2000. leary - a friendly introduction to mathematical logic

A Friendly Introduction to Mathematical Logic

Christopher C. Leary State University of New York College at Geneseo

Prentice Hall, Upper Saddle River, New Jersey 07458

Library of Congress Cataloging-m-Publication Data LEARY, CHRISTOPHER, C .

A friendly introduction to mathematical logic/Christopher C. Leary. p. cm.

Includes bibliographical references and index ISBN 0-13-010705-0

1. Computer logic. 3. Logic, symbolic and mathematical. I. Title

QA76 9 L63 L43 2000 99-052220 005 l'01'5H3-dc21 CIP

Acquisitions Editor: George Lahell . . Production Editors: Lynn M. Sawno/Bayani Mendoza de Leon Assistant Vice President of Production and Manufacturing. David W. Rtccardi Senior Managing Editor: Linda Mthatov Behrens Executive Managing Editor: Kathleen Schiaparelli Manufacturing Buyer. Alan Fischer Manufacturing Manager: Trudy Piaciotti Marketing Manager: Melody Marcus Marketing Assistant: Vtnce J arisen Director of Marketing- John Tweeddale Art Director: Jayne Conte Cover Designer: Bruce Kenselaar Editorial Assistant: Ga2e Epps

Printed in the United States of America 10 9 8 7 6 5 4 3 2 1

ISBN 0-13-010705-0

Prentice-Hall International (UK) Limited, London Prentice-Hall of Australia Pty. Limited, Sydney Prentice-Hail Canada, Inc., Toronto Prentice-Hall Hispanoamencana, S.A., Mexico Prentice-Hall of India Private Limited, New Delhi Prentice-Hall of Japan, Inc. Tokyo Pearson Education Asia Pte. Ltd. Editora Prentice-Hall do Brasil, Ltd a., Rio de Janeiro

To Andreas R. Blass and to the memory of Edward T. Wong

Two wonderful teachers Two wonderful friends

And to my family:

Sharon, Heather, and Eric Leary

1 I i I i I *

Contents

Preface xi

1 Structures and Languages 1 1.1 Naively • . . . . . , 4 1.2 Languages 5

1.2.1 Exercises , . „ , . . * 9 1.3 Terms and Formulas . . , , . . , , . , . 11

1.3.1 Exercises . . „ . . . . . . 15 1.4 Induction . . , . 16

1.4.1 Exercises . . . . . . . . . . . . . 20 1.5 Sentences 21

1.5.1 Exercises . , . 24 1.6 Structures . . . . . . 25

1.6.1 Exercises . 30 1.7 Truth in a Structure » . 32

1.7.1 Exercises , 38 1.8 Substitutions and Substitutability 39

1.8.1 Exercises 42 1.9 Logical Implication 42

1.9.1 Exercises 44 1.10 Summing Up, Looking Ahead . . . . . . . . . . . . . . 44

2 Deductions 47 2.1 Naively 47 2.2 Deductions 49

2.2.1 Exercises . 54 2.3 The Logical Axioms 55

2.3.1 Equality Axioms . . . . . . . . . . . . . J . . . " 56 2.3.2 Quantifier Axioms 56

viii Contents

2.3.3 Recap 57 2.4 Rules of I n f e r e n c e . . . . . . . . . . . . . . . . . . . . . 57

2.4.1 Prepositional Consequence . . . . . . . . . . . 58 2.4.2 Quantifier Rules 61 2.4.3 Exercises 62

2.5 Soundness 62 2.5.1 Exercises 66

2.6 Two Technical Lemmas . 67 2.7 Properties of Our Deductive System . . . 71

2.7.1 Exercises 75 2.8 Nonlogical Axioms . . . . . . . . . 76

2.8.1 Exercises 81 2.9 Summing Up, Looking Ahead 83

3 Completeness and Compactness 85 3.1 Naively 85 3.2 Completeness 86

3.2.1 Exercises 101 3.3 Compactness 102

3.3.1 Exercises 109 3.4 Substructures and the Lowenheim-Skolem Theorems .111

3.4.1 Exercises 118 3.5 Summing Up, Looking Ahead . . " 120

4 Incompleteness—Groundwork 121 4.1 Introduction 121 4.2 The Language, the Structure, and the Axioms of N . . 124

4.2.1 Exercises 125 4.3 Recursive Sets and Recursive Functions . . . . . . . . 126

4.3.1 Exercises 134 4.4 Recursive Sets and Computer Programs 135

4.4.1 Exercises 139 4.5 Coding—Naively 139

4.5.1 Exercises 142 4.6 Coding Is Recursive , , . . . » 142

4.6.1 Exercise 145 4.7 Godel Numbering 146

4.7.1 Exercises 148 4.8 Godel Numbers and N 149

4.8.1 Exercises 153

Contents ix

4.9 NUM and SUB Are Recursive . . . . . . . . . . . . . . 154 4.9.1 Exercises . . . . . . . . . . . . . . . . . . . . . 161

4.10 Definitions by Recursion Are Recursive . . . . . . . . . 162 4.10.1 Exercises 164

4.11 The Collection of Axioms Is Recursive 165 4.11.1 Exercise . . . 167

4.12 Coding Deductions . 167 4.12.1 Exercises , . 175

4.13 Summing Up, Looking Ahead . . . . . . . . . . . . . . 176

5 The Incompleteness Theorems 179 5.1 Introduction . , . . 179 5.2 The Self-Reference Lemma 180

5.2.1 Exercises 182 5.3 The First Incompleteness Theorem 182

5.3.1 Exercises . . . 187 5.4 Extensions and Refinements of Incompleteness . . . . 187

5.4.1 Exercises . 191 5.5 Another Proof of Incompleteness 191

5.5.1 Exercises . * . * . . . . . . . . . * 194 5.6 Peano Arithmetic and the Second Incompleteness The-

orem 195 5.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . 200

5.7 George Boolos on the Second Incompleteness Theorem 201

5.8 Summing Up, Looking Ahead 202

Appendix: Just Enough Set Theory to Be Dangerous 205

Bibliography 211

Index 213

• II

Preface

This book covers the central topics of first-order mathematical logic in a way that can reasonably be completed in a single semester. FVom the core ideas of languages, structures, and deductions we move on to prove the Soundness and Completeness Theorems, the Compactness Theorem, and Godel's First and Second Incompleteness Theorems. There is an introduction to some topics in model theory along the way, but I have tried to keep the text tightly focused.

One choice that I have made in my presentation has been to start right in on the predicate logic, without discussing propositional logic first. I present the material in this way as I believe that it frees up time later in the course to be spent on more abstract and difficult topics. It has been my experience in teaching from preliminary ver-sions of this book that students have responded well to this choice. Students have seen truth tables before, and what is lost in not seeing a discussion of the completeness of the propositional logic is more than compensated for in the extra time for Godel's Theorem.

I believe that most of the topics I cover really deserve to be in a first course in mathematical logic. Some will question my inclusion of the Lowenheim-Skolem Theorems, and I freely admit that they are included mostly because I think they are so neat. If time presses you, that section might be omitted. You may also want to soft-pedal some of the more technical results in Chapter 4.

The list of topics that I have slighted or omitted from the book is depressingly large. I do not say enough about recursion theory or model theory. I say nothing about linear logic or modal logic or second-order logic. All of these topics are interesting and important, but I believe that they are best left to other courses. One semester is, I believe, enough time to cover the material outlined in this book relatively thoroughly and at a reasonable pace for the student. - '•<

xii Preface

Thanks for choosing my book. I would love to hear how it works for you.

To the Student Welcome! I am really thrilled that you are interested in mathematical logic and that we will be looking at it together! I hope that my book will serve you well and will help to introduce you to an area of mathematics that I have found fascinating and rewarding.

Mathematical logic is absolutely central to mathematics, philoso-phy, and advanced computer science. The concepts that we discuss in this book—models and structures, completeness and incompleteness— are used by mathematicians in every branch of the subject. Further-more, logic provides a link between mathematics and philosophy, and between mathematics and theoretical computer science. It is a subject with increasing applications and of great intrinsic interest.

One of the tasks that I set for myself as I wrote this book was to be mindful of the audience, so let me tell you the audience that I am trying to reach with this book: third- or fourth-year undergradu-ate students, most likely mathematics students. The student I have in mind may not have taken very many upper-division mathematics courses. He or she may have had a course in linear algebra, or per-haps a course in discrete mathematics. Neither of these courses is a prerequisite for understanding the material in this book, but some familiarity with proving things will be required.

In fact, you don't need to know very much mathematics at all to follow this text. So if you are a philosopher or a computer scientist, you should not find any of the core arguments beyond your grasp. You do, however, have to work abstractly on occasion. But that is hard for all of us. My suggestion is that when you are lost in a sea of abstraction, write down three examples and see if they can tell you what is going on.

At several points in the text there are asides that are indented and start with the word Chaff. I hope you will find these comments helpful. They are designed to restate difficult points or emphasize important things that may get lost along the way. Sometimes they are there just to break up the exposition. But these asides really are chaff, in the sense that if they were blown away in the wind, the

Preface xiii

mathematics that is left would be correct and secure. But do look at them—they are supposed to make your life easier..

Just like every other math text, there axe exercises and problems for you to work out. Please try to at least think about the problems. Mathematics is a contact sport, and until you are writing things down and trying to use and apply the material you have been studying, you don't really know the subject. I have tried to include problems of different levels of difficulty, so some will be almost trivial and others will give you a chance to show off.

This is an elementary textbook, but elementary does not mean easy. It was not easy when we learned to add, or read, or write. You will find the going tough at times as we work our way through some very difficult and technical results. But the major theorems of the course—Godel's Completeness Theorem, the incompleteness results of Godel and Rosser, the Compactness Theorem, the Lowenheim-Skolem Theorem—provide wonderful insights into the nature of our subject. What makes the study of mathematical logic worthwhile is that it exposes the core of our field. We see the strength and power of mathematics, as well as its limitations. The struggle is well worth it. Enjoy the ride and see the sights. / .

Thanks Writing a book like this is a daunting process, and this particular book would never have been produced without the help of many people. Among my many teachers and colleagues I would like to express my heartfelt thanks to Andreas Blass and Claude Lafiamme for their careful readings of early versions of the book, for the many helpful suggestions they made, and for the many errors they caught.

I am also indebted to Paul Bankston of Marquette University, William G. Farris of the University of Arizona at Tucson, and Jiping Liu of the University of Lethbridge for their efforts in reviewing the text. Their thoughtful comments and suggestions have made me look smarter and made my book much better.

The Department of Mathematics at SUNY Geneseo has been very supportive of my efforts, and I would also like to thank the many students at Oberlin and at Geneseo who have listened to me lecture about logic, who have challenged me and rewarded me as I

xiv Preface

have tried to bring this field alive for them. The chance to work with undergraduates was what brought me into this field, and they have never (well, hardly ever) disappointed me.

Much of the writing of this book took place when I was on sab-batical during the fall semester of 1998. The Department of Mathe-matics and Statistics at the University of Calgary graciously hosted me during that time so I could concentrate on my writing.

I would also like to thank Michael and Jim Henle. On September 10,1975, Michael told a story in Math 13 about a barber who shaves every man in his town that doesn't shave himself, and that story planted the seed of my interest in logic. Twenty-two years later, when I was speaking with Jim about my interest in possibly writing a textbook, he told me that he thought that I should approach my writing as a creative activity, and if the book was in me, it would come out well. His comment helped give me the confidence to dive into this project.

The typesetting of this book depended upon the existence of Leslie Lamport's LXIgX. I thank everyone who has worked on this typesetting system over the years, and I owe a special debt to David M. Jones for his Index package, and to Piet von Oostrum for Fancy-headings.

Many people at Prentice Hall have worked very harfl to make this book a reality. In particular, George Lobell, Gale Epps, and Lynn Savino have been very helpful and caring. You would not be holding this book without their efforts.

But most of all, I would like to thank my wife, Sharon, and my children, Heather and Eric. Writing this book has been like raising another child. But the real family and the real children mean so much more.

Chris Leary Geneseo, New York leary@geneseo.edu

A Friendly Introduction to Mathematical Logic

* 1 i i

Chapter 1 f

Structures and Languages

Let us set the stage. In the middle of the nineteenth century, questions concerning the foundations of mathe-matics began to appear. Motivated by developments in geometry and in calculus, and pushed forward by results in set theory, mathematicians and logicians tried to cre-ate a system of axioms for mathematics, in particular, arithmetic. As systems were proposed, notably by the German mathematician Gottlob Frege, errors and para-doxes were discovered. So other systems were advanced.

At the International Congress of Mathematicians, a meeting held in Paris in 1900, David Hilbert proposed a list of 23 problems that the mathematical community should attempt to solve in the upcoming century. In stat-ing the second of his problems, Hilbert said: .

But above all I wish to designate the follow-ing as the most important among the numer-ous questions which can be asked with regard to the axioms [of arithmetic]: To prove that they are not contradictory, that is, that a finite number of logical steps based upon them can never lead to contradictory results. (Quoted in [Feferman 98])

In other words, Hilbert challenged mathematicians to come up with a set of axioms for arithmetic that were guaranteed to be consistent, guaranteed to be paradox-free. :

2 Chapter 1. Structures and Languages

In the first two decades of the twentieth century three major schools of mathematical philosophy developed. The Platonists held that mathematical objects had an exis-tence independent of human thought, and thus the job of mathematicians was to discover the truths about these mathematical objects. Intuitionists, led by the Dutch

' mathematician L. E. J. Brouwer, held that mathematics should be restricted to concrete operations performed on finite structures. Since vast areas of modern mathemat-ics depended on using inSnitary methods, Brouwer's posi-tion implied that most of the mathematics of the previous 3000 years should be discarded until the results could be reproved using Snitistic arguments. Hilbert was appalled at this suggestion and he became the leading exponent of the Formalist school, which held that mathematics was nothing more than the manipulation of meaningless sym-bols according to certain rules and that the consistency of such a system was nothing more than saying that the rules prohibited certain combinations of the symbols from occurring.

Hilbert developed a plan to refute the Intuitionist po-sition that most of mathematics was suspect. He pro-posed to prove, using finite methods that the Intuition-ists would accept, that all of classical mathematics was consistent. By using finite methods in his consistency proof, Hilbert was sure that his proof would be accepted by Brouwer and his followers, and then the mathemati-cal community would be able to return to what Hilbert considered the more important work of advancing math-ematical knowledge. Li the 1920s many mathematicians became actively involved in Hilbert's project, and there were several partial results that seemed to indicate that Hilbert's plan could be accomplished. Then came the shock.

On Sunday, September 7,1930, at the Conference on Epistemology of the Exact Sciences held in Konigsberg, Germany, a 24-year-old Austrian mathematician named Kurt Godel announced that he could show that there is a sentence such that the sentence is true but not provable in a formal system of classical mathematics. In 1931 Godel

published the proof of this claim along with the proof of his Second Incompleteness Theorem, which said that no consistent formal system of mathematics could prove its own consistency. Thus Hilbert'sprogram was impossible, and there would be no Snitistic proof that the axioms of arithmetic were consistent.

Mathematics, which had reigned for centuries as the embodiment of certainty, had lost that role. Thus we find ourselves in a situation where we cannot prove that mathematics is consistent. Although I beheve in my heart that mathematics is consistent, I know in my brain that I will not be able to prove that fact, unless I am wrong. For if I am wrong, mathematics is inconsistent. And (as we will see) if mathematics is inconsistent, then it can prove anything, including the statement which says that mathematics is consistent. . -

So do we throw our hands in the air and give up the study of mathematics? Of course not! Mathematics is still useful, it is still beautiful, and it is still interesting. It is an intellectual challenge. It compels us to think about great ideas and difficult problems. It is a wonderful Geld of study, with rewards for us all. What we have learned from the developments of the nineteenth and twentieth centuries is that we must temper our hubris. Although we can still agree with Gauss, who said that, "Mathematics is the Queen of the Sciences..." she no longer can claim to be a product of an immaculate conception.

Our study of mathematical logic will take us to a point where we can understand the statement and the proof of Godel's Incompleteness Theorems. On our way there, we will study formed languages, mathematical structures, and a certain deductive system. The type of thinking, the type of mathematics that we will do, may be unfamiliar to you, and it will probably be tough going at times. But the theorems that we will prove are among the most rev-olutionary mathematical results of the twentieth century. So your efforts will be well rewarded. Work hard. Have fun.

1.1 Naively Let us begin by talking informally about mathematical structures and mathematical languages. There is no doubt that you have worked with mathematical models in several previous mathematics courses, although in all likelihood it was not pointed out to you at the time. For example, if you have taken a course in linear algebra, you have some experience working with R2, R3, and R" as examples of vec-tor spaces. In high school geometry you learned that the plane is a "model" of Euclid's axioms for geometry. Perhaps you have taken a class in abstract algebra, where you saw several examples of groups: The integers under addition, permutation groups, and the group of invertible nxn matrices with the operation of matrix multiplication are all examples of groups—they are "models" of the group axioms. All of these are mathematical models, or structures. Different struc-tures are used for different purposes.

Suppose we think about a particular mathematical structure, for example R3, the collection of ordered triples of real numbers. If we try to do plane Euclidean geometry in R3, we fail miserably, as (for example) the parallel postulate is false in this structure. On the other hand, if we want to do linear algebra in R3, all is well and good, as we can think of the points of R3 as vectors and let the scalars be real numbers. Then the axioms for a real vector space are all true when interpreted in R3. We will say that R3 is a model of the axioms for a vector space, whereas it is not a model for Euclid's axioms for geometry.

As you have no doubt noticed, our discussion has introduced two separate types of things to worry about. First, there are the mathematical models, which you can think of as the mathematical worlds, or constructs. Examples of these include R3, the collection of polynomials of degree 17, the set of 3 x 2 matrices, and the real line. We have also been talking about the axioms of geometry and vector spaces, and these are something different. Let us discuss those axioms for a moment.

Just for the purposes of illustration, let us look at some of the axioms which state that V is a real vector space. They are listed here both informally and in a more formal language:

Vector addition is commutative: (Vu 6 V)(Vv 6 V)u + v — v + u.

There is a zero vector: (BO € V)(Vu € V)v + 0 = v.

1.2. Languages 5

One times anything is itself: (Vu 6 V)lv = v.

Don't worry if the formal language is not familiar to you at this point; it suffices to notice that there is a formal language. But do let me point out a few things that you probably accepted without question. The addition sign that is in the first two axioms is not the same plus sign that you were using when you learned to add in first grade. Or rather, it is the same sign, but you interpret that sign differently. If the vector space under consideration is R3, you know that as fax as the first two axioms up there are concerned, addition is vector addition. Similarly, the 0 in the second axiom is not the real number 0; rather, it is the zero vector. Also, the multiplication in the third axiom that is indicated by the juxtaposition of the 1 and the v is the scalar multiplication of the vector space, not the multiplication of third grade.

So it seems that we have to be able to look at some symbols in a particular formal language and then take those symbols and relate them in some way to a mathematical structure. Different interpretations of the symbols will lead to different conclusions as regards the truth of the formal statement. For example, if we take the commutivity axiom above and work with the space V" being R3

but interpret the sign + as standing for cross product instead of vector addition, we see that the axiom is no longer true, as cross product is not commutative.

These, then, are our next objectives: to introduce formal lan-guages, to give an official definition of a mathematical structure, and to discuss truth in those structures. Beauty will come later.

1.2 Languages We will be constructing a very restricted formal language, and our goal in constructing that language will be to be able to form cer-tain statements about certain kinds of mathematical structures. For our work, it will be necessary to be able to talk about constants, functions, and relations, and so we will need symbols to represent them. ,

Chaff: Let me emphasize this once more. Right now we are discussing the syntax of our language, the marks

on the paper. We are not going to worry about the se-mantics, or meaning, of those marks until later—at least not formally. But it is silly to pretend that the intended meanings do not drive our choice of symbols and the way in which we use them. If we want to discuss left-hemi-semi-demi-rings, our formal language should include the function and relation symbols that mathematicians in this lucrative and exciting field customarily use, not the symbols involved in chess, bridge, or right-hemi-semi-para-fields. It is not our goal to confuse anyone more than is necessary. So you should probably go through the exercise right now of taking a guess at a reasonable language to use if our intended field of discussion was, say, the theory of the natural numbers. See Exercise 1.

Definition 1.2.1. A first-order language £ is an infinite collec-tion of distinct symbols, no one of which is properly contained in another, separated into the following categories:

. 1. Parentheses: (,).

2. Connectives: V, -i.

3. Quantifier: V.

4. Variables, one for each positive integer n: vi,v2,. ..,vn,

The set of variable symbols will be denoted Vars.

5. Equality symbol: —.

6. Constant symbols: Some set of zero or more symbols. 7. Function symbols: For each positive integer n, some set of zero

or more n-ary function symbols.

8. Relation symbols: For each positive integer n, some set of zero or more n-ary relation symbols.

To say that a function symbol is n-ary (or has arity n) means that it is intended to represent a function of n variables. For example, -+-has arity 2. Similarly, an n-ary relation symbol will be intended to represent a relation on n-tuples of objects. This will be made formal in Definition 1.6.1.

1.2. Languages 7

To specify a language, all we have to do is determine which, if any, constant, function, and relation symbols we wish to use. Many authors, by the way, let the equality symbol be optional, or treat the equality symbol as an ordinary binary (i.e., 2-ary) relation symbol. We will assume that each language has the equality symbol, unless specifically noted.

Chaff: I ought to add a word about the phrase "no one of which is properly contained in another," which ap-pears in this definition. We have been quite vague about the meaning of the word symbol, but you are supposed to be thinking about marks made on a piece of paper. We will be constructing sequences of symbols and trying to -figure out what they mean in the next few pages, and by not letting one symbol be contained in another, we will find our job of interpreting sequences to be much easier.

For example, suppose that our language contained both the constant symbol and the constant symbol W (notice that the first symbol is properly contained in the second). If you were reading a sequence of symbols and ran across it would be impossible to decide if this was one symbol or a sequence of two symbols. By not allowing symbols to be contained in other symbols, this type of confusion is avoided, leaving the field open for other types of confusion to take its place.

Example 1.2.2. Suppose that we were taking an abstract algebra course and we wanted to specify the language of groups. A group consists of a set and a binary operation that has certain properties. Among those properties is the existence of an identity element for the operation. Thus, we could decide that our language will contain one constant symbol for the identity element, one binary operation symbol, and no relation symbols. We would get

Cg is {0, +} ,

where 0 is the constant symbol and + is a binary function symbol. Or perhaps we would like to write our groups using the operation as multiplication. Then a reasonable choice could be

4 ? i s { i r . V } ,

which includes not only the constant symbol 1 and the binary func-tion symbol but also a unary (or 1-ary) function symbol which is designed to pick out the inverse of an element of the group. As you can see, there is a fair bit of choice involved in designing a language.

Example 1.2.3. The language of set theory is not very complicated at all. We will include one binary relation symbol, and that is all:

CST IS {€}.

The idea is that this symbol will be used to represent the element-hood relation, so the interpretation of the string x € y will be that the set a: is an element of the set y. You might be tempted to add other relation symbols, such as C, or constant symbols, such as 0, but it will be easier to define such symbols in terms of more primitive symbols. Not easier in terms of readability, but easier in terms of proving things about the language.

In general, to specify a language we need to list the constant symbols, the function symbols, and the relation symbols. There can be infinitely many [in fact, uncountably many (cf. the Appendix)] of each. So, here is a specification of a language:

C is {ci,c2

Here, the c 's are the constant symbols, the / " ^ ' s are the function symbols, and the

's are the relation symbols. The superscripts on the function and relation symbols indicate the arity of the asso-ciated symbols, so a is a mapping that assigns a natural number to a string that begins with an / or an R, followed by a subscripted ordinal. Thus, an official function symbol might look like this:

/ 1 7 >

which would say that the function that will be associated with the 17th function symbol is a function of 223 variables. Fortunately, such dreadful detail will rarely be needed. We will usually see only unary or binary function symbols and the arity of each symbol will be stated once. Then the author will trust that the context will remind the patient reader of each symbol's arity.

1.2. Languages 9

1.2.1 Exercises 1. Carefully write out the symbols that you would want to have

in a language C that you intend to use to write statements of elementary algebra. Indicate which of the symbols are constant symbols, and the arity of the function and relation symbols that you choose. Now write out another language, M (i.e., another list of symbols) with the same number of constant symbols, function symbols, and relation symbols that you would not want to use for elementary algebra. Think about the value of good notation.

2. What are good examples of unary (1-ary) functions? Binary functions? Can you find natural examples of relations with arity 1, 2, 3, and 4? As you think about this problem, stay mindful of the difference between the function and the function symbol, between the relation and the relation symbol.

3. In the town of Sneezblatt there are three eating establishments: McBurgers, Chez Fancy, and Sven's Tandoori Palace. Think for a minute about statements that you might want to make about these restaurants, and then write out the formal language for your theory of restaurants. Have fun with this, but try to include both function and relation symbols in C. What interpretations are you planning for your symbols?

4. You have been put in charge of drawing up the schedule for a basketball league. This league involves eight teams, each of which must play each of the other seven teams exactly two times: once at home and once on the road. Think of a reasonable language for this situation. What constants would you need? Do you need any relation symbols? Function symbols? It would be nice if your finished schedule did not have any team playing two games on the same day. Can you think of a way to state this using the formal symbols that you have chosen? Can you express the sentence which states that each team plays every other team exactly two times?

5. Let's work out a language for elementary trigonometry. To get you started, let me suggest that you start off with lots of con-stant symbols—one for each real number. It is tempting to use the symbol 7 to stand for the number seven, but this runs into problems. (Do you see why this is illegal? 7, 77, 7 /3 , . . . . ) Now,

what functions would you like to discuss? Think of symbols for them. What are the arities of your function symbols? Do not for-get that you need symbols for addition and multiplication! What relation symbols would you like to use?

6. A computer language is another example of a language. For example, the symbol := might be a binary function symbol, where the interpretation of the instruction

x := 7

would be to alter the internal state of the computer by placing the value 7 into the position in memory referenced by the variable x. Think about the function associated with the binary function symbol

if _ , then -What are the inputs into this function? What sort of thing does the function do? Look at the statement

If x + y > 3, then z := 7.

Identify the function symbols, constant symbols, and relation symbols. What are the arities of each function and relation sym-bol?

7. What would be a good language for the theory of vector spaces? This problem is slightly more difficult, as there are two different varieties of objects, scalars and vectors, and you have to be able to tell them apart. Write out the axioms of vector spaces in your language. Or, better yet, use a language that includes a unary function symbol for each real number so that scalars don't exist as objects at all!

8. It is not actually necessary to include function symbols in the language, since a function is just a special kind of relation. Just to see an example, think about the function / : N —• N defined by f(x) = x2. Remembering that a relation on N x N is just a set of ordered pairs of natural numbers, find a relation R on N x N such that (x, y) is an element of R if and only if y = f(x). Convince yourself that you could do the same for any function defined on any domain. What condition must be true if a relation R on A x B is to be a function mapping A to B?

1.3. Terms and Formulas 11

1.3 Terms and Formulas Suppose that £ is the language {0, +, < } , and we are going to use £ to discuss portions of arithmetic. If I were to write down the string of symbols from £,

(v i+0)<VI , / , ; , and the string

*i7)(V + +(((0, you would probably agree that the first string conveyed some mean-ing, even if that meaning were incorrect, while the second string was meaningless. It is our goal in this section to carefully define which strings of symbols of £ we will use. In other words, we will select the strings that will have meaning. .

Now, the point of having a language is to be able to make state-ments about certain kinds of mathematical systems. Thus, we will want the statements in our language to have the ability to refer to objects in the mathematical structures under consideration. So we will need some of the strings in our language to refer to those objects. Those strings are called the terms of £ .

Definition 1.3.1. If £ is a language, a term of £ is a nonempty finite string t of symbols from £ such that either:

1. t is a variable, or

2. t is a constant symbol, or

3. t is /M2 • • -tn> where / is an ra-ary function symbol of £ and each of the U is a term of £.

This is a definition by recursion, as you notice that in the third clause of the definition, t is a term if it contains substrings that are terms. Since the substring of t are shorter (contain fewer symbols) than t, and as none of the symbols of £ are made up of other symbols of £, this causes no problems. '

Example 1.3.2. Let £ be the language {0,1,2,.. . ,+,•}, with one constant symbol for each natural number and two binary function symbols. Here are some of the terms of £'. 714, +52, • + 324. Notice that 12 3 is not a term of £, but rather is a sequence of three terms in a row.

Chaff: The term +32 looks pretty annoying at this point, but we will use this sort of notation (called Polish notation) for functions rather than the infix notation (3+ 2) that you are used to. We are not really being that odd here: You have certainly seen some functions written in Polish notation: sin(x) and f[x, y, z) come to mind. We are just being consistent in treating addition in the same way. What makes it difficult is that it is hard to remember that addition really is just another function of two variables. But I am sure that by the end of this book, you will be very comfortable with that idea and with the notation that we are using.

A couple of points are probably worth emphasizing, just this once. Notice that in the application of the function symbols, there are no parentheses and no commas. Also notice that all of our functions are written with the operator on the left. So instead of 3 + 2, we write +32. The reason for this is for consistency and to make sure that we can parse our expressions.

Let me give an example. Suppose that, in some language or other, we wrote down the string of symbols J /• Assume that two of our colleagues, Humphrey and Ingrid, were waiting in the hall while we wrote down the string. If Humphrey came into the room and announced that our string was a 3-ary function symbol followed by three terms, whereas Ingrid proclaimed that the string was really a 4-ary relation symbol followed by two terms, this would be rather confusing. It would be really confusing if they were both correct! So we need to make sure that the strings that we write down can be interpreted in only one way. This property, called unique readability, is addressed in Exercise 7 of Section 1.4.1.

Chaff: Unique readability is one of those things that, in the opinion of the author, is important to know, in-teresting to prove, and boring to read. Thus the proof is placed in (I do not mean "relegated to") the exercises.

t Suppose that we look more carefully at the term -+324. Assume

for now that the symbols in this term are supposed to be interpreted in the usual way, so that * means multiply, + means add, and 3 means three. Then if we add some parentheses to the term in order

to clarify its meaning, we get . . . -

, -(+55)3,

which ought to have the same meaning as -54, which is 20, just as you suspected. ~

Rest assured that we will continue to use infix notation, commas, and parentheses as seem warranted to increase the readability (by humans) of this text. So fhta. . .£« will be written /(<i,t^,...,tn) and +32 will be written 3 + 2, with the understanding that this is shorthand and that our official version is the version given in Defini-tion 1.3.1.

The terms of £ play the role of the nouns of the language. To make meaningful mathematical statements about some mathemati-cal structure, we will want to be able to make assertions about the objects of the structure. These assertions will be the formulas of £.

Definition 1.3.3. If £ is a first-order language, a formula of £ is a nonempty finite string <j> of symbols from £ such that either:

1. $ is = t\ti, where ii and ti are terms of £, or

2. <j> is Rtih •••tm where R is an n-ary relation symbol of £ and ti,t2, t„ are all terms of £, or

3. <f> is (->«), where a is a formula of £, or

4. <f> is (a V /?), where a and /? are formulas of £, or

5. ^ is (Vi0(a), where v is a variable and a is a formula of £.

If a formula %jj contains the subformula (Vt/)(a) [meaning that the string of symbols that constitute the formula (Vu)(a) is a substring of the string of symbols that make up tp], we will say that the scope of the quantifier V is a. Any symbol in a will be said to He within the scope of the quantifier V. Notice that a formula ip can have several different occurrences of the symbol V, and each occurrence of the quantifier will have its own scope. Also notice that one quantifier can lie within the scope of another.

The atomic formulas of £ are those formulas that satisfy clause (1) or (2) of Definition 1.3.3.

You have undoubtedly noticed that there are no parentheses or commas in the atomic formulas, and you have probably decided that we will continue to use both commas and infix notation as seems appropriate. You are correct on both counts. So, instead of writing the official version

< SSSSSOSSO in a language containing constant symbol 0, unary function symbol S, and binary relation symbol <, we will write

SSSSSO < SSQ

or (after some preliminary definitions) 5 < 3 .

Also notice that we are using infix notation for the binary logical connective V. I hope that this will make your life somewhat easier.

You will be asked in Exercise 8 in Section 1.4.1 to prove that unique readability holds for formulas as well as terms. We will, in our exposition, use different-size parentheses, different shapes of delim-iters, and omit parentheses in order to improve readability without (we hope) introducing confusion on your part.

Notice that a term is not a formula! If the terms are the nouns of the language, the formulas will be the statements. Statements can be either true or false. Nouns cannot. Much confusion can be avoided if you keep this simple dictum in mind.

For example, suppose that you are looking at a string of symbols and you notice that the string does not contain either the symbol = or any other relation symbol from the language. Such a string cannot be a formula, as it makes no claim that can be true or false. The string might be a term, it might be nonsense, but it cannot be a formula.

Chaff: I do hope that you have noticed that we are dealing only with the syntax of our language here. We have not mentioned that the symbol -> will be used for denial, or that V will mean "or," or even that V means "for every." Don't worry, they will mean what you think they should mean. Similarly, do not worry about the fact that the definition of a formula left out symbols for conjunctions, implications, and biconditional. We will get to them in good time.

1.3.1 Exercises 1. Suppose that the language C. consists of two constant symbols,

0 and a unary relation symbol ¥, a binary function symbol b, and a 3-ary function symbol j}. Write down at least three distinct terms of the language £. Write down a couple of nonterms that look like they might be terms and explain why they are not terms. Write a couple of formulas and a couple of nonformulas that look like they ought to be formulas,

2. The fact that we write all of our operations on the left is im-portant for unique readability. Suppose, for example, that we wrote our binary operations in the middle (and did not allow the use of parentheses). If our language included the binary function symbol # , then the term

could be interpreted two ways. This can make a difference: Sup-pose that the operation associated with the function symbol # is "subtract." Find three real numbers it, v, and w such that the two different interpretations of u#u#u; lead to different answers. Any nonassociative binary function will yield another counterex-ample to unique readability. Can you think of three such func-tions?

3. The language of number theory is

£nt is {0,5, +,-,£,<}, .

where the intended meanings of the symbols are as follows: 0 stands for the number zero, S is the successor function S(x) — x + 1, the symbols +, and < mean what you expect, and E stands for exponentiation, so E{3,2) = 9. Assume that CNT-formulas will be interpreted with respect to the nonnegative in-tegers and write an L//r-formula to express the claim that p is a prime number. Can you write a formula stating that there is no largest prime number? Can you write the statement of La-grange's Theorem, which states that every natural number is the sum of four squares? What is the formal statement of the Twin Primes Conjecture, which says that there are infinitely many pairs (x, y) such that x and y are both prime and y ~ x + 2?

Use shorthand in your answer to this problem. After you have found the formula which says that p is prime, call the formula Prime (p), and use Prime (p) in your later answers.

4. Suppose that our language has infinitely many constant symbols of the form and no function or relation symbols other than =. Explain why this situation leads to problems by looking at the formula =""". Where in our definitions do we outlaw this sort of problem?

1.4 Induction You are familiar, no doubt, with proofs by induction. They are the bane of most mathematics students from their first introduction in high school through the college years. It is my goal in this section to discuss the proofs by induction that you know so well, put them in a different light, and then generalize that notion of induction to a setting that will allow us to use induction to prove things about terms and formulas rather than just the natural numbers.

Just to remind you of the general form of a proof by induction on the natural numbers, let me state and prove a familiar theo-rem, assuming for the moment that the set of natural numbers is (1,2,3,...). Theorem 1.4.1. For every natural number n,

14.94. • _ w(n-f-1) l + & + • •»+ n = - .

Proof. If n = 1, simple computation shows that the equality holds. For the inductive case, fix k > 1 and assume that

1 + 2 + .•• + & = v • •/.

If we add fc + 1 to both sides of this equation, we get

l + 2 + — + fc + (fc + l) = ^ - ^ + (fc + l),

and simplifying the right-hand side of this equation shows that

finishing the inductive step, and the proof. •

1.4. Induction 17

As you look at the proof of this theorem, you notice that there is a base case, when » = 1, and an inductive case. In the inductive step of the proof, we prove the implication -

If the formula holds for k, then the formula holds for k •+1.

We prove this implication by assuming the antecedent, that the theorem holds for a (fixed, but unknown) number k, and from that assumption proving the consequent, that the theorem holds for the next number, k + 1. Notice that this is not the same as assuming the theorem that we are trying to prove. The theorem is a universal statement—it claims that a certain formula holds for every natural number.

Looking at this from a slightly different angle, what we have done is to construct a set of numbers with a certain property. If we let S stand for the set of numbers for which our theorem holds, in our proof by induction we show the following facts about S:

1. The number 1 is an element of 5. We prove this explicitly in the base case of the proof.

2. If the number k is an element of S, then the number k + 1 is an element of S. This is the content of the inductive step of the proof.

But now, notice that we know that the collection of natural num-bers can be defined as the smallest set such that:

1. The number 1 is a natural number.

2. If A: is a natural number, then fc + 1 is a natural number.

So S, the collection of numbers for which the theorem holds, is identical with the set of natural numbers, thus the theorem holds for every natural number n, as needed. (If you caught the slight lie here, just substitute "superset" where appropriate.)

So what makes a proof by induction work is the fact that the natural numbers can be defined recursively. There is a base case, consisting of the smallest natural number ("1 is a natural number"), and there is a recursive case, showing how to construct bigger natural numbers from smaller ones ("If A; is a natural number, then fc 4-1 is a natural number"). •

Now, let us look at Definition 1.3.3, the definition of a formula. Notice that the five clauses of the definition can be separated into two groups. The first two clauses, the atomic formulas, are explicitly defined: For example, the first case says that anything that is of the form = t\ ti is a formula if t\ and <2 are terms. These first two clauses form the base case of the definition. The last three clauses are the recursive case, showing how if ot and /? are formulas, they can be used to build more complex formulas, such as (ct V j3) or (Vv)(a).

Now since the collection of formulas is defined recursively, we can use an inductive-style proof when we want to prove that something is true about every formula. The inductive proof will consist of two parts, a base case and an inductive case. In the base case of the proof we will verify that the theorem is true about every atomic formula—about every string that is known to be a formula from the base case of the definition. In the inductive step of the proof, we assume that the theorem is true about simple formulas (a and /?), and use that assumption to prove that the theorem holds a more complicated formula <f> that is generated by a recursive clause of the definition. This method of proof is called induction on the complexity of the formula.

The following theorem, although mildly interesting in its own right, is included here mostly so that the reader can see an example of a proof by induction in this setting:

Theorem 1.4.2. Suppose that <j> is a formula in the language C. Then the number of left parentheses occurring in <j> is equal to the number of right parentheses occurring in <f>.

Proof. We will present this proof in a fair bit of detail, in order to emphasize the proof technique. As you become accustomed to proving theorems by induction on complexity, not so much detail is needed.

Base Case. We begin our inductive proof with the base case, as you would expect. Our theorem makes an assertion about all formulas, and the simplest formulas are the atomic formulas. They constitute our base case. Suppose that <j> is an atomic formula. There are two varieties of atomic formulas: Either (f> begins with an equals sign followed by two terms, or <f> begins with a relation symbol followed by several terms. As there are no parentheses in any term (we are using the official definition of term, here), there are no parentheses

1.4. Induction 19

in (ft. Thus, there axe as many left parentheses as right parentheses in 4>, and we have established the theorem if ^ is an atomic formula.

Inductive Case. The inductive step of a proof by induction on complexity of a formula takes the following form: Assume that <j> is a formula by virtue of clause (3), (4), or (5) of Definition 1.3.3. Also assume that the statement of the theorem is true when applied to the formulas a and /?. With those assumptions we will prove that. the statement of the theorem is true when applied to the formula <p. Thus, as every formula is a formula either by virtue of being an atomic formula or by application of clause (3), (4), or (5) of the definition, we will have shown that the statement of the theorem is true when applied to any formula, which has been our goal.

So, assume that a and f3 are formulas that contain equal numbers of left and right parentheses. Suppose that there axe k left paren-theses and k right parentheses in a and I left parentheses and I right parentheses in f3.

If 0 is a formula by virtue of clause (3) of the definition, then (f> , is (~»a). We observe that there are A; + 1 left parentheses and k + 1 right parentheses in <f>, and thus <f> has an equal number of left and right parentheses, as needed.

If <f> is a formula because of clause (4), then <f> is (a V /?), and $ contains k + l + l left and right parentheses, an equal number of each type.

Finally, if (j) is (Vu)(a), then <j> contains k + 2 left parentheses and k + 2 right parentheses, as needed.

This concludes the possibilities for the inductive case of the proof, so we have established that in every formula, the number of left • parentheses is equal to the number of right parentheses. •

This theorem is a little cheesy, but the power of this proof tech-nique will become more evident as you work through the following exercises and when we discuss the semantics of our language.

Notice also that the definition of a term (Definition 1.3.1) is also a recursive definition, so we can use induction on the complexity of a term to prove that a theorem holds for every term.

1.4.1 Exercises

1. Prove, by ordinary induction on the natural numbers, that

1 2 + 2 2 + . . . + n 2 == n ( n + 1 ) ( 2 n + 1 ) . 6

2. Prove, by induction, that the sum of the interior angles in a convex n-gon is (n — 2)180°. (A convex n-gon is a polygon with n sides, where the interior angles are all less than 180°.)

3. Prove by induction that if A is a set consisting of n elements, ' then A has 2n subsets.

4. Suppose that C. is {0, /,<?}, where 0 is a constant symbol, / is a binary function symbol, and g is a 4-ary function symbol. Use induction on complexity to show that every £-term has an odd number of symbols.

5. If £ is {0, <}, where 0 is a constant symbol and < is a binary relation symbol, show that the number of symbols in any formula is divisible by 3.

6. If s and t are strings, we say that s is an initial segment of i if there is a nonempty string u such that t = su, where su is the string s followed by the string u. For example, KUMQ is an initial segment of KUMQUAT and + 2 4 is an initial segment of +24u — v. Prove, by induction on the complexity of s, that if s and t are terms, then s is not an initial segment of t. [Suggestion: The base case, when 8 is either a variable or a constant symbol, should be easy. Then suppose that s is an initial segment of t and s — / M 2 •••tn, where you know that each U is not an initial segment of any other term. Look for a contradiction.]

7. A language is said to satisfy unique readability for terms if, for each term t, t is in exactly one of the following categories:

(a) Variable (b) Constant symbol (c) Complex term

and furthermore, if t is a complex term, then there is a unique function symbol / and a unique sequence of terms ti,t2,.*.,tn

1.5. Sentences 21

such that t — ft\t2 ...tn. Prove that our languages satisfy unique readability for terms. [Suggestion; You mostly have to worry about uniqueness—for example, suppose that t is c, a constant symbol. How do you know that t is not also a complex term? Suppose that t is ftfa ...tn. How do you show that the / and the Vs are unique? You may find Exercise 6 useful.]

8. To say that a language satisfies unique readability for formulas is to say that every formula <f> is in exactly one of the following categories: „ -

(a) Equality (if <j> is = titi) (b) Other atomic (if <j> is Rt\ £2 ...tn for an w-ary relation symbol

R) (c) Negation -(d) Disjunction (e) Quantified

Also, it must be that if <f> is both = t\t2 and = ^£4, then t\ is iden-tical to tz and £2 is identical to <4, and similarly for other atomic formulas. Furthermore, if (for example) 0 is a negation ( - H I ) , then it must be the case that there is not another formula such that <j) is also (~>/3), and similarly for disjunctions and quantified formulas. Prove that our languages satisfy unique readability for formulas. You will want to look at, and use, Exercise 7. You may have to prove an analog of Exercise 6, in which it may be helpful to think about the parentheses in an initial segment of a formula, in order to prove that no formula is an initial segment of another formula.

9. Take the proof of Theorem 1.4.2 and write it out in the way that you would present it as part of a homework assignment. Thus, you should cut out all of the inessential motivation and present only what is needed to make the proof work.

1.5 Sentences Among the formulas in the language there are some in which we will be especially interested. These are the sentences of C.—the

formulas that can be either true or false in a given mathematical model.

Let me use an example to introduce a language that will be vitally important to us as we work through this book. ' Definition 1.5.1. The language Cnt is {0, S, +, E, <} , where 0 is a constant symbol, 5 is a unary function symbol, +, •, and E are binary function symbols, and < is a binary relation symbol. This will be referred to as the language of number theory.

Chaff: Although we are not fixing the meanings of these symbols yet, I probably ought to tell you that the standard interpretation of CNT wiH use 0, +, •» and < in the way that you expect. The symbol S will stand for the successor function that maps a number x to the number x+1, and E will be used for exponentiation: E32 is supposed to be 32.

Consider the following two formulas of CNT'

(Did you notice that we have begun using an informal presenta-tion of the formulas?)

The second formula should look familiar. It is nothing more than the familiar trichotomy law of <, and you would agree that the second formula is a true statement about the collection of natural numbers, where you are interpreting < in the usual way.

The first formula above is different. It "says" that not every x is greater than or equal to y. The truth of that statement is indeterminate: It depends on what natural number y represents. The formula might be true, or it might be false—it all depends on the value of y. So our goal in this section is to separate the formulas of C into one of two classes: the sentences (like the second example above) and the nonsentences. To begin this task, we must talk about free variables.

Free variables are the variables upon which the truth value of a formula may depend. The variable y is free in the first formula above. To draw an analogy from calculus, if we look at

<Vx)[(y < x) V (y = x)]. (Vx)(Vy)[(x < y) V (x = y) V (y < x)].

1.5. Sentences 23

the variable x is free in this expression, as the value of the integral depends on the value of x. The variable t is not free, and in fact it doesn't make any sense to decide on a value for t. The same distinction holds between free and nonfree variables in an -formula. Let us try to make things a little more precise.

Definition 1.5.2. Suppose that v is a variable and <j> is a formula. We will say that v is free in <j) if

1. (f> is atomic and v occurs in (is a symbol in) (j), or

2. <j> is (~ia) and v is free in a, or '

3. <f> is (a V P) and v is free in at least one of a or (3, or

4. 4> is (Vu) (<*) and v is not u and v is free in a.

Thus, if we look at the formula

\/v2~>{ivz){vi — S(v2) V l>3 = V2), -

the variable vi is free whereas the variables V2 and V3 are not free. A slightly more complicated example is

(V»iVt*(t;i + v2 = 0)) V vi = 5(0). ,

In this formula, vi is free whereas V2 is not free. Especially when a formula is presented informally, you must be careful about the scope of the quantifiers and the placement of parentheses.

We will have occasion to use the informal notation Vx(f>(x). This will mean that <f> is a formula and x is the only free variable of <f>. If we then write 4>{t), where t is an £-term, that will denote the formula obtained by taking <f> and replacing each occurrence of the variable x with the term t. This will all be defined more formally and more precisely in Definition 1.8.2. Definition 1.5.3. A sentence in a language C is a formula of C that contains no free variables. ,

For example, if a language contained the constant symbols 0, 1, and 2 and the binary function symbol +, then the following are sentences: 1+1 = 2 and (Vx)(x+1 = x). You are probably convinced that the first of these is true and the second of these is false. In the next two sections we will see that you might be correct. But then again, you might not be. , . , ' . ,

1.5.1 Exercises 1. For each of the following, find the free variables, if any, and decide

if the given formula is a sentence. The language includes a binary function symbol +, a binary relation symbol <, and constant symbols 0 and 2.

(a) (Vx)(Vy)(x + y = 2) (b) (x + y < x) V (V*)(« < 0) (c) ((Vy)(y < x)) V ((Vx)(* < y))

2. Explain precisely, using the definition of a free variable, how you know that the variable is free in the formula

(Vvi)(-i(Vv5)(w2 = i>i + v5)).

3. In mathematics, we often see statements such as sin2 x+cos2 x — 1. Notice that this is not a sentence, as the variable x is free. But we all agree that this statement is true, given the usual interpretations of the symbols. How can we square this with the claim that sentences are the formulas that can be either true or false?

4. If we look at the first of our example formulas in this section,

-i(Vx)[(y < x) V (y « x)],

and we interpret the variables as ranging over the natural num-bers, you will probably agree that the formula is false if y rep-resents the natural number 0 and true if y represents any other number. (If you aren't happy with 0 being a natural number, then use 1.) On the other hand, if we interpret the variables as ranging over the integers, what can we say about the truth or falsehood of this formula? Can you think of an interpretation for the symbols that would make sense if we try to apply this formula to the collection of complex numbers?

5. A variable may occur several times in a given formula. For ex-ample, the variable v\ occurs four times in the formula

(Vvi) [(v! = t>3) V (t>i = Sv2) V (0 + vi7 <vi- SO)].

1.6. Structures 25

What should it mean for an occurrence of a variable to be free? Write a definition that begins: The nth occurrence of a variable v in a formula <j) is said to be free i f . . . . An occurrence of v in <j) that is not free is said to be bound. Give an example of a formula in a suitable language that contains both free and bound occurrences of a variable v. ,

6. Look at the formula *

[(Vy)(x==y)]v[(Vz)(x<0)]. -

If we denote this formula by 4>{%) and t is the term SO, find <p(t). [Suggestion: The trick here is to see that there is a bit of a lie in the discussion of <p(t) in the text. Having completed Exercise 5, we can now say that we only replace the free occurrences of the variable x when we move from (f>{x) to <f>(t).]

1.6 Structures Let us, by way of example, return to the language CNT of number theory. Recall that Cnt is (0, S, -f, E, <} , where 0 is a constant symbol, 5 is a unary function symbol, +, and E are binary function symbols, and < is a binary relation symbol. We now want to discuss the possible mathematical structures in which we can interpret these symbols, and thus the formulas and sentences of CNT-

"But wait!" cries the incredulous reader. "You just said that this is the language of number theory, so certainly we already know what each of those symbols means."

It is certainly the case that you know an interpretation for these symbols. The point of this section is that there are many different possible interpretations for these symbols, and we want to be able to specify which of those interpretations we have in mind at any particular moment. * :

Probably the interpretation you had in mind (what we will call the standard model for number theory) works with the set of natural numbers {0,1,2,3,. . .} . The symbol 0 stands for the number 0.

Chaff: Carefully, now! The symbol 0 is the mark on the paper, the numeral. The number 0 is the thing that the numeral 0 represents. The numeral is something that

you can see. The number is something that you cannot see.

The symbol S is a unary function symbol, and the function for which that symbol stands is the successor function that maps a num-ber to the next larger natural number. The symbols +, •, and E rep-resent the functions of addition, multiplication, and exponentiation, and the symbol < will be used for the "less than" relation.

But that is only one of the ways that we might choose to interpret those symbols. Another way to interpret all of those symbols would be to work with the numbers 0 and 1, interpreting the symbol 0 as the number 0, S as the function that maps 0 to 1 and 1 to 0, + as addition mod 2, * as multiplication mod 2, and (just for variety) E as the function with constant value 1. The symbol < can still stand for the relation "less than."

Or, if I were in a slightly more bizarre mood, I could work in a universe consisting of Beethoven, Picasso, and Ernie Banks, inter-preting the symbol 0 as Picasso, S as the identity function, < as equality, and each of the binary function symbols as the constant function with output Ernie Banks.

The point is that there is nothing sacred about one mathematical structure as opposed to another. Without determining the structure under consideration, without deciding how we wish to interpret the symbols of the language, we have no way of talking about the truth or falsity of a sentence as trivial as

(toi)(t/i<S(t>i)).

Definition 1.6.1. Fix a language C. An -structure 21 is a nonempty set A, called the universe of together with:

1. For each constant symbol c of C, an element c a of A

2. For each n-ary function symbol / of £, a function / a : An —»A

3. For each n-ary relation symbol R of an n-ary relation i i a

on A (i.e., a subset of An)

Chaff: The letter 21 is a German Fraktur capital A. We will also have occasion to use 21's friends, © and <£.

will be vised for a particular structure involving the natural numbers. The use of this typeface is traditional

1.6. Structures 27

(which means this is the way I learned it). For your handwritten work, probably using capital script letters will be the best.

Often, we will write a structure as an ordered fc-tuple, like this:

As you can see, the notation is starting to get out of hand once again, and we will not hesitate to simplify and abbreviate when we believe that we can do so without confusion. So, when we are working in £NT, we will often talk about the standard structure

. 9t = <N,0,S,+,-,£,<)/

where the constants, functions, and relations do not get the super-scripts they deserve, and the author trusts you will interpret N as the collection {0,1,2,.. .} of natural numbers, the symbol 0 to stand for the number zero, + to stand for addition, S to stand for the successor function, and so on. By the way, if you are not used to thinking of 0 as a natural number, do not panic. Set theorists see 0 as the most natural of objects, so we tend to include it in N without thinking about it.

Example 1.6.2. The structure 9t that we have just introduced is called the standard £/vr-structure. To emphasize that there are other perfectly good ^NT-structures, let us construct a different £NT-structure 21 with exactly four elements. The elements of A will be Oberon, Titania, Puck, and Bottom. The constant 0 a will be Bottom. Now we have to construct the functions and relations for our structure. As everything is unary or binary, setting forth tables (as in Table 1.1) seems a reasonable way to proceed. So you can see that in this structure 21 that Titania + Puck = Oberon, while Puck -f- Titania = Titania. You can also see that 0 (also known as Bottom) is not the additive identity in this structure, and that < is a very strange ordering. •

Now the particular functions and relation that I chose were just the functions and relations that jumped into my fingers as I typed up this example, but any such functions would have worked perfectly well to define an £NT-structure. It may well be worth your while to figure out if this £wr-sentence is true (whatever that means) in 21: SSO + SSO < SSSSSOEO + SO.

X S*(x) Oberon Oberon Titania Bottom Puck Titania Bottom Titania

+ * Oberon Titania Puck Bottom Oberon Puck Puck Puck Titania Titania Puck Bottom Oberon Titania Puck Bottom Titania Bottom Titania

Bottom Bottom Bottom Bottom Oberon

II Oberon Titania Puck Bottom Oberon 1 Oberon Titania Puck Bottom Titania Titania Bottom Oberon Titania

Puck Bottom Bottom Oberon Oberon Bottom 1 Titania Oberon Puck Puck

E* [ Oberon Titania Puck Bottom Oberon I Puck Puck Oberon Oberon Titania Titania Titania Titania Titania

Puck Titania Bottom Oberon Puck Bottom J Bottom Puck Titania Puck

<* 1 Oberon Titania Puck Bottom Oberon 1 Yes No Yes Yes Titania No No Yes No Puck Yes Yes Yes Yes

Bottom No No Yes No

Table 1.1: A Midsummer Night's Structure

1.6. Structures 29

Example 1.6.3. We work in a language with one constant symbol, £, and one unary function symbol, X. So, to define a model 21, all we need to do is specify a universe, an element of the universe, and a function X a . Suppose that we let the universe be the collection of all finite strings of 0 or more capital letters from the Roman alphabet. So A includes such strings as: BABY, LOGICISBETTERTHANSIX, e (the empty string), and DLKFDFAHADS. The constant symbol £ will be interpreted as the string POTITION, and the function X a

is the function that adds an X to the beginning of a string. So Xa(YLOPHONE) = XYLOPHONE. Convince yourself that this is a valid, if somewhat odd, £-structure.

To try to be clear about things, notice that we have X, the func-tion symbol, which is an element of the language C. Then there is X, the string of exactly one capital letter of the Roman alphabet, which is one of the elements of the universe. (Did you notice the change in typeface without my pointing it out? You may have a future in publishing!)

Let us look at one of the terms of the language: X£. In our particular ^-structure 21 we will interpret this as

X*(£*) = X*(POTITION) = XPOTITION.

In a different structure, 25, it is entirely possible that the inter-pretation of the term X£ will be HUNNY or AARDVARK or 3JT/17. Without knowing the structure, without knowing how to interpret the symbols of the language, we cannot begin to know what object is referred to by a term.

Chaff: All of this stuff about interpreting terms in a structure will be made formal in the next section, so don't panic if it doesn't aH make sense right now.

What makes this example confusing, as well as important, is that the function symbol is part of the structure for the language and (modulo a superscript and a change in typeface) the function acts on the elements of the structure in the same way that the function symbol is used in creating ^-formulas.

Example 1.6.4. Now, let C be {0,/,<?, i?}, where 0 is a constant symbol, / is a unary function symbol, g is a binary function symbol, and R is a 3-ary relation symbol. We define an ^-structure 58 as

follows: B, the universe, is the set of all variable-free £-terms. The constant 0® is the term 0. The functions /® and g® are defined as in Example 1.6.3, so if t and s are elements of B (i.e., variable-free terms), then /®(i) is ft and g®(t, s) is gts.

Let us look at this in a little more detail. Consider 0, the constant symbol, which is an element of £. Since 0 is a constant symbol, it is a term, so 0 is an element of B, the universe of our structure 58. (Alas, there is no change in typeface to help us out this time.) If we want to see what element of the universe is referred to by the constant symbol 0, we see that 0® = 0, so the term 0 refers to the element of the universe 0.

If we look at another term of the language, say, /0, and we try to find the element of the universe that is denoted by this term, we find that it is

- /®(0®) = /®(0) = /0. So the term /0 denotes an element of the universe, and that element of the universe is . . . /0. This is pretty confusing, but all that is going on is that the elements of the universe are the syntactic objects of the language.

This sort of structure is called a Henkin structure, after Leon Henkin, who introduced them in his Ph.D. dissertation in 1949. These structures will be crucial in our proof of the Completeness Theorem in Chapter 3. The proof of that theorem will involve the construction of a particular mathematical structure, and the struc-ture that we will build will be a Henkin structure.

To finish building our structure 53, we have to define a relation iZ®. As R is a 3-ary relation symbol, iZ® is a subset of B3. We will arbitrarily define

iZ® = {(r, s, t) € B3 \ the number of function symbols in r is even).

This finishes defining the structure The definition of /Z® given is entirely arbitrary. I invite you to come up with a more interesting or more humorous definition on your own.

1.6.1 Exercises

1. Consider the structure constructed in Example 1.6.2. Find the value of each of the following: 0+0, 0E0, SO-SSO. Do you think 0 < 0 in this structure?

1.6. Structures 31

2. Suppose that £ is the language {0,4-, <} . Let's work together to describe an £-structure 21. Let the universe A be the set consist-ing of all of the natural numbers together with Ingrid Bergman and Humphrey Bogart. You decide on the interpretations of the symbols. What is the value of 5 + Ingrid? Is Bogie < 0?

3. Here is a language consisting of one constant symbol, one 3-ary function symbol, and one binary relation symbol: £ is (t>, fl, tl). Describe an £-model that has as its universe R, the set of real numbers. Describe another £~model that has a finite universe.

4. Write a short paragraph explaining the difference between a lan-guage and a structure for a language.

5. Suppose that & and 23 are two £-structures. We will say that 2t and S are isomorphic and write 21 = 23 if there is a bi-jection i : A -* B such that for each constant symbol c of £, i(cm) = c®, for each n-ary function symbol / and for each ai,...,On € A, i ( /a (ai , . . . ,an}) = /®(i(ax),.. . ,t(a„)), and for each n-ary relation symbol R in £, {a\,..., On) € R* if and only if (i(fii), . . . , i(an)) € iZ®. The function t is called an isomor-phism.

(a) Show that = is an equivalence relation. [Suggestion: This means that you must show that the relation = is reflexive, symmetric, and transitive. To show that — is reflexive, you must show that for any structure 21, 21 = 21, which means that you must find an isomorphism, a function, mapping A to A that satisfies the conditions above. So the first line of your proof should be, "Consider this function, with domain A and codomain A: i(x) = something brilliant." Then show that your function i is an isomorphism. Then show, if & = 95, then 23 = 21. Then tackle transitivity. In each case, you must define a particular function and show that your function is an isomorphism.]

(b) Find a new structure that is isomorphic to the structure given in Example 1.6.2. Prove that the structures are iso-morphic.

(c) Find two different structures for a particular language and prove that they are not isomorphic.

(d) Find two different structures for a particular language such that the structures have the same number of elements in their universes but they are still not isomorphic. Prove they are not isomorphic.

6. Take the language of Example 1.6.4 and let C be the set of all £-terms. Create an £-structure € by using this universe in such a way that the interpretation of a term t is not equal to i.

7. If we take the language CNT, we can create a Henkin structure for that language in the same way as in Example 1.6.4. Do so. Consider the £j\nn-formula SO+50 = SSO. Is this formula "true" (whatever that means) in your structure? Justify your answer.

1.7 Truth in a Structure It is at last time to tie together the syntax and the semantics. We have some formal rules about what constitutes a language, and we can identify the terms, formulas, and sentences of a language. We can also identify £-structures for a given language £. In this section we will decide what it means to say that an £-formula <f> is true in an £-structure St.

To begin the process of tying together the symbols with the structures, we will introduce assignment functions. These assign-ment functions will formalize what it means to interpret a term or a formula in a structure.

Definition 1.7.1. If Ql is an £-structure, a variable assignment function into St is a function s that assigns to each variable an element of the universe A. So a variable assignment function into & is any function with domain Vars and codomain A.

Variable assignment functions need not be injective or bijective. For example, if we work with CNT and the standard structure 71, then the function s defined by s(vi) = i is a variable assignment function, as is the function s' defined by

&'(vt) — the smallest prime number that does not divide i.

We will have occasion to want to fix the value of the assignment function s for certain variables.

1.7. Truth in a Structure 33

Definition 1.7.2. If s is a variable assignment function into 21 and a; is a variable and a € A, then s[x|a] is the variable assignment function into 2t defined as follows:

We call the function s[x|a] an x-modification of the assign-ment function s.

So an x-modification of s is just like s, except that the variable x is assigned to a particular element of the universe.

What we will do next is extend a variable assignment function s to a term assignment function, s. This function will assign an element of the universe to each term of the language C. , '

Definition 1.7.3. Suppose that 21 is an / -structure and s is a vari-able assignment function into 21. The function s, called the term as-signment function generated by s, is the function with domain consisting of the set of £-terms and codomain A defined recursively as follows: .. .

1. If i is a variable, s(i) = s(t).

2. If f is a constant symbol c, then s(t) = ca .

There is a potential problem that needs to be addressed. It is not immediately clear, given a variable assignment function s, that there is an extension of s to a term assignment function s. For example, maybe I can find a term t such that both rules (2) and (3) apply to i, and thus there might be two different possibilities for s(t). Fortunately, the unique readability for terms that you proved back in Section 1.3 (Problem 7) provides the assurance that this cannot happen. The formal statement of this fact, the Recursion Theorem, would take us too far afield at this point. The interested reader can find the theorem and its proof in [Enderton 72]. •

Although we will be primarily interested in truth of sentences, we will first describe truth (or satisfaction) for arbitrary formulas, relative to an assignment function.

3. If t is ftit2 ...tn, then s(t) = /a(s(ti) , s(t2),..., s(*„)).

Definition 1.7.4. Suppose that 21 is an £-structure, 4 is an C-formula, and s ; Van —» A is an assignment function. We will say that SX satisfies 4> with assignment s, and write 21 in the following circu mstances:

1. If ^ is = ti<2 and s(t{) is the same element of the universe A as sfo), or

2. If # is and <5(<i),*(f3),-.-,«(*»)) € i l 3 , or

3. If is (-*>) and 21 ^ a[s], (where ^ means "does not satisfy9) or

4. If 0 is (ctVfl) and 2 1 a [ s ] , or 2 ^ $[s] (or both), or

5. If $ is (Vz)(a) and, for each element a of A, 21 o[s(xja)].

If F is a set of £-formulas, we say that 21 satisfies T with assign-ment and write 21 J= T[s] if for each 7 € T, % |= 7[s].

Chaff: Notice that the symbol is not part of the language £- Rather, f= is a metalinguistic symbol that we use to talk about formulas in the language and structures for the language.

Chaff: Also notice that we have at last tied together the syntax and the semantics of our language! The defini-tion above is the place where we formally put the mean-ings on the symbols that we will use, so that V means "or* and V means "for all."

Example 1.7.5. Let us work with the empty language, so £ has no constant symbols, no function symbols, and no relation symbols. So an £~structure is simply a nonempty set, and let us consider the £~ structure where A = {Humphrey, Ingrid}. Now if we think about the formula x ~ y and look at the assignment function s, where a{x) is Humphrey and s{y) is also Humphrey. If we ask whether % x = y[s], we have to check whether s(x) is the same element of A as s(y). Since the two objects are identical, the formula is true.

To emphasize this, the formula x = y can be true in some uni-verses with some assignment functions. Although the variables x and y are distinct, the truth or falsity of the formula depends not on the

variables (which ace not equal) but rather, on which elements of the structure the variables denote, the values of the variables (which are equal for this example). Of course, there are other assignment func-tions and other structures that make our formula false. I am sure you can think of some.

To talk about the truth or falsity of a sentence in a structure, we will take our definition of satisfaction relative to an assignment function and prove that for sentences, the choice of the assignment function is inconsequential. Then we will say that a sentence a is true in a structure 21 if and only if 21 {= <r[s] for any (and therefore all) variable assignment functions a.

Chaff: The next couple of proofs are proofs by induc-tion on the complexity of terms or formulas. You may want to reread the proof of Theorem 1.4.2 on page 18 if you find these difficult.

Lemma 1.7.6. Suppose that si and are variable assignment func-tions into a structure 3 such that (v) — $2(1?) f01" every variable v in the term t. Then s~[{t) — 5a(i).

Proof. We use induction on the complexity of the term t. If £ is either a variable or a constant symbol, the result is immediate. If t Is /fif2--.fr,, then as slfe) = Safe) far 1 < t < n by the induc-tive hypothesis, the definition of Si(t) and the definition of 52(f) are identical, and thus sl(f) = s CO- * Q

Proposition 1.7.7. Suppose that Si and s2 are variable assignment functions into a structure 21 such that Si(v) = s2(v) for every free variable v in the formula Then 21 }= <p[si] if and only ifSL }=

Proof. We use induction on the complexity of <j>. If ^ is = ti*a, then the free variables of 6 are exactly the variables that occur in <f>. Thus Lemma 1.7.6 tells us that 5T(<i) = sj(fi) and sTfo) = sifo) , meaning that they are the same element of the universe A, so 21 (= ti<2){*il if and only if 21 (= M2)[»2]» as needed.

The other base case, if $ is Rtfa•••tn, is similar and is left as part of Exercise 6.

To begin the first inductive clause, if $ is ->af notice that the free variables of 4> are exactly the free variables of a, so and $2 agree on the free variables of a. By the inductive hypothesis, 211= o[«i|

if and only if 21 c*[s2]> and thus (by the definition of satisfaction), 21 (= <t>[si] if and only if 8 |= <f>[s2]. The second inductive clause, if <f> is a V fi, is another part of Exercise 6.

If <j> is (Vx)(a), we first note that the only variable that might be free in a that is not free in <f> is x. Thus, if o € A, the assignment functions si[x|a] and S2[x|a] agree on all of the free variables of a. Therefore, by inductive hypothesis, for each a € A, 21 \= a[si[x|a]] if and only if 21 }= a[s2[x|a]]. So, by Definition 1.7.4, 21 \= 0[si] if and only if 53 <£[S2]. This finishes the last inductive clause, and our proof. •

Corollary 1.7.8. If a is a sentence in the language C and 2t is an C-structure, either & (= er[s] for all assignment functions s, or 21 F= CT[S] for no assignment function s.

Proof. There are no free variables in <r, so if s\ and 82 are two as-signment functions, they agree on all of the free variables of a, there just aren't all that many of them. So by Proposition 1.7.7, 21 (= er[si] if and only if 211= <r[s2], as needed. •

Definition 1.7.9. If <j> is a formula in the language C. and 21 is an ^-structure, we say that 21 is a model of <f>, and write 21 |= <j>, if and only if 21 {= <£[s] for every assignment function s. If $ is a set of £-formulas, we will say that 53 models and write 2t )= if and only if 21 <j> for each <j> €

Notice that if a is a sentence, then 21 a if and only if 21 (= a[s] for any assignment function s. In this case we will say that the sentence a is true in 21.

Example 1.7.10. Let's work in £jyr> and let

0,5, + , - ,£ ,< )

be the standard structure. Let s be the variable assignment function that assigns Vi to the number 2i. Now let the formula <f>(vi) be v i + v i = 55550.

To show that 9t (= <f>[s], notice that

s(v i + t/i) is +5rt(s(vi),s(t;1)) is +*(2,2) is 4

s(SSSSO) is 5m(59T(5<n(59T(0<yt)))) is 4.

Now, in the same setting, consider <r, the sentence

(Vui)->(Vv2)-,(ui — V2 + v2),

which states that everything is even. [That is hard to see unless you know to look for that ->(Vi>2)-> and to read it as (3^2)• See the last couple of paragraphs of this section.] You know that a is false in the standard structure, but to show how the formal argument goes, let a be any variable assignment function and notice that

For every a € N, 91 [= -i(Vu2)~'(vi = + U2)s[t>i]a] For every a € N, fc (Vu2)->(vx — V2 + t>2)s[vi|a] For every a 6 N, there is a b € N,

yi\=vi=v2+ v2s[vx\a][v2\b}.

Now, if we consider the case when a is the number 3, it is perfectly clear that there is no such b, so we have shown ^ er[s]. Then, by Definition 1.7.9, we see that the sentence <x is false in the standard structure. As you well knew.

When you were introduced to symbolic logic, you were probably told that there were five connectives. In the mathematics that you have learned recently, you have been using two quantifiers. I hope you have noticed that we have not used all of those symbols in this book, but it is now time to make those symbols available. Rather than adding the symbols to our language, however, we will introduce them as abbreviations. This will help us to keep our proofs slightly less complex (as our inductive proofs will have fewer cases) but will still allow us to use the more familiar symbols, at least as shorthand.

Thus, let us agree to use the following abbreviations in construct-ing ^-formulas: We will write (a A /?) instead of (-i((-ia) V (~V3)))j (a —• f3) instead of ((-<«) V /?), and (a «-• (3) instead of ((a P) A (/3 —» a)). We will also introduce our missing existential quan-tifier as an abbreviation, writing (3x)(a) instead of (~<(Vac)(-«o)). It is an easy exercise to check that the introduced connectives A, —

91 }= a[a] iff iff iff

and *-* behave as you would expect them to- Thus 21 (a A /?)[s] if and only if both 21 }= a[s] and 21 }= /?[«]. The existential quantifier is only slightly more difficult- See Exercise 7.

1.7.1 Exercises

1. I suggested after Definition 1.5.3 that the truth or falsity of the sentences 1 + 1 = 2 and (Vx)(x +1 = x) might not be automatic. Find a structure for the language discussal there that makes the sentence 1 + 1 = 2 true. Find another structure where 1 + 1 = 2 is false. Prove your assertions. Then show that you can find a structure where (Vx)(z + 1 = x) is true, and another structure where it is false.

2. Let the language C be {5, <} , where 5 is a unary function sym-bol and < is a binary relation symbol Let <j> be the formula (Vx)(3y)(5x < y).

(a) Find an ^-structure 21 such that % \= (j>. (b) Find an ^-structure 93 such that 23 b (c) Prove that your answer to part (a) or part (b) is correct. (d) Write an -sentence that fa true in a structure 21 if and only

if the universe A of 21 consists of exactly two elements.

3. Consider the language and structure of Example 1.6.4. Write two nontrivial sentences in the language, one of which is true in the structure and one of which (not the denial of the first) is false in the structure. Justify your assertions.

4. Consider the sentence tr: (Vx)(3y)[x <y~*x + 1 = »]. Find two structures for a suitable language, one of which makes o true, and the other of which makes a false.

5. One more bit of shorthand. Assume that the language C contains the binary relation symbol 6, which you are intending to use to mean the elementhood relation (so p € q will mew that p is an element of q). Often, it is the case that you want to claim that 4>(x) is true for every element of a set 6. Of course, to do this you could write

(Vz)[(* €6)

1.8. Substitutions and Substitutsbility 39

We will abbreviate this formula as

Similarly, (3x 6 b)(<f>(x)) will be an abbreviation for the formula Notice that this formula has a conjunction

where the previous formula had an implication! We do that just , to see if you are paying attention. (Well, if you think about what the abbreviations are supposed to mean, you'll see that the change is necessary. We'll have to do something else just to see if you're paying attention.) Now suppose that 01 is a structure 6m- the language of set the-ory. So £ has only this one binary relation symbol, €, which is interpreted as the elementhood relation. Suppose, in addition, that A = {u, u, w, {«}, {u, f) , (u, t>,«;}}. In particular, notice that there is no element x of A such that x € x. Consider the sentence

(Vy 6 y)(3x € x)(x = y).

Is this sentence true or false in 21?

6. Fill in the details to complete the proof of Proposition 1.7.7.

7. Show that 21 f= (3x)(a)[sJ if and only if there is an element such that 21 }= a{s[x|a]].

1.8 Substitutions and Substitutability Suppose you knew that the sentence Vx^(x) was true in a particular structure 2L Then, if c is a constant symbol in the language, you would certainly expect (j>{c) to be true in 21 as well. What we have done is substitute the constant symbol c for the variable x. This ' seems perfectly reasonable, although there are times when you do have to be careful.

Suppose that 21 j= Vi3y->(x = y). This sentence is, in feet, true in any structure 21 such that A has at least two elements. If we then proceed to replace the variable x by the variable u, we get the state-ment 3y-i(u = y), which will still be true in 21, no matter what value we give to the variable u. If, however, we take our original formula and replace x by y, then we find ourselves looking at 3y~»(y =* y), •

which will be false in any structure. So by a poor choice of sub-stituting variable, we have changed the truth value of our formula. The rules of substitutability that we will discuss in this section are designed to help us avoid this problem, the problem of attempting to substitute a term inside a quantifier that binds a variable involved in the term.

We begin by defining exactly what we mean when we substitute a term t for a variable x in either a term u or a formula 4>.

Definition 1.8.1. Suppose that u is a term, x is a variable, and t is a term. We define the term uf (read "u with x replaced by t") as follows:

1. If u is a variable not equal to x, then uf is u.

2. If u is x, then uf is t.

3. If u is a constant symbol, then uf is u.

4. If u is /ui«2...un, where / is an n-ary function symbol mid the m are terms, then

«? i s / ( « i ) f ( «2 ) f . - .K ) t .

Chaff: In the fourth clause of the definition above and in the first two clauses of the next definition, the parentheses are not really there. However, I personally believe that no one can look at mf and figure out what it is supposed to mean. So the parentheses have been

. added in the interest of readability.

For example, if we let t be g(c) and we let u be f(x, y)+h(z, x, g(x)), then uf is

/(9(c), y) + h(z, 5(c), g(g{c))). The definition of substitution into a formula is also by recursion:

Definition 1.8.2. Suppose that 0 is an ^-formula, t is a term, and x is a variable. We define the formula 4>f (read u<f> with x replaced by t") as follows:

1. If 0 is = ui«2» then is = («i)f («2)f.

1.8. Substitutions and Substitutability 41

2. If <f> is Ru\u2...tin, then # is i?(ui)f(u2)f...(«a)f. 3. If ^ is -,(<*), then <pf is --(of).

4. If ^ is (aV/?), then <f>f is (of Vpf).

5. If is (Vy)(a), then

X _\4> if X is y * \ (Vy) (o f ) otherwise.

As an example, suppose that 4> is the formula

y) - t(V*)(QG7(x), Z)) V (Vy)(i?(x, fc(*))]. Then, if t is the term g(c), we get

{ $ is P(g(c),y) - [(Vx)(Q(Kx),z)) V (Vy)(/2(y(c),ff(y(c)))]. Having defined what we mean when we substitute a term for a

variable, we will now define what it means for a term to be substi-tutable for a variable in a formula. The idea is that if t is substi-tutable for x in we will not run into the problems discussed at the beginning of this section—we will not substitute a term in such a way that a variable contained in that term is inadvertently bound by a quantifier. Definition 1.8.3. Suppose that 0 is an ^-formula, t is a term, and x is a variable. We say that t is substitutable for x in ^ if

1. <f> is atomic, or 2. <f> is ~>(a) and t is substitutable for x in a, or 3. <l> is (a V P) and t is substitutable for x in both a and P, or 4. <f> is (Vy)(a) and either

(a) x is not free in <j>, or (b) y does not occur in t and t is substitutable for x in a.

Notice that <pf is defined whether or not t is substitutable for x in 4>. Usually, we will not want to do a substitution unless we check for substitutability, but we have the ability to substitute whether or not it is a good idea. In the next chapter, however, you will often see that certain operations are allowed only if t is substitutable for x in <j>. That restriction is there for good reason, as we will be concerned with preserving the truth of formulas after performing substitutions.

1.8.1 Exercises

1. For each of the following, write out uf;

(a) u is cos a:, t is siny. (b) u is y, t is Sy. (c) u is |(x, y, z)t t is 423 — w.

2. For each of the following, first write out <f>f, then decide if t is substitutable for a; in <f>, and then (if you haven't already) use the definition of substitutability to justify your conclusions.

(a) <j> is Vx(x — y—* Sx — Sy), t is SO. (b) <f> is Vy(x = y - » Sx = Sy), t is Sy. (c) <j> is x = y -»(Vx)(Sx = Sy), t is Sy.

3. Show that if t is variable-free, then t is always substitutable for x in <j>.

4. Show that x is always substitutable for x in 4>.

5. Prove that if x is not free in tp, then rpf is tp.

6. You might think that (<t>y)vx is <f>, but a moment's thought will give you an example to show that this doesn't always work. (What if y is free in <fr?) Find an example that shows that even if y is not free in </>, we can still have {4>y)y different from </>. Under what conditions do we know that (^j^ is <f>1

7. Write a computer program (in your favorite language, or in pseudo-code) that accepts as input a formula a variable x, and a term t and outputs "yes" or "no" depending on whether or not t is substitutable for x in <j>.

1.9 Logical Implication At first glance it seems that a large portion of mathematics can be broken down into answering questions of the form: If I know this statement is true, is it necessarily the case that this other statement is true? In this section we will formalize that question.

1.9. Logical Implication 43

Definition 1.9.1, Suppose that A and T are sets of £-formulas. We will say that A logically implies T and write A f= T if for every £-structure 21, if 21 f= A, then 21 [= T.

This definition is a little bit tricky. It says that if A is true in 21, then T is true in 21. Remember, for A to be true in 21, it must be the case that 21 J= A[s] for every assignment function s. See Exercise 4.

If T = {7} is a set consisting of a single formula, we will write A |= 7 rather than the official A f= {7}.

Definition 1.9.2. An £-formula <t> is said to be valid if 0 \= <j>, in other words, if <j> is true in every £-structure with every assignment function s. In this case, we will write (= <f>.

Chaff: It doesn't seem like it would be easy to check whether A f= T. To do so directly would mean that we would have to examine every possible £-structure and every possible assignment function s, of which there will be many.

I'm also sure that you've noticed that this double turnstyle symbol, [=, is getting a lot of use. Just re-member that if there is a structure on the left, 21 \= er, we are discussing truth in a single structure. If there is a set of sentences on the left, T \= at then we are discussing logical implication. .

Example 1.9.3. Let £ be the language consisting of a single bi-nary relation symbol, P, and let a be the sentence (3yixP(x, y)) -* (Vx3yP(x, y)). We show that a is valid.

So let 21 be any £-structure and let s : Vars —* A be any assign-ment function. We must show that

21 (= (3j/VxP(x, y)) —• (V®3yP(ar, y))j[s]. ,

Assume that 21 |= (3yVxP(x,y)) [s]. (If 21 does not model this sentence, then we know by the definition of —• that 21 cr[s].)

Since we know that 21 j= (3yVxP(x, y)) [s], we know that there is an element of the universe, a, such that 21 (= VxP(x,y)[s[y|a]]. And so, again by the definition of satisfaction, we know that if 6 is any element of A, 2t b y) [(s[y|a]) [x|6]]. If we chase through the definition of satisfaction (Definition 1.7.4) and of the various

1.8.1 Exercises

1. For each of the following, write out uf:

(a) u is cosx, t is siny. (b) u is y, t is Sy. (c) u is (i(x, y, z), t is 423 — w.

2. For each of the following, first write out 4>f, then decide if t is substitutable for x in 4>, and then (if you haven't already) use the definition of substitutability to justify your conclusions.

(a) <j> is Vx(x = y Sx = Sy), t is SO. (b) <j> is Vy(x =» y - » Sx = Sy), t is Sy. (c) 4> is x = y -* (Vx)(Sx — Sy), t is Sy.

3. Show that if t is variable-free, then t is always substitutable for x in <(>.

4. Show that x is always substitutable for x in <j>.

5. Prove that if x is not free in ip, then ipf is ip.

6. You might think that {4>y)vx is <j>, but a moment's thought will give you an example to show that this doesn't always work. (What if y is free in <f>?) Find an example that shows that even if y is not free in </>, we can still have {<j>y)y different from <j>. Under what conditions do we know that is <j>?

7. Write a computer program (in your favorite language, or in pseudo-code) that accepts as input a formula <j), a variable x, and a term t and outputs "yes" or "no" depending on whether or not t is substitutable for x in <f>.

1.9 Logical Implication At first glance it seems that a large portion of mathematics can be broken down into answering questions of the form: If I know this statement is true, is it necessarily the case that this other statement is true? In this section we will formalize that question.

1.9. Logical Implication 43

Definition 1.9.1. Suppose that A and T are sets of ^-formulas. We will say that A logically implies T and write A T if for every ^-structure 21, if 21 \= A, then 21 f= T.

This definition is a little bit tricky. It says that if A is true in 2t, then T is true in 21. Remember, for A to be true in 21, it must be the case that 21 \= A[s] for every assignment function s. See Exercise 4.

If T = {7} is a set consisting of a single formula, we will write A |= 7 rather than the official A (= {7}.

Definition 1.9.2. An £-formula <j> is said to be valid if 0 {= in other words, if <j> is true in every /^-structure with every assignment function s. In this case, we will write f= <j>.

Chaff: It doesn't seem like it would be easy to check whether A T. To do so directly would mean that we would have to examine every possible /^-structure and every possible assignment function s, of which there will be many.

I'm also sure that you've noticed that this double turnstyle symbol, |=, is getting a lot of use. Just re-member that if there is a structure on the left, 21 (= a, we are discussing truth in a single structure. If there is a set of sentences on the left, T |= <r, then we are discussing logical implication.

Example 1.9.3. Let C be the language consisting of a single bi-nary relation symbol, P, and let a be the sentence (3yVxP(x, y)) —» (Vx3yP(x,y)). We show that a is valid.

So let 21 be any ^-structure and let s : Vara —• A be any assign-ment function. We must show that

Assume that 21 f= (3yVxP(x, y)) [s]. (If 21 does not model this sentence, then we know by the definition of —»that 21 J= o"[s].)

Since we know that 2t (3y\/xP(x, y)) [a], we know that there is an element of the universe, a, such that 21 VxP(x, y)[s[y|a]]. And so, again by the definition of satisfaction, we know that if b is any element of A, 21 j= P(x, y) [(s[y|a]) [x|6]]. If we chase through the definition of satisfaction (Definition 1.7.4) and of the various

21 |= (3yixP(x, y)) (Vx3yP(x, y))] [s).

assignment functions, this means that for our one fixed o, the ordered pair (b, a) € P a for any choice of 6 6 A,.

We have to prove that 211= (Vx3yP(x, y)) [s]. As the statement of interest is universal, we must show that, if c is an arbitrary element of A, 21 |= 3yP(x, y)[s[x|c]], which means that we must produce an element of the universe, d, such that 21 (= P(x, y) [(s[x|c|) [y|<f]]. Again, from the definition of satisfaction this means that we must find a d € A such that (c,d) € P a . Fortunately, we have such a d in hand, namely a. As we know (c, a) € P a , we have shown 21 \= (Vx3yP(x,y)) [s], and we are finished.

1.9.1 Exercises

1. Show that {a, a —»/?}}=/? for any formulas at and /?. Translate this result into everyday English. Or Swedish, if you prefer.

2. Show that the formula x = x is valid. Show that the formula x = y is not valid. What can you prove about the formula ->x = y in terms of validity?

3. Suppose that <f> is an £-formula and x is a variable. Prove that <f> is valid if and only if (Vx)(<£) is valid. Thus, if <f> has free variables x, y, and z, <j> will be valid if and only if VxVyVz^ is valid. The sentence VxVyVz^ is called the universal closure of <j>.

4. (a) Assume that {<P-* ip). Show that 0 )= t/>. (b) Suppose that 4 is x < y and ^ is z < to. Show that <p \= ip

but ^ {4> (The slash through b means "does not logically imply.")

[This exercise shows that the two possible ways to define logical equivalence are not equivalent. The strong form of the definitions says that <f> and are logically equivalent if [= (^ —• ip) and |= (t/> —• The weak form of the definition states that <f> and t/> are logically equivalent if <j> |= ip and ip f= <j).]

1.10 Summing Up, Looking Ahead What we have tried to do in this first chapter is to introduce the concepts of formal languages and formal structures. I hope that you will agree that you have seen many mathematical structures

1.10. Summing Up, Looking Ahead 45

in the past, even though you may not have called them structures at the time. By formalizing what we mean when we say that a formula is true in a structure, we will be able to tie together truth and provability in the next couple of chapters.

You might be at a point where you are about to throw your hands up in disgust and say, "Why does any of this matter? I've been doing mathematics for over ten years without worrying about structures or assignment functions, and I have been able to solve problems and succeed as a mathematician so far." Allow me to assure you that the effort and the almost unreasonable precision that we are imposing on our exposition will have a payoff in later chapters. The major theorems that we wish to prove are theorems about the existence or nonexistence of certain objects. To prove that you cannot express a certain idea in a certain language, we have to know, with an amazing amount of exactitude, what a language is and what structures are. Our goals are some theorems that are easy to state incorrectly, so by being precise about what we are saying, we will be able to make (and prove) claims that are truly revolutionary.

Since we will be talking about the existence and nonexistence of proofs, we now must turn our attention to defining (yes, precisely) what sorts of things qualify as proofs. That is the topic of the next chapter.

k k k k

Chapter 2

Deductions

2.1 Naively What is it that makes mathematics different from other academic subjects? What is it that distinguishes a mathematician from a poet, a linguist, a biologist, or a civil engineer? I am sure that you have many answers to that question, not all of which are complimentary to the author of this work or to the mathematics instructors that you have known!

I suggest that one of the things that sets mathematics apart is the insistence upon proof. Mathematical statements are not accepted as true until they have been verified, and verified in a very particular manner. This process of verification is central to the subject and serves to define our field of study in the minds of many. Allow me to quote a famous story from John Aubrey's Brief Lives:

[Thomas Hobbes] was 40 years old before he looked on Geometry; which happened accidentally. Being in a Gen-tleman's Library, Euclid's Elements lay open and 'twas the 47 El. libri 1 [the Pythagorean Theorem]. He read the Proposition. By G—, sayd he (he would now and then sweare an emphaticall Oath by way of emphasis) this is impossible! So he reads the Demonstration of it, which referred him back to such a Proposition; which proposi-tion he read. That referred him back to another, which he also read. Et sic deinceps [and so on] that at last he was demonstratively convinced of that trueth. This made him in love with Geometry.

48 Chapter 2. Deductions,

Doesn't this match pretty well with your image of a mathematical proof? To prove a proposition, you start from some first principles, derive some results from those axioms, then, using those axioms and results, push on to prove other results. This is a technique that you have seen in geometry courses, college mathematics courses, and in the first chapter of this book.

Our goal in this chapter will be to define, precisely, something called a deduction. You probably haven't seen a deduction before, and you aren't going to see very many of them after this chapter is over, but our idea will be that any mathematical proof should be able to be translated into a (probably very long) deduction. This will be crucial in our interpretation of the results of Chapters 3 and 5, where we will discuss the existence and nonexistence of certain deductions, and interpret those results as making claims about the existence and nonexistence of mathematical proofe.

If you think about what a proof is, you probably will come up with a characterization along, the lines of: A proof is a sequence of statements, each one of which can be justified by referring to previous statements. This is a perfectly reasonable starting point, and it brings us to the main difficulty we will have to address as we move from an informal understanding of what constitutes a proof to a formal definition of a deduction: What do you mean by the word justified?

Our answer to this question will come in three parts. We will start by specifying a set A of ^-formulas, which will be called the logical axioms. Logical axioms will be deemed to be "justified" in any deduction. Depending on the situation at hand, we will then specify a set of nonlogical axioms, E. Finally, we will develop some rules of inference, which will be pairs (F, <j>), where T is a finite set of ' formulas and 0 is a formula. Then, if q is a formula, we will say that a deduction of a from £ is a finite fist of formulas <f> i, <f> 2, •••,<f>n such that 4>n is a and for each i, <j>% is justified by virtue of being either a logical axiom (<f>i € A), a nonlogical axiom (fa € E), or the conclusion of one of our rules of inference, (r, </>,), where T C {< >1,

The proofs that you have seen in your mathematical career have had a couple of nice properties. The first of these is that proofs are easy to follow. (OK, they aren't always easy to follow, but they are supposed to be.) This doesn't mean that it is easy to discover a proof, but rather that if someone is showing you a proof, it should be easy to follow the steps of the proof and to understand why the

2.2. Deductions 49

proof is correct. The second admirable property of proofs is that when you prove something, you know that it is true! Our definition of deduction will be designed to make sure that deductions, too, will be easily checkable and will preserve truth.

In order to do this, we will impose the following restrictions on our logical axioms and rules of inference:

1. There will be an algorithm (i.e., a mechanical procedure) that will decide, given a formula 9, whether or not 9 is a logical axiom.

2. There will be an algorithm that will decide, given a finite set of formulas T and a formula 9, whether or not (r, 9) is a rule of inference.

3. For each rule of inference (r, 0), T will be a finite set of formu-las.

4. Each logical axiom will be valid.

5. Our rules of inference will preserve truth. In other words, for each rule of inference (r, 9), T f= 9.

The idea here is that although it may require no end of brilliance and insight to discover a deduction of a formula a, there should be no brilliance and no insight required to check whether an alleged deduction of a is, in fact, a deduction of a. To check whether a deduction is correct will be such a simple procedure that it could be programmed into a computer. Furthermore, we will be certain that if a deduction of a from £ is given, and if we look at a mathematical structure 21 such that 2t (= S, then we will be certain that 21 {= a. This is what we mean when we say that our deductions will preserve truth.

2.2 Deductions We begin by fixing a language C. Also assume that we have been given a fixed set of ^-formulas, A, called the set of logical axioms, and a set of ordered pairs (r, <p), called the rules of inference. (We will specify which formulas are elements of A and which ordered pairs are rules of inference in the next two sections.) A deduction is going to be a finite sequence, or list, of / -formulas with certain properties.

Definition 2.2.1. Suppose that 52 is a collect km of ^-formulas and Z> is a finite sequence (61,62,..., <?n) of ^-formulas. We will say that D is a deduction from E if for each t, 1 < * < n, either

1. <#» € A is a logical axiom), or

2. 4>i € S (fa is a nonlogical axiom), or

3. There is a rule of inference (T,&) such that T C {6i,4>27---,6i-i}-

If there is a deduction from E, the last line of which is the formula 6, we will call this a deduction from £ of and write £

Chaff: Well, we have now established what we mean by the word justified. In a deduction we are allowed to write down any jC-formula that we like, as long as that formula is either a logical axiom or is listed explicitly in a collection £ of nonlogical axioms. Any formula that we write in a deduction that is not an axiom must arise from previous formulas in the deduction via a rule of inference.

You may have gathered that there are many differ-ent deductive systems, depending on the choices that are made for A, and the rules of inference. As a general rule, a deductive system will either have lots of rules of infer-ence and few logical axioms, or not too many rules and a lot of axioms. In developing the deductive system for us to use in this book, we attempt to pursue a middle course.

Also notice that h is another metalinguistic symbol. It is not part of the language C.

Example 2.2.2. Suppose, fen- starters, that we don't want to make any assumptions. So, let E = 0, let A = 0, and write down a deduction from E. Don't be shy. Go ahead. I'll wait.

Still nothing? Right. There are no deductions from the empty set of axioms. (Actually, after we set up our rules of inference, there will

2.2. Deductions 51

be some deductions from the empty set of axioms, but that comes later.) This is a problem that the English logician Bextraad Russell found particularly annoying as he began to learn mathematics.

At the age of eleven, I began Euclid, with my brother as my tutor. This was one of the great events of my life, as dazzling as first love. I had not imagined that there was anything so delicious in the world. ...From that moment until Whitehead and I finished Prindpia Maih-ematica, when I was thirty-eight, mathematics was my chief interest, and my chief source of happiness. like all happiness, however, it was not unalloyed. I had been told that Euclid proved things, and was much disappointed that he started with axioms. At first I refused to ac-cept them unless my brother could offer me some reason for doing so, but he said: cIf you don't accept them we cannot go on," and as I wished to go on, I reluctantly ad-mitted them pro tem. The doubt as to the premisses of mathematics which I felt at that moment remained with me, and determined the course of my subsequent work. [Russell 67, p. 36]

What we have managed to do with our definition of deduction, though, is to be up front about our need to make assumptions, and we will acknowledge our axiom set in every deduction that we write.

Example 2.2.3. Let us work in the language C = {P} , where P is a binary relation symbol. Let our set of axioms, be

£ = {VxP(x,x), P(u,v), P(u,») — P(v, tt),

*)-»/>(«,«)}.

We will let A = 0 for now. We also need to have a set of rules of inference. So temporarily let our set of rules of inference be

«{<*, a —* | a and fi are formulas of £ } .

This is just the rule modus ponens, which says that from the formulas a and a —» fl we may conclude /?.

Now we can write a deduction from E of the formula P(u, it), as follows:

P(u,v) P(u, v) —»P(v, it) P(v,u) P(v, u) —» P(u, u) P(u,u).

You can easily see that every formula in our deduction is either explicitly listed among the elements of our axiom set £, or follows from modus ponens from previously listed formulas in the deduction.

Notice, however, that we cannot use the universal statement VxP(ar, x) to derive our needed formula P(u,u). Even a statement that seems like it ought to follow from our axioms, P(v,v), for ex-ample, will not be deducible from E until we either add to our rules of inference or include some additional axioms. Our definition of a deduction is very limiting—we cannot even use standard logical tricks such as universal instantiation [from Vxfeiah(x) deduce blah(t)). These logical axioms will be gathered together in Section 2.3.

Chaff: It is really tempting here to write down the incorrect deduction

VxP(x,x) P(u,u).

Please don't say things like that until we have built our collection of logical axioms. Remember, what we are try-ing to do here is to have a definition of deduction that is entirely syntactic, that does not depend on the meanings of the symbols. Where you are likely to run into trouble is when you start thinking too much about the meanings of the things that you write down. Our definition gives us deductions that are easily verifiable: Given an alleged deduction from E, as long as we can decide what formulas are in E, we can decide if the alleged deduction is correct. In fact, we could easily program a computer to check the deduction for us. However, this ease in verification comes with a price: Deductions are difficult to write and hard to motivate.

2.2. Deductions 53

Definition 2.2.1 is a "bottom-up" definition. It defines a deduc-tion in terms of its parts. Another way to define a collection of things is to take a "top-down" approach. The next proposition does just that, by showing that we can think of the collection of deductions from E (called Thins) as the closure of the collection of axioms under the application of the rules of inference.

Proposition 2.2.4. Fix sets of C-formulas E and A and a collection of rules of inference. The set Thm% = {<f> J E h 4>} is the smallest set C such that

1. E C C.

2. ac c.

3. If (r, 0) is a rule of inference and V CCf then 6 EC.

Proof. This proposition makes two separate claims about the set Thms. The first claim is that Thms satisfies the three criteria. The second claim is that Thins is the smallest set to satisfy the criteria. We tackle these claims one at a time.

First, let us look at the criteria in order, and make sure that Thms satisfies them. So to begin, we must show that E C Thms. But certainly if o € E, there is a deduction-from-E of <r, for example this one-line deduction: <r. Similarly, to show that A C Thms, we notice that there is a one-line deduction of any A € A. To finish this part of the proof, we must show that if (T, 0) is a rule of inference and T C Thms, then 9 € Thms- But to produce a deduction-from-E of 0, all we have to do is write down deductions of each of the 7's in r , followed by the formula 9. This is a valid deduction, as 9 follows from r by the rule of inference (r, 0). Thus Thms satisfies the three criteria of the proposition.

. Now we must show that Thms is the smallest such set. This is quite easy to prove once you figure out what you have to do. What is claimed is that if C is a collection of formulas satisfying the given requirements, then Thms £ C. So we assume that C is a class satisfying the conditions, and we attempt to show that every element of Thms is in C.

If ^ € Thms, there is a deduction from E with last line <f>. If the entry (f> is justified by virtue of <j> being either a logical or nonlogical axiom, then <f> is explicitly included in the set C. If 0 is justified by reference to a rule of inference {I\ 4>), then each 7 € F is an element

of C (this is really a proof by induction, and here is where we use the inductive hypothesis), and thus, by the third requirement on C, <f> 6 C, as needed.

Since Thm^ C C for all such sets C, Thm^ is the smallest such set, as claimed. •

Here is what we will do in the next few sections: We will define A, the fixed set of logical axioms; we will establish our collection of rules of inference; we will prove some results about deductions; and finally, we will discuss some examples of sets of nonlogical axioms.

2.2.1 Exercises 1. Let the collection of nonlogical axioms be

E = {(A(x) A A(x)) B{x, y), i4(®), B(x, y) ^(x)},

and let the rule of inference be modus ponens, as in Exam-ple 2.2.3. For each of the following, decide if it is a deduction. If it is not a deduction, explain how you know that it is not a deduction.

(a) A(x) A{x) A A{x) (A{x)f\A(x))-+B(x,y) B(x,y)

(b) B{x,y) -+A(x) A(x) B{x,y)

(c) (A{x)AA{x))-+B(xty) B(x, y) —»A(x) (A(x)AA(x))-*A(x)

2. Consider the axiom system E of Example 2.2.3. It is implied in that example that there is no deduction from E of the formula P(v,v). Prove this fact.

3. Carefully write out the proof of Proposition 2.2.4, worrying about the inductive step. [Suggestion: You may want to proceed by induction on the length of the shortest deduction of <p.]

2.3. The Logical Axioms 55

4. Let £ be a language that consists of a single unary predicate symbol R, and let B be the infinite set of axioms

B = {JJ(xi), R(xi) R(x2),

' f?(x2) R(x3),

R(xt) R(xi+1),

Using modus ponens as the only rule of inference, prove by in-duction that B h R(xj) for each natural number j > 1.

2.3 The Logical Axioms Let a first-order language C be given. In this section we will gather together a collection A of logical axioms for C. This set of axioms, though infinite, will be decidable. Roughly this means that if we are given a formula <j> that is alleged to be an element of A, we will be able to decide whether 0 € A or <j> & A. Furthermore, we could, in principle, design a computer program that would be able to decide membership in A in a finite amount of time.

After we have established the set of logical axioms A and we want to start doing mathematics, we will want to add additional axioms that are designed to allow us to deduce statements about whatever mathematical system we may have in mind. These will constitute the collection of nonlogical axioms, £. For example, if we are working in number theory, using the language CNT, along with the logical axioms A we will also want to use other axioms that concern the properties of addition and the ordering relation denoted by the symbol <. These additional axioms are the formulas that we will place in E. Then, from this expanded set of axioms A U £ we will attempt to write deductions of formulas that make statements of number-theoretic interest. To reiterate: A, the set of logical axioms, will be fixed, as will the collection of rules of inference. But the set of nonlogical axioms must be specified for each deduction. In the current section we set out the logical axioms only, dealing with the

rules of inference in Section 2.4, and deferring our discussion of the nonlogical axioms until Section 2.8.

2.3.1 Equality Axioms We have taken the route of assuming that the equality symbol, is a part of the language C. There are three groups of axioms that are designed for this symbol. The first just says that any object is equal to itself:

x = x for each variable x. (El)

For the second group of axioms, assume that xi,x2, . . . , x n are variables, yi, y2, • • •»Vn are variables, and / is an n-ary function sym-bol.

[(xi = yi) A (x2 = y2) A • • • A(xn = y„)] {f(xux2i...,xn) - f(yi,y2,-",yn))-

. (E2)

The assumptions for the third group of axioms is the same as for the second group, except that R is assumed to be an n-ary relation symbol (R might be the equality symbol, which is seen as a binary relation symbol).

[(xi = yi) A (x2 = y2) A • • • A(xn = y„)] (R(xi,x2,...,x„) --> R(yi,y2,...,yn))-

Axioms (E2) and (E3) are axioms that are designed to allow substitution of equals for equals. Nothing fancier than that.

2.3.2 Quantifier Axioms The quantifier axioms are designed to allow a very reasonable sort of entry in a deduction. Suppose that we know VxP(x). Then, if t is any term of the language, we should be able to state P(t). To avoid problems of the sort outlined at the beginning of Section 1.8, we will demand that the term t be substitutable for the variable x.

2.4. Rules of Inference 57

(Vx<j>) —• (f)*, if t is substitutable for x in <f>. (Ql) 4% —• (3x(j>), if t is substitutable for x in 4>. (Q2)

In many logic texts, axiom (Ql) would be called universal instan-tiation, while (Q2) would be known as existential generalization. We will avoid this impressive language and stick with the more mundane (Ql) and (Q2).

2.3.3 Recap

Just to gather all of the logical axioms together in one place, let me state them once again. The set A of logical axioms is the collection of all formulas that fall into one of the following categories:

x = x for each variable x. (El) [(®i = Vi) A (x2 - y2) A • • • A(xn = y„)] - »

(f(x 1, X2,..., Xn) = f(yi, V2,"-, Vn))-(E2)

[(xx = yx) A (x2 = y2) A • • • A(xn = yn)] (R(x 1, X2,..., X N ) R(yi, J/2) •«•, Vn))-

(E3) (Vx0) -+ 4>t, if t is substitutable for x in <j>. (Ql) <Pt —* (3x(f>), if t is substitutable for x in <f>. (Q2)

Notice that A is decidable: We could write a computer program which, given a formula <f>, can decide in a finite amount of time whether or not <f> is an element of A.

2.4 Rules of Inference

Having established our set A of logical axioms, we must now fix our rules of inference. There will be two types of rules, one dealing with propositional consequence and one dealing with quantifiers.

2.4.1 Propositional Consequence In all likelihood you are familiar with tautologies of propositional logic. They are simply formulas like (A —* B) (->£? —• -vA). If you are comfortable with tautologies, feel free to skip over the next couple of paragraphs. If not, what follows is a very brief review of a portion of propositional logic.

We work with a restricted language V, consisting only of a set of propositional variables A,B,C,... and the connectives V and -». Notice there are no quantifiers, no relation symbols, no function sym-bols, and no constants. Formulas of propositional logic are defined as being the collection of all <f> such that either 0 is a propositional variable, or <f> is (""a), or 0 is (a V /?), with a and P being formulas of propositional logic.

Each propositional variable can be assigned one of two truth val-ues, T or F, corresponding to truth and falsity. Given such an as-signment (which is really a function v : propositional variables —> {T, F}), we can extend v to a function v assigning a truth value to any propositional formula as follows:

v(<f>) if <t> is a propositional variable , . _ F if <j> is (itt) and v(a) = T

F if <f> is (a V p) and v(a) = v(P) = F T otherwise.

Now we say that a propositional formula <f> is a tautology if and only if v(4>) = T for any truth assignment v.

One way that you can check whether a given <j> is a tautology is by constructing a truth table with one row for each possible combination of truth values for the propositional variables that occur in <f>. Then you fill in the truth table and see whether the truth value associated with the main connective is always true. For example, consider the propositional formula A —* (B —* A), which is translated to -vl V

V A). The truth table verifying that this formula is a tautology is

A B -1.4 V {pB V A) T T F T F T T T F F T T T T F T T T F F F F F T T T T F

To discuss prepositional consequence in first-order logic, we will transfer our formulas to the realm of prepositional logic and use the idea of tautology in that area. To be specific, given (3, an >C-formula of first-order logic, here is a procedure that will convert p to a formula ftp of prepositional logic corresponding to /?:

1. Find all subformulas of /3 of the form Vxa that are not in the scope of another quantifier. Replace them with prepositional variables in a systematic fashion. This means that if VyQ(y, c) appears twice in /?, it is replaced by the same letter both times, and distinct subformulas are replaced with distinct letters.

2. Find all atomic formulas that remain, and replace them sys-tematically with new prepositional variables.

3. At this point, will have been replaced with a prepositional formula Pp.

For example, suppose that we look at the -formula

(VxP(x) A Q(c, z)) - (Q(c, z) V VxP(x)).

For the first step of the procedure above, we replace the quantified subformulas with the prepositional letter B:

(BAQ(c,z))-+(Q(c,z)VB).

To finish the transformation to a prepositional formula, replace the atomic formula with a prepositional letter:

(BAA)-+(AVB).

Notice that if j3p is a tautology, then P is valid, but the converse of this statement fails. For example, if j3 is

then /3 is valid, but {3p would be [A A B] —• C, which is certainly not a tautology.

We are now almost at a point where we can state our preposi-tional rule of inference. Recall that a rule of inference is an ordered pair (r, 4>), where T is a set of / -formulas and <f> is an >C-formula.

Definition 2.4.1. Suppose that IV is a set of propositional formulas and <f)p is a propositional formula. We will say that 4>P is a propo-sitional consequence of IV if every truth assignment that makes each propositional formula in IV true also makes <j)p true. Notice that (f>p is a tautology if and only if <f>p is a propositional consequence of 0.

Lemma 2.4.2. IfTp = {7ip,72pj • • • >7np} is a nonempty finite set of propositional formulas and <f>p is a propositional formula, then <f>p is a propositional consequence ofVp if and only if

H I P A 7 2 P A • • • A 7 „ P ] - » < J ) P

is a tautology.

Proof. Exercise 3. •

Now we extend our definition of propositional consequence to include formulas of first-order logic:

Definition 2.4.3. Suppose that T is a finite set of ^-formulas and 4> is an £-formula. We will say that 0 is a propositional consequence of r if <l>p is a propositional consequence of Tp, where (f>p and IV are the results of applying the procedure on the preceding page uniformly to <j> and all of the formulas in T.

Example 2.4.4. Suppose that £ contains two unary relation sym-bols, P and Q. Let T be the set

{VxP(x) - 3yQ(y), 3yQ(y) - P(x), ~>P(x) ~ (y = *)}.

If we let <t> be the formula VxP(x) —• ->(y = z), then by applying our procedure uniformly to the elements of T and <}>, we see that

Tp is {A B,B -* Ct->C *-* D}

and 4>P is A —* ~>D, where the fact that we have substituted the same propositional variables for the same formulas in <j> and the elements of r is ensured by our applying the procedure uniformly to all of the formulas in question. At this point it is easy to verify that 0 is a propositional consequence of T. *

Finally, our rule of inference:

Definition 2.4.5. If T is a finite set of ^-formulas, <j> is an ^-formula, and <j> is a propositional consequence of T, then (r, <p) is a rule of inference of type (PC) .

Chaff: All of this formalism just might be hiding what is really going on here. What rule (PC) says is that if you have proved 71 and 72 and [(71 A 72) —» <p]p is a tautology, then you may conclude <f>. Nothing fancier than that.

Also notice that if <p is a formula such that <j)p is a tau-tology, rule (PC) allows us to assert <j> in any deduction, using T = 0.

2.4.2 Quantifier Rules The motivation behind our quantifier rules is very simple. Suppose, without making any particular assumptions about x, that you were able to prove "x is an ambitious aardvark." Then it seems reasonable to claim that you have proved "(Vx)x is an ambitious aardvark." Dually, if you were able to prove the Riemann Hypothesis from the assumption that Kx is a bossy bullfrog," then from the assumption "(3x)x is a bossy bullfrog," you should still be able to prove the Riemann Hypothesis.

Definition 2.4.6. Suppose that the variable x is not free in the formula ip. Then both of the following are rules of inference of type (QR):

({ip <p},ip (Vx0)) ({<f> (3z<£) -»

The "not making any particular assumptions about x" comment is made formal by the requirement that x not be free in ip.

Chaff: Just to make sure that you are not lost in the brackets of the definition, what we are saying here is that if x is not free in ipi

1. From the formula tp —• <f>, you may deduce ip —* (Vx<£).

2. From the formula <p~*rp, you may deduce (3x<P)

2.4.3 Exercises 1. We claim that the collection A of logical axioms is decidable.

Outline an algorithm which, given an ^-formula 9, outputs "yes" if 9 is an element of A and outputs "no" if 9 is not an element of A. You do not have to be too fussy. Notice that you have to be able to decide if a term t is substitutable for a variable x is a formula <f>. See Exercise 7 in Section 1.8.1.

2. Show that the set of rules of inference is decidable. So outline an algorithm that will decide, given a finite set of formulas T and a formula 9, whether or not (r, 9) is a rule of inference.

3. Prove Lemma 2.4.2.

4. Write a deduction of the second quantifier axiom (Q2) (on page 57) without using (Q2) as an axiom.

5. For each of the following, decide if 0 is a propositional conse-quence of T and justify your assertion.

(a) T is {(VxP(x)) - Q(y), (VzP(ss)) V (yxR(x)),3x^R(x)}; <f> is Q(y).

(b) r is {x = y A Q(y), Q(y) V x + y < z}; <f> is x + y < z. (c) T is {P(x, y,x),x<yV M(w,p), (->P(x, y, x)) A < y)}\

<j> is -<M(w,p).

6. Prove that if 9 is not valid, then dp is not a tautology. Deduce that if Bp is a tautology, then 9 is valid.

2.5 Soundness Mathematicians are by nature a conservative bunch. I speak not of political or social leanings, but of their professional outlook. In particular, a mathematician likes to know that when something has been proved, it is true. In this section we will prove a theorem that shows that the logical system that we have developed has this highly desirable property. This result is called the Soundness Theorem.

Let me restate the list of requirements that we set out on page 49 for our axioms and rules of inference:

2.5. Soundness 63

1. There will be an algorithm that will decide, given a formula 9, whether or not 9 is a logical axiom.

2. There will be an algorithm that will decide, given a finite set of formulas T and a formula 6, whether or not (r, 9) is a rule of inference.

3. For each rule of inference (I\ 9), T will be a finite set of formu-las.

4. Each logical axiom will be valid.

5. Our rules of inference will preserve truth. In other words, for each rule of inference (F, 0), T }= 0.

These requirements serve two purposes: They allow us to verify mechanically that an alleged deduction is in fact a deduction, and they provide the basis of the Soundness Theorem. Of course, we must first verify that the system of axioms and rules that we have set out in the preceding two sections satisfy these requirements.

That the first three requirements above are satisfied by our de-duction system was noted as the axioms and rules were presented. These are the rules that are needed for deduction verification. We will discuss the last two requirements in more detail and then use those requirements to prove the Soundness Theorem. Theorem 2.5.1. The logical axioms are valid. Proof. We must check both the equality axioms and the quantifier axioms. First, consider equality axioms of type (E2). [(El) and (E3) will be proved in the Exercises.]

Chaff: Let me mention that we will use Theorem 2.6.2 in this proof. Although the presentation of that result has been delayed in order to aid the flow of the exposition, you may want to look at the statement of that theorem now so you won't be surprised when it appears.

So fix a structure 21 and an assignment function s : Vars —» A. We must show that

(/(*!, - • •, xn) = /(yi, y2, •. •, Vn))) M-

As the formula in question is an implication, we may assume that the antecedent is satisfied by the pair {% s), and thus s(xi) = s(yi)is(#2) = s ( ? / 2 ) , a n d s{xn) = s(yn). We must prove that 211= (/(a;i,X2,...,Xn) = /(yi,?/2,• • • >yn))[s]- Prom the definition of satisfaction (Definition 1.7.4), we know this means that we have to show

*(/(®i» • • •» £«)) = s(/(yi, 3/2, • • • > 2/n))-Now we look at the definition of term assignment function (Defini-tion 1.7.3) and see that we must prove

As(xi),s(x2),-.-,%n)) = /a(s(yi), s(y2), • • • ,s(yn))-

But since s(xi) == s(x,) = s(pi) = s(yt), and since / a is a function, this is true. Thus our equality axiom (E2) is valid.

Now we examine the quantifier axiom of type (Ql), reserving (Q2) for the Exercises. Once again, fix 21 and s, and assume that the term t is substitutable for the variable x in the formula <j>. We must show that

21 f= - # ] [ « ] .

So once again, we assume that 21 |= (Vz^>)[s], and we show that \= 4>t[s]- By assumption, 21 (= 0[s[x|a]] for any element o € A, so

in particular, a (= 0[s[x|s(t)]]. Informally, this says that <f> is true in 21 with assignment function

s, where you interpret x as s(t). It is plausible, given our assumption that t is substitutable for x in tj>, that if we altered the formula $ by replacing x by t, then <f>* would be true in 21 with assignment function s. This is the content of Theorem 2.6.2. Since we know that 21 |= </>[s[x|s(f)]J and Theorem 2.6.2 states that this is equivalent to 21 |= <fit[s), we have established 21 f= [s], so we have proved that axioms of type (Ql) are valid.

Thus, modulo your proofs of (El), (E2), and (Q2) and the delayed proof of Theorem 2.6.2, all of our logical axioms are valid, and our proof is complete. •

This leaves one more item on our list of requirements to check. We must show that our rules of inference preserve truth.

Theorem 2.5.2. Suppose that (r, 6) is a rule of inference. Then r M-

2.5. Soundness 65

Proof. First, assume that (r, 9) is a rule of type (PC). Then F is finite, and by Lemma 2.4.2, we know that

[ 7 1 P A 72 P A • • • A 7« /> ] -* Op

is a tautology, where Tp = {7ip,72p, • • • i7np} is the set of propo-sitional formulas corresponding to T and Op is the propositional for-mula corresponding to 9. But then, by Exercise 6 on page 62, we know that

[71 A 72 A • • • A 7 „ ] 9

is valid, and thus r 9. The other possibility is that our rule of inference is a quantifier

rule. So, suppose that x is not free in tp. We show that (ip —<• <f>) j= [tp —• (Vx<£)], leaving the other (QR) rule for the Exercises.

So fix a structure 21 and assume that 21 f= (tp —» <j>). Thus our assumption is that for any assignment s, 211= (ip —• <p)[s]. We must show that % j= (ip —* which means that we must show that (ip —• Vx<p) is satisfied in 21 under every assignment function. So let an assignment function t : Vars —* A be given. We must show that 2t f= (tp -* *ix<p)[t). If a ^ ip[t], we are done, so assume that 21 f= ip[t]. We want to prove that 21 (= Vx<p\t], which means that if a is any element of A, we must show that 21 <£[t[x|a]].

We know, by assumption, that 21 |= (ip -* 0)[£[x]a]]. Further-more, Proposition 1.7.7 tells us that 21 |= [<[x|a]], as 21 |= ip[t], and t and t[x|a] agree on all of the free variables of ip (x is not free in ip by assumption). But then, by the definition of satisfaction, 21 (= </>[£[x|a]], and we are finished. •

We are now at a point where we can prove the Soundness Theo-rem. The idea behind this theorem is very simple. Suppose that E is a set of ^-formulas and suppose that there is a deduction of <p from E. What the Soundness Theorem tells us is that in any structure 21 that makes all of the formulas of E true, (p is true as well.

Theorem 2.5.3 (Soundness). J/E h <p, then E \= <p.

Proof. Let ThmE = {<p J E h <p), and let C = {<p | E }= <p}. We show that Thing Q C, which proves the theorem.

Notice that C has the following characteristics:

1. E C C. If a € E, then certainly £ f=

2. A C C. As the logical axioms are valid, they are true in any structure. Thus E \= A for any logical axiom A, which means that if A € A, then A 6 C, as needed.

3. If (T, 9) is a rule of inference and F C C , then 9 € C. So assume that T C C. To prove 6 € C we must show that E f= 9. Fix a structure 21 such that 211= E. We must prove that 21 (= 9. If 7 is any element of T, then since 7 € C, we know that E |= 7. Since 21 f= E and E |= 7, we know that 21 f= 7. But this says that 21 f= 7 for each 7 € T, so 211= T. But Theorem 2.5.2 tells us that r f= 9, since (r, 9) is a rule of inference. Therefore, since & (= T and r f= 9, 21 9, as needed.

So C is a set of the type outlined in Proposition 2.2.4, and by that proposition, Thms C C, as needed. •

Notice that the Soundness Theorem begins to tie together the notions of deducibility and logical implication. It says, "If there is a deduction from E of <j>, then E logically implies Thus the purely syntactic notion of deduction, a notion that relies only upon typographical considerations, is linked to the notions of truth and logical implication, ideas that are inextricably tied to mathematical structures and their properties. This linkage will be tightened in Chapter 3.

Chaff: The proof of the Soundness Theorem that I have presented above has the desirable qualities of being neat and quick. It emphasizes a core fact about the con-sequences of E, namely that Thms is the smallest set of formulas satisfying the given three conditions. Unfortu-nately, the proof has the less desirable attribute of being pretty abstract. Exercise 5 outlines a more direct, less abstract proof of the Soundness Theorem.

2.5.1 Exercises 1. Ingrid walks into your office one day and announces that she is

puzzled. She has a set of axioms E in the language of number theory, and she has a formula <f> that she has proved using the assumptions in E. Unfortunately, <j> is a statement that is not true in the standard model 9t. Is this a problem? If it is a

2.6. Two Technical Lemmas 67

problem, what possible explanations can you think of that would explain what went wrong? If it is not a problem, why is it not a problem?

2. Prove that the equality axioms of type (El) and (E3) are valid.

3. Show that the quantifier axiom of type (Q2) is valid.

4. Show that, if x is not free in Vs (<£ VO N [(3x0) —» tp].

5. Prove the Soundness Theorem by induction on the complexity of the proof of <f>. For the base cases, <p is either a logical axiom or a member of E. Then assume that 0 is proved by reference to a rule of inference show that in this case as well, E f= <p.

2.6 Two Technical Lemmas In this section we present two rather technical lemmas that we need to complete the proof of Theorem 2.5.1. The proofs that are involved are not pretty, and if you are the trusting sort, you may want to scan through this section rather quickly. On the other hand, if you come to grips with these results, you will gain a better appreciation for the details of substitutability and assignment functions.

To motivate the first lemma, consider this example: Suppose that we are working in the language of number theory and that the structure under consideration comprises the natural numbers. Let the term ubex-v and the term t be y+z. Then u® is (y+z)-v. Now we have to fix a couple of assignment functions. Let the assignment function s look like this:

Vdrs s X 12 y 3 z 7 V 4 • *

So s(x) = 12, s(y) — 3, and so on. Now, suppose that s' is an assignment function that is just like s,

except that sf sends x to the value s(t), which is s(y+z) — 3+7 = 10:

Vars s a'

x 12 10 y 3 3 z 7 7 V 4 4 * * •

Now, if you compare s(uf) and s'(u), you find that

a(uf) = s((y + z) - v) = (3 + 7) • 4 = 10 • 4 = 40

7(u) = 7(X 'V) = 10 • 4 a 40. So, in this situation, the element of the universe that is assigned by s to uf is the same as the element of the universe that is assigned by s' to u. In some sense, the lemma states that it does not matter whether you alter the term or the assignment function, the result is the same.

Here is the formal statement:

Lemma 2.6.1. Suppose that u is a term, x is a variable, and t is a term. Suppose that a : Vara —* A is a variable assignment function and that a' = s[x|s(t)]. Then s(uf) — s'(u).

Proof. The proof is by induction on the complexity of the term u. If u is a variable and u — x, then

s(uf) = s(xxt) = s{t) = a'(x) = 7{u).

If u is a variable and u — y^x, then

s(u?) = s(yf) = % )

= s{y) = s\y) = s'(u).

If u is a constant symbol c, then s(ti£) = s(cf) = s(c) = c* = s'(u). The last inductive case is if u is f(r%,r2,...,r„), with each r< a

term. In this case,

s(nf) = s ( [ / ( r 1 , r 2 , . . . , r n ) ] f ) = «( / ( (r 1 ) f , (r 2 ) f (rB)?))

= s[(r2)f],.. . , s[(r„)f]) definition of s = /a (s ' (ri ) , s'(r2),..., s'(rn) inductive hypothesis = s'(f(ri, r 2 , . . . , rn)) definition of s' = 7(u).

So for every term u, s(u£) = s'(w). •

Chaff: That was hard. If you understood that proof the first time through, you have done something quite out of the ordinary. If, on the other hand, you are a mere mortal, you might want to work through the proof again, keeping an example in mind as you work. Pick a language, terms 11 and t in your language, and a variable x. Fix a particular assignment function s. Then just

' follow through the steps of the proof, keeping track of where everything goes. I would write it out for you, but you will get more out of doing it for yourself. Go to it!

Our next technical result is the lemma that we quoted explicitly in the proof of Theorem 2.5.1. This theorem states that as long as t is substitutable for x in <f>, the two different ways of evaluating the truth of "0, where you interpret x as t" coincide. The first way of evaluating the truth would be by forming the formula <pf and seeing if 21 |= <f>t[s]. The second way would be to change the assignment -function s to interpret x as s(t) and checking whether the original formula <f> is true with this new assignment function. The theorem states that the two methods are equivalent.

Theorem 2.6.2. Suppose that <j> is an C-formula, x is a variable, t is a term, and t is substitutable for x in <f>. Suppose that s 1 Vars —* A is a variable assignment function and that s' = s[x|s(£)]. Then % N if and only if% ^ (j>[s'}.

except that s' sends x to the value s(t), which is s(y+z) — 3+7 = 10:

Vars s s' X 12 10 V 3 3 z 7 7 V 4 4 : ; »

Now, if you compare s(uf) and s'(u), you find that

s{uf) = s({y + z) • u) = (3 + 7) • 4 = 10 • 4 = 40

= 7(x • v) = 10 • 4 = 40. So, in this situation, the element of the universe that is assigned by s to it® is the same as the element of the universe that is assigned by a' to u. In some sense, the lemma states that it does not matter whether you alter the term or the assignment function, the result is the same.

Here is the formal statement:

Lemma 2.6.1. Suppose that u is a term, x is a variable, and t is a term. Suppose that a : Vara —* A is a variable assignment function and that a' = s[xjs(t)]. Then a(uf) = s'(u).

Proof. The proof is by induction on the complexity of the term u. If u is a variable and u = x, then

s « ) = ) = 1(0 = a'(x) = 7(«).

If u is a variable and u = y ^ x, then

= s(y?)

= »(y) = Ay)

If u is a constant symbol c, then s(uf) = s(cf) — s(c) = c a = s'(u). The last inductive case is if u is / ( n , r 2 , . . . , rn), with each a

term. In this case,

Chaff: That was hard. If you understood that proof the first time through, you have done something quite out of the ordinary. If, on the other hand, you are a mere mortal, you might want to work through the proof again, keeping an example in mind as you work. Pick a language, terms u and t in your language, and a variable x. Fix a particular assignment function s. Then just follow through the steps of the proof, keeping track of where everything goes. I would write it out for you, but you will get more out of doing it for yourself. Go to it!

Our next technical result is the lemma that we quoted explicitly in the proof of Theorem 2.5.1. This theorem states that as long as t is substitutable for x in <j>, the two different ways of evaluating the truth of "0, where you interpret x as i" coincide. The first way of evaluating the truth would be by forming the formula <$>% and seeing if 21 |= <f>t [s]. The second way would be to change the assignment function s to interpret x as s(t) and checking whether the original formula <f> is true with this new assignment function. The theorem states that the two methods are equivalent.

Theorem 2.6.2. Suppose that <j> is an C-formula, x is a variable, t is a term, and t is substitutable for x in <f>. Suppose that s ; Vars —» A is a variable assignment function and that s' = s[x|s(f)]. Then 21 }= 4>f[s] if and only t/2l (= 0[s'].

= ? ( / ( r i , r 2 , . . . , r n ) ) =

definition of s inductive hypothesis definition of s1

So for every term u, ) = s'(u). •

Proof. We use induction on the complexity of <p. The first base case is where 4>isu\ = U2, where ui and u2 are terms. Then the following are equivalent:

The second base case is where is R(u\, u2,,.., Un). This case is similar to the case above.

The inductive cases involving the connectives V and -> follow immediately from the inductive hypothesis.

This leaves the last inductive case, where <j> is Vy^. We break this case down into two subcases: In the first subcase x is y, and in the second subcase x is not y.

If <j) is Vyi/> and y is x, then <j>f is <t>. Therefore, 21 [= <f)f[s] if and only if 21 |= < >[s]. But as s and s' agree on all of the free variables of 4> (x is not free), by Proposition 1.7.7, 21 }= 0[s] if and only if 211= <j>[a'\, as needed for this subcase.

The second subcase, where <f> is Wyip and y is not x, is examined in two sub-subcases:

Sub-subcase 1: If <f> is Vyt/>, y is not x, and x is not free in ip, then we know by Exercise 5 in Section 1.8.1 that ipf is i/>, and thus 4>t *s

<f>. But then

as s and s' agree on the free variables of <j>. Sub-subcase 2: If (j> is VyV>, y is not x, and x is free in ip, then as t

is substitutable for x in 0 (we had to use that assumption somewhere, didn't we?), we know that y does not occur in t and t is substitutable for x in i/>. Then we have

a h tf M 21 H («Of = («a)f w

s'(ui) = s'(u 2) 21 M M

definition of satisfaction by Lemma 2.6.1

a b <%[*] 21H <!>[*} 21 M M .

iff iff

aH(Vy)(Vf)W a \= mm*]}

iff iff for every a € A.

2.7. Properties of Our Deductive System 71

But we also know that

a M M si f= mms'} 21 b ip{s'[y\a}}

iff iff for every a 6 A.

But since x is not y, we know that for any a € A, s'[y\a] — s[y|a][x|s(t)], so by the inductive hypothesis (notice that t is substi-tutable for x in ip) we have

2.7 Properties of Our Deductive System Having gone through all the trouble of setting out our deductive sys-tem, we will now prove a few things both in and about that system. . First, we will show that we can prove, in our deductive system, that equality is an equivalence relation.

Theorem 2.7.1.

1. h x = x.

2. h x — y y = x.

3. b(x = yAy = z)'-+x = z.

Proof. We show that we can find deductions establishing that = is reflexive, symmetric, and transitive in turn.

1. This is a logical axiom of type (El).

2. Here is the needed deduction. Notice that the notations off to the right are listed only as an aid to the reader.

i f f a M M j / M l -

So 21 b <f>t [s] if and only if 21 f= as needed. •

[x-y/\x — x]—*[x — x-*y = x] (E3) • x = x

x — y—*y — x. (El) (PC)

3. Again, we present a deduction:

[ I = I A y = 2 ] —»[x = y —• x = z] (E3) (El) (PC) •

x = x • (x = y Ay = z) —*• x = z.

Chaff: Notice that we have done a bit more than prove that equality is an equivalence relation. (Heck, you've known that since fourth grade.) Rather, we've shown that our deductive system, with the axioms and rules of inference that have been outlined in this chapter, is powerful enough to prove that equality is an equiva-lence relation. There will be a fair bit of "our deductive system is strong enough to do such-and-such" in the pages to come.

We now prove some general properties of our deductive system. We start off with a lemma that seems somewhat problematical, but it will help us to think a Uttle more carefully about what our deductions do for us.

Lemma 2.7.2. E 0 if and only if Eh Vx0.

Proof. First, suppose that E b 0. Here is a deduction from E of Vx0:

There are a couple of things to point out about this proof. The first use of (PC) is justified by the fact that if 0 is true, then (anything —• 0) is also true. The second use of (PC) depends on the fact that [(Vy(y = y)) V ->(Vy(y = y))] is a tautology, and thus Vx0 is a propositional consequence of the implication. As for the (QR) step of the deduction, notice that the variable x is not free in the sentence [(Vy(y = y)) V ->(Vy(y = y))], making the use of the quantifier rule legitimate.

Deduction of 0 0 [(Vy(y = y)) V -"(Vy(y = y))] —• 0 [(Vy(y = y)) V -(Vy(y = y))j - (Vx<?)

(PC) (QR) (PC) Vx0.

Now, suppose that £ b Vx0. Here is a deduction from £ of 0 (recall that 9% is 0):

Here is an example to show how strange this lemma might seem. Suppose that £ consists of the single formula x = 5. Then certainly E h i = 5, and so, by the lemma, £ b (Vx)(x = 5). You might be tempted to say that by assuming x was equal to five, we have proved that everything is equal to five. But that is not quite what is going on. If x = 5 is true in a model 21, that means that 21 f= x = 5[s] for every assignment function s. And since for every a € A, there is an assignment function s such that s(x) = o, it must be true that every element of A is equal to 5, so the universe A has only one element, and everything is equal to 5. So our deduction of (Vx)(x = 5) has preserved truth, but our assumption was much stronger than it appeared at first glance. And the moral of our story is: For a formula to be true in a structure, it must be satisfied in that structure with every assignment function.

Lemma 2.7.3. Suppose that E b 6. Then if £ ' is formed by taking any a 6 E and adding or deleting a universal quantifier whose scope is the entire formula, E' b 0.

Proof. This follows immediately from Lemma 2.7.2. Suppose that Vxcr is in £'. By the preceding, E' b a. Then, given a deduction from E of 0, to produce a deduction from E' of 9, first write down a deduction from E' of <r, and then copy your deduction from E of 9. Having already established a, this deduction will be a valid deduction from £'.

The proof in the case that Vxa is an element of E and it is replaced by a in £ ' is analogous. •

Deduction of Vx0

(Qi) (PC)

Thus E b 9 if and only if E b Vx6. •

Notice that one consequence of this lemma is the fact that if we know E b 0, we can assume (if we like) that every element of £ is

a sentence: By quoting Lemma 2.7.3 several times, we can replace each o € E with its universal closure.

Now we will show that in at least some sense, the system of deductions that we have developed mirrors the process that mathe-maticians use to prove theorems. Suppose you were asked to prove the theorem: If A is a square, then A is a rectangle. A perfectly rea-sonable way to attack this theorem would be to assume that A is a square, and using that assumption, prove that A is a rectangle. But notice that you have not been asked to prove that A is a rectangle. You were asked to prove an implication! The Deduction Theorem says that there is a deduction of (j> from the assumption 0 if and only if there is a deduction of the implication 0 —• <j>. (A bit of notation: Rather than writing the formally correct £ U {0} h <f>, we shall omit the braces and write £ U 01- <j>.)

Theorem 2.7.4 (The Deduction Theorem). Suppose that 0 is a sentence and E is a set of formulas. Then E U 9 h <j> if and only if £ h (0 - » 4>).

Proof. First, suppose that £ h (0 —* <f>). Then, as the same deduction would show that E U 0 h (9 —* </>), and a s £ U 0 h 0 b y a one-line deduction, and as <j> is a prepositional consequence of 0 and (0 —* <p), we know that £ U 0 h <f>.

For the more difficult direction we will make use of Proposi-tion 2.2.4. Suppose that C = {4> | £ h (0 0)}. If we show that C contains E U 9, C contains all the axioms of A, and C is closed under the rules of inference as noted in Proposition 2.2.4, then by that proposition we will know that {<j> | £ U 9 h <f>} Q C. In other words, we will know that if £ U 0 h <f>, then E h (0 —* <f>), which is what we need to show.

So it remains to prove that C has the properties listed in the preceding paragraph.

1. E C C; If o e £, then Eh a. But then £ h (0 a), as this is a prepositional consequence of a.

2. 9 EC: E h 0 0, as this is a tautology.

3. A C C: This is identical to (1).

4. C is closed under the rules:

(a) Rule (PC): Suppose that 71,72> • - -17n are all elements of C and 0 is a propositional consequence of { 7 1 , 7 2 , . . . , 7 N } -We must show that <p 6 C. By assumption, £ h (0 —• 71), £ b (9 -> 72), . . . , £ H (0 7n). But then as (0 <p) is a propositional consequence of the set

we have that £ h (0 —* (j>). In other words, <f> € C, as needed.

(b) Quantifier Rules: Suppose that tp —• <p is in C and x is not free in if). We want to show that (ip —• Vx0) is an element of C. In other words, we have to show that

EH [0(ip-+Vx<pj\.

By assumption we have

E h [0 (ip <p)] ip <p is in C E h (0 A ip) —* <p propositional consequence £ H (0 A VO rule (QR) E b [0 (ip Vx0)] propositional consequence

Notice that our use of rule (QR) is legitimate since we know that 0 is a sentence, so x is not free in 0. But the last line of our argument says that (ip —» \/x(f>) G C, which is what we needed to show. The other quantifier rule, dealing with the existential quan-tifier, is proved similarly.

So we have shown that C contains 0, all the elements of E and A, and C is closed under the rules. This finishes the proof of the Deduction Theorem. O

2.7.1 Exercises 1. Lemma 2.7.2 tells us that E I- 9 if and only if E I- Vx0. What

happens if we replace the universal quantifier by an existential quantifier? So suppose that E I- 0. Must E h 3x0? Now assume that E h 3x0. Does E necessarily prove 0?

2. Finish the proof of Lemma 2.7.3 by considering the case when Vxcr is an element of £ and is replaced by o in £'.

3. Many authors demand that axioms be sentences rather than for-mulas. Explain how Lemma 2.7.2 implies that we could replace all of our axioms by their universal closures without changing the strength of our deductive system.

4. Suppose that 7/ is a sentence. Prove that £ h 77 if and only if £ U (->T?) b [(Vx)x = x] A -«[(Vx)x - x]. Notice that this exercise tells us that our deductive system allows us to do proofs by contradiction.

5. Suppose that P is a unary relation symbol and show that

h [(Vx)P(x)] - [(3x)P(x)].

[Suggestion: Proof by contradiction (see Exercise 4) works nicely here.]

6. If P is a binary relation symbol, show that

(Vx)(Vy)P(x,y) I- (Vy)(V*)P(2,y).

7. Let P and Q be unary relation symbols, and show that

I- [(Vx)(P(x)) A (Vx)(Q(x))] - (Vx) [P(x) A Q[x)\.

2.8 Nonlogical Axioms When we are trying to prove theorems in mathematics, there are almost always additional axioms, beyond the set of logical axioms A, that we use. If we are trying to prove a theorem about vector spaces, the axioms of vector spaces come in mighty handy. If we are proving theorems in a real analysis course, we need to have axioms about the structure of the real numbers. These additional axioms are sometimes explicitly stated and sometimes they are blanket as-sumptions that are made without being stated, but they are almost always there. In this section I give a couple of examples of sets of nonlogical axioms that we might use in writing deductions.

2.8. Nonlogical Axioms 77

Example 2.8.1. For many of us, the first explicit set of nonlogical axioms that we see is in a course on linear algebra. To work those axioms out explicitly, let us fix the language C as consisting of one binary function symbol, ©, and infinitely many unary function sym-bols, c-, one for each real number c. (Yes, that symbol is "c-dot.") These function symbols will be used to represent the functions of scalar multiplication. We will also have one constant symbol, 0, to represent the zero vector of the vector space. Here, then, is one way to list the nonlogical axioms of a vector space:

1. (Vx)(Vy)x ®y—y®x (vector addition is commutative).

2. (Vx)(Vy)(Vz)x ®(y®z) = (x®y)@z (vector addition is asso-ciative).

3. (Vx)a; © 0 = a:.

4. (Vx)(3y)x@y = 0.

5. (Vx)l ' x = x.

6. (Va:)(cic2) • x — c\ • (c2 • x).

7. (Vx)(Vy)c• ( x ® y ) = c-x®c-y.

8. (Vx)(ci + C2) * x — C\ ' x © C2 • x.

Notice a couple of things here: There are infinitely many axioms listed, as the last three axioms are really axiom schemas, consisting of one axiom for each choice of c, Ci, and C2. An axiom schema is a template, saying that a formula is in the axiom set if it is of a certain form. Also notice that I've cheated in using the addition sign to stand for addition and juxtaposition to stand for multiplication of real numbers since the language C does not allow that sort of thing. See Exercise 1.

Example 2.8.2. We will write out the axioms for a dense linear order without endpoints. Our language consists of a single binary relation symbol, <. Our nonlogical axioms are:

1. (Vx)(Vj/)(a:<j/Vx = y Vy < x).

2. (Vx)(Vy)[x = y ~>x < y].

3. (Vx)(Vy)(Vz)[(s <y Ay < z)—* x < z].

4. (Vx)(Vy)[x < y ((3z)(x < zAz < y))].

5. (Vx)(3y)(3z)(y < x A x < z).

The first three axioms guarantee that the relation denoted by < is a linear order, the fourth axiom states that the relation is dense, and the final axiom ensures that there is no smallest element and no greatest element.

Notice that in both of our examples, the axiom set involved is decidable: Given a formula 0 that is alleged to be either an axiom for vector spaces or an axiom for dense linear orders without endpoints, we could decide whether or not the formula was, in fact, such an axiom. And furthermore, we could write a computer program that could decide the issue for us.

Example 2.8.3. It is time to introduce a collection of nonlogical axioms that will be vitally important to us for the rest of the book. We work in the language of number theory,

The set of axioms we will call N is a minimal set of assumptions to describe a bare-bones version of the usual operations on the set of natural numbers. Just how weak these axioms are will be discussed in the next chapter. These axioms will, however, be important to us in Chapters 4 and 5 precisely because they are so weak.

The Axioms of N 1. (Vx)-.Sx = 0.

2. (Vx)(Vy)[Sx = S y - x = y].

3. (Vx)x + 0 = x.

4. (Vx)(Vy)x + Sy = S(x + y).

5. (Vx)x -0 = 0.

6. (Vx)(Vy)x • Sy = (x • y) + x.

7. (Vx)x£0 = 50.

8. (Vx)(Vy)xE(Sy) = (xEy).x.

9. (Vx)-ix < 0.

10. (Vx)(Vy) [x < Sy (x < y V x = y)].

11. (Vx)(Vy) [(x < y) V (x = y) V (y < x)].

Although I have just claimed that TV is a weak set of axioms, let us show that N is strong enough to prove some of the basic facts about the relations and functions on the natural numbers. For the following discussion, if a is a natural number, let a be the CNT-term SSS • • • 50. So a is the canonical term of the language that is

intended to refer to the natural number a.

Lemma 2.8.4. For natural numbers a and b:

1. If a = b, then Nha = b.

2. If a ^ 6, then Nha^b.

S. If a < b, then N\-a<b.

4. If a^t b, then N\-a-jtb.

5. N ha + b = a + b

6. N h a-5 = a~T>

7. NhaEb = cfi

Proof. Let us begin with (1), and let us work rather carefully. Notice that the theorem is saying that if the number a is equal to the number 6, then there is a deduction from the axioms in N of the formula

SS---S0 = SS---S0. oS'3 bS's

We work by induction on a (and 6, since a = 6). So, first assume that a = b — 0. Here is the needed deduction in N:

: Deduction of (Vx)x = x (see Lemma 2.7.2) (Vx)x — x (Vx)x = £—>0 = 0 (Ql) 0 = 0. (PC)

Now, what if a — b and a and b are greater than 0? Then certainly a — 1 and 6 — 1 are equal, and by the inductive hypothesis there is a deduction of -j • 50 = SS-- • 50. If we follow that deduction

a—lS's b—lS's with a use of axiom (E2): x — y Sx — Sy, and then (PC) gives us t>S -j • SO = -j • SO, as needed. Write out the details of the end of

aS's bS's this deduction. It is a little trickier than I have made it sound when you actually have to use (Ql) to do the substitution. This finishes the inductive step of the proof, so (1) is established.

Looking at (2), suppose that a ^ b. If one of a or b is 0, then - . 5 = 6 follows quickly from Axiom N1 and the fact that N proves that = is an equivalence relation. If neither a nor 6 is 0, we proceed by induction on the largest of a, b. Since a — 1 ^ 6—1, by the inductive hypothesis, N I ia — 1 = 6—1. Then by Axiom N2, N H -iS(a^T) = S(6 - 1). In other words, N I- ->a = 6, as 1) is typographically equivalent to a and S(6 — 1) is typographically equivalent to 6.

For (3), we use induction on 6. As a < b, we know that 6 ^ 0 and we know that o < 6 — 1 or a = b — 1. So either

Nba<b— 1 (by the inductive hypothesis) or

Nha = (by (1)).

So , Nh(a< 6 - 1 Va = b-1).

But then by Axiom N10, N h a < 5(6—1), which is exactly the same as N h a < b.

We will now discuss (5), leaving (4), (6), and (7) to the exercises. We prove (5) by induction on 6. If 6 = 0, then a + 6 = a + 0 = a. So Axiom N3 tells us that Nha + b — a.

For the inductive step, if b = c 4- 1, then a + 6 = a + S(c). So Axiom N4 tells us that - '

, N \~a + b — S(a + c).

Since N a + c = a + c by the inductive hypothesis, the equal-ity axioms tell us that N h S(a + c) = S(a + c). But S(a + c) is o + c + 1, which is o + 6. Since we know (by Theorem 2.7.1) that N h "equality is transitive," Nha + b = a + b. •

2.8.1 Exercises 1. This problem is in the setting of Example 2.8.1. Exactly one

of the following two statements is in the collection of nonlogical axioms of that example. Figure out which one it is, and why.

• (Vx)(17 + 42) * a: = 17 • X © 42 • x. • (Vx)59 • x = 17 • x © 42 • x.

Now fix up the presentation of the axioms for a vector space. You may need to redefine the language, or you may be able to take what is presented in Example 2.8.1 and fix it up.

2. For each of the following structures, decide whether or not it satisfies all of the axioms of Example 2.8.2. If the structure is not a dense linear order without endpoints, point out which of the axioms the structure fails to satisfy.

(a) The structure (N, <), the natural numbers with the usual less than relation

(b) The structure (Z, <), the integers with the usual less than relation

(c) The structure (Q, <), the set of rational numbers with the usual less than relation

(d) The structure (R, <), the real numbers with the usual less than relation

(e) The structure (C, <), the complex numbers with the relation < defined by:

a + hi < c + di if and only if (a2 + b2) < (c2 + d2).

3. Write out the axioms for group theory. If you do not know the axioms of group theory, go to the library and check out any book with the phrase "abstract algebra", "modern algebra", or "group theory" in the title. Then check the index under "group." Specify your language carefully and then writing out the axioms should be easy.

4. In this exercise you are asked to write up some of the axioms of Zermelo-Praenkel set theory, also known as ZF. The language of set theory consists of a single binary relation symbol, €, that is intended to represent the relation "is an element of." So the formula x € y will usually be interpreted as meaning that the set a; is an element of the set y. Here are English versions of some of the axioms of ZF. Write them up formally as sentences in the language of set theory.

The Axiom of Extensionality: Two sets are equal if and only if they have the same elements.

The Null Set Axiom: There is a set with no elements. The Pair Set Axiom: If a and b are sets, then there is a set

whose only elements are a and b. The Axiom of Union: If a is a set, then there is a set consist-

ing of exactly the elements of the elements of a. [Query: Can you figure out why this is called the axiom of union? Write up an example, where o is a set of three sets and each of those three sets has two elements. What does the set whose existence is guaranteed by this axiom look like?]

The Power Set Axiom: If a is a set, then there is a set consist-ing of all of the subsets of o. [Suggestion: For this axiom it might be nice to define C by saying that x C y is shorthand for (some nice formula with x and y free in the language of set theory).]

5. Complete the proof of Lemma 2.8.4.

6. Lemma 2.8.4(2) states that there is a deduction in N of the sen-tence -i(= SOS SO). Find a deduction in N of this sentence.

7. This problem is just to give you a hint of how little we can prove using the axiom system N. Suppose that we wanted to prove that N \f ->x < x. It makes sense (and is a consequence of the Completeness Theorem, Theorem 3.2.2) that one way to go about this would be to construct an L/^-structure 21 in which all the axioms of N are true but (Vx)->x < x is not true. Do so. I would suggest that you take as your universe the set

A = {0 , l ,2 ,3 , . . . }U{a} ,

where a is the letter a and not a natural number. You need to define the functions S a , + a , etc., and the relation < a . Don't do anything too strange for the natural numbers, but make sure that a <a a. Check that the axioms of N are true in the structure 21, and you're finished!

8. Using more or less the same technique as in Exercise 7, show that N does not prove that addition is commutative.

2.9 Summing Up, Looking Ahead In these first two chapters we have developed a vocabulary for talking about mathematical structures, mathematical languages, and deduc-tions. Chapter 2 has focused on deductions, which are supposed to be the formal equivalents of the mathematical proofs that you have seen for many years. We have seen some results, such as the De-duction Theorem, which indicate that deductions behave like proofs behave. The Soundness Theorem shows that deductions preserve truth, which gives us some comfort as we try to justify in our minds why proofs preserve truth.

As you look at the statement of the Soundness Theorem, you can see that it is explicitly trying to relate the syntactical notion of deducibility (H) with the semantical notion of logical implication ((=). The first major result of Chapter 3, the Completeness Theorem, will also relate these two notions and will in fact show that they are equivalent. Then the Compactness Theorem (which is really a

84 Chapter 2. Deductions

quite trivial consequence of the Completeness Theorem) will be used to construct some mathematical models with some very interesting properties.

Chapter 3

Completeness and Compactness

3.1 Naively We are at a point in our explorations where we have established a particular deductive system, consisting of the logical axioms and rules of inference that we set out in the last chapter. The Soundness Theorem showed that our deductive system preserves truth, in the sense that if there is a deduction of <f> from E, then <f> is true in any model of E. The Completeness Theorem, the first major result of this chapter, gives us the converse to the Soundness Theorem. So, when the two results are combined, we will have this equivalent®:

E j= <f> if and only if £1-0.

We have already made a big point of the fact that we would like to be sure that if our deductive system allows us to prove a statement, we would like that statement to be true. Certainly, the content of the Soundness Theorem is exactly that. If h if there is a deduction of <f> from only the logical axioms without any additional assumptions, then we know that \=<f>, so $ is true in every structure with every assignment function. To the extent that the informal mathematical practice of everyday proofs is modeled by our formal system of deduction, we can be sure that the things that we prove mathematically are true.

If life were peaches and cream, we would also like to know that we can prove anything that is true. The Completeness Theorem is

86 Chapter 3. Completeness and Compactness

the result that asserts that our deductive system is that strong. So you would be tempted to conclude that, for example, we are able to prove any statement of first-order logic that is a true statement about the natural numbers.

Unfortunately, this conclusion is based upon a misreading of the statement of the Completeness Theorem. What we will prove is that our deductive system is complete, in the sense of this definition:

Definition 3.1.1. A deductive system consisting of a collection of logical axioms A and a collection of rules of inference is said to be complete if for every set of nonlogical axioms E and every £-formula

IfE thenEh^.

What this says is that if <j) is an ^-formula that is true in every model of S, then there will be a deduction from E of 4>. So our ability to prove <f> depends on <f> being true in every model of E. Thus if we want to be able to use E to prove every true statement about the natural numbers, we have to be able to find a set of non-logical axioms E such that E f= 0 if and only if 0 is a true statement about the natural numbers. We will have much more to say about that problem in Chapters 4 and 5.

The second part of the chapter concerns the Compactness Theo-rem and the Lowenheim-Skolem Theorems. We will use these results to investigate various types of mathematical structures, including structures that are quite surprising.

In some sense, we have spent a lot of time in the first couple of chapters of this book developing a lot of vocabulary and establishing some basic results. Now we will roll up our sleeves and get a couple of worthwhile theorems. It is time to start showing some of the beauty and the power, as well as the limitations, of first-order logic.

3.2 Completeness Let us fix a collection of nonlogical axioms, E. Our goal in this section is to show that for any formula <f>, if E |= <j>, then El- <j>. In some sense, this is the only possible interpretation of the phrase "you can prove anything that is true," if you are discussing the adequacy of the deductive system. To say that <f> is true whenever E is a collection of true axioms is precisely to say that E logically implies

3.2. Completeness 87

<p. Thus, the Completeness Theorem will say that whenever <p is logically implied by E, there is a deduction from E of <p. So the Completeness Theorem is the converse of the Soundness Theorem.

We have to begin with a short discussion of consistency.

Definition 3.2.1. Let E be a set of ^-formulas. We will say that E is inconsistent if there is a deduction from E of [(Vx)x = x] A ->[(Vx)x = x]. We say that E is consistent if it is not inconsistent.

So E is inconsistent if E proves a contradiction. Exercise 1 asks you to show that if E is inconsistent, then there is deduction from E of every £-formula. For notations! convenience, let us agree to use the symbol X (read "false" or "eet") for the contradictory sentence [(Vx)x = a;] A -'[(Vx)x = x]. All you will have to remember is that J. is a sentence that is in every language and is true in no structure.

Theorem 3.2.2 (Completeness Theorem). Suppose that E is a set of C-formulas and (p is an C-formula. If E {= ^ then E b (p.

Proof.

Chaff: This theorem was established in 1929 by the Austrian mathematician Kurt Godel, in his Ph.D. disser-tation. If you haven't picked it up already, you should know that the work of Godel is central to the develop-ment of logic in the twentieth century. He is responsible for most of the major results that we will state in the rest of the book: The Completeness Theorem, the Compact-ness Theorem, and the two Incompleteness Theorems. Godel was an absolutely brilliant man, with a complex and troubled personality. A wonderful and engaging bi-ography of Godel is [Dawson 97]. The first volume of Godel's collected works, [Godel-Works], also includes a biography and introductory comments about his papers that can help your understanding of this wonderful math-ematics.

The proof we present of the Completeness Theorem is based on work of Leon Henkin. The idea of Henkin's proof is brilliant, but the details take some time to work through. Just to warn you, this proof doesn't end until page 98.

Before we get involved in the details, let us look at a rough out-line of how the argument proceeds. There are a few simplifications and one or two outright lies in the outline, but we will straighten everything out as we work out the proof.

Outline of the Proof

There will be a preliminary argument that will show that it is sufficient to prove that if E is a consistent set of sentences, then E has a model. Then we will proceed to assume that we are given such a set of sentences, and we will construct a model for E.

The construction of the model will proceed in several steps, but the central idea was introduced in Example 1.6.4. The elements of the model will be variable-free terms of a language. We will construct this model so that the formulas that will be true in the model are precisely the formulas that are in a certain set of formulas, which we will call £'. We will make sure that E C E', so all of the formulas of E will be true in this constructed model. In other words, we will have constructed a model of E.

To make the construction work we will take our given set of C-sentences E and extend it to a bigger set of sentences E' in a bigger language £ . We do this extension in two steps. First, we will add in some new axioms, called Henkin Axioms, to get a collection E. Then we will extend E to £ ' in such a way that:

1. E' is consistent.

2. For every ^'-sentence 0, either 0 € E' or (-K/>) e £'.

Thus we will say that E' is a maximal consistent extension of E, where maximal means that it is impossible to add any sentences to £ ' without making E' inconsistent.

Now there are two possible sources of problems in this expansion of E to £'. The first is that we will change languages from L to C , where C C £, It is conceivable that E will not be consistent when viewed as a set of ^'-sentences, even though E is consistent when viewed as a set of /^-sentences. The reason that this might happen is that there are more ^'-deductions than there are ^-deductions, and one of these new deductions just might happen to be a deduction of _L. Fortunately, Lemma 3.2.3 will show us that this does not happen,

so £ is consistent as a set of / '-sentences. The other possible problem is in our two extensions of £, first to £ and then to £' . It certainly might happen that we could add a sentence to £ in such a way as to make £ ' inconsistent. But Lemma 3.2.4 and Exercise 4 will prove that £ ' is still consistent.

Once we have our maximal consistent set of sentences £', we will construct a model 21 and prove that the sentences of C! that are in £ ' are precisely the sentences that are true in %. Thus, 21 will be a model of £' , and as £ C £', 2t will be a model of £, as well.

This looks daunting, but if we keep our wits about us and do things one step at a time, it will all come together at the end.

Preliminary Argument

So let us fix our setting for the rest of this proof. We are working in a language C. For the purposes of this proof, we assume that the language is countable, which means that the formulas of C can be written in an infinite list a%,a2, . . . , a n , . . . . (An outline of the changes in the proof necessary for the case when £ is not countable can be found in Exercise 6.)

We are given a set of formulas £, and we are assuming that £ f= (j>. We have to prove that £ h <f>.

Note that we can assume that <(> is a sentence: By Lemma 2.7.2, £ h <j> if and only if there is a deduction from £ of the universal closure of 4>. Also, by the comments following Lemma 2.7.3, we can also assume that every element of £ is a sentence. So, now all(!) we have to do is prove that if £ is a set of sentences and <j> is a sentence and if £ f= then £ h <f>.

Now we claim that it suffices to prove the case where <j> is the sentence ± . For suppose we know that if £ ]=-L, then £ h±, and suppose we are given a sentence $ such that £ ^ <j>. Then £ U (-y<f>) f=±, as there are no models of £ U (p<f>), so £ U (-><£) hi.. This tells us, by Exercise 4 in Section 2.7.1, that £ h <f>, as needed.

So we have reduced what we need to do to proving that if £ j=-L, then £ h_L, for £ a set of ^-sentences. This is equivalent to saying that if there is no model of £, then £ KL. We will work with the contrapositive: If £ \f±, then there is a model of £. In other words, we will prove:

If £ is a consistent set of sentences, then there is a model of £.

This ends the preliminary argument that was promised in the outline of the proof. Now, we will assume that E is a consistent set of ^-sentences and go about the task of constructing a model of E.

Changing the Language from £ to £1

The model of E that we will construct will be a model whose elements are variable-free terms of a language. This might lead to problems. For example, suppose that £ contains no constant symbols. Then there will be no variable-free terms of £. Or, perhaps £ has exactly one constant symbol c, no function symbols, one unary relation P, and

E = {3xP(ar),-iP(c)}.

Here E is consistent, but no structure whose universe is {c} (c is the only variable-free term of £) can be a model of E. So we have to expand our language to give us enough constant symbols to build our model.

So let CQ = £, and define

C\ = CQ U {CI ,C2, . . . ,CR, . , . } ,

where the c,'s are new constant symbols.

Chaff: Did you notice that when I was defining C\ I took something we already knew about, £, and gave it a new name, £o? When you are reading mathematics and something like that happens, it is almost always a clue that whatever happens next is going to be iterated, in this case to build £2, £3, and so on. Back in my English lit course we called that foreshadowing.

We say (for the obvious reason) that C\ is an extension by con-stants of CQ. AS mentioned in the outline, it is not immediately clear that E remains consistent when viewed as a collection of £1-sentences rather than £-sentences. The following lemma, the proof of which is delayed until page 99, shows that E remains consistent.

Lemma 3.2.3. If E is a consistent set of C-sentences and C\ is an extension by constants of C, then E is consistent when viewed as a set of £1-sentences.

The constants that we have added to form C\ are called Henkin constants, and they serve a particular purpose. They will be the witnesses that allow us to ensure that any time £ claims 3x<j>(x), then in our constructed model 21, there will be an element (which will be one of these constants c) such that 21 f= <t>{c).

Chaff: Recall that the notation 3xcj>(x) implies that <j> is a formula with x as the only free variable. Then <j>(c) is the result of replacing the free occurrences of x with the constant symbol c. Thus <f>(c) is <f>*.

The next step in our construction makes sure that the Henkin constants will be the witnesses for the existential sentences in E.

Extending E to Include Henkin Axioms

Consider the collection of sentences of the form 3x9 in the language CQ. AS the language CQ is countable, the collection of £O-sentences is countable, so we can list all such sentences of the form 3x0, enu-merating them by the positive integers:

3a:0i, 3x92,3x9$,3 x9n,....

We will now use the Henkin constants of C\ to add to E countably many axioms, called Henkin axioms. These axioms will ensure that every existential sentence that is asserted by E will have a witness in our constructed structure 21. The collection of Henkin axioms is

HI = {[3x0,] —• 0,(c,) | (3x0,) is an CQ sentence},

where 0,(cj) is shorthand for 0£. Now let £o = E, and define

EJ = EO U

Chaff: Foreshadowing!

As Ei contains many more sentences than Eo, it seems entirely possible that Ei is no longer consistent. Fortunately, the next lemma shows that is not the case. The proof of the lemma is on page 99. Lemma 3.2.4. If Ho is a consistent set of sentences and Ei is cre-ated by adding Henkin axioms to Eo, then Ei is consistent.

Now we have Ei, a consistent set of £i-sentences. We can repeat this construction, building a larger language £2 consisting of £1 to-gether with an infinite set of new Henkin constants k{. Then we can let be a new set of Henkin axioms:

H2 — {[Ba^j] -» 6l(kl) J (3s0j) is an C\ sentence},

and let £2 be £1 U H2. As before, £3 will be consistent. We can continue this process to build:

• £ = £0 £ C\ C £2 • * • > an increasing chain of languages.

• Hi, Hi, Hz,.. . , each Hi a collection of Henkin axioms in the language Cx.

• E = £0 Q Si Q £2 C • •«, where each £, is a consistent set of £,-sentences.

Let £ ' = U»<oo A and let £ = U»<oo Each £< is a consistent set of £'-sentences, as can be shown by proofs that are identical to those of Lemmas 3.2.3 and 3.2.4. You will show in Exercise 2 that £ is a consistent set of £'-sentences.

Extending to a Maximal Consistent Set of Sentences As you recall, we were going to construct our model 21 in such a way that the sentences that were true in 21 were exactly the elements of a set of sentences £'. It is time to build £'. Since every sentence is either true or false in a given model, it will be necessary for us to make sure that for every sentence o € £', either o € £ ' or -vr 6 £' . Since we can't have both o and —>cr true in any structure, we must also make sure that we don't put both o and —>cr into £'. Thus, £ ' will be a maximal consistent extension of £.

To build this extension, fix an enumeration of all of the £'-sentences

,,..,<rn,

We can do this as £' is countable, being a countable union of countable sets. Now we work our way through this list, one sentence at a time, adding either on or the denial of on to our growing list of sentences, depending on which one keeps our collection consistent.

Here are the details. Let £° = £, and assume that £k is known to be a consistent set of £'-sentences. We will show how to build £ fc+1 D

Efc and prove that Efc+1 is also a consistent set of / '-sentences. Then we let

E' = E° U E1 U E2 U • • • U En U •' •. You will prove in Exercise 4 that £' is a consistent set of sentences.

It will be obvious from the construction of £ fe+i from £fc that £ ' is maximal, and thus we will have completed our task of producing a A maximal consistent extension of E.

So all we have to do is describe how to get £fc+1 from £fc and prove that £ f c+ l is consistent. Given Efc, consider the set £fcU{crfe+1}, where Ok+i is the (& + l)st element of our fixed list of all of the C -sentences. Let

2j*+l _ u {cfc+i} if Efc U {<7fc+i} is consistent, [E* U {-'CTfc+i} otherwise.

You are asked in Exercise 3 to prove that E*+1 is consistent. Once you have done that, we have constructed a maximal consistent £ ' that extends E.

The next lemma states that £ ' is deductively closed, at least as far as sentences are concerned. As you work through the proof, the Deduction Theorem will be useful. Lemma 3.2.5. If a is a sentence, then o € £' if and only if E' I- <r.

Construction of the Model—Preliminaries I have mentioned a few times that the model of E that we are going to construct will have as its universe the collection of variable-free terms of the language £'. It is now time to confess that I have lied. It is easy to see why the plan of using the terms as the elements of the universe is doomed to failure. Suppose that there are two different terms ti and t2 of the language and somewhere in £ ' is the sentence

= <2- If the terms were the elements of the universe, then we could not model £', as the two terms t\ and t2 are not the same (they are typographically distinct), while £ ' demands that they be equal. Our solution to this problem is to take the collection of variable-free terms, define an equivalence relation on that set, and then construct a model from the equivalence classes of the variable-free terms.

So let T be the set of variable-free terms of the language £', and define a relation ~ on T by

t\ ~ <2 if and only if (<i = t2) € £'.

It is not difficult to show that ~ is an equivalence relation. We will verify that ~ is symmetric, leaving reflexivity and transitivity to the Exercises.

To show that ~ is symmetric, assume that ti ~ t2. We must prove that t2 ~ h. As we know ti ~ t2, by definition we know that the sentence (ti = t2) is an element of We need to show that (t2 = <x) e E'. Assume not. Then by the maximality of £', —>(2 = ti) € E'. But since we know that £' h ti = t2, by Theorem 2.7.1, £' I-t2 = t\. (Can you provide the details?) But since we also know that £ ' I—<(t2 = <i), it must be the case that £ ' KL, which is a contradiction, as we know that £ ' is consistent. So our assumption is wrong and (t2 = ti) 6 £', and thus ~ is a symmetric relation.

So, assuming that you have worked through Exercise 7, we have established that ~ is an equivalence relation. Now let [i] be the set of all variable-free terms s of the language £' such that t »* s. So [t] is the equivalence class of all terms that E' tells us are equal to t. The collection of all such equivalence classes will be the universe of our model QL.

Construction of the Model—The Main Ideas

To define our model of £', we must construct an £'-structure. Thus, we have to describe the universe of our structure as well as interpre-tations of all of the constant, function, and relation symbols of the language £'. We discuss each of them separately.

The Universe A: As explained above, the universe of QL will be the collection of ^-equivalence classes of the variable-free terms of £'. For example, if £' includes the binary function symbol / , the non-Henkin constant symbol fe, and the Henkin constants ci, C2,..., Cn,. .then the universe of our structure would in-clude among its elements [cn] and [/(fe, C3)].

The Constants: For each constant symbol c of £' (including the Henkin constants), we need to pick out an element c21 of the

universe to be the element represented by that symbol. We don't do anything fancy here:

So each constant symbol will denote its own equivalence class.

The Functions: If / is an n-ary function symbol, we must define an n-ary function : An —• A. Let me write down the defini-tion of and then we can try to figure out exactly what the definition is saying:

On the left-hand side of the equality you will notice that there are n equivalence classes that are the inputs to the function

Since the elements of A are equivalence classes and is an n-ary function, that should be all right. On the right side of the equation there is a single equivalence class, and the thing inside the brackets is a variable-free term of C. Notice that the function / a acts by placing the symbol / in front of the terms and then taking the equivalence class of the result. There is one detail that has to be addressed. We must show that the function / a is well defined. Let me say a bit about what that means, assuming that / is a unary function symbol, for simplicity. Notice that our definition of /a([<]) depends on the name of the equivalence class that we are putting into This might lead to problems, as it is at least conceivable that we could have two terms, t\ and <2, such that [ii] is the same set as [i2], but /a([*i]) and /a ( fa]) evaluate to be different sets. Then our alleged function f * wouldn't even be a function. Showing that this does not happen is what we mean when we say that we must show that the function / a is well defined. Let us look at the proof that our function is, in fact, well defined. Suppose that [ii] = [£2]. We must show that /a([ii]) =

2])- In other words, we must show that if [ii] = [t2], then [/ill — [/*2]. Again looking at the definition of our equivalence relation this means that we must show that if t\ — t% is an element of then so is f(t\) — ffa). So assume that t\ — t2 is an element of Here is an outline of a deduction from £ ' of / ( t i ) = /( i2) : .

x = y-* f(x) - f(y) axiom (E2)

t l=*2 -> / ( * l ) = /(*2) <! = t2

/(*i) = M

element of E' PC

Since E' h f(h) = f(t2), Lemma 3.2.5 tells us that f{h) = f(t2) is an element of £', as needed. So the function is well defined.

The Relations: Suppose that R is an n-ary relation symbol of £'. We must define an n-ary relation R* on A. In other words, we must decide which n-tuples of equivalence classes will stand in the relation i H e r e is where we use the elements of £'. We define R* by this statement:

^ ( M . M , . •., [tn]) is true if and only if Rht2... tn € E'.

So elements of the universe are in the relation R if and only if E' says they are in the relation R. Of course, we must show that the relation R* is well defined, also. Or rather, you must show that the relation R% is well defined. See Exercise 8.

At this point we have constructed a perfectly good / '-structure. What we have to do next is show that 21 makes all of the sentences of E' true. Then we will have shown that we have constructed a model of £'. Proposition 3.2.6. 211= E\

Proof. We will in fact prove something slightly stronger. We will prove, for each sentence o, that

Well, since you have noticed, this isn't really stronger, as we know that £ ' is maximal. But it does appear stronger, and this version of the proposition is what we need to get the inductive steps to work out nicely.

a € E' if and only if 21 \= o.

We proceed by induction on the complexity of the formulas in £'. For the base case, suppose that a is an atomic sentence. Then a is of the form Rtit? ...tn, where R is an n-ary relation symbol and the <,'s are variable free terms. But then our definition of R* guaranteed that 21 f= a if and only if er € £'. Notice that if R is = and a is <i = t2, then a € £ ' iff <i ~ t2 iff [ti] = [te\ iff 21 \=a.

For the inductive cases, suppose first that a is ->ai, where we know by inductive hypothesis that 21 {= a if and only if a € £'. Notice that as £ ' is a maximal consistent set of sentences, we know that a € £ ' if and only if a & £'. Thus

<r € £ ' if and only if a # £ ' if and only if 21 ft a if and only if 2t -ia if and only if 2t (= <r.

The second inductive case, when a is a V is similar and is left to the Exercises.

The interesting case is when a is a sentence of the form Wx<f>. We must show that Vx</> 6 £ ' if and only if 21 \= We do each implication separately.

First, assume that Vz^ € £' . We must show that 21 f= which means that we must show, given an assignment function a, that 211= Vx0[s]. Since the elements of A are equivalence classes of variable-free terms, this means that we have to show for any variable-free term t that

But (here is another lemma for you to prove) for any variable-free term t and any assignment function s, s(t) = [t], and so by Theo-rem 2.6.2, we need to prove that

•a HtfM- -Notice that is a sentence, so 21 J= <f>f [s] if and only if 21 J= <pf.

But also notice that £ ' I- <f>*, as Vx^ is an element of £', —» <j>f is a quantifier axiom of type (Ql) (t is substitutable for x in <f> as t is variable-free), and £ ' is deductively closed for sentences by Lemma 3.2.5. But 4>f is less complex than \/x<j>, and thus by our inductive hypothesis, 21 j= f, as needed.

For the reverse direction of our biconditional, assume that Vx0 £'. We need to show that 21 b Vx<£. As E' is maximal, -Nxfy 6 E'. By deductive closure again, this means that 3x-*<f> 6 E'. From our construction of E', we know there is some Henkin constant Ci such that ([3x-^] —• -^(c,)) 6 E', and using deductive closure once again, this tells us that ->0(c,) € E'. Having stripped off a quantifier, we can assert via the inductive hypothesis that 21b ^ (c , ) , so 21 b Vx<£, as needed.

This finishes our proof of Lemma 3.2.6, so we know that the ^'-structure 21 is a model of E', •

Construction of the Model—Cleaning Up

As you recall, back in our outline of the proof of the Completeness Theorem on page 88, we were going to prove the theorem by con-structing a model of E. We are almost there. We have a structure, 21, we know that 21 is a model of £', and we know that E C E', so every sentence in E is true in the structure 21. We're just about done. The only problem is that E began life as a set of £-sentences, while 21 is an ^'-structure, not an /^-structure.

Fortunately, this is easily remedied by a slight bit of amnesia: De-fine the structure 21 \c (read 21 restricted to £, or the restriction of 21 to £ as follows: The universe of 21 \c is the same as the uni-verse of 21. Any constant symbols, function symbols, and relations symbols of £ are interpreted in 21 fe exactly as they were interpreted in 21, and we just ignore all of the symbols that were added as we moved from £ to £'. Now, 2t f£ is a perfectly good £-structure, and all that is left to finish the proof of the Completeness Theorem is to work through one last lemma: Lemma 3.2.7. If cr is an C-sentence, then 21 b a if and only if 21 tab*-Proof. (Outline) Use induction on the complexity of cr, proving that 2L tab ° ^ and only if a € E', as in the proof of Lemma 3.2.6. •

Thus, we have succeeded in producing an £-structure that is a * model of E, so we know that every consistent set of sentences has a

model. By our preliminary remarks on page 89, we thus know that if E b then E h <f>, and our proof of the Completeness Theorem is complete. •

Proofs of the Lemmas

We present here the proofs of two lemmas that were used in the proof of the Completeness Theorem. The first lemma was introduced when we expanded the language L to the language £ and we were concerned about the consistency of E in the new, expanded language.

(Lemma 3.2.3). If E is a consistent set of C-sentences and £ is an extension by constants of C, then E is consistent when viewed as a set of £-sentences.

Proof. Suppose, by way of contradiction, that E is not consistent as a set of ^'-sentences. Thus there is a deduction (in £ ) of ± from E. Let n be the smallest number of new constants used in any such deduction, and let D' be a deduction using exactly n such constants. Notice that n > 0, as otherwise D' would be a deduction of _L in C. We show that there is a deduction of ± using fewer than n constants, a contradiction that establishes the lemma.

Let v be a variable that does not occur in D', let c be one of the new constants that occurs in D', and let D be the sequence of formulas (<f>i) that is formed by taking each formula $ in D' and replacing all occurrences of c in <pf- by v. The last formula in D is -L, so if we can show that D is a deduction, we will be finished.

So we use induction on the elements of the deduction D I f <f>'t is an element of D' by virtue of being an equality axiom or an element of E, then (f>t = <f>'tt and 4>i is an element of a deduction by the same reason. If <f>[ is a quantifier axiom, for example (Vx)#' —* 0'*,, then <pt will also be a quantifier axiom, in this case (Vx)0 —• 6f. There will be no problems with substitutability of t for x, given that t! is substitutable for x. If <f>' is an element of the deduction by virtue of being the conclusion of a rule of inference (V,^), then (r, <j>) wiU be a rule of inference that will justify <f>.

This completes the argument that D is a deduction of J_. Since D clearly uses fewer new constant symbols than D', we have our contradiction and our proof is complete. •

The second lemma was needed when we added the Henkin axioms to our consistent set of sentences E. We needed to prove that the resulting set, E, was still consistent.

(Lemma 3.2.4). If E is o consistent set of sentences and E ts created by adding Henkin axioms to E, then E is consistent.

Proof. Suppose that E is not consistent. Let n be the smallest num-ber of Henkin axioms used in any possible deduction from E of _L. Fix such a set of n Henkin axioms, and let a be one of those Henkin axioms. So we know that

SUJlUahl,

where if is the collection of the other n — 1 Henkin axioms needed in the proof. Now a is of the form 3x(f> —* </>(c), where c is a Henkin constant and <f>(c) is our shorthand for

By the Deduction Theorem (Theorem 2.7.4), as a is a sentence, this means that E U H h ->a, so

EU.H>3;r0 and £U H h - i f i .

Since 3x<j> is the same as from the first of these facts we know that

T.XJ H \—iVx-i^. (3.1)

We also know that E U if I—></>*. If we take a deduction of and replace each occurrence of the constant c by a new variable z, the result is still a deduction (as in the proof of Lemma 3.2.3 above), so E U if b ~><t>2' By Lemma 2.7.2, we know that "

E U f f h V ^ .

Our quantifier axiom (Ql) states that as long as x is substitutable for z in —«<jf>®, (which it is, as z is a new variable), then we may assert that \iz~xt>%\ - » Therefore

But (<j>g)x — <f>, so EUif h ~«j). But now we can use Lemma 2.7.2 again to conclude that

E U H h V x ^ . (3.2)

So, by Equations (3.1) and (3.2), we see that S U H H . This is a contradiction, as E U if contains only n — 1 Henkin axioms. Thus we are led to conclude that E is consistent. •

3.2.1 Exercises 1. Suppose that E is inconsistent and (j) is an -formula. Prove that

2. Assume that Eo Q Q £2 • * * are such that each £» is a consis-tent set of sentences in a language C. Show (J E, is consistent.

3. Show that if II is any consistent set of sentences and a is a sen-tence such that IIU{cr} is inconsistent, then Ilu{-yr} is consistent. Conclude that in the proof of the Completeness Theorem, if £fc

is consistent, then £ fc+1 is consistent.

4. Prove that the constructed in the proof of the Complete-ness Theorem is consistent. [Suggestion: Deductions are finite in length.]

6. Toward a proof of the Completeness Theorem in a more general setting:

(a) Do not assume that the language C is countable. Suppose that you have been given a set of sentences Emax that is maximal and consistent. So for each sentence cr, either a G £max or ~>o € £max. Mimic the proof of Proposition 3.2.6 to convince yourself that we can construct a model 21 of E.

(b) Zorn's Lemma implies the following: If we are given a con-sistent set of C-sentences E, then the collection of consistent extensions of E has a maximal (with respect to C) element Emax> If you are familiar with Zorn's Lemma, prove this fact.

(c) Use parts (a) and (b) of this problem to outline a proof of the Completeness Theorem in the case where the language C is not countable.

7. Complete the proof of the claim on page 94 that the relation ~ is an equivalence relation.

8. Show that the relation R* of the structure 21 is well defined. So let R be a relation symbol (a unary relation symbol is fine), and show that if [ti] = [<2], then ila([*i]) is true if and only if JRa([<2]) is true.

9. Finish the inductive clause of the proof of Proposition 3.2.6.

10. Fill in the details of the proof of Lemma 3.2.7.

3.3 Compactness The Completeness Theorem finishes our link between deducibility and logical implication. The Compactness Theorem is our first use of that link. In some sense, what the Compactness Theorem does is focus our attention on the finiteness of deductions, and then we can begin to use that finiteness to our advantage.

Theorem 3.3.1 (Compactness Theorem). Let £ be any set of axioms. There is a model of £ if and only if every finite subset £o o/£ has a model.

We say that £ is satisfiable if there is a model of £, and we say that £ is finitely satisfiable if every finite subset of £ has a model. So the Compactness Theorem says that £ is satisfiable if and only if £ is finitely satisfiable.

Proof. For the easy direction, suppose that £ has a model 21. Then 21 is also a model of every finite £o C £.

For the more difficult direction, assume there is no model of £. Then £ f=±. By the Completeness Theorem, £ t-±, so there is a deduction D of -L from £. Since D is a deduction, it is finite in length and thus can only contain finitely many of the axioms of £. Let £o be the finite set of axioms from £ that are used in D. Then D is a deduction from Eo, so Eo H±. But then by the Soundness Theorem, Eo f=JL, so Eo cannot have a model. •

Corollary 3.3.2. Let E be a set of C-formulas and let 9 be an C-formula. £ |= 9 if and only if there is a finite £o C E such that E o M -Proof.

E 9 iff £ I- 9 Soundness and Completeness iff Eo I- 9 for a finite Eo C £ deductions are finite iff Eo }= 9 Soundness and Completeness

Now we are in a position where we can use the Compactness The-orem to get a better understanding of the limitations of first-order logic—or, to put a more positive spin on it, a better understanding of the richness of mathematics!

Example 3.3.3. Suppose that we examine the structure 9t, whose universe is the set of natural numbers N, endowed with the familiar arithmetic functions of addition, multiplication, and exponentiation and the usual binary relation less than. It would be nice to have a collection of axioms that would characterize the structure 9t. By this I mean a set of sentences E such that 9t |= E, and if 21 is any CNT-structure such that 211= E, then % is "just like" 91. (21 is "just like"

if there 21 and 91 are isomorphic—see Exercise 5 in Section 1.6.1). Unfortunately, we cannot hope to have such a set of sentences,

and the Compactness Theorem shows us why. Suppose we took any set of sentences E that seemed like it ought to characterize 91. Let us add some sentences to E and create a new collection of sentences © in an extended language C — CNT U C, where c is a new constant symbol:

Now notice that © is finitely satisfiable: If ©o is a finite subset of 0 , then ©o is a subset of

©„ = SU{0 < c,50 < c,550 < c, . . . ,555-»-5,0 < c ) nS'a

for some natural number n. But ©n has a model 9l„, whose universe is N, the functions and relations are interpreted in the usual way, and c*» = n +1. So every finite subset of © has a model, and thus © has a model 21'. Now forget the interpretation of the constant symbol c and you are left with an £ jvr -structure 21 = 21' This model 21 is interesting, but we cannot claim that 21 is "just like" 91, since 21 has an element (the thing that used to be called ca ' ) such that there ' are infinitely many elements x that stand in the relation < with that " element, while there is no such element of 91. The element c a is called a nonstandard element of the universe, and 21 is another example of a nonstandard model of arithmetic, a model of arithmetic that is

not isomorphic to 91. We first encountered nonstandard models of arithmetic in Exercise 7 of Section 2.8.1.

So no set of first-order sentences can completely characterize the natural numbers.

Chaff: Isn't this neat! Notice how each of the 9ln's in the last example were perfectly ordinary models that looked just like the natural numbers, but the thing that we got at the end looked entirely different!

Definition 3.3.4. If 521 is an £-structure, we define the theory of 21 to be Th(21) = {</> J <j> is an ^-formula and 21 0}- If 21 and 93 are ^-structures such that T/i(2l) = T7i(95), then we say that 21 and 95 are elementarily equivalent, and write 21 = 95.

If 21 = 9t, we say that 21 is a model of arithmetic Example (continued). Notice that the weird structure 21 that we constructed above can be a model of arithmetic if we just let the £ of our construction be 77i(9l). Exercise 2 asks you to prove that in this case we have 21 s Since 21 certainly is not anything like the usual model of arithmetic on the natural numbers, calling 21 a nonstandard model of arithmetic makes pretty good sense. The difficult thing to see is that although the universe A certainly contains nonstandard elements, they don't get in the way of elementary equivalence. The reason for this is that the language CNT can't refer to any nonstan-dard element explicitly, so we can't express a statement that is (for example) true in 91 but false in 21. So the lesson to be learned is that it is much easier for two structures to be elementarily equivalent than it is for them to be isomorphic: Our structure 21 is not isomorphic to 91, but 21 is elementarily equivalent to 91. Example 3.3.5. Remember those e's and <Ts from calculus? They were introduced in the nineteenth century in an attempt to firm up the foundations of the subject. When they were developing the cal-culus, Newton and Leibniz did not worry about limits. They happily used quantities that were infinitely small but not quite zero and they ignored the logical difficulties this presented. These infinitely small quantities live on in today's calculus textbooks as the differentials dx and dy.

Most people find thinking about differentials much easier than fighting through limit computations, and in 1961 Abraham Robin-son developed a logical framework for calculus that allowed the use

of these infinitesimals in a coherent, noncontradictory way. Robin-son's version of the calculus came to be known as nonstandard anal-ysis. Here is a rough introduction (for a complete treatment, see [Keisler 76]).

Taking as our starting point the real numbers that you know so well, we construct a language £R, the language of the real numbers. For each real number r, the language includes a constant symbol R. So the language £R includes constant symbols 6,TT, and For each function / : Rn —» R, we toss in a function symbol / , and for each n-ary relation R on the reals we add an n-ary relation symbol R. So our language includes, for example, the function symbols -f and cos and the relation symbol <.

Now we define !EH to be the /^-structure (R, {r} , { / } , {i?}), where each symbol is interpreted as meaning the number, function, or rela-tion that gave rise to the symbol. So the function symbol 4- stands for the function addition, and the constant symbol -k refers to the real number that is equal to the ratio of the circumference of a circle to its diameter.

Given this structure 9t (notice that £H is not anything fancy—it is just the real numbers you have been working with since high school), it generates the set of formulas Th(9t), the collection of first-order /^-formulas that are true statements about the real numbers. Now it is time to use compactness.

Let £ ' = £RU {c}, where c is a new constant symbol, and look at the collection of ^'-sentences

0 = Th{°H) U {0<c} U {c<r | r € R, r > 0}.

(Are you clear about the difference between the dotted and the un-dotted symbols in this definition?)

By the Compactness Theorem, G has a model, 21, and in the model 21, the element denoted by c plays the role of an infinitesimal element: It is positive, yet it is smaller than every positive real num-ber. Speaking roughly, in the universe A of the structure 21 there are three kinds of elements. There are pure standard elements, which constitute a copy of R that lives inside A. Then there are pure non-standard elements, for example the element denoted by c. Finally, there are elements such as the object denoted by 17+c, which has a standard part and a nonstandard part. (For more of the details, see Exercise 11 in Section 3.4.1.)

106 ' Chapter 3. Completeness and Compactness

The nonstandard elements of the structure 9t provide a method for developing derivatives without using limits. For example, we can define the derivative of a function / at a standard element a to be

/ '(a) = the standard part of / ( ° + c ) ~ / ( a ) . c As you can see, there is no limit in the definition. We have traded

the limits of calculus for the nonstandard elements of 21, and the slope of a tangent line is nothing more than a slope of a line connecting two points, one of which is not standard. Nonstandard analysis has been an area of active study for the past forty years, and although it is not exactly mainstream, it has been used to discover some new results in various areas of classical analysis.

Example 3.3.6. The idea of coloring a map is supposed to be in-tuitive. When you were in geography class as a child, you were doubtless given a map of a region and asked to color in the various countries, or states, or provinces. And you were missing the point if you used the same color to shade two countries that shared a com-mon border, although it was permitted to use the same color for two countries whose borders met at a single point. (The states of Utah, Colorado, Arizona, and New Mexico do this in the United States, so coloring both Colorado and Arizona with the color red would be permitted.) The question of how many colors are needed to color any map drawn on the plane was first posed in 1852 by Francis Guthrie, and the answer, that four colors suffice for any such map (as long as each political division consists of a single region—Michigan in a map of the United States or pre-1971 Pakistan in a map of Asia would not be permitted), was proven in 1976 by Kenneth Appel and Wolfgang Hakin. We are not going to prove the Four-Color The-orem here; rather, we extend this result by considering maps with infinitely many regions.

Let J? be a set (I'm thinking of the elements of R as being the regions of a map with infinitely many countries) with a symmetric binary relation A (adjacency). Let k be a natural number. We claim that it is possible to assign to each region of R one of k possible colors in such a way that adjacent regions receive different colors if and only if it is possible to so color each finite subset of R.

We will prove this using the Compactness Theorem. One of the tricks to using compactness is to choose your language wisely. For

3.3. Compactness 107

this example, let the language £ consist of a collection of constants {r)refl> one for each region, and a collection of unary predicates {C,}i<t<fc, one for each color. So the atomic statement C,(r) will be interpreted as meaning that region r gets colored with color i. We will also need a binary relation symbol A, for adjacency.

Let E be the collection of sentences:

Chaff: Stop now for a minute and make sure that you understand each of the sentences in E. You ought to be able to say, in ordinary English, what each sentence asserts. For example, C\(r) V Czir) V • •• V C*(r) says that region r must be given one of the k colors. In other words, we have to color each region on the map. Take the time now to translate each of the other statement types of E into English.

But now our claim that an infinite map is fc-colorable if and only if each finite subset of the map is fc-colorable is clear, as a coloring of (a finite subset of) R corresponds to a model of (a finite subset of) E, and the Compactness Theorem says that E has a model if and only if every finite subset of E has a model.

Notice that no quantifiers are used in this example, so we really only needed compactness for predicate logic, not first-order logic. If you are comfortable with the terms, notice also that the proof works whether there are a countably infinite or an uncountably infinite collection of countries.

If you have really been paying attention, you noticed that we did not use the fact that the maps are drawn on the plane. So if we draw a map on a donut with uncountably many countries, it only takes seven colore to color the map, as it was proven in 1890 by Percy John Heawood that seven colors suffice for finite maps drawn on a donut. Example 3.3.7. You may well be familiar with mathematical trees, as they are often discussed in courses in discrete mathematics or

'Ci(r)VCh(r)V-.-VC*(r) ^ ( r ) A C 3 ( r ) ]

for each r € R

r,r* € R,l<i<k r,r' €R,r adjacent to r' r, r' € R,r not adjacent to r'.

E = ^ ( r , r ' ) - K 7 , ( r ) A C t ( r ' ) ) A(ry)

introductory computer science courses. For our purposes a tree is a set T partitioned into subsets Tj, (t = 0,1,2,...), called the levels of the tree, together with a function a such that:

1. To consists of a single element (called the root of the tree).

2. a:(T~To)-+T such that if t € Th i > 0, then a(t) € T%~\.

A path through T consists of a subset P CT such that Pf)Ti contains exactly one element for each i and P is closed under a. If t € Tf the immediate predecessor of t is a(t). And an element <2 is said to be a predecessor of ti if t2 — a(a(- --a(ti))) for some

- v — ' k a's

k> 1. We can now use the Compactness Theorem to prove

Lemma 3.3.8 (Konig's Infinity Lemma). Let T be a tree all of whose levels are finite and nonempty. Then there is a path through T.

Proof. Suppose that we are given such a tree T. Let C be the lan-guage consisting of one constant symbol t for each element t £ T, a unary relation symbol Q, which will be true for elements on the path, and one unary function symbol p, where p{t%) is intended to be the immediate predecessor of tt.

Let E be the following set of £-formulas:

I p(ti) = t2 for each t\, t2£T such that a(h) = t2

Q(t'i) V• • • VQ{tk) where Tn = {tut2,...,tk} (for each n) A Q{t2)). for tut2€Tn,h ? t2

Q{t) Q(p{t)) for each t£T-T0. We claim that E is finitely satisfiable: Let Eo be a finite subset

of E, and let n be so large that if t is mentioned in Eo, then t € To U Ti U • • • U T„. Pick any element t* e Tn+i, and build an re-structure by letting the universe A be the tree T, t a be t, letting

•p* be the function a, and letting Q21 be the collection of predecessors of t*. It is easy to check that 21 is a model of Eo, and thus by compactness, there is a structure © such that 58 is a model of E. If we let P « {t € T | t® € Q®}, then P is a path through T, and Konig's Infinity Lemma is proven. •

3.3. Compactness 109

3.3.1 Exercises 1. A common attempt to try to write a set of axioms that would

characterize (see Example 3.3.3) is to let £ be the collection of all £AfT-formulas that are true in 91, and then to argue that this is an element of E:

(Vx)(3n)(x = $SS-• • SO). nS's

Therefore, there can be no nonstandard elements in any model of E. Explain why this reasoning fails.

2. Show that if we let E = T/i(9t) in the construction of Exam-ple 3.3.3, then the structure 21 that is constructed is elementarily equivalent to the structure 91. Thus 21 is a model of arithmetic.

3. Show that if 21 and © are ^-structures such that 21 58, then 21 = ©.

4. Suppose that E is a set of ^-sentences such that at least one sentence from E is true in each ^-structure. Show that the dis-junction of some finitely many sentences from E is logically valid.

5. Show that every nonstandard model of arithmetic contains an infinite prime number, that is, an infinite number a such that if a = be, then either b = 1 or c = 1.

6. Show that if <f>(x) is a formula with one free variable in CNT such that there are infinitely many natural numbers a such that 91 < (x)[s[x|a]], then in every nonstandard model of arithmetic there is an infinite number b such that 911= 4>(x) [s[x|6]].

7. Verify that we can use the Compactness Theorem in Exam-ple 3.3.5 by verifying that every finite subset of 0 has a model.

8. (a) Using only connectives, quantifiers, variables, and the equal-ity symbol, construct a set of sentences E such that every model of E is infinite.

(b) Prove that if T is a set of sentences with arbitrarily large finite models, then T has an infinite model.

(c) Show that there can be no set of sentences in first-order logic that characterizes the finite groups. (See Exercise 3 in Section 2.8.1.)

(d) Prove that there is no finite set of sentences

such that 21 |= $ if and only if A is infinite. [Suggestion: Look at A <f>2 A • • • A </>„).]

9. Suppose that Ei and E2 are two sets of sentences such that no structure is a model of both Ej and E2. Show there is a sentence a such that every model of Ei is also a model of a and furthermore, every model of £ 2 is a model of ->a.

10. A binary relation < on a set A is said to be a linear order if

(a) < is irreflexive—(Va € A)(->a < a). (b) < is transitive—(Va, b, c G A) ([a < b A b < c] —» a < c). (c) < satisfies trichotomy—Va, b G A exactly one of the follow-

ing is true: a < b, b < a, or a — b.

If a linear order < has the additional property that there are no infinite descending chains—there do not exist ai, a 2 , . . . G A such that ai > ag > C3 > • • • (where ai > a2 means a2 < ai), then the relation < is a well-order of the set A. Suppose that C. is a language containing a binary relation symbol <. Show there . is no set of £-sentences E such that E has both of the following properties:

(a) E has an infinite model 21 in which < a is a linear order of A.

(b) If S is any infinite model of E, then <® is a well-ordering of B.

11. Show that < is not a well-order in any nonstandard model of arithmetic.

12. (a) In the structure 21 that was built in Example 3.3.5, explain how we know that

21 J= (V®)[(a:>6) (X/2>6AX>X/2)].

(b) Show that < is a linear order of A, the universe of 21. (c) Show that < is not a well-order in this structure.

3.4. Substructures and the Lowenheim-Skolem Theorems 111

3.4 Substructures and the Lowenheim-Skolem Theorems

In this section we will discuss a relation between structures. A given set of sentences may have many different models, and it will turn out that in some cases those models are related in surprising ways. We begin by defining the notion of a substructure.

Definition 3.4.1. If 21 and tB are two ^-structures, we will say that 21 is a substructure of 25, and write 21 C <B, if:

1. ACB.

2. For every constant symbol c, c a = c®.

3. For every n-ary relation symbol R, R* = /2® ft An.

4. For every n-ary function symbol / , = / ® fan. In other words, for every n-ary function symbol / and every a € A, / a (a ) = /®(a). (This is called the restriction of the func-tion / * to the set An.)

Thus a substructure of 2$ is completely determined by its uni-verse, and this universe can be any nonempty subset of B that con-tains the constants and is closed under every function / .

Example 3.4.2. Suppose that we try to build a substructure 21 of the structure 91 = (N, 0,5, +, •, E, <). Since A must be closed under the functions and contain the constants, the number 0 must be an element of the universe A. But now, since the substructure must be closed under the function S, it is clear that every natural number must be an element of A. Thus 91 has no proper substructures.

Example 3.4.3. Now, suppose that we try to find some substruc-tures of the structure 93 = (N,0, <), with the usual interpretations of 0 and <. Since there are no function symbols, any nonempty sub-set of N that includes the number 0 can serve as the universe of a substructure 21C!8.

Suppose that we let 21 = ({0},0, <). Then notice that even though 21 C 58, there are plenty of sentences that are true in one structure that are not true in the other structure. For example, (Vx)(3y)x < y is false in 21 and true in 23. It will not be hard for you to find an example of a sentence that is true in 21 and false in 2$.

As Example 3.4.3 shows, if we are given two structures such that 21C SB, most of the time you would expect that QL and 58 would be very different, and there would be lots of sentences that would be true in one of the structures that would not be true in the other.

Sometimes, however, truth in the smaller structure is more closely tied to truth in the larger structure.

Definition 3.4.4. Suppose that QL and © are ^-structures and QL C ©. We say that 21 is an elementary substructure of © (equiva-lently, © is an elementary extension of 21), and write 2t ~< ©, if for every s : Vars —• A and for every £-formula <f>,

QL (= <j>[s] if and only if © f=

Chaff: Notice that if we want to prove 21 -< we need only prove 21 f= <£[s] —• © j= <f)[s], since once we have done that, the other direction comes for free by using the contrapositive and negations.

Proposition 3.4.5. Suppose that QL -< ©. Then a sentence a is true in QL if and only if it is true in ©.

Example 3.4.6. We saw earlier that the structure © = (N, 0, <) has lots of substructures. However, © has no proper elementary substructures. For suppose that 2t -< ©. Certainly, 0 € A, as 21 is a substructure. Since the sentence (3y) [0 < y A (Vx)(0 < x —• y < x)] is true in ©, it must be true in 21 as well. So

21 \= (3y)[0 < y A (Vx)(0 < x y < «)].

Thus, for any assignment function s : Vars —* A there is some a € A such that

21 }= [0 < y A (Vx)(0 < x y < x)] {s[y\a}].

Fix such an s and such an o € A. Now we use elementarity again. Since 21 © and s[y|a]: Vars —• A, we know that

But in the structure ©, there is a unique element that makes the formula [0 < y A (Vx)(0 < x —• y < x)J true, namely the number

1. So a must be the number 1, and so 1 must be an element of A. Similarly, you can show that 2 6 A, 3 € A, and so on. Thus N C 4 , and 21 will not be a proper elementary substructure of 55.

This example shows that when building an elementary substruc-ture of a given structure 58, we need to make sure that witnesses for each existential sentence true in 58 must be included in the uni-verse of the elementary substructure 21. That idea will be the core of the proof of the Downward Lowenheim-Skolem Theorem, Theo-rem 3.4.8. In fact, the next lemma says that making sure that such witnesses are elements of 21 is all that is needed to ensure that 21 is an elementary substructure of 58.

Lemma 3.4.7. Suppose that 21 C 58 and that for every formula a and every s : Vars —• A such that 58 3xa[s] there is ana E A such that 58 |= a[s[x|a]]. Then 21 ^ 58.

Proof. We will show, given the assumptions of the lemma, that if ^ is any formula and s is any variable assignment function into A, 211= if and only if 58 f= <£[s], and thus 21 K 58.

This is an easy proof by induction on the complexity of <f>, which we will make even easier by noting that we can replace the V inductive step by an 3 inductive step, as V can be defined in terms of 3.

So for the base case, assume that <f> is atomic. For example, if 0 is R(x,y), then 21 |= ^[s] if and only if {s(x),s(y)) € But R* = n * n A2, so (s(x), s(y)) eR* if and only if (s(x), s(y)} € H* But (s(x), s(y)) € il® if and only if 58 f= <f>[s], as needed.

For the inductive clauses, assume that 4> is -"a. Then

211= <f>[s] if and only if 2t (= -ia[s] if and only if 2t a[s] if and only if 58 ^ a[s] inductive hypothesis if and only if 58 -ia[s] if and only if 58 (=

The second inductive clause, if <j> is a V /?, is similar. For the last inductive clause, suppose that <j> is 3xa. Suppose also

that 21 |= [s]; in other words, 21 3xa[s]. Then, for some a € A, 21 f= ar[s[x|a]]. Since s[x\a] is a function mapping variables into A, by our inductive hypothesis, 58 |= a[s[x|a]]. But then 2$ f= 3xa[s], as needed. For the other direction, assume that 58 J= 3xa[s], where

s : Vars —» A. We use the assumption of the lemma to find an a € A such that 58 |= a[s[z|a]]. As s[x|a] is a function with codomain A, by the inductive hypothesis 21 {= a[s[x|a]|, and thus 21 b 3ara[s], and the proof is complete. •

Chaff: We are now going to look at the Lowenheim-Skolem Theorems, which were published in 1915. To un-derstand these theorems, you need to have at least a basic understanding of cardinality, a topic that is outlined in the Appendix. However, if you are in a hurry, it will suffice if you merely remember that there are many dif-ferent sizes of infinite sets. An infinite set A is countable if there is a bijection between A and the set of natural numbers N otherwise, the set is uncountable. Examples of countable sets include the integers and the set of ra-tional numbers. The set of real numbers is uncountable, in that there is no bijection between R and N. So there are more reals than natural numbers. There are infinitely many different sizes of infinite sets. The smallest infinite size is countable.

Theorem 3.4.8 (Downward Lowenheim—Skolem Theorem). Suppose that C is a countable language and 58 is an C-structure. Then 58 has a countable elementary substructure.

Proof. If B is finite or countably infinite, then 58 is its own countable elementary substructure, so assume that B is uncountable. As the language C is countable, there are only countably many -formulas, and thus only countably many formulas of the form 3xa.

Let AO be any nonempty countable subset of B. We show how to build AX such that AQ C AI, and Ai is countable. The idea is to add to Ao witnesses for the truth (in 58) of existential statements.

Notice that as Ao is countable, there are only countably many functions ar : Vars Ao that are eventually constant. (This is a nice exercise for those of you who have had a course in set theory or are reasonably comfortable with cardinality arguments.) Also, if we are given any <}> and any s : Vara —* Ao, we can find an eventually constant s'; Vars —* Ao such that a and s' agree on the free variables of <j>, and thus 58 |= if and only if 58 b <l>[s']-

The construction of A\: For each formula of the form 3xa and each a i Vars Ao such that 58 |= 3xa[s], find an eventually con-

stant s': Vars —* AQ such that s and s' agree on the free variables of 3xa. Pick an element aa>8' € B such that 03 J= a[s[x|aQi<r/J], and let

AI — AQ U {aa>8' }aii a)g: Vars—»Ao •

Notice that A\ is countable, as there are only countably many a's and countably many s'.

Continue this construction, iteratively building An+i from An. Let A ~ U~0A„. As A is a countable union of countable sets, A is countable.

Now we have constructed a potential universe A for a substruc-ture for 93. We have to prove that A is closed under the functions of 93 (by the remarks following Definition 3.4.1 this shows that 21 is a substructure of 93), and we have to show that 21 satisfies the criteria set out in Lemma 3.4.7, so we will know that 21 is an elementary substructure of 93.

First, to show that A is closed under the functions of 93, suppose that a £ A and / is a unary function symbol (the general case is identical) and that b = /®(a). We must show that 6 € A. Fix an n so large that a € An, let <j> be the formula (3y)y = fix), and let s be any assignment function into A such that s(x) = a. We know that 93 (= (3y)y = /(x)[s], and we know that if 93 }= (y = /(x))[s[y|d]], then d — b. So, in our construction of An+i we must have used a y - f ^ s = 6, so b € An+i, and b 6 A, as needed.

In order to use Lemma 3.4.7, we must show that if a is a formula and s : Vars A is such that 93 (= 3xa[s], then there is an a € A such that 93 (= a[s[x|a]]. So, fix such an a and such an s. Find an eventually constant s' : Vars —• A such that s and s' agree on all the free variables of a. Thus 58 |= 3xa[s'], and all of the values of s' are elements of some fixed An, as s' takes on only a finite number of values. But then by construction of An+1, there is an element a of An+1 such that 93 |= a[s'[x|a]]. But this tells us (since s and s' agree on the free variables of a) that 93 <*[s[x|a]], as needed.

So we have met the hypotheses of Lemma 3.4.7, and thus 21 is a countable elementary substructure of 58, as needed. •

Chaff: I would like to look at a bit of this proof a little more closely. In the construction of Ai, what we did was to find an aa>si for each formula 3xa and each s : Vars —* A, and the point was that aa si would be a witness to the truth in 93 of the existential statement 3xa. So we

have constructed a junction which, given an existential formula 3xa and an assignment function, finds a value for x that makes the formula a true. A function of this sort is called a Skolem function, and the construction of A in the proof of the Downward Lowenheim-Skolem Theorem can thus be summarized: Let Ao be a countable subset of J9, and form the closure of Ao under the set of all Skolem functions. Then show that this closure is an elementary substructure of 53.

Example 3.4.9. We saw an indication in Exercise 4 in Section 2.8.1 that the axioms of Zermelo-Fraenkel set theory (known as ZF) can be formalized in first-order logic. Accepting that as true (which it is), we know that if the axioms are consistent they have a model, and then by the Downward Lowenheim-Skolem Theorem, there must be a countable model for set theory. But this is interesting, as the following are all theorems of ZF:

• There is a countably infinite set.

• If a set o exists, then the collection of subsets of a exists.

• If a is countably infinite, then the collection of subsets of a is uncountable. (This is Cantor's Theorem).

t Now, let us suppose that % is our countable model of ZF, and

suppose that a is an element of A and is countably infinite. If b is the set of all of the subsets of a, we know that 6 is uncountable (by Cantor's Theorem) and yet b must be countable, as all of the elements of b are in the model 21, and 21 is countable! So b must be both countable and uncountable! This is called (somewhat incorrectly) Skolem's paradox, and Exercise 8 asks you to figure out the solution to the paradox.

Probably the way to think about the Downward Lowenheim-Skolem Theorem is that it guarantees that if there are any infinite models of a given set of formulas, then there is a small (countably infinite means small) model of that set of formulas. It seems rea-sonable to ask if there is a similar guarantee about big models, and there is.

Proposition 3.4.10. Suppose that £ is a set ofC-formulas with an infinite model If K is an infinite cardinal, then there is o model of £ of cardinality greater than or equal to K.

Proof. This is an easy application of the Compactness Theorem. Ex- • pand C to include K new constant symbols c,, and let T = £ U {c,- ^ Cj \ i j}. Then T is finitely satisfiable, as we can take our given infinite model of £ and interpret the c% in that model in such a way that Ci Cj for any finite set of constant symbols. By the Com-pactness Theorem, there is a structure 21 that is a model of T, and thus certainly the cardinality of A is greater than or equal to K. If we restrict 2t to the original language, we get a model of £ of the required cardinality. •

Corollary 3.4.11. If E is a set of formulas from a countable lan-guage with an infinite model, and if K is an infinite cardinal, then there is a model of £ of cardinality K.

Proof. First, use Proposition 3.4.10 to get 95, a model of £ of car-dinality greater than or equal to K. Then, mimic the proof of the Downward Lowenheim-Skolem Theorem, starting with a set Ao C B of cardinality exactly K. Then the A that is constructed in that proof also will have cardinality K, and as 21 -< 95, 21 will be a model of E of cardinality K. •

Corollary 3.4.12. If 21 is an infinite C-structure, then there is no set of first-order formulas that characterize 21 up to isomorphism.

Proof. More precisely, the corollary says that there is no set of for-mulas £ such that 95 (= £ if and only if 21 = 93. We know that there are models of £ of all cardinalities, and we know that there are no bijections between sets of different cardinalities. So there must be many models of £ that are not isomorphic to 21. •

Chaff: There are sets of axioms that do characterize infinite structures. For example, the second-order axioms of Peano Arithmetic include axioms to ensure that addi-tion and multiplication behave normally, and they also include the principle of mathematical induction: I f M i s a set of numbers, if 0 6 M, and if S(n) G M for every n such that n € M, then (Vn)(n € M).

Any model of Peano Arithmetic is isomorphic to the natural numbers, but notice that we used two notions (sets of numbers and the elementhood relation) that are not part of our description of 9t. By introducing sets of numbers we have left the world of first-order logic and have entered second-order logic, and it is only by using second-order logic that we are able to characterize 9t. For a nice discussion of this topic, see [Bell and Machover 77, Chapter 7, Section 2].

The results from Proposition 3.4.10 to Corollary 3.4.12 give us models that are large, but they have a slightly different flavor from the Downward Lowenheim-Skolem Theorem, in that they do not guarantee that the small model is an elementary substructure of the large model. That is the content of the Upward Lowenheim-Skolem Theorem, a proof of which is outlined in the Exercises.

Theorem 3.4.13 (Upward Lowenheim-Skolem Theorem). If £ is a countable language, Ql is an infinite C-structure, and K is a cardinal, then QL has an elementary extension 58 such that the cardi-nality of B is greater than or equal to K.

3.4.1 Exercises 1. Suppose that 21 C 93, that 0 is of the form (Vx)V', where if> is

quantifier-free, and that 21 (= Prove that 93 \= <f>- The short version of this fact is, "Universal sentences are preserved down-ward." Formulate and prove the corresponding fact for existential sentences,

2. Justify the Chaff following Definition 3.4.4.

3. Show that if 21 -< 93 and € -i 93 and 21 C <£, then 21 •<€.

4. Suppose that we have an elementary chain, a set of -structures such that

QL\ ~< -< " • and let 21 = IJSi So the universe A of 21 is the union of the universes Ait R* = U S I ^ J etc. Show that Ok •< 21 for each i. [Suggestion: To show that QLi C QL is pretty easy by the definition. To get that 21 is an elementary extension, you have

to use induction on formulas. Notice by the comments following Definition 3.4.4 that you need only prove one direction. You may find it easier to use 3 rather than V in the quantifier part of the inductive step of the proof.]

5. Prove Proposition 3.4.5.

6. Show that if 21 -< 95 and if there is an element b G B and a formula (p(x) such that 95 |= <£[s[x|b]] and for every other b € B, 93 ^ [s[a;|6]], then b € A. [Suggestion: This is very similar to Example 3.4.6].

7. Suppose that 95 = {N, +, •}, and let A0 = {2,3}. Let F be the set of Skolem functions {/Q ,s} corresponding to as of the form (3x)x = yz. Find the closure of Ao under F. [Suggestion: Do not forget that the assignment functions s that you need to consider are functions mapping into AQ at first, then A\, and so on. You probably want to explicitly write out AI, then A2, etc. I am using the notation here corresponding to the proof of Theorem 3.4.8.]

8. To say that a set a is countable means that there is a function with domain the natural numbers and codomain a that is a bijection. Notice that this is an existential statement, saying that a certain kind of function exists. Now, think about Example 3.4.9 and see if you can figure out why it is not really a contradiction that the set b is both countable and uncountable. In particular, think about what it means for an existential statement to be true in a structure 21, as opposed to true in the real world (whatever that means!)

9. (Toward the Proof of the Upward Lowenheim-Skolem Theorem) If 21 is an ^-structure, let C(A) — C, U {a | a € A}, where each a is a new constant symbol. Then, let 21 be the £(A)-structure having the same universe as 21 and the same interpretation of the symbols of C as 21, and interpreting each a as a. Then we define the complete diagram of % as

Th{%) = {a | a is an £(.A)-formula such that 211= <r}.

Show that if 58 is any model of Th&), and if © = 58 tc. then 21 is isomorphic to an elementary substructure of 58. [Suggestion; Let h : A —* B be given by h(a) = a®. Let C be the range

of h. Show C is closed under /® for every / in C, and thus C is the universe of C, a substructure of 93. Then show h is an isomorphism between 21 and <£. Finally, show that € ~< 93.]

10. Use Exercise 9 to prove the Upward Lowenheim-Skolem Theorem by finding a model 58 of the complete diagram of the given model 21 such that the cardinality of B is greater than or equal to K.

11. We can now fill in some of the details of our discussion of nonstan-dard analysis from Example 3.3.5. As the language £R of that example already includes constant symbols for each real number, the complete diagram of is nothing more than TJfi(9ct). Explain how Exercise 9 shows that there is an isomorphic copy of the real line living inside the structure 21.

3.5 Summing Up, Looking Ahead We have proven a couple of difficult theorems in this chapter, and by understanding the proof of the Completeness Theorem you have grasped an intricate argument with a wonderful idea at its core. Our results have been directed at structures: What kinds of structures exist? How can we (or can't we) characterize them? How large can they be?

The next chapter begins our discussion of Kurt Godel's famous incompleteness theorems. Rather than discussing the strength of our deductive system as we have done in the last two chapters, we will now discuss the strength of sets of axioms. In particular, we will look at the question of how complicated a set of axioms must be in order to prove all of the true statements about the standard structure 9t.

In Chapter 4 we will introduce the idea of coding up the state-ments of CNT 83 terms and will show that a certain set of nonlogical axioms is strong enough to prove some basic facts about the num-bers coding up those statements. Then, in Chapter 5, we will bring those facts together to show that the expressive power we have gained has allowed us to express truths that are unprovable from our set of axioms.

Chapter 4

Incompleteness— Groundwork

i * > »

4.1 Introduction

Now, I hope you have been paying attention closely enough to be bothered by the title of this chapter. The preceding chapter was about completeness, and we proved the Completeness Theorem. Now we seem to be launching an investigation of incompleteness! This point is pretty confusing, so let us try to start out as clearly as possible.

In Chapter 3 we proved the completeness of our axiomatic system. We have shown that the deductive system described in Chapter 2 is sound and complete. What does this mean? For the collection of logical axioms and rules of inference that we have set out, any formula <f> that can be deduced from £ will be true in all models of £ under any variable assignment function (that's soundness), and furthermore any formula <f> that is true in all models of £ under every assignment function will be deducible from E (that's completeness). Thus, our deductive system is as nice as it can possibly be. The rough version of the Completeness and Soundness Theorems is: We can prove it if and only if it is true everywhere.

Now we will change our focus. Rather than discussing the won-derful qualities of our deductive system, we will concentrate on a particular language, CNT, and think about a particular structure: 91, the natural numbers. .

122 Chapter 4. Incompleteness—Groundwork

Wouldn't life be just great if we knew that we could prove every true statement about the natural numbers? Of course, the state-ments that we can prove depend on our choice of nonlogical axioms E, so let me start this paragraph over.

Wouldn't life be just great if we could find a set of nonlogical ax-ioms that could prove every true statement about the natural num-bers? I would love to have a set of axioms E such that 9t j= E (so our axioms are true statements about the natural numbers) and E is rich enough so that for every sentence cr, if 01 (= a, then £ h cr. Since

* E has a model, we know that E is consistent, so by soundness my wished-for E will prove exactly those sentences that are true in 9t. The set of sentences of CNT that are true in is called the Theory of 91, or T/i(9l).

Since we know that a sentence is either true in 91 or false in 91, this set of axioms E is complete—complete in the sense that given any sentence cr, E will provide either a deduction of a or a deduction of -nr.

Definition 4.1.1. A set of nonlogical axioms E is called complete if for every sentence cr, either E H cr or E H -HT.

Chaff: To reiterate, in Chapter 3 we showed that our deductive system is complete. This means that for a given E, the deductive system will prove exactly those formulas that are logical consequences of E. When we say that a set of axioms is complete, we are saying that the axioms are strong enough to provide either a proof or a refutation of any sentence. This is harder.

Our goal is to find a complete and consistent set of axioms E such that 911= E. So this set E would be strong enough to prove every £jvT-sentence that is true in the standard structure 91. Such a set of axioms is said to axiomatize I7i(9t).

Definition 4.1.2. A set of axioms E is an axiomatization of Th(% if for every sentence a 6 T/i(9l), E her.

Actually, as stated, it is pretty easy to find an axiomatization of 27i(9t): Just let the axiom set be T/i(9l) itself. This set clearly axiomatizes itself, so we are finished! Off we go to have a drink. Of course, our answer to the search has the problem that we don't have

4.1. Introduction 123

an easy way to tell exactly which formulas are elements of the set of axioms. If I took a random sentence in CNT and asked you if this sentence were true in the standard structure, I doubt you'd be able to tell me. The truth of nonrandom sentences is also hard to figure out—consider the twin prime conjecture, that there are infinitely many pairs of numbers k and k + 2 such that both k and fc + 2 are prime. People have been thinking about that one for over 2000 years and we don't know if it is true or not. So we have no idea if the twin prime conjecture is in 7Vi.(9t) or not. So it looks like 77i(9t) is unsatisfactory as a set of nonlogical axioms.

So, to refine our question a bit, what we would like is a set of nonlogical axioms £ that is simple enough so that we can recognize whether or not a given formula is an axiom (so the set of axioms should be decidable) and strong enough to prove every formula in T/i(9l). So we search for a complete, consistent, decidable set of axioms for 91. Unfortunately, our search is doomed to failure, and that fact is the content of Godel's First Incompleteness Theorem, which we shall prove in Chapter 5.

There is a fair bit of groundwork to cover before we get to the Incompleteness Theorem, and much of that groundwork is rather technical. Here is a thumbnail sketch of our plan to reach the the-orem: The proof of the First Incompleteness Theorem essentially consists of constructing a certain sentence 9 and noticing that 9 is, by its very nature, a true statement in 91 and a statement that is unprovable from our axioms. So the groundwork consists of mak-ing sure that this yet-to-be-constructed 9 exists and does what it is supposed to do. In this chapter we will specify our language and iV, a set of nonlogical axioms. The axioms of .N will be true sentences in 91. We will show that N, although very weak, is strong enough to prove some crucial results. We will then show that our language is rich enough to express several ideas that will be crucial in the construction of 9.

In Chapter 5 we will prove Godel's Self-Reference Lemma and use that lemma to construct the sentence 9. We shall then state and prove the First Incompleteness Theorem, that there can be no decidable, consistent, complete set of axioms for 91. We then will finish the book with a discussion of Godel's Second Incompleteness Theorem, which shows that no reasonably strong set of axioms can ever hope to prove its own consistency.

4.2 The Language, the Structure, and the Axioms of N

We work in the language of number theory

£NT= {0,5, + , - , £ ,< } ,

and we will continue to work in this language for the next two chap-ters. 91 is the standard model of the natural numbers,

91 =(N, 0,5, + , - ,£ ,<) ,

where the functions and relations are the usual functions and rela-tions that you have known since you were knee high to a grasshopper. E is exponentiation, which will usually be written xy rather than Exy or xEy.

We will now establish a set of nonlogical axioms, N. You will notice that the axioms are clearly sentences that are true in the standard structure, and thus if T is any set of axioms such that T h o for all a such that 91 (= cr, then T h N. So, as we prove that several sorts of formulas are derivable from N, remember that those same formulas are also derivable from any set of axioms that has any hope of providing an axiomatization of the natural numbers.

The axiom system N was introduced in Example 2.8.3 and is reproduced on the next page. These eleven axioms establish some of the basic facts about the successor function, addition, multiplication, exponentiation, and the < ordering on the natural numbers.

Chaff: To be honest, the symbol E and the axioms about exponentiation are not needed here. It is possible to do everything that we do in the next couple of chapters by defining exponentiation in terms of multiplication, and introducing E as an abbreviation in the language. This has the advantage of showing a little more explicitly how little you need to prove the incompleteness theorems, but adds some complications to the exposition. I have de-cided to introduce exponentiation explicitly and add a couple of axioms, which will hopefully allow us to move a little more cleanly through the proofs of our theorems.

4.2. The Language, the Structure, and the Axioms of N 125

The Axioms of N 1. (Vx)->Sx = 0.

2. (Vx)(Vy)[Sx = S y - x = y].

3. (Vx)x + 0 = x.

4. (Vx)(Vy)x + Sy = S(x + y).

5. (Vx)x -0 = 0.

6. (Vx)(Vy)x • Sy — (x' y) + x.

7. (Vx)xE0 = SO.

8. (Va:)(Vy)xE(Sy) = (xEy) • x.

9. (Va;)-'X<0.

10. (Vx)(Vy)[x<Sy*->(x<yVx = y)].

11. (Vx)(Vy) [(a; < y) V (x = y) V (y < a:)].

4.2.1 Exercises,

1. You have already seen that N is not strong enough to prove the commutative law of addition (Exercise 8 in Section 2.8.1). Use this to show that N is not complete by showing that

N \f (Vx)(Vy)x + y = y + x

and N \f i[(Vx)(Vy)x + y = y + x].

2. Suppose that E provides an axiomatization of Th$I). Suppose a is a formula such that N b a. Show that E b a.

3. Suppose that 21 is a nonstandard model of arithmetic. If T/i(2l) is the collection of sentences that are true in 21, is Th{21) complete? Does 27i(2l) provide an axiomatization of 91? Of 21?

4.3 Recursive Sets and Recursive Functions For the sake of discussion, suppose that « e let f(x) = x1. It will not surprise you. to find out that It Is the case that / (4) = 16, so I would like to write 91 |= / (4) = 16. Unfortunately, we are not allowed to do this, since the symbol / , not to mention 4 and 16, are not part of the language.

What we can do, however, is to represent the function / by a formula in CXT- To be specific, suppose that y) is

y — ExSSO.

Then, if we allow ourselves once again to use the abbreviation a for the £jyr-term SSS -- - 50, we can assert that

911=0(2,16)

which Is the same thing as

91 h = SSSSSSSSSSSSSSSSOESSSSQSSO.

(Boy, aren't you glad we don't use the official language very often?) Anyway, the situation is even better than this, for < (4,16) Is derivable from N rather than just true in 9L In fact, if you look bade at Lemma 2.8.4, you probably won't have any trouble believing the following statements:

• JV>^(3,l6)

• N>->0(1,7l4)

In fact, this formula <f> is such, and N is such, that if a is any natural number and b = /(a), then

N H Vy [<£(a, y) *-* y = Sj.

We will say that the formula 4> represents the function / in the theory N.

4.3. Recursive Sets and Recursive Functions 1 2 7

Definition 4.3.1. A set A C N* is said to be recursive if there is an £>T-formula $>{*) such that

V a € A

Vb^A Nh-v(b).

In this case we will say that the formula <j> represents the set A.

Chaff: A bit of notation has slipped in here. Rather than writing * i , X 2 , o v e r and over and over again . over the next few sections, we will abbreviate this as x. Similarly, x is shorthand for If you want, you can just assume that there is only one x—it won't make any difference to the exposition.

We will also spend a fair bit of time talking about recursive func-tions. Since a function / : N* —* N can be seen as & subset of N*+1, we can just think about the function as a set when proving it to be recursive and use Definition 4.3.1. In practice, we can do a little bit better:

If / : Nk —> N is a recursive function, we can assume that / is represented by a formula {x, y) such that for every a € N*,

A proof of this is outlined in Exercise 7.

Chaff:

What's in a name? that which we call a rose By any other name would smell as sweet; So Romeo would, were he not Romeo call'd, Retain that dear perfection which he owes Without that title.

—Romeo and Juliet, Act II, Scene ii

In computer science courses, and in many mathemat-ical logic texts, a different approach is taken when recur-sive sets and recursive functions are introduced. Starting with certain initial functions, the idea of recursion, and an object called the ft-operator, a collection of functions .

is defined such that each function in the collection is effec-tively calculable. This is called the collection of recursive functions, which lead to something called recursive sets, whereas the sets that we have just christened as recursive are called representable sets. Then these texts prove that the collection of representable sets is the same as the col-lection of recursive sets. Thus we all end up at the same place, with nicely defined collection of sets and functions, that we call recursive. So if you are confused by the dif-ferent definitions, just remember that they all define the same concept, and remember that the objects that are recursive (or representable) are (in some sense) the sim-ple ones, the ones where membership can be proved in N.

To be fair, it is not quite as simple as Juliet makes it out to be. (It never is, is it?) The path that we have taken to recursive sets is clean and direct but emphasizes the deductions over the functions. The approach through initial functions stresses the fact that everything that we discuss can be calculated, and that viewpoint gives a nat-ural tie between the logic that we have been discussing and its applications to computer science. For more on this connection, see Section 4.4.

Definition 4.3.2. We will say that a set A C N* is definable if there is a formula <f>(x) such that

Va € A 01 h 0(g)

In this case, we will say that <f> defines the set A.

Chaff: It is very important to notice the difference be-tween saying that <}> represents A and <j> defines A, which is the same as the difference between N h and 91 (=. Notice that any recursive set must be definable and is defined by any formula that represents it. The converse, however, is not automatic. In fact, the converse is not true. But we're getting ahead of ourselves.

We have mentioned several times that the axiom system N is relatively weak. We will show in this section that N is strong enough

4.3. Recursive Sets and Recursive Functions 129

to prove some of the £/vr formulas that are true in 9T, namely the class of true E-sentences. And this will allow us to show that if a set A has a relatively simple definition, then the set A will be recursive.

Definition 4.3.3. If x is a variable that does not occur in the term t, let us agree to use the following abbreviations:

(Vx < t)cf> means Vx(:r <t <j>)

(Vx < t)<f> means Vx((x < t V x = t) -» 4>)

(3x < t)<t> means 3x(x <tA<f>)

(3x < t)$ means 3x((x < t V x = t) A <f>).

These abbreviations are called bounded quantifiers.

Definition 4.3.4. The collection of E-formulas is the smallest set

of A/vr-formulas that:

1. Contains all atomic formulas.

2. Contains all negations of atomic formulas.

3. Is closed under the connectives A and V.

4. Is closed under bounded quantifiers and the quantifier 3. The collection of II-formulas is the smallest set of formulas

1. Contains all atomic formulas.

2. Contains all negations of atomic formulas.

3. Is closed under the connectives A and V.

4. Is closed under bounded quantifiers and the quantifier V.

The collection of A-formulas is the intersection of the collection of E-formulas with the set of II-formulas.

Notice that a A-formula is a formula in which all of the quantifiers are bounded. The formula can have Vx < y and 3u < v but will not have any unbounded V's or 3's.

Example 4.3.5. Here is a perfectly nice example of a A-formuIa <j>: (Vx < t)(x = 0). Notice that the denial of ^ is not a A-formula, as -i(Vx < t)(x — 0) is neither a E- nor a n-formula. But a chain of logical equivalences shows us that -<<f> is equivalent to a A-formula if we just push the negation sign inside the quantifier:

-•(Vx < t)(x = 0) (3x < *)-.(x ~ 0).

Similarly, we can show that any propositional combination (using A, V, —»,«-») of A-formulas is equivalent to a A-formula. We will use this fact approximately 215,342 times in the remainder of this book.

A E-sentence is, of course, a E-formula that is also a sentence. We will be particularly interested in E-formulas and A-formulas, for we will show if <j> is a E-sentence and Vl\= <j), then the axiom set N is strong enough to provide a deduction of <}>. Since every A-sentence is also a E-sentence, any A-sentence that is true in 91 is also provable from N.

The first lemma that we will prove shows that N is strong enough to prove that 1 + 1 = 2. Actually, we already know this since it was proved back in Lemma 2.8.4. We now expand that result and show that if t is any variable-free term, then N proves that t is equal to what it is supposed to be equal to.

Recall that if t is a term, then t91 is the interpretation of that term in the structure 91. For example, suppose that t is the term £SSS0SS0,_also known as SSS0S5°. Then tm would be the num-ber 9, and t™ would be the term SSSSSSSSSO. So when this lemma says that N proves t = t91, you should think that N proves SSS0sso = SSSSSSSSSO, which is the same as saying that N h 32 = 9.

Lemma 4.3.6. For each variable-free term t, N t = t91,

Proof. We proceed by induction on the complexity of the term t. If t is the term 0, then t91 is the natural number 0, and tm is the term 0. Thus we have to prove that N h 0 = 0, which is an immediate consequence of our logical axioms.

If t is S(u), where tt is a variable-free term, then the term i91 is identical to the term S(iAlso, N u = tt91, by the inductive hypothesis, and thus N h Su = thanks to the equality axiom (E2). Putting all of this together, we get that N 1 = Su = S(u*) -t 1, as needed.

Iftis u+v, we recall that Lemma 2.8.4 proved that N h tt^+v91 = un + v*. But then Nht = u+v = u^+tr" = um + v™ = t91, which is what we needed to show. The arguments for terms of the form u • v or ti* are similar, so the proof is complete. •

The next lemma and its corollary will be used in our proof that true S-sentences are provable from N.

Lemma 4.3.7 (Rosser's Lemma). If a is a natural number

N H ( V x < O ) [ I = 0 V I = 1 V - - - V I = A ^ T ] .

Proof. We use induction on a. If a = 0, it suffices to prove that N I- Vx[x < 0 -+±]. By Axiom 9 of N, we know that N H ->(x < 0), so N I- (x < 0) — a s needed.

For the inductive step, suppose that a = b + 1. We must show that

JV h Vz[a: < 6 + 1 x - 0 V • • • Va: = 6].

Since 6+1 and Sb are identical, it suffices to show that N1- Vx[x < Sb -* x - 0 V • • • V x = 6].

By Axiom 10, we know that N H x < Sb —» (x < 6 V x = 6), and then by the inductive hypothesis, we are finished. •

Corollary 4.3.8. If a is a natural number, then

N h [[(Vx < a)0(x)] <- [cf>(0) A <f>(\) A • • • A ^(^^T)]] •

Now we come to the major result of this section, that our axiom system is strong enough to prove all true E-sentences.

Proposition 4.3.9. If<f>(x) ts a T.-formula with free variables x, if t are variable-free terms, and }= <p(t), then N h (f>(t).

Proof. This is a proof by induction on the complexity of the formula <f>.

1. If 0 is atomic, say for example that <f> is x < y and terms t and u are such that 91 f= t < u. Then J91 < ti91, so by Lemma 2.8.4, j V h ^ c W * But we also know N t = Fi and N h u = by Lemma 4.3.6, so N1-1 < tt, as needed.

2. Negations of atomic formulas are handled in the same manner. x

3. If <f> is a V 0 or a A j3, the argument is left to the Exercises.

4. Suppose that 911= 3xtp(x), where we assume that V has only one free variable for simplicity. Then there is a natural number o such that 91 }= tp{o), and thus N H ip(a) by the inductive hypothesis. But then our second quantifier axiom tells us, as a is substitutable for x in ip, that N H 3xip, as needed.

5. Now if 911= (V® < u)tp(x), we know by the inductive hypothesis that

JVh [^(0)A^(I)A — Aip(u*-l)].

But then by Corollary 4.3.8,

N (Vx < ^ ( x ) .

Thus, since N h u91 = it, N I- (Vx < u)rp(x), as needed. Thus if 91 b < (t), then N h <£(*). •

We will say that <f> is provable (from N) if N I- <f>. And we shall say that (j> is refutable if N h -*<f>.

Suppose that <f> is a A-sentence. If 91 f= <j>, since we know that <j> is also a E-sentence, Proposition 4.3.9 shows that N \r <f>. But suppose that <f> is false; that is, suppose that 91 b Then 911= -«j>, and ->(j> is equivalent to a A-sentence. Thus by the same argument as above, NI—<<f>. So we have proved the following:

Proposition 4.3.10. If <f>(x) is a A-formula with free variables x, if t are variable-free terms, and if 91 b then N h <p(t). If, on the other hand, 911= -^(i) , then N h -></»(t).

Corollary 4.3.11. Suppose that A C Nfc is defined by a A-formula 4>(x). Then A is recursive.

Proof. This is immediate from Proposition 4.3.10 and Definition 4.3.1. •

Example 4.3.12. Suppose that we look at the even numbers. You might want to define this set by the £jvr-formula

<j>{x) is: (3y)(x = y + y). "

But we can, in fact, do even better than this. We can define the set of evens by a A-formula

Even(x) is: (3y < x)(x = y + y).

So now we have a A-definition of EVEN, the set of even num-bers. (We will try to be consistent and use SMALL CAPITALS when referring to a set of numbers and Italics when referring to the CNT-formula that defines that set.) So by Corollary 4.3.11, we see that the set of even numbers is a recursive subset of the natural numbers.

Over the next few sections we will be doing a lot of this. We will look at a set of numbers and prove that it is recursive by producing a A-definition of the set. In many cases, the tricks that we will use to produce the bounds on the quantifiers will be quite impressive.

Chaff: For the rest of this chapter you will see lots of formulas with boxes around them. The idea is that ev-ery time we introduce a A-definition of a set of numbers, there will be a box around it to set it off. I have also gath-ered three tables of some of the useful A-definitions on the endpapers of this book for (relatively) easy reference.

Example 4.3.13. Take a minute and write Prime(x) L a A-definition of PRIME, the set of prime numbers. Once you have done that, here is a definition of the set of prime pairs, the set of pairs of numbers x and y such that both x and y are prime, and y is the next prime after x: - •

Primepair(x, y) is: Prime(x)APrime(y)A(x < y)A[(V.z < y)(Prime(z) —» z < x)].

Notice that Primepair has two free variables, as PRIMEPAIR C N2, while your formula Prime has exactly one free variable. Also notice that all of the quantifiers in each definition are bounded, so we know the definitions are A-definitions.

Chaff: I also hope that you hoticed that in the defi-nition of Primepair I used your formula Prime, and I did not try to insert your entire formula every time I needed it—I just wrote Prime (x) or Prime(y). As you work out the many definitions to follow, it will be essential for you to do the same. Freely use previously defined formulas and plug them in by using their names. To do otherwise is to doom yourself to unending streams of unintelligible symbols. This stuff gets quite dense enough as it is. You do not need to make things any harder than they are.

4.3.1 Exercises 1. Show that the set {17} is recursive by finding a A-formula that '

defines the set. Can you come up with a (probably silly) non-A formula that defines the same set?

2. Suppose that A C N is recursive and represented by the formula 4>(x). Suppose also that B C N is recursive and represented by 4>(x). Show that the following sets are also recursive, and find a formula that represents each;

(a) A U B (b) AnB (c) The complement of A, {x 6 N | x A}

3. Show that every finite subset of the natural numbers is recursive and that every subset of N whose complement is finite is also recursive.

4. Let p(x) be a polynomial with nonnegative integer coefficients. Show that the set {a € N | p(a) = 0} is recursive.

5 . Write a A-definition for the set DIVIDES. SO you must come up with a formula with two free variables, Divides {x, y) , such that 011= Divides (a, b) if and only if a is a factor of b.

4.4. Recursive Sets and Computer Programs 135

6. Show that the set {1,2,4,8,16,...} of powers of 2 is recursive.

7. Suppose that you know that a(x, y) represents the function / . Show that (j)(x, y) = a(x, y) A (Vz < y)(->a(x, z)) also represents / , and furthermore, for each a € N,

Nh{Vy){<t>(a,y)~y=W))- ,

[Suggestion: After you show that <f> represents / , the second part is equivalent to showing N I- <f>(a, /(a)), which is pretty trivial, and then proving that

N b [(a(o, y) A (Vz < y)(-**(*,«))) - y = 7(a)] • -

So, take as hypotheses N, a(a,y), and (Vz < y)(->a(x,z)) and show that there is a deduction of both -i[/(a) < y] and ->[y < /(a)]. Then the last of the axioms of N will give you what you need. For the details, see [Enderton 72, Theorem 33K].j

8. In the last inductive step of the proof of Lemma 4.3.6, the use of the inductive hypothesis is rather hidden. Please expose the use of the inductive hypothesis and write out that step of the proof more completely. Finish the cases for multiplication and exponentiation.

9. Prove Corollary 4.3.8.

10. Fill in the details of the steps omitted in the inductive proof of Proposition 4.3.9. In the last two cases, how does the argument change if there are more free variables? If, for example, instead of <t> being of the form 3xip(x), <f> is of the form 3xxp(x,y), does that change the proof? •

4.4 Recursive Sets and Computer Programs In this section we shall investigate the relationship between recursive sets and computer programs. Our discussion will be rather informal and will rely on your intuition about computers and computation.

One of the reasons that we must be rather informal when dis-cussing computation is that the idea of a computation is rather

vague. In the mid-1930s many mathematicians developed theoret-ical constructs that tried to capture the idea of a computable func-tion. Kurt Godel's recursive functions, the Turing Machines of Alan Turing, and Alonzo Church's A-calculus are three of the best-known models of computability.

One of the reasons that mathematicians accept these formal con-structs as accurately modeling the intuitive notion of computability is that all of the formal analogs of computation that have been pro-posed have been proved to be equivalent. Thus it is known that a function is Turing computable if and only if it is general recursive if and only if it is A-computable. It is also known that these formal notions are equivalent to the idea of a function being computable on a computer, where we will say that a function / is computable on an idealized computer if there is a computer program P such that if the program P is run with input n, the program will cause the computer to output f(n) and halt.

Thus the situation is like this: On one hand, we have an intuitive idea of what it means for a function to be effectively calculable. On the other hand, we have a slew of formal models of computation, each of which is known to be equivalent to all of the others:

Intuitive Notion Formal Models

Computable function

Recursive function A-Computable function

Turing-computable function Computer-computable function •

So why do I say that the idea of a computation is vague? Al-though all of the current definitions are equivalent, for all we know there might be a new definition of computation that you will come up with tonight over a beer. That new definition will be intuitively correct, in the sense that people who hear your definition agree that it is the "right" definition of what it means for a person to compute something, but your definition may well not be equivalent to the current definitions. This will be an earthshaking development and will give logicians and computer scientists plenty to think about for years to come.

You will go down in history as a brilliant person

4.4, Recursive Sets and Computer Programs 137

with great insight into the workings o f the human mind!

You will win lots of awards and be rich and famous! All right, I admit it. Not rich. Just famous. OK. Maybe not famous. But at least well known In logic and computer science circles.

But until you have that beer we will have to go with the current situation, where we have several equivalent definitions that seem to fit our current understanding of the word computation. So our idea of what constitutes a computation is imprecise, even though there is precision in the sense that lots of people have thought about what the definition ought to be, and every definition that has been proposed so far has been proved (precisely) to be equivalent to every other definition that has been proposed.

Church's Thesis is simply an expression of the belief that the for-mal models of computation accurately represent the intuitive idea of a calculable function. We will state the thesis in terms of recursive-ness, as we have been working with recursive functions and recursive sets.

Church's Thesis. A Junction is computable if and only if it is re-cursive.

Now it is important to understand that Church's Thesis is a "the-sis" as opposed to a "theorem" and that it will never be a theorem. As an attempt to link an intuitive notion (computability) and a for-mal notion (recursiveness) it is not the sort of thing that could ever be proved. Proofs require formal definitions, and if we write down a formal definition of computable function, we will have subverted the meaning of the thesis.

So is Church's Thesis true? We can say that all of the evidence to date seems to suggest that Church's Thesis is true, but I am afraid that is all the certainty that we can have on that point. We have over 60 years' experience since the statement of the thesis, and over 3000 years since we started computing functions, but that only counts as anecdotal evidence. Even so, most, if not all, of the mathematical community accepts the identification of "computable" with "recur-sive" and thus the community accepts Church's Thesis as an article of faith.

One of the areas of mathematical logic that has been very active has been the field of recursion theory. Roughly, recursion theory uses

the formal notions of computability and examines sets and functions that satisfy these definitions. For example, one theorem states

Theorem 4.4.1. A subset A of the natural numbers is recursive if there is an idealized computer and a computer program P such that:

1. If a € A and P is run with input a, then P will cause the computer to output 1 and halt.

2. If a $ A and P is run with input a, then P will cause the computer to output 0 and halt.

(Equivalently, for the cognoscenti, A is recursive if and only if its characteristic function is computable by an idealized computer.)

The proof of this theorem is essentially verifying my claim that the formal notions of "recursive function" and "function that is cal-culable by an idealized computer" are equivalent.

Now the class of recursive subsets of N can be extended by looking at the collection of sets such that there is a program which will tell you if a number is an element of the set, but does not have to do anything at all if the number is not an element of the set.

Definition 4.4.2. A set A C N is said to recursively enumerable, or r.e., if there is a computer program P such that if a € A, program P returns 1 on input a, and if a & A, program P does not halt when given input a.

It is easy to prove the following:

Proposition 4.4.3. A set A is recursive if and only if both A and N — A ore recursively enumerable.

The exercises provide a little more practice in working with re-cursively enumerable sets.

The study of recursive functions is an important area of math-ematical logic, and emphasizes the tie between logic and computer science. Although we have not attempted to do so in this book, it is possible to develop an entire presentation of Godel's Incompleteness Theorem based on recursive functions. In this setting, the statement of the Incompleteness Theorem is the statement that the collection

4.5. Coding—Naively 139

of sentences that are provable-from-E, where £ is an extension of N that is decidable and true-in-9t, is a recursively enumerable set that is not recursive. Thus there is a significant difference between the collection of recursive sets and the collection of r.e. sets. One text that emphasizes computability in its treatment of Godel's Theorem is [Keisler and Robbin 96].

4.4.1 Exercises 1. (a) Show that A C N is recursively enumerable if and only if

A is listable, where a set is listable if there is computer program L such that L prints out, in some order or another, the elements of A.

(b) Show A C N is recursive if and only if A is listable in in-creasing order.

2. Suppose that A C N is infinite and recursively enumerable and show that there is an infinite set B C A such that B is recursive.

3. Show that A C N is recursively enumerable if and only if there is a E-formula <p(x) such that 4> defines A.

4. Prove Proposition 4.4.3. [Suggestion: First assume that A is recursive. This direction is easy. For the other direction, the assumption guarantees the existence of two programs. Think about writing a new program that runs these two programs in tandem—first you rim one program for a minute, then you run the second program for a minute. . . . ]

4.5 Coding—Naively If you know a child of a certain age, you have undoubtedly run across coded messages of the form

1 14 116 16 12 5 1 4 1 25

where letters are coded by numbers associated with their place in the alphabet. If we ignore the spaces above, we can think of the phrase as being coded by a single number, and that number can have special properties. For example, the code above is not prime and it is divisible by exactly five distinct primes. If we like, we could

say that the coding allows us to assert the same statements about the phrase that has been coded. For example, if we take the phrase

The code for this phrase is even

and coded it as a number, you might notice that the code ends in 4, so you might be tempted to say that the phrase was correct in what it asserts.

What we are doing here is representing English statements as numbers, and investigating the properties of the numbers. We can do the same thing with statements of CNT- For example, if we take the sentence

= 050,

we could perhaps code this sentence as the number

4964250291131250000000,

and then we can assert things about the number associated with the string. For example, the code for = 0S0 is in the set of numbers that are divisible by 10, and it is in the set of numbers that are larger than the national debt. What will make all of this interesting is we can ask if the code for = 050 is in the set of all codes of sentences that can be deduced from N, Then we will be asking if our sentence is a theorem of N. If it is, then we will know that N is inconsistent. If it is not, then N is consistent. Thus we will have reduced the question of consistency of N to a question about numbers and sets!

This is the insight that let Godel to the Incompleteness Theorem. Given any reasonable set of axioms A, Godel showed a way to code the phrase

This phrase is not a theorem of A as a sentence of CNT and prove that this sentence cannot be in the collection of sentences that are provable from A. So he found a sentence that was true, but not provable. We will, in this chapter and the next, do the same thing.

But first, we will have to establish our coding mechanism. In this section we will not develop our official coding, but rather, a simplified version to give you a taste of the things to come. Let me describe the system I used for the example above.

I started by assigning symbol numbers to the symbols of CNT, given in Table 4.1. Notice that the symbol numbers are only assigned

4.5. Coding—Naively 141

Symbol Symbol Number Symbol Symbol Number -i 1 + 13 V 3 • 15 V 5 E 17 = 7 < 19 0 9 ( 21 5 11 ) 23

Vi 2 i

Table 4.1: Symbol Numbers for CNT

for the official elements of the language, so if you need to use any abbreviations, such as —• or 3, you will have to write them out in terms of their definitions.

Then I had to figure out a way to code up sequences of symbols. The idea here is pretty simple, as we will just take the symbol num-bers and use them as exponents for the prime numbers, in order. For example, if we look at the expression

the sequence of symbol numbers this generates is

<7,9,11,9),

so the code for the sequence would be

27395u79,

which is also known as

4964250291131250000000,

the example that we looked at earlier. Notice that our coding is effective, in the sense that it is easy,

given a number, to find its factorization and thus to find the string that is coded by the number.

Chaff: The word easy is used here in its mathematical sense, not in its computer science sense. In reality it can take unbelievably long to factor many numbers, especially numbers of the size that we will discuss.

Now, the problem with all of this is not that you would find it difficult to recognize code numbers, or to decode a given number, or anything like that. Rather, what turns out to be tricky is to show that N, our collection of axioms, is strong enough to be able to prove true assertions about the numbers. For example, we would like N to be able to show that the term

4964250291131250000000

represents a number that is the code for an £NT-sentence. The details of showing that N has this strength will occupy us for the next several sections.

4.5.1 Exercises

1. Decode the message that begins this section.

2. Code up the following formulas using the method described in this section.

(a) = +000 (b ) = EV\SV2 • EV1V2V1

(c) (= OOV(-KOO)) (d) (3v2)(< u20)

3. Find the £wr-formula that is represented by the following num-bers. A calculator (or a computer algebra system) will be helpful. Write your answer in a form that normal people can understand— normal people with some familiarity with first-order logic and mathematics, that is.

(a) 334714343040000000000000 (b) 86829136707262709760000000 (c) 221355672311211371761992323

4.6 Coding Is Recursive A basic part of our coding mechanism will be the ability to code finite sequences of numbers as a single number. A number c is going to be a code for a sequence of numbers (ki, kn) if and only if

4.6. Coding Is Recursive 143

c = 2kl3k2 •' -p^p, where pn is the nth prime number. We assume, for convenience, that none of the are equal to 0.

We show now that N is strong enough to recognize code numbers. In other words, we want to establish Proposition 4.6.1. The collection of numbers that are codes for finite sequences is a recursive set.

Proof. It is easy to write a A-definition for the set of code numbers:

Codenumber(c) is: Divides (SSO, c) A (to < c)(Vy < z)

Prime(z)ADivides(z,c)APnmepair(y,z)) —• Divides(y, c)j.

Notice that Codenumber(c) is a formula with one free variable, c. If you look at it carefully, all the formula says is that c is divisible by 2 and if any prime divides c, so do all the earlier primes. Since the definition above is a A-definition, Corollary 4.3.11 tells us that the set CODENUMBER is a recursive set. •

Since CODENUMBER is recursive and Codenumber is a A-formula, we now know (for example) that

N H Codenumber( 18)

and NI—i Codenumber(45).

Now, suppose that we wanted to take a code number, c, and decode it. To find the third element of the sequence of numbers coded by c, I need to find the exponent of the third prime number. Thus, for N to be able to prove statements about the sequence coded by a number, N will need to be able to recognize the function that takes i and assigns it to the ith prime number, Pi- Proving that this function p is recursive is our next major goal. Definition 4.6.2. The function p : N —•N is the function such that p(0) = 1, and for each positive natural number i, p(i) is the t'th prime number. We will often write pt instead of p(i).

Proposition 4.6.3. The function p that enumerates the primes is a recursive function.

Proof. We start by constructing a measure of the primes. A num-ber o will be in the set YARDSTICK if and only if a is of the form 213253 — f o r some i. So the first few elements of the set are {2,18,2250,5402250,870037764750,...}.

Yardstick(a) is: Codenumber(a) A

Divides(SS0, a) A Divides{SSSS0, a)A (Vy < a)(Vz < o)(Vfc < a)

^[Divides(z, a) A Primepair(y, z)) —»

(Divides(yk, a) «-» Divides(zSk, a))j.

If we unravel this, all we have is that 2 divides a, 4 does not divide a, and if 2 is a prime number such that z divides a, then the power of z that goes into a is one more than the power of the previous prime that goes into a.

Now it is relatively easy to provide a A-definition of the function V-

IthPrime(i,y) is: Pr ime (y ) A

(3a < (yl)i)[Yardstick(a)ADivides(yi,a)A~iDivides(ySt,a)].

Notice the tricky bound on the quantifier! Here is the thinking behind that bound: If y is, in fact, the »th prime, then here is an a e YARDSTICK that shows this fact: a = 2132---y®. But then certainly a is less than or equal to ylyi '••yi, and so a < (y1)*. This

i terms bound is, of course, much larger than a (the lone exception being when i = 1 and y = 2), but we will only be interested in the existence of bounds, and will pay almost no attention to making the bounds precise in any sense. •

Chaff: There is a bit of tension over notation that needs to be mentioned here. Suppose that we wished

4.6. Coding Is Recursive 145

to discuss the seventeenth prime number, which hap-pens to be 59, and that y is supposed to be equal to 59. The obvious way to assert this would be to state that y = pu, but we will tend to use the explicit £,NT-formula IthPrime(17, y). Our choice will give us a great increase in consistency, as our formulas become rather more com-plicated over the rest of this chapter. We will tend to write all of our functions in this consistent, if not exactly intuitive manner.

Now we can use the function IthPrime to find each element coded by a number:

IthElement(e, i, c) is: Codenumber(c) A (3y < c)(IthPrime(i}y)A

Divides(ye, c) A ->Divides(ySe, c)).

So intuitively, IthElement(e., i, c) is true if c is a code and e is the number at position i of the sequence coded by c.

The length of the sequence coded by c is also easily found:

Length(c,l) is:

Codenumber(c) A (3y < c) (IthPrime(l, y) A Divides(y, c)

A (Vx < c)[PrimePair(y, z) —• -i.Dtt;ides(2,c)])J.

All this says is that if the Zth prime divides c and the (I + l)st prime does not divide c, then the length of the sequence coded by c is I. ' '

4.6.1 Exercise <

1. Decide if the following statements are true or false as statements about the natural numbers. Justify your answers.

(a) (5 ,13 ) £ ITHPRIME

(b) (1200,3) € LENGTH

(c) IthElement(T, 2,3630) (Why axe there those lines over the numbers?)

4.7 Godel Numbering We now change our focus from looking at functions and relations on the natural numbers, where it makes sense to talk about recursive sets, to functions mapping strings of £/vr-symbols to N. We will establish our coding system for formulas, associating to each CNT-formula <p its Godel number, r<p.

Definition 4.7.1. The function r with domain the collection of finite strings of £jvT-symbols and codomain N, is defined as follows:

213r°1 if s is (->0!), where a is an £wr-formula 233r°n5r^ if s is (a V 0), where a and /? are £i\rr-formulas 253rt,*",5ra~l if s is (Vw,)(a), where a is an £^r-formula 273r<in5r'2^ if s is = t\t2, where t\ and ti are terms 29 if s is 0 2 u 3 r r i if s is St, with t a term 2i33rti"lE>rta"1 if s is +ht2) where t\ and t2 are terms 2153rtl"'5rt2"1 if s is 't\t2, where t\ and t2 are terms 2173rtl"l5rt2"t if s is Et\t2, where <1 and t2 are terms 2193rtin5rt2"' if s is < t\t2, where t\ and t2 are terms 22* if a is the variable V{ 3 otherwise.

Notice that each symbol is associated with its symbol number, as set out in Table 4.1. Example 4.7.2. This is a neat function, but the numbers involved get really big, really fast. Suppose that we work out the Godel number for the formula <j>, where <j> is (-1 = 0S0).

Since <j> is a denial, the definition tells us that r<f>~1 is 213r=oso"1.

So we need to find r = 0S0"1, and by the "equals" clause in the defi-nition,

050"1 is 2 7 3 r o V s o \

4.7. Godel Numbering 147

But r0~» = 29, and rS(P = 2 n 3 r ( P = 2n3(2®). Now we're getting somewhere. Plugging things back in, we get

<-= OSO"1 is 273(29)5l2ll3(29,J

so the Godel number for (-1 = 050) is

Chaff: To get an idea about how large this number is, consider the fact that the exponent on the 5 is rS0~1 = 2n3(29>, which is

395742204565276981359699327426961020014622 018797636168707779330639922518966590197760 367399522017881177908886103953123727465543 866395298629306327869004362537831640066738 737141407294716362902985484718968825205601

82459989001541603088374317950036871168,

approximately 10247. Now, let us play a bit with the Godel number of <f>:

so if we take common logarithms, we see that.

logO"^) > 5[1°247] log 3

and taking logarithms again,

log(logr^)) > 10247 log 5 + log(log 3) > 10246.

Hmm. . . . So this means that log(r<?P) is bigger than 10tlo246l But the common logarithm of a number is (ap-proximately) the number of digits that it takes to express the number in base 10 notation, so we have shown that it takes more than lO^o246] digits to write out the Godel

number of <f>. If you remember that a google is lO100 and a googleplex is lO10'00, r<P is starting to look like a pretty big number, but it gets better!

To write out a string of lO^10246! characters (assuming a million characters per mile, or about 16 characters per inch) would require far more than 10tlo248l miles, which is far more than 10f1024^ light years.

Or, to look at it another way, if we assume that we can put about 132 lines of type on an 85- by 11-inch piece of paper (using both sides), that works out to about 10t102"] pieces of paper, and since a ream of paper (500 sheets) is about 2 inches thick, that gives a stack of paper more than 10tlo244l light years high. Since the age of the universe is currently estimated to be in the tens of billions of years (on the order of 1010 years), if we assume that the universe is both Euclidean and spherical, the volume of the universe is less than lO40 cubic light years, rather less than the 10tlo244l cubic light years we would need to store our stack of paper. In short, we don't win any prizes for being incredibly efficient with the coding that we have chosen. What we do win is ease of analysis. The fact that we have chosen to code recursively will make our proofs to come much easier to comprehend.

4.7.1 Exercises

1. Evaluate the Godel number for each of the following.

(a) (Vu3)(f3 + 0 = t/4) (b) SSSSO

2. Find the formula or term that is coded by each of the following:

(b) ^ l 2 1 9 3 2 9 *211329)

(c) 253(22)5(2?3454)

4.8. Godel Numbers and N 149

4.8 Godel Numbers and N Suppose I asked you if the number

a = 9893555114131924533992483185674025500365550469940904 2176944832659980629741647549440091849880504470294477 2215259882809318663859665988246573265819672510906344 5791001668468428535182367909072574637117974220630140 0456149972503854007720935794875092177920000

was the Godel number of an CNT-term. How would you go about finding out? A reasonable approach would be to factor o and try to decode. It turns out that a = 2l3351254, and since 512 = 29 = HP and 4 = 22 = rt/i"\ you know that a is the Godel number of the term +0vi. That was easy.

What makes this more interesting is that the above is so easy that N can prove that a is the Godel number of a term. Establishing this fact is the goal of this section.

We will show how to construct certain A-formulas, for example the formula Term(x), such that for every natural number a, }= Term(a) if and only if a is the Godel number of a term. Since our formula will be a A-formula, this tells us that N h Term(a) if a is the Godel number of a term, and N I- ~>Term(a) if a is not the Godel number of a term.

The problem is going to be in writing down the formula. As you look at the definition of the function r~1 in Definition 4.7.1, you can see that the definition is by recursion, and we will need a way to deal with recursive definitions within the constraints of A-definitions. We would like to be able to write something like

Term(x) is: * • • V (By < x) [x = 2X13V A Term(y)] V • • •,

but this definition is clearly circular. A technical trick will get us past this point.

But we should start at the beginning. You know that the col-lection of CNT-terms is the closure of the set of variables and the collection of constant symbols under the function symbols. We will begin by showing that the collection of Godel numbers of variables is recursive.

Lemma 4.8.1. The set

VARIABLE = {o € N j a = rv~l for some variable V }

is recursive. V

Proof. It suffices to provide a A-definition for VARIABLE:

Variable(x) is: (3y < x){Even(y) A 0 < y A x = 2y).

Notice that we use the fact that if x = 2V, then y < x. It is easy to see that 91 }= Variable(a) if and only if a € VARIABLE, so our formula shows that VARIABLE is a recursive set. •

To motivate our development of the formula Term, consider the term t, where t is -j-OSui. We are used to recognizing that this is a term by looking at it from the outside in; t is a term, as it is the sum of two terms. Now we need to start looking at t from the inside out: t is a term, as there is a sequence of terms, each of which is either a constant symbol, a variable, or constructed from earlier entries in the sequence by application of a function symbol of the appropriate arity. Here is a construction sequence for our term t:

(vi,Svi,0,+0Svi).

From this construction sequence we can look at the associated se-quence of Godel numbers:

(22,211322,29,21332V2"32a)),

and then we can code this sequence up as a single number, as in Section 4.6:

Now we can begin to see what our formula Term(a) is going to look like. We will know that a is the Godel number of a term if there is a number c that is the code for a construction sequence for a term, and the last number in that construction sequence is o. We begin by defining the collection of construction sequences:

Definition 4.8.2. A finite sequence of CNT~terms {H,T2,•••,TI) is called a term construction sequence for if, for each i, 1 < i < I, t% is either a variable, the constant symbol 0, or is one of tj>St3, +tjtk, • tjtk, or Etjtk, where j < % and fe < i.

Proposition 4.8.3. The set

TERMCONSTRUCTIONSEQUENCE =

{(c,a) | c codes a term construction sequence for the term with Godel number a}

is recursive.

Proof. Here is a A-definition for the set:

TermConstructionSequence(c,a) is: CodeNumber(c) A

(3/ < c) Length(c, I) A IthElement(a, I, c)A

(Ve < c)(Vi < I) (lthElement(e, i, c)

Variable(e) V e = 2®V (3j < i)(3fe < i)(3ej < c)(3ek < c)

(lthElement(ej, j, c) A IthElement(ek, k, c)A

[e = e i Ve = 2 1 1 - f J V e ^ . r ' - r w

e = 2 IF .r^5e f cVe = 2T 7-3e^r f c])) j .

This just says that (c,o) € TERMCONSTRUCTIONSEQUENCE if and only if c is a code of length 1, a is the last number of the sequence coded by c, and if e is an entry at position i of c, then e is either the Godel number of a variable, the Godel number of 0, a repeat of an earlier entry, or is the Godel number that is the result of applying 5, -f, •, or E to earlier entries in c. As all of the quantifiers are bounded, this is a A-definition, so the set TERMCONSTRUCTIONSEQUENCE is recursive. D

Now it would seem that to define Term(a), all we would have to do is to say that a is the Godel number of a term if there is a number c such that TermConstructionSequence(c, a). This is not quite enough, as the quantifier 3c is not bounded. In order to write down a A-definition of Term, we will have to get a handle on how large construction sequences have to be.

Lemma 4.8.4. If t is an Cfrr-term and rt~l = a, then the number of symbols int is less than a.

Proof. The proof is by induction on the complexity of t. Just to give you an idea of how true the lemma is, consider the example of t, where t is 50. Then t has two symbols, while T = a = 2U3512, which is just a little bigger than 2. •

Lemma 4.8.5. If t is a term, the length of the shortest construction sequence oft is less than or equal to the number of symbols in t.

Proof. Again, use induction on the complexity of t. •

These lemmas tell us that if a is the Godel number of a term, then there is a construction sequence of that term whose length is less than a.

Lemma 4.8.6. Suppose that t is an C^T-term. and u is a subterm of t (in other words, u is a substring of t and u is also an Cfrr-term).

Lemma 4.8.7. If a is a natural number greater than or equal to 1, then pa < 2°*, where pa is the ath prime number.

Now we have enough to give us our bound on the code for the shortest construction sequence for a term t with Godel number rt~* ~ a. Any such construction sequence must look like

Then rtP < T .

Proof. Exercise 5. D

{tuhf-ftk = *)>

where k < o and each U is a subterm of t. But then the code for this construction sequence is

c = 2 r t ^3 r t^-pT

<PkPl'j-Pt k terms

a terms

=[cn*]' = ( 2 - f ,

which gives us our needed bound. We are finally at a position where we can give a A-definition of

the collection of Godel numbers of £jvr-terms:

Term(a) is:

3c < f2° } ) TermConstructionSequence(c, a).

So the set

TERM = {a € N j a is the Godel number of an £AT-term}

is a recursive set, and thus N has the strength to prove, for any number a, either Term(a) or -»Term(a). In Exercise 6 we will ask you to show that the set FORMULA is also recursive.

4.8.1 Exercises 1. Assume that <j> is a formula of CNT- Which of the following are

also £jvr-formulas? For the ones that are not formulas, why are they not formulas?

• Term(<f>) • Term{r<p) • Term f^)

2. Suppose that in the definition of TermConstructionSequence, you saw the following string:

•»• V e = 211 • 3eJ V • • *

Would that be a part of a legal A/vr-formula? How do you know? What if the string were

•••Ve = 2 U - 3 ^ V - » - ?

5. Prove Lemma 4.8.7 by induction. For the inductive step, if you are trying to prove that pn+i < use the fact that pn+i is less than or equal to the smallest prime factor of (jpipz • • • Pn) ~ 1-

6 . A proof similar to the proof that TERM is recursive will show that

FORMULA = {a € N J O is the Godel number of an £JVR-formula}

is also reclusive. Carefully supply the needed details and de-fine the formula Formula(f) . You will probably have to define the formula FormulaConstructionSequence(c, / ) and estimate the length of such sequences as part of your exposition.

4.9 NUM and SUB Are Recursive In our proof of the Self-Reference Lemma in Section 5.2, we will have to be able to substitute the Godel number of a formula into a formula. To do this it will be necessary to know that a couple of functions are recursive, and in this section we outline how to construct A-definitions of those functions. First we work with the function Num.

Recall that a is the numeral representing the number a. Thus, 2 is SSO. Since SSO is an £;vT-term, it has a Godel number, in this case

rS5(P = 2 n 3 r s c r = 2n32l l3r°n = 2113a"3(*\ The function Num that we seek will map 2 to the Godel number

of its jCjvr-numeral, 2u32U3<a9). So Num(o) = raP.

4.9. NtJM and S V B Are Recursive 155

To write a A-definition Num(a, y) we will start, once again, with a construction sequence, but this time we will construct the numeral a. For a = 2, the construction sequence is

(0,50,550),

which gives rise to the sequence of Godel numbers

(29,2"329,2u32ll3(29)),

which is coded by the number

c ^ V ^ V 1 1 3 2 " 3 ^ ! .

Notice that the length of the construction sequence here is 3, and in general the construction sequence will have length a -f 1 if we seek to code the construction of the Godel number of the numeral associated with the number a. (That is a very long sentence, but it does make sense if you work through it carefully.)

You are asked in Exercise 1 to write down the formula

NumConstructionSequenceic, a, y)

as a A-formula. The idea is that

91 (= NumConstructionSequence(c, a, y)

if and only if c is the code for a construction sequence of length a +1 with last element y = raP.

Now we would like to define the formula Num(a, y) in such a way that Num(a, y) is true if and only if y is ra-1, and as in Section 4.8, the formula (3c)NumConstructionSequence(c,a,y) does not work, as the quantifier is unbounded. So we must find a bound for c.

If (c,a,y) € NUMCONSTRUCTIONSEQUENCE, then we know that c codes a construction sequence of length a + 1 and c is of the form

where each ti is a subterm of a. By Lemma 4.8.6 and Lemma 4.8.7, we know that r*t"1 < y and pa+i < 2(a+1)("+1), so

c < 2 * 3 * . . < (pa+1)Wy < (2(«+1) (0+1)) (a+1)y. o + l terms

This gives us our needed bound on e. Now we can define

Num(a, y) is:

( 3 c < ( ^ ' " n ) l ° + T ) " )

NumConstructionSequence(c, a, y).

The next formulas that we need to discuss will deal with substi-tution. We will define two A-formulas:

1. TermSub(u, x, t, y) will represent substitution of a term for a variable in a term (uf).

2. Sub(f, x, t, y) will represent substitution of a term for a variable in a formula (<£?).

Specifically, we will show that 91 f= TermSub(ru^, rxn , rf~1, y) if and only if y is the Godel number of uf, where it is assumed that u and t are terms and a: is a variable. Similarly, Sub(r<f>~*, rx~t, rt"1, y) will be true if and only if <j> is a formula and y = r0f"1.

We will develop TermSub carefully and outline the construction of Sub, leaving the details to the reader.

First, let us look at an example. Suppose that u is the term + • 050a:. Then a construction sequence for u could look lite

(0, x, 50, • 050, + • 050x). If t is the term 550, then u? is + • 050550, and we will find a

construction sequence for this term in stages. The first thing we will do is look at the sequence (no longer a

construction sequence) that we obtain by replacing all of the z's in u's construction sequence with i's:

(0,550,50, • 050, + • 050550). This fails to be a construction sequence, as the second term of the

sequence is illegal. However, if we precede this with the construction sequence for t, all will be well (recall that we are allowed to repeat elements in a construction sequence):

(0,50,550,0,550,50, • 050, + - 050550). So to get all of this to work, we have to do three things:

4.9. NtJM and SVB Are Recursive 157

1. We must show how to change u's construction sequence by replacing the occurrences of x with t.

2. We must show how to put one construction sequence in front of another.

3. We must make sure that all of our quantifiers are bounded throughout, so that our final formula TermSub is a A-formula.

So, first we define a formula TermReplace (c, u, d,«, t) such that if c is the code for a construction sequence of a term with Godel number ti, then d is the code of the sequence (probably not a construction sequence) that results from replacing each occurrence of the variable with Godel number x by the term with Godel number t. In the following definition, the idea is that e* and a,- are the entries at position * of the sequence coded by c and d, respectively, and if we look at it line by line we see: c codes a construction sequence for the term coded by u, d codes a sequence; the lengths of the two sequences are the same; if ej and a, are the ith entries in sequence c and d, respectively, then e* is x if and only if a* is t\ e< is another variable if and only if a* is the same variable; e* codes 0 if and only if Oi also codes 0; ej codes the successor of a previous ej if and only if <n codes the successor of the corresponding ay, and so on.

TermReplace(c, u, d, x, t) is: TermConstructionSequence(c, u) A Codenumber(d)A

(31 < c) Length(c} I) A Length(d, I)A

(Vi < i)(Vei < c)(Va* < d)

^(IthElement(ei, i, c) A IthElement(a.i, i, d)) —»

[([( Variable (et) A e = x) *-+ at = t]A (Variable(ei) A ei ^ x)«-+ (Variable (at) Aat = e,))A

(ei = 2 9 ^ a i = 29)A

(Vj < i) ^((3ej < c)IthElement(e3, j , c) A ej = 211 * 3^) -H

((3cij < d)IthElement(aj,j,d) A a, = 2 U • A

(Similar clauses for +, and E.)

Now that we know how to destroy a construction sequence for u by replacing all occurrences of x with f, we have to be able to put a term construction sequence for t in front of our sequence to make it a construction sequence again. So suppose that d is the code for the sequence obtained by replacing x by t, and that 6 is the code for the term construction sequence for t. Essentially, we want a to code d appended to b. So the length of o is the sum of the lengths of b and d, and the first I elements of the o sequence should be the same as the b sequence, where the length of the b sequence is Z, and the rest of the a sequence should match, entry by entry, the d sequence:

Append(b,d,a) is: Codenumber(b) A Codenumber(d) A Codenumber(a)A

(31 < b) (3m < d) (length(l, b)A

Length(m, d) A Length(l + m, a)A (Ve < a)(Vi < I) [lthElement(e, i, a) «-* RhElement(e ,i,6)]A

. (Ve < a)(Vj < m) [0 < j —* (IthElement(e, I + j, a) IthElement(e,j, <0)]).

Now we are ready to define the formula TermSub(u,x,t,y) that is supposed to be true if and only if y is the (Godel number of the) term that results when you take the term u and replace x by t. A first attempt at a definition might be

(3a) (36) (3d) (3c) [TermConstructionSequence(c, u)A TermConstructionSequence(b, t)A

TermReplace(c, u, d, x, t) A Append(b, d, a)A TermConstructionSequence(a, y)].

This is nice, but we need to bound all of the quantifiers. Bounding b and d is easy: They are smaller than a. As for c, we showed when we defined the formula Term(a) on page 153 that the term coded by u must have a construction sequence coded by a number less than (2"")" . So all that

is left is to find a bound on a. Notice that we can assume that b codes a sequence of length less than or equal to t, the last entry of 6, and similarly, c codes a sequence of length less than or equal to u. Since the sequence coded by d has the same length as the sequence coded by c, and as a codes Vs sequence followed by d's, the length of the sequence coded by a is less than or equal to u +1. So

a<2e i3e a---p?+ u .

Now we can use the same trick that we used in the definition of TermConstructionSequence. As each entry of a can be assumed to

be less than or equal to y, we get

a < 2yZv... Pt+U

< [fo-nO'f+w

^ r i r So our A-definition of TERMSUB is

TermSub(u, x,t,y) is:

(la < ( [ r « e + U ] V ) t + U ) (36 < a)(3d < a) ( s c < ( F " ) U * )

[ TermConstructionSequence (c, u) A TermConstructionSequence(b, t)A

TermReplace(c, u, d, x, t) A Append(b, d, a)A TermConstructionSequence(a, y)].

Chaff: Oh, how I hate garbage cases. My claim in the paragraph preceding the definition of TermSub that each entry of o will be less than or equal to y might not be correct if y = uf = u, so the substitution is vacuous. The reason for this is that the entries of a would include a construction sequence for t, which might be huge, while y might be relatively small. For example, we might have u — 0 and t = t>i23456789 and x = In Exercise 3 you are invited to figure out the slight addition to the definition of TermSub that takes care of this.

Now we outline the construction of the formula Sub(f, x, t, y), which is to be true if / is the Godel number of a formula <j) and y is r4>t~[- The idea is to take a formula construction sequence for (j> and follow it by a copy of the same construction sequence where we systematically replace the occurrences of x with Vs. You do have to be a little careful in your replacements, though. If you compare Definitions 1.8.1 and 1.8.2, you can see that the rules for replacing variables by terms are a little more complicated in the formula case than in the term case, particularly when you are substituting in a quantified formula. But the difficulties are not too bad.

So if b codes the construction sequence for <f>, if d codes the se-quence that you get after replacing the ar's by Vs, and if Append(b, d, a) holds, then a will code up a construction sequence for <f>f. After you deal with the bounds, you will have a A-formula along the lines of .

Sub(f, x, t, y) is: (3a < Bound) (3& < a) (3d < a)

" FormulaConstructionSequence(b,f)A FormulaReplace(b, /, d, x, t) A Append(b, d, a)A

FormulaConstructionSequence(a, y).

4.9.1 Exercises

1. Write the formula NumConstructionSequence(c, a, y). Make sure that your formula is a A-formula with the variables c, a, and y free. [Suggestion: You might want to model your answer on the construction of TermConstructionSequence.]

2. Write out the "Similar clauses" in the definition of TermReplace.

3. Change the definition of TermSub to take care of the problem mentioned on page 160. [Suggestion: The problem occurs only if the substitution is vacuous. So there are two cases. Either u and y are different, in which case our definition is fine, or u and y are equal. What do you need to do then? So I suggest that your answer should be a disjunction, something like

MyTermSub(u,x,t,y) is: [(it ^ y) A TermSub(u, x, t, y)] V

[(it = y) A (Something Brilliant)].

4. Write out the details of the formula Sub.

4.10 Definitions by Recursion Are Recursive If you look at the definition of a term (Definition 1.3.1), the definition of formula (Definition 1.3.3), the definition of uf (Definition 1.8.1), and the definition of (Definition 1.8.2), you will notice that all of these definitions were definitions "by recursion." For example, in the definition of a term, you see the phrase

...t is ft\t<2... tn» where / in an n-ary function symbol of C and each of the i, are terms of C

so a term can have constituent parts that are themselves terms. In the last two sections we have used the device of construction se-quences to show that the sets TERM, FORMULA, TERMSUB, and SUB are recursive sets. In this section we outline a proof that all such sets of strings, defined "by recursion," give rise to sets of Godel numbers that are recursive. It will be clear from our exposition that a more general statement of our theorem could be proved, but what we present will be sufficient for our needs.

Definition 4.10.1. A string of symbols s from a first-order language C is called an expression if s is either a term of C or a formula of C.

Theorem 4.10.2. Suppose that we have a set of CPIT-cxpressions, which we will call Set, defined as follows: An expression s is an element of Set if and only if:

1, s is an element of BaseCaseSet, or

2. There is an expression t, a proper substring of s, such that (t, s) is an element of ConstructionSet.

If the sets of strings BaseCaseSet and ConstructionSet give rise to sets of Godel numbers BASECASESET and CONSTRUCTIONSET that are defined by A-formulas, then the set

SET = { " V | s € Set}

is recursive, and has a A-definition Set.

Chaff: Try to keep the various typefaces straight: • Set is a bunch of £jvr-expressions—strings of sym-

bols from CNT-

4.10. Definitions by Recursion Are Recursive 163

• SET is a set of natural numbers—the Godel numbers of the strings in Set.

• Set is an CNT-formula such that 91 f= Set (A) if and only if there is an s 6 Set such that rs~1 = a.

Proof. We follow very closely the proof of the recursiveness of the set TERM, which begins on page 150. As you worked through Ex-ercise 6 in Section 4.8.1, you saw that you could prove the analogs of Lemmas 4.8.4 through 4.8.6 for formulas, so you have established the following lemma:

Lemma 4.10.3. Suppose that s is an £NT-expression and u is a substring of s that is also an expression. Then

1. If rsn = o, then the number of symbols in s is less than or equal to a.

2. The length of the shortest construction sequence of a is less than or equal to the number of symbols in s.

3. run < rs-\

Now we can write a A-definition of SETCONSTRUCTIONSEQUENCE and then use this lemma to show SET is recursive:

SetConstructionSequence(c, a) is: CodeNumber(c)A

(31 < c) i) A IthElement(a, I, c)A

(Vi < l)(Ve < c) 'IthElement(e, t, c)

BaseCaseSet(e)y (3j < i)(3ej < c)

(lthElement(ej, j,c) A ConstructionSet(ej,

We know, by assumption, that there are A-formulas BaseCaseSet and Constructionset that define the recursive sets BASECASESET and CONSTRUCTIONSET. Thus, all of the quantifiers in the definition above are bounded, and SetConstructionSequence is a A-formula.

As before we can use Lemmas 4.8.5 and 4.8.7 to bound the size of the shortest construction sequence for a: By the same argument as on page 152, there is a construction sequence coded by a number

2 c such that c < (2a°)° . So we define

and we have a A-definition of the set SET, finishing our proof. •

This is a wonderful theorem, as it saves us lots of work. Merely by noting that their definitions fit the requirements of Theorem 4.10.2, the following sets all turn out to be recursive and have A-definitions:

1. FREE, where (x, f) e FREE if and only if x is the Godel number of a variable that is free in the formula with Godel number / .

2. SUBSTITUTABLE, where (t, x, / ) 6 SUBSTITUTABLE if and only if t is the Godel number of a term that is substitutable for the variable with Godel number x in the formula with Godel number / .

4.10.1 Exercises

1. Work through the details (including the needed modification of Theorem 4.10.2) and show that both FREE and SUBSTITUTABLE are recursive.

2. Suppose that the function / : Nfc+1 —• N is defined as follows:

/(& 0) — g(a) / (& 6 + 1) = %,&>/(&&))>

where g and h are recursive functions represented by A-formulas. Show that / is a recursive function. [Suggestion; One approach is to define the formula f ConstructionSequence(c, a, I), with the idea that the sequence coded by c will be { / (0 ) , / ( l ) , . . . f(l)). Or, you might try to fit this situation into a theorem along the lines of Theorem 4.10.2.]

4.11. The Collection of Axioms Is Recursive 165

3. Use Exercise 2 to show that the following functions are recursive:

(a) The factorial function n! (b) The Fibonacci function F, where F(l) = F(2) = 1, and for

k > 3, F(k) = F(k - 1) + F(k - 2) (c) The function a t h where a T 0 = 1 and a T { j + 1) =

(You should also compute a few values, along the lines of 2 t 3, 2 T 4, and 2 t 5.)

4.11 The Collection of Axioms Is Recursive In this section we will exhibit two A-formulas that are designed to pick out the axioms of our deductive system.

Proposition 4.11.1. The collection of Godel numbers of the axioms of N is recursive.

Proof. The formula AxiomOfN is easy to describe. As there are only a finite number of iV-axioms, a natural number a is in the set AXIOMOFN if and only if it is one of a finite number of Godel numbers. Thus

AxiomOfN(a) is: a = r(Vx)->5x = (T V

a = r(Vx)(Vy) [5x = 5y x = y] V

V o = r(Vx)(Vy) [(x < y) V (x = y) V (y < x^"1.

(To be more-than-usually picky, we need to change the x's and y's to vis and iVs, but you cam do that.) O

Proposition 4.11.2. The collection of Godel numbers of the logical axioms is recursive.

Proof. The formula that recognizes the logical axioms is more compli-cated than the formula AxiomOfN for two reasons. The first is that there are infinitely many logical axioms, so we cannot just list them all. The second reason that this group of axioms is more complicated

is that the quantifier axioms depend on the notion of substitutability, so we will have to use our results from Section 4.10.

Quite probably you are at a point where you could turn to Sec-tion 2.3.3 and write down a A-definition of the set LOGICALAXIOM. To do so would be a worthwhile exercise. But if you are feeling lazy, here is an attempt:

LogicalAxiom(a) is:

{3x < a) (Variable (x) A a = 273*5*)V

(3x, y < a) Variable (x) A Variable (y) A

X2,yi,V2 < a)(Variable(xI) A •• • A Variable^)A

a = Ugly Mess saying [(xi = yx) A (x2 = 2/2)] .

(xi +X2 = 2/1

(Similar clauses coding up (E2) and (E3) for E, = , and <)

V (3/, x,t,y< a) (Formula(f) A Variable(x) A Term(t)A

Substitutable(t, x, / ) A Sub(f, x, t, y)A t

a =s 2 3 5MV

(Similar clause for Axiom (Q2)).

To look at this in a little more detail, the first clause of the formula is supposed to correspond to axiom (El): x = x for each variable x. So a is the Godel number of an axiom of this form if there is some x that is the Godel number of a variable [which is what Variable(x) says] such that a is the Godel number of a formula

4.12. Coding Deductions 167

that looks like

variable with Godel number x = variable with Godel number x.

But the Godel number for this formula is just 273z5®, so that is what we demand that a equal.

The second clause covers axioms of the form (E2), when the func-tion / is the function S. We demand that a be the code for the formula

(v{ = v0) —• Svi = Svj, . „

where x = rVi1 and y — rvp. After you fuss with the coding, you come out with the expression shown. The other clauses of type (E2) and (E3) are similar.

The last clause that is written out is for the quantifier axiom (Ql). After demanding that the term coded by t be substitutable for the variable coded by a; in the formula coded by / and that y be the code for the result of substituting in that way, the equation for a is nothing more than the analog of (Vi<f>) —• . P

4.11.1 Exercise 1. Complete the definition of the formula LogicalAxiom.

4.12 Coding Deductions It is probably difficult to remember at this point of our journey, but our goal is to prove the Incompleteness Theorem, and to do that we need to write down an £jvT-sentence that is true in 91, the standard structure, but not provable from the axioms of N. Our sentence, 0, will "say" that 6 is not provable from N, and in order to "say" that, we will need a formula that will identify the (Godel numbers of the) formulas that are provable from N. To do that we will need to be able to code up deductions from N, which makes it necessary to code up sequences of formulas. Thus, our next goal will be to settle on a coding scheme for sequences of £/vT-formulas.

We have been pretty careful with our coding up to this point. If you check, every Godel number that we have used has been even, with the exception of 3, which is the garbage case in Definition 4.7.1. We will now use numbers with smallest prime factor 5 to code sequences of formulas.

Suppose that we have the sequence of formulas

D = 0k).

We will define the sequence code of D to be the number

So the exponent on the (t+2)nd prime is the Godel number of the ith element of the sequence. You are asked in the Exercises to produce several useful £;vr-fonnulas relating to sequence codes.

We will be interested in using sequence codes to code up de-ductions from N. If you look back at the definition of a deduction (Definition 2.2.1), you will see that to check if a sequence is a de-duction, we need only check that each entry is either an axiom or follows from previous lines of the deduction via a rule of inference. So to say that c codes up a deduction from N, we want to be able to say, for each entry e at position i of the deduction coded by c,

AxiomOfN(e) V LogicalAxiom(e) V RuleOfInference(c, e, t).

The first two of these we have already developed. The last major goal of this section (and this chapter) is to flesh out the details of a A-definition of a formula that recognizes when an entry in a deduction is justified by one of our rules of inference.

As you recall from Section 2.4, there are two types of rules of inference: propositional consequence and quantifier rules. The latter of these is easiest to A-define, so we deal with it first.

The rule of inference (QR) is Definition 2.4.6, which says that if x is not free in ip, then the following are rules of inference:

({<£-+V),(3x0)

If we look at the first of these, we see that if e is an entry in a code for a deduction, and e — rip (Vx$)~\ then e is justified as long as there is an earlier entry in the deduction that codes up the formula ip —+ <j>, assuming that x is not free in ip. So all we have to do is figure out a way to say this.

We will write the formula QRRidel (c, e, i), where c is the code of the deduction, and e is the entry at position i that is being justified

by the first quantifier rule. In the following, / is playing the role of rip~), while g is supposed to be r<f>~[:

QRRvlel (c, e, t) is:

Sequence Code (c) A IthSequenceElement(e, i, c) A

(3x, / , g < c) j/brrnu/a(/) A Formvla(g) A Variable(x)A

->l;Vee(£, / ) A e = 2 3 5 A

(3j < t)(3ej < c) (lthSequenceElement(ej, j, c)A

After you write out QRRuleS in the Exercises, it is obvious to define

QRRule(c, e, t) is: QRRulel (c, e, i) V QRRuleS (c, e, *)•

Now we have to address prepositional consequence. This will involve some rather tricky coding, so hold on tight as we review propositional logic.

Assume that D is our deduction, and D is the sequence of for-mulas

Notice that entry ai of a deduction is justified as a propositional consequence if and only if the formula

0 = ( A I A C*2 A ••• A Q!I_X) —• A» « /

is a tautology. Now, as we discussed in Section 2.4, in order to decide if a first-

order formula is a tautology, we must take (3 and find the propo-sitional formula (3p. Then if f3p is

|\ai)p A (a2)p A ••• A (ai-i)pj -> (Qt)p,

we must show any truth assignment that makes (ai)p through (cti_i)p true must also make (a,)p true.

As outlined on page 59, to create a propositional version of a formula, we first must find the prime components of that formula, where a prime component is a subformula that is either universal and not contained in any other universal subformula, or atomic and not contained in any universal formula. Rather than explicitly writ-ing out a A-formula that identifies the pairs (ti, v) such that u is the Godel number of a prime component of the formula with Godel num-ber v, let us write down a recursive definition of this set of formulas, and we will leave it to the reader to find the minor modification of Theorem 4.10.2 which will guarantee that the set of Godel numbers is recursive.

Definition 4.12.1. If 0 and 7 are jCjyr-formulas, 7 is said to be a prime component of 0 if:

1. 0 is atomic and 7 = /?, or

2. 0 is universal and 7 = 0, or

3. 0 is -<a and 7 is a prime component of a, or

4. 0 is ai V 03 and 7 is a prime component of either <*i or a2.

Proposition 4.12.2. The set

PRIMECOMPONENT =

{(•U, f)\u — r7_l and / = r0Tt and 7 is a prime component of 0, for some CNT-formulas 7 and 0}

is recursive and has A-definition PrimeComponent(u, /)

Proof. Theorem 4.10.2. •

Now we will code up a canonical sequence of all of the prime components of <*i through ccj. We will say r codes the PrimeList for the first t entries of the deduction coded by c if each element coded by r is a prime component of one of the first i entries of the deduction coded by c, r's elements are distinct, each prime component of each of the first i entries of the deduction coded by c is among the entries

in r, and if s is a smaller code number, s is missing one of these prime components:

PrimeList(c,i,r) is: SequenceCode(c) A CodeNumber(r)A

(31 < r) Length(r, I)A

(Vm,n<0(Ve r o ,en<r) (IthElement(em, m, r) A IthElement(en, n, r)) —•

(j3k < i)(3/fc < c)IthSequenceElement(fk, k, c)A

PrimeComponent (em, fk)A

[(m ^ n) (e„, ^ e„)])A

[(Vfe < i)(V/fc < c)(Vu < fk)(IthSequenceElernent(fk,k,c)A PrimeComponent(it, /*,) —*

(3m < l)IthElement(u, m, r))]| A

(Vs < r) ^ CodeNumber(s)

(3k < i)(3fk < c)(3u < fk)[lthSequenceElement(fk,k,c)A Prime Component (u, fk)A

(Vm < s)(->IthElement(u,m,s))]^

To decide if /3 is a tautology, we need to assign all possible truth values to all of the prime components in our list, and then evaluate the truth of each a, under a given truth assignment. So the next thing we need to do is find a way to code up an assignment of truth values to all of the prime components of all the a,'s. We will say that v codes up a truth assignment if v is the code number of a sequence of the right length and all of the elements coded by v are either 1 (for false) or 2 (for true). [Although it would be convenient to use O's and l's in v, that runs afoul of our convention that we never code a 0.1

TruthAssignment(c, i, r, v) is: PrimeList(c,i,r) A CodeNumber(v) A (31 < r)(Length(r,l) A Length(v,l)h

(Vi < Z)(Ve < v)(IthElement(e} i,v) -+ [e = TVc = 2))).

Now, given a truth assignment for the prime components of ai through a,, coded up in v, we need to be able to evaluate the truth of each formula an under that assignment. To do this we will need to be able to evaluate the truth value of a single formula.

Suppose that we first look at an example. Here is a formula construction sequence that ends with some formula a:

(0 < x,x < y,(0 < x Vx < y),-i(0 < x), (Vx)(0 < x V x < y), (Vx)(0 < x V x < y) V (-Q < x)\.

Let us assume that the PrimeList we are working with is

((Vx)(0 < x V x < y), 0 < x)

and the chosen truth assignment for our PrimeList is

<1,2).

To assign the truth value to a, we follow along the construction sequence. When we see an entry that is in the PrimeList, we assign that entry the corresponding truth value from our truth assignment. If the entry in the construction sequence is not a prime component, one of three things might be true:

1. The entry might be universal, in which case we assign it truth value 3 (for undefined).

2. The entry might be an atomic formula that ends up inside the scope of a quantifier in a. Again, we use truth value 3.

3. The entry might be the denial of or disjunction of earlier entries in the construction sequence. In this case we can figure out its truth value, always using 3 if any of the parts have truth value 3.

So to continue our example from above, the sequence of truth values would be - - -

(2,3,3,1,1,1).

Exercise 7 asks you to write a A-formula Evaluate(e,r,v,y) where you should assume that e is the Godel number for a formula a, r is a code for a list including all of the prime components of a, v is a TruthAssignment for r, and y is the truth value for a, given the truth assignment v. Exercise 8 also concerns this formula.

Now, knowing how to evaluate the truth of a single formula a, we will be able to decide if a*, the ith element of the alleged deduction that is coded by c, can be justified by the propositional consequence rule. Recall that we need only check whether

(E*I A CT2 A »• * A A , _ I ) —»

is a tautology. To do this, we need only see whether any truth assignment that makes c*i through aj_i true also makes a, true. Here is the A-formula that says this, where c codes the alleged deduction, and e is supposed to be the code for the ith entry in the deduction, the entry that is to be justified by an appeal to the rule PC:

PCRule(c, e, i) is: IthSequenceElement(e, i, c) A

<w<(p)«fy> PrimeList(c,i,r) A TruthAssignment(c, t, r, u)] —•

[((Vi < i)( a e i < c )

(IthSequenceElement(ej, j, c) A Evaluate(ej, r, v, 2))) — »

Evaluate(e, r, v, 2) j ^.

Now, to know that this works, we must justify the bounds that

we have given for r and v. The number r is supposed to code the list of prime components among the first i elements of the deduction c. So r is of the form 5r7l"l7r72"1 • • • p^»where the prime components are 71 through 7*. First, we need to get a handle on the number of prime components there are. Since c is the code for the deduction, there are fewer than c formulas in the deduction, and each of those formulas has a Godel number that is less than or equal to c. So each formula in the deduction has fewer than c symbols in it, and thus fewer than c prime components. So we have no more than c formulas, each with no more than c prime components, so there are no more than c2 prime components total in the deduction coded by c. If we look at the number r, we see that

r » 5 r * Y ^ — p f i ? < 5C7C •»'Pk+2

where the last line depends on our usual bound for the size of the (c^th prime, as we saw on page 152.

As for i>, v is a code number that gives us the truth values of the various prime components coded in r. Thus v is of the form 2n3<a * * •Pfc', where there are k prime components, and each ij is either Hf or r5~\ By the argument above, we know there are no more than c2 prime components, so

v**2il3i'">pikh

< 2 v 8 v t . . p y

32 M T Thus we have justified the bounds given for r and v in the definition of PCRule(c, e, i). Thus we have a A-definition, and the set P C R U L E is recursive.

Now we have to remember where we were on page 169. We are thinking of c as coding an alleged deduction from JV, and we need to check all of the entries of c to see if they are legal. We have al-ready written A-formulas LogicalAxiom and AxiomOfN, and we are

working on the rules of inference. The quantifier rules were relatively easy, so we then wrote a A-formula PCRule(c, e, i) that is true if and only if e is the ith entry of the deduction c and can be justified by the rule PC.

At last, we are able to decide if c is the code for a deduction from N of a formula with Godel number / ;

Deduction(c, / ) is: SequenceCode(c) A Formula(f)A

(31 < c) SequenceLength(c, I) A IthSequenceElement(f, I, c)A

(Vi < Z)(Ve < c) IthSequenceElement(e, i, c) —*

^Lo(/ica/A:riom(e) V AxiomOfN (e)V

QRRule(c, e, i) V PCRule(c, e, i)^ j ^.

This formula represents the set DEDUCTION £ N2 and shows that DEDUCTION is a recursive set. This makes formal the ideas of Chap-ter 2, where it was suggested that we ought to be able to program a computer to decide if an alleged deduction is, in fact, a deduc-tion. If you keep the equivalence between recursive and "computer-decidable" in your head, we will say more about this in Chapter 5.

4.12.1 Exercises 1, Write a A-formula | SequenceCode(c) that is true if and only if c

is the code of a sequence of formulas.

2. Write a A-formula SequenceLength(c, I) that is true if and only if c is a sequence code of a sequence of length I. •>

3. Write out a A-formula IthSequenceElement(e, t, c) [ that is true in 91 if and only if c is a sequence code and the ith element of the sequence coded by c has Godel number e.

4. Write out a A-formula QRRuleS(c,e,i) that will be true if e is the ith entry in the sequence coded by c and is justified by the second quantifier rule.

5. Here is your average, ordinary tautology:

<t>(x,y) is [jVxP(x)] - (Q{x,y) [VxP(x)])].

Find a construction sequence for Make a list of the prime components of <f>. Pretending that your list of prime components is the prime list for find all possible truth assignments for <f> and use the truth assignments to evaluate the truth of <f> under your assignments. If all goes well, every time you evaluate the truth of <f>, you will get: True.

6. Repeat Exercise 5 with the following formulas, which are not (necessarily) guaranteed to be tautologies:

(a) (Vx)(x < y x < y) (b) (Vx)(x <y)~* (Vx)(x < y) (c) (A(x)VB(y))V^B(x) (d) (.A(x) V B(y)) - [(Vx)(Vy)(A(x) V B(y))] (e) [(Vx)(Vy)(A(x) V B{y))] - {A(x) V B(y))

7. Write out the A-formula Evaluate(e, r, v, y)} as outlined in the text. You will need to think about formula construction se-quences, as you did in Exercise 6 in Section 4.8.1.

8. Show by induction on the longest of the two construction se-quences that if d\ and d% are codes for two construction sequences of a, and if Evaluate(di,r,v,yi) and Evaluate^, r, v, y2) are both true, then yi and y2 are equal. Thus, the truth assigned to a does not depend upon the construction sequence chosen to evaluate that truth.

4.13 Summing Up, Looking Ahead Well, in all likelihood you are exhausted at this point. This chapter has been full of dense, technical arguments with imposing definition piled upon imposing definition. We have established our axioms,

discussed recursive sets, and talked about A-definitions. You have just finished wading through an unending stream of A-definitions that culminated with the formula Deduction(c, / ) which holds if and only if c is a code for a deduction of the formula with Godel number / . We have succeeded in coding up our deductive theory inside of number theory.

Let me reiterate this. If you look at that formula Deduction, what it looks like is a disjunction of a lot of equations and inequal-ities. Everything is written in the language CNT, SO everything in that formula is of the form SSSO < SSO + x (with, it must be ad-mitted, rather more S's than shown here). Although we have given these formulas names which suggest that they are about formulas and terms and tautologies and deductions, the formulas are formulas of elementary number theory, so the formulas don't know that they are about anything beyond whether this number is bigger than that number, no matter how much you want to anthropomorphize them. The interpretation of the numbers as standing for formulas via the scheme of Godel numbering is imposed on those numbers by us.

The next chapter brings us to the statement and the proof of Godel's Incompleteness Theorem. To give you a taste of things to come, notice that if we define the statement

ThmN(f) is: (3 c)(Deduction(c, /)),

then ThmNif) should hold if and only if / is the Godel number of a formula that is a theorem of N. I am sure you noticed that ThmN is not a A-formula, and there is no way to fix that—we cannot bound the length of a deduction of a formula. But ThmN is a E-formula, and Proposition 4.3.9 tells us that true S-sentences are provable. That will be one of the keys to Godel's proof.

Well, if 90% of the iceberg is under water, we've covered that. Now it is time to examine that glorious 10% that is left.

Chapter 5

The Incompleteness Theorems

5.1 Introduction Suppose that A is a collection of axioms in the language of number theory such that A is consistent and is simple enough so that we can decide whether or not a given formula is an element of A. The First Incompleteness Theorem will produce a sentence, 9, such that 1= 9 and A\f 9, thus showing our collection of axioms A is incomplete.

The idea behind the construction of 9 is really neat: We get 9 to say that 9 is not provable from the axioms of A. In some sense, 9 is no more than a fancy version of the liar's paradox, in which the speaker asserts that the speaker is lying, inviting the listener to decide whether that utterance is a truth or a falsehood. The challenge for us is to figure out how to get an jC^p-sentence to do the asserting!

You will notice that there are two parts to 9. The first is that 9 will have to talk about the collection of Godel numbers of theorems of A. That is no problem, as we will have a S-formula ThrriA(f) that is true (and thus provable from N) if and only if / is the Godel number of a theorem of A. The thing that makes 9 tricky is that we want 9 to be ThmA(a), where a — r0~x. In this sense, we need 9 to refer to itself. Showing that we can do that will be the content of the Self-Reference Lemma that we address in the next section.

After proving the First Incompleteness Theorem, we will discuss some corollaries to the theorem before moving on to discuss the Sec-

180 ' Chapter 5. The Incompleteness Theorems

ond Incompleteness Theorem, which states that the set of axioms of Peano Arithmetic cannot prove that Peano Arithmetic is consistent, unless (of course) Peano Arithmetic is inconsistent, in which case it can prove anything. So our goal of proving that we have a complete, consistent set of axioms for is a goal that cannot be reached within the confines of first-order logic.

5.2 The Self-Reference Lemma Given any formula with only one free variable, we show how to con-struct a sentence that asserts that the given formula applies to itself:

Lemma 5.2.1 (Godel's Self-Reference Lemma). Let be an Cxr-formula with only v\ free. Then there is a sentence <f> such that

N »- (<f>«-»vcTO) •

Chaff: Look at how neat this is! Do you see how 4> "says" ip is true of me? And we can do this for any formula ip\ What a cool idea!

Proof. We will explicitly construct the needed 4>. Recall that we have recursive functions Num : N —» N and Sub : N3 —» N such that Num(n) == rrP and Sub(ra"', rx"1, r f ) = raf~l. Since these functions are represented by our A-definitions of Section 4.9, we know that

N b y) y = Num(a)j, and that

N H |Su6(a, 6, c, z) «•+ z — Sub(o, 6, c)j.

Chaff: Remember, Num is the function and Num is the £/vr-formula that represents the function! Oh, and just because we're going to need it, rwi"1 = 4.

Now suppose that ip{vi) is given as in the statement of the lemma. Let 7(t>i) be

VyVz j; [jVum(ui, y) A Sub(vi,4, y, z)] ip(z)j.

Let us look at 7(n) a little more closely, supposing that n = ""a"1. If the antecedent of 7(n) holds, then the first part of the antecedent tells us that

y = Num(ra) = rrP

5.2. The Self-Reference Lemma 181

and the second part of the antecedent asserts that x = Sub(n, 4, rnn)

= S u b ( V , F VI~> = Off " r ^ *

So z is the Godel number of {a with the Godel number of a substi-tuted in for «i} .

One more tricky choice will get us to where we want to go. Let m = r7(ui)~\ and let (f> be 7(m). Certainly, 0 is a sentence, so we will be finished if we can show that N h <f> tp(r(f>n).

Let us work through a small calculation first. Notice that Sub(m, 4, rffP) = Subffivxp, V , rrrP)

= rtMST

= r<P. ' (5.1)

With this in hand, the following are provably equivalent in N:

Vyiz |Num(rfi, y) —• (Sufc(m, 4, y, z) -* V'C2))] logic

VyV* Jy = Num(m) —» (5u6(m, 4, y, z) */>(z))j

Num represents Num

VyVz |y = rrrP —» (5u6(m, y, z) —» calculation

Vz(jSub(m, 4, rfrP, z) — * q u a n t i f i e r rules

Vz^z = Sub(m, 4, rm~l) —* ip(z)j Sub represents Sub

calculation (5.1) above

^O"1 ) quantifier rules

So N I- <j> *-* as needed. ' * D Notice in this proof that if tp is a Il-formula, then <j> is logically

equivalent to a II-sentence. By altering 7 slightly we can also arrange, if ip is a E-formula, to have <f> logically equivalent to a E-sentence.

5.2.1 Exercises 1. Suppose that ip{vi) is Formula(vi). By the Self-Reference Lemma,

there is a sentence <j> such that iV h <-+ Formula{r(f)n)). Does N Does N I — J u s t i f y your answer. What happens if we use tp(vi) = ~>Formula(vi) instead?

2. Let ip(vi) be Even(vi), and let $ be the sentence generated when the Self-Reference Lemma is applied to Does N h <j>? Does NI- How can you tell?

3. Show that the proof of the Self-Reference Lemma still works if we use

7(UI) = 3y3z |IV«M(UI, y) A Sub(vI, Y, z) A T/>(Z)J.

Conclude that if is a formula, then the 0 of the Self-Reference Lemma can be taken to be equivalent to a E-sentence.

5.3 The First Incompleteness Theorem We are ready to state and prove the First Incompleteness Theorem, which tells us that if we are given any reasonable axiom system A, there is a sentence that is true in 91 but not provable from A.

You may have been complaining all along about my choice for an axiom system. Perhaps you have been convinced from the beginning that N is clearly too weak to prove every truth about the natural numbers. You are right. We know, for example, that N does not prove the commutative law of addition. Since you are a diligent person, I imagine that you have come up with an axiom system of your own, let's call it A. You might be convinced that A is the "right" choice of axioms, a set of axioms that is strong enough to prove every truth about 9t. Although this shows admirable independence on your part, we will, unfortunately, be able to prove that A is no better than N, as long as A satisfies certain reasonable conditions.

Definition 5.3.1. A theory is a collection of sentences T that is closed under deduction: For every sentence o, if T I- a, then a € T. If A is a set of sentences, then the theory of A, written Th(A), is the smallest theory that includes A: Th(A) = {cr | A h tr}.

5.3. The First Incompleteness Theorem 183

Definition 5.3.2. A theory T in the language JCNT is said to be recursively axiomatized if there is a set of axioms A such that

1. T = {a 1 a is a sentence and A I- a}.

2. AXIOMOFA = {'"of11 a € .A} is a recursive set.

Roughly, a theory is recursively axiomatized if the theory has an axiom set that is simple enough so that we can recognize axioms and nonaxioms. One of the conditions that we will require of your set of axioms A is that it be recursive. The only other condition is that A needs to be consistent.

Here is the theorem:

Theorem 5.3.3 (Godel's First Incompleteness Theorem). Suppose that A is a consistent and recursive set of axioms in the language CNT- Then there is a IL-sentence 9 such that 91 f= 0 but A\f9.

Proof. Although our analysis in Chapter 4 was worked out with the specific set of axioms AT, it is easy to see that as long as the set AXIOMOFA is recursive, there is a recursive set DEDUCTION A, represented by the formula DeductionA(C, f) such that c codes up an ^4-deduction of the formula with Godel number / if and only if N H DeductionA(C,/).

If we let the formula THMA(VI) be (3c) (Deduction A(C, t>i)), then ThmA is a E-formula. Therefore, by Proposition 4.3.9, we know that if 91 J= ThmA ( / ) > then for some code number c, 911= Deduction a (c, / ) . Therefore, N b Deduction A(C, f), as Deduction A(C> f) is a true A-sentence. But this means that N h ThmA^f )• To summarize:

We assume that A is strong enough to prove all of the axioms of N. If not, the universal closure of any unprovable-from-.A axiom of N will be a II-sentence that is true in 91 and unprovable from A.

With this assumption, use the Self-Reference Lemma to produce a sentence 9 such that

If 911= ThmA(f), then N h ThmA(f). (5.2)

Chaff: Do you see how 9 "says" I am not a theorem?

Notice by the comments following the proof of the Self-Reference Lemma that 9 can be taken to be logically equivalent to a II-sentence.

Now 9t f= «-+ ~*ThmA > so we know that

9 1 9 ThmA(np)

<-» r9~l i Thm^ -•A \/9,

so 9 is either true in 91 and not provable from A, or false in 91 and provable from A.

Assume, for the moment, that 9 is false and Ah 9. Then A I- 9, so 91 |= ThmA(r9^). By (5.2) this means that N h ThmA ( ' T ' ) . But then, by our choice of 9, N I- Since A proves all of the axioms of N, this implies that A}—>9. But we already have assumed that A h 9, which means that A is inconsistent, contrary to our assumption on A.

Thus, A\f 9 and91^=0, and 9 is true and unprovable, as needed. •

Corollary 5.3.4. If A is a consistent, recursive set of axioms in the language CNT> then

THMA = {a | a is the Godel number of a formula derivable from A}

is not recursive. Proof. Suppose that Thm^ is recursive. Then some formula 7(1*1) represents the set in N, which means that

If / € THMA, then N h 7 ( / ) . If / £ THMa, then N h -vy(7).

If we construct 9 as in the proof of the First Incompleteness The-orem, we know that 9 is either true and unprovable or false and prov-able. We also know that the assumption that 9 is false and provable leads to a contradiction. Therefore, 9 is true and unprovable. But then, since 7(1^) represents THM^ and r9~x THMA, N T- - V Y ^ T ^ . But then, by our choice of 9, N b 9, so A I- 9, contradicting the fact that 9 is unprovable.

So our assumption that THM^ is recursive leads to a contradic-tion, and we are led to conclude that THMA is not recursive. •

5.3. The First Incompleteness Theorem 185

Chaff: This corollary is the "computers will never put mathematicians out of a job" corollary: If you accept the identification between recursive sets and sets for which a computer can decide membership, Corollary 5.3.4 says that we will never be able to write a computer program which will accept as input an £jvr-formula <j> and will produce as output a<f> is a theorem" if A b <f> and "0 is not a theorem" if A \f

If you think of the computer as taking <j> and system-atically fisting all deductions and checking to see if it has listed a deduction-of-0, it is easy to see that if 0 is, in fact, a theorem-of-A, the computer will eventually verify that fact. If, however, <j> is not a theorem, the computer will never know. All the computer will know is that it has not succeeded, as of this moment, of finding a deduction of <f>. But it will not be able to say that it will never come across a deduction of <f>. This is saying that the collec-tion of theorems-of-A is recursively enumerable, but not recursive.

We can, in fact, dispense with the requirement that A be a re-cursive set of axioms:

Theorem 5.3.5. Suppose that A is a consistent set of axioms ex-tending N and in the language CNT- Then the set THM^ is not representable in A (and therefore THM^ is not recursive).

Proof. Suppose, to the contrary, that 7(1*1) represents THM^. As usual, let 0 be such that

As A extends Nt certainly

A\R6-+ R9N € THMA

since 7 represents THM^

choice of 6 A is consistent A\/$

R 0 N THMA

since 7 represents THM^

choice of 6. -+Ah6

This contradiction completes the proof. • The short version of the Theorem 5.3.5 is: Any consistent theory

extending N is undecidable, where "undecidable" means not recur-sive.

Let us apply Theorem 5.3.5 to a particular theory, the theory of the natural numbers, Th(Vl).

Theorem 5.3.6 (Tarski's Theorem). The set of Godel numbers of formulas true in VI is not definable in 91.

Proof. Recall that for any set A C N , to say 0 defines A in 01 means that

Since Th(91) I- a if and only if 91 a, we can rewrite this statement as

I f o € A, then Th^) h <f>(a). If a? A, then Th(<Jl) h -^(a).

So <f> defines a set in 91 if and only if <j> represents the set in Th(yi). The set in question is

TRUEIN91 =

{a | a is the Godel number of a formula that is true in 91).

I f o e A , then 91 |= (f>{a). If a & A, then 911= -*£(a).

5.4. Extensions and Refinements of Incompleteness 187

Notice that this is precisely the set

THM;N,(«N) =

{o | a is the Godel number of a formula provable from 77i(9t)},

so TRUEIN91 is definable if and only if THMrA(«rt) IS representable in Th(9I). But T/i(9t) is a consistent set of axioms extending N, so by Theorem 5.3.5, THMjvi(<n) is not representable in T7i(9l). So TRUEINOI is not definable. •

5.3.1 Exercises 1. Prove the theorem that is implicit in Definition 5.3.1: If A is a

set of sentences, then {cr | cr is a sentence and A I- <r} is a theory.

2. Suppose that 21 is an / -structure, and consider Th(%L), as defined in Definition 3.3.4. Prove that T7i(2l) is a theory in the sense of Definition 5.3.1.

3. The First Incompleteness Theorem gives us a EE-sentence 9 such that 9 and A\f 9. Can we find a E-sentence with the same characteristics? Please justify your answer.

4. Show that if X,Y C N are recursive sets, then so axe XUY and N — X. This fact is used in the comment in the proof of Godel's First Incompleteness Theorem that the development in Chapter 4 would work for any recursive set of axioms. In Chapter 4 we used heavily the fact that we could write A-definitions for everything, but there are recursive sets that do not have A-definitions.

5.4 Extensions and Refinements of Incompleteness

If you look carefully at the First Incompleteness Theorem, it does not quite say that the collection of axioms A is incomplete. All that is claimed is that there is a sentence 9 such that 9 is true-in-9t and 9 is not provable from A. But, perhaps, ->9 is provable from A. Our first result in this section brings the focus onto incompleteness. Proposition 5.4.1. Suppose that A is a consistent, recursive set of axioms that proves all of the axioms of N. If all of the axioms of A are true in 91, then there is a sentence 9 such that A\f 9 and A \f ->9.

Proof. As in the proof of the First Incompleteness Theorem, let 9 be such that

We know that 91 9 and A\f 9. Suppose that A proves ~>0. Then, as all of the axioms of A are true in the structure 91, we know that 91 (= -i9, which contradicts the fact that 91 [= 9. Thus A \f -<0, and

We can also eliminate the hypothesis that the axioms of A be true in 91. To prove that A is incomplete, all that will be required of our axioms will be that they form a consistent extension of N. We will prove this result in two steps, starting by strengthening the hypothesis of consistency to ^-consistency. Then, in Rosser's The-orem, we will show how a slightly trickier use of the Self-Reference Lemma can show that any consistent, recursive extension of N must be incomplete.

Definition 5.4.2. A theory T in CNT is said to be ^-inconsistent if there is a formula <f>(x) such that T I- 3x<f>(x), but for each natural number n, T h -«f>(n). Otherwise, T is called w-consistent.

Proposition 5.4.3. If T is u>-consistent, then T is consistent.

Proof. Exercise 2. . •

The converse of this proposition is false, as you are asked to show in Exercise 3. Also notice that if T is a theory such that 91 f= T, then T is necessarily w-consistent. So the chain of implications looks like this:

We already know that if A is a recursive, true-in-91 extension of N, then A is incomplete. In Proposition 5.4.4 we will use the same sentence 9 as in the First Incompleteness Theorem to show that an w-consistent recursive extension of N is incomplete, then in Theorem 5.4.5 we will use a slightly trickier sentence p to show that mere consistency suffices: Any consistent, recursive extension of N must be incomplete.

A is incomplete. •

true in 91 => w-consistent => consistent.

Proposition 5.4.4. If A is an to-consistent recursive set of axioms extending N, then A is incomplete.

5.4. Extensions and Refinements of Incompleteness 189

Proof. As usual, let 9 be such that N b [fl «-• -.JTWM^"1)]. We already know that 911= 8 and A \f 9. We will show that A \f and thus A is incomplete.

Assume that A I—>0; then we know by our choice of 9 that A b ThmA(W). In other words,

A b (3x) Deduction A(x,HP). (5.3)

Since we also know that A \f 9, we know, for each natural number n, that n is not the code for a deduction of 9. So

For each N € N, (n, r9n) & DEDUCTION*,

and thus, as the formula Deduction A represents the set DEDUCTION A,

For each n € N, N I—>Deduction A(ri,r9~f).

Since A is an extension of JV, we have

For each n € N, A b ~>DeductionA(n, r<P),

which, when combined with (5.3), implies that A is (^-inconsistent, contrary to hypothesis. So our assumption must be wrong, and A \f -i9, as needed. •

We have gotten a lot of mileage out of our sentence 9. Rosser's Theorem uses a different sentence to get a stronger result.

Theorem 5.4.5 (Rosser's Theorem). If A is a set of C^r-axioms that is recursive, consistent, and extends N, then A is incomplete.

Proof. We use the Self-Reference Lemma to construct a sentence p such that

(Vx) [DeductionA(x, rp~i) —• (3y < x)(DeductionA(y, q}^^))

So p says, "If there is a proof of p, then there is a proof of ~>p with a smaller code." ,

First, we claim that A \f p\ Assume, on the contrary, that A b p. Let o be a number that codes up a deduction of p. Since the formula Deduction A represents the set DEDUCTION^, we know that

A I- Deduction(a, rpn).

Also, by choice of p, we know that

A I- (Vx) [DeductionA(X, rpn) - • (By < x) (Deduction A(V, 2*3 p ))],

which means that

A\r [Deduction A (a, p^) —» (3 y < A)(DEDUCTIONA(y,2ll±p ))].

But we know that A J- DeductionA(a, rp~1), so we are led to conclude that

A b (3y < a) (DeductionA (y, S 1 ^ ) ) . (5.4)

On the other hand, we have assumed that A b p and A is consistent. Therefore, we know, for each n € N, that

A h -iDeductionA(U, 2*3^ ).

But then, by Rosser's Lemma (Lemma 4.3.7), we know that

A b -i(3y < a)(DeductionA(y, 2*3^)),

which, when combined with (5.4), shows that A is inconsistent, a contradiction. So we conclude that A\f p, as claimed.

We also claim that A I/ ->p. For assume that 6 is a code for a deduction of ~>p. Then A b DeductionA(P, r-1P~1)- In other words, A b DeductionA(b,2X:f^).

Now, since AI—•p, from the choice of p we know that

A h ( 3 a ; ) ^ D e d u c t i o n A ( X , R P ^ H - I ^ Y ) ^ < X/\DEDUCTIONA(Y,*FLFPR>)]^.

So if we substitute b for y, we see that

A b (3x) DeductionA(x,r^) A -<[6 < x A DeductionA (b, | .

5.5. Another Proof of Incompleteness 191

But as the formula (*) is provable from A, this means that

A H (3x) \Deductionrp~l) A ->(b < x)],

which is equivalent to

A H (3x) [x < 6 A DeductionA[X, rp~1)].

Now, since we have assumed that A 1—>p and A is consistent, we know that for each n € N, A I—>DeductionA(n, rp~t), and so by Rosser's Lemma again,

which shows that A is inconsistent, contrary to assumption. There-fore, A \f ~>p.

Since A \f p and A I-f ->p, we know that A is incomplete, as

5.4.1 Exercises 1. Suppose that A and B are theories in some language and A C B.

Suppose that B is consistent. Show that A is consistent. What happens if B is w-consistent?

2. Prove Proposition 5.4.3. [Suggestion: Try the contrapositive.]

3. Find an example of a theory that is consistent but not w-consistent. [Suggestion: If you can construct the correct sort of model, 121, then Th(QL) will be consistent and ^-inconsistent.]

4. Suppose 9 is such that

So 9 asserts its own refutability. Is 9 true? Provable? Refutable?

5.5 Another Proof of Incompleteness

A H -<(3x) [x < b A Deductionrp"1)],

needed. •

As we mentioned in the introduction to this chapter, the sentence 8 of the First Incompleteness Theorem can be seen as a formalization of the liar paradox, where a speaker assets that what the speaker says

is false. In this section we will outline a proof, due to George Boolos [Boolos 94] of the First Incompleteness Theorem that is based upon Berry's paradox.

G. G. Berry, a librarian at Oxford University at the beginning of the twentieth century, is credited by Bertrand Russell with the observation that the least integer not nameable in fewer than nineteen syllables is nameable in eighteen syllables. We will formalize a version of Berry's phrase to come up with another sentence that is true in

but not provable. For our argument to work we need to make a minor change in our

language. It will be important that our language haw only finitely many symbols, and to make CNT finite, we have to rework the way that we denote variables. So, for this section, rather than having Vars be the infinite set of variables v\, V2,..., vn,..., and thinking of each Vj as its own symbol, we will think of them as a sequence of symbols. So the string vn is no longer a single symbol but is, rather, three symbols. The set Vars then is defined to be the collection of finite strings of symbols that are of the form va, where s is a string of numerals. Thus the symbols of CNT are

{(> )i v> " V = , v,q ,i ,2,. . . ,9,0, S, + , •, E, <},

giving us precisely 23 symbols in the language, and CNT is finite. We restate the First Incompleteness Theorem:

Theorem 5.5.1. Suppose that Q is a consistent and recursive set °f Cnt-formulas. Then there is a sentence 0 such that 91 f= /? and QV P-

Outline of Proof. We can assume that Q proves all of the axioms of TV, since if not, one of the axioms of N would do for /3.

Suppose that <j>(x) is a formula of Cnt- We say that <p(x) names the natural number n with respect to the axioms Q if and only if

Q h ((Vz)(<£(a:) «-• x = n)). Notice that no formula can name more than one number.

Here is where we use the fact that our language is finite. Fix a number i. Since there are only 23 symbols in our language, there are no more that 23® formulas of length i, where the length of a formula is the number of symbols that it contains. So no more than 23* numbers can be named by formulas of length i.

5.5. Another Proof of Incompleteness 193

So for each number m, there are only finitely many numbers that can be named by formulas of length less than or equal to m. Thus, for each m, there are some numbers that cannot be named by formulas of length less than or equal to m, so there is a least number not named by any formula containing no more than m symbols.

Now there is a formula r](i>i,Z) with two free variables that says that Vi is a number named by a formula of length I. You are asked in Exercise 5 to find 77. Then we can let 6(vi, v2) be (3/ < v2)r}(v\,l). Thus V2) says that the number V\ is nameable in fewer than v2

symbols. Two more formulas get us home. Define 7(^1, v2) by

r/(vi, v2) is [-vf(ui, u2)] A [(Vx < vi)5(x, v2)] •

So 7(VI,V2) says that t>i is the least number not nameable in fewer than v2 symbols. Let k be the number of symbols in 7(^1, v2). Notice that k > 4. (How's that for a bit of an understatement?)

We can now define

a ( v i ) is (3u2)(u2 = TO • k A 7(^1, v2 ) ) .

So a(ui) claims that «i is the least number not nameable in fewer than 10k symbols, where k is the number of symbols in 7(1*1,1 ). If we write out a(vi) formally, we see that

a{vi) is (->(Vv2)(-i(= V2 • lOfe A 7(1/1, t/2)))).

Let us count the symbols in a(t>i). There axe 11 symbols in 10, k +1 in A:, and k in 7(^1, v2). By using both my fingers and my toes, I get a total of 11 + (fc + 1) + k + 18 = 2k + 30 symbols in a(vi). A bit of algebra tells us that since fc > 4, lOfc >2k + 30, so a(i>i) contains fewer than lOfc symbols.

Suppose that 6 is the smallest number not named by any formula with fewer than lOfc symbols. Since a(t>i) has fewer than 10& sym-bols, certainly b is not named by the formula a(«i). By looking back at our definition of what it means for a formula to name a number, we see that

Q V ((Vwi)(a(vi) «-• t/i = 6)). But this is where we want to be. Let (3 be the sentence

((Vt/iXaM «-»;! = &)).

We just saw that Q does not prove the sentence 0, But 0 is a true statement about the natural numbers, for the number b is the least number that is not nameable in fewer than lOfc symbols, and that is precisely the meaning of the sentence 0. So j= 0 and Q ty 0, as needed. D

One big difference between this proof and our first proof of the First Incompleteness Theorem is that this proof does not use the Self-Reference Lemma, as we don't have to substitute the Godel number of a formula into itself, but rather, we substitute the numeral of a number that makes the formula true. But both proofs do rely on Godel numbering and coding deductions, so the mechanism of Chapter 4 comes into play for both.

5.5.1 Exercises

1. Give an argument, perhaps based on Church's Thesis or perhaps using a variant of Theorem 4.10.2, to show that there is a A-formula LengthOfFormula(f, I) that is true if and only if / is the Godel number of a formula consisting of exactly I symbols. [Suggestion: You may need to start by showing the existence of a formula LengthOfTerm with two free variables.]

2. Prove that no formula can name two different numbers. [Sugges-tion: Think about what it would mean if <f>(x) named both 17 and 42.}

3. Find an upper bound for the number of numbers that can be named by formulas tf>(x) that contain no more than m symbols.

4. Suppose that 4>(x) names n with respect to the set of axioms N. Show that 4>{x) represents {n}.

5. Find a formula T)(n, I) that says that n is named by a formula of length I. Is your formula equivalent to a S-formula?

6. The statement of Theorem 5.5.1 makes two assumptions about the collection of axioms Q. Where are they used in the proof?

5.6. Peano Arithmetic and the Second Incompleteness Theorem 195

5.6 Peano Arithmetic and the Second Incompleteness Theorem

It is our goal in this section to show that a set of axioms cannot prove its own consistency. Now this statement needs to be sharpened, for of course some sets of axioms can prove their own consistency. For example, the axiom set A might contain the statement "A is consistent." But that will lead to problems, as we will show.

The first order of business will be to introduce a new collection of axioms, called PA, or the axioms of Peano Arithmetic. This exten-sion of N will be recursive and will be true in 91, so all of the results of this chapter will apply to PA. We wiU then state, without proof, three properties that are true of PA, properties that are needed for the proof of the Second Incompleteness Theorem.

The Second Incompleteness Theorem is, in some sense, nothing more than finding another true and improvable statement, but the statement that we will find is much more natural and has a longer history than the sentence 0 of Godel I. As we mentioned on page 1, at the beginning of the twentieth century, the German mathemati-cian David Hilbert proposed that the mathematical community set itself the goal of proving that mathematics is consistent. In the Sec-ond Incompleteness Theorem, we will see that no extension of Peano Arithmetic can prove itself to be consistent, and thus certainly any plan for a self-contained proof of consistency must be doomed to fail-ure. This was the blow that Godel delivered to Hilbert's consistency program. Understanding the ideas behind this second proof is our current goal. We will not fill in all of the details of the construc-tion. The interested reader is directed to Craig Smorynski's article in [Barwise 77], on which our presentation is based.

We begin by establishing our new set of nonlogical axioms, the axioms of Peano Arithmetic. Definition 5.6.1. The axioms of Peano Arithmetic, or PA, are

the eleven axioms of N together with the axiom schema

[^(0) A (Vx)[0(x) 0(Sx)] J (Vx)0(x)

for each £ AT-formula 0 with one free variable. So Peano Arithmetic is nothing more than the familiar set of

axioms N, together with an induction schema for £/\rr-definabIe sets.

Although we will not go through the details, it is easy to see that the set AXIOM O F P A is recursive, so PA is a recursively axiomatized extension of N.

What makes PA useful to us is that PA is strong enough to prove certain facts about derivations in PA. In particular, PA is strong enough so that the following derivability conditions hold for all for-mulas <f>:

If PA I- then PA h ThmPA(^p). . (Dl)

PA I- ^ThmPA(<ljP) ThmpA{rThmPA(rpp)^. (D2)

PA h \^[ThmPAfp) A ThmPA(r4> - i/f1)] ThmPA(^p)^. (D3)

Granting these conditions, we can move on to prove the Second Incompleteness Theorem. Recall that we agreed to use the symbol J_ for the contradictory sentence [(Vx)x = x] A ->[(Vx)x = x].

Definition 5 . 6 . 2 . The sentence ConPA is the sentence -<ThmPA(rlP).

Notice that 1= ConPA if and only if PA is a consistent set of axioms. For if PA is not consistent, then PA can prove anything, including X. If PA is consistent, since we know that there is a proof in PA of -i ± , there must not be a proof of _L.

Theorem 5.6.3 (Godel's Second Incompleteness Theorem). If Pea.no Arithmetic is consistent, then PA \f ConPA.

Proof. Let 9 be, as usual, the statement generated by the Self-Reference Lemma, but this time we apply the lemma to the formula ->ThmPA(vi). So 9 is such that

PA\- ThmPA ( ^ J . . (5.5)

We know that PA \f 9, as PA is a recursive consistent extension of N, all of whose axioms are true in 91 (Proposition 5.4.1). We will show that

PA\- (9 ConPA).

If it was the case that PA h ConPA, then we would have PA h 9, a contradiction. Thus PA Ii ConPA. [For this proof, all we really need

is that PA I- [ConPA 0), but the other direction is used in the Exercises.]

So all that is left is to show that PAh (d Con pa). For the forward direction, since ± is the denial of a tautology, we

know that PA I- (1-+ 0),

so by the first derivability condition (Dl) we know that PA I- ThmPA(r±-> 0n).

This implies, via (D3), that

PA f- ThmPA(rlP) ThmpA(r8n),

PA h -iThmPA(r0->) -,ThmPA(rX"•). (5.6)

Now, if we combine (5.5) and (5.6), we see that

PA h 0 -,ThmpACT^)}

PAY-0 ConPA,

which is half of what we need to prove.

For the converse, notice that from derivability condition (D2),

PA I- ThmPA(TQ^) ThmPA (^Thm^^Pj • (5-7)

Since we also know that PA h 0 *-+ -iThmpA(r0~>), the sentence

> ThmpA ^ThmpA^) -.0"^ is a true E-sentence. Since N suffices to prove true E-sentences, certainly

PA h ThmpA ThmPa(HP) . (5.8)

Now, if we take (5.8) and derivability condition (D3), we have

PA H ThmPA (^ThmPA(nP~y^j ThmpA(r. . (5.9)

If we combine (5.7) and (5.9), we see that

PA h ThmPA(<lP) -> ThmpAf^JP). (5.10)

Now, since we know that the sentence 6 —* —»-L] is a tau-tology, the statement

ThmpA (r0

is a true E-sentence, so

PA H ThmpA ( r 0 - » [(-0) -*±"1) - (5.11)

Once again, using the derivability condition (D3) twice on (5.11), we find that

PA h ThmpAfT*) — [ThmpA f1^) — ThmPACl^)],

and if we combine that with (5.10), we see that

PA h ThmpA^P) ThmpA^J^),

PA h -iThmpAfT*) -yThmpA^),

which when we translate the antecedent and use the definition of 9 on the consequent, combines with (D3) to give us

PA h ConpA 9,

which is what we needed to prove. •

The proof of the Second Incompleteness Theorem is technical, but the result is fabulous: If Peano Arithmetic is consistent, it cannot prove its own consistency. You will not be surprised to find out that the same result holds for any consistent set of axioms extending Peano Arithmetic that can be represented by a E-formula.

Chaff: Time for a bit of technical stuff that is pretty neat. To be precise, the way in which we code up the axioms of Peano Arithmetic is important in the state-ment of the Second Incompleteness Theorem. The set

A X I O M O F P A is recursive, and thus there is a formula 4>(x) that represents that set. In fact, the formula 4>{x) can be taken to be a A-formula, and if you use that 4>, then Godel's result is as we have stated it. But there are other formulas that you could use to represent the set of axioms of Peano Arithmetic, and Solomon Feferman proved in 1960 that it is possible to express consistency using one of these other formulas in such a way that PA I- ConpA [Feferman 60]. The moral is: You have to be very precise when dealing with consistency statements.

The example of Con PA as a true and yet unprovable statement stood for 45 years as the most natural sentence of that type. In 1977, however, Jeff Paris and Leo Harrington discovered a generalization of Ramsey's Theorem in combinatorics that is not provable in Peano Arithmetic. Since that time a small cottage industry has developed that produces relatively natural statements that are not provable in one theory or another.

We end this section with a couple of interesting corollaries of the Second Incompleteness Theorem. The first corollary is pretty strange. Suppose for a second that Peano Arithmetic is consistent (you believe that it is, right?). Then we can, if we like, assume that PA is inconsistent, and we will still have a consistent set of axioms!

Corollary 5.6.4. If PA is consistent, the set of axioms

PA U {->ConPA}

is consistent. . . "

Proof. This is immediate. We know that PA \F Con PA. So by Exer-cise 4 in Section 2.7.1, PA U {~>ConpA} is consistent. O

To appreciate the next corollary, remember our development of the First Incompleteness Theorem, in the setting of Peano Arith-metic. The sentence 9 that we construct from the Self-Reference Lemma "says" that it (0) is not provable from PA. Then the Incom-pleteness Theorem says that 9 is right: 9 is unprovable from PA, so 9 is correct in what it asserts.

Now, suppose that we use the Self-Reference Lemma to construct a different sentence, a, such that a claims that it is provable from PA. Is a true? False? Lob's Theorem provides the answer.

Corollary 5.6.5 (Lob's Theorem). Suppose a is such that PA b ThmpA^c^) -* a. Then PA b a.

Proof. This proof hinges on the fact that PA is sufficiently strong to prove the equivalence

->ThmpA(ra~') ConPAu^ay

Granting this, since by assumption PA b (pot —* ->ThmpA(roP)), we know that

PA U {-**} b -iThmpACoP),

so by our equivalence,

PA U {-«*} b ConPAu{-*x}.

But then by the Second Incompleteness Theorem, PA U {->a} must be inconsistent, and therefore PA b a, as needed. •

Lob's Theorem tells us that the sentence that states, "I am prov-able in Peano Arithmetic" is true (and therefore provable)!

5.6.1 Exercises 1. Explain how the induction schema of Peano Arithmetic as given

in Definition 5.6.1 differs from the full principle of mathematical induction:

(VA C N ) | ( 0 6 i A (Vx)(x € A—* Sx & A)) —* A = N .

[Suggestion: You might look at the comment on page 117.]

2. Suppose 9 and r] are two sentences that assert their own unprov-ability (from PA). So

PA b [0 <-» -yThmpA

and PA b [77 <- -1 ThmpA ( " V ) ] •

Prove that PA b 9 *-* rj.

5.7. George Boolos on the Second Incompleteness Theorem 201

3. Let p be the Rosser sentence (in the setting of PA). So

' (Vx) [DeductionPA(X, rp_1) —•

(3y < x){DeductionPA{y,?fp~'))^.

Show that -

PA h ConpA — [{^ThmpAfP)) A {->ThmPA(r^))]^.

Use this and Godel's Second Incompleteness Theorem to show

PA h [ConPA - » p]

and PA\f[p^ ConPA].

So since Godel's 9 is such that PA Ir [9 <-» ConpA], you have shown that 9 and p are not provably equivalent in PA.

5.7 George Boolos on the Second Incompleteness Theorem

George Boolos published a wonderful article on the Second Incom-pleteness Theorem. Here is the first page of his

Godel's Second Incompleteness Theorem Explained in Words of One Syllable 1

First of all, when I say "proved", what I will mean is "proved with the aid of the whole of math". Now then: two plus two is four, as you well know. And, of course,

, it can be proved that two plus two is four (proved, that is, with the aid of the whole of math, as I said, though in the case of two plus two, of course we do not need the

1 Mind, Vol. 103, January 1994, pp. 1-3. Used with the permission of Oxford University Press.

whole of math to prove that it is four). And, as may not be quite so clear, it can be proved that it can be proved that two plus two is four, as well. And it can be proved that it can be proved that it can be proved that two plus two is four. And so on. In fact, if a claim can be proved, then it can be proved that the claim can be proved. And that too can be proved.

Now: two plus two is not five. And it can be proved that two plus two is not five. And it can be proved that it can be proved that two plus two is not five, and so on.

Thus: it can be proved that two plus two is not five. Can it be proved as well that two plus two is five? It would be a real blow to math, to say the least, if it could. If it could be proved that two plus two is five, then it could be proved that five is not five, and then there would be no claim that could not be proved, and math would be a lot of bunk.

So, we now want to ask, can it be proved that it can't be proved that two plus two is five? Here's the shock: no, it can't. Or, to hedge a bit: if it can be proved that it can't be proved that two plus two is five, then it can be proved as well that two plus two is five, and math is a lot of bunk. In fact, if math is not a lot of bunk, then no claim of the form "claim X can't be proved" can be proved.

So, if math is not a lot of bunk, then, though it can't be proved that two plus two is five, it can't be proved that it can't be proved that two plus two is five.

By the way, in case you'd like to know: yes, it can be proved that if it can be proved that it can't be proved that two plus two is five, then it can be proved that two plus two is five.

5.8 Summing Up, Looking Ahead This chapter is the capstone of the book. We have stated and proved (with only a small bit of handwaving) the two incompleteness theo-rems. I am sure that at this point you have a strong understanding of the theorems as well as an appreciation for the delicate arguments

that are used in their proofs. These theorems have had a profound impact on the philosophical understanding of the subject of math-ematics, and they involve some wonderful mathematics in and of themselves.

When I was in college, I was constantly asked by my mother what I was studying, and it became sort of a game to try to distill the essence of a course into a few phrases that would get the idea across without bogging down into technical details. As I think through this course, I think the summary I would give my mom would be something like this:

We looked at the language in which mathematics is written and looked at different kinds of mathematical universes, or structures. We thought about axioms and proofs. We worked through the proof of Godel's Com-pleteness Theorem, which shows that any set of axioms that is not self-contradictory is true in some mathemati-cal universe.

Then we changed our focus and thought about the natural numbers and ordinary arithmetic. We proved that 1 4-1 is 2. We proved that we can prove that 1 +1 is 2. But the amazing thing is that even though 1 +1 is not 3, and even though we can prove that 1 + 1 is not 3, we cannot prove that we cannot prove that 1 + 1 is 3, unless mathematics is inconsistent. That is the core of Godel's Second Incompleteness Theorem.

So five years from now, if you have stopped studying mathematics and think back on this course, I hope you will at least remember the core ideas. If, on the other hand, you are actively working in a mathematical field, I hope that you will also remember some of the intricate arguments that we have developed and some of the powerful theorems that we have seen. Logic is a useful and powerful part of mathematics, interesting in its own right and shaping our understanding of the entire field.

Appendix Just Enough Set Theory to Be Dangerous

It is my goal in this appendix to review some basic set-theoretic notions and to state some results that are used in the book. Very little will be proved, but the Exercises will give you a chance to get a feel for the subject. There are several laudable texts and reference works on set theory, I have listed a few in the Bibliography.

Think of a set as a collection of objects. If X is a set, we write a € X to say that the object a is in the collection X. For our purposes, it will be necessary that any given thing either is an element of a given set or is not an element of that set. If that sounds obvious to you, suppose I asked you whether or not 153,297 was in the "set" of big numbers. Depending on your age and whether or not you are used to handling numbers that large, your answer might be "yes," "no," or "pretty large, but not all that large." So you can see that membership in an alleged set might not be all that cut and dried. But we won't think about sets of that sort, leaving them to the field of fuzzy set theory.

Our main concern will be with infinite sets and the size of those sets. One of the really neat results of set theory is that infinite sets come in different sizes, so there are different sizes of infinity! Here are some of the details:

We say that sets X and Y have the same cardinality, and write |X| = |Y|, if there is a one-to-one and onto function (also called a bijection) f : X —» Y. The function / is sometimes called a one-to-one correspondence. For example, if X and Y are finite sets, this

206 Appendix: Just Enough Set Theory to Be Dangerous

just means that you can match up the elements of the set without missing anyone. So if D = {Happy, Sleepy, Grumpy, Doc, Dopey, Sneezy, Bashful} and W = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}, it is easy to see that \D\ = by pairing them up. Notice in doing the pairing that you never actually have to count the number of elements in either set. All you have to do is match them up. This is what allows us to apply the concept of cardinality to infinite sets.

As an easy example, notice that N = {0,1,2,3, . . , } has the same cardinality as EVEN = {0,2,4,6,. . .} . To prove this, I must exhibit a bijection between the two sets, and the function / : N —» EVEN defined by f(x) = 2x works quite nicely. It is easy to check that / is a bijection, so |EVEN| = |N|. Notice that this says that these two sets have the same size, even though the first is a proper subset of the second. This cannot happen with finite sets, but is rather common with infinite sets.

A set that has the same cardinality as the set of natural numbers is called countable. The exercises will give you several opportunities to show that the cardinality of one set is equal to the cardinality of another set.

We will say that the cardinality of set A is less than or equal to the cardinality of set B, and write |A| < |i?|, if there is a one-to-one function / : A B. If |A| < |B| but |A| ^ Jfl|, we say that the cardinality of A is less than the cardinality of B.

One of the basic theorems of set theory, the Schroder-Bernstein Theorem, says that if |A| < \B\ and |£| < |A|, then |A| = |J3|. The proof of this theorem is beyond this little appendix, but a nice version is in [Hrbacek and Jech 84]. Be warned, however. This theorem looks obvious, because of the notation. What it says, however, is that if you have two sets A and B such that there are injections fi : A -* B and f2 B A, then there is a bijection fc : A—* B. But for our purposes, it will suffice to think: "If the size of this set is no bigger than the size of that set, and if the size of that set is no bigger than the size of this set, then the two sets must have the same size!".

With this tool in hand it is not hard to show that Q, the collection of rational numbers, is countable. An injection from N to Q is easy: f(x) = x will do. But to find an injection from the rationals to the naturals requires more thought. I'll let you think about it, but

trust me when I say that there is such a function. If you don't trust me, check any introductory set theory book or discrete mathematics textbook.

You will have noticed by this point that all we have done is show that several different infinite sets are countable. Georg Cantor proved in 1874 that the set of real numbers is not countable, in other words, he showed that |N| < |R|. In 1891 he developed his famous diagonal argument to show that given any set X, there is a set of larger cardinality, namely the collection of all subsets of X , which we call the power set of X and denote P(X). So states |X| < \P{X)\. Thus there can be no largest cardinality. More picturesquely: There is no largest set.

The way to think about a countable set is that you can write an infinite list that contains every element of the set. This means that you can go through the set one element at a time and be sure that you eventually get to every element of the set, and that any element of the set only has finitely many things preceding it on the list. Think about listing the natural numbers. Although it would take a long time to reach the number 1038756397456 on the list (0,1,2,3,4,...), we would eventually get to that number, and after only a finite amount of time. The fact that the real numbers are uncountable means that if you try to write them out in a list, you have to leave some reals off your list. So if you try to list the reals as (1, —4/7, e, 1010,7r/4,...), then there has to be a real number, maybe 17, that you left out.

Suppose that Xi, X2, X3,. ..are all countable sets, and consider the set Y = XiUXaUXaU- • •. So Y is a countable union of countable sets. If you want to know the size of Y, it is easy to find, for there is a theorem that states that the countable union of countable sets is countable. (Purists: Calm down. I know about the Axiom of Choice. I know we don't need this theorem in this generality. But this isn't a set theory text, so lighten up a little!) Let's look at an application of this wonderful theorem:

Almost all of the languages in this book are countable, meaning that the set consisting of all of the constant symbols, function sym-bols, and relation symbols, together with the connectives, parenthe-ses, and variables is countable. Now using the fact that a countable union of countable sets is countable, we can show that S2, the col-lection of all strings of two symbols from the language, is countable.

Here is a table that contains every string of exactly two ^-symbols, where £ = (so, si, . . . )'•

so Si s 2 S3 S4 »•*

so So So So Si S0S2 S0S3 S0S4 H s i So Si Si S1S2 Si S3 S1S4 ••• S% S2S0 S2S1 S2S2 S2S3 S2S4

S3 S3S0 S3S1 S3S2 S3S3 S3S4

Now, this array shows that S2 can be thought of as a countable union of countable sets: Each row of the table is countable, and there are countably many rows. By the theorem of the last paragraph, the collection of length-two strings is countable.

Chaff: Notice that by reading the table diagonally starting in the upper left-hand corner, we can explicitly construct a listing of all of the length-two strings:

{S0S0> «0«1, S0S2, S1S1,82So, S0S3, SiS2, S2S1, S3S0,...).

This trick is important when you want to be able to pro-gram a computer to produce the listing or when you want to show that the collection of length two strings is recur-sive.

Using the same idea, we can prove by induction that for any nat-ural number n, Sn, the collection of strings of length n is countable. [Put the length (n — 1) strings across the top of the table and the symbols of C down the side.] But then the collection of all finite strings is just the (countable) union So U Si U S2 U • * *. So the col-lection of all finite strings is countable. Exercise 5 asks you to think about generating the listing of strings via a computer.

Exercises 1. Show that the set of numbers that are a nonnegative power of 10

(also known as A = {1,10,100,1000,...} ) is countable.

2. Show that the collection of prime numbers is countable.

3. (a) Show that these two intervals on the real line have the same cardinality; [0,1] and [0,10].

(b) Show that these two intervals have the same cardinality: [0,1] and [1,3].

(c) Show that these two intervals on the real line have the same • cardinality: [a, b] and [c, cfl, where a / 6 and cj^d.

(d) Show that the interval [0,1] and the interval (0,1) have the same cardinality. [Suggestion: The easy way is to quote a theorem. The interesting way is to find an explicit bijection between the two sets.]

4. Show that the relation "has the same cardinality as" is an equiv-alence relation.

5. Suppose that we have a recursively enumerable listing of the el-ements of a countable language C = (soisi,«2> • • •)• Design a computer program that lists all of the finite strings of symbols from the language, showing that the collection of finite strings is also recursively enumerable.

Bibliography

[Barwise 77] Jon Barwise, ed. Handbook of Mathematical Logic. Am-sterdam: North-Holland, 1977.

[Bell and Machover 77] John L. Bell and Mosh6 Machover. A Course in Mathematical Logic. Amsterdam: North-Holland, 1977.

[Boolos 89] George Boolos. A New Proof of the Godel Incomplete-ness Theorem. Notices of the American Mathematical So-ciety, Vol. 36, No. 4, April 1989, pp. 388-390.

[Boolos 94] — Godel's Second Incompleteness Theorem Ex-plained in Words of One Syllable. Mind, Vol. 103, January 1994, pp. 1-3.

[Chang and Keisler 73] C. C. Chang and H. J. Keisler. Model The-ory. Amsterdam: North-Holland, 1973.

[Crossley et al. 72] J. N. Crossley, C, J. Ash, C. J. Brickhill, J. C. Stillwell, and N. H. Williams. What Is Mathematical Logic? London: Oxford University Press, 1972.

[Dawson 97] John W. Dawson, Jr. Logical Dilemmas: The Life and Work of Kurt Godel. Wellesley, Mass.: A. K. Peters, 1997.

[Enderton 72] Herbert B. Enderton. A Mathematical Introduction to Logic. Orlando, Fla.: Academic Press, 1972.

[Feferman 60] Solomon Feferman. Arithmetization of Metamathe-matics in a General Setting. Pundamenta Mathematicae, Vol. 49, 1960, pp. 35-92.

[Feferman 98] — In the Light of Logic. Oxford: Oxford Uni-versity Press, 1998.

212 Bibliography

[Godel-Works] Kurt Godel. Collected Works: Vol. I, Publications 1929-1936. Edited by Soloman Feferman, John W. Daw-son, Jr., Stephen C. Kleene, Gregory H. Moore, Robert M. Solovay, and Jean Van Heijenoort. New York: Oxford University Press, 1986.

[Goldstern and Judah 95] Martin Goldstern and Haim Judah. The Incompleteness Phenomenon: A New Course in Mathe-matical Logic. Wellesley, Mass.: A. K. Peters, 1995.

[Henle 86] James M. Henle. An Outline of Set Theory. New York: Springer-Verlag, 1986.

[Hrbacek and Jech 84] Karel Hrbacek and Thomas Jech. Introduc-tion to Set Theory. New York: Marcel Dekker, 1984.

[Keisler 76] H. Jerome Keisler. Foundations of Infinitesimal Calcu-lus. Boston: Prindle, Weber & Schmidt, 1976.

[Keisler and Robbin 96] H. Jerome Keisler and Joel Robbin. Math-ematical Logic and Computability. New York: McGraw-Hill, 1996.

[Malitz 79] Jerome Malitz. Introduction to Mathematical Logic. New York: Springer-Verlag, 1979.

[Manin 77] Yu. I. Manin. A Course in Mathematical Logic. New York: Springer-Verlag, 1977.

[Mendelson 87] Elliott Mendelson. Introduction to Mathematical Logic. Monterey, Calif.: Brooks/Cole, a division of Wadsworth, 1987.

[Roitman 90] Judith Roitman. Introduction to Modern Set Theory. New York: Wiley, 1990.

[Russell 67] Bertrand Russell. The Autobiography of Bertrand Rus-sell, Vol. I. London: George Allen & Unwin, 1967.

A arity, 6 assignment function

term, 33 variable, 32 as-modification, 33

atomic formula, 13 axiom

equality, 56 logical, 55-57 quantifier, 56-57

axiomatize, 124

B Berry's paradox, 194 Boolos, George, 194, 203

C Cantor's Theorem, 209 " Cantor, Georg, 209 cardinality, 207 Church, Alonzo, 138 Church's Thesis, 139 Compactness Theorem, 102 complete

deductive system, 86 set of axioms, 124

complete diagram, 120 Completeness Theorem, 87

consistent, 87 countable, 114, 208

D decidable, 55 deduction, 50 Deduction Theorem, 74 definable set, 130 defines, 130 A-formuIa, 131 derivability conditions for Peano

Arithmetic, 198 dwarfs, seven, 208

E elementarily equivalent, 104 elementary chain, 119 elementary extension, 112 elementary substructure, 112 equality axiom, 56 extension by constants, 90

F Feferman, Solomon, 201 finitely satisfiable set of formu-

las, 102 formula, 13

atomic, 13 A, 131

214 Index

n, 131 S, 131

function well-defined, 95

G Godel, Kurt, 2, 87,138 Godel number, 148

H „ Harrington, Leo, 201 Henkin axiom, 91 Henkin constant, 91 Henkin, Leon, 30, 87 Hilbert, David, 1, 197

I Incompleteness Theorem

First, 185 Second, 198

inconsistent, 87 induction on complexity, 18 initial segment, 20 isomorphic, 31 isomorphism, 31

Konig's Infinity Lemma, 108

L ^-structure, 26 language, 6 liar paradox, 193 linear order, 110 listable set, 141 Lob's Theorem, 202

logical axiom, 55-57 logical implication, 43 Lowenheim-Skolem Theorem

Downward, 114 Upward, 118

M model, 36 model of arithmetic, 104

nonstandard, 104 modus ponens, 51

N JV, 78 n-ary, 6 names, 194 nonstandard analysis, 105 notation

infix, 12 Polish, 12

number theory language of, 22

O ^-consistent, 190

P Paris, Jeff, 201 Peano Arithmetic

first-order, 197 second-order, 118

II-formula, 131 prime component of a formula,

172 propositional consequence

in first order logic, 60 in propositional logic, 60

Index 215

provable formula, 134

Q quantifier

bounded, 131 scope of, 13

quantifier axiom, 56-57

recursion definition by, 11

recursive set, 129 recursively axiomatized, 185 recursively enumerable set, 140 refutable formula, 134 represent, 129 restriction

of a function, 111 of a model, 98

Robinson, Abraham, 105 Rosser's Lemma, 133 Rosser's Theorem, 191 rule of inference

type (PC), 61 type (QR), 61

S satisfaction, 34 satisfiable set of formulas, 102 Schroder-Bernstein Theorem, 208 scope, 13 Self-Reference Lemma, 182 sentence, 23 sequence code, 170 E-formula, 131 Skolem function, 116

Skolem's paradox, 117 Soundness Theorem, 66 structure, 26

Henkin, 30 universe of, 26

subformula, 13 substitutable, 41 substructure, 111

elementary, 112

T Tarski's Theorem, 188 tautology

of propositional logic, 58 term, 11 term assignment function, 33 term construction sequence, 153 theory, 184

of a set of sentences, 184 of a structure, 104

theory of 51, 124 Thms, 53 tree, 108 true, 36 Turing, Alan, 138

U uncountable, 114 unique readability, 12 universal closure, 44 universe, 26

size of the, 150

V valid, 43 variable

bound, 25

216 Index

free, 23

variable assignment function, 32

W well-defined function, 95 well-order, 110

w • j

m •n m ina

m m id

d gl 01 ml * l ml

A-Formulas Dealing with Coding

Formula Defined on Page:

Interpretat ion

Even(x) 133 x is an even number. Pnme(x) 133 x is a prime number. PnmePair(x, y) 134 x and y are adjacent primes. Dimdes(x, y) 134 x is a factor of y. Codenumber(c) 143 c codes a finite sequence of numbers. IthPrime(i, y) 144 y is the tth prime number, so y = pt. IthElement(e, i, c) 145 e is the tth element of the sequence coded

by e. Length(c, I) 145 e codes a sequence of length I. Append(b, d, o) 158 a is the code for b followed by d. Sequence Code (c) 175 c codes a sequence of £yvT-formulas. SequenceLength(c, I) 175 c codes a sequence of exactly I £NT-

formulas.

A-Formulas Dealing with jCw-Syntax

Formula Defined on Page:

Interpretation

Variable (x) 150 x is the Godel number of a vari-able.

TermConstructionSequence(c, a) 151 a is the Godel number of a term and c codes a construc-tion sequence for the term with Godel number a.

Term(a) 153 a is the Godel number of a term.

Formula(f) 154 / is the Godel number of an jCjvT-formula.

NumConstructionSequence (c, a,y) 155 c codes a construction se-quence with last element y = rcP.

Num (a, y) 156 y = nr. TermReplace(c, u, d, x, t) 157 u and t are terms, x is a vari-

able, c is a construction se-quence for u, and d is the re-sult of taking c and replacing all of the x's by t.

TerrnSub(u, x, t, y) 160 y = ruf\ Sub(f, x, t, y) 161 f = and y = r<j>f\ FYee(x,f) 164 the variable coded by x is free

in the formula coded by / . Substitutable(t, x, f ) 164 t is a term that is substitutable

for variable x in formula / .

Permission granted by rights holder. Further reproduction prohibited. Printed Oil recycled paper.

A-Formulas Dealing with Deductions

Formula Defined on Pages

Interpretation

AxiomOfN(a) 165 _ a is the Godel number of an axiom of N.

LogicalAxioTn(a) 166 a is the Godel number of a logical axiom.

QRRuk(c, e, t) 169 c codes a sequence of formulas (which is supposed to be a deduction), e codes the ith formula of the se-quence, and e is justified by the quantifier rules.

PnmeComponent(u, f) 170 u and / are Godel numbers of for-mulas, and the formula coded by u is a prime component of the formula coded by / .

PnmeList(c,i,r) 171 r codes the prime components in the first i entries of the deduction coded by c.

TruthAsstgnment(c, i, r, v) 171 v codes an assignment of truth values to the prime components coded by r.

Evaluate(e,r,v,y) 173 e is the Godel number of a formula a, and y is the truth of a, given v, a truth assignment for r, which is a prime list that includes the prime components of o.

PCRule(c, e,i) 173 e is the tth entry of the deduction coded by c, and is justified by the propositional consequence rule.

Deduction(c, f) 175 / is the Godel number of a formula and c is a code for a deduction of that formula.

Thins U) 177 / is the Godel number of a theorem of N.

= > N o t a A-formula < =

Permission granted by rights holder. Further reproduction prohibited. Printed on recycled Paper. i

List of Symbols

21, 26 4>h 40

21 H <t>[s], 34

21 H <t>, 36 S, 22-

s[z|o], 33

E M , 5 0 21 -< <B, 112

Th(A), 184 59 104

ConpA, 198

A j= T, 43

CNT, 22

£»,105

ThmE, 53

Vars, 6

=, 104

±, 87 r<P, 148

91, 27 9 8 , 1 1 1

m 01 •Ll 00

*l Printed on recycled paper

2000. leary - a friendly introduction to mathematical logic

Documents

timothy leary interview

mathematical practices · web viewhandout 6: interpreting...

ab leary 93 - sites.bu.edu

chapitre 1 - mama editions · 23 timothy leary, le...

charles leary portfolio

timothy leary, 1965

a friendly to mathematical logic

timothy leary - psychedelic prayers

entrevista con timothy leary

leary.ruleary.ru/download/leary/timothe leary -...

robinson - leary

leary ontology

polityka ekstazy tim leary

timothy leary-biografie

timothy leary - neurocomics (1979)

genealogía del general de brigada daniel florencio...

steve leary - finalreport

leary állatkertje v2.0

1996-baumeister-leary (1)

baumeister and leary