sider theodore - logic for philosophy

Logic for Philosophy

Theodore Sider

June 2, 2008

Preface

This book is an elementary introduction to the logic that students of contempo-

rary philosophy ought to know. It covers i) basic approaches to logic, including

proof theory and especially model theory, ii) extensions of standard logic (such

as modal logic) that are important in philosophy, and iii) some elementary

philosophy of logic. It prepares students to read the logically sophisticated

articles in today’s philosophy journals, and helps them resist bullying by symbol-

mongerers. In short, it teaches the logic necessary for being a contemporary

philosopher.

For better or for worse (I think better), the last century-or-so’s developments

in logic are part of the shared knowledge base of philosophers, and inform, in

varying degrees of directness, nearly every area of philosophy. Logic is part

of our shared language and inheritance. The standard philosophy curriculum

therefore includes a healthy dose of logic. This is a good thing. But the

advanced logic that is part of this curriculum is usually a course in mathematical

logic, which usually means an intensive course in metalogic (for example, a

course based on the excellent Boolos and Jeffrey (1989).) I do believe in the

value of such a course. But advanced undergraduate philosophy majors and

beginning graduate students often take but a single advanced logic course; and

if there is to be only one, it should not, I think, be a course in metalogic. The

standard metalogic course is too mathematically demanding for the average

philosophy student, and omits material that the average student needs to know.

If there is to be only one advanced logic course, let it be a course designed to

instill logical literacy.

I begin with a sketch of standard propositional and predicate logic (de-

veloped more formally than in a typical intro course.) I brie�y discuss a few

extensions and variations on each (e.g., three-valued logic, de�nite descrip-

tions). I then discuss modal logic and counterfactual conditionals in detail. I

presuppose familiarity with the contents of a typical intro logic course: the

i

PREFACE ii

meanings of the logical symbols of �rst-order predicate logic without identity

or function symbols; truth tables; translations from English into propositional

and predicate logic; some proof system (e.g., natural deduction) in propositional

and predicate logic.

I drew heavily from the following sources, which would be good for supple-

mental reading:

· Propositional logic: Mendelson (1987)

· Descriptions, multi-valued logic: Gamut (1991a)

· Sequents: Lemmon (1965)

· Further quanti�ers: Glanzberg (2006); Sher (1991, chapter 2); Wester-

ståhl (1989); Boolos and Jeffrey (1989, chapter 18)

· Modal logic: Gamut (1991b); Cresswell and Hughes (1996)

· Semantics for intuitionism : Priest (2001)

· Counterfactuals: Lewis (1973)

· Two-dimensional modal logic: Davies and Humberstone (1980)

Another important source was Ed Gettier’s 1988 modal logic class at the Uni-

versity of Massachusetts. My notes from that course formed the basis of the

�rst incarnation of this work.

I am also deeply grateful for feedback from colleagues, and from students

in courses on this material. In particular, Marcello Antosh, Josh Armstrong,

Gabe Greenberg, Angela Harper, Sami Laine, Gregory Lavers, Alex Morgan,

Jeff Russell, Brock Sides, Jason Turner, Crystal Tychonievich, Jennifer Wang,

Brian Weatherson, and Evan Williams: thank you.

Contents

Preface i

1 Nature of Logic 11.1 Logical consequence and logical truth . . . . . . . . . . . . . . . . . 2

1.2 Form and abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Formal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Correctness and application . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 The nature of logical consequence . . . . . . . . . . . . . . . . . . . 8

1.6 Extensions, deviations, variations . . . . . . . . . . . . . . . . . . . . 10

1.6.1 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6.2 Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.6.3 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.7 Metalogic, metalanguages, and formalization . . . . . . . . . . . . 12

1.8 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Propositional Logic 182.1 Grammar of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 The semantic approach to logic . . . . . . . . . . . . . . . . . . . . . 21

2.3 Semantics of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Natural deduction in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4.3 Sequent proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.4.4 Example sequent proofs . . . . . . . . . . . . . . . . . . . . . 36

2.5 Axiomatic proofs in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.5.1 Example axiomatic proofs . . . . . . . . . . . . . . . . . . . . 43

2.6 Soundness of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.7 Completeness of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

iii

CONTENTS iv

3 Variations and Deviations from PL 633.1 Alternate connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.1.1 Symbolizing truth functions in propositional logic . . . 63

3.1.2 Inadequate connective sets . . . . . . . . . . . . . . . . . . . 67

3.1.3 Sheffer stroke . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.2 Polish notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.3 Multi-valued logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.3.1 Łukasiewicz’s system . . . . . . . . . . . . . . . . . . . . . . . 72

3.3.2 Kleene’s “strong” tables . . . . . . . . . . . . . . . . . . . . . 74

3.3.3 Kleene’s “weak” tables (Bochvar’s tables) . . . . . . . . . . 76

3.3.4 Supervaluationism . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.4 Intuitionism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4 Predicate Logic 844.1 Grammar of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 84

4.2 Semantics of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 85

4.3 Establishing validity and invalidity . . . . . . . . . . . . . . . . . . . 90

5 Extensions of Predicate Logic 935.1 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.1.1 Grammar for the identity sign . . . . . . . . . . . . . . . . . 93

5.1.2 Semantics for the identity sign . . . . . . . . . . . . . . . . 94

5.1.3 Symbolizations with the identity sign . . . . . . . . . . . 95

5.2 Function symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.2.1 Grammar for function symbols . . . . . . . . . . . . . . . . 99

5.2.2 Semantics for function symbols . . . . . . . . . . . . . . . . 100

5.3 De�nite descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.3.1 Grammar for ι . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.3.2 Semantics for ι . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.3.3 Eliminability of function symbols and de�nite descriptions106

5.4 Further quanti�ers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.4.1 Generalized monadic quanti�ers . . . . . . . . . . . . . . . 111

5.4.2 Generalized binary quanti�ers . . . . . . . . . . . . . . . . . 113

5.4.3 Second-order logic . . . . . . . . . . . . . . . . . . . . . . . . 115

6 Propositional Modal Logic 1186.1 Grammar of MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.2 Symbolizations in MPL . . . . . . . . . . . . . . . . . . . . . . . . . . 120

CONTENTS v

6.3 Semantics for MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6.3.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.3.2 Kripke models . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.3.3 Semantic validity proofs . . . . . . . . . . . . . . . . . . . . . 131

6.3.4 Countermodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.3.5 Schemas, validity, and invalidity . . . . . . . . . . . . . . . . 152

6.4 Axiomatic systems of MPL . . . . . . . . . . . . . . . . . . . . . . . . 154

6.4.1 System K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

6.4.2 System D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.4.3 System T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.4.4 System B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

6.4.5 System S4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

6.4.6 System S5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

6.4.7 Substitution of equivalents and modal reduction . . . . . 171

6.5 Soundness in MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

6.5.1 Soundness of K . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

6.5.2 Soundness of T . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

6.5.3 Soundness of B . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.6 Completeness of MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.6.1 Canonical models . . . . . . . . . . . . . . . . . . . . . . . . . 179

6.6.2 Maximal consistent sets of wffs . . . . . . . . . . . . . . . . 179

6.6.3 De�nition of canonical models . . . . . . . . . . . . . . . . 180

6.6.4 Features of maximal consistent sets . . . . . . . . . . . . . 181

6.6.5 Maximal consistent extensions . . . . . . . . . . . . . . . . . 182

6.6.6 “Mesh” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

6.6.7 Truth and membership in canonical models . . . . . . . . 187

6.6.8 Completeness of systems of MPL . . . . . . . . . . . . . . 188

7 Variations on MPL 1917.1 Propositional tense logic . . . . . . . . . . . . . . . . . . . . . . . . . . 191

7.1.1 The metaphysics of time . . . . . . . . . . . . . . . . . . . . 191

7.1.2 Tense operators . . . . . . . . . . . . . . . . . . . . . . . . . . 193

7.1.3 Syntax of tense logic . . . . . . . . . . . . . . . . . . . . . . . 195

7.1.4 Possible worlds semantics for tense logic . . . . . . . . . . 195

7.1.5 Formal constraints on ≤ . . . . . . . . . . . . . . . . . . . . . 197

7.2 Intuitionist propositional logic . . . . . . . . . . . . . . . . . . . . . . 199

7.2.1 Kripke semantics for intuitionist propositional logic . . 199

7.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

CONTENTS vi

7.2.3 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

8 Counterfactuals 2078.1 Natural language counterfactuals . . . . . . . . . . . . . . . . . . . . 208

8.1.1 Not truth-functional . . . . . . . . . . . . . . . . . . . . . . . 208

8.1.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 208

8.1.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 209

8.1.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 210

8.1.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 210

8.1.6 Context dependence . . . . . . . . . . . . . . . . . . . . . . . 211

8.2 The Lewis/Stalnaker approach . . . . . . . . . . . . . . . . . . . . . 213

8.3 Stalnaker’s system (SC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

8.3.1 Syntax of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

8.3.2 Semantics of SC . . . . . . . . . . . . . . . . . . . . . . . . . . 214

8.4 Validity proofs in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

8.5 Countermodels in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

8.6 Logical Features of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

8.6.1 Not truth-functional . . . . . . . . . . . . . . . . . . . . . . . 230

8.6.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 230

8.6.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 230

8.6.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 231

8.6.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 231

8.6.6 No exportation . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

8.6.7 No importation . . . . . . . . . . . . . . . . . . . . . . . . . . 232

8.6.8 No hypothetical syllogism (transitivity) . . . . . . . . . . . 233

8.6.9 No transposition . . . . . . . . . . . . . . . . . . . . . . . . . . 234

8.7 Lewis’s criticisms of Stalnaker’s theory . . . . . . . . . . . . . . . . 234

8.8 Lewis’s system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

8.9 The problem of disjunctive antecedents . . . . . . . . . . . . . . . 241

9 Quanti�ed Modal Logic 2439.1 Grammar of QML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

9.2 Symbolizations in QML . . . . . . . . . . . . . . . . . . . . . . . . . . 243

9.3 A simple semantics for QML . . . . . . . . . . . . . . . . . . . . . . . 246

9.4 Countermodels and validity proofs in SQML . . . . . . . . . . . . 248

9.5 Philosophical questions about SQML . . . . . . . . . . . . . . . . . 254

9.5.1 The necessity of identity . . . . . . . . . . . . . . . . . . . . 254

9.5.2 The necessity of existence . . . . . . . . . . . . . . . . . . . 256

CONTENTS vii

9.5.3 Necessary existence defended . . . . . . . . . . . . . . . . . 261

9.6 Variable domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

9.6.1 Countermodels to the Barcan and related formulas . . . 266

9.6.2 Expanding, shrinking domains . . . . . . . . . . . . . . . . 267

9.6.3 Strong and weak necessity . . . . . . . . . . . . . . . . . . . 269

10 Two-dimensional modal logic 27210.1 Actuality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

10.1.1 Kripke models with designated worlds . . . . . . . . . . . 273

10.1.2 Semantics for @ . . . . . . . . . . . . . . . . . . . . . . . . . . 274

10.1.3 Establishing validity and invalidity . . . . . . . . . . . . . . 275

10.2 × . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

10.2.1 Two-dimensional semantics for × . . . . . . . . . . . . . . 277

10.3 Fixedly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

10.4 A philosophical application: necessity and a priority . . . . . . . . 282

A Answers to Selected Exercises 289

B Answers to Remaining Exercises 298

References 350

Chapter 1

Nature of Logic

Since you are reading this book, you are probably already familiar with

some logic. You probably know how to translate English sentences into

symbolic notation—into propositional logic:

English Propositional logicIf snow is white then grass is green S→GEither snow is white or grass is not green S∨∼G

and into predicate logic:

English Predicate logicIf Jones is happy then someone is happy H j→∃xH xAny friend of Jones is either insane or

friends with everyone

∀x[F x j→(I x ∨∀yF xy)]

You are probably also familiar with some basic techniques for evaluating argu-

ments written out in symbolic notation. You have probably encountered truth

tables, and some form of proof theory (perhaps a “natural deduction” system;

perhaps “truth trees”.) You may have even encountered some elementary model

theory. In short: you had an introductory course in symbolic logic.

What you already have is: literacy in elementary logic. What you will

get out of this book is: literacy in the rest of logic that philosophers tend to

presuppose, plus a deeper grasp of what logic is all about.

So what is logic all about?

1

CHAPTER 1. NATURE OF LOGIC 2

1.1 Logical consequence and logical truthLogic is about logical consequence. The statement “someone is happy” is a logical

consequence of the statement “Ted is happy”. If Ted is happy, then it logicallyfollows that someone is happy. Put another way: the statement “Ted is happy”

logically implies the statement “someone is happy”. Likewise, the statement

“Ted is happy” is a logical consequence of the statements “It’s not the case that

John is happy” and “Either John is happy or Ted is happy”. The �rst statement

follows from the latter two statements. If the latter two statements are true,

the former must be true. Put another way: the argument whose premises are

the latter two statements, and whose conclusion is the former statement, is a

logically correct one.1

Relatedly, logic is about logical truth. A logical truth is a sentence that is

“true purely by virtue of logic”. Examples might include: “it’s not the case that

snow is white and also not white”, “All �sh are �sh”, and “If Ted is happy then

someone is happy”. It is plausible that logical truth and logical consequence

are related thus: a logical truth is a sentence that is a logical consequence of

any sentences whatsoever.

1.2 Form and abstractionLogicians focus on form. Consider again the following argument:

It’s not the case that John is happy

Ted is happy or John is happy

Therefore, Ted is happy

(Argument A)

Argument A is logically correct—its conclusion is a logical consequence of its

premises. It is customary to say that this is so in virtue of its form—in virtue of

the fact that its form is:

It’s not the case that φφ or ψTherefore ψ

1The word ‘valid’ is sometimes used for logically correct arguments, but I will reserve that

word for a different concept: that of a logical truth according to the semantic conception of

logical truth.


Likewise, we say that “it’s not the case that snow is white and snow is not white”

is a logical truth because it has the form: it’s not the case that φ and not-φ.

We need to think hard about the idea of form. Apparently, we got the

alleged form of Argument A by replacing some words with Greek letters and

leaving other words as they were. We replaced the sentences ‘John is happy’ and

‘Ted is happy’ with φ and ψ, respectively, but left the expressions ‘It’s not the

case that’ and ‘or’ as they were, resulting in the schematic form displayed above.

Let’s call that form, “Form 1”. What’s so special about Form 1? Couldn’t we

make other choices for what to leave and what to replace? For instance, if we

replace the predicate ‘is happy’ with the schematic letter α, leaving the rest

intact, we get this:

It’s not the case that John is αTed is α or John is αTherefore, Ted is α

(Form 2)

And if we replace the ‘or’ with the schematic letter γ and leave the rest intact,

then we get this:


Ted is happy γ John is happy


(Form 3)

If we think of Argument A as having Form 1, then we can think of it as being

logically correct in virtue of its form, since every “instance” of Form 1 is

logically correct. That is, no matter what sentences we substitute in for the

greek letters φ and ψ in Form 1, the result is a logically correct argument.

Now, if we think of Argument A’s form as being Form 2, we can continue to

think of Argument A as being logically correct in virtue of its form, since, like

Form 1, every instance of Form 2 is logically correct: no matter what predicate

we change α to, Form 2 becomes a logically correct argument. But if we think

of Argument A’s form as being Form 3, then we cannot think of it as being

logically correct in virtue of its form, for not every instance of Form 3 is a

logically correct argument. If we change γ to ‘if and only if’, for example, then

we get the following logically incorrect argument:


Ted is happy if and only if John is happy



So, what did we mean, when we said that Argument A is logically correct in

virtue of its form? What is Argument A’s form? Is it Form 1, Form 2, or Form

3?

There is no such thing as the form of an argument. When we assign an

argument a form, what we are doing is focusing on certain words and ignoring

others. We leave intact the words we’re focusing on, and we insert schematic

letters for the rest. Thus, in assigning Argument A Form 1, we’re focusing on

the words (phrases) ‘it is not the case that’ and ‘or’, and ignoring other words.

More generally, in (standard) propositional logic, we focus on the phrases

‘if…then’, ‘if and only if’, ‘and’, ‘or’, and so on, and ignore others. We do this

in order to investigate the relations of logical consequence that hold in virtue

of these words’ meaning. The fact that Argument A is logically correct depends

just on the meaning of the phrases ‘it is not the case that’ and ‘or’; it does not

depend on the meanings of the sentences ‘John is happy’ and ‘Ted is happy’.

We can substitute any sentences we like for ‘φ’ and ‘ψ’ in Form 1 and still get

a valid argument.

In predicate logic, on the other hand, we focus on further words: ‘all’ and

‘some’. Broadening our focus in this way allows us to capture a wider range

of logical consequences and logical truths. For example “If Ted is happy then

someone is happy” is a logical truth in virtue of the meaning of ‘someone’, but

not merely in virtue of the meanings of the characteristic words of propositional

logic.

Call the words on which we’re focusing—that is, the words that we leave

intact when we construct the forms of sentences and arguments—the logicalconstants. (We can speak of natural language logical constants—‘and’, ‘or’, etc.

for propositional logic; ‘all’ and ‘some’ in addition for predicate logic—as well

as symbolic logical constants: ∧, ∨, etc. for propositional logic; ∀ and ∃ in

addition for predicate logic.) What we’ve seen is that the forms we assign

depend on what we’re considering to be the logical constants.

We call these expressions logical constants because we interpret them in a

constant way in logic, in contrast to other terms. For example, ∧ is a logical

constant; in propositional logic, it always stands for conjunction. There are

�xed rules governing ∧, in proof systems (the rule that from P∧Q one can

infer P , for example), in the rules for constructing truth tables, and so on.

Moreover, these rules are distinctive for ∧: there are different rules for other

logical constants such as ∨. In contrast, the terms in logic that are not logical

constants do not have �xed, particular rules governing their meanings. For

example, there are no special rules governing what one can do with a P as


opposed to a Q in proofs or truth tables. That’s because P doesn’t symbolize

any sentence in particular; it can stand for any old sentence.

There isn’t anything sacred about the choices of logical constants we make

in propositional and predicate logic; and therefore, there isn’t anything sacred

about the customary forms we assign to sentences. We could treat other words

as logical constants. We could, for example, stop taking ‘or’ as a logical constant,

and instead take ‘It’s not the case that John is happy’, ‘Ted is happy’, and ‘John

is happy’ as logical constants. We would thereby view Argument A as having

Form 3. This would not be a particularly productive choice (since it would not

help to explain the correctness of Argument A), but it’s not wrong simply by

virtue of the concept of form.

More interestingly, consider the fact that every argument of the following

form is logically correct:

α is a bachelor

Therefore, α is unmarried

Accordingly, we could treat the predicates ‘is a bachelor’ and ‘is unmarried’

as logical constants, and develop a corresponding logic. We could introduce

special symbolic logical constants for these predicates, we could introduce

distinctive rules governing these predicates in proofs. (The rule of “bachelor-

elimination”, for instance, might allow one to infer “α is unmarried” from “αis a bachelor”.) As with the choices of the previous paragraph, this choice of

what to treat as a logical constant is also not ruled out by the concept of form.

And it would be more productive than the choices of the last paragraph. Still,

it would be far less productive than the usual choices of logical constants in

predicate and propositional logic. The word ‘bachelor’ doesn’t have as general

application as the words commonly treated as logical constants in propositional

and predicate logic; the latter are ubiquitous.

At least, this remark about “generality” is one idea about what should be

considered a “logical constant”, and hence one idea about the scope of what

is usually thought of as “logic”. Where to draw the boundaries of logic—and

indeed, whether the logic/nonlogic boundary is an important one to draw—is

an open philosophical question about logic. At any rate, in this course, one

thing we’ll do is study systems that expand the list of logical constants from

standard propositional and predicate logic.


1.3 Formal logicModern logic is “mathematical” or “formal” logic. This means simply that

one studies logic using mathematical techniques. More carefully: in order

to develop theories of logical consequence, and logical truth, one develops a

formal language (see below), one treats the sentences of the formal language as

mathematical objects; one uses the tools of mathematics (especially, the tools

of very abstract mathematics, such as set theory) to formulate theories about

the sentences in the formal language; and one applies mathematical standards

of rigor to these theories. Mathematical logic was originally developed to study

mathematical reasoning2, but its techniques are now applied to reasoning of all

kinds.

Think, for example, of propositional logic (this will be our �rst topic below).

The standard approach to analyzing the logical behavior of ‘and’, ‘or’, and so

on, is to develop a certain formal language, the language of propositional logic.

The sentences of this language look like this:

P(Q→R)∨(Q→∼S)

P↔(P∧Q)

The symbols ∧, ∨, etc., are used to represent the English words ‘and’, ‘or’, and

so on (the logical constants for propositional logic), and the sentence letters

P,Q, etc., are used to represent declarative English sentences.

Why ‘formal’? Because we stipulate, in a mathematically rigorous way,

a grammar for the language; that is, we stipulate a mathematically rigorous

de�nition of the idea of a sentence of this language. Moreover, since we are

only interested in the logical behavior of the chosen logical constants ‘and’,

‘or’, and so on, we choose special symbols (∧,∨ . . . ) for these words only; we

use P,Q, R, . . . indifferently to represent any English sentence whose internal

logical structure we are willing to ignore.3

We go on, then, to study (as always, in a mathematically rigorous way) vari-

ous concepts that apply to the sentences in formal languages. In propositional

logic, for example, one constructs a mathematically rigorous de�nition of a

2Notes

3Natural languages like English also have a grammar, and the grammar can be studied

using mathematical techniques. But the grammar is much more complicated, and is discovered

rather than stipulated; and natural languages lack abstractions like the sentence letters.


tautology (“all Trues in the truth table”), and a rigorous de�nition of a prov-

able formula (e.g., in terms of a system of deduction, using rules of inference,

assumptions, and so on).

Of course, the real goal is to apply the notions of logical consequence and

logical truth to sentences of English and other natural languages. The formal

languages are merely a tool; we need to apply the tool.

1.4 Correctness and applicationTo apply the tools we develop for formal languages, we need to speak of a

formal system as being correct. What does that sort of claim mean?

As we saw, logicians use formal languages and formal structures to study

logical consequence and logical truth. And the range of structures that one

could in principle study is very wide. For example, I could introduce a new

notion of “provability” by saying “in Ted Logic, the following rule may be

used when constructing proofs: if you have P on a line, you may infer ∼P .

The annotation is ‘T’.” I could then go on to investigate the properties of

such a system. Logic can be viewed as a branch of mathematics, and we can

mathematically study any system we like, including a system (like Ted logic) in

which one can “prove” ∼P from P .

But no such formal system would shed light on genuine logical consequence

and genuine logical truth. It would be implausible to claim, for example, that

when we translate an English argument into symbols, the conclusion of the

resulting symbolic argument may be derived in Ted logic from its premises iff

the conclusion of the original English argument is a logical consequence of its

premises.

Thus, the existence of a coherent, speci�able logical system must be dis-

tinguished from its application. When we say that a logical system is correct,we have in mind some application of that system. Here’s an oversimpli�ed

account of one such correctness claim. Suppose we have developed a certain

formal system for constructing proofs in propositional logic. And suppose

we have speci�ed some translation scheme from English into the language

of propositional logic. This translation schema would translate English ‘and’

into the logical ∧ , English ‘or’ into the logical ∨, and so on. Then, the claim

that the formal system gives a correct logic of English ‘and’, ‘or’, etc. might be

taken to be the claim that one English sentence is a logical consequence of

some other English sentences in virtue of ‘and’, ‘or’, etc., iff one can prove the


translation of the former English sentence from the translations of the latter

English sentences in the formal system.

In this book I won’t spend much time on philosophical questions about

which formal systems are correct. My goal is rather to introduce those for-

malisms that are ubiquitous in philosophy, to give you the tools you need to

address such philosophical questions yourself. Still, from time to time, we’ll dip

just a bit into these philosophical questions, in order to motivate our choices

of logical systems to study.

1.5 The nature of logical consequenceThe previous section discussed what it means to say that a formal theory gives a

correct account of logical consequence (as applied to sentences of English and

other natural languages). But what is it for sentences to stand in the relation of

logical consequence? What is logical consequence?

The question here is a philosophical question, as opposed to a mathematical

one. Logicians de�ne various notions concerning sentences of formal languages:

derivability in this or that proof-system, “all trues in the truth table”, and so

on. They thereby stipulatively introduce various formal concepts. These

formal concepts are good insofar as they correctly model logical truth and

logical consequence. But in what do logical truth and logical consequence—the

intuitive concepts, as opposed to the stipulatively introduced concepts—consist?

This is one of the core questions of philosophical logic.

This book is not primarily a book in philosophical logic, so we won’t spend

much time on the question. However, I do want to make clear that the question

is indeed a question. The question is sometimes obscured by the fact that terms

like ‘logical truth’ are often stipulatively de�ned in logic books. This can lead

to the belief that there are no genuine issues concerning these notions. It is also

obscured by the fact that one philosophical theory of these notions—the model-

theoretic one—is so dominant that one can forget that it is a nontrivial theory.

Stipulative de�nitions are of course not things whose truth can be questioned;

but stipulative de�nitions of logical notions are good insofar as the stipulated

notions accurately model the real, intuitive, nonstipulated notions of logical

consequence and logical truth. Further, the stipulated de�nitions generally

concern formal languages, whereas the ultimate goal is an understanding of

correct reasoning of the sort that we actually do, using natural languages.

Let’s focus just on logical consequence. Here is a quick survey of some


competing philosophical accounts of its nature. Probably the most standard

account is the semantic, or model-theoretic one. Intuitively, a logical truth is

“true no matter what”. The model theoretic account is one way of making this

slogan precise. It says that φ is a logical consequence of the sentences in set Γif the formal translation of φ is true in every model (interpretation) in which

the formal translations of the members of Γ are true. This account needs to be

spelled out in various ways. First, “formal translations” are translations into a

formal language; but which formal language? It will be a language that has a

logical constant for each English logical expression. But that raises the question

of which expressions of English are logical expressions. In addition to ‘and’,

‘or’, ‘all’, and so on, are any of the following logical expressions?

necessarily

it will be the case that

most

it is morally wrong that

Further, the notion of translation must be de�ned; further, an appropriate

de�nition of ‘model’ must be chosen.

Similar issues of re�nement confront a second account, the proof-theoreticaccount: φ is a logical consequence of the members of Γ iff the translation of

φ is provable from the translations of the members of Γ. We must decide what

formal language to translate into, and we must decide upon an appropriate

account of provability.

A third view is Quine’s: φ is a logical consequence of the members of Γiff there is no way to (uniformly) substitute new nonlogical expressions for

nonlogical expressions in φ and the members of Γ so that the members of Γbecome true and φ becomes false.

Three other accounts should be mentioned. The �rst account is a modal

one. Say that Γ modally implies φ iff it is not possible for φ to be false while the

members of Γ are true. (What does ‘possible’ mean here? There are many kinds

of possibility one might have in mind: so-called “metaphysical possibility”,

“absolute possibility”, “idealized epistemic possibility”…. Clearly the accept-

ability of the proposal depends on the legitimacy of these notions. We discuss

modality later in the book, beginning in chapter 6.) One might then propose

thatφ is a logical consequence of the members of Γ iff Γmodally impliesφ. (An


intermediate proposal: φ is a logical consequence of the members of Γ iff, invirtue of the forms ofφ and the members of Γ, Γmodally impliesφ. More carefully:

φ is a logical consequence of the members of Γ iff Γ modally implies φ, and

moreover, whenever Γ′ and φ′ result from Γ and φ by (uniform) substitution of

nonlogical expressions, Γ′ modally implies φ′. This is like Quine’s de�nition,

but with modal implication in place of truth-preservation.) Second, there is

a primitivist account, according to which logical consequence is a primitive

notion. Third, there is a pluralist account according to which there is no one

kind of genuine logical consequence. There are, of course, the various con-

cepts proposed by each account, each of which is trying to capture genuine

logical consequence; but in fact there is no further notion of genuine logical

consequence at all; there are only the proposed construals.

As I say, this is not a book on philosophical logic, and so we will not inquire

further into which (if any) of these accounts is correct. We will, rather, focus

exclusively on two kinds of formal proposals for modeling logical consequence

and logical truth: model-theoretic and proof-theoretic proposals.

1.6 Extensions, deviations, variations4

“Standard logic” is what is usually studied in introductory logic courses. It

includes propositional logic (logical constants: ∧,∨,∼,→,↔), and predicate

logic (logical constants: ∀,∃, variables). In this book we’ll consider various

modi�cations of standard logic:

1.6.1 ExtensionsHere we add to standard logic. We add both:

· new symbols

· new cases of logical consequence and logical truth that we can model

We do this in order to get a better representation of logical consequence. There

is more to logic than that captured by plain old standard logic.

We extended propositional logic, after all, to get predicate logic. You can

do a lot with propositional logic, but you can’t capture the obvious fact that

4See Gamut (1991a, pp. 156-158).


‘Ted is happy’ logically implies ‘someone is happy’ using propositional logic

alone. It was for this reason that we added quanti�ers, variables, predicates,

etc., to propositional logic (added symbols), and added means to deal with

these new symbols in semantics and proof theory (new cases of logical conse-

quence and logical truth). But there is no need to stop with plain old predicate

logic. We will consider adding symbols for identity, function symbols, and

de�nite descriptions to predicate logic, for example, and we’ll add a sign for

“necessarily” when we get to modal logic. And in each case, we’ll introduce

modi�cations to our formal theories that let us account for logical truths and

logical consequences involving the new symbols.

1.6.2 DeviationsHere we change, rather than add. We retain the same symbols from standard

logic, but we alter standard logic’s proof theory and semantics. We therefore

change what we say about the logical consequences and logical truths that

involve the symbols.

Why do this? Perhaps because we think that standard logicians are wrongabout what the right logic for English is. If we want to correctly model logical

consequence in English, therefore, we must construct systems that behave

differently from standard logic.

For example, in the standard semantics for propositional logic, every for-

mula is either true or false. But some have argued that natural language sen-

tences like the following are neither true nor false:

The king of the United States is bald

Sherlock Holmes weighs more than 178 pounds

Bill Clinton is tall There will be a sea battle tomorrow

If this is correct, then perhaps we should abandon the standard semantics for

propositional logic in favor of multi-valued logic, in which formulas are allowed

to be neither true nor false.

1.6.3 VariationsHere we also change standard logic, but we change the notation without

changing the content of logic. We study alternate ways of expressing the same

thing.


For example, in intro logic we show how:

∼(P∧Q)∼P∨∼Q

are two different ways of saying the same thing. We will study other ways of

saying what those two sentences say, including:

P |Q∼∧PQ

In the �rst case, | is a new symbol for “not both”. In the second case (“Polish

notation”), the ∼ and the ∧mean what they mean in standard logic; but instead

of going between the P and the Q, the ∧ goes before P and Q. The value of

this, as we’ll see, is that we no longer will need parentheses.

1.7 Metalogic, metalanguages, and formalizationIn introductory logic, we learned how to use certain logical systems. We learned

how to do truth tables, construct derivations, and so on. But logicians do not

spend much of their time developing systems only to sit around all day doing

derivations in those systems. As soon as a logician develops a new system, he

or she will begin to ask questions about that system. For an analogy, imagine

people who make up games. They might invent a new version of chess. Now,

they might spend some time actually playing the new game. But if they were

like logicians, they would soon get bored with this and start asking questions

about the game, such as: “is the average length of this new game longer than the

average length of a game of standard chess?”. “Is there any strategy one could

pursue which will guarantee a victory?” Analogously, logicians ask questions

like: what things can be proved in such and such a system? Can you prove the

same things in this system as in system X? Proving things about logical systems

is part of “meta-logic”, which is an important part of logic.

One particularly important question of metalogic is that of soundness and

completeness. Standard textbooks introduce a pair of methods for characterizing

logical truth for the formulas of propositional logic. One is semantic: a formula

is a semantic logical truth iff the truth table for that formula has all “trues” in its

�nal column. Another is proof-theoretic: a sentence is a proof-theoretic logical

truth iff there exists a derivation of it (from no premises), where a derivation


is then appropriately de�ned. (Think: introduction- and elimination- rules,

conditional and indirect proof, and so on.) The question of soundness and

completeness is: how do these two methods for characterizing logical truth

relate to each other? The question is answered, in the case of propositional

logic, by the following metalogical results, which are proved in standard books

on metalogic:

Soundness of propositional logic: In propositional logic, any proof-theoretic

logical truth is a semantic logical truth

Completeness of propositional logic: In propositional logic, any semantic

logical truth is a proof-theoretic logical truth

These are really interesting claims! They show that the method of truth tables

and the method of constructing derivations amount to the same thing, as

applied to symbolic formulas of propositional logic. One can establish similar

results for standard predicate logic.

A couple remarks about proving things in metalogic.

First: what do we mean by “proving”? We do not mean: constructing a

derivation in the logical system we’re investigating. We’re trying to construct a

proof about the system. We do this in English, and we do it with informal (though

rigorous!) reasoning of the sort one would encounter in a mathematics book.

Logicians often distinguish the “object language” from the “metalanguage”.

The object language is the language that’s being studied—the language of

propositional logic, for example. Sentences of this object language look like

this:

P∧Q∼(P∨Q)↔R

The metalanguage is the language we use to talk about the object language.

In the case of the present book, the metalanguage is English. Here are some

example sentences of the metalanguage:

‘P∧Q’ is a sentence with three symbols, one of which is

a logical constant

Every sentence of propositional logic has the same num-

ber of left parentheses as right parentheses


If there exists a derivation of a formula, then its truth

table contains all “trues” in its �nal column (i.e., sound-

ness)

Thus, we formulate metalogical claims in the metalanguage, and our proofs in

metalogic take place in the metalanguage.

Second: to get anywhere in metalogic, we will have to get picky about a few

things about which one can afford to be lax in introductory logic. Let’s look at

soundness, for instance. To be able to prove this, in a mathematically rigorous

way, we’ll need to have the terms in it de�ned very carefully. In particular, we’ll

need to say exactly what we mean by ‘sentence of propositional logic’, ‘truth

tables’, and ‘derived’. De�ning these terms precisely (another thing we’ll do

using English, the metalanguage!) is known as formalizing logic. Our �rst task

will be to formalize propositional logic.

1.8 Set theory5

As mentioned above, modern logic uses mathematical techniques to study

formal languages. The mathematical techniques in question are those of “set

theory”. Only the most elementary set-theoretic concepts and assumptions will

be needed, and you are probably already familiar with them; but nevertheless,

here is a brief overview.

Sets have members. Consider, for example, the set, N, of natural numbers.

Each natural number is a member of N: 1 is a member of N, 2 is a member of N,

and so on. We use the expression “∈” for this relationship of membership; thus,

we can say: 1 ∈N, 2 ∈N, and so on. We often name a set by putting names of

its members between braces: “{1,2,3,4, . . .}” is another name of N.

Sets are not limited to sets of mathematical entities; anything can be a

member of a set. Thus, we may speak of the set of people, the set of cities,

or—to draw nearer to our intended purpose—the set of sentences in a given

language.

There is also the empty set, ∅. This is the one set with no members. That

is, for each object u, u is not a member of ∅ (i.e.: for each u, u /∈∅.)

Though the notion of a set is an intuitive one, it is deeply perplexing. This

can be seen by re�ecting on the Russell Paradox, discovered by Bertrand Russell,

the great philosopher and mathematician. Let us call R the set of all and only

5Supplementary reading: the beginning of Enderton (1977)


those sets that are not members of themselves. For short, R is the set of non-

self-members. Russell asks the following question: is R a member of itself?

There are two possibilities:

· R /∈ R. Thus, R is a non-self-member. But R was said to be the set of all

non-self-members, and so we’d have R ∈ R. Contradiction.

· R ∈ R. So R is not a non-self-member. R, by de�nition, contains onlynon-self-members. So R /∈ R. Contradiction.

Thus, each possibility leads to a contradiction. But there are no remaining

possibilities—either R is a member of itself or it isn’t! So it looks like the very

idea of sets is paradoxical.

The modern discipline of axiomatic set theory arose in part to develop a

notion of sets that isn’t subject to this sort of paradox. This is done by imposing

rigid restrictions on when a given “condition” picks out a set. In the example

above, the condition “is a non-self-member” will be ruled out—there’s no set

of all and only the things satisfying this condition. The details of set theory are

beyond the scope of this course; for our purposes, we’ll help ourselves to the

existence of sets, and not worry about exactly what sets are, or how the Russell

paradox is avoided.

Various other useful set-theoretic notions can be de�ned in terms of the

notion of membership. We say that A is a subset of B (“A⊆ B”) when every

member of A is a member of B . We say that the intersection of A and B (“A∩B”)

is the set that contains all and only those things that are in both A and B , and

that the union of A and B (“A∪B”) is the set containing all and only those things

that are members of either A or B .

Suppose we want to refer to the set of the so-and-sos—that is, the set

containing all and only objects, u, that satisfy the condition “so-and-so”. We’ll

do this with the term “{u: u is a so-and-so}”. Thus, we could write: “N= {u :u is a natural number}”. And we could restate the de�nitions of ∩ and ∪ from

the previous paragraph as follows:

A∩B = {u : u ∈A and u ∈ B}A∪B = {u : u ∈A or u ∈ B}

Sets have members, but they don’t contain them in any particular order. For

example, the set containing me and Bill Clinton doesn’t have a “�rst” member.


This is re�ected in the fact that “{Ted, Clinton}” and “{Clinton, Ted}” are

two different names for the same set—the set containing just Clinton and Ted.

But sometimes we need to talk about a set-like thing containing Clinton and

Ted, but in a certain order. For this purpose, logicians use ordered sets. Two-

membered ordered sets are called ordered pairs. To name the ordered pair of

Clinton and Ted, we use: “⟨Clinton, Ted⟩”. Here, the order is signi�cant, for

⟨Clinton, Ted⟩ and ⟨Ted, Clinton⟩ are not the same thing. The three-membered

ordered set of u, v, and w (in that order) is written: ⟨u, v, w⟩; and similarly for

ordered sets of any �nite size. A n-membered ordered set is called an n-tuple.Let’s even allow 1-tuples: let’s de�ne the 1-tuple ⟨u⟩ as being the object u itself.

In addition to sets, and ordered sets, we’ll need a further related concept:

that of a function. A function is a rule that “takes in” an object or objects,

and “spits out” a further object. For example, the addition function is a rule

that takes in two numbers, and spits out their sum. As with sets and ordered

sets, functions are not limited to mathematical entities: they can “take in” and

“spit out” any objects whatsoever. We can speak of the father-of function, for

example, which is a rule that takes in a person, and spits out the father of that

person. And later in this book we will be considering functions that take in

and spit out linguistic entities: sentences and parts of sentences from formal

languages.

Each function has a �xed number of “places”: a �xed number of objects

it must take in before it is ready to spit out something. You need to give

the addition function two arguments (numbers) in order to get it to spit out

something, so it is called a two-place function. You only need to give the father-

of function one object, on the other hand, to get it to spit out something, so it

is a one-place function.

The objects that the function takes in are called its arguments, and the object

it spits out is called its value. Suppose f is an n-place function, and u1 . . . un are

n of its arguments; one then writes “ f (u1 . . . un)” for the value of function f as

applied to arguments u1 . . . un. f (u1 . . . un) is the object that f spits out, if you

feed it u1 . . . un. For example, where f is the father-of function, since Ron is

my father, we can write: f (Ted) =Ron; and, where a is the addition function,

we can write: a(2,3) = 5.

There’s a trick for “reducing” talk of both ordered pairs and functions to

talk of sets. One �rst de�nes ⟨u, v⟩ as the set {{u},{u, v}}; one de�nes ⟨u, v, w⟩as ⟨u, ⟨v, w⟩⟩, and similarly for n-membered ordered sets, for each positive

integer n. And, �nally, one de�nes an n-place function as a set, f , of n+ 1-

tuples obeying the constraint that if ⟨u1, . . . , un, v⟩ and ⟨u1, . . . , un, w⟩ are both


members of f , then v = w; f (u1, . . . , un) is then de�ned as the object, v, such

that ⟨u1, . . . , un, v⟩ ∈ f . Thus, ordered sets and functions are de�ned as certain

sorts of sets. The trick of the de�nition of ordered pairs is that we put the

set together in such a way that we can look at the set and tell what the �rstmember of the ordered pair is: it’s the one that “appears twice”. Similarly, the

trick of the de�nition of a function is that we can take any arguments to the

function, look at the set that is identi�ed with the function, and �gure out

what value the function spits out for those arguments. But the technicalities of

these reductions won’t matter for us; I’ll just feel free to speak of ordered pairs,

triples, functions, etc., without de�ning them as sets.

Chapter 2

Propositional Logic

We begin with the simplest logic commonly studied: propositional logic.

Despite its simplicity, it has great power and beauty.

2.1 Grammar of PLModern logic has made great strides by treating the language of logic as a

mathematical object. To do so, grammar needs to be developed rigorously.

(Our study of a new logical system will always begin with grammar.)

If all you want to do is understand the language of logic informally, and be

able to use it effectively, you don’t really need to get so careful about grammar.

For even if you haven’t ever seen the grammar of propositional logic formalized,

you can recognize that things like this make sense:

P→QR∧ (∼S↔P )

Whereas things like this do not:

→PQR∼(P∼Q∼(∨

But to make any headway in metalogic, we will need more than an intuitive

understanding of what makes sense and what does not; we will need a precise

de�nition that has the consequence that only the strings of symbols in the �rst

group “make sense”.

18

CHAPTER 2. PROPOSITIONAL LOGIC 19

Grammatical formulas (i.e., ones that “make sense”) are called well-formedformulas, or “wffs” for short. We de�ne these by �rst carefully de�ning exactly

which symbols are allowed to occur in wffs (the “primitive vocabulary”), and

second, carefully de�ning exactly which strings of these symbols count as wffs:

Primitive vocabulary:

· Sentence letters: P,Q, R . . . , with or without numerical subscripts

· Connectives: →, ∼· Parentheses: ( , )

Definition of wff:

i) Sentence letters are wffs

ii) If φ and ψ are wffs then φ→ψ and ∼φ are also wffs

iii) Only strings than can be shown to be wffs using i) and ii) are wffs

We will be discussing a number of different logical systems throughout this

book, with differing notions of grammar. What we have de�ned here is the

notion of a wff for one particular language, the language of PL. So strictly, we

should speak of PL-wffs. But I’ll just say “wff” when where there is no danger

of ambiguity.

Notice an interesting feature of this de�nition: the very expression we

are trying to de�ne, ‘wff’, appears on the right hand side of clause ii) of the

de�nition. In a sense, we are using the expression ‘wff’ in its own de�nition.

Is that “circular”? Not in any objectionable way. This de�nition is what is

called a “recursive” de�nition, and recursive de�nitions are legitimate despite

this sort of circularity. The reason is that clause ii) de�nes the notion of a

wff for certain complex expressions (namely, ∼φ and φ→ψ) in terms of the

notion of a wff as applied to smaller expressions (φ and ψ). These smaller

expressions may themselves be complex, and therefore may have their statuses

as wffs determined, via clause ii), in terms of yet smaller expressions, and so on.

But eventually this procedure will lead us to clause i), not clause ii). And clause

i) is not circular: in that clause, we do not appeal to the notion of a wff in its

own de�nition; we rather say directly that sentence letters are wffs. Recursive

de�nitions always “bottom out” in this way; they always include a clause (called

the “base” clause) like i).


Think of this procedure in reverse: we begin with the smallest wffs (sentence

letters), and build up complex wffs using clause ii). Example: we can use clauses

i) and ii) to show that the expression (∼P→(P→Q)) is a wff:

· P is a wff (clause i))

· so, ∼P is a wff (clause ii))

· Q is a wff (clause i))

· so, since P and Q are both wffs, (P→Q) is also a wff (clause ii))

· so, since∼P and (P→Q) are both wffs, (∼P→(P→Q)) is also a wff (clause

ii))

What’s the point of clause iii)? Clauses i) and ii) provide only suf�cient

conditions for being a wff, and therefore do not on their own exclude non-

sense combinations of primitive vocabulary like P∼Q∼R, or even strings like

(P∨147)→⊕ that include disallowed symbols. Clause iii) rules these strings out,

since there is no way to build up either of these strings from clauses i) and ii),

in the way that we built up the wff (∼P→(P→Q)).What happened to ∧, ∨, and↔? Our de�nition of a wff mentions only

→ and ∼; it therefore counts expressions like P∧Q, P∨Q, and P↔Q as notbeing wffs. Answer: we can de�ne the ∧, ∨, and↔ in terms of ∼ and→:

Definitions of ∧, ∨, and↔:

· “φ∧ψ” is short for “∼(φ→∼ψ)”· “φ∨ψ” is short for “∼φ→ψ”

· “φ↔ψ” is short for “(φ→ψ)∧(ψ→φ)”

So, whenever we subsequently write down an expression that includes one of

the de�ned connectives, we can regard it as being short for an expression that

includes only the of�cial primitive connectives, ∼ and→. (We will show below

that the above de�nitions are good ones; in short, they are good because they

generate the correct truth tables for ∧, ∨, and↔.)

Our choice to begin with→ and∼ as our primitive connectives was arbitrary.

We could have started with ∼ and ∧, and de�ned the others as follows:


· “φ∨ψ” is short for “∼(∼φ∧∼ψ)”· “φ→ψ” is short for “∼(φ∧∼ψ)”· “φ↔ψ” is short for “(φ→ψ)∧(ψ→φ)”

And other alternate choices are possible. We’ll talk about this later.

So: → and ∼ are our primitive connectives; the others are de�ned. Why

do we choose only a small number of primitive connectives? Because, as we

will see, it makes meta-proofs easier.

2.2 The semantic approach to logicIn the next section we will introduce a “semantics” for propositional logic. A

semantics for a language is a way of assigning meanings to words and sentences

of that language. For us, the central notion of meaning will be that of truth.

Roughly speaking, our approach will be to de�ne, for each wff of propositional

logic, the circumstances in which it is true.

Philosophers disagree over how to understand the notion of meaning in

general. But the idea that the meaning of a sentence has something to do with

truth-conditions is hard to deny, and at any rate has currency within logic. On

this approach, one explains the meaning of a sentence by showing how that

sentence depends for its truth or falsity on the way the world is.

We will provide a truth-conditional semantics for a symbolic (formal) lan-

guage in two stages. First, we will de�ne mathematical models of the various

con�gurations the world could be in. Second, we will de�ne the conditions

under which a sentence of the symbolic language is true in one of these mathe-

matical con�gurations.

These de�nitions will have two main bene�ts. First, they will illuminate

meaning. In logic, the symbols of symbolic languages are typically intended to

represent bits of natural language. The PL connectives, for example, represent

‘and’, ‘or’, and so on. If the de�nitions are well-constructed, then the ways

in which the con�gurations render symbolic sentences true and false will be

parallel to the ways in which the real world renders corresponding natural

language sentences true and false. The de�nitions will therefore shed light on

the meanings of the natural language sentences represented by our symbolic

language. Second, our de�nitions will allow us to construct a precise theory (of

the semantic/model-theoretic variety) of logical consequence and logical truth.

The semantic conception is a way of making precise the idea that a logical truth


is a sentence that is “true no matter what”, and the idea that one sentence is a

logical consequence of some other sentences iff there is “no way” for the latter

sentences to be true without the former sentence being true. We will use our

de�nitions to make these rough statements more precise: we will say that one

formula is a logical consequence of others iff there is no con�guration in which

the latter formulas are true but the former is not, and that a formula is a logical

truth iff it is true in all con�gurations.

2.3 Semantics of PLOur semantics for propositional logic is really just truth tables, only presented

a little more carefully than in introductory logic books. What a truth table of a

formula does is depict how the truth value of that formula is determined by the

truth values of its sentence letters, for each possible combination of truth values

for its sentence letters. To do this nonpictorially, we need to de�ne a notion

corresponding to “a possible combination of truth values for sentence letters”.

Definition of interpretation: A PL-interpretation is is a function I , that

assigns either 1 or 0 to every sentence letter

As with the notion of a wff, we will have different de�nitions of interpretations

for different logical systems, so strictly we must speak of PL-interpretations.

But usually it will be �ne to speak simply of interpretations when it’s clear

which system is at issue.

The numbers 0 and 1 are our truth values. So an interpretation assigns

truth values to sentence letters. Instead of saying “let P be false, and Q be

true”, we can say: let I be an interpretation such that I (P ) = 0 and I (Q) = 1.

Once we settle what truth values a given interpretation assigns to the sen-

tence letters, the truth values of complex sentences containing those sentence

letters are thereby �xed. The usual, informal, method for showing exactly how

those truth values are �xed is by giving truth tables for each connective. The


standard truth tables for the→ and ∼ are the following:1

→ 1 01 1 00 1 1

∼1 00 1

What we will do, instead, is write out a formal de�nition of a function—the

valuation function—that assigns truth values to complex sentences as a function

of the truth values of their sentence letters—i.e., as a function of a given

intepretation I . But the idea is the same as the truth tables: truth tables are

really just pictures of the de�nition of a valuation function:

Definition of valuation: For any PL-interpretation, I , the PL-valuation

for I , VI , is de�ned as the function that assigns to each wff either 1 or 0, and

which is such that, for any sentence letter α and any wffs φ and ψ,

VI (α) =I (α)VI (φ→ψ) = 1 iff either VI (φ) = 0 or VI (ψ) = 1

VI (∼φ) = 1 iff VI (φ) = 0

We have another recursive de�nition: the valuation function’s values for com-

plex formulas are determined by its values for smaller formulas; and this pro-

cedure bottoms out in the values for sentence letters, which are determined

directly by the interpretation function I .

Notice also that in the de�nition of a valuation function I use the English

logical connectives ‘either…or’, and ‘iff ’. I used these English connectives

rather than the logical connectives ∨ and↔, because at that point I was notwriting down wffs of the language of study (in this case, the language of propo-

sitional logic). I was rather using sentences of English—our metalanguage, the

informal language we’re using to discuss the formal language of propositional

logic—to construct my de�nition of the valuation function. My de�nition

needed to employ the logical notions of disjunction and bi-implication, the

English words for which are ‘either…or’ and ‘iff’.

1The→ table, for example, shows what truth value φ→ψ takes on depending on the truth

values of its parts. Rows correspond to truth values for φ, columns to truth values for ψ. Thus,

to ascertain the truth value of φ→ψ when φ is 1 and ψ is 0, we look in the 1 row and the 0column. The listed value there is 0—the conditional is false in this case. The ∼ table lacks

multiple columns because ∼ is a one-place connective.


One might again worry that something circular is going on. We de�ned the

symbols for disjunction and bi-implication, ∨ and↔, in terms of ∼ and→ in

section 2.1, and now we’ve de�ned the valuation function in terms of disjunction

and bi-implication. So haven’t we given a circular de�nition of disjunction

and bi-implication? No. When we de�ne the valuation function, we’re not

trying to de�ne logical concepts such as negation, conjunction, disjunction,

implication, and bi-implication, and so on, at all. A reductive de�nition of these

very basic concepts is probably impossible (though one can de�ne some of them

in terms of the others). What we are doing is starting with the assumption that

we already understand the logical concepts, and then using those notions to

provide a formalized semantics for a logical language. This can be put in terms

of object- and meta-language: we use metalanguage connectives, such as ‘iff’

and ‘or’, which we simply take ourselves to understand, to provide a semantics

for the object language connectives ∼,→, etc.

Back to the de�nition of the valuation function. The de�nition applies

only to of�cial wffs, which can contain only the primitive connectives→ and

∼. But sentences containing ∧, ∨, and↔ are abbreviations for of�cial wffs,

and therefore they too are governed by the de�nition. In fact, given the

abbreviations de�ned in section 2.1, one can show that the de�nition assigns

the intuitively correct truth values to sentences containing ∧, ∨, and↔; one

can show that for any PL-interpretation I , and any wffs ψ and χ ,

VI (ψ∧χ ) = 1 iff VI (ψ) = 1 and VI (χ ) = 1VI (ψ∨χ ) = 1 iff either VI (ψ) = 1 or VI (χ ) = 1

VI (ψ↔χ ) = 1 iff VI (ψ) =VI (χ )

I’ll show that the �rst statement is true here; the others are exercises for the

reader. I’ll write out my proof in excessive detail, to make it clear exactly how

the reasoning works.

Proof that ∧ gets the right truth condition. Let ψ and χ be any wffs. The expres-

sion ψ∧χ is an abbreviation for the expression ∼(ψ→∼χ ). So we want to

show that, for any PL-interpretation I , VI (∼(ψ→∼χ )) = 1 iff VI (ψ) = 1 and

VI (χ ) = 1. Now, in order to show that a statement α holds iff a statement

β holds, we must �rst show that if α holds, then β holds (the “forwards ⇒direction”); then we must show that if β holds then α holds (the “backwards

⇐direction”):


⇒: First assume that VI (∼(ψ→∼χ )) = 1. Then, by de�nition of the val-

uation function, clause for ∼, VI (ψ→∼χ ) = 0. So2, VI (ψ→∼χ ) is not 1. But

then, by the clause in the de�nition of VI for the→, we know that it’s not the

case that: either VI (ψ) = 0 or VI (∼χ ) = 1. That is: VI (ψ) = 1 and VI (∼χ ) = 0.

From the latter, by the clause for ∼, we know that VI (χ ) = 1. That’s what we

wanted to show—that VI (ψ) = 1 and VI (χ ) = 1.

⇐: This is sort of like undoing the previous half. Suppose that VI (ψ) = 1and VI (χ ) = 1. Since VI (χ ) = 1, by the clause for∼, VI (∼χ ) = 0; but now since

VI (ψ) = 1 and VI (∼χ ) = 0, by the clause for→ we know that VI (ψ→∼χ ) = 0;

then by the clause for ∼, we know that VI (∼(ψ→∼χ )) = 1, which is what we

were trying to show.

(The symbol marks the end of meta-language proofs—that is, arguments

I give, phrased in English, to establish facts about various formal languages

discussed in this book.)

Exercise 2.1 Given the de�nitions of the de�ned symbols ∨ and

↔, show that for the valuation function V of any PL-interpretation,

and any wffs ψ and χ ,

V(ψ∨χ ) = 1 iff either V(ψ) = 1 or V (χ ) = 1V(ψ↔χ ) = 1 iff V (ψ) =V (χ )

Let’s re�ect on what we’ve done so far. We have de�ned the notion of

a PL-interpretation, which assigns 1s and 0s to sentence letters of the sym-

bolic language of propositional logic. And we have also de�ned, for any PL-

interpretation, a corresponding PL-valuation function, which extends the

interpretation’s assignment of 1s and 0s to complex wffs of PL. Note that we

have been informally speaking of these assignments as assignments of truthvalues. That’s because the assignments of 1s and 0s accurately models the truth

values of statements in English that are represented in the obvious way by

PL-wffs. For example, the ∼ of propositional logic is supposed to model the

English phrase ‘it is not the case that’. Accordingly, just as an English sentence

2The careful reader will note that here (and henceforth), I treat “VI (α) = 0” and “VI (α) is

not 1” interchangeably (for any wff α). (Similarly for “VI (α) = 1” and “VI (α) is not 0”.) This is

justi�ed as follows. First, if VI (α) is 0, then it can’t also be that VI (α) is 1—VI was stipulated

to be a function. Second, since it was stipulated that VI assigns either 0 or 1 to each wff, if

VI (α) is not 1, then VI (α) must be 0.


“It is not the case that φ” is true iff φ is false, one of our valuation functions

assigns 1 to ∼φ iff it assigns 0 to φ.

Semantics in logic, recall, generally de�nes two things: con�gurations and

truth-in-a-con�guration. In the propositional logic semantics we have laid

out, the con�gurations are the interpretation functions, and the valuation

function de�nes truth-in-a-con�guration. Each interpretation function gives

a complete assignment of truth values to the sentence letters. Thus, insofar

as the sentence letters are concerned, an interpretation function completely

speci�es a possible con�guration of the world. And for any interpretation

function, its corresponding valuation function speci�es, for each complex wff,

what truth value that wff has in that interpretation. Thus, for each wff (φ) and

each con�guration (I ), we have speci�ed the truth value of that wff in that

con�guration (VI (φ)).Onward. We are now in a position to de�ne the semantic versions of the

notions of logical truth and logical consequence for PL. The semantic notion

of a logical truth is that of a valid formula:

Definition of validity: A formulaφ is PL-valid iff for every PL-interpretation,

I , VI (φ) = 1

We write “�PLφ” for “φ is PL-valid”. (When it’s obvious which system

we’re talking about, we’ll omit the subscript on �.) The valid formulas of

propositional logic are also called tautologies.As for logical consequence, the semantic version of this notion is that of a

single formula’s being a semantic consequence of a set of formulas:

Definition of semantic consequence: Formula φ is a PL-semantic conse-

quence of the formulas in set Γ iff for every PL-interpretation, I , if VI (γ ) = 1for each γ in Γ, then VI (φ) = 1

That is, φ is a PL-semantic consequence of Γ iff φ is true whenever each

member of Γ is true. We write “Γ �PLφ” for “φ is a PL-semantic consequence

of Γ”. (As usual we’ll often omit the “PL” subscript; and further, let’s improve

readability by writing “φ1, . . . ,φn �ψ” instead of “{φ1, . . . ,φn} �ψ”. That is,

let’s drop the set braces when it’s convenient to do so.)

A parenthetical remark: now we can see the importance for setting up the

grammar for our system according to precise rules. If we hadn’t, the de�nition

of ‘truth value’ given here would have been impossible. In this de�nition we

de�ned truth values of complicated formulas based on their form. For example,


if a formula has the form (φ→ψ), then we assigned it an appropriate truth value

based on the truth values of φ and ψ. But suppose we had a formula in our

language that looked as follows:

P→P→P

and suppose that P has truth value 0. What is the truth value of the whole? We

can’t tell, because of the missing parentheses. For if the parentheses look like

this:

(P→P )→P

then the truth value is 0, whereas if the parentheses look like this:

P→(P→P )

then it is 1. Certain kinds of grammatical ambiguity, then, make it impossible to

assign truth values. We solve this problem in logic by pronouncing the original

string “P→P→P” as ill-formed; it is missing parentheses. Thus, the precise

rules of grammar assure us that when it comes time to do semantics, we are

able to assign semantic values (in this case, truth values) in an unambiguous

way.

Notice also a fact about validity in propositional logic: it is mechanically

“decidable”—a computer program could be written that is capable of telling,

for any given formula, whether or not that formula is valid. The program

would simply construct a complete truth table for the formula in question.

We can observe that this is possible by noting the following: every formula

contains a �nite number N of sentence letters, and so for any formula, there

are only a �nite number of different “cases” one needs to check—namely, the

2Npermutations of truth values for the contained sentence letters. But given

any assignment of truth values to the sentence letters of a formula, it’s clearly a

perfectly mechanical procedure to compute the truth value the formula takes

for those truth values—simply apply the rules for the ∼ and→ repeatedly.

It’s worth being very clear about two assumptions in this proof (which, by

the way, is our �rst bit of metatheory—our �rst proof about a logical system).

They are: that every formula has a �nite number of sentence letters, and that

the truth values of sentence letters not contained in a formula do not affect

the truth value of the formula. We need the latter assumption to be sure that

we only need to check a �nite number of cases—namely, the permutations


of truth values of the contained sentence letters—to see whether a formula is

valid. After all, there are in�nitely many interpretation functions (since there

are in�nitely many sentence letters in the language of PL), and a valid formula

must be true in each one.

These two assumptions are obviously true, but it would be good to prove

them. I’ll prove the �rst assumption here (the second may be proved by a

similar method), and take this opportunity to introduce an important technique

for metalanguage proofs: proof by induction.

Proof that every wff contains a �nite number of sentence letters. In this sort of proof

by induction, we’re trying to prove a statement of the form: every wff has prop-

erty P . The property P in this case is having a �nite number of different sentenceletters. In order to do this, we must show two separate statements:

base case: we show that every atomic sentence has the property. This is

obvious—atomic sentences are just sentence letters, and each of them contains

one sentence letter, and thus �nitely many different sentence letters.

induction step: we begin by assuming that if formulas φ and ψ have the

property, then so will the complex formulas one can form from φ and ψ by the

rules of formation, namely ∼φ and φ→ψ. So, we assume that φ and ψ have

�nitely many different sentence letters; and we show that the same must hold

for ∼φ and φ→ψ. That’s obvious: ∼φ has as many different sentence letters

as does φ; since φ, by assumption, has only �nitely many, then so does ∼φ.

As for φ→ψ, by hypothesis, φ and ψ have �nitely many different sentence

letters, and so φ→ψ has, at most, n+m sentence letters, where n and m are

the number of different sentence letters in φ and ψ, respectively.

We’ve shown that every atomic formula has the property having a �nitenumber of different sentence letters; and we’ve shown that the property is inherited

by complex formulas built according to the recursion rules. But every wff is

either atomic, or built from atomics by a �nite series of applications of the

recursion rules. Therefore, by induction, every wff has the property.

2.4 Natural deduction in propositional logicWe have investigated a semantic conception of the notions of logical truth and

logical consequence. An alternate conception is proof-theoretic, in which the

central conception is that of proof. On this conception, logical consequence

means “provable from”, and a logical truth is a sentence that can be proved


starting from no premises at all. A “proof” procedure, informally, is a method

of reasoning one’s way, step by step, according to mechanical rules, from some

premises to a conclusion. This all is, of course, informal; we must now make it

precise.

One method for characterizing proof is called the method of natural de-duction. Any system in which one has assumptions for “conditional proof”,

assumptions for “indirect derivation”, etc. is a system of natural deduction.

This is the usual method in introductory logic books. Proofs in these systems

often look like this:

1 P→(Q→R)

2 P∧Q

3 P 2, ∧E

4 Q 2, ∧E

5 Q→R 1, 3→E

6 R 4, 5→E

7 (P∧Q)→R 2-6,→I

or like this:

1.

2.

3.

4.

5.

6.

7.

8.

P→(Q→R)show (P∧Q)→R

P∧Qshow R

PQQ→RR

Pr.

CD

As.

DD

3, ∧E

3, ∧E

1, 5→E

6, 7→E

We will implement natural deduction a little differently here, in order to reveal

what is really going on. Our derivations will therefore look a little different


from the derivations familiar from introductory books. Our version of the

above derivation will look like this:

1. P→(Q→R) ` P→(Q→R) RA

2. P∧Q ` P∧Q RA

3. P∧Q ` P 2, ∧E

4. P∧Q `Q 2, ∧E

5. P→(Q→R), P∧Q `Q→R 1,3→E

6. P→(Q→R), P∧Q ` R 4,5→E

7. P→(Q→R) ` (P∧Q)→R 5,→I

It looks different, but the underlying idea is nevertheless the same.

2.4.1 SequentsNatural deduction systems model the kind of reasoning one employs in everyday

life. How does that reasoning work? In its simplest form, one reasons in a

step-by-step fashion from premises to a conclusion, each step being sanctioned

by a rule of inference. For example, suppose that one already knows the premise

that P∧(P→Q) is true. One can then reason one’s way to the conclusion that

Q is also true, as follows:

1. P∧(P→Q) premise

2. P from line 1

3. P→Q from line 1

4. Q from lines 2 and 3

In this kind of proof, each step is a tiny, indisputably correct, logical inference.

Consider the moves from 1 to 2 and from 1 to 3, for example. These are

indisputably correct because a conjunctive statement clearly logically implies

either of its conjuncts. Likewise for the move from 2 and 3 to 4: it is clear

that a conditional statement, plus its antecedent, together imply its consequent.

Natural deduction systems consist in part of simple general principles like these

(“a conjunctive statement logically implies either of its conjuncts”); they are

known as rules of inference.In addition to rules of inference, ordinary reasoning employs a further

technique: the use of assumptions. In order to establish a conditional claim “if P


then Q”, one would ordinarily i) assume P , ii) reason one’s way to Q, and then

iii) on that basis conclude that the conditional claim “if P then Q” is true. Once

P ’s assumption is shown to lead to Q, the conditional claim “if P then Q” may

be concluded. Another example: to establish a claim of the form “not-P”, one

would ordinarily i) assume P , ii) reason one’s way to a contradiction, and iii)

on that basis conclude that “not-P” is true. Once P ’s assumption is shown to

lead to a contradiction, “not-P” may be concluded. The �rst sort of reasoning

is called conditional proof, the second, reductio ad absurdum.

When one reasons with assumptions, one writes down statements that one

does not know to be true. When you write down P as an assumption, with the

goal of proving the conditional “if P then Q”, you do not know P to be true.

You’re merely assuming P for the sake of establishing the conditional “if Pthen Q”. Outside the context of this proof, the assumption need not hold; once

you’ve reasoned your way to Q on the basis of the assumption of P , and so

concluded that the conditional “if P then Q” is true, you stop assuming P . To

model this sort of reasoning formally, we need a way to keep track of how the

conclusions we establish depend on the assumptions we have made. Natural

deduction systems in introductory textbooks tend to do this geometrically (by

placement on the page), with special markers (e.g., ‘show’), and by drawing

lines or boxes around parts of the proof once the assumptions that led to those

parts are no longer operative. We will do it differently: we will keep track of the

dependence of conclusions on assumptions by writing down explicitly, for each

conclusion, which assumptions it depends on. We will do this by constructing

our derivations out of sequents.A sequent looks like this:

Γ `φ

Γ is a set of formulas, called the premises of the sequent. φ is a single formula,

called the conclusion of the sequent. “`” is a sign that goes between the sequent’s

premises and its conclusion, to indicate that the whole thing is a sequent. We

will introduce sequent proofs below; and when you write down the sequent

Γ ` φ in one of them, the idea is that φ is an established conclusion, but it

was established by making the assumptions in Γ. Take away those assumptions,

and φ may no longer be established. In fact, one may think of a sequent as

“meaning” that its conclusion is a logical consequence of its premises.

Thus, our proofs will be proofs of sequents. It’s a bit weird at �rst to think in

terms of proving sequents, rather than formulas, since each sequent itself asserts


a relation of logical consequence between its premises and its conclusion, but

the idea nevertheless makes sense. Let’s introduce an informal notion of logicalcorrectness for sequents: the sequent Γ `φ is logically correct if the formula φis a logical consequence of the formulas in Γ. Thus, one is entitled to conclude

the conclusion of a logically correct sequent from its premises. The idea, then,

of constructing a sequent proof of a sequent is to show that that sequent is

logically correct—to show, that is, that its consequent is a logical consequence

of its premises.

From our investigation of the semantics of propositional logic, we already

have the makings of a semantic criterion for when a sequent is logically correct:

the sequent Γ ` φ is logically correct iff φ is a semantic consequence of Γ.

What we will be doing in this section is giving a new, proof-theoretic, criterion

for the logical correctness of sequents.

2.4.2 RulesThe �rst step in developing our system is to write down sequent rules. A sequent

rule is a permission to move from certain sequents to certain other sequents. Our

goal is to construct rules with the following feature: if the “from” sequents are

all logically correct sequents, then any of the “to” sequents will be guaranteed

to be a logically correct sequent. Call such sequent rules “logical-correctness

preserving”.

Consider, as an example, the �rst rule of our system “∧ introduction”, or

“∧I” for short. We picture this sequent rule thus:

Γ `φ ∆ `ψΓ,∆ `φ∧ψ

∧I

Above the line go the “from” sequents; below the line go the “to”-sequents.

(The comma between Γ and ∆ in the “to” sequent simply means that the

premises of this sequent are all the members of Γ plus all the members of ∆.

Strictly speaking it would be more correct to write this in set-theoretic notation

as: Γ∪∆ `φ∧ψ.) Thus, ∧I permits us to move from the sequents Γ `φ and

∆ `ψ to the sequent Γ,∆ `φ∧ψ. For any sequent rule, we say that any of the

“to” sequents (Γ,∆ `φ∧ψ in this case) follows from the “from” sequents (in this

case Γ `φ and ∆ `ψ) via the rule.

It seems intuitively clear that ∧I preserves logical correctness. For if some

assumptions Γ logically imply φ, and some assumptions ∆ logically imply ψ,


then (sinceφ∧ψ intuitively follows fromφ andψ taken together) the conclusion

φ∧ψ should indeed logically follow from all the assumptions together, the ones

in Γ and the ones in ∆.

Our next sequent rule is ∧E:

Γ `φ∧ψΓ `φ Γ `ψ

∧E

This lets one move from the sequent Γ `φ∧ψ to either the sequent Γ `φ or

the sequent Γ `ψ (or both). This, too, appears to preserve logical correctness.

If the members of Γ imply the conjunction φ∧ψ, then (since φ∧ψ intuitively

implies both φ and ψ individually) it must be that the members of Γ imply φ,

and they must also imply ψ.

The rule ∧I is known as an introduction rule for ∧, since it allows us to move

to a sequent whose major connective is the ∧. Likewise, the rule ∧E is known

as an elimination rule for ∧, since it allows us to move from a sequent whose

major connective is the ∧. In fact our sequent system contains introduction and

elimination rules for the other connectives as well: ∼, ∨, and→ (let’s forget

the↔ here.) We’ll present those rules in turn.

First ∨I and ∨E:

Γ `φΓ `φ∨ψ Γ `ψ∨φ

∨I

Γ `φ∨ψ ∆1,φ ` χ ∆2,ψ ` χΓ,∆1,∆2 ` χ

∨E

Let’s think about what∨E means. Remember the intuitive meaning of a sequent:

its conclusion is a logical consequence of its premise. Another (related) way to

think of it is that Γ `φ means that one can establish that φ if one assumes the

members of Γ. So, if the sequent Γ `φ∨ψ is logically correct, that means we’ve

got the disjunction φ∨ψ, assuming the formulas in Γ. Now, suppose we can

reason to a new formula χ , assuming φ, plus perhaps some other assumptions

∆1. And suppose we can also reason to χ from ψ, plus perhaps some other

assumptions∆2. Then, since either φ or ψ (plus the assumptions in∆1 and∆2)

leads to χ , and we know thatφ∨ψ is true (conditional on the assumptions in Γ),

we ought to be able to infer χ itself, assuming the assumptions we needed along

the way (∆1 and ∆2), plus the assumptions we needed to get φ∨ψ, namely, Γ.

Next, we have double negation:

Γ `φΓ `∼∼φ

Γ `∼∼φΓ `φ

DN


In connection with negation, we also have the rule of reductio ad absurdum:

Γ,φ `ψ∧∼ψΓ `∼φ

RAA

That is, if φ (along with perhaps some other assumptions, Γ) leads to a contra-

diction, we can conclude that ∼φ is true (given the assumptions in Γ). RAA

and DN together are our introduction and elimination rules for ∼.

And �nally we have→I and→E:

Γ,φ `ψΓ `φ→ψ

→I

Γ `φ→ψ ∆ `φΓ,∆ `ψ

→E

→E is perfectly straightforward; it’s just the familiar rule of modus ponens.

But→I requires a bit more thought. →I is the principle of conditional proof.

Suppose you can get to ψ on the assumption that φ (plus perhaps some other

assumptions Γ.) Then, you should be able to conclude that the conditional

φ→ψ is true (assuming the formulas in Γ). Put another way: if you want to

establish the conditional φ→ψ, all you need to do is assume that φ is true, and

reason your way to ψ.

We add, �nally, one more sequent rule, the rule of assumptions

φ `φRA

Note that this is the one sequent rule when no “from” sequent is required,

since there are no sequents above the line. The rule permits us to move from

no sequents at all to a sequent of the form φ `φ. (Strictly, this sequent should

be written “{φ} `φ”.) Call any such sequent an “assumption sequent”. Clearly,

any assumption sequent is a logically correct sequent, since clearly φ can be

proved if we assume φ itself.

2.4.3 Sequent proofsWe have assembled all the sequent rules; now we need to show how to use

those rules to provide a criterion for logically correct sequents. We do this by

�rst de�ning the notion of a “sequent proof”:

Definition of sequent proof: A sequent proof is a series of sequents, each of

which is either an assumption sequent, or follows from earlier sequents in the

series by some sequent rule.


So, for example, the following is a sequent proof

1. P∧Q ` P∧Q As

2. P∧Q ` P 1, ∧E

3. P∧Q `Q 1, ∧E

4. P∧Q `Q∧P 2, 3 ∧I

Though it isn’t strictly required, we write a line number to the left of each

sequent in the series, and to the right of each line we write the sequent rule

that justi�es it, together with the line or lines (if any) that contained the “from”

sequents required by the sequent rule in question. (The rule of assumptions

requires no “from” sequents, recall.)

(It’s important to distinguish what we’re now calling proofs, namely, sequent

proofs, from the kinds of informal arguments I gave in section 2.3, and will

give elsewhere in this book. Sequent proofs (and also the axiomatic proofs we

will introduce in section 2.5) are formalized object-language proofs. The sentences

in sequent proofs are sentences in the object language; they are wffs of PL.

Moreover, we gave a rigorous de�nition of what a sequent proof is. Moreover,

sequent proofs are restrictive in that only the system’s of�cial rules may be

used. For contrast, consider the argument I gave in section 2.3 that any PL-

valuation assigns 1 to φ∧ψ iff it assigns 1 to φ and 1 to ψ. That argument was

an informal metalanguage proof. The sentences in the argument were sentences

of English, and the argument used informal (i.e., not formalized) techniques of

reasoning. “Informal” doesn’t imply lack of rigor. The argument was perfectly

rigorous: it conforms to the standards of good argumentation that generally

prevail in mathematics. We’re free to use any reasonable pattern of reasoning,

for example “universal proof” (to establish something of the form “everything

is thus-and-so”, we consider an arbitrary thing and show that it is thus-and-so).

We may “skip steps” if it’s clear how the argument is supposed to go. In short,

what we must do is convince a well-informed and mathematically sophisticated

reader that the result we’re after is indeed true.)

Next we introduce the notion of a “provable sequent”. The idea is that

each sequent proof culminates in the proof of some sequent. Thus we offer the

following de�nition:

Definition of provable sequent: A provable sequent is a sequent that is the

last line of some sequent proof


(Note that it would be equivalent to de�ne a provable sequent as any line in

any sequent proof, because at any point in a sequent proof one may simply stop

adding lines, and the proof up until that point counts as a legal sequent proof.)

So, for example, the sequent proof given above establishes that P∧Q `Q∧P is

a provable sequent. (We call a sequent proof, whose last line is Γ `φ, a sequent

proof of Γ `φ.)

The de�nitions we have given in this section give us a way to make precise

the proof-theoretic conception of the core logical notions, as applied to propo-

sitional logic. Namely, we can say that φ is a logical truth, on this conception,

iff the sequent ∅ `φ is a provable sequent, and that φ is a logical consequence

of the formulas in set Γ iff the sequent Γ `φ is a provable sequent. The symbol

∅ stands for the “empty set”—the set containing no members. Thus, a logical

truth here is understood as a formula that is provable from no assumptions at

all.

2.4.4 Example sequent proofsLet’s explore how to construct sequent proofs. You may �nd this a bit less

intuitive, initially, than constructing natural deduction proofs in the systems

familiar from introductory textbooks. But a little experimentation will show

that the techniques for proving things in the usual systems carry over to the

present system.

A �rst simple example: let’s return to the sequent proof of P∧Q `Q∧P :

1. P∧Q ` P∧Q As

2. P∧Q ` P 1, ∧E

3. P∧Q `Q 1, ∧E

4. P∧Q `Q∧P 2, 3 ∧I

Notice the strategy. We �rst use the rule of assumptions to enter the premise

of the sequent we’re trying to prove: P∧Q. We then use the rules of inference

to infer the consequent of that sequent: Q∧P . Since our initial assumption of

P∧Q was dependent on the formula P∧Q, our subsequent inferences remain

dependent on that same assumption, and so the �nal formula concluded, Q∧P ,

remains dependent on that assumption.

Let’s write our proofs out in a simpler way. Instead of writing out entire

sequents, let’s write out only their conclusions. We can indicate the premises

of the sequent using line numbers; the line numbers indicating the premises


of the sequent will go to the left of the number indicating the sequent itself.

Rewriting the previous proof in this way yields:

1 (1) P∧Q As

1 (2) P 1, ∧E

1 (3) Q 1, ∧E

1 (4) Q∧P 2, 3 ∧I

Next, let’s have an example to illustrate conditional proof. Let’s construct a

sequent proof of P→Q,Q→R ` P→R:

1. P→Q ` P→Q As

2. Q→R `Q→R As

3. P ` P As

4. P→Q, P `Q 1,3→E

5. P→Q,Q→R, P ` R 2,4→E

6. P→Q,Q→R ` P→R 5,→I

It can be rewritten in the simpler style as follows:

1 (1) P→Q As

2 (2) Q→R As

3 (3) P As

1,3 (4) Q 1, 3→E

1,2,3 (5) R 2, 4→E

1,2 (6) P→R 5,→I

Let’s think about this example. We’re trying to establish P→R on the basis of

two formulas, P→Q and Q→R, so we start by assuming the latter two formulas.

Then, since the formula we’re trying to establish is a conditional, we assume

the antecedent of the conditional, in line 3. We then proceed, on that basis,

to reason our way to R, the consequent of the conditional we’re trying to

prove. (Notice how in lines 4 and 5, we add more line numbers on the very

left. Whenever we use→ E, we increase dependencies: when we infer Q from

P and P→Q, our conclusion Q depends on all the formulas that P and P→Qdepended on, namely, the formulas on lines 1 and 3. Look back to the statement

of the rule→ E: the conclusion ψ depends on all the formulas that φ and φ→ψ


depended on: Γ and ∆.) That brings us to line 5. At that point, we’ve shown

that R can be proven, on the basis of various assumptions, including P . The

rule→I (that is, the rule of conditional proof) then lets us conclude that the

conditional P→R follows merely on the basis of the other assumptions; that

rule, note, lets us in line 6 drop line 3 from the list of assumptions on which

P→R depends.

Next let’s establish an instance of DeMorgan’s Law, ∼(P∨Q) `∼P∧∼Q:

1 (1) ∼(P∨Q) As

2 (2) P As (for reductio)

2 (3) P∨Q 2, ∨I

1,2 (4) (P∨Q)∧∼(P∨Q) 1, 3 ∧I

1 (5) ∼P 4, RAA

6 (6) Q As (for reductio)

6 (7) P∨Q 6, ∨I

1,6 (8) (P∨Q)∧∼(P∨Q) 1, 7 ∧I

1 (9) ∼Q 8, RAA

1 (10) ∼P∧∼Q 5, 9∧I

Next let’s establish ∅ ` P∨∼P :

1 (1) ∼(P∨∼P ) As


2 (3) P∨∼P 2, ∨I

2,1 (4) (P∨∼P )∧∼(P∨∼P ) 1, 3 ∧I

1 (5) ∼P 4, RAA

6 (6) ∼P As (for reductio)

6 (7) P∨∼P 6, ∨I

6,1 (8) (P∨∼P )∧∼(P∨∼P ) 1, 7 ∧I

1 (9) ∼∼P 8, RAA

1 (10) ∼P∧∼∼P 5, 9 ∧I

∅ (11) ∼∼(P∨∼P ) 10, RAA

∅ (12) P∨∼P 11, DN

Comment: my overall goal was to assume ∼(P∨∼P ) and then derive a con-

tradiction. And my route to the contradiction was to separately establish ∼P


(lines 2-5) and ∼∼P (lines 6-9), each by reductio arguments.

Finally, let’s establish a sequent corresponding to a way that∨E is sometimes

formulated: P∨Q,∼P `Q:

1 (1) P∨Q As

2 (2) ∼P As

3 (3) Q As (for use with ∨E)

4 (4) P As (for use with ∨E)

5 (5) ∼Q As (for reductio)

4,5 (6) ∼Q∧P 4,5 ∧I

4,5 (7) P 6, ∧E

2,4,5 (8) P∧∼P 2,7 ∧I

2,4 (9) ∼∼Q 8, RAA

2,4 (10) Q 9, DN

1,2 (11) Q 1,3,10 ∨E

The basic idea of this proof is to use ∨ E on line 1 to get Q. That calls, in

turn, for showing that each disjunct of line 1, P and Q, leads to Q. Showing

that Q leads to Q is easy; that was line 3. Showing that P leads to Q took lines

4-10; line 10 states the result of that reasoning, namely that Q follows from P(as well as line 2). I began at line 4 by assuming P . Then my strategy was to

establish Q by reductio, so I assumed ∼Q in line 5. At this point, I basically

had my contradiction: at line 2 I had ∼P and at line 4 I had P . (You might

think I had another contradiction: Q at line 3 and ∼Q at line 5. But at the end

of the proof, I don’t want my conclusion to depend on line 3, whereas I don’t

mind it depending on line 2, since that’s one of the premises of the sequent

I’m trying to establish.) So I want to put P and ∼P together, to get P∧∼P ,

and then conclude ∼∼Q by RAA. But there is a minor hitch. Look carefully at

how RAA is formulated. It says that if we have Γ,φ `ψ∧∼ψ, we can conclude

Γ `∼φ. The �rst of these two sequents includes φ in its premises. That means

that in order to conclude ∼φ, the contradiction ψ∧∼ψ needs to depend on φ.

So in the present case, in order to �nish the reductio argument and conclude

∼∼Q, the contradiction P∧∼P needs to depend on the reductio assumption

∼Q (line 5.) But if I just used ∧I to put lines 2 and 4 together, the resulting

contradiction will only depend on lines 2 and 4. To get around this, I used

a little trick. Whenever you have a sequent Γ ` φ, you can always add any


formula ψ you like to the premises on which φ depends, using the following

method:3

i . Γ `φ (begin with this)

i + 1. ψ `ψ As (ψ is any chosen formula)

i + 2. Γ,ψ `φ∧ψ ∧I

i + 3. Γ,ψ `φ ∧E

Lines 4, 6 and 7 in the proof employ this trick: initially, at line 4, P depends

only on 4, but then by line 7, P also depends on 5. That way, the move from 8

to 9 by RAA is justi�ed.

Exercise 2.2 Prove the following sequents:

a) P,Q, R ` P

b) P→(Q→R) ` (Q∧∼R)→∼P

c) P→Q, R→Q ` (P∨R)→Q

2.5 Axiomatic proofs in propositional logicNatural deduction proofs are comparatively easy to construct; that is their

great advantage. A different approach to proof theory, the axiomatic method,

offers different advantages. Like natural deduction, the axiomatic method is a

proof-theoretic approach to logic, based on the step-by-step reasoning model

in which each step is sanctioned by a rule of inference. But unlike natural

deduction systems, axiomatic systems do not allow reasoning by assumptions,

and they have very few rules of inference. Although these differences make

axiomatic proofs much harder to construct, there is a compensatory advantage

in metalogic: it is far easier to prove things about axiomatic systems.

Let’s �rst think about axiomatic systems informally. An axiomatic proof is

a series of formulas (not sequents—we no longer need them since we’re not

3Adding arbitrary dependencies is not allowed in relevance logic, where a sequent is provable

only when all of its premises are, in an intuitive sense, relevant to its conclusion. Relevant

logicians modify various rules of classical logic, including the rule of ∧E.


reasoning with assumptions), the last of which is the conclusion of the proof.

Each line in the proof must be justi�ed in one of two ways: it may be inferred

by a rule of inference from earlier lines in the proof, or it may be an axiom.

An axiom is a certain kind of formula, a formula that one is allowed to enter

into a proof without any justi�cation at all. Axioms are the “starting points” of

proofs, the foundation on which proofs rest. Since axioms are to play this role,

the axioms in a good axiomatic system ought to be indisputable logical truths.

For example, “P→P” would be a good axiom—it’s obviously a logical truth.

(As it happens, we won’t choose this particular axiom; we’ll instead choose

other axioms from which this one may be proved.) Similarly, for each rule of

inference in a good axiomatic system, there should be no question but that the

premises of the rule logically imply its conclusion.

Actually we’ll employ a slightly more general notion of a proof: a proof

from a given set of wffs Γ. A proof from Γ will be allowed to contain members

of Γ, in addition to axioms and wffs that follow from earlier lines by a rule.

Think of the members of Γ as premises, which in the context of a proof from

Γ are temporarily treated as axioms, in that they are allowed to be entered

into the proof without any justi�cation. The intuitive point of a proof from

Γ is to demonstrate its conclusion on the assumption that the members of Γ aretrue, in contrast to a proof simpliciter (i.e. a proof in the sense of the previous

paragraph), whose point is to demonstrate its conclusion unconditionally. (Note

that we can regard a proof simpliciter as a proof from the empty set ∅.)

Formally, to apply the axiomatic method, we must choose i) a set of rules,

and ii) a set of axioms. An axiom is simply any chosen sentence (though as we

saw, in a good axiomatic system the axioms will be clear logical truths.) A rule

is simply a permission to infer one sort of sentence from other sentences. For

example, the rule modus ponens can be stated thus: “From φ→ψ and φ you mayinfer ψ”, and pictured as follows:

φ→ψ φ

ψMP

(Modus ponens is the analog of the sequent rule→E.) Given any chosen axioms

and rules, we can de�ne the following concepts:

Definition of axiomatic proof from a set: Where Γ is a set of wffs and φ is

a wff, an axiomatic proof from Γ is a �nite sequence of wffs whose last line is

φ, in which each line either i) is an axiom, ii) is a member of Γ, or iii) follows

from earlier wffs in the sequence via a rule.


Definition of axiomatic proof: An axiomatic proof of φ is an axiomatic proof

of φ from ∅ (i.e., a �nite sequence of wffs whose last line is φ, in which each

line either i) is an axiom, or ii) follows from earlier wffs in the sequence via a

rule.)

It is common to write “Γ `φ” to mean that φ is provable from Γ, i.e., that

there exists some axiomatic proof of φ from Γ, and to write “`φ” to mean that

∅ `φ, i.e. that φ is provable, i.e., that there exists some axiomatic proof of φfrom no premises at all. (Formulas provable from no premises at all are often

called theorems.) This notation can be used for any axiomatic system, i.e. any

choice of axioms and rules. The symbol ` may be subscripted with the name

of the system in question. Thus, for our axiom system for PL below, we may

write: `PL

. (We’ll omit this subscript when it’s clear which axiomatic system is

at issue.)

Here is an axiomatic system for propositional logic:4

Axiomatic system for PL:

· Rule: modus ponens

· Axioms: Where φ, ψ, and χ are wffs, anything that comes from the

following schemas are axioms

φ→ (ψ→φ) (A1)

(φ→(ψ→χ ))→ ((φ→ψ)→(φ→χ )) (A2)

(∼ψ→∼φ)→ ((∼ψ→φ)→ψ) (A3)

Thus, a PL-theorem is any formula that is the last line of a sequence of formulas,

each of which is either an A1, A2, or A3 axiom, or follows from earlier formulas

in the sequence by modus ponens. And a formula is PL-provable from some

set Γ if it is the last line of a sequence of formulas, each of which is either a

member of Γ, an A1, A2, or A3 axiom, or follows from earlier formulas in the

sequence by modus ponens.

The axiom “schemas” A1-A3 are not themselves axioms. They are, rather,

“recipes” for constructing axioms. Take A1, for example:

φ→(ψ→φ)4See Mendelson (1987, p. 29).


This string of symbols isn’t itself an axiom because it isn’t a wff; it isn’t a wff

because it contains Greek letters, which aren’t allowed in wffs (since they’re

not on the list of PL primitive vocabulary). φ and ψ are variables of our

metalanguage; you only get an axiom when you replace these variables with

wffs. P→(Q→P ), for example, is an axiom; it results from A1 by replacing φwith P and ψ with Q. (Note: since you can put in any wff for these variables,

and there are in�nitely many wffs, there are in�nitely many axioms.)

A few points of clari�cation about how to construct axioms from schemas.

First point: you can stick in the same wff for two different Greek letters. Thus

you can let both φ and ψ in A1 be P , and construct the axiom P→(P→P ).(But of course, you don’t have to stick in the same thing for φ as for ψ.) Sec-

ond point: you can stick in complex formulas for the Greek letters. Thus,

(P→Q)→(∼(R→S)→(P→Q)) is an axiom (I put in P→Q for φ and ∼(R→S)forψ in A1). Third point: within a single axiom, you cannot substitute different

wffs for a single Greek letter. For example, P→(Q→R) is not an axiom; you

can’t let the �rst φ in A1 be P and the second φ be R. Final point: even though

you can’t substitute different wffs for a single Greek letter within a single axiom,

you can let a Greek letter become one wff when making one axiom, and let

it become a different wff when making another axiom; and you can use each

of these axioms within a single axiomatic proof. For example, each of the

following is an instance of A1, and one could include both in a single axiomatic

proof:

P→(Q→P )∼P→((Q→R)→∼P )

In the �rst case, I made φ be P and ψ be Q; in the second case I made φ be

∼P and ψ be Q→R. This is �ne because I kept φ and ψ constant within each

axiom.

The de�nitions we have given in this section constitute another way of

making precise the proof-theoretic conception of the core logical notions, as

applied to propositional logic. A logical truth, on this conception, is a PL-

theorem; one formula is a logical consequence of others iff it is PL-provable

from them.

2.5.1 Example axiomatic proofsAxiomatic proofs are much harder than natural deduction proofs. Some are

easy, of course. Here is a proof of (P→Q)→(P→P ):


1. P→(Q→P ) (A1)

2. (P→(Q→P ))→((P→Q)→(P→P )) (A2)

3. (P→Q)→(P→P ) 1,2 MP

The existence of this proof shows that (P→Q)→(P→P ) is a theorem.

Building on the previous proof, we can construct a proof of P→P from{P→Q}:

1. P→(Q→P ) (A1)

2. (P→(Q→P ))→((P→Q)→(P→P )) (A2)

3. (P→Q)→(P→P ) 1,2 MP

4. P→Q member of {(P→Q)}5. P→P 3, 4 MP

Thus, we have shown that {P→Q} ` P→P .

(When we’re talking about provability from a set, let’s adopt the convention

of writing “φ1 . . .φn `ψ” instead of “{φ1 . . .φn} `ψ”, and writing “Γ,φ1 . . .φn”

instead of “Γ∪ {φ1 . . .φn}”. That is, let’s drop the set-braces on the left hand

side of ` in these circumstances. In this new notation, what we showed in the

previous paragraph was: P→Q ` P→P .)

The next example is a little harder: (R→P )→(R→(Q→P ))

1. [R→(P→(Q→P ))]→[(R→P )→(R→(Q→P ))] A2

2. P→(Q→P ) A1

3. [P→(Q→P )]→[R→(P→(Q→P ))] A1

4. R→(P→(Q→P )) 2,3 MP

5. (R→P )→(R→(Q→P )) 1,4 MP

Here’s how I approached this problem. What I was trying to prove, namely

(R→P )→(R→(Q→P )), is a conditional whose antecedent and consequent both

begin: (R→. That looks like the consequent of A2. So I wrote out an instance

of A2 whose consequent was the formula I was trying to prove; that gave me

line 1 of the proof. Then I tried to �gure out a way to get the antecedent of

line 1; namely, R→(P→(Q→P )). And that turned out to be pretty easy. The

consequent of this formula, P→(Q→P ) is an axiom (line 2 of the proof). And

if you can get a formula φ, then you choose anything you like—say, R,—and

then get R→φ, by using A1 and modus ponens; that’s what I did in lines 3 and

4.


Exercise 2.3 Establish each of the following facts. For these prob-

lems, do not use the “toolkit” from the following sections; i.e.,

construct the axiomatic proofs “from scratch”. However, you may

use a fact you prove in an earlier problem in later problems.

a) ` P→P

b) ` (∼P→P )→P

c) ∼∼P ` P

In fact, we’ll regularly want to make a move like that of lines 3 and 4 from

the preceding proofs—whenever we have φ on its own, and we want to move

to ψ→φ. Let’s call this move “adding an antecedent”; this is how it is done:

1. φ (from earlier lines)

2. φ→(ψ→φ) A1

3. ψ→φ 1, 2 MP

In future proofs, instead of repeating such steps, let’s just move directly from φto ψ→φ, with the justi�cation “adding an antecedent”.

The preceding proof was a bit tricky, and most proofs are trickier still.

Moreover, the proofs quickly get very long. Practically speaking, the best way

to make progress in an axiomatic system like this is by building up a toolkit.

The toolkit consists of theorems and techniques for doing bits of proofs which

are applicable in a wide range of situations. Then, when approaching a new

problem, one can look to see whether the problem can be reduced to a few

chunks, each of which can be accomplished by using the toolkit. Further, one

can cut down on writing by citing bits of the toolkit, rather than writing down

entire proofs.

So far, we have just one tool in our toolkit: “adding an antecedent”. Let’s

add another: the “MP technique”. Here’s what the technique will let us do.

Suppose we can separately prove φ→ψ and φ→(ψ→χ ). The MP technique

then shows us how to construct a proof of φ→χ . I call this the MP technique

because its effect is that you can do modus ponens “within the consequent of

the conditional φ→”. Here’s how the MP technique works:


1. φ→ψ from earlier lines

2. φ→(ψ→χ ) from earlier lines

3. (φ→(ψ→χ ))→((φ→ψ)→(φ→χ )) A2

4. (φ→ψ)→(φ→χ ) 2,3 MP

5. φ→χ 1,4 MP

Note that the lines in this “proof schema” are schemas (they contain Greek

letters), rather than wffs. It therefore isn’t a proof at all; rather, it becomes a

proof once you �ll in wffs for the φ,ψ, and χ . We constructed a proof schema

because we want the MP technique to be applicable whenever we want to move

from formulas of the form φ→ψ and φ→(ψ→χ ), to a formula of the form

φ→χ , no matter what φ,ψ, and χ may be.

And while we’re on the topic of proof schemas, note also that whenever

one constructs a proof of a formula containing sentence letters, one could just

as well have constructed a similar proof schema. Corresponding to the proof

of (R→P )→(R→(Q→P )), for example, there is this proof schema:

1. [φ→(ψ→(χ→ψ))]→[(φ→ψ)→(φ→(χ→ψ))] A2

2. ψ→(χ→ψ) A1

3. [ψ→(χ→ψ)]→[φ→(ψ→(χ→ψ))] A1

4. φ→(ψ→(χ→ψ)) 2,3 MP

5. (φ→ψ)→(φ→(χ→ψ)) 1,4 MP

It’s usually more useful to think in terms of proof schemas, rather than proofs,

because they can go into our toolkit, if they have general applicability. The

proof schema we just constructed, for example, shows that anything of the

form (φ→ψ)→(φ→(χ→ψ)) is a theorem. As it happens, this is a fairly intuitive

theorem schema. Think of it as the principle of “weakening the consequent”.

χ→ψ is logically weaker than ψ, so if φ leads to ψ, φ must also lead to χ→ψ.

That sounds like a pattern that might well recur, so let’s put it into the toolkit,

under the label “weakening the consequent”. If we’re ever in the midst of a

proof and could really use a line of the form (φ→ψ)→(φ→(χ→ψ)), then we

can simply write that line down, and annotate on the right “weakening the

consequent”. Given the proof sketch above, we know that we could always

in principle insert a �ve-line proof of line; to save writing we simply won’t

bother. (Note that once we do this—omitting those �ve lines—the proofs we

are constructing will cease to be of�cial proofs, since not every line will be


either an axiom or a line that follows from earlier lines by MP. They will be

instead proof sketches, which are in essence metalanguage arguments to the

effect that there exists some proof or other of the desired type. An ambitious

reader could always construct an of�cial proof on the basis of the proof sketch,

by taking each of the bits, �lling in the details using the toolkit, and assembling

the results into one proof.)

Next I want to add to our toolkit the principle of “strengthening the an-

tecedent”: [(φ→ψ)→χ ]→(ψ→χ ). The intuitive idea is that if φ→ψ leads to

χ , then ψ ought to lead to χ , since ψ is logically stronger than φ→ψ. This

proof will be harder still; we’ll need to break it into bits and use the toolkit to

complete it. Here’s a sketch of the overall proof:

a. [(φ→ψ)→χ ]→[ψ→(φ→ψ)] see below

b. [(φ→ψ)→χ ]→[(ψ→(φ→ψ))→(ψ→χ )] see below

c. [(φ→ψ)→χ ]→[ψ→χ ] a,b MP method

All that remains is to supply separate proofs of lines a and b. Step a is pretty

easy. Its consequent, ψ→(φ→ψ) is an instance of A1, so we can prove it in one

line, then use “adding an antecedent” to get a.

Line b is a bit harder. It has the form: (α→β)→[(γ→α)→(γ→β)]. Call

this “adding antecedents” (and put it into the toolkit too), since it lets you add

the same antecedent (γ ) to both the antecedent and consequent of a condi-

tional (α→β). The following proof sketch for adding antecedents uses the MP

technique again!

1. [γ→(α→β)]→[(γ→α)→(γ→β)] A2

2. (α→β)→[γ→(α→β)] A1

3. (α→β)→[γ→(α→β)]→[(γ→α)→(γ→β)] adding an antecedent to

line 1

4. (α→β)→[(γ→α)→(γ→β)] 2, 3, MP method

(For this use of the MP method, we let φ = α→β, ψ = γ→(α→β), and

χ =(γ→α)→(γ→β).) Since we’ve now provided proof sketches for parts a. and

b., we’re �nished our proof sketch for strengthening the antecedent.

Next let’s add a tool to our toolkit, to the effect that conditionals are “tran-

sitive”. Here’s a proof sketch for ` (φ→ψ)→[(ψ→χ )→(φ→χ )]:


1. (ψ→χ )→[(φ→ψ)→(φ→χ )] adding antecedents

2. [(ψ→χ )→[(φ→ψ)→(φ→χ )]]→[[(ψ→χ )→(φ→ψ)]→[(ψ→χ )→(φ→χ )]]

A2

3. [(ψ→χ )→(φ→ψ)]→[(ψ→χ )→(φ→χ )] 1, 2 MP

4. [[(ψ→χ )→(φ→ψ)]→[(ψ→χ )→(φ→χ )]]→[(φ→ψ)→[(ψ→χ )→(φ→χ )]]

strengthening the

antecedent

5. (φ→ψ)→[(ψ→χ )→(φ→χ )] 3, 4 MP

Given this theorem, we can always move from φ→ψ and ψ→χ to φ→χ thus:

1. φ→ψ from earlier lines

2. ψ→χ from earlier lines

3. (φ→ψ)→[(ψ→χ )→(φ→χ )] theorem just proved

4. (ψ→χ )→(φ→χ ) 1, 3 MP

5. φ→χ 2, 4 MP

So, such moves may henceforth be justi�ed by appeal to “transitivity”.

With transitivity in our toolkit, we can really get moving:

` [φ→(ψ→χ )]→[ψ→(φ→χ )] (“swapping antecedents”):

1. [φ→(ψ→χ )]→[(φ→ψ)→(φ→χ )] A2

2. [(φ→ψ)→(φ→χ )]→[ψ→(φ→χ )] strengthening the antecedent

3. [φ→(ψ→χ )]→[ψ→(φ→χ )] 1,2 transitivity

` (∼ψ→∼φ)→(φ→ψ) (“contraposition 1”):

1. (∼ψ→∼φ)→[(∼ψ→φ)→ψ] A3

2. [(∼ψ→φ)→ψ]→(φ→ψ) strengthening the antecedent

3. (∼ψ→∼φ)→(φ→ψ) 1, 2 transitivity

` (φ→ψ)→(∼ψ→∼φ) (“contraposition 2”):


1. (φ→ψ)→[(ψ→∼∼ψ)→(φ→∼∼ψ)] transitivity

2. (ψ→∼∼ψ)→[(φ→ψ)→(φ→∼∼ψ)] 1, swapping

antecedents

3. ψ→∼∼ψ exercise 2.4c

4. (φ→ψ)→(φ→∼∼ψ) 2, 3 MP

5. ∼∼φ→φ exercise 2.4b

6. (∼∼φ→φ)→[(φ→∼∼ψ)→(∼∼φ→∼∼ψ)] transitivity

7. (φ→∼∼ψ)→(∼∼φ→∼∼ψ) 5, 6 MP

8. (∼∼φ→∼∼ψ)→(∼ψ→∼φ) contraposition 1

9. (φ→ψ)→(∼ψ→∼φ) 4, 7, 8 transitivity (2x)

`∼φ→(φ→ψ) (“ex falso quodlibet”):

1. ∼φ→(∼ψ→∼φ) A1

2. (∼ψ→∼φ)→(φ→ψ) contraposition 1

3. ∼φ→(φ→ψ) 1, 2 transitivity

Exercise 2.4 Establish each of the following. For these you may

use the toolkit.

a) Show that ` P→[(P→Q)→Q]

b) Show that `∼∼P→P (Hint: convert your proof that∼∼P `P from problem 2.3c into a proof that ` ∼∼P→P , making use of

the MP technique.)

c) Show that ` P→∼∼P

2.6 Soundness of PLIn this chapter we have discussed two approaches to propositional logic: the

proof-theoretic approach and the semantic approach. In each case, we in-

troduced formal notions of logical truth and logical consequence. For the

semantic approach, these notions involved truth in PL-interpretations. For the


proof-theoretic approach, we considered two formal de�nitions, one involving

sequent-proofs, the other involving axiomatic proofs.

An embarrassment of riches! We have multiple formal accounts of our

logical notions. But in fact, it can be shown that each of our de�nitions yieldsexactly the same results. That is, whether you de�ne a logical truth (for example)

as a formula that is true in all PL-interpretations, or as a formula that is the

last line of some PL axiomatic proof, or as a formula φ for which the sequent

∅ `φ is a provable sequent, exactly the same formulas turn out logical truths.

Proving this is a major accomplishment of meta-logic. We won’t prove this

here, not in full anyway, but we will discuss the issues a bit, to show how such

metalogical proofs proceed.

Let’s focus, for the moment, on just two of our notions, the notion of

a theorem (last line of an axiomatic proof) and the notion of a valid formula

(true in all PL-interpretations). An important accomplishment of metalogic is

the establishment of the following two important connections between these

notions:

Soundness: Every PL-theorem is PL-valid

Completeness: Every PL-valid wff is a PL-theorem

Proof of soundness. It’s pretty easy to establish soundness. We do this by induc-

tion. But our inductive proof here is slightly different. We’re not trying to

prove something of the form “Every wff has property P”. Instead, we’re trying

to prove something of the form “Every theorem has property P”. In this case,

the property P is: being a valid formula.

Here’s how induction works in this case. A theorem is the last line of any

proof. So, to show that every theorem has a certain property P, all we need

to do is show that every time one adds another line to a proof, that line has

property P. Now, there are two ways one can add to a proof. First, one can add

an axiom. The base case of the inductive proof must show that adding axioms

always means adding a line with property P. Second, one can add a formula that

follows from earlier lines by a rule. The inductive step of the inductive proof

must show that in this case, too, one adds a line with property P, provided all

the preceding lines have property P. OK, here goes:

base case: here we need to show that every PL-axiom is valid. This is tedious

but straightforward. Take A1, for example. Suppose for reductio that some in-

stance of A1 is invalid, i.e., for some PL-interpretation I , VI (φ→(ψ→φ)) = 0.


Thus, VI (φ) = 1 and VI (ψ→φ) = 0. Given the latter, VI (φ) = 0—contradiction.

Analogous proofs can be given that instances of A2 and A3 are also valid.

induction step: here we begin by assuming that every line in a proof up to a

certain point is valid (this is the “inductive hypothesis”); we then show that if

one adds another line that follows from earlier lines by the rule modus ponens,

that line must be valid too. I.e., we’re trying to show that “modus ponens

preserves validity”. So, assume the inductive hypothesis: that all the earlier

lines in the proof are valid. And now, consider the result of applying modus

ponens. That means that the new line we’ve added to the proof is some formula

ψ, which we’ve inferred from two earlier lines that have the forms φ→ψ and

φ. We must show that ψ is a valid formula, i.e., that VI (ψ) = 1 for every PL-

interpretation I . By the inductive hypothesis, all earlier lines in the proof are

valid, and hence both φ→ψ and φ are valid. Thus, VI (φ)=1 and VI (φ→ψ) = 1.

But if VI (φ) = 1 then VI (ψ) can’t be 0, for if it were, then VI (φ→ψ) would be

0, and it isn’t. Thus, VI (ψ) = 1.

We’ve shown that axioms are valid, and that modus ponens preserves validity.

So, by induction, every time one adds to a proof, one adds a valid formula.

So the last line in a proof is always a valid formula. Thus, every theorem is

valid.

Notice the general structure of this proof: we �rst showed that every axiom

has a certain property, and then we showed that the rule of inference preserves

the property. Given the de�nition of ‘theorem’, it followed that every theorem

has the property. We chose our de�nition of a theorem with just this sort of

proof in mind.

Remember that this is a proof in the metalanguage, about propositional

logic. It isn’t a proof in any system of derivation.

One nice thing about soundness is that it lets us establish facts of unprov-ability. Soundness says “if ` φ then � φ”. Equivalently, it says: “if 2 φ then

0 φ”. So, to show that something isn’t a theorem, it suf�ces to show that it

isn’t valid. Consider, for example, the formula (P→Q)→(Q→P ). There exist

PL-interpretations in which the formula is false, namely, PL-interpretations in

which P is 0 and Q is 1. So, (P→Q)→(Q→P ) is not valid (since it’s not true

in all PL-interpretations.) But then soundness tells us that it isn’t a theorem

either. In general: if we’ve established soundness, then in order to show that a

formula isn’t a theorem, all we need to do is �nd an interpretation in which it

isn’t true.

One could also prove soundness for the natural deduction system of section


2.4. The soundness proof, for instance, would proceed by proving by induction

that whenever sequent Γ ` φ is provable, φ is a semantic consequence of Γ.

The main thing would be to show that each rule of inference (As, RAA, ∧I, ∧E,

etc.) preserves semantic consequence. But note how much more involved this

proof would be, since there are so many rules of inference. The paucity of rules

in the axiomatic system made the construction of proofs within that system a

real pain in the neck, but now we see how it makes metalogical life easier.

Before we leave this section, let me summarize and clarify the nature of

proofs by induction. Induction is the method of proof to use whenever one

is trying to prove that each entity of a certain sort has a certain feature F ,

where each such entity is generated from certain “starting points” by a �nite

number of successive “operations”. To do this, one establishes two things: a)

that the starting points have feature F , and b) that the operations preserve

feature F —i.e., that if the inputs to the operations have feature F then the

output also has feature F .

In logic, it is important to distinguish two different cases where proofs by

induction are needed. One case is where one is establishing a fact of the form:

every theorem has a certain feature F . (The proof of soundness is an example of

this case.) Here’s why induction is applicable: a theorem is de�ned as the last

line of a proof. So the fact to be established is that every line in every proof has

feature F . Now, a proof is de�ned as a �nite sequence, where each member is

either an axiom or follows from earlier lines by the rule modus ponens. The

axioms are the “starting points” and modus ponens is the “operation”. So if

we want to show that every line in every proof has feature F , all we need to

do is show that a) the axioms all have feature F , and b) show that if you start

with formulas that have feature F , and you apply modus ponens, then what you

get is something with feature F . More carefully, b) means: if φ has feature F ,

and φ→ψ has feature F , then ψ has feature F . Once a) and b) are established,

one can conclude by induction that all lines in all proofs have feature F . When

one gives this �rst sort of inductive argument, for the conclusion that every

theorem φ has a certain feature, it is sometimes called “induction on the proof

of φ” or “induction on the length of φ’s proof”.

A second case in which induction may be used is when one is trying to

establish a fact of the form: every formula has a certain feature F . (The proof

that every wff has a �nite number of sentence letters is an example of this

case.) Here’s why induction is applicable: all formulas are built out of sentence

letters (the “starting points”) by successive applications of the rules of formation

(“operations”) (the rules of formation, recall, say that if φ and ψ are formulas,


then so are (φ→ψ) and ∼φ.) So, to show that all formulas have feature F ,

we must merely show that a) all the sentence letters have feature F , and b)

show that if φ and ψ both have feature F , then both (φ→ψ) and ∼φ also

will have feature F . When one gives this second sort of inductive argument,

for the conclusion that every formula φ has a certain feature, it is sometimes

called “induction on the construction of φ”, or “induction on the number of

connectives in φ”.

Inductions in logic can take yet other forms, but these two are particularly

common.

If you’re ever proving something by induction, it’s important to identify

what sort of inductive proof you’re constructing. What are the entities you’re

dealing with? What is the feature F ? What are the starting points, and what

are the operations generating new entities from the starting points? If you’re

trying to construct an inductive proof and get stuck, you should return to these

questions and make sure you’re clear about their answers.


Exercise 2.5 Consider the following (strange) system of propo-

sitional logic. The de�nition of wffs is the same as for standard

propositional logic, and the rules of inference are the same (just one

rule: modus ponens); but the axioms are different. For any wffs φand ψ, the following are axioms:

φ→φ(φ→ψ)→(ψ→φ)

Establish the following two facts about this system:

· every theorem of this system has an even number of “∼”s.

· soundness is false for this system—i.e., some theorems are

not valid formulas

Exercise 2.6 Back to normal propositional logic and its semantics

(for more practice on inductive proofs). Show that the truth value of

a formula depends only on the truth values of the sentence letters in

that formula. That is, letφ be any wff and let V and V ′be valuations

that agree on the sentence letters in φ (i.e., for any sentence letter

α, if α is in φ then V (α) =V ′(α)). Show that V (φ) =V ′(φ).

Exercise 2.7 Prove the following form of soundness: for any set

of formulas, Γ , and any formula φ, if Γ `φ then Γ �φ (i.e., if φ is

provable from Γ then φ is a semantic consequence of Γ.)

Exercise 2.8 Prove the soundness of the sequent calculus. That

is, show that if Γ `φ is a provable sequent, then Γ �φ. (No need

to go through each and every detail of the proof once it becomes

repetitive. Hint: call a sequent Γ ` φ valid iff Γ � φ. Your proof

should be a proof by induction, and it should use the notion of a

valid sequent.)


2.7 Completeness of PLIn this section we’ll prove completeness for propositional logic. It will be a bit

more dif�cult than the preceding sections, and may be skipped without much

loss. If you decide to work through the more dif�cult sections dealing with

metalogic later in the book (for example sections 6.5 and 6.6), you might �rst

return to this section.

Before we prove completeness, we’ll need to prove a helpful theorem (which

is interesting in its own right) and a Lemma.

As you learned in section 2.5.1 (perhaps to your dismay), constructing

axiomatic proofs is much harder than constructing sequent proofs. It’s hard to

prove things when you’re not allowed to use conditional proof! Nevertheless,

one can prove a metalogical theorem about our axiomatic system that is closely

related to conditional proof:

Deduction theorem: If Γ,φ `ψ, then Γ `φ→ψ

That is: whenever there exists a proof from (Γ and) φ to ψ, then there alsoexists a proof of φ→ψ (from Γ).

Suppose we want to prove φ→ψ. Our axiomatic system does not allow

us to assume φ in a conditional proof of φ→ψ. But once we’ve proved the

deduction theorem, we’ll be able to do the next best thing. Suppose we write

down a proof of ψ from {φ}. That is, we write down a proof in which each line

is either i) a member of {φ} (that is, φ itself), or ii) an axiom, or iii) follows

from earlier lines in the proof by modus ponens. The deduction theorem then

lets us conclude that some proof of φ→ψ exists. We won’t have constructed such

a proof ourselves; we only constructed the proof from φ to ψ. Nevertheless

the deduction theorem assures us that it exists. More generally, whenever we

can construct a proof of ψ from φ plus some other premises (the formulas in

some set Γ), then the deduction theorem assures us that some proof of φ→ψfrom those other premises also exists.

Proof of the deduction theorem. Suppose Γ,φ ` ψ. That is, there is some proof,

call it “Proof A”, of ψ from Γ∪{φ}. Such a proof looks like this:


1. α1

2. α2

.

.

n. ψ

where each αi is either a member of Γ∪{φ}, an axiom, or follows from earlier

lines in the proof by MP. Our strategy will be to establish that:

(*) for each αi in proof A, Γ `φ→αi

We already know that each line of proof A is provable from Γ∪φ; what (*) says

is that if you stick “φ→” in front of any of those lines, the result is provable

from Γ all by itself. Once we succeed in establishing (*) then we will have

proved the deduction theorem. For the last line of proof A is ψ; (*) then tells

us that φ→ψ is provable from Γ.

(*) says that each line of proof A has a certain feature, namely, the feature of:

being provable from Γ when pre�xed with “φ→”. Just as in the proof of soundness,

this calls for the method of proof by induction, and in particular, induction on

φ’s proof. Here goes.

What we’re going to do is show that whenever a line is added to proof A,

then it has the feature—provided, that is, that all earlier lines in the proof have

the feature. There are three cases in which a line αi could have been added

to proof A. The �rst case is where αi is an axiom. We must show that αi has

the feature—that is, show that Γ `φ→αi . Well, we can prove φ→αi from Γ as

follows:

1. αi axiom

2. φ→αi adding an antecedent

This is not an of�cial proof, of course; it’s a proof sketch. And note that we

didn’t need to use any members of Γ in the proof. That’s OK; if you look back

at the de�nition of a proof from a set, you’ll see that this counts of�cially as a

proof from Γ.

The second case in which a line αi could have been added to proof A is

where αi is a member of Γ∪ {φ}. This subdivides into two subcases. The �rst

is where αi is φ itself. Here, φ→αi is φ→φ, which is shown in exercise 2.3a to

be a theorem, i.e., provable from no premises at all. So it is obviously provable


from Γ. The second subcase is where αi ∈ Γ. But here we can prove φ→αifrom Γ as follows:

1. αi member of Γ2. φ→αi adding an antecedent

The �rst two cases were “base” cases of our inductive proof, because we

didn’t need to assume anything about earlier lines in proof A. The third case in

which a line αi could have been added to proof A leads us to the inductive part

of our proof: the case in which αi follows from two earlier lines of the proof

by MP. Here we simply assume that those earlier lines of the proof have the

feature we’re interested in (this assumption is the “inductive hypothesis”; the

feature, recall, is: being provable from Γ when pre�xed with “φ→”) and we show

that αi has the feature as well.

So: we’re considering the case where αi follows from earlier lines in the

proof by modus ponens. That means that the earlier lines have to have the

forms χ→αi and χ . Furthermore, the inductive hypothesis tells us that the

result of pre�xing either of these earlier lines with “φ→” is provable from Γ.

That is, we know that some proof from Γ culminates in φ→(χ→αi ), and some

other proof from Γ culminates in φ→χ . We can then string these two proofs

together into a new proof, and then continue that new proof as follows:

.

.

k. φ→(χ→αi ).

.

l . φ→χl + 1. φ→αi k, l , MP method

This is a proof of φ→αi from Γ.

Thus, in all three cases, whenever αi was added to proof A, there always

existed some proof of φ→αi from Γ. By induction, (*) is established; and this

in turn completes the proof of the deduction theorem.

Next we’ll prove the following lemma:

Lemma: Let I be any PL-interpretation, let φ be any wff, let s1 . . . sn be the

sentence letters in φ, and where ψ is any formula, de�ne ψ′ as being ψ itself if

ψ is true in I , and as being ∼ψ if ψ is false in I . Then s ′1 . . . s ′n `φ′


A very rough way of thinking about what Lemma says is this: the truth value of a

formula is provably settled by the truth values of its sentence letters (“provably”

in the sense of provability in our axiomatic system.)

Proof of Lemma. Let I ,φ, and s1 . . . sn be as described in Lemma. We must

show that φ′ is provable from {s ′1 . . . s ′n}. We’ll show this by induction. With an

eye toward setting up the inductive proof correctly, think of Lemma as saying:

every formula has the feature being a formula whose primed version is provable fromthe primed versions of its sentence letters. This makes it clear that the assertion

we’re trying to prove has the form “every formula has a certain feature”, and

thus calls for proof by induction on the formula’s construction (rather than

proof by induction on a proof, as in the previous two inductive proofs). So,

we’ll need to show that all sentence letters have the feature (base case), and

then show that if α and β have the feature (inductive hypothesis), then both

∼α and α→β must have the feature as well.

Base case: suppose φ is a sentence letter. Then there is just one si , which is

φ itself. So what we need to show is: φ′ `φ′. But that’s trivial; we can give a

one-line proof of φ′ from {φ′}:

1. φ′ member of {φ′}

Now assume the inductive hypothesis: that both α and β have the feature.

That is, where s1 . . . sn are the sentence letters in α, and t1 . . . tm are the sentence

letters in β, we are assuming:

(a) s ′1 . . . s ′n ` α′

(b) t ′1 . . . t ′n `β′

First we must show that∼α has the feature. Since∼α has the same sentence

letters as α (namely, s1 . . . sn), this means showing that s ′1 . . . s ′n ` (∼α)′. Now,

(∼α)′ is either ∼α or ∼∼α depending on whether ∼α is true or false (in I ; I’ll

suppress this from now on). We’ll consider these cases separately.

· In the former case, α is false; and so α′ is ∼α. Then (a) tells us that

s ′1 . . . s ′n ` ∼α. Since (∼α)′ is ∼α in this case, we’ve already shown what

we wanted: s ′1 . . . s ′n ` (∼α)′.


· In the latter case, α is true, and so α′ is just α. So what (a) tells us in this

case is: s ′1 . . . s ′n ` α. Furthermore, since in this case (∼α)′ is ∼∼α, what

we’re trying to establish is s ′1 . . . s ′n ` ∼∼α. So, to construct a proof of

∼∼α from {s ′1 . . . s ′n}, begin with a proof of α from {s ′1 . . . s ′n} (we know

that such a proof exists from what (a) told us). Then insert a proof of

α→∼∼α (exercise 2.4c). Finish the proof by concluding ∼∼α by MP.

Next we must show that α→β has the feature. The sentence letters in

α→β are s1 . . . sn, t1 . . . tm, so what we must show is: s ′1 . . . s ′n, t ′1 . . . t ′m ` (α→β)′.

We’ll consider three separate cases:

· First case: β is true. Then α→β is also true, and so (α→β)′ is just α→β.

We may then construct the desired proof from {s ′1 . . . s ′n, t ′1 . . . t ′m} of α→βas follows. Here β′ is just β, so (b) tells us that t ′1 . . . t ′n `β. So we may

begin our desired proof with a proof of β from {t ′1 . . . t ′n} (ipso facto this

is a proof from {s ′1 . . . s ′n, t ′1 . . . t ′m}). We then use the technique of “adding

an antecedent” to move to α→β.

· Second case: α is false. Here α→β is again true, so again we must

construct a proof from {s ′1 . . . s ′n, t ′1 . . . t ′m} of α→β. α′ is now ∼α, so (a)

tells us that s ′1 . . . s ′n ` ∼α. So, begin with a proof from {s ′1 . . . s ′n} (and so

from {s ′1 . . . s ′n, t ′1 . . . t ′m}) of ∼α. Then insert a proof of ∼α→(α→β) (ex

falso quodlibet, from section 2.5.1), and then by modus ponens conclude

α→β.

· The remaining case is where α is true (hence α′ is α) andβ is false (hence

β′ is ∼β). Here α→β is false, so we must construct a proof of ∼(α→β)from {s ′1 . . . s ′n, t ′1 . . . t ′m}. Begin with a proof of α (guaranteed by (a)), and

continue with a proof of ∼β (guaranteed by (b)), and then continue as

follows:


.

.

i . α

.

.

j . ∼βj+1. [∼∼(α→β)→∼β]→[(∼∼(α→β)→β)→∼(α→β)]

A3

j+2. ∼∼(α→β)→∼β j , adding an antecedent

j+3. (∼∼(α→β)→β)→∼(α→β) j+1, j+2, MP

j+4. α→[∼∼(α→β)→β] see below

j+5. ∼∼(α→β)→β i , j+4, MP

j+6. ∼(α→β) j+3, j+5, MP

As for step j+4, consider the following proof of β from {α,∼∼(α→β)}:

1. α member of {α,∼∼(α→β)}2. ∼∼(α→β) member of {α,∼∼(α→β)}3. ∼∼(α→β)→(α→β) exercise 2.4b

4. α→β 2, 3 MP

5. β 1, 4 MP

The existence of this proof shows that α,∼∼(α→β) ` β. The de-

duction theorem then allows us to conclude that α ` ∼∼(α→β)→β.

And another use of the deduction theorem allows us to conclude that

` α→[∼∼(α→β)→β]. So we know that there exists some proof of

α→[∼∼(α→β)→β]. Any such proof may be inserted into step j+4

above.

This completes the inductive proof of Lemma.

We’ll now use the deduction theorem to prove completeness:

Completeness: if �φ then `φ


Proof. Suppose � φ. We’ll prove ` φ in a series of stages. Let s1 . . . sn be the

sentence letters in φ.

Stage 1a: Let I be any PL-interpretation in which sn is true. Lemma tells

us that s ′1 . . . s ′n ` φ′. Since � φ, we know that φ is true in I , so φ′ will be φ.

Also, s ′n is sn. So we’ve learned: s ′1 . . . s ′n−1, sn ` φ. By the deduction theorem,

s ′1 . . . s ′n−1 ` sn→φ.

Stage 1b: Let’s now choose a different interpretation, just like I but in

which sn is false. If we apply Lemma again, s ′n is now ∼sn, and φ′ is still φ,

so we have: s ′1 . . . s ′n−1,∼sn ` φ, from which we may infer s ′1 . . . s ′n−1 ` ∼sn→φby the deduction theorem. (Note: it’s legitimate to continue to use the same

names for the “primed” versions of s1 . . . sn−1, because our new interpretation

function assigns the same truth values to these sentence letters as did I .)

Stage 1c: We’ve learned that s ′1 . . . s ′n−1 ` sn→φ and s ′1 . . . s ′n−1 ` ∼sn→φ.

Let’s now construct a proof of φ from {s ′1, . . . s ′n−1} by �rst beginning with a

proof of sn→φ, continuing with a proof of ∼sn→φ, and then continuing as

follows:

.

.

i . sn→φ.

.

j . ∼sn→φj+1. ∼φ→∼sn i , contraposition 2, MP

j+2. ∼φ→∼∼sn j , contraposition 2, MP

j+3. (∼φ→∼∼sn)→[(∼φ→∼sn)→φ] A3

j+4. φ j + 1, j + 2, j + 3, MP (x2)

Thus we have shown that s ′1 . . . s ′n−1 `φ.

Stage 2 will show that s ′1 . . . s ′n−2 `φ. That is, it will remove one further of

the s ′i s on the left of the `. It proceeds like stage 1:

Stage 2a: �rst choose a new interpretation just like the one chosen at the

end of stage 1b, except that sn−1 is true in it. Following the strategy of stage 1a,

show that s ′1 . . . s ′n−2 ` sn−1→φ.

Stage 2b: choose a new interpretation just like the preceding, but in which

sn−1 is false. Following the strategy of 1b, show that s ′1 . . . s ′n−2 `∼sn−1→φ.


Stage 2c: Following the strategy of 1c, show on the basis of stages 2a and

2b that s ′1 . . . s ′n−2 `φ.

Stage 2 removed one more of the s ′i s on the left of the `. Stage 3 will remove

another one, and so we will have: s ′1 . . . s ′n−3 `φ. Each stage removes another

one; and so after the last stage, stage nc, they will all be gone and we will have

shown that `φ.

Chapter 3

Variations and Deviations fromStandard Propositional Logic

As promised, we will not stop with the standard logics familiar from in-

troductory textbooks. In this chapter we examine some philosophically

important variations and deviations from standard propositional logic.

3.1 Alternate connectives

3.1.1 Symbolizing truth functions in propositional logicOur propositional logic is in a sense “expressively complete”. To get at this idea,

let’s introduce the idea of a truth function. A truth function is a (�nite-placed)

function that maps truth values (i.e., 0s and 1s ) to truth values. For example,

here is a truth function, f :

f (1) = 0f (0) = 1

This is a called one-place function because it takes only one truth value as input.

In fact, we have a name for this truth function: negation. And we express that

truth function with our symbol ∼. So: negation is a truth function that we can

express in propositional logic.

63

CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 64

Another truth function we can express is the conjunction truth function:

g (1,1) = 1g (1,0) = 0g (0,1) = 0g (0,0) = 0

Conjunction is a two-place truth function, which means that it takes two truth

values as inputs. We have a symbol for this truth function as well: ∧.

Here’s another truth function:

i(1,1) = 0i(1,0) = 1i(0,1) = 1i(0,0) = 1

Think of this truth function as “not both”. Unlike the negation and conjunction

truth functions, we don’t have a single symbol for this truth function. Never-

theless, it too can be expressed in propositional logic. If we want to express

“not-both (P , Q)”, we can just write:

∼(P∧Q)

In fact, it’s not hard to show that any truth function (of any �nite number

of places) can be expressed in propositional logic using just the ∧,∨, and ∼.

Proof. The proof will be informal; but before giving it, we need a precise

de�nition of what it means to say that a truth function “can be expressed” in

propositional logic.

Definition of expressability: n-place truth function h can be expressed in

propositional logic iff there is some sentence of propositional logic, φ, con-

taining n sentence letters, P1 . . . Pn, which has the following feature: whenever

P1 . . . Pn have the truth values t1 . . . tn, respectively, then the whole sentence φhas the truth value h(t1 . . . tn)


Now for the proof. I’ll begin by illustrating the idea with an example. Suppose

we want to express the following three-place truth-function:

f (1,1,1) = 0f (1,1,0) = 1f (1,0,1) = 0f (1,0,0) = 1f (0,1,1) = 0f (0,1,0) = 0f (0,0,1) = 1f (0,0,0) = 0

We must construct a sentence with three sentence letters, P1, P2, and P3, whose

truth table “matches” function f . Now, if we ignore everything but the numbers

in the above picture of function f , we can think of it as a kind of truth table

for the sentence we’re after. The �rst column of numbers represents the truth

values of P1, the second column, the truth values of P2, and the third column,

the truth values of P3; and the far right column represents the truth values that

the desired formula should have. Each row represents a possible combination

of truth values for these sentence letters. Thus, the second row (“ f (1,1,0) = 0”)

is the combination where P1 is 1, P2 is 1, and P3 is 0; the fact that the fourth

column in this row is 1 indicates that the desired formula should be true here.

Since function f returns the value 1 in just three cases (rows two, four, and

seven), the sentence we’re after should be true in exactly those three cases: (a)

when P1, P2, P3 take on the three truth values in the second row (i.e., 1, 1, 0);

(b) when P1, P2, P3 take on the three truth values in the fourth row (1, 0, 0); and

(c) when P1, P2, P3 take on the three truth values in the seventh row (0, 0, 1) .

Now, we can construct a sentence that is true in case (a) and false otherwise:

P1∧P2∧∼P3. We can also construct a sentence that’s true in case (b) and false

otherwise: P1∧∼P2∧∼P3. And we can also construct a sentence that’s true in

case (c) and false otherwise: ∼P1∧∼P2∧P3. But then we can simply disjoin these

three sentences to get the sentence we want:

(P1∧P2∧∼P3)∨ (P1∧∼P2∧∼P3)∨ (∼P1∧∼P2∧P3)

(Strictly speaking the three-way conjunctions, and the three-way disjunction,

need parentheses added, but since it doesn’t matter where they’re added—

conjunction and disjunction are associative—I’ve left them off.)


This strategy is in fact purely general. Any n-place truth function, f , can

be represented by a chart like the one above. Each row in the chart consists of

a certain combination of n truth values, followed by the truth value returned

by f for those n inputs. For each such row, construct a conjunction whose

i thconjunct is Pi if the i th

truth value in the row is 1, and ∼Pi if the i thtruth

value in the row is 0. Notice that the conjunction just constructed is true if and

only if its sentence letters have the truth values corresponding to the row in

question. The desired formula is then simply the disjunction of all and only

the conjunctions for rows where the function f returns the value 1.1

Since the

conjunction for a given row is true iff its sentence letters have the truth values

corresponding to the row in question, the resulting disjunction is true iff its

sentence letters have truth values corresponding to one of the rows where freturns the value true, which is what we want.

Say that a set of connectives is adequate iff one can express all the truth

functions using a sentence containing only those connectives. What we just

showed was that the set {∧,∨,∼} is adequate. We can then use this fact to

prove that other sets of connectives are adequate. For example, it is easy to

prove that φ∨ψ has the same truth table as (is true relative to exactly the same

PL-interpretations as) ∼(∼φ∧∼ψ). But that means that for any sentence χwhose only connectives are ∧,∨, and ∼, we can construct another sentence χ ′

with the same truth table but whose only connectives are ∧ and∼: simply begin

with χ and use the equivalence between φ∨ψ and ∼(∼φ∧∼ψ) to eliminate

all occurrences of ∨ in favor of occurrences of ∧ and ∼. But now consider

any truth function f . Since {∧,∨,∼} is adequate, f can be expressed by some

sentence χ ; but χ has the same truth table as some sentence χ ′ whose only

connectives are ∧ , and ∼; hence f can be expressed by χ ′ as well. So {∧,∼} is

adequate.

Similar arguments can be given to show that other connective sets are

adequate as well. For example, the ∧ can be eliminated in favor of the→ and

the ∼ (since φ∧ψ has the same truth table as ∼(φ→∼ψ)); therefore, since

{∧,∼} is adequate, {→, ∼} is also adequate.

1Special case: if there are no such rows—i.e., if the function returns 0 for all inputs—

then let the formula be simply any logically false formula containing P1 . . . Pn , for example

P1∧∼P1∧P2∧P3∧· · ·∧Pn .


3.1.2 Inadequate connective setsCan we show that certain sets of connectives are not adequate?

We can quickly answer yes, for a trivial reason. The set {∼} isn’t adequate,

for the simple reason that, since ∼ is a one-place connective, no sentence with

more than one sentence letter can be built using just ∼. So there’s no hope of

symbolizing all the n-place truth functions, for any n > 1, using just the ∼.

More interestingly, we can show that there are inadequate connective sets

containing two-place connectives. Let’s prove that {∧,→} is not an adequate set

of connectives. We’ll do this by proving that if those were our only connectives,

we couldn’t express the negation truth function. And we’ll demonstrate that by

proving the following fact:

(+) For any sentence,φ, containing just sentence letter P and the connectives

∧ and→, φ is true in any PL-interpretation in which P is true

We’ll again use the method of induction. We want to show that (+) holds for all

sentences. So we �rst prove that (+) is true for all sentence with no connectives

(i.e., for sentences containing just sentence letters.) This is the base case, and

is very easy here, since if φ has no connectives, then obviously φ is just the

sentence letter P itself, in which case, clearly, φ is true in any PL-interpretation

in which P is true. Next we assume the inductive hypothesis:

(ih) (+) is true for sentences φ and ψ. (That is, in any interpretation in which

P is true, both φ and ψ are true.)

And we try to show, on the basis of this assumption, that (+) is true for φ∧ψand for φ→ψ. This is easy to do. First we show that (+) is true for φ∧ψ—

that is, φ∧ψ is true in any interpretation in which P is true. But we know

by the inductive hypothesis that φ and ψ are individually true in any such

interpretation. But then, we know from the truth table for ∧ that φ∧ψ is also

true in any such interpretation. The reasoning is exactly parallel for φ→ψ: the

inductive hypothesis tells us that whenever P is true, so are φ and ψ, and then

we know that in this case φ→ψ must also then be true, by the truth table for

→. Therefore, by induction, the result is proved.


Exercise 3.1 Let’s de�ne the connective % to have the following

truth table:

% 1 01 0 10 1 0

Can all the truth functions be expressed using just the %? Justify

your answer.

3.1.3 Sheffer strokeWe’ve seen how we can choose alternate sets of connectives. Some of these

choices are adequate (i.e., allow symbolization of all truth functions), others

are not.

As we saw, there are some truth functions that can be expressed in propo-

sitional logic, but not by a single connective (e.g., the not-both function idiscussed above.)

We could change this, by adding a new connective. Let’s use a new connec-

tive, the “Sheffer stroke”, |, to express not-both. φ|ψ is to mean that not both

φ and ψ are true, so let’s stipulate that φ|ψ will have the same truth table as

∼(φ∧ψ), i.e:

% 1 01 0 10 1 1

Now here’s an exciting thing about |: it’s an adequate connective all on its own.

You can express all the truth functions using just |!Here’s how we can prove this. We showed above that {→, ∼} is adequate;

so all we need to do is show how to de�ne the→ and the ∼ using just the |.De�ning ∼ is easy; φ|φ has the same truth table as ∼φ. As for φ→ψ, think of

it this way. φ→ψ is equivalent to ∼(φ∧∼ψ), i.e., φ|∼ψ. But given the method

just given for de�ning ∼ in terms of |, we know that ∼ψ is equivalent to ψ|ψ.

Thus, φ→ψ has the same truth table as: φ|(ψ|ψ).


Exercise 3.2 For each of the following two truth functions, f and

g , �rst �nd a sentence that expresses it in standard propositional

logic (i.e., with ∼, ∧, ∨,↔,→); then �nd a sentence that expresses

it using just the Sheffer stroke:

f (1,1) = 1 g (1,1,1) = 1f (1,0) = 0 g (1,1,0) = 0f (0,1) = 0 g (1,0,1) = 1f (0,0) = 1 g (1,0,0) = 1

g (0,1,1) = 1g (0,1,0) = 1g (0,0,1) = 0g (0,0,0) = 1

Feel free to avoid writing out extremely long formulas by making

abbreviations, saying things like “make such-and-such substitutions

throughout”, etc.

Exercise 3.3 Show that all truth functions can be de�ned using

just ↓ (nor). The truth table for ↓ is the following:

% 1 01 0 00 0 1

3.2 Polish notationAlternate connectives, like the Sheffer stroke, are called “variations” of standard

logic because they don’t really change what we’re saying with propositional

logic; it’s just a change in notation.

Another fun change in notation is polish notation. The basic idea of polish

notation is that the connectives all go before the sentences they connect. Instead

of writing P∧Q, we write ∧PQ. Instead of writing P∨Q we write ∨PQ.

Formally, here is the de�nition of a wff:


Definition of wffs for Polish notation:

· sentence letters are wffs

· if φ and ψ are wffs, then so are:

∼φ∧φψ ∨φψ →φψ ↔φψ

What’s the point? This notation eliminates the need for parentheses. With the

usual notation, in which we put the connectives between the sentences they

connect, we need parentheses to distinguish, e.g.:

(P∧Q)→RP∧(Q→R)

But with Polish notation, these are distinguished without parentheses; they

become:

→∧PQR∧P→QR

respectively.

Exercise 3.4 Translate each of the following into Polish notation:

a) P↔∼P

b) (P→(Q→(R→∼∼(S∨T ))))

c) [(P∧∼Q)∨(∼P∧Q)]↔∼[(P∨∼Q)∧(∼P∨Q)]

3.3 Multi-valued logic2

Logicians have considered adding a third truth value to the usual two. In these

new systems, in addition to truth (1) and falsity (0) , we have a third truth value,

2See Gamut (1991a, pp. 173-183).


#. There are a number of things one could take # to mean (e.g., “meaningless”,

or “unde�ned”, or “unknown”).

Standard logic is “bivalent”—that means that there are no more than two

truth values. So, moving from standard logic to a system that admits a third

truth value is called “denying bivalence”. One could deny bivalence, and go

even further, and admit four, �ve, or even in�nitely many truth values. But

we’ll only discuss trivalent systems—i.e., systems with only three truth values.

Why would one want to admit a third truth value? There are various

philosophical reasons one might give. One concerns vagueness. A person with

one dollar is not rich. A person with a million dollars is rich. Somewhere in the

middle, there are some people that are hard to classify. Perhaps a person with

$100,000 is such a person. They seem neither de�nitely rich nor de�nitely

not rich. So there’s pressure to say that the statement “this person is rich” is

capable of being neither de�nitely true nor de�nitely false. It’s vague.Others say we need a third truth value for statements about the future. If

it is in some sense “not yet determined” whether there will be a sea battle

tomorrow, then (it is argued) the sentence:

There will be a sea battle tomorrow

is neither true nor false. In general, statements about the future are neither

true nor false if there is nothing about the present that determines their truth

value one way or the other.3

Yet another case in which some have claimed that bivalence fails concerns

failed presupposition. Consider this sentence:

Ted stopped beating his dog.

In fact, I have never beaten a dog. I don’t even have a dog. So is it true that

I stopped beating my dog? Obviously not. But on the other hand, is this

statement false? Certainly no one would want to assert its negation: “Ted has

not stopped beating his dog”. The sentence presupposes that I was beating a dog;

since this presupposition is false, the question of the sentence’s truth does not

arise: the sentence is neither true nor false.

For a �nal challenge to bivalence, consider the sentence:

3An alternate view preserves the “openness of the future” as well as bivalence: both ‘There

will be a sea battle tomorrow’ and ‘There will fail to be a sea battle tomorrow’ are false. This

combination is not contradictory, provided one rejects the equivalence of “It will be the case

tomorrow that ∼φ” and “∼ it will be the case tomorrow that φ”.


Sherlock Holmes has a mole on his left leg

‘Sherlock Holmes’ doesn’t refer to a real entity. Further, Sir Arthur Conan

Doyle does not specify in his Sherlock Holmes stories whether Holmes has

such a mole. Either of these reasons might be argued to result in a truth value

gap for the displayed sentence.

It’s an interesting philosophical question whether any of these arguments

for bivalence’s failing are any good. But we won’t take up that question. Instead,

we’ll look at the formal result of giving up bivalence. That is, we’ll introduce

some non-bivalent formal systems. We won’t ask whether these systems really

model English correctly.

These systems all give different truth tables for the Boolean connectives.

The original truth tables give you the truth values of complex formulas based

on whether their sentence letters are true or false (1 or 0) . The new truth

tables need to take into account cases where the sentence letters are # (neither

1 nor 0) .

3.3.1 Łukasiewicz’s systemHere are the new truth tables (let’s skip the↔):

∼1 00 1# #

∧ 1 0 #1 1 0 #0 0 0 0# # 0 #

∨ 1 0 #1 1 1 10 1 0 ## 1 # #

→ 1 0 #1 1 0 #0 1 1 1# 1 # 1

Using these truth tables, one can calculate truth values of wholes based on

truth values of parts.

Example 3.1: Where P is 1, Q is 0 and R is #, calculate the truth value of

(P∨Q)→∼(R→Q). First, what is R→Q? Answer, from the truth table for→: #.

Next, what is∼(R→Q)? From the truth table for∼, we know that the negation

of a # is a #. So, ∼(R→Q) is #. Next, P∨Q: that’s 1∨0—i.e., 0. Finally, the

whole thing: 0→#, i.e., 1.

We can formalize this a bit more by de�ning up new notions of an inter-

pretation, and of truth relative to an interpretation:

Definition of trivalent interpretation: A trivalent interpretation is a func-

tion that assigns to each sentence letter exactly one of the values: 1, 0, #.


Definition of trivalent valuation: For any trivalent interpretation, I , the

Łukasiewicz-valuation for I , ŁVI , is de�ned as the function that assigns to

each wff either 1, 0, or #, and which is such that, for any wffs φ and ψ,

ŁVI (φ) =I (φ) if φ is a sentence letter

ŁVI (φ∧ψ) =

1 if ŁVI (φ) = 1 and ŁVI (ψ) = 1

0 if ŁVI (φ) = 0 or ŁVI (ψ) = 0

# otherwise

ŁVI (φ∨ψ) =

1 if ŁVI (φ) = 1 or ŁVI (ψ) = 1

0 if ŁVI (φ) = 0 and ŁVI (ψ) = 0

# otherwise

ŁVI (φ→ψ) =

1 if ŁVI (φ) = 0, or ŁVI (ψ) = 1, or

ŁVI (φ) =ŁVI (ψ) = #

0 ŁVI (φ) = 1 and ŁVI (ψ) = 0

# otherwise

ŁVI (∼φ) =

1 if ŁVI (φ) = 0

0 if ŁVI (φ) = 1

# otherwise

Let’s de�ne validity and semantic consequence for Łukasiewicz’s system much

like we did for standard PL:

Łukasiewicz definitions of validity and consequence:

· φ is Łukasiewicz-valid (“�Łφ”) iff for every trivalent interpretation I ,

ŁVI (φ) = 1

· φ is a Łukasiewicz-semantic-consequence of Γ (“Γ �Łφ”) iff for every

trivalent interpretation, I , if ŁVI (γ ) = 1 for each γ ∈ Γ, then ŁVI (φ) =1

Notice that there are now two ways a formula can fail to be valid. It can

be 0 under some trivalent interpretation, or it can be # under some trivalent


interpretation. “Valid” (under this de�nition) means always true; it does notmean never false. (Similarly, the de�ned notion of semantic consequence is

that of truth-preservation, not nonfalsity-preservation.) The de�nition leaves

it open that a formula might be never-false, and still not be always-true: such a

formula would be sometimes # and sometimes 1, but never 0.

Example 3.2: Is P ∨∼P Łukasiewicz-valid? Answer: no, it isn’t. Suppose Pis #. Then ∼P is #; but then the whole thing is # (since #∨# is #.)

Example 3.3: Is P→P Łukasiewicz-valid? Answer: yes. P could be either 1,

0 or #. From the truth table for→, we see that P→P is 1 in all three cases.

Exercise 3.5 Assuming Łukasiewicz’s tables, show that the→ is

not de�nable in terms of the ∼, ∧, and ∨. That is, show that there

is no wff φ such that i) φ contains just the sentence letters P and

Q, plus the connectives ∼, ∧, and ∨ (plus parentheses), and ii) φhas the same truth table as P→Q (i.e., φ is true in exactly the same

Łukasiewicz-valuations as P→Q).

Exercise 3.6 As noted, the de�nition of Łukasiewicz-validity leaves

it open that a formula might be never-false, and still not be always-

true. Give an example of such a formula.

3.3.2 Kleene’s “strong” tablesThis system is like Łukasiewicz’s system, except that the truth table for the→is different:

→ 1 0 #1 1 0 #0 1 1 1# 1 # #

As with Łukasiewicz’s system, let’s continue to understand validity as truth in

all trivalent interpretations, and semantic consequence as the preservation of

truth in a given trivalent interpretation.


Here is the intuitive idea behind the Kleene tables. Let’s call the truth values

0 and 1 the “classical” truth values. If a formula’s halves have only classical truth

values, then the truth value of the whole formula is just the classical truth value

determined by the classical truth values of the halves. But if one or both halves

are #, then we must consider the result of turning each # into one of the classical

truth values. If the entire formula would sometimes be 1 and sometimes be 0after doing this, then the entire formula is #. But if the entire formula always

takes the same truth value, X, no matter which classical truth value any #s are

turned into, then the entire formula gets this truth value X. Intuitively: if there

is “enough information” in the classical truth values of a formula’s parts to settle

on one particular classical truth value, then that truth value is the formula’s

truth value.

Take the truth table for φ→ψ, for example. When φ is 0 and ψ is #, the

whole formula is 1—because the false antecedent is suf�cient to make the whole

formula true, no matter what classical truth value we convert ψ to. On the

other hand, when φ is 1 and ψ is #, then the whole formula is #. The reason is

that what classical truth value we substitute in for ψ’s # affects the truth value of

the whole. If the # becomes a 0 then the whole thing is 0; but if the # becomes

a 1 then the whole thing is 1.

There are two important differences between Łukasiewicz’s and Kleene’s

systems. The �rst is that, unlike Łukasiewicz’s system, Kleene’s system makes

the formula P→P invalid. The reason is that in Kleene’s system, #→# is #; thus,

P→P isn’t true in all valuations (it is # in the valuation where P is #.)

In fact, it’s easy to show that there are no valid formulas in Kleene’s system.

Proof. Consider the valuation that makes every sentence letter #. Here’s an

inductive proof that every wff is # in this interpretation. Base case: all the

sentence letters are # in this interpretation. (That’s obvious.) Inductive step:

assume that φ and ψ are both # in this interpretation. We need now to show

that φ∧ψ, φ∨ψ, and φ→ψ are all # in this interpretation. But that’s easy—just

look at the truth tables for ∧,∨ and→. #∧# is #, #∨# is #, and #→# is #.

Even though there are no valid formulas in Kleene’s system, there are still

cases of semantic consequence. Semantic consequence for Kleene’s system is

de�ned as truth-preservation: Γ �Kleene

φ iff φ is true whenever every member

of Γ is true, given Kleene’s truth tables. Then P∧Q �Kleene

P , since the only

way for P∧Q to be true is for P to be true and Q to be true.

The second (related) difference is that in Kleene’s system, → is interde-

�nable with the ∼ and ∨, in that φ→ψ has exactly the same truth table as


∼φ∨ψ. (Look at the truth tables to verify that this is true.) But that’s not true

for Łukasiewicz’s system. In Łukasiewicz’s system, when φ and ψ are both #,

then φ→ψ is 1, but ∼φ∨ψ is #.

3.3.3 Kleene’s “weak” tables (Bochvar’s tables)This �nal system is based on a very different intuitive idea: that # is “infectious”.

That is, if any formula has a part that is #, then the entire formula is #. Thus,

the tables are as follows:

∼1 00 1# #

∧ 1 0 #1 1 0 #0 0 0 ## # # #

∨ 1 0 #1 1 1 #0 1 0 ## # # #

→ 1 0 #1 1 0 #0 1 1 ## # # #

So basically, the classical bit of each truth table is what you’d expect; but

everything gets boring if any constituent formula is a #.

One way to think about these tables is to think of the # as indicating nonsense.The sentence “The sun is purple and blevledgekl;rz”, one might naturally think,

is neither true nor false because it is nonsense. It is nonsense even though it

has a part that isn’t nonsense.

3.3.4 SupervaluationismRecall the guiding thought behind the strong Kleene tables: if a formula’s

classical truth values �x a particular truth value, then that is the value that the

formula takes on. There is a way to take this idea a step further, which results

in a new and interesting way of thinking about three-valued logic.

According to the strong Kleene tables, we get a classical truth value for

φ©ψ, where © is any connective, only when we have “enough classical

information” in the truth values of φ and ψ to �x a classical truth value for

φ© ψ. Consider φ∧ψ for example: if either φ or ψ is false, then since

falsehood of a conjunct is classically suf�cient for the falsehood of the whole

conjunction, the entire formula is false. But if, on the other hand, both φ and

ψ are #, then neither φ nor ψ has a classical truth value, we do not have enough

classical information to settle on a classical truth value for φ∧ψ, and so the

whole formula is #.


But now consider a special case of the situation just considered, where φ is

P , ψ is ∼P , and P is #. According to the strong Kleene tables, the conjunction

P∧∼P is #, since it is the conjunction of two formulas that are #. But there is a

way of thinking about truth values of complex sentences according to which

the truth value ought to be 0, not #: no matter what classical truth value P were

to take on, the whole sentence P∧∼P would be 0—therefore, one might think,

P∧∼P ought to be 0. If P were 0 then P∧∼P would be 0∧∼0—that is 0; and if

P were 1 then P∧∼P would be 1∧∼1—0 again.

The general thought here is this: suppose a sentence φ contains some

sentence letters P1 . . . Pn that are #. If φ would be false no matter how we assign

classical truth values to P1 . . . Pn—that is, no matter how we precisi�ed φ—then

φ is in fact false. Further, if φ would be true no matter how we precisi�ed it,

then φ is in fact true. But if precisifying φ would sometimes make it true and

sometimes make it false, then φ in fact is #.

The idea here can be thought of as an extension of the idea behind the

strong Kleene tables. Consider a formula φ©ψ, where© is any connective.

If there is enough classical information in the truth values of φ and ψ to �x on a

particular classical truth value, then the strong Kleene tables assign φ©ψ that

truth value. Our new idea goes further, and says: if there is enough classical

information within φ and ψ to �x a particular classical truth value, then φ©ψgets that truth value. Information “within” φ and ψ includes, not only the

truth values of φ and ψ, but also a certain sort of information about sentence

letters that occur in both φ and ψ. For example, in P∧∼P , when P is #, there

is insuf�cient classical information in the truth values of P and of ∼P to settle

on a truth value for the whole formula P∧∼P (since each is #). But when we

look inside P and ∼P , we get more classical information: we can use the fact

that P occurs in each to reason as we did above: whenever we turn P to 0, we

turn ∼P to 1, and so P∧∼P becomes 0; and whenever we turn P to 1 we turn

∼P to 0, and so again, P∧∼P becomes 0.

This new idea—that a formula has a classical truth value iff every way of

precisifying it results in that truth value—is known as supervaluationism. Let us

lay out this idea formally.

Where I is a trivalent interpretation and C is a PL-interpretation (i.e., a

bivalent interpretation in the sense of section 2.3), say that C is a precisi�cationof I iff: whenever I assigns a sentence letter a classical truth value (i.e., 1 or 0),

C assigns that sentence letter the same classical value. Thus, precisi�cations

of I agree with I on the classical truth values, but in addition—being PL-


interpretations—they also assign classical truth values to sentence letters to

which I assigns #. Each precisi�cation of I “decides” each of I ’s #s in some

way or other; different precisi�cations decide those #s in different ways.

We can now say how the supervaluationist assigns truth values to complex

formulas relative to a trivalent interpretation.

Definition of supervaluation: When φ is any wff and I is a trivalent inter-

pretation, the supervaluation of φ relative to I , is the function SI (φ) which

assigns 0, 1, or # to each wff as follows:

SI (φ) =

1 if VC (φ) = 1 for every precisi�cation, C , of I0 if VC (φ) = 0 for every precisi�cation, C , of I# otherwise

Here VC is the valuation for PL-interpretation C , as de�ned in section 2.3.

Some common terminology: when SI (φ) = 1, we say thatφ is supertrue inI ,

and when SI (φ) = 0, we say that φ is superfalse in I . For the supervaluationist,

a formula is true when it is supertrue (i.e., true in all precisi�cations of I ), false

when it is superfalse (i.e., false in all precisi�cations of I ), and # when it is

neither supertrue nor superfalse (i.e., when it is true in some precisi�cations of

I but false in others.)

Supervaluational notions of validity and semantic consequence may be

de�ned thus:

Supervaluational validity and consequence:

· φ is supervaluationally valid (“�SVφ”) iff φ is supertrue in every trivalent

interpretation

· φ is a supervaluational semantic consequence of Γ (“Γ �SVφ”) iff φ is

supertrue in each trivalent interpretation in which every member of Γ is

supertrue

To return to the example considered above: the supervaluationist assigns a

different truth value to P∧∼P , when P is #, than do the strong Kleene tables

(and indeed, than do all the other tables we have considered.) The strong

Kleene tables say that P∧∼P is # in this case. But the supervaluationist says

that it is 0: each precisi�cation of any trivalent interpretation that assigns P #is by de�nition a PL-interpretation, and P∧∼P is 0 in each PL-interpretation.


Let us note a few facts about supervaluationism.

First, note that every tautology (PL-valid formula) turns out to be superval-

uationally valid. For let φ be a tautology; and consider any trivalent interpreta-

tion I , and any precisi�cationC of I . Precisi�cations are PL-interpretations;

so, since φ is a tautology, φ is true in C .

Second, note that according to supervaluationism, some formulas are nei-

ther true nor false in some trivalent interpretations. For instance, take the

formula P∧Q, in any trivalent interpretation I in which P is 1 and Q is #. Any

precisi�cation of I must continue to assign P 1. But some precisi�cations of

I will assign 1 to Q, whereas others will assign 0 to Q. Any of the former

precisi�cations will assign 1 to P∧Q, whereas any of the latter will assign 0 to

P∧Q. Hence P∧Q is neither supertrue nor superfalse in I : SI (P∧Q) = #.

Finally, notice that the propositional connectives are not truth-functional

according to supervaluationism. To say that a connective© is truth-functional,

given a certain de�nition of truth in an interpretation, is to say that the truth

value4

according to that de�nition of a complex statement whose major connec-

tive is© is a function of the truth values of the immediate constituents of that

formula—that is, any two such formulas whose immediate constituents have

the same true values must themselves have the same truth value. According to

all of the truth tables for three-valued logic we considered earlier (Łukasiewicz,

Kleene strong and weak), the propositional connectives are truth-functional.

Indeed, this is in a way trivial: if a connective weren’t truth-functional according

to some de�nition of truth in an interpretation, then one couldn’t give that

connective a truth table at all; what a truth table does is specify how a connective

determines the truth value of entire sentences as a function of the truth values

of its parts. But supervaluationism renders the connectives not truth functional.

Consider the following pair of sentences, in a trivalent interpretation in which

P and Q are both #:

P∧QP∧∼P

As we have seen, in such trivalent interpretations, the �rst formula is # and

the second formula is 0 (since it is superfalse). But each of these formulas is

a conjunction, each of whose conjuncts is #: the truth values of φ and ψ do

not determine the truth value of φ∧ψ. So in supervaluationism, the ∧ isn’t

4I am counting #, in addition to 1 and 0, as a “truth value”.


truth-functional. Similar arguments can be given to show that the→ and the

∨ aren’t truth-functional either, given supervaluationism.

Exercise 3.7 Show that if a formula is true in a trivalent interpre-

tation given the strong Kleene truth tables, then it is supertrue in

that interpretation.

Exercise 3.8 Suppose that a wff φ has no repetitions of sentence

letters (i.e., each sentence letter occurs at most once inφ.) Show that

φ is not valid according to supervaluationism—that is, show that

it’s not the case that: for each trivalent interpretation I , SI (φ) = 1.

3.4 IntuitionismIntuitionism in the philosophy of mathematics is a view according to which

there are no mind-independent mathematical facts. Rather, mathematical facts

and entities are mental constructs that owe their existence to the mental activity

of mathematicians constructing proofs. This philosophy of mathematics leads

intuitionists to a distinctive form of logic: intuitionist logic.

Let P be the statement: The sequence 0123456789 occurs somewhere in thedecimal expansion of π. How should we think about its meaning? For the classicalmathematician, the answer is straightforward. P is a statement about a part of

mathematical reality, namely, the in�nite decimal expansion of π. Either the

sequence 0123456789 occurs somewhere in that expansion, in which case P is

true, or it does not, in which case P is false and ∼P is true.

For the intuitionist, this whole picture is mistaken, premised as it is on the

reality of an in�nite decimal expansion of π. Our minds are �nite, and so only

the �nite initial segment of π’s decimal expansion that we have constructed so

far is real. The intuitionist’s alternate picture of P ’s meaning, and indeed of

meaning generally (for mathematical statements) is a radical one.5

The classical mathematician, comfortable with the idea of a realm of mind-

independent entities, thinks of meaning in terms of truth and falsity. As we saw,

she thinks of P as being true or false depending on the facts about π’s decimal

5One intuitionist picture, anyway, on which see Dummett (1973). What follows is a crude

sketch. It does not do justice to the actual intuitionist position, which is, as they say, subtle.


expansion. Further, she explains the meanings of the propositional connectives

in truth-theoretic terms: a conjunction is true iff each of its conjuncts are true;

a negation is true iff the negated formula is false; and so on. Intuitionists, on

the other hand, reject the centrality of truth to meaning, since truth is tied

up with the rejected picture of mind-independent mathematical reality. For

them, the central semantic concept is that of proof. They simply do not think

in terms of truth and falsity; in matters of meaning, they think in terms of the

conditions under which formulas have been proved.

Take P , for example. Intuitionists advise us: don’t think in terms of what

it would take for P to be true. Think, rather, in terms of what it would take

to prove P . And the answer is clear: we would need to actually continue our

construction of the decimal expansion of π to a point where we found the

sequence 0123456789.

What, now, of ∼P? Again, thinking in terms of proof, not truth: what

would it take for ∼P to be proved? The answer here is less straightforward.

Since P said that there exists a number of a certain sort, it was clear how it

would have to be proved: by actually exhibiting (calculating) some particular

number of that sort. But ∼P says that there is no number of a certain sort; how

do we prove something like that? The intuitionist’s answer: by proving that theassumption that there is a number of that sort leads to a contradiction. In general, a

negation, ∼φ, is proved by proving that φ leads to a contradiction.6

Similarly for the other connectives: the intuitionist explicates their meanings

by their role in generating proof conditions, rather than truth conditions. φ∧ψis proved by separately giving a proof of φ and a proof of ψ; φ∨ψ is proved

by giving either a proof of φ or a proof of ψ; φ→ψ is proved by exhibiting a

construction whereby any proof of φ can be converted into a proof of ψ.

Likewise, the intuitionist thinks of logical consequence as the preservation

of provability, not the preservation of truth. For example,φ∧ψ logically implies

φ because if one has a proof of φ∧ψ, then one has a proof of φ; and conversely,

if one has proofs of φ and ψ separately, then one has the materials for a proof

of φ∧ψ. So far, so classical. But ∼∼φ does not logically imply φ, for the

intuitionist. Simply having a proof of ∼∼P—a proof that the assumption that

0123456789 occurs nowhere in π’s decimal expansion leads to a contradiction—

wouldn’t give us a proof of P , since proving P would require exhibiting a

particular place in π’s decimal expansion where 0123456789 occurs.

6Given the contrast with the classical conception of negation, a different symbol (often

“¬”) is sometimes used for intuitionist negation.


Likewise, intuitionists do not accept the law of the excluded middle, φ∨∼φ,

as a logical truth. To be a logical truth, according to an intuitionist, a sentence

should be provable from no premises whatsoever. But to prove P∨∼P , for

example, would require either exhibiting a case of 0123456789 in π’s decimal

expansion, or proving that the assumption that 0123456789 occurs inπ’s decimal

expansion leads to a contradiction. We’re not in a position to do either.

Though we won’t consider intuitionist predicate logic, one of its most

striking features is easy to grasp informally. Intuitionists say that an existentially

quanti�ed sentence is proved iff one of its instances has been proved. Therefore

they reject the inference from ∼∀xF x to ∃x∼F x, for one might be able to

prove a contradiction from the assumption of ∀xF x without being able to

prove any instance of ∃x∼F x.

We have so far been considering a putative philosophical justi�cation for

intuitionist propositional logic. That justi�cation has been rough and ready;

but intuitionist propositional logic itself is easy to present, perfectly precise,

and is a coherent system regardless of what one thinks of its philosophical

underpinnings. Two simple modi�cations to the natural deduction system of

section 2.4 generate a natural deduction system for intuitionistic propositional

logic. First, we need to split up the double-negation rule, DN, into two halves,

“double-negation introduction” and “double-negation elimination”:

Γ `∼∼φΓ `φ

DNE

Γ `φΓ `∼∼φ

DNI

In our original system from section 2.4 we were allowed to use both DNE and

DNI; but in the intuitionist system, we are only allowed to use DNI; DNE is

not allowed. Second, to make up for the dropped rule DNE, our intuitionist

system adds the rule “ex falso”:

Γ `φ∧∼φΓ `ψ

EF

Note that EF can be proved in the original system: simply use RAA and then

DNE. So, intuitionist logic results from a system for classical logic by simply

dropping one rule (DNE) and adding another rule that was previously provable

(EF). It follows that every intuitionistically provable sequent is also classically

provable (because every intuitionistic proof can be converted to a classical

proof).


Notice how dropping DNE blocks proofs of various classical theorems the

intuitionist wants to avoid. The proof of ∅ ` P∨∼P in section 2.4, for instance,

used DNE. Of course, for all we’ve said so far, there might be some other way

to prove this sequent. Only when we have a semantics for intuitionistic logic,

and a soundness proof relative to that semantics, can we show that this sequent

cannot be proven without DNE. We will discuss a semantics for intuitionism

in section 7.2.

It is interesting to note that even though intuitionists reject the inference

from ∼∼P to P , they accept the inference from ∼∼∼P to ∼P , since its proof

only requires the half of DN that they accept, namely the inference from P to

∼∼P :

1 (1) ∼∼∼P As


2 (3) ∼∼P 2, DN (accepted version)

1,2 (4) ∼∼P ∧∼∼∼P 1,3 ∧I

1 (5) ∼P 4, RAA

Note that you can’t use this sort of proof to establish ∼∼P ` P . Given the way

RAA is stated, its application always results in a formula beginning with the ∼.

Chapter 4

Predicate Logic

Let’s now turn from propositional logic to the “predicate calculus” (PC),

as it is sometimes called. As with propositional logic, we’re going to

formalize predicate logic. We’ll �rst do grammar, and then move to semantics.

We won’t consider proof theory at all.1

4.1 Grammar of predicate logicAs before, we start by specifying the kinds of symbols that may be used in

sentences of predicate logic—primitive vocabulary—and then go on to de�ne

the well formed formulas as strings of primitive vocabulary that have the right

form.


· logical: →, ∼, ∀· nonlogical:

· for each n > 0, n-place predicates F ,G . . ., with or without subscripts

· variables x, y . . . with or without subscripts

· individual constants (names) a, b . . ., with or without subscripts

· parentheses

1Proof systems for predicate logic, in both axiomatic and natural-deduction form, are

straightforward, and can be found in standard logic textbooks.

84

CHAPTER 4. PREDICATE LOGIC 85

No symbol of one type is a symbol of any other type. Let’s call any variable or

constant a term.

Definition of wff:

i) if Π is an n-place predicate and α1 . . .αn are terms, then Πα1 . . .αn is a wff

ii) if φ, ψ are wffs, and α is a variable, then ∼φ, (φ→ψ), and ∀αφ are wffs

iii) nothing else is a wff

We’ll call formulas that are wffs in virtue of clause i) “atomic” formulas. When

a formula has no free variables, we’ll say that it is a closed formula, or sentence;

otherwise it is an open formula. (“Free” means that the variable doesn’t “belong”

to any quanti�er in the formula. For example, in ∀yRxy, the variable x is free,

whereas the variable y is “bound” to the quanti�er ∀y. ‘Free’ and ‘bound’ can

be precisely de�ned, but I won’t bother.)

We have the same de�ned logical terms: ∧,∨,↔. We also add the following

de�nition of the existential quanti�er:

Definition of ∃: “∃vφ” is short for “∼∀α∼φ” (where α is a variable and φ is

a wff)

4.2 Semantics of predicate logicRecall from section 2.2 the truth-conditional conception of semantics. Se-

mantics is about meaning; meaning is about the way truth is determined by

the world; and the way we represent the dependence of truth on the world in

logic consists of i) de�ning certain abstract con�gurations, which represent

different ways the world could be, and ii) de�ning the notion of truth for

formulas in these con�gurations. We thereby shed light on meaning, and we

are thereby able to de�ne precise versions of the notions of logical truth and

logical consequence.

In propositional logic, the con�gurations were the PL-interpretations:

assignments of truth or falsity (1 or 0) to sentence letters; and valuation functions

de�ned truth in a con�guration. This procedure needs to get more complicated

for predicate logic. The reason is that the method of truth tables assumes that

we can calculate the truth value of a complex formula by looking at the truth

values of its parts. But take the sentence ∃x(F x∧Gx). You can’t calculate its

truth value by looking at the truth values of F x and Gx, since sentences like


F x don’t have truth values at all. The variable ‘x’ doesn’t stand for any one

thing, and so ‘F x’ doesn’t have a truth value.

The solution to this problem is due to the Polish logician Alfred Tarski. It

begins with a new conception of a con�guration, that of a model:

Definition of model: A PC-model is an ordered pair ⟨D,I ⟩ such that:

· D is a non-empty set (“the domain”)

· I is a function (“the interpretation function”) obeying the following

constraints:

· if α is a constant then I (α) ∈D· if Π is an n-place predicate, then I (Π) = some set of n-tuples of

members of D.

(Recall the notion of an n-tuple from section 1.8.)

Models, as we have de�ned them, seem like good ways to represent con-

�gurations of the world. The domain, D, contains, intuitively, the individuals

that exist in the con�guration. I , the interpretation function, tells us what the

non-logical constants (names and predicates) mean in the con�guration. Iassigns to each name a member of the domain—its denotation. For example,

if the domain is the set of persons, then the name ‘a’ might be assigned me.

One-place predicates get assigned sets of 1-tuples of D—that is, just sets of

members ofD. So, a one-place predicate ‘F ’ might get assigned a set of persons.

That set is called the “extension” of the predicate—if the extension is the set

of males, then the predicate ‘F ’ might be thought of as symbolizing “is male”.

Two-place predicates get assigned sets of ordered pairs of members of D—that

is, binary relations over the domain. If a two place predicate ‘R’ is assigned

the set of persons ⟨u, v⟩ such that u is taller than v, we might think of ‘R’ as

symbolizing “is taller than”. Similarly, three-place predicates get assigned sets

of ordered triples…

Relative to any PC-model ⟨D,I ⟩, we want to de�ne the notion of truth in

a model—the corresponding valuation function. But we’ll need some apparatus

�rst. It’s pretty easy to see what truth value a sentence like F a should have. Iassigns a member of the domain to a—call that member u. I also assigns a

subset of the domain to F —let’s call that subset S . The sentence F a should be

true iff u ∈ S—that is, iff the referent of a is a member of the extension of F .

That is, F a should be true iff I (a) ∈ I (F ). Similarly, Rab should be true iff

⟨I (a),I (b )⟩ ∈ I (R). And so on.


As before, we can give recursive clauses for the truth values of negations

and conditionals. φ→ψ, for example, will be true iff either φ is false or ψ is

true.

But this becomes tricky when we try to specify the truth value of ∀xF x. It

should, intuitively, be true if and only if ‘F x’ is true, no matter what we put

in in place of ‘x’. But this is vague. Do we mean “whatever name (constant)

we put in place of ‘x”’? No, because we don’t want to assume that we’ve got a

name for everything in the domain, and what if F x is true for all the objects we

have names for, but false for one of the nameless things! Do we mean, “true no

matter what object from the domain we put in place of ‘x”’? No; objects from

the domain aren’t part of our primitive vocabulary, so the result of replacing ‘x’

with an object from the domain won’t be a formula!2

Tarski’s solution to this problem goes as follows. Initially, we don’t consider

truth values of formulas absolutely. Rather, we let the variables refer to certain

things in the domain temporarily. Then, we’ll say that ∀xF x will be true iff

for all objects u in the domain D: F x is true while x temporarily refers to u.

We implement this idea of temporary reference with the idea of a “variable

assignment”:

Definition of variable assignment: g is a variable assignment for model

⟨D,I ⟩ iff g is a function that assigns to each variable some object in D.

The variable assignments give the “temporary” meanings to the variables; when

g (x) = u, then u is the temporary denotation of x.

We need a further bit of notation. Let u be some object in D, let g be some

variable assignment, and let α be a variable. We then de�ne “g αu ” to be the

variable assignment that is just like g , except that it assigns u to α. (If g already

assigns u to α then g αu will be the same function as g .)

Note the following important fact about variable assignments: g αu , when

applied to α, must give the value u. (Work through the de�nitions to see that

this is so.) That is:

g αu (α) = u

One more bit of apparatus. Given any modelM (= ⟨D,I ⟩), and given any

variable assignment, g , and given any term (i.e., variable or name) α, we de�ne

2Unless the domain happens to contain members of our primitive vocabulary!


the denotation of α, relative toM and g , “[α]M ,g ” as follows:

[α]M ,g =

(

I (α) if α is a constant

g (α) if α is a variable

The subscriptsM and g on [ ] indicate that denotations are assigned relative

to a model (M ), and relative to a variable assignment (g ).

Now we are ready to de�ne the valuation function. The valuation function

will assign truth values to formulas relative to variable assignments. (Relativization

to assignments is necessary because, as we noticed before, F x doesn’t have a

truth value absolutely. It only has a truth value relative to an assigned value to

the variable x—i.e., relative to a choice of an arbitrary denotation for x.)

Definition of valuation: The PC-valuation function, VM ,g , for PC-model

M (= ⟨D,I ⟩) and variable assignment g , is de�ned as the function that assigns

to each wff either 0 or 1 subject to the following constraints:

i) for any n-place predicate Π and any terms α1 . . .αn, VM ,g (Πα1 . . .αn) = 1iff ⟨[α1]M ,g . . .[αn]M ,g ⟩ ∈ I (Π)

ii) for any wffs φ, ψ, and any variable α:

VM ,g (∼φ) = 1 iff VM ,g (φ) = 0

VM ,g (φ→ψ) = 1 iff either VM ,g (φ) = 0 or VM ,g (ψ) = 1

VM ,g (∀αφ) = 1 iff for every u ∈D,VM ,gαu(φ) = 1

(In understanding clause i), recall that the one tuple containing just u, ⟨u⟩ is

just u itself. Thus, in the case where Π is F , some one place predicate, clause i)

says that VM ,g (Fα) = 1 iff [α]M ,g ∈I (F ).)

So far we have de�ned the notion of truth in a model relative to a variableassignment. But what we really want is a notion of truth in a model, period—that

is, absolute truth in a model. (We want this because we want to de�ne, e.g., a

valid formula as one that is true in all models.) So, let’s de�ne absolute truth in

a model in this way:

Definition of truth in a model: φ is true in PC-modelM iff VM ,g (φ) = 1,

for each variable assignment g


It might seem that this is too strict a requirement—why must φ be true relative

to each variable assignment? But in fact, it’s not too strict at all. The kinds of

formulas we’re really interested in are formulas without free variables (we’re

interested in formulas like F a, ∀xF x, ∀x(F x→Gx); not formulas like F x,

∀xRxy, etc.) And if a formula has no free variables, then if there’s even a single

variable assignment relative to which it is true, then it is true relative to every

variable assignment. (And so, we could just as well have de�ned truth in a

model as truth relative to some variable assignment.) I won’t prove this fact,

but it’s not too hard to prove; one would simply need to prove (by induction)

that, for any wff φ and modelM , if variable assignments g and h agree on all

variables free in φ, then VM ,g (φ) =VM ,h(φ).

Now we can give semantic de�nitions of the core logical notions:

Definition of validity: φ is PC-valid (“�PCφ”) iff φ is true in all PC-models

Definition of semantic consequence: φ is a PC-semantic consequence of

set of wffs Γ (“Γ �PCφ”) iff for every PC-modelM , if each member of Γ is

true inM then φ is also true inM

Since our new de�nition of the valuation function treats the propositional

connectives→ and ∼ in the same way as the propositional logic valuation did,

it’s easy to see that it also treats the de�ned connectives ∧, ∨, and↔ in the

same way:

VM ,g (φ∧ψ) = 1 iff VM ,g (φ) = 1 and VM ,g (ψ) = 1

VM ,g (φ∨ψ) = 1 iff VM ,g (φ) = 1 or VM ,g (ψ) = 1

VM ,g (φ↔ψ) = 1 iff VM ,g (φ) =VM ,g (ψ)

Moreover, we can also prove that the valuation function treats ∃ as it should

(given its intended meaning):

VM ,g (∃αφ) = 1 iff there is some u ∈D such that VM ,gαu(φ) = 1

This can be established as follows. The de�nition of ∃αφ is: ∼∀α∼φ. So, we

must show that for any model, and any variable assignment g based on that

model, VM ,g (∼∀α∼φ) = 1 iff there is some u ∈D such that VM ,gαu(φ) = 1. (In

arguments like these, I’ll sometimes stop writing the subscriptM in order to

reduce clutter. It should be obvious from the context what the relevant model

is.) Here’s the argument:


· Vg (∼∀α∼φ) = 1 iff Vg (∀α∼φ) = 0 (given the clause for ∼ in the de�ni-

tion of the valuation function)

· But, Vg (∀α∼φ) = 0 iff for some u ∈D, Vgαu(∼φ) = 0

· Given the clause for ∼, this can be rewritten as: “… iff for some u ∈D,

Vgαu(φ) = 1”

4.3 Establishing validity and invalidityGiven our de�nitions, we can establish that particular formulas are valid.

Example 4.1: Show that ∀xF x→F a is valid. That is, show that for any

model ⟨D,I ⟩, and any variable assignment g , Vg (∀xF x→F a) = 1:

i) Suppose otherwise; then Vg (∀xF x) = 1 and Vg (F a) = 0.

ii) Given the latter, that means that [a]g /∈I (F )—that is, I (a) /∈I (F ).

iii) Given the former, for any u ∈D, Vg xu(F x) = 1.

iv) I (a) ∈D, and so Vg xI (a)(F x) = 1.

v) By the truth condition for atomics, [x]g xI (a)∈I (F ).

vi) By the de�nition of the denotation of a variable, [x]g xI (a)

= g xI (a)(x)

vii) but g xI (a)(x) = I (a). Thus, I (a) ∈I (F ). Contradiction

The claim in step iv) that I (a) ∈ D comes from the de�nition of an inter-

pretation function: the interpretation of a constant is always a member of the

domain. Notice that “I (a)” is a term of our metalanguage; that’s why, when

I’m given that “for any u ∈D” in step iii), I can set u equal to I (a) to obtain

step iv).

Example 4.2: Show that � ∀x∀yRxy→∀xRx x:

i) Suppose for reductio that Vg (∀x∀yRxy→∀xRx x) = 0 (for some assign-

ment g in some model). Then Vg (∀x∀yRxy) = 1 and …


ii) …Vg (∀xRx x) = 0

iii) Given ii), for some u ∈D,Vg xu(Rx x) = 0, and so ⟨[x]g x

u,[x]g x

u⟩ /∈I (R)

iv) [x]g xu

is g xu(x), i.e., u. So ⟨u, u⟩ /∈I (R)

v) Given i), for every member ofD, and so for u in particular, Vg xu(∀yRxy) =

1

vi) given v), for every member ofD, and so for u in particular, Vg xyu u(Rxy) = 1

vii) given vi), ⟨[x]g xyu u

,[y]g xyu u⟩ ∈ I (R)

viii) But [x]g xyu u

and [y]g xyu u

are each just u. Hence ⟨u, u⟩ ∈ I (R), contradicting

iv).

Exercise 4.1 Show that:

a) � ∀x(F x→(F x∨Gx))

b) � ∀x(F x∧Gx)→(∀xF x∧∀xGx)

c) ∀x(F x→Gx),∀x(Gx→H x) � ∀x(F x→H x)

d) � ∃x∀yRxy→∀y∃xRxy

We’ve seen how to establish that particular formulas are valid. How do

we show that a formula is invalid? We need to simply exhibit a single model

in which the formula is false. (The de�nition of validity speci�es that a valid

formula is true in all models; therefore, it only takes one model in which a

formula is false to make that formula invalid.) So let’s take one example; let’s

show that the formula (∃xF x∧∃xGx)→∃x(F x∧Gx) isn’t valid. To do this, we

must produce a model in which this formula is false. All we need is a single

model, since in order for the formula to be valid, it must be true in all models.

My model will contain letters in its domain:

D = {u,v}I (F ) = {u}I (G) = {v}


It is intuitively clear that the formula is false in this model. In this model,

something has F (namely, u), and something has G (namely, v), but nothing

has both.

One further example: let’s show that ∀x∃yRxy 2 ∃y∀xRxy. We must show

that the �rst formula does not semantically imply the second. So we must come

up with a model in which the �rst formula is true and the second is false. It helps

to think about natural language sentences that these formulas might represent.

If R symbolizes “respects”, then the �rst formula says that “everyone respects

someone or other”, and the second says that “there is someone whom everyone

respects”. Clearly, the �rst can be true while the second is false: suppose that

each person respects a different person, so that no one person is respected by

everyone. A simple case of this occurs when there are just two people, each of

whom respects the other, but neither of whom respects him/herself:

• (( •hh

Here is a model based on this idea:

D = {u,v}I (R) = {⟨u,v⟩, ⟨v,u⟩}

Exercise 4.2 Show that

a) 2 ∀x(F x→Gx)→∀x(Gx→F x)

b) 2 ∀x(F x∨∼Gx)→(∀xF x∨∼∃xGx)

c) Rab 2 ∃xRx x

d) ∀x∀y∀z[(Rxy∧Ry z)→Rx z],∀x∃yRxy 2 ∃xRx x

Chapter 5

Extensions of Predicate Logic

The predicate logic we considered in the previous chapter is powerful.

Much natural language discourse can be represented using it, in a way

that reveals logical structure. Nevertheless, it has its limitations. In this chapter

we consider some of its limitations, and corresponding additions to predicate

logic.

5.1 Identity“Standard” predicate logic is usually taken to include the identity sign (“=”).

“a=b” means that a and b are one and the same thing.

5.1.1 Grammar for the identity signWe �rst need to expand our grammar of predicate logic to allow for the new

symbol =. Two changes are needed. First, we need to add = to the primitive

vocabulary of predicate logic. Then we need to the following clause to the

de�nition of a well-formed formula:

· If α and β are terms, then α=β is a wff

We need to beware of a potential source of confusion. We’re now using the

symbol ‘=’ as the object-language symbol for identity. But I’ve also been using

‘=’ as the metalanguage symbol for identity, for instance when I write things

93

CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 94

like “V(φ) = 1”. This shouldn’t generally cause confusion, but if there’s a

danger of misunderstanding, I’ll clarify by writing things like: “…= (i.e., is the

same object as)…”.

5.1.2 Semantics for the identity signThis is easy. We keep the notion of a PC-model from the last chapter, and

simply add to our de�nition of truth-in-a-PC-model. All we need to add is a

clause to the de�nition of a valuation function telling it what truth values to

give to sentences containing the = sign. Here is the clause:

VM ,g (α=β) = 1 iff: [α]M ,g = (i.e., is the same object as) [β]M ,g

That is, the sentence α = β is true iff the terms α and β refer to the same

object.

Example 5.1: Show that the formula ∀x∃y x=y is valid. We need to show

that in any model, and any variable assignment g in that model, Vg (∀x∃y x=y) =1. So:

i) So, suppose for reductio that for some g in some model, Vg (∀x∃y x=y) =0.

ii) Given the clause for∀, for some object in the domain, call it “u”, Vg xu(∃y x=y) =

0.

iii) Given the clause for ∃, for every v in the domain, Vgu/x v/y(x=y) =0.

iv) Letting v in iii) be u, we have: Vg xyu u(x=y) = 0.

v) So, given the clause for “=”, [x]g xyu u

is not the same object as [y]g xyu u

vi) but [x]g xyu u

and [y]g xyu u

are the same object. [x]g xyu u

is g xyu u(x), i.e., u; and

[y]g xyu u

is g xyu u(y), i.e., u.

Exercise 5.1 Demonstrate each of the following:

a) F ab � ∀x(x=a→F x b )

b) ∃x∃y∃z(F x∧F y∧F z∧x 6=y∧x 6=z∧y 6=z),∀x(F x→(Gx∨H x) 2 ∃x∃y∃z(Gx∧Gy∧Gz ∧ x 6=y∧x 6=z∧y 6=z)


5.1.3 Symbolizations with the identity signWhy do we ever add anything to our list of logical constants? Why not stick

with the tried and true logical constants of propositional and predicate logic?

We generally add a logical constant when it has a distinctive inferential and

semantic role, and when it has very general application—when, that is, it occurs

in a wide range of linguistic contexts. We studied the distinctive semantic role

of ‘=’ in the previous section. In this section, we’ll look at the range of linguistic

contexts that can be symbolized using ‘=’.

The most obvious sentences that may be symbolized with ‘=’ are those

that explicitly concern identity, such as “Mark Twain is identical to Samuel

Clemens”:

t=c

and “Every man fails to be identical to George Sand”:

∀x(M x→∼x=s)

(It will be convenient to abbreviate ∼α=β as α 6=β. Thus, the second symbol-

ization can be rewritten as: ∀x(M x→x 6=s).) But many other sentences involve

the concept of identity in subtler ways.

Consider, for example, “Every lawyer hates every other lawyer”. The ‘other’

signi�es nonidentity; we have, therefore:

∀x(Lx→∀y[(Ly∧x 6=y)→H xy])

Consider next “Only Ted can change grades”. This means: “no one other than

Ted can change grades”, and may therefore be symbolized as:

∼∃x(x 6=t∧C x)

(letting ‘C x’ symbolize “x can change grades”.)

Another interesting class of sentences concerns number. We cannot sym-

bolize “There are at least two dinosaurs” as: “∃x∃y(D x∧Dy)”, since this would

be true even if there were only one dinosaur: x and y could be assigned the

same dinosaur. The identity sign to the rescue:

∃x∃y(D x∧Dy ∧ x 6=y)


This says that there are two different objects, x and y, each of which are di-

nosaurs. To say “There are at least three dinosaurs” we say:

∃x∃y∃z(D x∧Dy∧D z∧ x 6=y ∧ x 6=z ∧ y 6=z)

Indeed, for any n, one can construct a sentence φn that symbolizes “there are

at least n F s”:

φn : ∃x1 . . .∃xn(F x1∧· · ·∧F xn ∧δ)

where δ is the conjunction of all sentences “xi 6=x j ” where i and j are integers

between 1 and n (inclusive) and i < j . (The sentence δ says in effect that no

two of the variables x1 . . . xn stand for the same object.)

Since we can construct eachφn, we can symbolize other sentences involving

number as well. To say that there are at most n F s, we write: ∼φn+1. To say

that there are between n and m F s (where m > n), we write: φn∧∼φm+1. To

say that there are exactly n F s, we write: φn∧∼φn+1.

These methods for constructing sentences involving number will always

work; but one can often construct shorter numerical symbolizations by other

methods. For example, to say “there are exactly two dinosaurs”, instead of

saying “there are at least two dinosaurs, and it’s not the case that there are at

least three dinosaurs”, we could say instead:

∃x∃y(D x∧Dy ∧ x 6=y ∧∀z[D z→(z=x∨z=y)])

Exercise 5.2 Symbolize each of the following, using predicate

logic with identity.

a) Everyone who loves someone else loves everyone

b) The only truly great player who plays in the NBA is Allen

Iverson

c) If a person shares a solitary con�nement cell with a guard,

then they are the only people in the cell

d) There are at least �ve dinosaurs (What is the shortest sym-

bolization you can �nd?)


5.2 Function symbolsA singular term, such as ‘Ted’, ‘New York City’, ‘George W. Bush’s father’,

or ‘the sum of 1 and 2’, is a term that purports to refer to a single entity.

Notice that some of these have semantically signi�cant structure. ‘George

W. Bush’s father’, for example, means what it does because of the meaning

of ‘George W. Bush’ and the meaning of ‘father’. But standard predicate

logic’s only (constant) singular terms are its names: a, b , c . . . , which do nothave semantically signi�cant parts. Thus, using predicate logic’s names to

symbolize semantically complex English singular terms leads to an inadequate

representation.

Suppose, for example, that we give the following symbolizations:

“3 is the sum of 1 and 2”: a = b

“George W. Bush’s father was a politician”: P c

By symbolizing ‘the sum of 1 and 2’ as simply ‘b ’, the �rst symbolization ignores

the fact that ‘1’, 2’, and ‘sum’ are semantically signi�cant constituents of ‘the

sum of 1 and 2’; and by symbolizing “George W. Bush’s father” as ‘c ’, we ignore

the semantically signi�cant occurrences of ‘George W. Bush’ and ‘father’. This

is a bad idea. We ought, rather, to produce symbolizations of these terms

that take account of their semantic complexity. The symbolizations ought to

account for the distinctive logical behavior of sentences containing the complex

terms. For example, the sentence “George W. Bush’s father was a politician”

logically implies the sentence “Someone’s father was a politician”. This ought

to be re�ected in the symbolizations; the �rst sentence’s symbolization ought

to semantically imply the second sentence’s symbolization.

One way of doing this is via an extension of predicate logic: we add functionsymbols to its primitive vocabulary. Think of “George W. Bush’s father” as

the result of plugging “George W. Bush” into the blank in “ ’s father”. “ ’s

father” is an English function symbol. Function symbols are like predicates in

some ways. The predicate “ is happy” has a blank in it, in which you can put

a name. “ ’s father” is similar in that you can put a name into its blank. But

there is a difference: when you put a name into the blank of a predicate, you

get a complete sentence, whereas when you put a name into the blank of “ ’s

father”, you get a noun phrase, such as “George W. Bush’s father”.


Corresponding to English function symbols, we’ll add logical function

symbols. We’ll symbolize “ ’s father” as f ( ). We can put names into the

blank here. Thus, we’ll symbolize “George W. Bush’s father” as “ f (a)”, where

“a” stands for “George W. Bush”.

We need to add two more complications. First, what goes into the blank

doesn’t have to be a name—it could be something that itself contains a function

symbol. E.g., in English you can say: “George W. Bush’s father’s father”. We’d

symbolize this as: f ( f (a)).Second, just as we have multi-place predicates, we have multi-place function

symbols. “The sum of 1 and 2” contains the function symbol “the sum of

and —”. When you �ll in the blanks with the names “1” and “2”, you get the

noun phrase “the sum of 1 and 2”. So, we symbolize this using the two-place

function symbol, “s( ,—). If we let “a” symbolize “1” and “b” symbolize “2”,

then “the sum of 1 and 2” becomes: s(a, b ).The result of plugging names into function symbols in English is a noun

phrase. Noun phrases combine with predicates to form complete sentences.

Function symbols function analogously in logic. Once you combine a function

symbol with a name, you can take the whole thing, apply a predicate to it, and

get a complete sentence. Thus, the sentence “George W. Bush’s father was a

politician” becomes:

P f (a)

And “3 is the sum of 1 and 2” becomes:

c = s(a, b )

(here “c” symbolizes “3”). We can put variables into the blanks of function

symbols, too. Thus, we can symbolize “Someone’s father was a politician” as

∃xP f (x)

Example 5.2: Symbolize the following sentences using predicate logic with

identity and function symbols:

Everyone loves his or her father

∀xLx f (x)

No one’s father is also his or her mother

∼∃x f (x)=m(x)


No one is his or her own father

∼∃x x= f (x)

A person’s maternal grandfather hates that person’s pa-

ternal grandmother

∀x H f (m(x)) m( f (x))

Every even number is the sum of two prime numbers

∀x(E x→∃y∃z(P y∧P z∧x=s(y, z)))


logic with identity and function symbols.

a) The product of an even number and an odd number is an

even number.

b) If the square of a number that is divisible by each smaller

number is odd, then that number is greater than all numbers. (I

know, the sentence is silly.)

5.2.1 Grammar for function symbolsWe need to update our de�nition of a wff to allow for function symbols. First,

we need to add to our vocabulary. So, the new de�nition starts like this (the

new bit is in boldface):


· logical: →, ∼, ∀, =

· nonlogical:


· for each n > 0, n-place function symbols f , g ,…, with or with-out subscripts· variables x, y . . . with or without subscripts



· parentheses

The de�nition of a wff, actually, stays the same; all that needs to change is the

de�nition of a “term”. Before, terms were just names or variables. Now, we

need to allow for f (a), f ( f (a)), etc., to be terms. This is done by the following

recursive de�nition of a term:1

Definition of terms:

· names and variables are terms

· if f is an n-place function symbol, and α1 . . .αn are terms, thenf (α1 . . .αn) is a term

· nothing else is a term

5.2.2 Semantics for function symbolsWe now need to update our de�nition of a PC-model by saying what the

interpretation of a function symbol is. That’s easy: the interpretation of an

n-place function symbol ought to be an n-place function de�ned on the model’s

domain—i.e., a rule that maps any n members of the model’s domain to another

member of the model’s domain. For example, in a model in which the one-place

function symbol f ( ) is to represent “ ’s father”, the interpretation of f will

be the function that assigns to any member of the domain that object’s father.

Here’s the new general de�nition of a model (a “PC+FS-model”, for “predicate

calculus plus function symbols”):

Definition of model: A PC+FS-model is an ordered pair ⟨D,I ⟩ such that:

· D is a non-empty set (“the domain”)

· I is a function (“the interpretation function”) obeying the following

constraints:


members of D.

1Note that complex terms formed from function symbols with more than one place do not,

of�cially, contain commas. But I’ll informally include the commas to improve readability. I

will write, for example, f (x, y) instead of f (xy).


· If f is an n-place function symbol, then I (f ) is an n-place(total) function de�ned on D.

(“Total” simply means that the function must yield an output for any n members

of D.)

The de�nition of a valuation function stays the same; all we need to do is

update the de�nition of denotation to accommodate our new complex terms.

Since we now can have arbitrarily long terms (not just names or variables), we

need a recursive de�nition:

Definition of denotation: For any modelM (= ⟨D,I ⟩), variable assignment

g , and term α, the denotation of α relative toM and g , [α]M ,g , is de�ned as

follows:

[α]M ,g =

I (α) if α is a constant

g (α) if α is a variable

I (f )([α1]M ,g . . . [αn]M ,g ) if α is a complex term f (α1 . . .αn)

Note the recursive nature of this de�nition: the denotation of a complex term

is de�ned in terms of the denotations of its smaller parts. Let’s think carefully

about what the �nal clause says. It says that, in order to calculate the denotation

of the complex term f (α1 . . .αn) (relative to assignment g ), we must �rst �gure

out what I ( f ) is—that is, what the interpretation function I assigns to the

function symbol f . This object, the new de�nition of a model tells us, is an

n-place function on the domain. We then take this function, I ( f ), and apply

it to n arguments: namely, the denotations (relative to g ) of the terms α1 . . .αn.

The result is our desired denotation of f (α1 . . .αn).It may help to think about a simple case. Suppose that f is a one-place

function symbol; suppose our domain consists of the set of natural numbers;

suppose that the name a denotes the number 3 in this model (i.e., I (a) = 3),

and suppose that f denotes the successor function (i.e., I ( f ) is the function,

successor, that assigns to any natural number n the number n+ 1.) In that case,

the de�nition tells us that:

[ f (a)]g =I ( f )([a]g )

=I ( f )(I (a))= successor(3)= 4


Example 5.3: Here’s a sample metalanguage argument that makes use of

the new de�nitions. As mentioned earlier, ‘George W. Bush’s father was a

politician’ logically implies ‘ ‘Someone’s father was a politician’. Let’s show that

these sentences’ symbolizations stand in the relation of semantic implication.

That is, let’s show that P f (c) � ∃xP f (x)—i.e., that in any model in which

P f (c) is true, ∃xP f (x) is true:

i) Suppose that P f (c) is true in a model ⟨I ,D⟩—i.e., Vg (P f (c)) = 1 (where

V is the valuation for this model), for each variable assignment g .

ii) Suppose for reductio that ∃xP f (x) is false in this model; i.e., for some

variable assignment g , Vg (∃xP f (x)) = 0

iii) By line i), Vg (P f (c)) = 1, and so [ f (c)]g ∈I (P ). [ f (c)]g is justI ( f )([c]g ),and [c]g is just I (c). So I ( f )(I (c)) ∈I (P ).

iv) By ii), for every object u ∈D,Vg xu(P f (x)) = 0.

v) I (c) ∈D. So, by line iv), Vg xI (c))(P f (x)) = 0, and hence, [ f (x)]g x

I (c)/∈I (P )

vi) [ f (x)]g xI (c)

is just I ( f )([x]g xI (c))), and [x]g x

I (c)is just gI (c)x (x)—i.e., I (c).

vii) So I ( f )(I (c)) /∈I (P ), which contradicts line iii)

Exercise 5.4 Demonstrate each of the following:

a) � ∀xF x→F f (a)

b) {∀x f (x)6=x} 2 ∃x∃y( f (x)=y ∧ f (y)=x)

5.3 De�nite descriptionsOur logic has gotten more powerful with the addition of function symbols,

but it still isn’t perfect. Function symbols let us “break up” certain complex

singular terms—e.g., “Bush’s father”. But there are others we still can’t break

up—e.g., “The black cat”. Even with function symbols, the only candidate for


a direct symbolization of this phrase into the language of predicate logic is a

simple name, “a” for example. But this symbolization ignores the fact that “the

black cat” contains “black” and “cat” as semantically signi�cant constituents.

It therefore fails to provide a good model of this term’s distinctively logical

behavior. For example, ‘The black cat is happy’ logically implies ‘Some cat

is happy’. But the simple-minded symbolization of the �rst sentence, H a,

obviously does not semantically imply the obvious symbolization of the second:

∃x(C x∧H x).One response is to introduce another extension of predicate logic. We

introduce a new symbol, ι, to stand for “the”. The grammatical function of

“the” in English is to turn predicates into noun phrases. “Black cat” is a predicate

of English; “the black cat” is a noun phrase that refers to the thing that satis�es

the predicate “black cat”. Similarly, in logic, given a predicate F , we’ll let ιxF xbe a term that means: the thing that is F .

We’ll want to let ιx attach to complex predicates, not just simple predi-

cates. To symbolize “the black cat”—i.e., the thing that is both black and a

cat—we want to write: ιx(B x∧C x). In fact, we’ll let ιx attach to wffs with

arbitrary complexity. To symbolize “the �reman who saved someone”, we’ll

write: ιx(F x∧∃yS xy).

5.3.1 Grammar for ιJust as with function symbols, we need to add a bit to the primitive vocabulary,

and revise the de�nition of a term.


· logical: →, ∼, ∀, =, ι

· nonlogical:


· for each n > 0, n-place function symbols f , g ,…, with or without

subscripts

· variables x, y . . . with or without subscripts


· parentheses

Definition of terms and wffs:


i) names and variables are terms

ii) if φ is a wff and α is a variable then ιαφ is a term

iii) if f is an n-place function symbol, and α1 . . .αn are terms, then f (α1 . . .αn)is a term

iv) if Π is an n-place predicate and α1 . . .αn are terms, then Πα1 . . .αn is a wff

v) If α and β are terms, then α=β is a wff

vi) if φ, ψ are wffs, and α is a variable, then ∼φ, (φ→ψ), and ∀αφ are wffs

vii) nothing else is a wff or term

Notice how we needed to combine the recursive de�nitions of term and wff

into a single recursive de�nition of wffs and terms together. The reason is that

we need the notion of a wff to de�ne what counts as a term containing the ιoperator (clause ii); but we need the notion of a term to de�ne what counts as

a wff (clause iv). The way we accomplish this is not circular. The reason it isn’t

is that we can always decide, using these rules, whether a given string counts as

a wff or term by looking at whether smaller strings count as wffs or terms. And

the smallest strings are said to be wffs or terms in non-circular ways.

5.3.2 Semantics for ιWe need to update the de�nition of denotation so that ιxφ will denote the one

and only thing in the domain that is φ. This is a little tricky, though. What is

there is no such thing? Suppose that ‘K ’ symbolizes “king of” and ‘a’ symbolizes

“USA”. Then, what should ‘ιxK xa’ denote? It is trying to denote the king of

the USA, but there is no such thing. Further, what if more than one thing

satis�es the predicate? In short, what do we say about “empty descriptions”?

One approach would be to say that every atomic sentence with an empty

description is false. One way to do this is to include in each model an “emptiness

marker”, E , which is an object we assign as the denotation for each empty de-

scription. The emptiness marker shouldn’t be thought of as a “real” denotation;

when we assign it as the denotation of a description, this just marks the fact that

the description has no real denotation. We will stipulate that the emptiness

marker is not in the domain; this ensures that it is not in the extension of any

predicate, and hence that atomic sentences containing empty descriptions are

always false. Here’s how the semantics looks (“PC+DD”—“predicate calculus

plus de�nite descriptions”):


Definition of model: A PC+DD-Model is an ordered triple ⟨D,I ,E ⟩ such

that:

· D is a non-empty set

· E /∈ D· I is a function obeying the following constraints:


members of D· If f is an n-place function symbol, then I ( f ) is some n-place total

function on D ∪ {E} that maps u1 . . . un to a member of D if each

ui ∈D and otherwise maps them to E

Definition of denotation and valuation: The denotation and valuation

functions, []M ,g and VM ,g , for PC+DD-modelM (=⟨D,I ⟩) and variable as-

signment g , are de�ned as the functions that satisfy the following constraints:

i) VM ,g assigns to each wff either 0 or 1

ii) For any term α, [α]M ,g is:

· I (α) if α is a constant

· g (α) if α is a variable

· I ( f )([α1]M ,g . . .[αn]M ,g ) if α is a complex term f (α1 . . .αn)· the unique u ∈D such that Vgβu

(φ) = 1 if α is a complex term ιβφ

and there is a unique such u· E if α is a complex term ιβφ and there is no such u

iii) for any n-place predicate Π and any terms α1 . . .αn,VM ,g (Πα1 . . .αn) = 1iff ⟨[α1]M ,g . . .[αn]M ,g ⟩ ∈ I (Π)

iv) VM ,g (α=β) = 1 iff: [α]M ,g = (i.e., is the same object as) [β]M ,g

v) for any wffs φ, ψ, and any variable α:

VM ,g (∼φ) = 1 iff VM ,g (φ) = 0

VM ,g (φ→ψ) = 1 iff either VM ,g (φ) = 0 or VM ,g (ψ) = 1



As with the grammar, we need to mix together the de�nition of denotation

and the de�nition of the valuation function. The reason is that we need to

de�ne the denotation of de�nite descriptions using the valuation function (in

clause ii), but we need to de�ne the valuation function using the concept of

denotation (in clauses iii and iv). As before, this is not circular.

Note a few things about these de�nitions. First, note that in the de�nition

of a model, function symbols denote functions that “stay within D”. That is, if

you feed one of these functions an n-tuple consisting only of members of D,

it spits out another member of D. But if you feed it an n-tuple, one of whose

members is the emptiness marker E , then it spits back out E . Second, note

that the denotation of any term is either E , or a member of D (exercise 5.6).

Third, since E is not in the domain, and extensions of predicates are de�ned to

be sets of n-tuples drawn from the domain, it follows that E cannot be present

in predicate extensions.

An alternate approach to using the emptiness marker E would appeal to

three-valued logic. We could leave the denotation of ιxφ unde�ned if there is

no object in the domain such that φ. We could then treat any atomic sentence

that contains a denotationless term as being neither true nor false—i.e., #. We

would then need to update the other clauses to allow for #s, using one of the

three-valued approaches to propositional logic from chapter ??. I won’t pursue

this option further.

Exercise 5.5 Establish the following:

a) � ∀xL(x, ιyF xy)→∀x∃yLxy

b) 2GιxF x→F ιxGx

Exercise 5.6 Show that the denotation of any term is either E , or

a member of D.

5.3.3 Eliminability of function symbols and de�nite descrip-tions

In a sense, we don’t really need function symbols or the ι. Let’s return to

the English singular term ‘the black cat’. Introducing the ι gave us a way

to symbolize this singular term in a way that takes into account its semantic


structure (namely: ιx(B x∧C x).) But even without the ι, there is a way to

symbolize whole sentences containing ‘the black cat’, using just standard predicate

plus identity. We could, for example, symbolize “The black cat is happy” as:

∃x[ (B x∧C x)∧∀y[(By∧C y)→y=x]∧H x]

That is, “there is something such that: i) it is a black cat, ii) nothing else is a

black cat, and iii) it is happy”.

This method for symbolizing sentences containing ‘the’ is called “Russell’s

theory of descriptions”, in honor of its inventor Bertrand Russell, the 19th

and

20th

century philosopher and logician.2

The general idea is to symbolize: “the

φ is ψ” as ∃x[φ(x)∧∀y(φ(y)→x=y)∧ψ(x)]. This method can be iterated so

as to apply to sentences with two or more de�nite descriptions, such as “The

8-foot tall man drove the 20-foot long limousine”, which becomes, letting ‘E ’

stand for ‘is eight feet tall’ and ‘T ’ stand for ‘is twenty feet long’:

∃x[E x∧M x ∧∀z([E z∧M z]→x=z)∧∃y[T y∧Ly ∧∀z([T z∧Lz]→y=z)∧D xy]]

An interesting problem arises with negations of sentences involving de�nite

descriptions, when we use Russell’s method. Consider “The president is not

bald”. Does this “The president is such that he’s non-bald”, which is symbolized

as follows:

∃x[P x∧∀y(P y→x=y)∧∼B x]

? Or does it mean “It is not the case that the President is bald”, which is

symbolized thus:

∼∃x[P x∧∀y(P y→x=y)∧B x]

? According to Russell, the original sentence is simply ambiguous. Symbolizing

it the �rst way is called “giving the description wide scope (relative to the ∼)”,

since the ∼ is inside the scope of the ∃x. Symbolizing it in the second way is

called “giving the description narrow scope (relative to the ∼)”, because the ∃xis inside the scope of the ∼.

What is the difference in meaning between these two symbolizations? The

�rst says that there really is a unique president, and adds that he is not bald.

2See Russell (1905).


So the �rst implies that there’s a unique president. The second merely denies

that there is a unique president, who is bald. That doesn’t imply that there’s

a unique president. It would be true if there’s a unique president who is not

bald, but it would also be true in two other cases: the case in which there are

no presidents at all, and the case in which there is more than one president.

A similar issue arises with the sentence “The round square does not exist”.

We might think to symbolize it:

∃x[Rx∧S x∧∀y([Ry∧Sy]→x=y)∧∼E x]

letting “E” stands for “exists”. In other words, we might give the description

wide scope. But this is wrong, because it says there is a certain round square that

doesn’t exist, and that’s a contradiction. This way of symbolizing the sentence

corresponds to reading the sentence as saying “The thing that is a round square

is such that it does not exist”. But that isn’t the most natural way to read the

sentence. The sentence would usually be interpreted to mean: “It is not true

that the round square exists”, —that is, as the negation of “the round square

exists”:

∼∃x[Rx∧S x∧∀y([Ry∧Sy]→x=y)∧ E x]

with the∼ out in front. Here we’ve given the description narrow scope. Notice

also that saying that x exists at the end is redundant, so we could simplify to:

∼∃x[Rx∧S x∧∀y([Ry∧Sy]→x=y)]

Again, notice the moral of these last two examples: if a de�nite description

occurs in a sentence with a ‘not’, the sentence may be ambiguous: does the

‘not’ apply to the entire rest of the sentence, or merely to the predicate?

If we are willing to use Russell’s method for translating de�nite descriptions,

we can drop ι from our language. We would, in effect, not be treating “the F ”

as a referring phrase. We would instead be paraphrasing sentences that contain

“the F ” into sentences that don’t. “The black cat is happy” got paraphrased

as: “there is something that is a black cat, is such that nothing else is a black

cat, and is happy”. See?—no occurrence of “the black cat” in the paraphrased

sentence.

In fact, once we use Russell’s method, we can get rid of function symbols too.

Given function symbols, we treated “father” as a function symbol, symbolized

it with “ f ”, and symbolized the sentence “George W. Bush’s father was a


politician” as P f (b ). But instead, we could treat ‘father of’ as a two-place

predicate, F , and regard the whole sentence as meaning: “The father of George

W. Bush was a politician.” Given the ι, this could be symbolized as:

P ιxF x b

But given Russell’s method, we can symbolize the whole thing without using

either function symbols or the ι:

∃x(F x b ∧∀y(F y b→y=x)∧ P x)

We can get rid of all function symbols this way, if we want. Here’s the method:

· Take any n-place function symbol f

· Introduce a corresponding n+ 1-place predicate R

· In any sentence containing the term “ f (α1 . . .αn)”, replace each occur-

rence of this term with “the x such that R(x,α1 . . .αn)”.

· Finally, symbolize the resulting sentence using Russell’s theory of de-

scriptions

For example, let’s go back to: “Every even number is the sum of two prime

numbers”. Instead of introducing a function symbol s(x, y) for “the sum of xand y”, let’s introduce a predicate letter R(z, x, y) for “z is a sum of x and y”.

We then use Russell’s method to symbolize the whole sentence thus:

∀x(E x→∃y∃z[P y∧P z ∧∃w(Rwy z ∧∀w1(Rw1y z→w1=w)∧ x=w)])

The end of the formula (beginning with ∃w) says “the product of y and z is

identical to x”—that is, that there exists some w such that w is a product of yand z, and there is no other product of y and z other than w, and w = x.



logic with identity, function symbols, and the ι operator. (Do noteliminate descriptions using Russell’s method.)

a) If a person commits a crime, then the judge that sentences

him/her wears a wig.

b) The tallest spy is a spy. (Use a two-place predicate to sym-

bolize “is taller than”.)

Exercise 5.8 For the sentence “The ten-feet-tall man is not

happy”, �rst symbolize with the ι operator. Then symbolize tworeadings using Russell’s method. Explain the intuitive difference

between those two readings. Which gives truth conditions like the

ι symbolization?

5.4 Further quanti�ersStandard logic contains just the quanti�ers ∀ and ∃. As we have seen, using just

these quanti�ers, plus the rest of standard predicate logic, one can represent

the truth conditions of a great many sentences of natural language. But not all.

For instance, there is no way to symbolize the following sentences in predicate

logic:

Most things are massive

Most men are brutes

There are in�nitely many numbers

Some critics admire only one another

Like those sentences that are representable in standard logic, these sentences

involve quanti�cational notions: most things, some critics, and so on. In this

section we introduce a broader conception of what a quanti�er is, and new

quanti�ers that allow us to symbolize these sentences.


5.4.1 Generalized monadic quanti�ersWe will generalize the idea behind the standard quanti�ers ∃ and ∀ in two

ways. To approach the �rst, think about the clauses in the de�nition of truth in

a PC-model,M , with domain D, for ∀ and ∃:


VM ,g (∃αφ) = 1 iff for some u ∈D,VM ,gαu(φ) = 1

Let’s introduce the following bit of terminology. For any PC-model,M (=

⟨D,I ⟩), and wff, φ, let’s introduce the name “φM ,g ,α” for (roughly speaking)

the set of members ofM ’s domain of which φ is true:

Definition: φM ,g ,α = {u : u ∈D and VM ,gαu(φ) = 1}

Thus, if we begin with any variable assignment g , then φM ,g ,αis the set of

things u in D such that φ is true, relative to variable assignment g αu ). Given

this terminology, we can rewrite the clauses for ∀ and ∃ as follows:

VM ,g (∀αφ) = 1 iff φM ,g ,α =DVM ,g (∃αφ) = 1 iff φM ,g ,α 6=∅

But if we can rewrite the semantic clauses for the familiar quanti�ers ∀ and

∃ in this way—as conditions on φM ,g ,α—then why not introduce new symbols

of the same grammatical type as ∀ and ∃, whose semantics is parallel to ∀ and

∃ except in laying down different conditions on φM ,g ,α? These would be new

kinds of quanti�ers. For instance, for any integer n, we could introduce a

quanti�er ∃n, to be read as “there exists at least n”. That is, ∃nφ means: “there

are at least n φs.” The de�nitions of a wff, and of truth in a model, would be

updated with the following clauses:

· if α is a variable and φ is a wff, then ∃nαφ is a wff

· VM ,g (∃nαφ) = 1 iff |φM ,g ,α| ≥ n

The expression |A| stands for the “cardinality” of set A—i.e., the number of

members of A. Thus, this de�nition says that ∃nαφ is true iff the cardinality of

φM ,g ,αis greater than or equal to n—i.e., this set has at least n members.

Now, the introduction of the symbols ∃n do not increase the expressive

power of predicate logic, for as we saw in section 5.1.3, we can symbolize


“there are at least n F s” using just standard predicate logic (plus “=”). The

new notation is merely a space-saver. But other such additions are not mere

space-savers. For example, by analogy with the symbols ∃n, we can introduce a

symbol ∃∞, meaning “there are in�nitely many”:

· if α is a variable and φ is a wff, then “∃∞αφ” is a wff

· VM ,g (∃∞αφ) = 1 iff |φM ,g ,α| is in�nite

As it turns out (though I won’t prove it here), the addition of ∃∞ genuinely

enhances predicate logic: no sentence of standard (�rst-order) predicate logic

has the same truth condition as does ∃∞xF x.

Another generalized quanti�er that is not symbolizable using standard

predicate logic is most:

· If α is a variable and φ is a wff, then “most αφ” is a wff

· VM ,g (most αφ) = 1 iff |φM ,g ,α|> |D −φM ,g ,α|

The minus-sign in the second clause is the symbol for set-theoretic difference:

A−B is the set of things that are in A but not in B . Thus, the de�nition says

that most αφ is true iff more things in the domain D are φ than are not φ.

One could add all sorts of additional “quanti�ers” Q in this way. Each

would be, grammatically, just like ∀ and ∃, in that each would combine with

a variable, α, and then attach to a sentence φ, to form a new sentence Qαφ.

Each of these new quanti�ers, Q, would be associated with a relation between

sets, RQ , such that Qαφ would be true in a PC-model,M , with domain D,

relative to variable assignment g , iff φM ,g ,αbears RQ to D.

If such an added symbol Q is to count as a quanti�er in any intuitive sense,

then the relation RQ can’t be just any relation between sets. It should be a

relation concerning the relative “quantities” of its relata. It shouldn’t, for

instance, “concern particular objects” in the way that the following symbol,

∃Ted-loved

, concerns particular objects:

VM ,g (∃Ted-lovedαφ) = 1 iff φM ,g ,α ∩{u : u ∈D and Ted loves u} 6=∅

So we should require the following of RQ . Consider any set, D, and any one-

one function, f , from D onto another set D ′. Then, if a subset X of D bears

RQ to D, the set f [X ] must bear RQ to D ′. ( f [X ] is the image of X under

function f —i.e., {u : u ∈D ′ and u = f (v), for some v ∈D}. It is the subset of

D ′ onto which f “projects” X .)


Exercise 5.9 Let the quanti�er ∃prime mean “there are a prime

number of”. Using the notation of generalized quanti�ers, write

out the semantics of this quanti�er.

5.4.2 Generalized binary quanti�ersWe have seen how the standard quanti�ers ∀ and ∃ can be generalized in

one way: syntactically similar symbols may be introduced and associated with

different relations between sets. Our second way of generalizing the standard

quanti�ers is to allow two-place, or binary quanti�ers. ∀ and φ are monadic

in that ∀α and ∃α attach to a single open sentence φ. Compare the natural

language monadic quanti�ers ‘everything’ and ‘something’:

Everything is material

Something is spiritual

Here, the predicates (verb phrases) ‘is material’ and ‘is spiritual’ correspond

to the open sentences of logic; it is to these that ‘everything’ and ‘something’

attach.

But in fact, monadic quanti�ers in natural language are atypical. ‘Every’

and ‘some’ typically occur as follows:

Every student is happy

Some �sh are tasty

The quanti�ers ‘every’ and ‘some’ attach to two predicates. In the �rst, ‘every’

attaches to ‘[is a] student’ and ‘is happy’; in the second, ‘some’ attaches to ‘[is

a] �sh’ and ‘[is] tasty’. In these sentences, we may think of ‘every’ and ‘some’ as

binary quanti�ers. (Indeed, one might think of ‘everything’ and ‘something’ as

the result of applying the binary quanti�ers ‘every’ and ‘some’ to the predicate

‘is a thing’.) A logical notation can be introduced which exhibits a parallel

structure, in which ∀ and ∃ attach to two open sentences. In this notation, the

form of quanti�ed sentences is:

(∀α:φ)ψ(∃α:φ)ψ


The �rst is to be read: “all φs are ψ”; the second is to be read “there is a φ that

is a ψ”. The clauses for these new binary quanti�ers in the de�nition of the

valuation function for a PC-model are:

VM ,g ((∀α:φ)ψ) = 1 iff φM ,g ,α ⊆ψM ,g ,α

VM ,g ((∃α:φ)ψ) = 1 iff φM ,g ,α ∩ψM ,g ,α 6=∅

A further important binary quanti�er is the:

· if φ and ψ are wffs and α is a variable, then (theα:φ)ψ is a wff

· VM ,g ((theα:φ)ψ) = 1 iff |φM ,g ,α|= 1 and φM ,g ,α ⊆ψM ,g ,α

That is, (theα:φ)ψ is true iff i) there is exactly one φ, and ii) every φ is a

ψ. This truth condition, notice, is exactly the truth condition for Russell’s

symbolization of “the φ is a ψ”; hence the name the.

As with the introduction of the monadic quanti�ers ∃n, the introduction of

the binary existential and universal quanti�ers, and of the, does not increase the

expressive power of �rst order logic, for the same effect can be achieved with

monadic quanti�ers. (∀α:φ)ψ, (∃α:φ)ψ, and (theα:φ)ψ become, respectively:

∀α(φ→ψ)∃α(φ∧ψ)

∃α(φ∧∀β(φβ→β=α)∧ψ)

(whereφβ isφwith free αs changed toβs.) But, as with the monadic quanti�ers

∃∞ and most, there are binary quanti�ers one can introduce that genuinely

increase expressive power. For example, most occurrences of ‘most’ in English

are binary, e.g.:

Most �sh swim

To symbolize such sentences, we can introduce a binary quanti�er most2. The

sentence (most2α:φ)ψ is to be read “most φs are ψs”. The semantic clause for

most2 is:

VM ,g ((most2α:φ)ψ) = 1 iff |φM ,g ,α ∩ψM ,g ,α|> |φM ,g ,α−ψM ,g ,α|


The binary most2 increases our expressive power, even relative to the monadic

most: not every sentence expressible with the former is equivalent to a sentence

expressible with the latter.3

Exercise 5.10 Symbolize the following sentence:

The number of people multiplied by the num-

ber of cats that bite at least one dog is 198.

You may invent any generalized quanti�ers you need, provided you

write out their semantics.

5.4.3 Second-order logicAll the predicate logic we have considered so far is known as �rst-order. We’ll

now brie�y look at second-order predicate logic, a powerful extension to �rst-

order predicate logic. The distinction has to do with how variables behave, and

has syntactic and semantic aspects.

The syntactic part of the idea concerns the grammar of variables. All

the variables in �rst-order logic are grammatical terms. That is, they behave

grammatically like names: you can combine them with a predicate to get a

wff; you cannot combine them solely with other terms to get a wff; etc. In

second-order logic, on the other hand, variables can occupy predicate position.

Thus, each of the following sentences is a well-formed formula in second-order

logic:

∃X X a∃X∃yXy

Here we see the variable X occupying predicate position. Predicate variables,

like the normal predicates of standard �rst-order logic, can be one-place, two-

place, three place, etc.

The semantic part of the idea concerns the interpretation of variables. In

�rst-order logic, a variable-assignment assigns to each variable a member of the

domain. A variable assignment in second-order logic assigns to each standard

3Westerståhl (1989, p. 29).


(�rst-order) variable α a member of the domain, as before, but assigns to each

n-place predicate variable a set of n-tuples drawn from the domain. (This is

what one would expect: the semantic value of a n-place predicate is its extension,

a set of n-tuples, and variable assignments assign temporary semantic values.)

Then, the following clauses to the de�nition of truth in a PC-model must be

added:

· Ifπ is an n-place predicate variable andα1 . . .αn are terms, then VM ,g (πα1 . . .αn) =1 iff ⟨[α1]M ,g . . .[αn]M ,g ⟩ ∈ g (π)

· If π is a predicate variable and φ is a wff, then VM ,g (∀πφ) = 1 iff for

every set U of n-tuples from D, VM ,gπU(φ) = 1

(where gπU is the variable assignment just like g except in assigning U to π.)

Notice that, as with the generalized monadic quanti�ers, no alteration to the

de�nition of a PC-model is needed. All we need to do is change grammar and

the de�nition of the valuation function.

Second-order logic is different from �rst-order logic in many ways. For

instance, one can de�ne the identity predicate in second-order logic:

Second-order definition of identity: “x=y” is short for: ∀X (X x↔Xy)

This can be seen to work correctly as follows. A one-place second-order variable

X gets assigned a set of things. Thus, the atomic sentence X x says that the

object (currently assigned to) x is a member of the set (currently assigned to)

X . Thus, ∀X (X x↔Xy) says that x and y are members of exactly the same

sets. But since x and only x is a member of {x} (i.e., x’s unit set), that means

that the only way for this to be true is for y to be identical to x.

More importantly, the metalogical properties of second-order logic are dra-

matically different from those of �rst-order logic. For instance, the axiomatic

method cannot be fully applied to second order logic. One cannot write down a

set of axioms for second-order logic that are both sound and complete—unless,

that is, one resorts to cheap tricks like saying “let every valid wff be an axiom”.

This trick is “cheap” because one would have no way of telling what an axiom

is.4

4More precisely, the resulting set of axioms would fail to be recursive. For a rigorous

statement and proof of this and other metalogical results about second-order logic, see, e.g.,

Boolos and Jeffrey (1989, chapter 18).


Second-order logic also allows us to express claims that cannot be expressed

in �rst-order logic. Consider the “Geach-Kaplan sentence”:5

Some critics admire only one another

It can be shown that there is no way to symbolize (one reading of) the sentence

using just �rst-order logic and predicates for ‘is a critic’ and ‘admires’. The

sentence (on the desired reading) says that there is a group of critics such that

the members of that group admire only other members of the group, but one

cannot say this in �rst-order logic. However, the sentence can be symbolized

in second-order logic:

∃X [∃xX x ∧∀x(X x→C x)∧∀x∀y([X x∧Axy]→[Xy∧x 6=y)] (GK2)

(GK2) “symbolizes” (GK) in the sense that it contains no predicates other than

C and A, and for every model ⟨D,I ⟩, the following is true:

(*) (GK2) is true in ⟨D,I ⟩ iff D has a nonempty subset, X , such that i)

X ⊆ I (C ), and ii) whenever ⟨u, v⟩ ∈ I (A) and u ∈ X , then v ∈ X as

well and v is not u.

No �rst-order sentence symbolizes the Geach-Kaplan sentence in this sense.

However, one can in a sense symbolize the Geach-Kaplan sentence using a �rst-

order sentence, provided the sentence employs, in addition to the predicates Cand A, a predicate ∈ for set-membership:

∃z[∃x x∈z ∧∀x(x∈z→C x)∧∀x∀y([x∈z∧Axy]→[y∈z∧x 6=y)] (GK1)

(GK1) doesn’t “symbolize” (GK) in the sense of satisfying (*) in every model,

for in some models the two-place predicate ∈ doesn’t mean set-membership.

Nevertheless, if we just restrict our attention to models ⟨D,I ⟩ in which ∈ doesmean set-membership (restricted to the model’s domain, of course—that is,

I (∈) = {⟨u, v⟩ : u, v ∈D and u ∈ v}), and in which D contains each subset of

I (C ) as a member, then (GK1) will indeed satisfy (*). In essence, the difference

between (GK1) and (GK

2) is that it is hard-wired into the de�nition of truth in

a model that second-order predications Xy express set-membership, whereas

this is not hard-wired into the de�nition of the �rst-order predication y ∈ z.6

5The sentence and its signi�cance were discovered by Peter Geach and David Kaplan. See

Boolos (1984).

6For more on second-order logic, see Boolos (1975, 1984, 1985).

Chapter 6

Propositional Modal Logic

Modal logic is the logic of necessity and possibility. In it we treat words

like “necessary”, “could be”, “must be”, etc. as logical constants. Here

are our new symbols:

2φ: “It is necessary that φ”, “Necessarily, φ”, “It must be that φ”

3φ: “It is possible that φ”, “Possibly, φ”, “It could be that φ”, “It can be that

φ”, “It might be that φ”

The phrase “φ is possible” is sometimes used in the following sense: “φcould be true, but then again, φ could be false”. For example, if one says “it

might rain tomorrow”, one might intend to say not only that there is a possibility

of rain, but also that there is a possibility that there will be no rain. This is notthe sense of ‘possible’ that we symbolize with the 3. In our intended sense,

“possiblyφ” does not imply “possibly not-φ”. To get into the spirit of this sense,

note the naturalness of saying the following: “well of course 2+ 2 can equal 4,

since it does equal 4”. Here, ‘can’ is used in our intended sense: it is presumably

not possible for 2+ 2 to fail to be 4, and so in this case, ‘it can be the case that

2+ 2 equals 4’ does not imply ‘it can be the case that 2+ 2 does not equal 4’.

It is helpful to think of the 2 and the 3 in terms of possible worlds. A

possible world is a complete and possible scenario. Calling a scenario “possible”

means simply that it’s possible that the scenario happen, i.e., be actual. This

requirement disquali�es scenarios in which, for example, it is both raining and

also not raining (at the same time and place)—such a thing couldn’t happen, and

118

CHAPTER 6. PROPOSITIONAL MODAL LOGIC 119

so doesn’t happen in any possible world. But within this limit, we can imagine

all sorts of possible worlds: possible worlds with talking donkeys, possible

worlds in which I am ten feet tall, and so on. “Complete” means simply that

no detail is left out—possible worlds are completely speci�c scenarios. There

is no possible world in which I am “somewhere between ten and eleven feet

tall” without being some particular height.1

Likewise, in any possible world in

which I am exactly ten feet, six inches tall (say), I must have some particular

weight, must live in some particular place, and so on. One of these possible

worlds is the actual world—this is the complete and possible scenario that in

fact obtains. The rest of them are merely possible—they do not obtain, but

would have obtained if things had gone differently.

In terms of possible worlds, we can think of our modal operators thus:

“2φ” is true iff φ is true in all possible worlds

“3φ” is true iff φ is true in at least one possible world

It is necessarily true that all bachelors are male; in every possible world, every

bachelor is male. There might have existed a talking donkey; some possible

world contains a talking donkey. Possible worlds provide, at the very least,

a vivid way to think about necessity and possibility. How much more than a

vivid guide they provide is an open philosophical question. Some maintain that

possible worlds are the key to the metaphysics of modality, that what it is for a

proposition to be necessarily true is for it to be true in all possible worlds.2

Whether this view is defensible is a question beyond the scope of this book;

what is important for present purposes is that we distinguish possible worlds as

a vivid heuristic from possible worlds as a concern in serious metaphysics.

Our �rst topic in modal logic is the addition of the 2 and the 3 to proposi-

tional logic; the result is modal propositional logic (“MPL”). A further step will be

be modal predicate logic (chapter 9).

6.1 Grammar of MPLWe need a new language: the language of propositional modal logic. The

grammar of this language is just like the grammar of propositional logic, except

that we add the 2 as a new one-place sentence connective:

1This is not to say that possible worlds exclude vagueness.

2Sider (2003) presents an overview of this topic.



· Sentence letters: P,Q, R . . . , with or without numerical subscripts

· Connectives: →, ∼, 2

· Parentheses

Definition of wff:

· Sentence letters are wffs

· If φ and ψ are wffs then φ→ψ, ∼φ, and 2φ are also wffs

· nothing else is a wff

The 2 is the only new primitive connective. But just as we were able to

de�ne ∧, ∨, and↔, we can de�ne new nonprimitive modal connectives:

· “3φ” (“Possibly φ”) is short for “∼2∼φ· “φ⇒ψ” (“φ strictly implies ψ”) is short for “2(φ→ψ)”

6.2 Symbolizations in MPLModal logic allows us to symbolize a number of sentences we couldn’t symbolize

before. The most obvious cases are sentences that overtly involve “necessarily”,

“possibly”, or equivalent expressions:

Necessarily, if snow is white, then snow is white or grass

is green

2[S→(S∨G)]

I’ll go if I must

2G→G

It is possible that Bush will lose the election

3L

Snow might have been either green or blue

3(G∨B)

If snow could have been green, then grass could have

been white

3G→3W


‘Impossible’ and related expressions signify the lack of possibility:

It is impossible for snow to be both white and not white

∼3(W∧∼W )

If grass cannot be clever then snow cannot be furry

∼3C→∼3F

God’s being merciful is inconsistent with imperfection’s

being incompatible with your going to heaven.

∼3(M∧∼3(I∧H ))

(M = “God is merciful”, I = “You are imperfect”, H =“You go to heaven”)

As for the strict conditional, it arguably does a decent job of representing

certain English conditional constructions:

Snow is a necessary condition for skiing

∼W⇒∼K

Food and water are required for survival

∼(F∧W )⇒∼S

Thunder implies lightning

T⇒L

Once we add modal operators, we can expose an important ambiguity in

certain English sentences. The surface grammar of a sentence like “if Ted is a

bachelor, then he must be unmarried” is misleading: it suggests the symboliza-

tion:

B→2U

But since I am in fact a bachelor, it would follow from this symbolization that

the proposition that I am unmarried is necessarily true. But clearly I am not

necessarily a bachelor—I could have been married! The sentence is not saying

that if I am in fact a bachelor, then the following is a necessary truth: I am

married. It is rather saying that, necessarily, if I am a bachelor then I am

married:

2(B→U )


It is the relationship between my being a bachelor and my being unmarried that

is necessary. Think of this in terms of possible worlds: the �rst symbolization

says that if I am a bachelor in the actual world, then I am unmarried in every

possible world (which is absurd); whereas the second one says that in each

possible world, w, if I am a bachelor in w, then I am unmarried in w (which

is quite sensible). The distinction between φ→2ψ and 2(φ→ψ) is called

the distinction between the “necessity of the consequent” (�rst sentence) and

the “necessity of the consequence” (second sentence). It is important to keep

the distinction in mind, because of the fact that English surface structure is

misleading.

English modal words are ambiguous in a systematic way. For example,

suppose I say that I can’t attend a certain conference in Cleveland. What is the

force of “can’t” here? Probably I’m saying that my attending the conference

is inconsistent with honoring other commitments I’ve made at that time. But

notice that another sentence I might utter is: “I could attend the conference;

but I would have to cancel my class, and I don’t want to do that.” Now I’ve

said that I can attend the conference; have I contradicted my earlier assertion

that I cannot attend the conference? No—what I mean now is perhaps that I

have the means to get to Cleveland on that date. I have shifted what I mean by

“can”.

In fact, there are a lot of things one could mean by a modal word like ‘can’.

Examples:

I can come to the party, but I can’t stay late. (“can” = “is not

inconvenient”)

Humans can travel to the moon, but not Mars. (“can” = “is

achievable with current technology”)

Objects can move almost as fast as the speed of light, but nothingcan travel faster than light. (“can” = “is consistent with

the laws of nature”)

Objects could have traveled faster than the speed of light (ifthe laws of nature had been different), but no matter what thelaws had been, nothing could have traveled faster than itself.(“can” = “metaphysical possibility”)

You can borrow but you can’t steal. (“can” = “morally ac-


ceptable”)

So when representing English sentences using the 2 and the 3, one should

keep in mind that these expressions can be used to express different strengths

of necessity and possibility. (Though we won’t do this, one could introduce

different symbols for different sorts of possibility and necessity.)

The different strengths of possibility and necessity can be made vivid by

thinking, again, in terms of possible worlds. As we saw, we can think of the 2

and the 3 as quanti�ers over possible worlds (the former a universal quanti�er,

the latter an existential quanti�er). The very broad sort of possibility and

necessity, metaphysical possibility and necessity, can be thought of as a completely

unrestricted quanti�er: a statement is necessarily true iff it is true in all possible

worlds whatsoever. The other kinds of possibility and necessity can be thought

of as resulting from various restrictions on the quanti�ers over possible worlds.

Thus, when ‘can’ signi�es achievability given current technology, it means:

true in some possible world in which technology has not progressed beyond where it hasprogressed in fact at the current time; when ‘can’ means moral acceptability, it

means: true in some possible world in which nothing morally forbidden occurs; and so

on.

6.3 Semantics for MPLAs usual, let’s consider semantics �rst. As always, our goal is to model how

statements involving the 2 and 3 are made true by the world, in order to shed

light on the meaning of these connectives, and in order to provide semantic

de�nitions of the notions of logical truth and logical consequence.

In constructing a semantics for MPL, we face two main challenges, one

philosophical, the other technical. The philosophical challenge is simply that

it isn’t wholly clear which formulas of MPL are indeed logical truths. It’s hard

to construct an engine to spit out logical truths if you don’t know which logical

truths you want it to spit out. With a few exceptions, there is widespread

agreement over which formulas of nonmodal propositional and predicate logic

are logical truths. But for modal logic, this is less clear, especially for sentences

that contain iterations of modal operators. Is 2P→22P a logical truth? It’s

hard to say.

A quick peek at the history of modal logic is in order. Modal logic arose

from dissatisfaction with the material conditional→ of standard propositional


logic. The material conditional φ→ψ is true whenever φ is false or ψ is true;

but in expressing the conditionality of ψ on φ, we sometimes want to require a

tighter relationship: we want it not to be a mere accident that either φ is false or

ψ is true. To express this tighter relationship, C. I. Lewis introduced the strict

conditional φ⇒ψ, which he de�ned, as above, as 2(φ→ψ).3 Thus de�ned,

φ⇒ψ isn’t automatically true just because φ is false or ψ is true. It must be

necessarily true that either φ is false or ψ is true.

Lewis then asked: what principles govern this new symbol 2? Certain

principles seemed clearly appropriate, for instance: 2(φ→ψ)→(2φ→2ψ).Others were less clear. Is 2φ→22φ a logical truth? What about 32φ→φ?

Lewis’s solution to this problem was not to choose. Instead, he formulated

several different modal systems. He did this axiomatically, by formulating differ-

ent systems that differed from one another by containing different axioms and

hence different theorems.

We will follow Lewis’s approach, and construct several different modal

systems. Unlike Lewis, we’ll do this semantically at �rst (the semantics for

modal logic we will study was published by Saul Kripke in the 1950s, long

after Lewis was writing), by constructing different de�nitions of a model for

modal logic. The de�nitions will differ from one another in ways that result

in different sets of valid formulas. In section 6.4 we’ll study Lewis’s axiomatic

systems, and in sections 6.5 and 6.6 we’ll discuss the relationship between the

semantics and the axiom systems.

Formulating multiple systems does not answer the philosophical question

of which formulas of modal logic are logically true; it merely postpones it.

The question re-arises when we want to apply Lewis’s systems; when we ask

which system is the correct system—i.e., which one correctly mirrors the logical

properties of the English words ‘possibly’ and ‘necessarily’? (Note that since

there are different sorts of necessity and possibility, different systems might

correctly represent different sorts of necessity.) But we won’t try to address

such philosophical questions here.

The technical challenge to constructing a semantics for MPL is that the

modal operators 2 and 3 are not truth functional. A (sentential) connective is an

expression that combines with sentences to make new sentences. A one-place

connective combines with one sentence to form a new sentence. ‘It is not the

case that’ is a one-place connective of English—the ∼ is a one-place connective

in the language of PL. A connective is truth-functional iff whenever it combines

3See Lewis (1918); Lewis and Langford (1932).


with sentences to form a new sentence, the truth value of the resulting sentence

is determined by the truth value of the component sentences. Many think

that ‘and’ is truth-functional, since they think that an English sentence of the

form “φ and ψ” is true iff φ and ψ are both true. But ‘necessarily’ is not truth-

functional. Suppose I tell you the truth value of φ; will you be able to tell me

the truth value of this sentence? Well, if φ is false then presumably you can (it

is false), but if φ is true, then you still don’t know. If φ is “Ted is a philosopher”

then “Necessarily φ” is false, but if φ is “Either Ted is a philosopher or he isn’t

a philosopher” then “Necessarily φ” is true. So the truth value of “Necessarily

φ” isn’t determined by the truth value of φ. Similarly, ‘possibly’ isn’t truth-

functional either: ‘I might have been six feet tall’ is true, whereas ‘I might have

been a round square’ is false, despite the fact that ‘I am six feet tall’ and ‘I am a

round square’ each have the same truth value (they’re both false.)

Since the 2 and the 3 are supposed to represent ‘necessarily’ and ‘possibly’,

respectively, and since the latter aren’t truth-functional, we can’t use the method

of truth tables to construct the semantics for the 2 and the 3. For the method

of truth tables assumes truth-functionality. Truth tables are just pictures of truth

functions: they specify what truth value a complex sentence has as a function of

what truth values its parts have. Imagine trying to construct a truth table for

the 2. It’s presumably clear (though see the discussion of systems K, D, and T

below) that 2φ should be false if φ is false, but what about when φ is true?:

2

1 ?

0 0

There’s nothing we can put in this slot in the truth table, since when φ is true,

sometimes 2φ is true and sometimes it is false.

Our challenge is clear: we need a semantics for the 2 and the 3 other than

the method of truth tables.

6.3.1 RelationsBefore we investigate how to overcome this challenge, a digression is necessary,

to introduce the concept of a relation. A relation is just a feature of multiple

objects taken together. The taller-than relation is one example: when one

person is taller than another, that’s a feature of those two objects taken together.


Another example is the the less-than relation for numbers. When one number

is less than another, that’s a feature of those two numbers taken together.

“Binary” relations apply to two objects at a time. The taller-than and less-

than relations are binary relations, or “two-place” relations as we might say.

We can also speak of three-place relations, four-place relations, and so on.

An example of a three-place relation would be the betweenness relation for

numbers: the relation that holds between 2, 5, and 23 for example.

Recall our discussion of ordered sets from section 1.8. In addition to their

use in constructing models, ordered sets are also useful for giving an of�cial

de�nition of what a relation is.

Definition of relation: An n-place relation is de�ned as a set of n-tuples. So

a binary (two-place) relation is a set of ordered pairs.

For example, the taller-than relation may be taken to be the set of ordered pairs

⟨u, v⟩ such that u is a taller person than v. The less-than relation for positive

integers is the set of ordered pairs ⟨m, n⟩ such that m is a positive integer less

than n, another positive integer. That is, it is the following set:

{⟨1,2⟩, ⟨1,3⟩, ⟨1,4⟩ . . . ⟨2,3⟩, ⟨2,4⟩ . . .}

When ⟨u, v⟩ is a member of relation R, we say, equivalently, that u and v “stand

in” R, or R “holds between” u and v, or that u “bears” R to v. Most simply,

we write “Ruv”. (This notation is like that of predicate logic; but here I’m

speaking the metalanguage, not displaying sentences of a formalized language.)

Some more de�nitions.

Definition of domain, range, over: Let R be any binary relation.

· The domain of R (“dom(R)”) is the set {u: for some v, Ruv}· The range of R (“ran(R)”) is the set {u: for some v, Rv u}· R is over A iff dom(R)⊆A and ran(R)⊆A

In other words, the domain of R is the set of all things that bear R to something;

the range is the set of all things that something bears R to; and R is over A iff

the members of the ’tuples in R are all drawn from A.

Binary relations come in different types, depend on the patterns in which

they hold. Here are some types of binary relations that we will need to think

about:


Definition of kinds of binary relations: Let R be any binary relation over

A.

· R is serial (in A) iff for every u ∈A, there is some v ∈A such that Ruv.

· R is re�exive (in A) iff for every u ∈A, Ru u

· R is symmetric iff for all u, v, if Ruv then Rv u

· R is transitive iff for any u, v, w, if Ruv and Rvw then Ruw

· R is an equivalence relation (in A) iff R is symmetric, transitive, and

re�exive (in A)

· R is total (in A) iff for every u, v ∈A, Ruv

Notice that we relativize some of these relation types to a given set A. The

notion of re�exivity is de�ned as being relative to a set, for example. We do

this because the alternative would be to say that a relation is re�exive simpliciterif everything bears R to itself; but that would require the domain and range

of any re�exive relation to be the set of absolutely all objects. It’s better to

introduce the notion of being re�exive relative to a set, which is applicable to

relations with smaller domains and ranges. (I will sometimes omit the quali�er

‘in A’ when it is clear which set that is.) Why don’t symmetry and transitivity

have to be relativized to a set?—because they only say what must happen ifR holds among certain things. Symmetry, for example, says merely that if Rholds between u and v , then it must also hold between v and u, and so we can

say that a relation is symmetric absolutely, without implying that everything is

in its domain and range.

6.3.2 Kripke modelsNow we’re ready to introduce a semantics for MPL. As we saw, we can’t

construct truth tables for the 2 or the 3. Instead, we will pursue an approach

called possible-worlds semantics. The intuitive idea is to count 2φ as being true

iff φ is true in all possible worlds, and 3φ as being true iff φ is true in some

possible worlds. More carefully: we are going to develop models for modal

propositional logic. These models will contain objects we will call “possible

worlds”. And formulas are going to be true or false “at” these worlds—that

is, we are going to assign truth values to formulas in these models relative to

possible worlds, rather than absolutely. Truth values of propositional-logic

compound formulas—that is, negations and conditionals—will be determined


by truth tables within each world; ∼φ, for example, will be true at a world iff φis false at that world. But the truth value of 2φ at a world won’t be determined

by the truth value of φ at that world; the truth value of φ at other worlds will

also be relevant.

Speci�cally, 2φ will count as true at a world iff φ is true at every world that

is “accessible” from the �rst world. What does “accessible” mean? Each model

will come equipped with a binary relation, R , that holds between possible

worlds; we will say that world v is “accessible from” world w whenRwv . The

intuitive idea is thatRwv if and only if v is possible relative to w. That is, if you

live in world w, then from your perspective, the events in world v are possible.

The idea that what is possible might vary depending on what possible

world you live in might at �rst seem strange, but it isn’t really. “It is physically

impossible to travel faster than the speed of light” is true in the actual world,

but false in worlds where the laws of nature allow faster-than-light travel.

On to the semantics. We �rst de�ne a general notion of a MPL model,

which we’ll then use to give a semantics for each of our systems:

Definition of model: An MPL-model is an ordered triple, ⟨W ,R ,I ⟩, where:

· W is a non-empty set of objects (“possible worlds”)

· R is a binary relation overW (“accessibility relation”)

· I is a two-place function that assigns a 0 or 1 to each sentence letter,

relative to (“at”, or “in”) each world—that is, for any sentence letter α,

and any w ∈W ,I (α, w) is either 0 or 1. (“interpretation function”)

Each MPL-model contains a setW of possible worlds, and an accessibility

relationR . ⟨W ,R⟩ is sometimes called the model’s frame. Think of the frame

as a map of the “structure” of the model’s space of possible worlds: it contains

information about how many worlds there are, and which worlds are accessible

from which. In addition to a frame, each model also contains an interpretation

function I , which assigns truth values to sentence letters.

A model’s interpretation function assigns truth values only to sentence

letters. But the sum total of all the truth values of sentence letters relative to

worlds determines the truth values of all complex wffs, again relative to worlds.

It is the job of the model’s valuation function to specify exactly how these truth

values get determined:

Definition of valuation: WhereM (= ⟨W ,R ,I ⟩) is any MPL-model, the

valuation forM , VM , is de�ned as the two-place function that assigns either


0 or 1 to each wff relative to each member of W , subject to the following

constraints, where α is any sentence letter, φ and ψ are any wffs, and w is any

member ofW :

VM (α, w) =I (α, w)VM (∼φ, w) = 1 iff VM (φ, w) = 0

VM (φ→ψ, w) = 1 iff either VM (φ, w) = 0 or VM (ψ, w) = 1VM (2φ, w) = 1 iff for each v ∈W , ifRwv, then VM (φ, v) = 1

What about the truth values for complex formulas that contain ∧,∨,↔, and

3? Given the de�nition of these de�ned connectives in terms of the primitive

connectives, it is easy to prove that the following derived conditions hold:

VM (φ∧ψ, w) = 1 iff VM (φ, w) = 1 and VM (ψ, w) = 1VM (φ∨ψ, w) = 1 iff VM (φ, w) = 1 or VM (ψ, w) = 1

VM (φ↔ψ, w) = 1 iff VM (φ, w) =VM (ψ, w)VM (3φ, w) = 1 iff for some v ∈W ,Rwv and VM (φ, v) = 1

So far, we have introduced a general notion of an MPL model, and have

de�ned the notion of a wff’s being true at a world in an MPL model. Next, let

us consider how to de�ne validity.

Remember that our overall strategy is C. I. Lewis’s: we want to construct

different modal systems, since it isn’t obvious which formulas ought to count as

logical truths. The systems will be named: K, D, T, B, S4, S5. Each system will

come with its own de�nition of a model. As a result, different formulas will

come out valid in the different systems. For example, as we’ll see, the formula

2P→22P is going to come out valid in S4 and S5, but not in the other systems.

Here are the de�nitions:

Definition of validity for modal systems:

· A K-model is de�ned as any MPL-model

· A D-model is any MPL-model whose accessibility relation is serial (i.e.,

any model ⟨W ,R ,I ⟩ in whichR is serial inW )

· A T-model is any MPL-model whose accessibility relation is re�exive (in

W )

· A B-model is any MPL-model whose accessibility relation is re�exive (in

W ) and symmetric


· An S4-model is any MPL model whose accessibility relation is re�exive

(inW ) and transitive

· An S5-model is any MPL model whose accessibility relation is re�exive

(inW ), symmetric, and transitive

· A formula φ is valid in model M (= ⟨W ,R ,I ⟩ iff for every w ∈ W ,

VM (φ, w) = 1

· A formula is valid in system S (where S is either K, D, T, B, S4, or S5) iff

it is valid in every S-model

Notice that for each system, the valid formulas are de�ned as the formulas

that are valid in every model in which the accessibility relation has a certain

formal feature. The systems differ from from one another by what that formal

feature is. For T it is re�exivity: a formula is T-valid iff it is valid in every

model in which the accessibility relation is re�exive. For S4 the formal feature

is re�exivity + transitivity. Other systems correspond to other formal features.

As before, we’ll use the � notation for validity. But since we have many

modal systems, if we claim that a formula is valid, we’ll need to indicate which

system we’re talking about. Let’s do that by subscripting � with the name of

the system; thus, “�Tφ” means that φ is T-valid.

It’s important to get clear on the status of possible-worlds lingo here. Where

⟨W ,R ,I ⟩ is a model, we call the members ofW “worlds”, and we callR the

“accessibility” relation. Now, there is no question that “possible worlds” is a

vivid way to think about necessity and possibility. But of�cially,W is nothing

but a nonempty set, any old nonempty set. Its members needn’t be the kinds

of things metaphysicians call possible worlds: they can be numbers, people,

bananas—whatever you like. Similarly,R is just de�ned to be any old binary

relation onR ; it needn’t have anything to do with the metaphysics of modality.

Of�cially, then, the possible-worlds talk we use to describe our models is just

talk, not heavy-duty metaphysics. Still, models are usually intended to modelsomething—to depict some aspect of the dependence of truth on the world.

So if modal sentences of English containing ‘necessarily’ and ‘possibly’ aren’t

made true by anything like possible worlds, it’s hard to see why possible worlds

models would shed any light on their meaning, or why truth-in-all-possible-

worlds-models would be a good way of modeling (genuine) validity for modal

statements. At any rate, this philosophical issue should be kept in mind. Back,

now, to the formalism.


6.3.3 Semantic validity proofsGiven our de�nition of validity, one can now show that a certain formula is

valid in a given system. First, a very simple example.

Example 6.1: The formula 2(P∨∼P ) is K-valid. To show this formula is

K-valid, we must show that it is valid in every MPL-model, since validity-in-all-

MPL-models is the de�nition of K-validity. Being valid in a model means being

true at every world in the model. So, consider any MPL-model ⟨W ,R ,I ⟩,and let w be any world inW . We must prove that VM (2(P∨∼P ), w) = 1. (As

before, I’ll start to omit the subscriptM on VM when it’s clear which model

we’re talking about.)

i) Suppose for reductio that V(2(P∨∼P ), w) = 0

ii) So, by the truth condition for the 2 in the de�nition of the valuation

function, there is some world, v, such thatRwv and V(P∨∼P, v) = 0

iii) Given the truth condition for the ∨, V(P, v) = 0 and V(∼P, v) = 0

iv) Since V(∼P, v) = 0, given the truth condition for the ∼, V(P, v) = 1. But

that’s impossible; V(P, v) can’t be both 0 and 1.

Thus, �K

2(P∨∼P ).

Note that similar reasoning would establish �Kφ, for any propositional-

logic tautology φ. The reason is this: within any world, the truth values of

complex statements of propositional logic are determined by the truth values

of their constituents in that world by the usual truth tables. So if φ is a PL-

tautology, it will be true in any world in any model; hence 2φ will turn out

true in any world in any model.

Example 6.2: Show that �T(32(P→Q)∧2P )→3Q. We must show that

V((32(P→Q)∧2P )→3Q, w) = 1 for the valuation V for an arbitrarily chosen

model and world w in that model.

i) Assume for reductio that V((32(P→Q)∧2P )→3Q, w) = 0

ii) So V(32(P→Q)∧2P, w) = 1 and …

iii) …V(3Q, w) = 0


iv) From ii), 32(P→Q) is true at w, and so V(2(P→Q), v) = 1, for some

world, call it v, such thatRwv

v) From ii), V(2P, w) = 1. So, by the truth condition for the 2, P is true in

every world accessible from w; sinceRwv, it follows that V(P, v) = 1.

vi) From iv), P→Q is true in every world accessible from v ; since our model

is a T-model,R is re�exive. SoRvv; and so V(P→Q, v) = 1

vii) From v) and vi), by the truth condition for the→, V(Q, v) = 1

viii) Given iii), Q is false at every world accessible from w; this contradicts

vii)

The last example just showed that the formula (32(P→Q)∧2P )→3Q is

valid in T. Suppose we were interested in showing that this formula is also valid

in S4. What more would we have to do? Nothing! To be S4-valid is to be

valid in every S4-model; but a quick look at the de�nitions shows that every

S4-model is a T-model. So, since we already know that the the formula is valid

in all T-models, we already know that it must be valid in all S4-models (and

hence, S4-valid), without doing a separate proof.

Think of it another way. To do a proof that the formula is S4-valid, we need

to do a proof in which we are allowed to assume that the accessibility relation is

both transitive and re�exive. And the proof above did just that. We didn’t ever

use the fact that the accessibility relation is transitive—we only used the fact

that it is re�exive (in line 9). But we don’t need to use everything we’re allowed

to assume.

In contrast, the proof above doesn’t establish that this formula is, say, K-valid.

To be K-valid, the formula would need to be valid in all models. But some

models don’t have re�exive accessibility relations, whereas the proof we gave

assumed that the accessibility relation was re�exive. And in fact the formula

isn’t in fact K-valid, as we’ll show how to demonstrate in the next section.


Consider the following diagram of systems:

S5

S4

==||||||B

``@@@@@@

T

>>~~~~~~

aaBBBBBB

D

OO

K

OO

An arrow from one system to another indicates that validity in the �rst system

implies validity in the second system. For example, if a formula is D-valid, then

it’s also T-valid. The reason is that if something is valid in all D-models, then,

since every T-model is also a D-model (since re�exivity implies seriality), it

must be valid in all T-models as well.

S5 is the strongest system, since it has the most valid formulas. (That’s

because it has the fewest models—it’s easier to be S5-valid because there are

fewer potentially falsifying models.)

Notice that the diagram isn’t linear. That’s because of the following. Both B

and S4 are stronger than T; each contains all the T-valid formulas. But neither

B nor S4 is stronger than the other—each contains valid formulas that the

other doesn’t. (They of course overlap, because each contains all the T-valid

formulas.) S5 is stronger than each; S5 contains all the valid formulas of each.

These relationships between the systems will be exhibited below.

Suppose you are given a formula, and for each system in which it is valid,

you want to give a semantic proof of its validity. This needn’t require multiple

semantic proofs—as we have seen, one semantic proof can do the job. To prove

that a certain formula is valid in a number of systems, it suf�ces to prove that it

is valid in the weakest possible system. Then, that very proof will automatically

be a proof that it is valid in all stronger systems. For example, a proof that a

formula is valid in K would itself be a proof that the formula is D, T, B, S4,

and S5-valid. Why? Because every model of any kind is a K-model, so K-valid

formulas are always valid in all other systems.

In general, then, to show what systems a formula is valid in, it suf�ces to

give a single semantic proof of it, namely, a semantic proof in the weakest


system in which it is valid. There is an exception, however, since neither B

nor S4 is stronger than the other. Suppose a formula is not valid in T, but one

has given a semantic proof its validity in B. This proof also establishes that the

formula is also valid in S5, since every S5 model is a B-model. But one still

doesn’t yet know whether the formula is S4-valid, since not every S4-model is a

B-model. Another semantic proof may be needed: of the formula’s S4-validity.

(Of course, the formula may not be S4-valid.)

So: when a wff is valid in both B and S4, but not in T, two semantic proofs

of its validity are needed.

We are now in a position to do validity proofs. But as we’ll see in the next

section, it’s often easier to do proofs of validity when one has failed to construct

a counter-model for a formula.

Exercise 6.1 Use validity proofs to demonstrate the following:

a) �D[2P∧2(∼P∨Q)]→3Q

b) �S4

33(PαQ)→3Q

6.3.4 CountermodelsWe have a de�nition of validity for the various systems, and we’ve shown how

to establish validity of particular formulas. Now we’ll investigate establishing

invalidity.

Let’s show that the formula 3P→2P is not K-valid. A formula is K-valid if

it is valid in all K-models, so all we must do is �nd one K-model in which it

isn’t valid. What follows is a procedure for doing this:4

Place the formula in a box

The goal is to �nd some model, and some world in the model, where the

formula is false. Let’s start by drawing a box, which represents some chosen

world in the model we’ll construct. The goal is to make the formula false in

this world. In these examples I’ll always call this �rst world “r”:

3P→2Pr

4This procedure is from Cresswell and Hughes (1996).


Now, since the box represents a world, we should have some way of representing

the accessibility relation. What worlds are possible, relative to r; what worlds

does r “see”? Well, to represent one world (box) seeing another, we’ll draw

an arrow from the �rst to the second. However in the case of this particular

model, we don’t need to make this world r see anything. After all, we’re trying

to construct a K-model, and the accessibility relation of a K-model doesn’t

even need to be serial—no world needs to see any worlds at all. So, we’ll forget

about arrows for the time being.

Make the formula false in the world

We will indicate a formula’s truth value (1 or 0) by writing it above the formula’s

major connective. So to indicate that 3P→2P is to be false in this model, we’ll

put a 0 above its arrow:

0

3P→2Pr

Enter in forced truth values

If we want to make the 3P→2P false in this world, the de�nition of a valuation

function requires us to assign certain other truth values. Whenever a conditional

is false at a world, its antecedent is true at that world and its consequent is false

at that world. So, we’ve got to enter in more truth values; a 1 over the major

connective of the antecedent (3P ), and a 0 over the major connective of the

consequent (2P ):

1 0 0

3P→2Pr

Enter asterisks

When we assign a truth value to a modal formula, we thereby commit ourselves

to assigning certain other truth values to various formulas at various worlds.

For example, when we make 3P true at r, we commit ourselves to making Ptrue at some world that r sees. To remind ourselves of this commitment, we’ll

put an asterisk (*) below 3P . An asterisk below indicates a commitment to there

being some world of a certain sort. Similarly, since 2P is false at r, this means


that P must be false in some world P sees (if it were true in all such worlds,

then by the semantic clause for the 2, 2P would be true at r). We again have a

commitment to there being some world of a certain sort, so we enter an asterisk

below 2P as well:

1 0 0

3P→2P∗ ∗

r

Discharge bottom asterisks

The next step is to ful�ll the commitments we incurred by adding the bottom

asterisks. For each, we need to add a world to the diagram. The �rst asterisk

requires us to add a world in which P is true; the second requires us to add a

world in which P is false. We do this as follows:

1 0 0

3P→2P∗ ∗

r

��

��??????????

1Pa

0Pb

What I’ve done is added two more worlds to the diagram: a and b. P is true in

a, but false in b. I have thereby satis�ed my obligations to the asterisks on my

diagram, for r does indeed see a world in which P is true, and another in which

P is false.

The of�cial model

We now have a diagram of a K-model containing a world in which 3P→2Pis false. But we need to produce an of�cial model, according to the of�cial

de�nition of a model. A model is an ordered triple ⟨W ,R ,I ⟩, so we must

specify the model’s three members.

The set of worlds We �rst must specify the set of worlds,W . W is simply

the set of worlds I invoked:

W = {r, a,b}


But what are r, a, and b? Let’s just take them to be the letters ‘r’, ‘a’, and ‘b’. No

reason not to—the members ofW , recall, can be any things whatsoever.

The accessibility relation Next, for the accessibility relation. This is

represented on the diagram by the arrows. In our model, there is an arrow

from r to a, an arrow from r to b, and no other arrows. Thus, the diagram

represents that r sees a, that r sees b, and that there are no further cases of

seeing. Now, remember that the accessibility relation, like all relations, is a set

of ordered pairs. So, we simply write out this set:

R = {⟨r, a⟩, ⟨r,b⟩}

That is, we write out the set of all ordered pairs ⟨w1, w2⟩ such that w1 “sees”

w2.

The interpretation function Finally, we need to specify the interpreta-

tion function, I , which assigns truth values to sentence letters at worlds. In

our model, I must assign 1 to P at world a, and 0 to P at world b. Now, our

of�cial de�nition requires an interpretation to assign a truth value to each of

the in�nitely many sentence letters at each world; but so long as P is true at

world a and false at world b, it doesn’t matter what other truth values I assigns.

So let’s just (arbitrarily) choose to make all other sentence letters false at all

worlds in the model. We have, then:

I (P, a) = 1I (P, b) = 0I (α, w) = 0 for all other sentence letters α and worlds w

That’s it—we’re done. We have produced a model in which 3P→2P is false

at some world; hence this formula is not valid in all models; and hence it’s not

K-valid: 2K

3P→2P .

Check the model

At the end of this process, it’s a good idea to check to make sure that your

model is correct. This involves various things. First, make sure that you’ve

succeeded in producing the correct kind of model. For example, if you’re trying

to produce a T-model, make sure that the accessibility relation you’ve written


down is re�exive. (In our case, we were only trying to construct a K-model, and

so for us this step is trivial.) Secondly, make sure that the formula in question

really does come out false at one of the worlds in your model.

Simplifying models

Sometimes a model can be simpli�ed. Consider the diagram of the �nal version

of the model above:

1 0 0

3P→2P∗ ∗

r

��

��??????????

1Pa

0Pb

We needn’t have used three worlds in the model. When we discharged the �rst

asterisk, we needed to put in a world that r sees, in which P is true. But we

needn’t have made that a new world—we could have simply have made P true in

r. Of course we couldn’t haven’t done that for both asterisks, because that would

have made P both true and false at r. So, we could make one simpli�cation:

1 1 0 0

3P→2P∗ ∗

r

��

00

0Pb

The of�cial model would then look as follows:

W = {r, b}R = {⟨r, r ⟩, ⟨r, b ⟩}

I (P, r) = 1, all others 0


Adapting models to different systems

We have showed that 3P→2P is not K-valid. Now, let’s show that this formula

isn’t D-valid, i.e. that it is false in some world of some model with a serial

accessibility relation (i.e., some “D-model”). Well, we haven’t quite done this,

since the model above does not have a serial accessibility relation. But we can

easily change this, as follows:

1 1 0 0

3P→2P∗ ∗

r

��

00

0Pb

00

Of�cial model:

W = {r, b}R = {⟨r, r ⟩, ⟨r, b ⟩, ⟨b , b ⟩}


That was easy—adding the fact that b sees itself didn’t require changing any-

thing else in the model.

Suppose we want now to show that 3P→2P isn’t T-valid. Well, we’ve

already done so! Why? Because we’ve already produced a T-model in which

this formula is false. Look back at the most recent model. Its accessibility

relation is re�exive. So it’s a T-model already. In fact, that accessibility relation

is also already transitive, so it’s already an S4-model.

So far we have established that 2K,D,T,S4

3P→2P . What about B and S5?

It’s easy to revise our model to make the accessibility relation symmetric:

1 1 0 0

3P→2P∗ ∗

r

OO

��

00

0Pb

00


Of�cial model:

W = {r, b}R = {⟨r, r⟩, ⟨r,b⟩, ⟨b,b⟩, ⟨b, r⟩}


Now, we’ve got a B-model, too. What’s more, we’ve also got an S5-model:

notice that the accessibility relation is an equivalence relation. (In fact, it’s also

a total relation.)

So, we’ve succeeded in establishing that 3P→2P is not valid in any of our

systems. Notice that we could have done this more quickly, if we had given the

�nal model in the �rst place. After all, this model is an S5, S4, B, T, D, and

K-model. So one model establishes that the formula isn’t valid in any of the

systems.

In general, in order to establish that a formula is invalid in a number of

systems, try to produce a model for the strongest system (i.e., the system with

the most requirements on models). If you do, then you’ll automatically have a

model for the weaker systems. Keep in mind the diagram of systems:

S5

S4

==||||||B

``@@@@@@

T

>>~~~~~~

aaBBBBBB

D

OO

K

OO

An arrow from one system to another, recall, indicates that validity in the �rst

system implies validity in the second. The arrows also indicate facts about

invalidity, but in reverse: when an arrow points from one system to another,

then invalidity in the second system implies invalidity in the �rst. For example,

if a wff is invalid in T, then it is invalid in D. (That’s because every T-model is

a D-model; a countermodel in T is therefore a countermodel in D.)

When our task is to discover which systems a given formula is invalid in,

usually only one countermodel will be needed—a countermodel in the strongest


system in which the formula is invalid. But there is an exception involving B

and S4. Suppose a given formula is valid in S5, but we discover a model showing

that it isn’t valid in B. That model is automatically a T, D, and K-model, so we

know that the formula isn’t T, D, or K-valid. But we don’t yet know about that

formula’s S4-validity. If it is S4-invalid, then we will need to produce a second

countermodel, an S4 countermodel. (Notice that the B-model couldn’t alreadybe an S4-model. If it were, then its accessibility relation would be re�exive,

symmetric, and transitive, and so it would be an S5 model, contradicting the

fact that the formula was S5-valid.)

Additional steps in countermodelling

I gave a list of steps in constructing countermodels:

1. Place the formula in a box

2. Make the formula false in the world

3. Enter in forced truth values

4. Enter asterisks

5. Discharge bottom asterisks

6. The of�cial model

We’ll need to adapt this list.

Above asterisks Let’s try to get a countermodel for 32P→23P in all the

systems in which it is invalid, and a semantic validity proof in all the systems in

which it is valid. We always start with countermodelling before doing semantic

validity proofs, and when doing countermodelling, we start by trying for a

K-model. After the �rst few steps, we have:

1 0 0

32P→23P∗ ∗

r

}}{{{{{{{{{{{

!!CCCCCCCCCCC

1

2Pa

0

3Pb


At this point, we’ve got a true 2, and a false 3. Take the �rst: a true 2P . This

doesn’t commit us to adding a world in which P is true; rather, it commits us

to making P true in every world that a sees. Similarly, a zero over a 3, over

3P in world b in this case, commits us to making P false in every world that

b sees. We indicate such commitments, commitments in every world seen, by

putting asterisks above the relevant modal operators:

1 0 0

32P→23P∗ ∗

r

��~~~~~~~~~~~~

��@@@@@@@@@@@@

∗1

2Pa

∗0

3Pb

Now, how can we discharge these asterisks? In this case, when trying to

construct a K-model, we don’t need to do anything. Since a, for example,

doesn’t see any world, then automatically P is true in every world it sees; the

statement “for every world, w, if Raw then V(P, w) = 1” is vacuously true.

Same goes for b—P is automatically false in all worlds it sees. So, we’ve got a

K-model in which 32P→23P is true.

Now let’s turn the model into a D-model. Every world must now see at

least one world. Let’s try:

1 0 0

32P→23P∗ ∗

r

��~~~~~~~~~~~~

��@@@@@@@@@@@@

∗1

2Pa

��

∗0

3Pb

��1

Pc

00

0

Pd

00


I added worlds c and d, so that a and b would each see at least one world.

(Further, worlds c and d each had to see a world, to keep the relation serial.

I could have added still more worlds that c and d saw, but then they would

themselves need to see some worlds…So I just let c and d see themselves.) But

once c and d were added, discharging the upper asterisks in worlds a and b

required making P true in c and false in d (since a sees c and b sees d).

Let’s now try for a T-model. This will involve, among other things, letting

a and b see themselves. But this gets rid of the need for worlds c and d, since

they were added just to make the relation serial. I’ll try:

1 0 0

32P→23P∗ ∗

r

��~~~~~~~~~~~~

��@@@@@@@@@@@@00

∗1 1

2Pa

00

∗0 0

3Pb

00

When I added arrows, I needed to make sure that I correctly discharged the

asterisks. This required nothing of world r, since there were no top asterisks

there. There were top asterisks in worlds a and b; but it turned out to be easy

to discharge these asterisks—I just needed to let P be true in a, but false in b.

Notice that I could have moved straight to this T-model—which is itself a

D-model—rather than �rst going through the earlier mere-D-model. However,

this won’t always be possible—sometimes you’ll be able to get a D-model, but

no T-model.

At this point let’s verify that our model does indeed assign the value 0 to

our formula 32P→23P . First notice that 2P is true in a (since a only sees

one world—itself—and P is true there). But r sees a. So 32P is true at r. Now,

consider b. b only sees one world, itself, and P is false there. So 3P must also

be false there. But r sees b. So 23P is false at r. But now, the antecedent of

32P→23P is true, while its consequent is false, at r. So that conditional is

false at r. Which is what we wanted.

Onward. Our model is not a B-model, since a, for example, doesn’t see r,

despite the fact that r sees a. So let’s try to make this into a B-model. This


involves making the relation symmetric. Here’s how it looks before I try to

discharge the top asterisks in a and b:

1 0 0

32P→23P∗ ∗

r

??

��~~~~~~~~~~~~ __

��@@@@@@@@@@@@00

∗1 1

2Pa

00

∗0 0

3Pb

00

Now I need to make sure that all top asterisks are discharged. For example,

since a now sees r, I’ll need to make sure that P is true at r. However, since

b sees r too, P needs to be false at r. But P can’t be both true and false at r.

So we’re stuck, in trying to get a B-model in which this formula is false. This

suggests that maybe it is impossible—that is, perhaps this formula is true in all

worlds in all B-models—that is, perhaps the formula is B-valid. So, the thing

to do is try to prove this: by supplying a semantic validity proof.

So, let ⟨W ,R ,I ⟩ be any model in whichR is re�exive and symmetric, let

V be its valuation function, and let w be any member ofW ; we must show that

V(32P→23P, w) = 1.

i) Suppose for reductio that V(32P→23P, w) = 0

ii) Then V(32, w) = 1 and …

iii) …V(23P, w) = 0

iv) By i), for some v,Rwv and V(2P, v) = 1.

v) By symmetry,Rvw.

vi) From iv), via the truth condition for 2, we know that P is true at every

world accessible from v; and so, by v), V(P, w) = 1.

vii) By iii), there is some world, call it u, such thatRw u and V(3P, u) = 0.

viii) By symmetry, w is accessible from u.


ix) By vi), P is false in every world accessible from u; and so by viii), V(P, w) =0, contradicting vi)

Just as we suspected: the formula is indeed B-valid. So we know that it is

S5-valid (the proof we just gave was itself a proof of its S5-validity). But what

about S4-validity? Remember the diagram—we don’t have the answer yet. The

thing to do here is to try to come up with an S4-model, or an S4 semantic

validity proof. Usually, the best thing to do is to try for a model. In fact, in the

present case this is quite easy: our T-model is already an S4-model.

So, we’re done. Our answer to what systems the formula is valid and invalid

in comes in two parts. First, validity. For the systems in which the formula

is valid, we gave a semantic proof of B-validity above. This was itself a proof

of S5-validity, as I noted. So �B,S5

32P→23P . Second, invalidity. For the

systems in which the formula is invalid, we have an S4-model, which we display

of�cially as follows:

W = {r,a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨r,b⟩}

I (P, a) = 1, all others 0

This model is itself also a T, D, and K-model (since its accessibility relation is

re�exive and serial), so: 2K,D,T,S4

32P→23P .

Example 6.3: Determine in which systems 32P→3232P is valid and in

which systems it is invalid.

Well, we can get a T-model as follows:


∗1 0 0 0

32P→3232P∗ ∗

r

00

��

��

I discharged the second

bottom asterisk in

r by letting r see b

∗1 1 0

2P 232P∗

a

00

��

Notice how commitments

to speci�c truth values for

different formulas are recorded

by placing the formulas

side by side in the box

∗0 0 1

32P P∗

b

00

��0

Pc

00

Of�cial model:

W = {r,a, b , c}R = {⟨r, r ⟩, ⟨a,a⟩, ⟨b , b ⟩, ⟨c , c⟩, ⟨r,a⟩, ⟨r, b ⟩, ⟨a, b ⟩, ⟨b , c⟩}

I (P,b) = 1, all others 0

Now consider what happens when we try to turn this model into a B-model.

World b must see back to world a. But then the false 32P in b con�icts with the

true 2P in a. So it’s time for a validity proof. In constructed this validity proof,

we can be guided by failed attempt to construct a countermodel (assuming all

of our choices in constructing that countermodel were forced). In the following

proof that the formula is B-valid, I chose variables for worlds that match up

with the countermodel above:

i) Suppose for reductio that V(32P→3232P, r ) = 0, in some world r in

some B-model ⟨W ,R ,I ⟩


ii) So V(32P, r ) = 1 and . . .

iii) V(3232P, r ) = 0

iv) From ii), there’s some world, call it a, such that V(2P,a) = 1 andR ra

v) From iii), sinceR ra, V(232P,a) = 0

vi) And so, there’s some world, call it b , such that V(32P, b ) = 0 andRab

vii) By symmetry,Rba. And so, given vi), V(2P,a) = 0. This contradicts iv)

We now have a T-model for the formula, and a proof that it is B-valid. The

B-validity proof shows the formula to be S5-valid; the T-model shows it to

be K- and D-invalid. We still don’t yet know about S4. So let’s return to the

T-model above, and see what happens when we try to make its accessibility

relation transitive. World a must then see world c, which is impossible since

2P is true in a and P is false in c. So we’re ready for a S4-validity proof (the

proof looks like the B-validity proof at �rst, but then diverges):

i) Suppose for reductio that V(32P→3232P, r ) = 0, for some world rin some S4-model ⟨W ,R ,V ⟩

ii) So V(32P, r ) = 1 and . . .

iii) V(3232P, r ) = 0

iv) From ii), there’s some world, call it a, such that V(2P,a) = 1 andR ra

v) From iii), sinceR ra, V(232P,a) = 0

vi) And so, there’s some world, call it b , such that V(32P, b ) = 0 andRab

vii) By re�exivity,Rb b , so given vi), V(2P, b ) = 0

viii) And so, there’s some world, call it c , such that V(P, c) = 0 andRb c .

ix) From vi) and viii), given transitivity, we have Rac . And so, given iv),

V(P, c) = 1, contradicting viii)


Daggers

There’s another kind of step in constructing models. When we make a condi-

tional false, we’re forced to enter certain truth values for its components: 1 for

the antecedent, 0 for the consequent. But consider making a disjunction true. A

disjunction can be true in more than one way. The �rst disjunct might be true,

or the second might be true, or both could be true. So we have a choice for

how to go about making a disjunction true. Similarly for making a conditional

true, a conjunction false, or a biconditional either true or false.

When one has a choice about which truth values to give the constituents

of a propositional compound, it’s best to delay making the choice as long as

possible. After all, some other part of the model might force you to make one

choice rather than the other. If you investigate the rest of the countermodel,

and nothing has forced your hand, you may need then to make a guess: try one

of the truth value combinations open to you, and see whether you can �nish

the countermodel. If not, go back and try another combination.

To remind ourselves of these choice points, we will place a dagger (†) un-

derneath the major connective of the formula in question. Consider, as an

example, constructing a countermodel for the formula 3(3P∨2Q)→(3P∨Q).Throwing caution to the wind and going straight for a T-model, we have after

a few steps:

∗1 0 0 0 0 0

3(3P∨2Q)→(3P ∨Q)∗

r

00

��1 0

3P∨2Q P†

a

00

We still have to decide how to make 3P∨2Q true in world a: which disjunct

to make true? Well, making 2P true won’t require adding another world to

the model, so let’s do that. We have, then, a T-model:


∗1 0 0 0 0 0

3(3P∨2Q)→(3P ∨Q)∗

r

00

��∗

1 1 1 0

3P∨2Q P†

a

00

W = {r,a}R = {⟨r, r⟩, ⟨a,a⟩, ⟨r,a⟩}

I (Q,a) = 1, all else 0

OK, let’s try now to upgrade this to a B-model. We can’t simply leave

everything as-is while letting world a see back to world r, since 2Q is true

in a and Q is false in r. But there’s another possibility. We weren’t forced to

discharge the dagger in world a by making 2Q true. So let’s explore the other

possibility; let’s make 3P true:

∗1 0 0 0 0 0

3(3P∨2Q)→(3P ∨Q)∗

r

00 OO

��1 1 0

3P∨2Q P∗ †

a

00 OO

��1

Pb

00


W = {r,a,b}R = {⟨r, r⟩, ⟨a,a⟩, ⟨b,b⟩, ⟨r,a⟩, ⟨a, r⟩, ⟨a,b⟩, ⟨b,a⟩}

I (P,b) = 1, all else 0

What about an S4-model? We can’t just add the arrows demanded by

transitivity to our B-model, since 3P is false in world r and P is true in world

b. What we can do instead is revisit the choice of which disjunct of 3P∨2Q to

make true. Instead of making 3P true, we can make 2Q true, as we did when

we constructed our T-model. In fact, that T-model is already an S4-model.

So, we have countermodels in both S4 and B. The �rst resulted from

one choice for discharging the dagger in world a, the second from the other

choice. An S5-model, though, looks impossible. When we made the �rst

choice—making the right disjunct of 3P∨2Q true—we were able to make the

accessibility relation symmetric, and when we made the second choice—making

the left disjunct of 3P∨2Q true—we were able to make the accessibility rela-

tion transitive. It would seem to be impossible, then, to make the accessibility

both transitive and symmetric. Here is an S5-validity proof, based on this

reasoning. Note the “separation of cases” reasoning:

i) Suppose for reductio that in some world r in some S5-model, V(3(3P∨2Q)→(3P∨Q), r ) =0. Then V(3(3P∨2Q), r ) = 1 and …

ii) …V(3P∨Q, r ) = 0

iii) Given i), for some world a, R ra and V(3P∨2Q,a) = 1. So, either

V(3P,a) = 1 or V(2Q,a) = 1

iv) The �rst possibility leads to a contradiction:

a) Suppose V(3P,a) = 1. Then for some world b ,Rab and V(P, b ) =1

b) R is transitive, so given a) and iii),R r b .

c) given ii), V(3P, r ) = 0, and so, given b), V(P, b ) = 0, which contra-

dicts a).

v) So does the second:


a) Suppose V(2Q,a) = 1.

b) R is symmetric. So, given iii),Ra r ; and so, given a), V(Q, r ) = 1

c) But given ii), V(Q, r ) = 0—contradiction.

vi) Either way we have a contradiction.

So we have demonstrated that �S5

3(3P∨2Q)→(3P∨Q).

Summary of steps

Here, then, is a �nal list of the steps for constructing countermodels:

1. Place the formula in a box

2. Make the formula false in the world

3. Enter in forced truth values

4. Enter in daggers, and after all forced moves over…

5. Enter asterisks

6. Discharge asterisks (hint: do bottom asterisks �rst)

7. Back to step 3 if not �nished

8. The of�cial model


Exercise 6.2 For each of the following wffs, give a countermodel

for every system in which it is not valid, and give a semantic validity

proof for every system in which it is valid. When you use a single

countermodel or validity proof for multiple systems, indicate which

systems it is good for.

a) 2[P→3(Q→R)]→3[Q→(2P→3R)]

b) 2(P∨3Q)→(2P∨3Q)

c) 3(P∧3Q)→(23P→32Q)

d) 2(P↔Q)→2(2P↔2Q)

e) 2(P∧Q)→22(3P→3Q)

f) 2(2P→Q)→2(2P→2Q)

g) 332P↔2P

h) 33P→23P

i) 2[2(P→2P )→2P]→(32P→2P )

6.3.5 Schemas, validity, and invalidityLet’s digress, for a moment, to clarify the notions of validity and invalidity, as

applied to formulas and schemas.

Formulas—of�cial wffs, that is—are the strings of symbols that are sanc-

tioned by the of�cial rules of grammar of our object language. Formulas of

MPL include P∨(Q→R), 2P→3Q, and so on. Schemas are devices of our

metalanguage that are used to talk about in�nitely many formulas at once. We

used schemas, for instance, in section 2.5 to state the axioms of propositional

logic; we said: “each instance of the schema φ→(ψ→φ) is an axiom of PL”.

And we have used schemas throughout the book to de�ne the notion of a wff,

and to de�ne valuation functions. For example, earlier in this chapter we said

that a valuation function must assign the value 1 to any instance of the schema

∼φ relative to any world iff it assigns 0 to the corresponding instance of φrelative to that world. Schemas are not formulas, since they contain schematic


variables (“φ” and “ψ” here) that are not part of the primitive vocabulary of

the object language.

When we de�ned the notion of validity, what we de�ned was the notion of

a valid formula. (That’s because validity is de�ned in terms of truth in a model,

which itself was de�ned only for formulas.) So it’s not, strictly speaking, correct

to apply the notions of validity or invalidity to schemas.

However, it’s often interesting to show that every instance of a given schema

is valid. (Instances are schemas are formulas, and so the notion of validity can

be properly applied to them.) It’s easy, for example, to show that every instance

of the schema 2(φ→φ) is valid in each of our modal systems. (Let φ be any

MPL-wff, and take any world w in any model. Since the rules for evaluating

propositional compounds within possible worlds are the classical ones, φ→φmust be true at w, no matter what truth value φ has at w. Hence 2(φ→φ) is

true in any world in any model, and so is valid in each system.)

There is, therefore, a kind of indirect notion of schema-validity: validity

of all instances. How about the invalidity of schemas? Here we must take

great care. In particular, the notion of a schema, all of whose instances are

invalid, is not a particularly interesting notion. Take, for instance, the schema

3φ→2φ. We showed earlier that a certain instance of this schema, namely

3P→2P is invalid in each of our systems. However, the schema 2φ→3φ also

has plenty of instances that are valid in various systems. The following formula,

for example, is an instance of 3φ→2ψ, and can easily be shown to be valid in

each of our systems:

3(P→P )→2(P→P )

Thus, even intuitively terrible schemas like 3φ→2φ have some valid instances.

(For an extreme example of this, consider the schema φ. Even this has some

valid instances: P→P , for one.) So it’s not interesting to inquire into whether

each instance of a schema is invalid. What is interesting is to inquire into

whether a given schema has some instances that are invalid. We can show, for

example, that the schema 3φ→2φ has some invalid instances (3P→2P , for

one), and hence is in this way unlike the schema 2(φ→φ).So when dealing with schemas, it will often be of interest to ascertain

whether each instance of the schema is valid; it will rarely (if ever) be of interest

to ascertain whether each instance of the schema is invalid.


6.4 Axiomatic systems of MPLWe turn next to provability in modal logic. We’ll approach this axiomatically:

we’re going to write down axioms, which are sentences of propositional modal

logic that seem clearly to be logical truths, and we’re going to write down rules

of inference, which say which sentences can be logically inferred from which

other sentences.

We’re going to continue to follow C. I. Lewis in constructing multiple

modal systems, since it’s so unclear which sentences of MPL are logical truths.

Hence, we’ll need to formulate multiple axiomatic systems. These systems will

contain different axioms from one another. As a result, different theorems will

be provable in the different systems.

We will, in fact, give these systems the same names as the systems we

investigated semantically: K, D, T, B, S4, and S5. (Thus we will subscript

the symbol for theoremhood with the names of systems; `Kφ, for example,

will mean that φ is a theorem of system K.) Our re-use of the system names

will be justi�ed in sections ?? and ??, where we will establish soundness and

completeness for each system. Given soundness and completeness, for each

system, exactly the same formulas are provable as are valid.

6.4.1 System KOur �rst system, K, is the weakest system—i.e., the system with the fewest

theorems.

Axiomatic system K:

· Rules: modus ponens and necessitation:

φ→ψ φ

ψMP

φ

2φNEC

· Axioms: for any MPL-wffs φ,ψ, and χ , the following are axioms:

φ→(ψ→φ) (A1)

(φ→(ψ→χ ))→((φ→ψ)→(φ→χ )) (A2)

(∼ψ→∼φ)→((∼ψ→φ)→ψ) (A3)

2(φ→ψ)→(2φ→2ψ) (K)


As before, a proof is de�ned as a series of wffs, each of which is either an

axiom or follows from earlier lines in proof by a rule, and a theorem is de�ned

as the last line of any proof.

This axiomatic system (like all the modal systems we will study) is an exten-sion of propositional logic, in the sense that it includes all of the theorems of

propositional logic, but then adds more theorems. It includes all of proposi-

tional logic because one of its rules is the propositional logic rule MP, and each

propositional logic axiom is one of its axioms. It adds theorems by adding a

new rule of inference (NEC), and a new axiom schema (the K-schema) (as well

as adding new wffs—wffs containing the 2—to the stock of wffs that can occur

in the PL axioms.)

The rule of inference, NEC (for “necessitation”), says that if you have a

formula φ on a line, then you may infer the formula 2φ. This may seem

unintuitive. After all, can’t a sentence be true without being necessarily true?

Yes; but the rule of necessitation doesn’t contradict this. Remember that every

line in every axiomatic proof is a theorem. So whenever one uses necessitation in

a proof, one is applying it to a theorem. And necessitation does seem appropriate

when applied to theorems: ifφ is a theorem, then 2φ ought also to be a theorem.

Think of it this way. The worry about the rule of necessitation is that it isn’t a

truth-preserving rule: its premise can be true when its conclusion is false. The

answer to the worry is that while necessitation doesn’t preserve truth, it does

preserve logical truth, which is all that matters in the present context. For in

the present context, we’re only using NEC in a de�nition of theoremhood.

We want our theorems to be, intuitively, logical truths; and provided that our

axioms are all logical truths and our rules preserve logical truth, the de�nition

will yield only logical truths as theorems. We will return to this issue.

Let’s investigate what one can prove in K. The simplest sort of distinctively

modal proof consists of �rst proving something from the PL axioms, and then

necessitating it, as in the following proof of 2((P→Q)→(P→P ))

1. P→(Q→P ) (A1)

2. P→(Q→P ))→((P→Q)→(P→P )) (A2)

3. (P→Q)→(P→P ) 1,2 MP

4. 2((P→Q)→(P→P )) 3, NEC

Using this technique, we can prove anything of the form 2φ, where φis provable in PL. And, since the PL axioms are complete (section 2.7), that

means that we can prove 2φ whenever φ is a tautology—i.e., a valid wff of


PL. But constructing proofs from the PL axioms is a pain in the neck!—and

anyway not what we want to focus on in this chapter. So let’s introduce the

following time-saving shortcut. Instead of writing out proofs of tautologies,

let’s instead allow ourselves to write any PL tautology at any point in a proof,

annotating simply “PL”.5

Thus, the previous proof could be shortened to:

1. (P→Q)→(P→P ) PL

2. 2((P→Q)→(P→P )) 1, NEC

Furthermore, consider the wff 2P→2P . Clearly, we can construct a proof

of this wff from the PL axioms: begin with any proof of the tautology Q→Qfrom the PL axioms, and then construct a new proof by replacing each occur-

rence of Q in the �rst proof with 2P . (This is a legitimate proof, even though

2P isn’t a wff of propositional logic, because when we stated the system K, the

schematic letters φ,ψ, and χ in the PL axioms are allowed to be �lled in with

any wffs of MPL, not just wffs of PL.) So let us also include lines like this in

our modal proofs:

2P→2P PL

Why am I making such a fuss about this? Didn’t I just say in the previous

paragraph that we can write down any tautology at any time, with the annotation

“PL”? Well, strictly speaking, 2P→2P isn’t a tautology. A tautology is a valid

wff of PL, and 2P→2P isn’t even a wff of PL (since it contains a 2). But it

is the result of beginning with some PL-tautology (Q→Q, in this case) and

uniformly changing sentence letters to chosen modal wffs (in this case, Qs to

2P s); hence any proof of the PL tautology may be converted into a proof of

it; hence the “PL” annotation is just as justi�ed here as it is in the case of a

genuine tautology. So in general, MPL wffs that result from PL tautologies in

this way may be written down and annotated “PL”.

Back to investigating what we can prove in K. As we’ve seen, we can prove

that tautologies are necessary—we can prove 2φ whenever φ is a tautology.

One can also prove in K that contradictions are impossible. For instance,

∼3(P∧∼P ) is a theorem of K:

5How do you know whether something is a tautology? Figure it out any way you like: do a

truth table, or a natural deduction derivation—whatever.


1. ∼(P∧∼P ) PL

2. 2∼(P∧∼P ) 1, NEC

3. 2∼(P∧∼P )→∼∼2∼(P∧∼P ) PL

4. ∼∼2∼(P∧∼P ) 2, 3, MP

But line 4 is a de�nitional abbreviation of ∼3(P∧∼P ).Let’s introduce another time-saving shortcut. Note that the move from 2 to

4 in the previous proof is just a move from a formula to a propositional logical

consequence of that formula. Let’s allow ourselves to move directly from any

lines in a proof, φ1 . . .φn, to any propositional logical consequence ψ of those

lines, by “PL”. Thus, the previous proof could be shorted to:

1. ∼(P∧∼P ) PL

2. 2∼(P∧∼P ) 1, NEC

3. ∼∼2∼(P∧∼P ) 2, PL

Why is this legitimate? Suppose that ψ is a propositional logical semantic

consequence of φ1 . . .φn. Then the conditional φ1→(φ2→·· · (φn→ψ) . . . ) is

a PL-valid formula, and so, given the completeness of the PL axioms, is a

theorem of K. That means that if we have φ1, . . . ,φn in an axiomatic K-proof,

then we can always prove the conditional φ1→(φ2→·· · (φn→ψ) . . . ) using the

PL-axioms, and then use MP repeatedly to infer ψ. So inferring ψ directly,

and annotating “PL”, is justi�ed. (As with the earlier “PL” shortcut, let’s use

this shortcut when the conditionalφ1→(φ2→·· · (φn→ψ) . . .) results from some

tautology by uniform substitution, even if it contains modal operators and so

isn’t strictly a tautology.)

So far our modal proofs have only used necessitation and the PL axioms.

What about the K-axioms? The point of the K-schema is to enable “distribution

of the 2 over the→”. That is, if you ever have the formula 2(φ→ψ), then you

can always move to 2φ→2ψ as follows:

i . 2(φ→ψ)i + 1. 2(φ→ψ)→(2φ→2ψ) K axiom

i + 2. 2φ→2ψ i , i + 1, MP

Distribution of the 2 over the→, plus the rule of necessitation, combine

to give us a powerful proof strategy. Whenever one can prove the conditional

φ→ψ, then one can prove the modal conditional 2φ→2ψ as well, as follows.


First prove φ→ψ, then necessitate it to get 2(φ→ψ), then distribute the 2

over the arrow to get 2φ→2ψ. This procedure is one of the core K-strategies,

and is featured in the following proof of 2(P∧Q)→(2P∧2Q):

1. (P∧Q)→P PL

2. 2[(P∧Q)→P] NEC

3. 2[(P∧Q)→P]→[2(P∧Q)→2P] K axiom

4. 2(P∧Q)→2P 3,4 MP

5. 2(P∧Q)→2Q Insert steps similar to 1-4

6. 2(P∧Q)→(2P∧2Q) 4,5, PL

Notice that the preceding proof, like all of our proofs since we introduced

the time-saving shortcuts, is not a K-proof in the of�cial de�ned sense. Lines 1,

5, and 6 are not axioms, nor do they follow from earlier lines by MP or NEC;

similarly for line 6.6

So what kind of “proof” is it? It’s a metalanguage proof:

an attempt to convince the reader, by acceptable standards of rigor, that some

real K-proof exists. A reader could use this metalanguage proof as a blueprint

for constructing a real proof. She would begin by replacing line 1 with a proof

from the PL axioms of the conditional (P∧Q)→P . (As we know from chapter

??, this could be a real pain in the neck!—but the completeness of PL assures us

that it is possible.) She would then replace line 5 with lines parallel to lines 1-4,

but which begin with a proof of (P∧Q)→Q rather than (P∧Q)→P . Finally,

in place of line 6, she would insert a proof from the PL axioms of the sen-

tence (2(P∧Q)→2P )→[(2(P∧Q)→2Q)→(2(P∧Q)→(2P∧2Q))], and then

use modus ponens twice to infer 2(P∧Q)→(2P∧2Q).Let’s introduce another time-saving shortcut, which we’ll use more and

more as we progress: doing two (or more) steps at once. This shortcut is

featured in the following proof of (2P∨2Q)→2(P∨Q):

1. P→(P∨Q) PL

2. 2(P→(P∨Q)) 1, NEC

3. 2(Q→(P∨Q)) PL, NEC

4. 2P→2(P∨Q) 2, K

5. 2Q→2(P∨Q) 3, K

6. (2P∨2Q)→2(P∨Q) 4,5 PL

6A further (even pickier) reason: the symbol ∧ isn’t allowed in wffs; the sentences in the

proof are mere abbreviations for of�cial MPL-wffs.


Line 3 is really short for:

3a. Q→(P∨Q) PL

3b. 2(Q→(P∨Q)) 3a, NEC

And line 4 is short for:

4a. 2(P→(P∨Q))→(2P→2(P∨Q)) K axiom

4b. 2P→2(P∨Q) 2, 4a, MP

One further comment about this last proof: it illustrates a strategy that is

common in modal proofs. We were trying to prove a conditional formula

whose antecedent is a disjunction of two modal formulas. But the modal

techniques we had developed didn’t deliver formulas of this form. They only

showed us how to put 2s in front of PL-tautologies, and how to distribute 2s

over→s. They only yield formulas of the form 2φ and 2φ→2ψ, whereas the

formula we were trying to prove looks different. To overcome this problem,

what we did was to use the modal techniques to prove two conditionals, namely

2P→2(P∨Q) and 2Q→2(P∨Q), from which the desired formula, namely

(2P∨2Q)→2(P∨Q), follows by propositional logic. The trick, in general, is

this: remember that you have PL at your disposal. Simply look for one or

more modal formulas you know how to prove which, by PL, imply the formula

you want. Assemble the desired formulas, and then write down your desired

formula, annotating “PL”. In doing so, it may be helpful to recall PL inferences

like the following:

φ→ψ ψ→φφ↔ψ

φ→(ψ→χ )(φ∧ψ)→χ

φ→ψ φ→χφ→(ψ∧χ )

φ→χ ψ→χ(φ∨ψ)→χ

φ→ψ∼φ∨ψ

φ→∼ψ∼(φ∧ψ)

The next example illustrates our next major modal proof technique: com-

bining two 2 statements to get a single 2 statement. Let us construct a K-proof

of (2P∧2Q)→2(P∧Q):


1. P→(Q→(P∧Q)) PL

2. 2[P→(Q→(P∧Q))] NEC

3. 2P→2(Q→(P∧Q)) 2, K

4. 2(Q→(P∧Q))→[2Q→2(P∧Q)] K axiom

5. 2P→[2Q→2(P∧Q)] 3,4 PL

6. (2P∧2Q)→2(P∧Q) 5, PL

(If you wanted to, you could skip step 5, and just go straight to 6 by propositional

logic, since 6 is a propositional logical consequence of 3 and 4; I put it in for

perspicuity.)

The general technique illustrated by the last problem applies anytime you

want to move from several 2 statements to a further 2 statement, where the in-

side parts of the �rst 2 statements imply the inside part of the �nal 2 statement.

More carefully: it applies whenever you want to prove a formula of the form

2φ1→(2φ2→·· · (2φn→2ψ) . . . ), provided you are able to prove the formula

φ1→(φ2→·· · (φn→ψ) . . . ). (The previous proof was an instance of this because

it involved moving from 2P and 2Q to 2(P∧Q); and this is a case where one

can move from the inside parts of the �rst two formulas (namely, P and Q), to

the inside part of the third formula (namely, P∧Q)—by PL.) To do this, one

begins by proving the conditionalφ1→(φ2→·· · (φn→ψ) . . . ), necessitating it to

get 2[φ1→(φ2→·· · (φn→ψ) . . . )], and then distributing the 2 over the arrows

repeatedly using K-axioms and PL to get 2φ1→(2φ2→·· · (2φn→2ψ) . . . ).One cautionary note in connection with this last proof. One might think to

make it more intuitive by using conditional proof:

1. 2P∧2Q assume for conditional proof

2. 2P 1, PL

3. 2Q 1, PL

4. P→(Q→(P∧Q)) PL

5. 2[P→(Q→(P∧Q))] NEC

6. 2P→2(Q→(P∧Q)) 5, K

7. 2(Q→(P∧Q)) 6,2, MP

8. 2Q→2(P∧Q) 7,K

9. 2(P∧Q) 3,8 MP

10. (2P∧2Q)→2(P∧Q) 1-9, conditional proof


But this is not a legal proof, since our axiomatic system allows neither assump-

tions nor conditional proof.

In fact, our decision to omit conditional proof was not at all arbitrary. Given

our rule of necessitation, we couldn’t add conditional proof to our system. If we

did, proofs like the following would become legal:

1. P assume for conditional proof

2. 2P 1, NEC

3. P→2P 1,2, conditional proof

Thus, P→2P would turn out to be a K-theorem. But we don’t want that: after

all, a statement P might be true without being necessarily true.

Once we have a soundness proof (section 6.5), we’ll be able to show that

P→2P isn’t a K-theorem. But as we just saw, one can construct a K-proof from{P} of 2P (recall the notion of a proof from a set Γ, from section 2.5.) It follows

that the deduction theorem (section 2.7), which says that if there exists a proof

of ψ from {φ}, then there exists a proof of φ→ψ, fails for K (it likewise fails for

all the modal systems we will consider.) So there will be no conditional proof

in our axiomatic modal systems. (Of course, to convince yourself that a given

formula is really a tautology of propositional logic, you may sketch a proof of it

to yourself using conditional proof in some standard natural deduction system

for nonmodal propositional logic; and then you may write that formula down

in one of our axiomatic MPL proofs, annotating “PL”.)

Back to techniques for constructing proofs in K. The following proof of

22(P∧Q)→22P illustrates a technique for proving formulas with “nested”

modal operators:

1. (P∧Q)→P PL

2. 2(P∧Q)→2P 1, NEC, K

3. 2[2(P∧Q)→2P] 2, NEC

4. 22(P∧Q)→22P 3, K

Notice in line 3 that we necessitated something that was not a PL theorem.

That’s ok; we’re allowed to necessitate any K-theorems, even those whose proofs

were distinctly modal. Notice also how this proof contains two instances of our

basic K-strategy. This strategy involves obtaining a conditional, necessitating

it, then distributing the 2 over the→. We did this �rst using the conditional


(P∧Q)→P ; that led us to a conditional, 2(P∧Q)→2P . Then we started the

strategy over again, using this as our initial conditional.

So far we have no techniques dealing with the 3, other than eliminating it

by de�nition. It will be convenient to derive some shortcuts. For one, there

are the following theorem schemas, which may collectively be called “modal

negation”, or “MN” for short:

`K∼2φ→3∼φ `

K3∼φ→∼2φ

`K∼3φ→2∼φ `

K2∼φ→∼3φ

I’ll do one of these; the rest can be done as exercises.

Example 6.4: Prove ∼2φ→3∼φ (one of the MN theorems):

1. ∼∼φ→φ PL

2. 2∼∼φ→2φ 1, NEC, K

3. ∼2φ→∼2∼∼φ 2, PL

The �nal line, 3, is the de�nitional equivalent of ∼2φ→3∼φ.

Exercise 6.3 Prove the remaining MN theorems.

It will also be worthwhile to know that an analog of the K axiom for the 3

is a K-theorem:

2(φ→ψ)→(3φ→3ψ) (K3)

K3 is, by de�nition of the 3, the same formula as:

2(φ→ψ)→(∼2∼φ→∼2∼ψ)

How are we going to construct a K-proof of this theorem? In a natural de-

duction system we would use conditional proof and reductio ad absurdum,

but these strategies are not available to us here. What we must do instead is

look for a formula we know how to prove in K, which is PL-equivalent to the

formula we want to prove. Here is such a formula:

2(φ→ψ)→(2∼ψ→2∼φ)


This is equivalent, given PL, to what we want to show; and it looks like the

result of necessitating a tautology and then distributing the 2 over the→ a

couple times—just the kind of thing we know how to do in K. Here, then, is

the desired proof of K3:

1. (φ→ψ)→ (∼ψ→∼φ) PL

2. 2(φ→ψ)→2(∼ψ→∼φ) 1, NEC, K

3. 2(∼ψ→∼φ)→(2∼ψ→2∼φ) K

4. 2(φ→ψ)→(2∼ψ→2∼φ) 2,3 PL

5. 2(φ→ψ)→(∼2∼φ→∼2∼ψ) 4,PL

In doing proofs, let’s also allow ourselves to refer to earlier theorems proved,

rather than repeating their proofs. The importance of K3 may be illustrated

by the following proof of 2P→(3Q→3(P∧Q)):

1. P→[Q→(P∧Q)] PL

2. 2P→2[Q→(P∧Q)] 1, NEC, K

3. 2[Q→(P∧Q)]→[3Q→3(P∧Q)] K3

4. 2P→[3Q→3(P∧Q)] 2,3, PL

In general, K3 lets us construct proofs of the following sort. Suppose we

wish to prove a formula of the form:

O1φ1→(O2φ2→(. . .→(Onφn→3ψ) . . .)

where the Oi s are modal operators, all but one of which are 2s. (Thus, the

remaining Oi is the 3.) This can be done, provided that ψ is provable in K from

the φi s. The basic strategy is to prove a nested conditional, the antecedents

of which are the φi s, and the consequent of which is ψ; necessitate it; then

repeatedly distribute the 2 over the→s, once using K3, the rest of the times

using K. But there is one catch. We need to make the application of K3 last,after all the applications of K. This in turn requires the conditional we use to

have theφi that is underneath the 3 as the last of the antecedents. For instance,

suppose that φ3 is the one underneath the 3. Thus, what we are trying to

prove is:

2φ1→(2φ2→(3φ3→(2φ4→(. . .→(2φn→3ψ) . . .)


In this case, the conditional to use would be:

φ1→(φ2→(φn→(φ4→(. . .→(φn−1→(φ3→ψ) . . .)

In other words, one must swap one of the other φi s (I arbitrarily chose φn)

with φ3. What one obtains at the end will therefore have the modal statements

out of order:

2φ1→(2φ2→(2φn→(2φ4→(. . .→(2φn−1→(3φ3→3ψ) . . .)

But that problem is easily solved; this is equivalent in PL to what we’re trying

to get. (Recall that φ→(ψ→χ ) is logically equivalent in PL to ψ→(φ→χ ).)

Why do we need to save K3 for last? The strategy of successively distribut-

ing the box over all the nested conditionals comes to a halt as soon as the K3

theorem is used. Let me illustrate with an example. Suppose we wish to prove

`K 3P→(2Q→3(P∧Q)). We might think to begin as follows:

1. P→(Q→(P∧Q)) PL

2. 2[P→(Q→(P∧Q))] 1, Nec

3. 3P→3(Q→(P∧Q)) K3, 2, MP

4. ?

But now what? What we need to �nish the proof is:

3(Q→(P∧Q))→(2Q→3(P∧Q)).

But neither K nor K3 gets us this. The remedy is to begin the proof with a

different conditional:

1. Q→(P→(P∧Q)) PL

2. 2(Q→(P→(P∧Q))) 1, Nec

3. 2Q→2(P→(P∧Q)) K, 2, MP

4. 2(P→(P∧Q))→(3P→3(P∧Q)) K3

5. 2Q→(3P→3(P∧Q)) 3, 4, PL

6. 3P→(2Q→3(P∧Q)) 5, PL

One can, then, prove a number of theorems in K. Nevertheless, K is a

very weak system. You can’t prove the formula 2P→3P in K. (We’ll be able


to demonstrate this after section 6.5.) Relatedly, one can’t prove in K that

tautologies are possible or that contradictions aren’t necessary.

Exercise 6.4 Give axiomatic proofs in K of the following formulas:

a) 3(P∧Q)→(3P∧3Q)

b) 2∼P→2(P→Q)

c) ∼3(Q∧R)↔2(Q→∼R)

d) 2(P↔Q)→(2P↔2Q)

e) [2(P→Q)∧2(P→∼Q)]→∼3P

f) (2P∧2Q)→2(P↔Q)

g) 3(P→Q)↔(2P→3Q)

h) 3P→(2Q→3Q)

6.4.2 System DSystem D results from adding a new axiom schema to system K:

Axiomatic system D:

· Rules: MP, NEC

· Axioms: the A1, A2, A3, and K schemas, plus the D-schema:

2φ→3φ (D)

Notice that since system D includes all the K axioms and rules, we retain all

the K-theorems. The addition of the D-schema just adds more theorems. In

fact, all of our systems will build on K in this way, by adding new axioms to K.

With the D-schema in place, we can now prove that tautologies are possible:

1. P∨∼P PL

2. 2(P∨∼P ) 1, NEC

3. 2(P∨∼P )→3(P∨∼P ) D

4. 3(P∨∼P ) 2,3 MP


Example 6.5: Show that `D

22P→23P .

1. 2P→3P D

2. 2(2P→3P ) 1, NEC

3. 22P→23P 2, K

Like K, system D is very weak. As we will see later, we can’t prove 2φ→φin D. Therefore, D doesn’t seem to be a correct logic for metaphysical, or

nomic, or technological necessity, for surely, if something is metaphysically,

nomically, or technologically necessary, then it must be true. (If something is

true in all metaphysically possible worlds, or all nomically possible worlds, or

all technologically possible worlds, then surely it must be true in the actual

world, and so must be plain old true.) But perhaps there is some interest in

D anyway; perhaps D is a correct logic for moral necessity. Suppose we read

2φ as “One ought to make φ be the case”, and, correspondingly, read 3φ as

“One is permitted to make φ be the case”. Then the fact that 2φ→φ cannot

be proved in D would be a virtue, for from the fact that something ought be

done, it certainly doesn’t follow that it is done. The D-axiom, on the other

hand, would correspond to the principle that if something ought to be done

then it is permitted to be done, which does seem like a logical truth. But I won’t

go any further into the question of whether D in fact does give a correct logic

for moral necessity.

Exercise 6.5 Give axiomatic proofs in D of the following formulas:

a) ∼(2P∧2∼P )

b) ∼2[2(P∧Q)∧2(P→∼Q)]

6.4.3 System TT is the �rst system we have considered that has any plausibility of being a

correct logic for a wide range of concepts of necessity (metaphysical necessity,

for example):

Axiomatic system T:

· Rules: MP, NEC


· Axioms: the A1, A2, A3, and K schemas, plus the T-schema:

2φ→φ (T)

Recall that in the case of K, we proved a theorem schema, K3, which was

the analog for the 3 of the K-axiom schema. Let’s do the same thing here; let’s

prove a theorem schema T3, which is the analog for the 3 of the T axiom

schema:

T3: φ→3φ

1. 2∼φ→∼φ T

2. φ→∼2∼φ 1, PL

2 is just the de�nition of φ→3φ. Thus, we have established that for every

wff φ,`Tφ→3φ. So let’s allow ourselves to write down formulas of the form

φ→3φ, annotating simply “T3”.

Notice that instances of the D-axioms are now theorems: 2φ→φ is a T

axiom, we just proved that φ→3φ is a theorem; and from these two by PL we

can prove 2φ→3φ. Thus, T is an extension of D: every theorem of D remains

a theorem of T. (Since D was an extension of K, T too is an extension of K.)

Exercise 6.6 Give axiomatic proofs in T of the following formulas:

a) 32P→3(P∨Q)

b) 2P∧32(P→Q)]→3Q

c) 3(P→2Q)→(2P→3Q)

6.4.4 System BOur systems so far don’t allow us to prove anything interesting about iteratedmodalities, i.e., sentences with consecutive boxes or diamonds. Which such

sentences should be theorems? The B axiom schema decides some of these

questions for us; here is system B:

Axiomatic system B:


· Rules: MP, NEC

· Axioms: the A1, A2, A3, K, and T schemas, plus the B-schema:

32φ→φ (B)

Note that we retain the T axiom schema in B. Thus, B is an extension of T

(and hence of K and D as well.)

As with K and T, we can establish a theorem schema that is the analog for

the 3 of B’s characteristic axiom schema. (The proof illustrates techniques for

“moving” ∼s through strings of modal operators.)

B3: φ→23φ:

1. 32∼φ→∼φ B

2. φ→∼32∼φ 1, PL

3. ∼32∼φ↔2∼2∼φ MN

4. ∼2∼φ→3φ PL (since 3 abbreviates ∼2∼)

5. 2∼2∼φ→23φ 4, NEC, K, MP

6. φ→23φ 2, 3, 5, PL

Example 6.6: Show that `B[2P∧232(P→Q)]→2Q.

1. 32(P→Q)→(P→Q) B

2. 232(P→Q)→2(P→Q) 1, Nec, K, MP

3. 2(P→Q)→(2P→2Q) K

4. 232(P→Q)→(2P→2Q) 2, 3 PL

5. [2P∧232(P→Q)]→2Q 4, PL

Exercise 6.7 Give axiomatic proofs in B of the following formulas:

a) 32P↔3232P

b) [2P∧232(P→Q)]→2Q


6.4.5 System S4The characteristic axiom of our next system, S4, is a different principle gov-

erning iterated modalities:

Axiomatic system S4:

· Rules: MP, NEC

· Axioms: the A1, A2, A3, K, and T schemas, plus the S4-schema:

2φ→22φ (S4)

S4 contains the S4-schema but does not contain the B-schema. Symmetri-

cally, B lacks the S4-schema, but of course contains the B-schema. As a result,

some instances of the B-schema are not provable in S4, and some instances of

the S4-schema are not provable in B (we’ll be able to show this after section

6.5). Hence, although S4 and B are each extensions of T, neither B nor S4 is

an extension of the other.

As before, we have a theorem schema that is the analog for the 3 of the S4

axiom schema:

S43: 33φ→3φ:

1. 2∼φ→22∼φ S4

2. 2∼φ→∼3φ MN

3. 22∼φ→2∼3φ 2, NEC, K, MP

4. 2∼3φ→∼33φ MN

5. ∼3φ→2∼φ MN

6. 33φ→3φ 5,1,3,4, PL

Example 6.7: Show that `S4(3P∧2Q)→3(P∧2Q). This problem is rea-

sonably dif�cult. My approach is as follows. We saw in the K section above that

the following sort of thing may always be proved: 2φ→(3ψ→3χ ), whenever

the conditional φ→(ψ→χ ) can be proved. So we need to try to work the

problem into this form. As-is, the problem doesn’t quite have this form. But

something very related does have this form, namely: 22Q→(3P→3(P∧2Q))(since the conditional 2Q→(P→(P∧2Q)) is a tautology). This thought in-

spires the following proof:


1. 2Q→(P→(P∧2Q)) PL

2. 22Q→2(P→(P∧2Q)) 1, Nec, K, MP

3. 2(P→(P∧2Q))→(3P→3(P∧2Q)) K3

4. 22Q→(3P→3(P∧2Q)) 2, 3 PL

5. 2Q→22Q S4

6. (3P∧2Q)→3(P∧2Q) 4, 5 PL

Exercise 6.8 Give axiomatic proofs in S4 of the following formulas:

a) 2P→232P

b) 2323P→23P

c) 32P→3232P

6.4.6 System S5Here, instead of the B or S4 schemas, we add the S5 schema to T:

Axiomatic system S4:

· Rules: MP, NEC

· Axioms: the A1, A2, A3, K, and T schemas, plus the S5-schema:

32φ→2φ (S5)

First let’s prove the analog of the S5-schema for the 3:

S53: 3φ→23φ

1. 32∼φ→2∼φ S5

2. ∼3φ→2∼φ MN

3. 3∼3φ→32∼φ 2, NEC, K3, MP

4. ∼23φ→3∼3φ MN

5. 2∼φ→∼3φ MN

6. 3φ→23φ 4,3,1,5, PL


Next, note that the B and S4 axioms are now derivable as theorems. The B

axiom, 32φ→φ, is trivial:

1. 32φ→2φ S5

2. 2φ→φ T

3. 32φ→φ 1,2 PL

And now the S4 axiom, 2φ→22φ. This is a little harder. I used the B3

theorem, which we can now appeal to since the theoremhood of the B-schema

has been established.

1. 2φ→232φ B3

2. 32φ→2φ S5

3. 2(32φ→2φ) 2, Nec

4. 232φ→22φ 3, K, MP

5. 2φ→22φ 4, 1, PL

Exercise 6.9 Give axiomatic proofs in S5 of the following formulas:

a) (2P∨3Q)↔2(P∨3Q)

b) 3(P∧3Q)↔(3P∧3Q)

c) 2(2P→2Q)∨2(2Q→2P )

d) 2[2(3P→Q)↔2(P→2Q)]

6.4.7 Substitution of equivalents and modal reductionLet’s conclude our discussion of provability in modal logic by proving two

simple meta-theorems.

Substitution of equivalents: Where S is any of our modal systems, and wff

χβ results from wff χ by changing occurrences of wff α to occurrences of wff

β:

if `Sα↔β, then `

Sχ↔χβ


Proof. Suppose (*) `Sα↔β. We’ll argue by induction that `

Sχ↔χβ.

Base case: here χ is a sentence letter. Then either i) changing αs to βs has

no effect, in which case χβ is just χ , in which case obviously `S

(χ↔χβ); or

ii) χ is α, in which case χβ is β, and we know `S

(χ↔χβ) since we are given

(*).

Induction case: We now assume the result holds for some formulas χ1 and

χ2—that is, we assume that `Sχ1↔χβ1 and `

Sχ2↔χβ2 —and we show the

result holds for ∼χ1, 2χ1, and χ1→χ2.

Take the �rst case. We must show that the result holds for ∼χ1—i.e., we

must show that `S∼χ1↔(∼χ1)

β. (∼χ1)

βis just ∼χβ1 , so we must show `

S

∼χ1↔∼χβ1 . But ∼χ1↔∼χ

β1 follows by PL from χ1↔χβ1 , and the inductive

hypothesis tells us that: `Sχ1↔χβ1 .

Take the second case. We must show `S(χ1→χ2)

β. The inductive hypoth-

esis tells us that `Sχ1↔χβ1 , and so (since S includes PL):

`S(χ1→χ2)↔ (χ1

β→χ2)

The inductive hypothesis also tells us that `S(χ2↔χ2

β), from which, using

propositional logic in S, we obtain:

`S(χ1

β→χ2)↔ (χ1β→χ2

β)

Now from the two displayed equivalences, again using propositional logic in S,

we have:

`S (χ1→χ2)↔ (χ1β→χ2

β)

But note that (χ1β→χ2

β) is just the same formula as (χ1→χ2)β

. So we’ve shown

what we wanted to show.

Finally, take the third case. We must show that `S2χ1↔2χβ1 . This follows

from the inductive hypothesis `Sχ1↔χβ1 . For the inductive hypothesis implies

`Sχ1→χ

β1 , by PL; and then, using NEC and a K-axiom, we have`

S2χ1→2χβ1 .

A parallel argument establishes `S2χβ1 →2χ1; and then the desired conclusion

follows by using PL. That completes the inductive proof.

The following examples illustrate the power of substitution of equivalents.

In our discussion of K we proved the following two theorems:

2(P∧Q)→(2P∧2Q)(2P∧2Q)→2(P∧Q)


Hence (by PL), 2(P∧Q)↔(2P∧2Q) is a K-theorem. Given substitution

of equivalents, whenever we prove a theorem in which the formula 2(P∧Q)occurs as a subformula, we can infer that the result of changing 2(P∧Q) to

2P∧2Q is also a K-theorem—without having to do a separate proof.

Similarly, given the modal negation theorems, we know that all instances

of the following schemas are theorems of K (and hence of every other system):

2∼φ↔∼3φ

3∼φ↔∼2φ

Call these “the duals equivalences”.7

Given the duals equivalences, we can swap

∼3φ and 2∼φ, or ∼2φ and 3∼φ, within any theorem of any system, and

the result will also be a theorem of that system. So we can “move” ∼s through

series of modal operators at will. For example, it’s easy to show that each of the

following is a theorem of each system S:

332∼φ↔332∼φ (1)

33∼3φ↔332∼φ (2)

3∼23φ↔332∼φ (3)

∼223φ↔332∼φ (4)

(1) is a theorem of S, since it has the form ψ→ψ. (2) is the result of changing

2∼φ on the left of (1) to∼3φ. Since (1) is a theorem of S, (2) is also a theorem

of S, by substitution of equivalents via a duals equivalence. We then obtain (3)

by changing 3∼3φ in (2) to ∼23φ; by substitution of equivalents via a duals

equivalence, this too is a theorem of S. Finally, (4) follows from (3) and a MN

theorem by PL, so it too is a theorem of S. (Note how this sort of technique

greatly simpli�es the process of establishing the existence of theorems such as

K3, T3, B3, S43, and S53!)

Our second meta-theorem concerns only system S5:

Modal reduction theorem for S5: Where O1 . . .On are modal operators and

φ is a wff:

`S5O1 . . .Onφ↔Onφ

7Given the duals equivalences, the 2 is related to the 3 the way the ∀ is related to the

∃ (since ∀x∼φ↔∼∃xφ, and ∃x∼φ↔∼∀xφ are logical truths). This shared relationship,

which holds between the 2 and the 3, and between the ∀ and the ∃, is called “duality”; 2 and

3 are said to be duals, as are ∀ and ∃. This logical analogy would be neatly explained by a

metaphysics according to which necessity just is truth in all worlds and possibility just is truth

in some worlds!


That is, whenever a formula has a string of modal operators in front, it is always

equivalent to the result of deleting all the modal operators except the innermost

one. For example, 223232232323φ and 3φ are provably equivalent in

S5; i.e., 223232232323φ↔3φ is a theorem of S5). This follows from

the fact that the following equivalences are all theorems of S5:

32φ↔2φ (a)

22φ↔2φ (b)

23φ↔3φ (c)

33φ↔3φ (d)

The left-to-right direction of (a) is just S5; the right-to-left is T3; (b) is T and

S4; (c) is T and S53; and (d) is S43 and T3. Thus, by repeated applications

of these equivalences, using substitution of equivalents, we can reduce strings

of modal operators to the innermost operator. (It is straightforward to convert

this argument into a more rigorous inductive proof.)8

6.5 Soundness in MPL9

At this point, we have de�ned twelve logical systems: six semantic systems and

six axiomatic systems. But each semantic system was paired with an axiomatic

system to which we gave the same name. The time has come to justify this

pairing. In this section and the next, we show that for each semantic system,

exactly the same wffs are counted valid in that system as are counted theorems

by the axiomatic system of the same name. That is, for each of our systems, S

(for S = K, D, T, B, S4, and S5), we will prove soundness and completeness:

S-soundness: every S-theorem is S-valid

S-completeness: every S-valid formula is a S-theorem

8The modal reduction formula, the duals equivalences, and substitution of equivalents

together let us “reduce” strings of operators that include ∼s as well as modal operators. Simply

use the duals equivalents to drive any ∼s in the string to the far right hand side, then use the

modal reduction theorem to eliminate all but the innermost modal operator.

9The proofs of soundness and completeness in this and the next section are from Cresswell

and Hughes (1996).


Our study of modal logic has reversed that of history. We began with semantics,

because that is the more intuitive approach. Historically (as we noted earlier),

the axiomatic systems came �rst, in the work of C. I. Lewis. Given the uncer-

tainty over what formulas ought to be counted as axioms, modal logic was in

disarray. The discovery by the teenaged Saul Kripke in the late 1950s of the

possible-worlds semantics we studied in section 6.3, and of the correspondence

between simple constraints (re�exivity, transitivity, etc.) on the accessibility

relation in his models and Lewis’s axiomatic systems, was a major advance in

the history of modal logic.

The soundness and completeness theorems have practical as well as the-

oretical value. First, once we’ve proved soundness, we will for the �rst time

have a method for establishing that a given formula is not a theorem: construct

a countermodel for that formula, thus establishing that the formula is not valid,

and then conclude via soundness that the formula is not a theorem. Second,

given completeness, if we want to know that a given formula is a theorem, it

suf�ces to show that it is valid. Since semantic validity proofs are comparatively

easy to construct, it’s nice to be able to use them rather than axiomatic proofs.

Let’s begin with soundness. We’re going to prove a general theorem, which

we’ll use in several soundness proofs. First we’ll need a piece of terminology.

Where Γ is any set of modal wffs, let’s call “K+Γ” the axiomatic system that

consists of the same rules of inference as K (MP and NEC), and which has

as axioms the axioms of K (instances of the K- and PL- schemas), plus the

members of Γ. Here, then, is the theorem:

Theorem 6.1 If Γ is any set of modal wffs andM is an MPL-model in which

each wff in Γ is valid, then every theorem of K+Γ is valid inM

Modal systems of the form K+Γ are commonly called normal. Normal

modal systems contain all the K-theorems, plus possibly more. What Theorem

6.1 gives us is a method for constructing a soundness proof for any normal

system. Since all the systems we have studied here (K, D, etc.) are normal, this

method is suf�ciently general for us. Here’s how the method works for system

T. System T has the same rules of inference as K, and its axioms are all the

axioms of K, plus the instances of the T-schema. In the “K+Γ” notation, T = K

+ {2φ→φ :φ is an MPL wff}. To establish soundness for T, all we need to do

is show that every instance of the T-schema is valid in all re�exive models; for

we may then conclude by Theorem 6.1 that every theorem of T is valid in all

re�exive models. This method can be applied to each of our systems: for any


system, S, to establish S’s soundness it will suf�ce to show that the S’s “extra-K”

axioms are valid in all of the S-models.

Theorem 6.1 follows from two lemmas we will need to prove:

Lemma 6.2 All PL and K-axioms are valid in all MPL-models

Lemma 6.3 For every MPL-model,M , MP and Necessitation preserve validity

inM

Proof of Theorem 6.1 from the lemmas. Assume that every wff in Γ is valid in a

given MPL-modelM , and consider any theorem φ of K+Γ. That theorem is

a last line in a proof in which each line is either an axiom K+Γ, or follows from

earlier lines in the proof by MP or NEC. But axioms of K+Γ are either PL

axioms, K axioms, or members of Γ. The �rst two classes of axioms are valid in

all MPL-models, by Lemma 6.2, and so are valid inM ; and the �nal class of

axioms are valid inM by hypothesis. Thus, all axioms in the proof are valid

inM . Moreover, by Lemma 6.3, the rules of inference in the proof preserve

validity inM . Therefore, by induction, every line in the proof is valid inM .

Hence the last line in the proof, φ, is valid inM .

We now need to prove the lemmas.

Proof of Lemma 6.2. From our proof of soundness for PL (section 2.6), we know

that the PL truth tables generate the value 1 for each PL axiom, no matter

what truth value its immediate constituents have. But here in MPL, the truth

values of conditionals and negations are determined at a given world by the

truth values at that world of its immediate constituents via the PL truth tables.

So any PL axiom must have truth value 1 at any world, regardless of what truth

values its immediate constituents have. PL-axioms, therefore, are true at every

world in every model, and so are valid in every model. We need now to show

that any K axiom—i.e., any formula of the form 2(φ→ψ)→ (2φ→2ψ)—is

valid in any model:

i) Suppose for reductio that V(2(φ→ψ)→(2φ→2ψ), w) = 0, for some

model ⟨W ,R ,I ⟩, whose valuation is V, and some w ∈W

ii) So V(2(φ→ψ), w) = 1 and…

iii) …V((2φ→2ψ), w) = 0

iv) Given iii), V(2φ, w) = 1 and …


v) …V(2ψ, w) = 0

vi) Given v), for some v,Rwv and V(ψ, v) = 0

vii) Given iv), sinceRwv, V(φ, v) = 1

viii) Given ii), sinceRwv, V(φ→ψ, v) = 1

ix) Lines vi), vii), and viii) contradict, given the truth condition for the→

Proof of Lemma 6.3. We must show that the rules MP and NEC preserve va-

lidity in any given model. That is, we must show that if the inputs to one of

these rules is valid in some model, then that rule’s output must also be valid in

that model.

First MP. Let φ and φ→ψ be valid in model ⟨W ,R ,I ⟩; we must show that

ψ is also valid in that model. That is, where V is this model’s valuation, and

w is any member of W , we must show that V(ψ, w) = 1. Since φ and φ→ψare valid in this model, V(φ→ψ, w) = 1, and V(φ, w) = 1; but by the truth

condition for→, V(ψ, w) must also be 1.

Next NEC. Suppose φ is valid in modelM . We must show that 2φ is

valid inM , i.e., that 2φ is true at each world inM , i.e., that for each world,

w, φ is true at every world accessible from w. But since φ is valid inM , φis true in every world inM , and hence is true at every world accessible from

w.

6.5.1 Soundness of KWe can now construct soundness proofs for the individual systems. I’ll do this

for some of the systems, and leave the veri�cation of soundness for the other

systems as exercises.

First K. In the “K+Γ” notation, K is just K+∅, and so it follows immediately

from Theorem 6.1 that every theorem of K is valid in every MPL-model. So

K is sound.

6.5.2 Soundness of TT is K+Γ, where Γ is the set of all instances of the T-schema. So, given Theorem

6.1, to show that every theorem of T is valid in all T-models, it suf�ces to show

that all instances of the T-schema are valid in all T-models:


i) Assume for reductio that V(2φ→φ, w) = 0 for some world w in some

T-model (i.e., some model with a re�exive accessibility relation)

ii) So V(2φ, w) = 1 and…

iii) …V(φ, w) = 0

iv) Rww, by re�exivity. So, from ii), V(φ, w) = 1, contradicting iii)

6.5.3 Soundness of BB is K+ Γ, where Γ is the set of all instances of the T- and B- schemas. Given

Theorem 6.1, it suf�ces to show that every instance of the B-schema and every

instance of the T-schema is valid in every B-model. So, choose an arbitrary

model with a re�exive and symmetric accessibility relation, whose valuation is

V, and let w be any world in that model. We must show that V counts each

instance of the T-schema and the B-schema as being true at w. The proof

of the previous section shows that the T-axioms are true at w. Now for the

B-axioms:

i) Assume for reductio that V(32φ→φ, w) = 1.

ii) So V(32φ, w) = 1 and…

iii) …V(φ, w) = 0

iv) By ii), V(2φ, v) = 1, for some v such thatRwv.

v) By symmetry,Rvw. So, given iv), V(φ, w) = 1, contradicting iii)

6.6 Completeness of MPLNext, completeness: for each system, we’ll show that every valid formula is a

theorem. As with soundness, most of the work will go into developing some

general-purpose machinery. At the end we’ll use the machinery to construct

completeness proofs for each system.

We’ll be constructing a kind of completeness proof known as a “Henkin-

proof”, after Leon Henkin, who used similar methods to demonstrate com-

pleteness for (nonmodal) predicate logic.


6.6.1 Canonical modelsFor each of our systems, we’re going to show how to construct a certain special

model, the canonical model for that system. The canonical model for a system,

S, will be shown to have the following feature:

If a formula is valid in the canonical model for S, then it is atheorem of S

This suf�cient condition for theoremhood can then be used to give complete-

ness proofs, as the following example brings out. Suppose we can demonstrate

that the accessibility relation in the canonical model for T is re�exive. Then,

since T-valid formulas are by de�nition true in every world in every model

with a re�exive accessibility relation, we know that every T-valid formula is

valid in the canonical model for T. But then the italicized statement tells us

that every T-valid formula is a theorem of T. So we would have established

completeness for T.

The trick for constructing canonical models will be to let the worlds in these

models be sets of formulas (remember, worlds are allowed to be anything we

like). And we’re going to construct the interpretation function of the canonical

model in such a way that a formula will be true at a world iff the formula is a

member of the set that is the world. Working out this idea will occupy us for

awhile.

6.6.2 Maximal consistent sets of wffsTo carry out this idea of constructing worlds as sets of formulas that are true at

those worlds, we’ll need to put some constraints on the nature of these sets of

wffs. It’s part of the de�nition of a valuation function that for any wff φ and

any world w, either φ or ∼φ is true at w. That means that any set of wffs that

we’re going to call a world had better contain either φ or ∼φ. Moreover, we’d

better not let such a set contain both φ and ∼φ, since a formula can’t be both

true and false at a world. Other constraints must be introduced as well.

Where S is any of our axiomatic systems, let’s de�ne the following notions:

Definition of consistency and maximality:

· A set of MPL-wffs, Γ, is S-inconsistent iff for some φ1 . . .φn ∈ Γ, `S

∼(φ1∧· · ·∧φn). Γ is S-consistent iff it is not S-inconsistent


· A set of MPL-wffs, Γ, is maximal iff for every MPL-wff φ, either φ or

∼φ is a member of Γ

· A set is maximal S-consistent iff it is both maximal and S-consistent

A set is S-inconsistent if it contains some wffs (�nite in number) that are

provably (in S) contradictory; it is S-consistent if it contains no such wffs. This

notion of consistency is proof-theoretic: it has to do with what can be proved in

axiomatic systems, not with truth in models. Furthermore, S-consistency has to

do with provability in system S. It therefore requires more than the mere absence

of contradictions. Thus consider a set of wffs that contains no contradictions

(i.e., for no φ does the set contain both φ and ∼φ), but which contains both

2P and ∼P . This set would be T-inconsistent, since ∼(2P∧∼P ) is a theorem

of T.

A maximal S-consistent set of wffs contains, for each formula, either that

formula or its negation; and it contains no �nite list of formulas that are

collectively disprovable in S. Maximal consistent sets are �t sets to be worlds

in our canonical models.

6.6.3 De�nition of canonical modelsWe’re now ready to de�ne canonical models. It may not be fully clear at this

point why the de�nition is phrased as it is; you’ll need to take it on faith, for

the moment, that the de�nition will get us where we want to go.

Definition of canonical model: The canonical model for system S is the

MPL-model ⟨W ,R ,I ⟩ where:

· W is the set of all maximal S-consistent sets of wffs

· Rww ′ iff 2−(w)⊆ w ′

· I (α, w) = 1 iff α ∈ w, for each sentence letter α and each w ∈W

(where 2−(∆) is de�ned as the set of wffs φ such that 2φ is a member of ∆)

Let’s think for a bit about this de�nition. As promised, we have de�ned the

members of W to be maximal S-consistent sets of wffs. And note that allmaximal S-consistent sets of wffs are included inW .

Accessibility is de�ned using the “2−” notation. Think of this operation

as “stripping off the boxes”: to arrive at 2−(∆) (“the box-strip of set ∆”),

begin with set ∆, discard any formula that doesn’t begin with a 2, line up


the remaining formulas, and then strip one 2 off of the front of each. The

de�nition of accessibility, therefore, says thatRww ′ iff for each wff 2φ that is

a member of w, the wff φ is a member of w ′.The de�nition of accessibility in the canonical model says nothing about

formal properties like transitivity, re�exivity, and so on. As a result, it is not

true by de�nition that the canonical model for S is an S-model. T-models,

for example, must have re�exive accessibility relations, whereas the de�nition

of the accessibility relation in the canonical model for T says nothing about

re�exivity. As we will eventually see, the canonical model for each system S

turns out to be an S-model, but this fact must be proven; it’s not built into the

de�nition of a canonical model.

An atomic wff (sentence letter) is de�ned to be true at a world iff it is a

member of that world. Thus, for atomic wffs, truth and membership coincide.

What we really need to know, however, is that truth and membership coincide

for all wffs, including complex wffs. Proving this turns out to be a big task,

which will occupy us for several sections. We’ll need �rst to assemble some

�repower: a number of preliminary lemmas and theorems which will eventually

be used to prove that membership and truth coincide for all wffs in canonical

models. We’ll then �nally be able to give completeness proofs.

6.6.4 Features of maximal consistent setsWe’ll begin with some lemmas governing the behavior of maximal consistent

sets:

Lemma 6.4 Where Γ is any maximal S-consistent set of wffs:

6.4a for any wff φ, exactly one of φ, ∼φ is a member of Γ

6.4b φ→ψ ∈ Γ iff either φ /∈ Γ or ψ ∈ Γ

6.4c if φ and φ→ψ are both members of Γ then so is ψ

6.4d if `Sφ then φ ∈ Γ

Proof of Lemma 6.4a. We know from the de�nition of maximality that at least

one of φ or ∼φ is in Γ. But it cannot be that both are in Γ, for then Γ would be

S-inconsistent (it would contain the �nite subset {φ,∼φ}; but since all modal

systems incorporate propositional logic, it is a theorem of S that∼(φ∧∼φ).)


Proof of Lemma 6.4b. Suppose �rst that φ→ψ is in Γ, and suppose for reductio

that φ is in Γ but ψ is not. Then, since Γ is maximal, ∼ψ is in Γ; but now

Γ is S-inconsistent by containing the subset {φ,φ→ψ,∼ψ}. Suppose for the

other direction that either φ is not in Γ or ψ is in Γ, and suppose for reductio

that φ→ψ isn’t in Γ. Since Γ is maximal, ∼(φ→ψ) ∈ Γ. Now, if φ /∈ Γ then

∼φ ∈ Γ, but then Γ would contain the S-inconsistent subset {∼(φ→ψ),∼φ}.And if on the other hand, ψ ∈ Γ then Γ again contains an S-inconsistent subset:

{∼(φ→ψ),ψ}. Either possibility contradicts Γ’s S-consistency.

Exercise 6.10 Prove lemmas 6.4c and 6.4d.

6.6.5 Maximal consistent extensionsNext let’s show that if we begin with an S-consistent set ∆, we can “expand” it

into a maximal S-consistent set Γ. (We prove this because we’ll need to know

that there exist enough maximal consistent sets in a canonical model’s W in

order for the model to do its thing.)

Theorem 6.5 If∆ is an S-consistent set of wffs, then there exists some maximal

S-consistent set of wffs, Γ, such that ∆⊆ Γ

Proof of Theorem 6.5. In outline, we’re going to build up Γ as follows. We’re

going to start by dumping all the formulas in∆ into Γ. Then we will go through

all the wffs in the language of MPL, φ1, φ2,…, one at a time. For each of these

wffs, we’re going to dump either it or its negation into Γ, depending on which

choice would be S-consistent. After we’re done, our set Γ will obviously be

maximal; it will obviously contain ∆ as a subset; and, we’ll show, it will also be

S-consistent.

So, let φ1, φ2,… be a list—an in�nite list, of course—of all the wffs of

MPL.10

Our strategy, recall, is to construct Γ by starting with ∆, and then

10We need to be sure that there is some way of arranging all the wffs of MPL into such a

list. Here is one method. Consider the following list of the primitive expressions of MPL:

( ) ∼ → 2 P1 P2 . . .1 2 3 4 5 6 7 . . .

Since we’ll need to refer to what position an expression has in this list, the positions of the


going through this list one-by-one, at each point adding either φi or ∼φi .

Here’s how we do this more carefully. Let’s begin by de�ning an in�nite

sequence of sets, Γ0,Γ1, . . . :

· Γ0 is ∆

· Γn+1 is Γn ∪{φn+1} if that is S-consistent; otherwise Γn+1 is Γn ∪{∼φn+1}

Note the recursive nature of the de�nition: the next member of the sequence

of sets, Γn+1 is de�ned as a function of the previous member of the sequence,

Γn.

Next let’s prove that each member in this sequence—that is, each Γi —is an

S-consistent set. We do this inductively, by �rst showing that Γ0 is S-consistent,

and then showing that if Γn is S-consistent, then so will be Γn+1.

Obviously, Γ0 is S-consistent, since ∆ was stipulated to be S-consistent.

Next, suppose thatΓn is S-consistent; we must show thatΓn+1 is S-consistent.

Look at the de�nition of Γn+1. What Γn+1 gets de�ned as depends on whether

Γn∪{φn+1} is S-consistent. If Γn∪{φn+1} is S-consistent, then Γn+1 gets de�ned

as that very set Γn ∪ {φn+1}, and so of course is S-consistent. So we’re ok in

that case.

The remaining possibility is that Γn ∪{φn+1} is S-inconsistent. In that case,

Γn+1 gets de�ned as Γn∪{∼φn+1}. So must show that in this case, Γn∪{∼φn+1}is S-consistent. Suppose for reductio that it isn’t. The conjunction of some

�nite subset of its members must therefore be provably false in S. Since Γn was

S-consistent, the �nite subset must contain∼φn+1, and so there existψ1 . . .ψm ∈

expressions are listed underneath those expressions. (E.g., the position of the 2 is 5.) Now,

where φ is any wff, call the rating of φ the sum of the positions of the occurrences of its

primitive expressions. (The rating for the wff (P1→P1), for example, is 1+ 6+ 4+ 6+ 2= 19.)

We can now construct the listing of all the wffs of MPL by an in�nite series of stages: stage 1,

stage 2, etc. In stage n, we append to our growing list all the wffs of rating n, in alphabeticalorder. The notion of alphabetical order here is the usual one, given the ordering of the primitive

expressions laid out above. (E.g., just as ‘and’ comes before ‘nad’ in alphabetical order, since

‘a’ precedes ‘n’ in the usual ordering of the English alphabet, ∼2P2 comes before 2∼P2 in

alphabetical order since ∼ comes before the 2 in the ordering of the alphabet of MPL. Note

that each of these wffs are inserted into the list in stage 15, since each has rating 15.) In stages

1-5 no wffs are added at all, since every wff must have at least one sentence letter and P1 is

the sentence letter with the smallest position. In stage 6 there is one wff: P1. Thus, the �rst

member of our list of wffs is P1. In stage 7 there is one wff: P2, so P2 is the second member of

the list. In every subsequent stage there are only �nitely many wffs; so each stage adds �nitely

many wffs to the list; each wff gets added at some stage; so each wff eventually gets added after

some �nite amount of time to this list.


Γn such that `S∼(ψ1∧· · ·∧ψm∧∼φn+1). Furthermore, since Γn ∪ {φn+1} is S-

inconsistent, it too contains a �nite subset that is provably false in S. Since Γnis S-consistent, the �nite subset must contain φn+1, so there exist χ1 . . .χp ∈ Γn

such that `S ∼(χ1∧· · ·∧χp∧φn+1). But notice that ∼(ψ1∧· · ·∧ψm∧χ1∧· · ·∧χp)is a PL-semantic-consequence of the formulas ∼(ψ1∧· · ·∧ψm∧∼φn+1) and

∼(χ1∧· · ·∧χp∧φn+1). It follows that `S∼(ψ1∧· · ·∧ψm∧χ1∧· · ·∧χp) (each of

our modal systems “contains PL”: given the completeness of the PL axioms,

one can always move within an MPL axiomatic proof from some formulas to a

PL-semantic-consequence of those formulas.) Since ψ1 . . .ψm and χ1 . . .χp are

all members of Γn, this contradicts the fact that Γn is S-consistent.

We have shown that all the sets in our sequence Γi are S-consistent. Let

us now de�ne Γ to be the union of all the sets in the in�nite sequence—i.e.,

{φ :φ ∈ Γi for some i}. We must now show that Γ is the set we’re after: that i)

∆⊆ Γ, ii) Γ is maximal, and iii) Γ is S-consistent.

Any member of ∆ is a member of Γ0 (since Γ0 was de�ned as ∆), hence is a

member of one of the Γi s, and hence is a member of Γ. So ∆⊆ Γ.

Any wff of MPL is in the list somewhere—i.e., it is φi for some i. But by

de�nition of Γi , either φi or ∼φi is a member of Γi ; and so one of these is a

member of Γ. Γ is therefore maximal.

Suppose for reductio that Γ is S-inconsistent; there must then existψ1 . . .ψm∈ Γ such that `

S∼(ψ1∧· · ·∧ψm). By de�nition of Γ, each of these ψi ’s are

members of Γ j , for some j . Let k be the largest such j . Note next that, given

the way the Γi ’s are constructed, each Γi is a subset of all subsequent ones.

Thus, all of the ψi ’s are members of Γk , and thus Γk is S-inconsistent. But that

can’t be—we showed that all the Γi ’s are S-consistent.

6.6.6 “Mesh”Our ultimate goal is to show that in canonical models, a wff is true at a world

iff it is a member of that world. If we’re going to be able to show this, we’d

better be able to show things like this:

(2) If 2φ is a member of world w, then φ is a member of every world

accessible from w

(3) If 3φ is a member of world w, then φ is a member of some world

accessible from w


We’ll need to be able to show (2) and (3) because it’s part of the de�nition of

truth in any MPL-model (whether canonical or not) that 2φ is true at w iff φis true at each world accessible from w, and that 3φ is true at w iff φ is true at

some world accessible from w. Think of it this way: (2) and (3) say that the

modal statements that are members of a world w in a canonical model “mesh”

with the members of the other worlds in that canonical model. This sort of

mesh had better hold if truth and membership are going to coincide.

(2) we know to be true straightaway, since it follows from the de�nition of

the accessibility relation in canonical models. The de�nition of the canonical

model for S, recall, stipulated that w ′ is accessible from w iff for each wff 2φin w, the wff φ is a member of w ′. (3), on the other hand, doesn’t follow

immediately from our de�nitions; we’ll need to prove it. Actually, it will be

convenient to prove something slightly different which involves only the 2:

Lemma 6.6 If ∆ is a maximal S-consistent set of wffs containing ∼2φ, then

there exists a maximal S-consistent set of wffs Γ such that 2−(∆) ⊆ Γ and

∼φ ∈ Γ

(Given the de�nition of accessibility in the canonical model and the de�nition

of the 3 in terms of the 2, Lemma 6.6 basically amounts to (3).)

Proof of Lemma 6.6. Let ∆ be as described. The �rst (and biggest) step is to

establish:

(*) 2−(∆)∪{∼φ} is S-consistent.

Suppose for reductio that (*) is false. By the de�nition of S-inconsistency, for

some χ1 . . .χm ∈2−(∆)∪{∼φ} the following is a theorem of S:

∼(χ1∧· · ·∧χm)

The following is a PL-semantic-consequence of this formula, and hence, since

S includes PL, is also a theorem of S:

∼(χ1∧· · ·∧χm∧∼φ)

Now go through the list χ1 . . .χm, and if it contains any wffs that are not

members of 2−(∆), drop them from the list. Call the resulting list ψ1 . . .ψn.


Each of the ψi s, note, is a member of 2−(∆).11 The only wff that could have

been dropped, in moving from the χi s to the ψi s, is ∼φ (since each χi was

a member of 2−(∆)∪ {∼φ}); the following wff is therefore a PL-semantic-

consequence of the previous wff, and so is itself a theorem of S:

∼(ψ1∧· · ·∧ψn∧∼φ)

Next, begin a proof in S with a proof of∼(ψ1∧· · ·∧ψn∧∼φ), and then continue

as follows:

.

.

i . ∼(ψ1∧· · ·∧ψn∧∼φ)i + 1. ψ1→(ψ2→·· · (ψn→φ)) . . . ) i , PL

i + 2. 2(ψ1→(ψ2→·· · (ψn→φ)) . . . ) i + 1, NEC

.

.

j . 2ψ1→(2ψ2→·· · (2ψn→2φ)) . . . ) i + 2…, K, PL (×n)

j + 1. ∼(2ψ1∧· · ·∧2ψn∧∼2φ) j , PL

This proof establishes that `S∼(2ψ1∧· · ·∧2ψn∧∼2φ). But since 2ψ1…2ψn,

and ∼2φ are all in∆, this contradicts∆’s S-consistency (2ψ1…2ψn are mem-

bers of ∆ because ψ1…ψn are members of 2−(∆).)We’ve established (*): 2−(∆) ∪ {∼φ} is S-consistent. It therefore has a

maximal S-consistent extension, Γ, by Theorem 6.5. Since 2−(∆)∪{∼φ} ⊆ Γ,

we know that 2−(∆)⊆ Γ and that ∼φ ∈ Γ. Γ is therefore our desired set.

Exercise 6.11 Where S is any normal modal system, show that if

∆ is an S-consistent set of wffs containing the formula 3φ, then

2−(∆) ∪φ is also S-consistent. You may appeal to lemmas and

theorems proved in this chapter so far.

11If 2−(∆) is empty then there will be no ψi s. In that case, let’s regard “ψ1∧· · ·∧ψn∧∼φ”

as standing for ∼φ.


6.6.7 Truth and membership in canonical modelsWe’re now in a position to put all of our lemmas to work, and prove that

canonical models have the desired property that the wffs true at a world are

exactly the members of that world:

Theorem 6.7 WhereM (= ⟨W ,R ,I ⟩) is the canonical model for any normal

modal system, S, for any wff φ and any w ∈W , VM (φ, w) = 1 iff φ ∈ w

Proof of Theorem 6.7. We’ll use induction. The base case is when φ has zero

connectives—i.e., φ is a sentence letter. In that case, the result is immediate:

by the de�nition of the canonical model, I (φ, w) = 1 iff φ ∈ w; but by the

de�nition of the valuation function, VM (φ, w) = 1 iff I (φ, w) = 1.

Now the inductive step. We suppose (ih) that the result holds for φ, ψ, and

show that it holds for ∼φ, φ→ψ, and 2φ as well. First, ∼: we must show that

∼φ is true at w iff ∼φ ∈ w:

i) ∼φ ∈ w iff φ /∈ w (6.4a)

ii) φ /∈ w iff φ is not true at w (ih)

iii) φ is not true at w iff ∼φ is true at w (truth cond. for ∼)

iv) so, ∼φ is true at w iff ∼φ ∈ w (lines i), ii), iii))

Next,→: we must show that φ→ψ is true at w iff φ→ψ ∈ w:

i) φ→ψ is true at w iff either φ is not true at w or ψ is true at w (truth

cond for→)

ii) So, φ→ψ is true at w iff either φ /∈ w or ψ ∈ w (ih)

iii) So, φ→ψ is true at w iff φ→ψ ∈ w (6.4b)

Finally, 2: we must show that 2φ is true at w iff 2φ ∈ w. First the forwards

direction. Assume 2φ is true at w; then φ is true at every world w ′ such that

Rww ′. By the ih, we have (+) φ is a member of every such w ′. Now suppose

for reductio that 2φ /∈ w; by 6.4a, ∼2φ ∈ w. Since w is maximal S-consistent,

by Lemma 6.6, we know that there exists some maximal S-consistent set Γ such

that 2−(w)⊆ Γ and ∼φ ∈ Γ. By de�nition ofW , Γ is a world; by de�nition of

R ,RwΓ; and so by (+) Γ containsφ. But Γ also contains∼φ, which contradicts

its S-consistency.


Now the backwards direction. Assume 2φ ∈ w. Then by de�nition ofR ,

for every w ′ such thatRww ′, φ ∈ w ′. By the ih, φ is true at every such world;

hence by the truth condition for 2, 2φ is true at w.

What was the point of proving theorem 6.7? The whole idea of a canonical

model was to be that a formula is valid in the canonical model for S iff it is a

theorem of S. This fact follows fairly immediately from Theorem 6.7:

Corollary 6.8 φ is valid in the canonical model for S iff `Sφ

Proof of Corollary 6.8. Let ⟨W ,R ,I ⟩ be the canonical model for S. Suppose

`Sφ. Then, by lemma 6.4d, φ is a member of every maximal S-consistent set,

and hence φ ∈ w, for every w ∈W . By theorem 6.7, φ is true in every w ∈W ,

and so is valid in this model. Now for the other direction: suppose 0Sφ. Then

{∼φ} is S-consistent, and so by theorem 6.5, has a maximal consistent extension;

thus, ∼φ ∈ w for some w ∈W ; by theorem 6.7, ∼φ is therefore true at w, and

so φ is not true at w, and hence φ is not valid in this model.

So, we’ve gotten where we wanted to go: we’ve shown that every system

has a canonical model, and that a wff is valid in the canonical model iff it is a

theorem of the system. We now use this fact to prove completeness for our

various systems:

6.6.8 Completeness of systems of MPLCompleteness of K

K’s completeness follows immediately. Any K-valid wff is valid in all MPL-

models, and so is valid in the canonical model for K, and so, by corollary 6.8, is

a theorem of K.

For any other system, S, all we need to do to prove S-completeness is to

show that the canonical model for S is an S-model. That is, we must show

that the accessibility relation in the canonical model for S satis�es the formal

constraint for system S (seriality for D, re�exivity for T and so on). This will

be made clear in the proof of completeness for D:

Completeness of D

Let us show that in the canonical model for D, the accessibility relation,R , is

serial. Let w be any world in that model. We showed above that 3(P→P ) is a


theorem of D, and so is a member of w by lemma 6.4d, and so is true at w by

theorem 6.7. Thus, by the truth condition for 3, there must be some world

accessible to w in which P→P is true; and hence there must be some world

accessible to w.

Now for D’s completeness. Let φ be D-valid. It is then valid in all D-

models, i.e., all models with a serial accessibility relation. But we just showed

that the canonical model for D has a serial accessibility relation. φ is therefore

valid in that model, and hence by corollary 6.8, `D φ.

Completeness of T

All we need to do is to prove that the accessibility relation in the canonical model

for T is re�exive; given that, every T-valid formula is valid in the canonical

model for T, and hence by corollary 6.8, every T-valid formula is a T-theorem.

Let φ be any wff. `T

2φ→φ, so, where w is any world in the canonical

model for T, by lemma 6.4d, 2φ→φ ∈ w. By lemma 6.4c, if 2φ ∈ w, then so

is φ. Formula φ was arbitrarily chosen, so we have: for any φ, if 2φ ∈ w then

φ ∈ w. But this is the de�nition ofRww. World w was arbitrarily chosen, so

R is re�exive.

Completeness of B

We must show that the accessibility relation in the canonical model for B is

re�exive and symmetric. Re�exivity can be demonstrated in the same way as it

was for T, since every T-theorem is a B-theorem.

Now for symmetry: in the canonical model for B, suppose thatRwv. We

must show thatRvw—that is, that for any 2ψ in v, ψ ∈ w. So, suppose that

2ψ ∈ v. By theorem 6.7, 2ψ is true at v; since Rwv, by the de�nition of

3 it follows that 32ψ is true at w, and hence is a member of w by theorem

6.7. Since `B

32ψ→ψ, by lemma 6.4d, 32ψ→ψ ∈ w, and so, by lemma 6.4c,

ψ ∈ w.

Completeness of S4

We must show that the accessibility relation in the canonical model for S4 is

re�exive and transitive. Again, re�exivity can be demonstrated as it was for

T; transitivity remains. SupposeRwv andRv u. We must showRw u—that

is, for any 2ψ ∈ w,ψ ∈ u. If 2ψ ∈ w, since `S4 2ψ→22ψ, by lemma 6.4d,


2ψ→22ψ ∈ w, and so by lemma 6.4c, 22ψ ∈ w. By theorem 6.7, 22ψ is

true at w; hence by the truth condition for 2, 2ψ is true at v; again by the

truth condition for 2, ψ is true at u; by theorem 6.7, ψ ∈ u.

Completeness of S5

We must show that the accessibility relation in the canonical model for S5 is

re�exive, symmetric, and transitive. But since each T, B, and S4 theorem is

an S5 theorem, the proofs of re�exivity, symmetry, and transitivity from the

previous three sections apply here.

Exercise 6.12 Consider the system that results from adding to K

every axiom of the form 3φ→2φ. Let the frames for this system

be de�ned as those whose accessibility relation meets the following

condition: every world can see at most one world. Prove completeness

for this (strange) system.

Chapter 7

Variations on Propositional ModalLogic

As we have seen, possible worlds are useful for giving a semantics for propo-

sitional modal logic. Possible worlds are useful in other areas of logic as

well. In this chapter we will brie�y examine two other uses for possible worlds:

semantics for tense logic, and semantics for intuitionist propositional logic.

7.1 Propositional tense logic1

7.1.1 The metaphysics of timePropositional modal logic concerned the logic of the non-truth-functional

sentential operators “it is necessary that” and “it is possible that”. Another

set of sentential operators that can be similarly treated are propositional tenseoperators, such as “it will be the case that”, “it has always been the case that”,

etc.

A full logical treatment of natural language obviously requires that we

pay attention to temporal notions. Some philosophers, however, think that it

requires nothing beyond standard predicate logic. This was the view of many

early logicians, most notably Quine.2

Here are some examples of how Quine

would regiment temporal sentences in predicate logic:

1See Gamut (1991b, section 2.4); Cresswell and Hughes (1996, pp. 127-134).

2See, for example, Quine (1953b).

191

CHAPTER 7. VARIATIONS ON MPL 192

Everyone who is now an adult was once a child

∀x(Axn→∃t[E t n∧C x t])

A dinosaur once trampled a mammal

∃x∃y∃t (E t n∧D x ∧M y ∧T xy t )

Comments:

· ‘n’ is to be a name for the present time

· The predicate “E” is to be a predicate for the earlier-than relation over

moments of time. Thus, “E t n” means that t is a time that is before the

present moment; and so, “∃t (E t n∧ . . .” means that there exists some

time, t , before the present moment, such that …”.

· we add in a new place for all predicates, for the time at which the object

satis�es the predicate. Thus, instead of saying “C x”—“x is a child”—we

say “C x t”: “x is a child at t”

· the quanti�er ∃x is atemporal, ranging over all objects at all times. That’s

how we can say that there is a thing, x, that is a dinosaur, and which, at

some previous time, trampled a mammal.

So: we can use Quine’s strategy to represent temporal notions using standard

predicate logic. But some philosophers reject the conception of time that is

presupposed by Quine’s strategy. First, Quine presupposes that the past, present,

and future are equally real. After all, his symbolization of “A dinosaur once

trampled a mammal” says that there is such a thing as a dinosaur. Quine’s view is

that time is “space-like”. Other times are as real as the present, just temporally

distant, just as other places are equally real but spatially distant. Second, Quine

presupposes a distinctive metaphysics of change. Quine accounts for change

by adding argument places to temporary predicates like ‘is a child’ and ‘is an

adult’. For him, the statement ‘Ted is an adult’ is incomplete in something

like the way ‘Philadelphia is north of’ is complete: its predicate has an un�lled

argument place. When all of a sentence’s argument places are �lled, it can no

longer change its truth value; as a result (according to some), Quine’s approach

leaves no room for genuine change.

Arthur Prior (1967; 1968) and others reject Quine’s picture of time. Ac-

cording to Prior, rather than reducing notions of past, present, and future to


notions about what is true at times, we must instead include certain special

temporal expressions—sentential tense operators—in our most basic languages,

and develop an account of their logic. Thus he initiated the study of tense logic.One of Prior’s tense operators was P, symbolizing “it was the case that”.

Grammatically, P behaves like the ∼ and the 2: it attaches to a complete

sentence and forms another complete sentence. Thus, if R symbolizes “it is

raining”, then P R symbolizes “it was raining”. If a sentence letter occurs by

itself, outside of the scope of all temporal operators, then for Prior it is to

be read as present-tensed. Thus, it was appropriate to let R symbolize “It is

raining”—i.e., it is now raining.

Suppose we symbolize “there exists a dinosaur” as ∃xD x. Prior would then

symbolize “There once existed a dinosaur” as:

P∃xD x

And according to Prior, P∃xD x is not to be analyzed as saying that there exist

dinosaurs located in the past. For him, there is no further analysis of P∃xD x.

Prior’s attitude toward P is like everyone else’s attitude toward the ∼: no one

thinks that ∼∃xU x, “there are no unicorns”, is to be analyzed as saying that

there exist unreal unicorns. Further, Prior can represent the fact that I am

now, but have not always been, an adult, without adding argument places for

times to predicates. Symbolizing ‘is an adult’ with ‘A’, and ‘Ted’ with ‘t ’, Prior

would write: At ∧P∼At (“Ted is an adult, but it was the case that: Ted isn’t an

adult”). For Prior, the sentence At (“Ted is an adult”) is a complete statement,

but nevertheless can alter its truth value.

7.1.2 Tense operatorsOne can study various tense operators. Here is one group:

Gφ: “it is, and is always going to be the case that φ”

Hφ: “it is, and always has been the case that φ”

Fφ: “it either is, or will at some point in the future be the case that, φ”

Pφ: “it either is, or was at some point in the past the case that φ”


Notice how these tense operators come in interde�nable pairs (related to

each other as the 2 and the 3):

Gφ iff ∼F∼φHφ iff ∼P∼φ

Thus one could start with just two of them, G and H say, and de�ne the others.

And one could use these two to de�ne further tense operators, for example Aand S, for “always” and “sometimes”

Aφ iff Hφ∧GφSφ iff ∼H∼φ∨∼G∼φ (i.e., iff Pφ∨Fφ)

There are further tense operators that are not de�nable in terms of those

we’ve been considering so far. There are, for example, metrical tense operators,

which concern what happened or will happen at speci�c temporal distances in

the past or future:

Pxφ: “it was the case x minutes ago that φ”

Fxφ: “it will be the case in x minutes that φ”

We will not consider metrical tense operators further.

The (nonmetrical) tense operators, as interpreted above, “include the

present moment”. For example, if Gφ is now true, then φ must now be true.

One could specify an alternate interpretation on which they do not include the

present moment:

Gφ: “it is always going to be the case that φ”

Hφ: “it always has been the case that φ”

Fφ: “it will at some point in the future be the case that φ”

Pφ: “it was at some point in the past the case that φ”

Whether we take the tense operators as including the present moment will

affect what kind of logic we develop for them. For example, the “2-like”

operators G and H will obey the T-principle (Gφ and Hφ will imply φ) if they

are interpreted as including the present moment, but not otherwise.


7.1.3 Syntax of tense logicIn this chapter we will study only propositional tense logic. Its syntax is straight-

forward: each tense operator has the grammar of the∼ and the 2. For example,

if we take the tense operators G and H as basic, then we could begin with the

de�nition of a wff from propositional logic (section 2.1) and add the following

clause:

· If φ is a wff then so are Gφ and Hφ

7.1.4 Possible worlds semantics for tense logicLet’s turn now to semantics. The most natural semantics for tense logic is

a possible worlds-style semantics, in which we think of the members of Was times rather than possible worlds, we think of the accessibility relation as

the temporal ordering relation, and we think of the interpretation function as

assigning truth values to sentence letters at times.

(A Priorean faces hard philosophical questions about the use of such a

semantics, since according to him, the semantics doesn’t accurately model the

metaphysics of time. The questions are like those questions that confront

someone who uses possible worlds semantics for modal logic, but doesn’t think

that possible worlds are part of the metaphysics of modality.)

This change in how we think about possible-worlds models doesn’t require

any change to the de�nition of a model from section 6.3. A model,M , is still

an ordered triple, whose �rst member is a nonempty set, whose second member

is a binary relation over that set, and whose third member is a function that

assigns to each sentence letter a truth value relative to each member of the

�rst member ofM . To signify the change in how we’re thinking about these

models, however, let’s change our notation. Let’s call a model’s �rst member T ,

rather thanW , and let’s use variables like t , t ′, etc., for its members. And since

we’re thinking of a model’s second member as a relation of temporal ordering—

the at-least-as-early-as relation over times—let’s rename it too: “≤”. (If we were

interpreting the tense operators as not including the present moment, then

we would think of the temporal ordering relation as the strictly-earlier-than

relation, and would write it “<”.) Thus, instead of writing “Rww ′”, we write:

t ≤ t ′.We’ll need to update the de�nition of the valuation function. The clauses

for the propositional connectives remain the same; what we need to add is


clauses for the tense operators. Let’s take just G and H as primitive; here are

the clauses:

VM (Gφ, t ) = 1 iff for every t ′ such that t ≤ t ′, VM (φ, t ′) = 1VM (Hφ, t ) = 1 iff for every t ′ such that t ′ ≤ t , VM (φ, t ′) = 1

If we de�ne F and P as ∼G∼ and ∼H∼, respectively, then we get the following

derived clauses:

VM (Fφ, t ) = 1 iff for some t ′ such that t ≤ t ′, VM (φ, t ′) = 1VM (Pφ, t ) = 1 iff for some t ′ such that t ′ ≤ t , VM (φ, t ′) = 1

Call an MPL-model, thought of in this way, a “PTL-model” (for “Priorean

Tense Logic”). And say that a wff is PTL-valid iff it is true in every time in every

PTL-model. Given our discussion of system K from chapter 6, we already

know a lot about PTL-validity. The truth condition for the G is the same as

the truth condition for the 2 in MPL. Thus, for each K-valid formula φ of

MPL, there is a PTL-valid formula of tense logic: simply replace each 2 in

φ with G. Replacing 2s with Gs in the K-valid formula 2(P∧Q)→2P results

in the PTL-valid formula G(P∧Q)→GP , for example. Similarly, the result of

replacing 2s with Hs in a K-valid formula also results in a PTL-valid formula.

But there are further cases of PTL-validity that depend on the interaction

between different tense operators, and hence have no direct analog in MPL.

For example, we can demonstrate that �PTL

φ→GPφ:

i) Suppose for reductio that in some PTL-model M (= ⟨T ,≤,I ⟩) and

some t ∈ T , VM (φ→GPφ, t ) = 0. (I henceforth drop the subscriptM .)

ii) So V(φ, t ) = 1 and …

iii) …V(GPφ, t ) = 0.

iv) Given iii), by the truth condition for G: for some t ′ ∈ T , t ≤ t ′ and

V(Pφ, t ) = 0

v) Given iv), by the (derived) truth condition for P: for every t ′′ ∈ T , if

t ′′ ≤ t ′ then V(φ, t ′′) = 0

vi) letting t ′′ in v) be t , given that t ≤ t ′ (from iv)), we have: V(φ, t ) = 0,

contradicting ii).

Similarly, one can show that �PTL

φ→HFφ.


7.1.5 Formal constraints on ≤PTL-validity is not a good model for logical truth in tense logic. We have

so far placed no constraints on the formal properties of the relation ≤ in a

PTL-model. That means that there are PTL models in which the ≤ looks

nothing like a temporal ordering. We don’t normally think that time could

consist of a number of wholly temporally disconnected points, for example,

or of many points each of which is at-least-as-early-as all of the rest, and so

on, but there are PTL-models answering to these strange descriptions. As we

have de�ned them, PTL-valid formulas must be true at every world in everyPTL-model, even these strange models. This means that many tense-logical

statements that ought, intuitively, to count as logical truths, are in fact not

PTL-valid.

The formula GP→GGP is an example. It is PTL-invalid, for consider a

model with three times, t1, t2, and t3, where t1 ≤ t2, t2 ≤ t3, and t1 6≤ t3, and in

which P is true at t1 and t2, but not at t3:

•P

t1(( •t2

P

(( •t3

∼P

In this model, GP→GGP is false at time t1. But GP→GGP is, intuitively, a

logical truth. If it is and will always be raining, then surely it must also be

true that: it is and always will be the case that: it is and always will be raining.

The problem, of course, is that the ≤ relation in the model we considered is

intransitive, whereas, one normally assumes, the at-least-as-early-as relation

must be transitive.

So: a more interesting notion of validity for PTL formulas results from

considering only PTL-models with transitive ≤ relations. Doing this validates

every instance of the “S4” schemas:

Gφ→GGφHφ→HHφ

There are other interesting constraints on≤ that one might impose. One might

impose re�exivity, for example. This is natural to impose if we are construing

the tense operators as including the present moment; not otherwise. Imposing

re�exivity validates the “T-schemas” Gφ→φ and Hφ→φ.

One might also impose “connectivity” of some sort.


Definition of kinds of connectivity: Let R be any binary relation over A.

· R is strongly connected in A iff for every u, v ∈A, either Ruv or Rv u

· R is weakly connected iff for every u, v, v ′, IF: either Ruv and uv ′, or

Rv u and Rv ′u, THEN: either Rvv ′ or Rv ′v

So, we might require that the ≤ relation be strongly connected (in T ), or,

alternatively, merely weakly connected. This would be to disallow “incom-

parable” pairs of times—pairs of times neither of which bears the ≤ relation

to the other. The stronger requirement disallows all incomparable pairs; the

weaker requirement merely disallows incomparable pairs when each member

of the pair is after or before some one time. Thus, the weaker requirement

disallows “branches” in the temporal order but allows distinct timelines wholly

disconnected from one another, whereas the stronger requirement insures that

all times are part of a single non-branching structure. Each sort validates every

instance of the following schemas:

G(Gφ→ψ)∨G(Gψ→φ)H(Hφ→ψ)∨H(Hψ→φ)

Here’s a sketch of a validity proof for the �rst schema:

i) Assume for reductio that V(G(Gφ→ψ)∨G(Gψ→φ), t ) = 0 in some PTL-

model whose accessibility relation is weakly connected.

ii) So V(G(Gφ→ψ), t ) = 0 and V(G(Gψ→φ), t ) = 0

iii) so there exist times, t ′ and t ′′, such that t ≤ t ′ and t ≤ t ′′, and V(Gφ→ψ, t ′) =0 and V(Gψ→φ, t ′′) = 0

iv) thus, Gφ is true at t ′, ψ is false at t ′, Gψ is true at t ′′, and φ is false at t ′′

v) but by weak connectivity, t ′ ≤ t ′′ or t ′′ ≤ t ′. Either way iv) leads to a

contradiction.

There are other constraints one might impose, for example anti-symmetry(no distinct times bear ≤ to each other), density (between any two times there is

another time), or eternality (there exists neither a �rst nor a last time). In some

cases, imposing a constraint validates an interesting schema being validated.

Further, some constraints are more philosophically controversial than others.


Notice that one should not impose symmetry on ≤. Obviously if one time

is at least as early as another, then the second time needn’t be at least as early

as the �rst. Moreover, imposing symmetry would validate the “B” schemas

FGφ→φ and PHφ→φ; but these clearly ought not to be validated. Take the

�rst, for example: it doesn’t follow from it will be the case that it is always going tobe the case that I’m dead that I’m (now) dead.

So far we have been interpreting the tense operators as including the present

moment. That led us to call the temporal ordering relation in our models “≤”,

and require that it be re�exive. What if we instead interpreted the tense

operators as not including the present moment? We would then call the

temporal ordering relation “<”, and think of it as the earlier-than relation; and

we would no longer require that it be re�exive. Indeed, it would be natural to

require that it be irre�exive: that it never be the case that t < t .

We have considered only the semantic approach to tense logic. What of a

proof-theoretic approach? Given the similarity between tense logic and modal

logic, it should be no surprise that axiom systems similar to those of section

6.4 can be developed for tense logic. Moreover, the techniques developed in

sections 6.5-6.6 can be used to give soundness and completeness proofs for

tense-logical axiom systems, relative to the possible-worlds semantics that we

have developed in this section.

7.2 Intuitionist propositional logic

7.2.1 Kripke semantics for intuitionist propositional logic3

As we saw in section 3.4, intuitionists think of meaning in proof-theoretic terms,

rather than truth-theoretic terms, and as a result reject classical propositional

logic in favor of intuitionist propositional logic. We have already developed a

proof-theory for intuitionist logic: we began with the original sequent calculus

and then dropped double-negation elimination while adding ex falso. But we

still need a semantics. What should such a semantics look like?

In this book we have been thinking of logical truth, on the semantic

conception—i.e., validity—as “truth no matter what”. It is natural for in-

tuitionists to think, rather, in terms of “provability no matter what”. We will

lay out a semantics for intuitionist propositional logic—due to Saul Kripke—

that is based on this idea. The semantics will be like that of possible-worlds

3See (Priest, 2001, chapter 6)


semantics for propositional modal logic, and so it will include valuation func-

tions that assign the values 1 and 0 to formulas relative to the members of a

set W . But the idea is to now think of the members of W as stages in the

construction of proofs, rather than as possible worlds, and to think of 1 and 0 as

“proof statuses”, rather than truth values. That is, we are to think of V(φ, w) = 1as meaning that formula φ has been proved at stage w.

Let us treat the ∧ and the ∨ as primitive connectives. Here is Kripke’s

semantics for intuitionist propositional logic. (To emphasize the different way

we are regarding the “worlds”, we renameW “S ”, for stages in the construction

of proofs, and we will use the variables s , s ′, etc., for its members.)

Definition of I-model: An I-model is a triple ⟨S ,R ,I ⟩, such that:

· S is a non-empty set (“proof stages”)

· R is a binary relation over S (“accessibility”) that is re�exive, transitive,

and obeys the heredity condition: for any sentence letter α, if I (α, s) = 1 and

R s s ′ then I (α, s ′) = 1

· I is a function from sentence letters and stages to truth values (“inter-

pretation function”).

Definition of valuation: Where M (= ⟨S ,R ,I ⟩) is any I-model, the I-

valuation forM , IVM , is de�ned as the two-place function that assigns either

0 or 1 to each wff relative to each member of S , subject to the following

constraints, where α is any sentence letter, φ and ψ are any wffs, and s is any

member of S :

VM (α, s) =I (α, s)VM (φ∧ψ, s) = 1 iff VM (φ, s) = 1 and VM (ψ, s) = 1VM (φ∨ψ, s) = 1 iff VM (φ, s) = 1 or VM (ψ, s) = 1

VM (∼φ, s) = 1 iff for every s ′ such thatR s s ′,VM (φ, s ′) = 0VM (φ→ψ, s) = 1 iff for every s ′ such thatR s s ′, either VM (φ, s ′) = 0

or VM (ψ, s ′) = 1

Note that the truth conditions for the → and the ∼ at stage s no longer

depend exclusively on what s is like; they are sensitive to what happens at

stages accessible from s . Unlike the ∧ and the ∨, → and ∼ are not “truth

functional” (relative to a stage); they behave like modal operators.


Let us think intuitively about these models. We are to think of each member

of S as a stage in the construction of mathematical proofs. At any stage, one

has come up with proofs of some things but not others. When V assigns 1 to a

formula at a stage, that means intuitively that as of that state of information,

the formula has been proven. The assignment of 0 means that the formula has

not been proven thus far (though it might nevertheless in the future.)

The holding of the accessibility relationR represents which future stages

are possible, given one’s current stage. If s ′ is accessible from s , that means that

s ′ contains all the proofs in s , plus perhaps more. Given this understanding,

re�exivity and transitivity are obviously correct to impose, as is the heredity

condition, since (on the somewhat idealized conception of proof we are oper-

ating with) one does not lose proved information when constructing further

proofs. But the accessibility relation will not in general be symmetric: for

sometimes one will come across a new proof that one did not formerly have.

Let’s also think through why the truth conditions for →,∧,∨ and ∼ are

intuitively correct. Intuitionists, recall, associate with each propositional con-

nective, a conception of what proofs of formulas built using that connective

must be like:

· a proof of ∼φ is a proof that φ leads to a contradiction

· a proof of φ∧ψ is a proof of φ and a proof of ψ

· a proof of φ∨ψ is a proof of φ or a proof of ψ

· a proof of φ→ψ is a construction that can be used to turn any proof of φinto a proof of ψ

This is what inspires the de�nition of a valuation function. As of a time, one

has proved φ∧ψ iff one has proved both φ and ψ then. As of a time, one has

proved φ∨ψ iff one has proved one of the disjuncts. As for ∼, a proof of ∼φ,

according to an intuitionist, is a proof that φ leads to a contradiction. But i) if

one has proved that φleads to a contradiction, then in no future stage could

one prove φ (at least if one’s methods of proof are consistent); and ii) if one

has not proved that φ leads to a contradiction, this leaves open the possibility

of a future stage at which one proves φ. Thus the valuation condition for ∼ is

justi�ed.4

As for→: if one has a method of converting proofs of φ into proofs

4I’m fudging here a bit. Are the stages idealized so that one has already proven everyone


of ψ, then there could never be a possible future in which one has a proof of

φ but not one of ψ. Conversely, if one lacks such a method, then it should be

possible one day to have a proof of φ without being able to convert it into a

proof of ψ, and thus without then having a proof of ψ.

We can now de�ne intuitionist validity and semantic consequence in the

obvious way:

Definitions of validity and semantic consequence:

· φ is I-valid (�Iφ) iff VM (φ, s) = 1 for each stage s in each intuitionist

modelM· φ is an I-semantic-consequence of Γ (Γ �

Iφ) iff for every intuitionist

modelM and every stage s inM , if VM (γ , s) = 1 for each γ ∈ Γ, then

VM (φ, s) = 1

Exercise 7.1 Show that φ �Iψ iff �

Iφ→ψ.

Exercise 7.2 Show that intuitionist consequence implies classical

consequence. That is, show that if Γ �Iφ then Γ �

PLφ.

7.2.2 ExamplesGiven the semantics just introduced, it’s straightforward to demonstrate facts

about validity and semantic consequence.

Example 7.1: Show that Q �I

P→Q. (I’ll omit the quali�er “I” from now

on.) Take any model and any stage s ; assume that V(Q, s) = 1 and V(P→Q, s) =0. Thus, for some s ′,R s s ′ and V(P, s ′) = 1 and V(Q, s ′) = 0. But this violates

heredity.

one can in principle prove? Clearly not, for then any formula assigned 1 at any accessible

stage should already be assigned 1 at that stage. But if stages are not idealized in this way,

then why suppose that the assignment of 0 at a stage to ∼φ (failure to prove that φ leads to a

contradiction) insures that there is some future stage at which φ is proved? A similar worry

confronts the valuation condition for→.


Example 7.2: Show that P→Q � ∼Q→∼P (contraposition). Suppose

V(P→Q, s) = 1 and V(∼Q→∼P, s) = 0. Given the latter, there’s some stage s ′

such thatR s s ′ and V(∼Q, s ′) = 1 and V(∼P, s ′) = 0. Given the latter, for some

s ′′, R s ′ s ′′ and V(P, s ′′) = 1. Given the former, V(Q, s ′′) = 0. Given transitiv-

ity, R s s ′′. Given the truth of P→Q at s , either V(P, s ′′) = 0 or V(Q, s ′′) = 1.

Contradiction.

It’s also straightforward to use the techniques of section 6.3.4 to construct

countermodels.

Example 7.3: Show that 2 P∨∼P . Here’s a model in which P∨∼P is valu-

ated as 0 in stage r:

0 0 0

P∨∼P∗

r

��

00

∗1

Pa

00

As in section 6.3, we use asterisks to remind ourselves of commitments that

concern other worlds/stages. The asterisk is under ∼P in stage r because a

negation with value 0 carries a commitment to including some stage at which

the negated formula is 1. The asterisk is over the P in stage a because of the

heredity condition: a sentence letter valuated 1 carries a commitment to make

that letter 1 in every accessible stage. (Likewise, negations and conditionals

valuated as 1 generate top-asterisks, and conditionals valuated as 0 generate

bottom-asterisks). The of�cial model:

S : {r,a}R : {⟨r, r⟩, ⟨a,a⟩, ⟨r,a⟩}I (P,a) = 1, all other atomics 0 everywhere

(I’ll skip the of�cial models from now on.)


Example 7.4: Show that ∼∼P 2 P . Here is a countermodel:

∗1 0 0

∼∼P P∗

r

00

��∗1 0

P ∼P∗

a

00

Note: since ∼∼P is 1 at r, that means that ∼P must be 0 at every stage at

which r sees. Now, Rrr, so ∼P must be 0 at r. So r must see some stage in

which P is 1. World a takes care of that.

Exercise 7.3 Establish the following facts.

a) ∼(P∧Q) 2∼P∨∼Q

b) ∼P∨∼Q �∼(P∧Q)

c) P→(Q∨R) 2 (P→Q)∨(P→R)

7.2.3 SoundnessRecall our proof system for intuitionism from section 3.4. What I’d like to

do next is show that that proof system is sound, relative to our semantics for

intuitionism. But �rst we’ll need to prove an intermediate theorem:

Generalized heredity: The heredity condition holds for all formulas. That

is, for any wff φ, whether atomic or no, and any stage, s , in any intuitionist

model, if V(φ, s) = 1 andR s s ′ then V(φ, s ′) = 1.


Proof. The proof is by induction. The base case is just the of�cial heredity

condition. Next we make the inductive hypothesis (ih): heredity is true for

formulas φ and ψ; we must now show that heredity also holds for ∼φ, φ→ψ,

φ∧ψ, and φ∨ψ. I’ll do this for φ∧ψ, and leave the rest as exercises.

∧: Suppose for reductio that V(φ∧ψ, s) = 1, R s s ′, and V(φ∧ψ, s ′) = 0.

Given the former, V(φ, s) = 1 and V(ψ, s) = 1. By (ih), V(φ, s ′) = 1 and

V(ψ, s ′) = 1—contradiction.

Exercise 7.4 Complete the proof of generalized heredity.

Now for soundness. What does soundness mean in the present context?

The proof system in section 3.4 is a proof system for sequents, not individual

formulas. So �rst, we need a notion of intuitionist validity for sequents.

Definition of sequent I-validity: Sequent Γ ` φ is intuitionistically valid

(“I-valid”) iff Γ �Iφ

We can now formulate soundness:

Soundness for intuitionism: Every intuitionistically provable sequent is I-

valid

Proof. This will be an inductive proof. Since a provable sequent is the last

sequent in any proof, all we need to show is that every sequent in any proof

is I-valid. And to do that, all we need to show is that the rule of assumptions

generates I-valid sequents (base case), and all the other rules preserve I-validity

(induction step). For any set, Γ, valuation function V, and stage s , let’s write

“V(Γ, s) = 1” to mean that V(γ , s) = 1 for each γ ∈ Γ.

Base case: the rule of assumptions generates sequents of the form φ `φ,

which are clearly I-valid.

Induction step: we show that the other sequent rules from section 3.4

preserve I-validity.

∧I: Here we assume that the inputs to ∧I are I-valid, and show that its

output is I-valid. That is, we assume that Γ `φ and ∆ `ψ are I-valid sequents,

and we must show that it follows that Γ, ∆ `φ∧ψ is also I-valid. So, consider

any model with valuation V and any stage s such that V(Γ∪∆)=1, and suppose

for reductio that V(φ∧ψ, s) = 0. Since Γ `φ is I-valid, V(φ, s) = 1; since∆ `ψis I-valid, V(ψ, s) = 1; contradiction.


∨E: Assume that Γ ` φ∨ψ, ∆1,φ ` Π, and ∆2,ψ ` Π are all I-valid, and

suppose for reductio that V(Γ ∪∆1 ∪∆2, s) = 1 but V(Π, s) = 0. The �rst

assumption tells us that V(φ∨ψ, s) = 1, so either φ or ψ is 1 at s . If the former,

then the second assumption tells us that V(Π, s) = 1; if the second, then the

third assumption tells us that V(Π, s) = 1. Either way, we have a contradiction.

I leave the demonstration that the remaining rules preserve I-validity as an

exercise.

Exercise 7.5 Show that ∧E, ∨I, DNI, RAA, →I, →E, and EF

preserve I-validity.

I can now justify an assertion I made, but did not prove, in section 3.4. I

asserted there that the sequent ∅ ` P∨∼P is not intuitionistically provable.

Given the soundness proof, to demonstrate that a sequent is not intuitionisti-

cally provable, it suf�ces to show that its premises do not I-semantically-imply

its conclusion. But in example 7.3 we showed that 2 P∨∼P , which is equivalent

to saying that ∅2 P∨∼P .

Similarly, we showed in example 7.4 that∼∼P 2 P . Thus, by the soundness

theorem, the sequent ∼∼P ` P isn’t provable. (Recall how, in constructing our

proof system for intuitionism in section 3.4, we dropped the rule of double-

negation elimination.)

Chapter 8

Counterfactuals1

There are certain conditionals in natural language that are not well-

represented either by propositional logic’s material conditional or by

modal logic’s strict conditional. In this chapter we consider “counterfactual”

conditionals—conditionals that (loosely speaking) have the form:

If it had been that φ, then it would have been that ψ

For instance:

If I had struck this match, it would have lit

The counterfactuals that we typically utter have false antecedents (hence

the name), and are phrased in the subjunctive mood. They must therefore

be distinguished from English conditionals phrased in the indicative mood.

Counterfactuals are generally thought to semantically differ from indicative

conditionals. A famous example: the counterfactual conditional ‘If Oswald

hadn’t shot Kennedy, someone else would have’ is false (assuming that certain

conspiracy theories are false and Oswald was acting alone); but the indicative

conditional ‘If Oswald didn’t shoot Kennedy then someone else did’ is true

(we know that someone shot Kennedy, so if it wasn’t Oswald, it must have been

someone else.) The semantics of indicative conditionals is an important topic

in its own right, but we won’t take up that topic here.

We represent the counterfactual with antecedent φ and consequent ψ thus:

φ2→ψ1This section is adapted from my notes from Ed Gettier’s fall 1988 modal logic class.

207

CHAPTER 8. COUNTERFACTUALS 208

What should the logic of this new connective be, if it is to accurately represent

natural language counterfactuals?

8.1 Natural language counterfactualsWell, let’s have a look at how natural language counterfactuals behave. Our

survey will provide guidance for our main task: developing a semantics for 2→.

As we’ll see, counterfactuals behave very differently from both material and

strict conditionals.

8.1.1 Not truth-functionalOur system for counterfactuals should have the following features:

∼P 2 P2→QQ 2 P2→Q

For consider: I did not strike the match; but it doesn’t logically follow that

if I had struck the match, it would have turned into a feather. So if 2→ is to

represent ‘if it had been that…, it would have been that…’, ∼P should not

semantically imply P2→Q. Similarly, George W. Bush (somehow) won the last

United States presidential election, but it doesn’t follow that if the newspapers

had discovered beforehand that Bush had an affair with Al Gore, he would

still have won. So our semantics had better not count P2→Q as a semantic

consequence of Q either.

These implications hold for the material conditional, however (for any φand ψ):

∼φ �φ→ψψ �φ→ψ

We have our �rst difference in logical behavior between counterfactuals and

the material conditional→.

8.1.2 Can be contingentIt’s not true, presumably, that if Oswald hadn’t shot Kennedy, then someone else

would have (assuming that the conspiracy theory is false). But the conspiracy


theory might have been true; in a possible world in which there is a conspiracy,

it would be true that if Oswald hadn’t shot Kennedy, someone else would

have. Thus, our logic should allow counterfactuals to be contingent statements.

Just because a counterfactual is true, it should not follow logically that it is

necessarily true; and just because a counterfactual is false, it should not follow

logically that it is necessarily false. Our semantics for 2→, that is, should have

the following features:

P2→Q 22(P2→Q)∼(P2→Q) 2→2∼(P2→Q)

One reason this is important is that it shows an obstacle to using the strict

conditional⇒ to represent natural language counterfactuals. For remember

that φ⇒ψ is de�ned as 2(φ→ψ). As a result:

φ⇒ψ �S4,S5

2(φ⇒ψ)∼(φ⇒ψ) �

S52∼(φ⇒ψ)

So if, as is commonly supposed, the logic of the 2 is at least as strong as S4, we

have a logical mismatch between counterfactuals and the⇒.

8.1.3 No augmentationThe→ and the⇒ obey the argument form augmentation

φ→ψ(φ∧χ )→ψ

φ⇒ψ(φ∧χ )⇒ψ

That is, φ→ψ �PL(φ∧χ )→ψ and φ⇒ψ �

K,…(φ∧χ )⇒ψ. However, natural

language counterfactuals famously do not obey augmentation. Consider:

If I were to strike the match, it would light.

Therefore, if I were to strike the match and I was in outer

space, it would light.

So, our next desideratum is that the corresponding argument should not hold

good for 2→ (that is, P2→Q 2 (P∧R)2→Q.)


8.1.4 No contraposition→ and⇒ obey contraposition:

φ→ψ∼ψ→∼φ

φ⇒ψ∼ψ⇒∼φ

But counterfactuals do not. Suppose I’m on the �ring squad, and we shoot

someone dead. My gun was loaded, but so were those of the others. Then the

premise of the following argument is true, while its consequent is false:

If my gun hadn’t been loaded, he would still be dead.

Therefore, if he weren’t dead, my gun would have been

loaded.

8.1.5 Some implicationsHere is an argument form that intuitively should hold for the 2→:

φ2→ψφ→ψ

The counterfactual conditional should imply the material conditional. 2→ will

then obey modus ponens and modus tollens:

φ φ2→ψψ

∼ψ φ2→ψ∼φ

The reason is, of course, that modus ponens and modus tollens are valid for

the→. (Note that it’s not inconsistent to say that modus tollens holds for the

2→ and also that contraposition fails.)

Another implication: the strict conditional should imply the counterfactual:

φ⇒ψφ2→ψ

To see that these implications should hold, consider �rst the argument from

the strict conditional to the counterfactual conditional. Surely, if φ entails—necessitates—ψ, then if φ were indeed true, ψ would be as well. As for the


counterfactual implying the material, suppose that you think that if φ were

true, ψ would also be true. Now suppose that someone tells you that φ is true,

but that ψ is false. Wouldn’t you then need to give up your original claim that

if φ were to be true, then ψ would be true? It seems so. So, the statement

φ2→ψ isn’t consistent with φ∧∼ψ—that is, it isn’t consistent with the denial

of φ→ψ.

8.1.6 Context dependenceYears ago, a few of us were at a restaurant in NY—Red Smith, Frank

Graham, Allie Reynolds, Yogi [Berra] and me. At about 11.30 p.m., Ted

[Williams] walked in helped by a cane. Graham asked us what we thought

Ted would hit if he were playing today. Allie said, “due to the better

equipment probably about .350.” Red Smith said. “About .385.” I said,

“due to the lack of really great pitching about .390.” Yogi said, “.220.” We

all jumped up and I said, “You’re nuts, Yogi! Ted’s lifetime average is .344.”

“Yeah” said Yogi “but he is 74 years old.”

–Buzzie Bavasi, baseball executive.

Who was right? If Ted Williams had played at the time the story was told,

would he or wouldn’t he have hit over .300?

Clearly, there’s no single correct answer. The �rst respondents were imag-

ining Williams playing as a young man. Understood that way, the answer is, no

doubt: yes, he would have hit over .300. But Berra took the question a different

way: he was imagining Williams hitting as he was then: a 74 year old man. Berra

took the others off guard, by deliberately (?—this is Yogi Berra we’re talking

about) shifting how the question was construed, but he didn’t make a semantic

mistake in so doing. It’s perfectly legitimate, in other circumstances anyway,

to take the question in Berra’s way. (Imagine Williams talking to himself �ve

years after he retired: “These punks today! If I were playing today, I’d stillhit over .300!”) Counterfactual sentences can be interpreted in different ways

depending on the conversational context in which they are uttered.

Another example:

If Syracuse were in Louisiana, Syracuse winters would

be warm.

True or false? It might seem true: Louisiana is in the south. But wait—perhaps

Louisiana would include Syracuse by extending its borders north to Syracuse’s

actual latitude.


Would Syracuse be warm in the winter? Would Williams hit over .300?

No one answer is correct, once and for all. Which answer is correct depends

on the linguistic context. Whether a counterfactual is true or whether it is false

depends in part on what the speaker means to be saying, and what her audience

takes her to be saying, when she utters the counterfactual. Would Syracuse be

warm?—in some contexts, it would be correct to say yes, and in others, to say

no. When we imagine Syracuse being warm, we imagine reality being different

in certain respects from actuality. In particular, we imagine Syracuse as being

in Louisiana. In other respects, we imagine a situation that is a lot like reality—

we don’t imagine a situation, for example, in which Syracuse and Louisiana

are both located in China. Now, when considering counterfactuals, there is

a question of what parts of reality we hold constant. In the Syracuse-Louisiana

case, we seem to have at least two choices. Do we hold constant the location of

Syracuse, or do we hold constant the borders of Louisiana? The truth value of

the counterfactual depends on which we hold constant.

What determines which things are to be held constant, when we evaluate

the truth value of a counterfactual? It large part: the context of utterance of

the counterfactual. Suppose I am in the middle of the following conversation:

“Syracuse restaurants struggle to survive because the climate there is so bad:

no one wants to go out to eat in the winter. If Syracuse were in Louisiana,

its restaurants would do much better.” In such a context, an utterance of the

counterfactual “If Syracuse were in Louisiana, Syracuse winters would be warm”

would be regarded as true. But if this counterfactual were uttered in the midst of

the following conversation, it would be regarded as false: “You know, Louisiana

is statistically the warmest state in the country. Good thing Syracuse isn’t in

Louisiana, because that would ruin the statistic.”

Does just saying a sentence, intending it to be true, make it true? Well,

sort of! When a certain sentence has a meaning that is partly determined by

context, then when a person utters that sentence with the intention of saying

something true, that tends to create a context in which the sentence is true.

Compare ‘�at’—we’ll say “the table is �at”, and thereby utter a truth. But when

a scientist looks at the same table and says “you know, macroscopic objects are

far from being �at. Take that table, for instance. It isn’t �at at all—when viewed

under a microscope, it can be seen to have a very irregular surface”. The term

‘�at’ has a certain amount of vagueness—how �at does a thing have to be to

count as being “�at”? Well, the amount required is determined by context.2

2See Lewis (1979).


8.2 The Lewis/Stalnaker approachHere is the core idea of David Lewis (1973) and Robert Stalnaker (1968) of how

to interpret counterfactual conditionals. Consider a counterfactual conditional

P2→Q. To determine its truth, Lewis and Stalnaker instruct us to consider

the possible world that is as similar to reality as possible, in which P is true.

Then, the counterfactual is true in the actual world if and only if Q is true in

that possible world. Consider Lewis’s example:

If kangaroos had no tails, they would topple over.

When we consider the possible world that would be actual if kangaroos had

no tails, we do not depart gratuitously from actuality. For example, we do

not consider a world in which kangaroos have wings, or crutches. We do not

consider a world with different laws of nature, in which there is no gravity. We

keep the kangaroos as they actually are, but remove the tails, and we keep the

laws of nature as they actually are. It seems that the kangaroos would then fall

over.

Take the examples of the previous section, in which I got you to give

differing answers to certain sentences. Consider:

If Syracuse were in Louisiana, Syracuse winters would

be warm.

How does the contextual dependence of this sentence work, on the Lewis-

Stalnaker view? By supplying different standards of comparison of similarity.

Think about similarity, for a moment: things can be similar in certain respects,

while not being similar in other respect. A blue square is similar to a blue circle

in respect of color, not in respect of shape. Now, when we answer af�rmative

to this counterfactual, according to Lewis and Stalnaker, when we consider

the possible world most similar to the actual world in which Syracuse is in

Louisiana, we are using a kind of similarity that weights heavily Louisiana’s

actual borders. When we count the counterfactual false, we are using a kind of

similarity that weights very heavily Syracuse’s actual location.


8.3 Stalnaker’s system3

I now lay out Stalnaker’s system, SC (for “Stalnaker-Counterfactuals”). The

idea is to add the 2→ to propositional modal logic.

8.3.1 Syntax of SCThe primitive vocabulary of SC is that of propositional modal logic, plus the

connective 2→. Here’s the grammar:

Definition of SC-wff:

· Sentence letters are wffs

· if φ, ψ are wffs then (φ→ψ), ∼φ, 2φ, and (φ2→ψ) are wffs

· nothing else is a wff

8.3.2 Semantics of SCWhere R is a three-place relation, let’s abbreviate “Rxy z” as “Rz xy”. And,

where u is any object, let “Ru” be the two-place relation that holds between

objects x and y iff Ru xy. (Think of Ru as the two-place relation that results

from “plugging” up one place of the three-place relation R with object u.)

We can now de�ne SC-models:

Definition of SC-model: An SC-model,M , is an ordered triple ⟨W ,�,I ⟩,where:

· W is a nonempty set (“worlds”)

· I is a two-place function that assigns either 0 or 1 to each sentence letter

relative to each w ∈W (“interpretation function”)

· � is a three-place relation overW (“nearness relation”)

· The valuation function VM forM (see below) and� satisfy the following

conditions:

· for any w, �w is strongly connected inW3See Stalnaker (1968). The version of the theory I present here is slightly different from

Stalnaker’s original version; see Lewis (1973, p. 79).


· for any w, �w is transitive

· for any w, �w is anti-symmetric

· for any x, y, x �xy (“Base”)

· for any SC-wff, φ, provided φ is true in at least one world, then for

every z, there’s some w such that VM (φ, w) = 1, and such that for

any x, if VM (φ, x) = 1 then w �z x (“Limit”)

(Recall that a binary relation R is “strongly connected” in set A iff for each

u, v ∈ A, either Ruv or Rv u, and “anti-symmetric” iff u = v whenever both

Ruv and Rv u.)

Definition of SC-valuation: WhereM (= ⟨W ,�,I ⟩) is any SC-model, the

SC-valuation for M , VM , is de�ned as the two-place function that assigns

either 0 or 1 to each SC-wff relative to each member of W , subject to the

following constraints, where α is any sentence letter, φ and ψ are any wffs, and

w is any member ofW :

i) VM (α, w) =I (α, w)

ii) VM (∼φ, w) = 1 iff VM (φ, w) = 0

iii) VM (φ→ψ, w) = 1 iff either VM (φ, w) = 0 or VM (ψ, w) = 1

iv) VM (2φ, w) = 1 iff for any v, VM (φ, v) = 1

v) VM (φ2→ψ, w) = 1 iff for any x, IF [V(φ, x) = 1 and for any y such that

VM (φ, y) = 1, x �w y] THEN VM (ψ, x) = 1

Phew! Let’s look into what this means.

First, notice that much of this is exactly the same as for our MPL models—

we still have the set of worlds, and formulas being given truth values at worlds.

We’ll still say that φ is true “at w” iff V(φ, w) = 1.

What happened to the accessibility relation? It has simply been dropped,

in favor of a simpli�ed truth-clause for the 2. 2φ is now true at a world w iff

φ is true at all worlds in the model, not just all worlds accessible from w. It

turns out that this in effect just gives us an S5 logic for the 2, for you get the

same valid formulas for MPL, whether you make the accessibility relation an

equivalence relation, or a total relation. Clearly, if φ is valid in all equivalence

relation models, then it is valid in all total models, since every total relation is

an equivalence relation. What’s more, the converse is true—if φ is valid in all

total models then it’s also valid in all equivalence relation models:


Rough proof. Let φ be any formula that’s valid in all total models, and letM be

any equivalence relation model. We need to show that φ is true in an arbitrary

world r ∈ W (M ’s set of worlds). Now, any equivalence relation partitions

its domain into non-overlapping subsets in which each world sees every other

world. SoW is divided up into one or more non-overlapping subsets. One of

these, Wr , contains r . Now, consider a model,M ′, just likeM , but whose

set of worlds is justWr .M ′is a total model, so φ is valid in it by hypothesis.

Thus, in this model, φ is true at r . But then φ is true at r in M , as well.

Why? Roughly: the truth value of φ at r inM isn’t affected by what goes

on outside r ’s partition, since chains of modal operators just take us to worlds

seen by r , and worlds seen by worlds seen by r , and… Such chains will never

have us “look at” anything out of r ’s partition, since these worlds are utterly

unconnected to r via the accessibility relation. So φ’s truth value at r inM is

determined by what goes on inWr , and so is the same as its truth value at r in

M ′.

So, we get the same class of valid formulas whether we require the accessibility

relation to be total, or an equivalence relation. Things are easier if we make

it a total relation, because then we can simply drop talk of the accessibility

relation and de�ne necessity as truth at all worlds. The corresponding clause

for possibility is:

· VM (3φ, w) = 1 iff for some v, VM (φ, v) = 1

The derived clauses for the other connectives remain the same:

· VM (φ∧ψ, w) = 1 iff VM (φ, w) = 1 and V(ψ, w) = 1

· VM (φ∨ψ, w) = 1 iff VM (φ, w) = 1 or VM (ψ, w) = 1

· VM (φ↔ψ, w) = 1 iff VM (φ, w) =VM (ψ, w)

Next, what about this nearness relation? Read “x �z y” as “x is at least as

near to/similar to z as is y”; thus, think of � as the similarity relation between

possible worlds that we talked about before. In order to evaluate whether

x �w y, we place ourselves in possible world w, and we ask whether x is more

similar to our world than y is. (Recall the bit about counterfactual conditionals

being highly context dependent. In a full treatment of counterfactuals, we

would complicate our semantics by introducing contexts of utterance, and evaluate

sentences relative to these contexts of utterance. The point of this would be to

allow different nearness relations in the different contexts.)


I say “we can think of” � as a similarity relation, but take this with a grain

of salt—just as our de�nitions allow the members ofW to be any old things, so,

� is allowed to be any old relation overW . Just as the members ofW could be

�sh, so the � relation could be any old relation over �sh. (But as before, if the

truth conditions for natural language counterfactuals have nothing in common

with the truth conditions for 2→ statements in our models, the interest in our

semantics is diminished, since our models wouldn’t be modeling the behavior of

natural language counterfactuals.)

The constraints on the formal properties of the nearness relation—certain

of them, at least, seem plausible constraints on � if it is to be thought of as

a similarity relation. C1 simply says that it makes sense to compare any two

worlds in respect of similarity to a given world. C2 has a transparent meaning.

C3 means “no ties”—it says that, relative to a given base world w, it is never

the case that there are two separate worlds x and y such that each is at least as

close to w as the other. C4 is the “base” axiom—it says that every world is at

least as close to itself as every other. Given C3, it has the further implication

that every world is closer to itself than every other. (We de�ne “x is closer to wthan y is” (x ≺w y) to mean x �w y and not: y �w x.) C5 is called the “limit”

assumption: according to it, for any formula φ and any base world w, there

is some world that is a closest world to w in which φ is true (that is, unless φisn’t true at any worlds at all). This rules out the following possibility: there

are no closest φ worlds, only an in�nite chain of φ worlds, each of which is

closer than the previous. Certain of these assumptions have been challenged,

especially C3 and C5. We will consider those issues below.

Note how condition C5 in the de�nition of an SC-model made reference

to the valuation function that we went on to de�ne. This is in contrast to

our earlier de�nitions of models, in which the de�nition of a model made

no reference to the valuation function. The reason for this difference is that

constraint C5 (the limit assumption) is a constraint that relates the nearness

relation to the truth values of all formulas, complex or otherwise: it says that

any formula φ that is true somewhere is true in some closest-to-w world.

Given our de�nitions, we can de�ne validity and semantic consequence:


· φ is SC-valid (�SCφ) iff φ is true at every world in every SC-model

· Γ SC-semantically implies φ (Γ �SCφ) iff for every SC-model and every

world w in that model, if every member of Γ is true at w then φ is also

true at w


8.4 Validity proofs in SCGiven this semantic system, we can give semantic validity proofs just as we did

for the various modal systems.

Example 8.1: Let’s show that the formula (P∧Q)→(P2→Q) is SC-valid.

We pick an arbitrary SC-model, ⟨W ,�,I ⟩, pick an arbitrary world r ∈W , and

show that this formula is true at r :

i) Suppose for reductio that V((P∧Q)→(P2→Q), r ) = 0

ii) then V(P, r ) = 1,V(Q, r ) = 1, and V(P2→Q, r ) = 0

iii) the truth condition for 2→ says that P2→Q is true at r iff for every

closest P-world (to r ), Q is true as well. So since P2→Q is false at r ,

there must be a closest-to-r P-world at which Q is false—that is, there

is some world a such that:

a) V(P,a) = 1

b) for any x, if V(P, x) = 1 then a �r x

c) V(Q,a) = 0

iv) Since V(P, r ) = 1 (line ii)), given b), a �r r

v) by “base”, r �r a. So, by anti-symmetry, r = a. But now, from lines c)

and ii), Q is both true and false at r

Example 8.2: Show that �SC[(P2→Q)∧((P∧Q)2→R)]→ [P2→R]. (This

formula is worth taking note of, because it is valid despite its similarity to the

invalid formula [(P2→Q)∧(Q2→R)]→ [P2→R]):

i) Suppose for reductio that P2→Q and (P∧Q)2→R are true at r , but

P2→R is false there.

ii) Then there’s a world, a, that is a nearest-to-r P world, in which R is

false.

iii) Since P2→Q is true at r , Q is true in all the nearest-to-r P worlds, and

so V(Q,a) = 1.

iv) Note now that a is a nearest-to-r P∧Q world:


a) P and Q are both true there, so P∧Q is true there.

b) let x be any P∧Q world. x is then a P world. But since a is a nearest-

to-r P world, we know that a �r x. (Remember: “a is a nearest-to-rP world” means: “V(P,a) = 1, and for any x, if V(P, x) = 1 then

a �r x”.)

v) Since (P∧Q)2→R is true at r , it follows that R is true at a. This contra-

dicts ii).

Exercise 8.1 Show that in the SC semantics, the counterfactual

conditional is intermediate in strength between the strict and mate-

rial conditionals. That is, show that:

a) φ⇒ψ �φ2→ψ

b) φ2→ψ �φ→ψ

8.5 Countermodels in SCIn this section we’ll learn how to construct countermodels in SC. Along the

way we’ll also look at how to decide whether a given formula is SC-valid or

SC-invalid. As with plain old modal logic, the best strategy is to attempt to

come up with a countermodel. If you fail, then you can use your failed attempt

to guide the construction of a validity proofs.

We can use diagrams like those from section 6.3.4 to represent SC-counter-

models. The diagrams will be a little different though. They will still contain

boxes (rounded now, to distinguish them from the old countermodels) in which

we put formulas; and we again indicate truth values of formulas with small

numbers above the formulas. But since there is no accessibility relation, we

don’t need the arrows between the boxes. And since we need to represent the

nearness relation, we will arrange the boxes vertically. At the bottom goes a box

for the world, r , of our model in which we’re trying to make a given formula

false. We string the other worlds in the diagram above this bottom world r :

the further away a world is from r in the �r ordering, the further above r we

place it in the diagram. Thus, a countermodel for the formula ∼P→(P2→Q)might look as follows:


/. -,() *+

1 1

P Qb

OO

no P

��

/. -,() *+

1 0

P Qa

/. -,() *+

1 0 0 0

∼P→(P2→Q)r

In this diagram, the world we’re primarily focusing on is the bottom world,

world r. The nearest world to r is world r itself. The next nearest world to r is

the next world moving up from the bottom: world a. The furthest world from

r is world b. Notice that P is false at world r, and true at worlds a and b. Thus,

a is the nearest world to r in which P is true. Since Q is false at world a, that

makes the counterfactual P2→Q false at world r . Since ∼P is true at r, the

entire material conditional∼P→(P2→Q) is false at world r, as desired. (World

b isn’t needed in this countermodel; I included it merely for illustration.) The

“no P” sign to the left of worlds a and r is a reminder to ourselves in case we

want to add further worlds to the diagram: don’t include any worlds between a

and r in which P is true. Otherwise world a would no longer be the nearest Pworld.

What strategy should one use for constructing SC-countermodels? As we

saw in section 6.3.4, a good policy is to make “forced” moves �rst. For example,

if you are committed to making a material conditional false at a world, go

ahead and make its antecedent true and consequent false in that world, right

away. In fact, a false counterfactual also forces certain moves. It follows from

the truth condition for the 2→ that if φ2→ψ is false at world w, then there

exists a nearest-to-w φ world at which ψ is false. So if you put a 0 overtop of

a counterfactual φ2→ψ in some world w, it’s good to do the following two

things right away. First, add a nearest-to-w world in which φ is true (if such a

world isn’t already present in your diagram). And second, make ψ false there.

True counterfactuals don’t force your hand quite so much, since there are

two ways for a counterfactual to be true. If φ2→ψ is true at w, then ψmust be

true at every nearest-tow φ world. This could happen, not only if there exists

a nearest-to-w φ world in which ψ is true, but also if there are no nearest-to-w


φ worlds. In the latter case we say that φ2→ψ is “vacuously true” at w. A

counterfactual can be vacuously true only when its antecedent is necessarily

false, since the limit assumption guarantees that if there is at least one φ world,

then there is a nearest φ world. So: if you want to make a counterfactual true at

a world, it’s a good idea to wait until you’ve been forced to make its antecedent

true at at least one world. Only when this has happened, thus closing off

the possibility of making the counterfactual vacuously true, should you add a

nearest world in which its antecedent is true, and make its consequent true at

that nearest antecedent-world.

Suppose, for example, that we want to show that the following formula is

SC-invalid: [(P2→Q)∧(Q2→R)]→ (P2→R). We begin as follows:

/. -,() *+

1 1 1 0 0

[(P2→Q)∧(Q2→R)]→(P2→R)r

In keeping with the advice I gave a moment ago, let’s deal with the false

counterfactual �rst: let’s make P2→R false in r. This means that we need to

add a nearest-to-r P world in which R is false. At this point, nothing prevents

us from making this world r itself, but that might collide with other things we

might want to do later, so I’ll make this nearest-to-r P world a distinct world

from r:

OO

no P

��

/. -,() *+

1 0 1

P R Qa

/. -,() *+

0 1 1 1 0 0

[(P2→Q)∧(Q2→R)]→(P2→R)r

“No P” reminds me not to add any P-worlds between a and r. Since world r is

in the “no P zone”, I made P false there.

Notice that I made Q true in a. This is because P2→Q is true in r . This

formula says that Q is true in the nearest-to-r P world; and a is the nearest-to-r

P world. In general, whenever you add a new world to one of these diagrams,

you should go back to all the counterfactuals in the bottom world and see

whether they require their consequents to have certain truth values in the new

world.


Now for the �nal counterfactual Q2→R. This can be true in two ways—

either there is no Q world at all (the vacuous case), or there is a nearest-to-r

Q world in which R is true. Q is already true in world a, so the vacuous

case is ruled out. So we must include a nearest-to-r Q world, call it “b”, and

make R true there. Where will we put this new world b? There are three

possibilities. World b could be farther away from, identical to, or closer to r

than a. (These are the only three possibilities, given anti-symmetry.) Let’s try

the �rst possibility:

/. -,() *+

1 1

Q Rb OO

no Q

��

OO

no P

��

/. -,() *+

1 0 1

P R Qa

/. -,() *+

0 1 0 1 1 0 0

[(P2→Q)∧(Q2→R)]→(P2→R)r

This doesn’t work, because world a is in the no-Q zone, but Q is true at world

a. Put another way: in this diagram, b isn’t the nearest-to-r Q world; world a

is. And so, since R is false at world a, the counterfactual Q2→R would come

out false at world r, whereas we want it to be true. we’ve got Q true at a nearer

world—namely, a.

Likewise, we can’t make world b be identical to world a, since we need to

make R true in b and R is already false in a.

But the �nal possibility works out just �ne—let world b be closer to r than

a:


OO

no P

��

/. -,() *+

1 0 1

P R Qa

/. -,() *+

1 1 0

Q R Pb OO

no Q

��/. -,() *+

0 1 0 1 1 0 0

[(P2→Q)∧(Q2→R)]→(P2→R)r

Notice that I made P false in b, since b is in the no P zone. Here’s the of�cial

model:

W = {r,a,b}�

r= {⟨b,a⟩ . . .}

I (P,a) =I (Q,a) =I (Q,b) =I (R,b) = 1, all others 0

In this of�cial model I left out a lot in giving the similarity relation for this

model. First, I left out some of the elements of �r. Fully written out, it would

be:

�r= {⟨r,b⟩, ⟨b,a⟩, ⟨r, a⟩, ⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩}

I left out ⟨r,b⟩ because it gets included automatically given the “base” assump-

tion (C4). Also, the element ⟨r,a⟩ is required to make �r

transitive. The

elements ⟨r, r⟩, ⟨a,a⟩, and ⟨b,b⟩ were entered to make that relation re�exive.

(Why must it be re�exive? Because re�exivity comes from strong connectivity.

Let w and x be any members of W ; we get (x �w x or x �w x) from Strong

connectivity of �w , and hence x �w x.) My policy will be to write out enough

of �r

so that the rest can be inferred, given the de�nition of an SC-model. Sec-

ondly, this isn’t a complete writing out of � itself; it is just �r. To be complete,

we’d need to write out �a, and �

b. But in this case, these latter two parts of �

don’t matter, so I omitted them. (Later we’ll consider cases where we need to

consider more of � than simply �r.)

Example 8.3: Is the formula (P2→R)→((P∧Q)2→R) valid or invalid? (This

is the formula corresponding to the inference pattern of augmentation.) The


best approach to such problems is to �rst attempt to �nd a countermodel. In

this case we succeed:

OO

no P∧Q

��

/. -,() *+

1 1 1 0

P∧Q Ra

/. -,() *+

1 1 0

P R Qb OO

no P

��/. -,() *+

0 1 0 0

(P2→R)→[(P∧Q)2→R)r

I began with the false: (P∧Q)2→R. This forced the existence of a nearest

P∧Q world, in which R was false. But since P∧Q was true there, P was true

there; this ruled out the true P2→R in r being vacuously true. So I was forced

to consider the nearest P world. It couldn’t be farther out than a, since P is

true in a. It couldn’t be a, since R was already false there. So I had to put it

nearer than a. Notice that I had to make Q false at b. Why? Well, it was in the

“no P∧Q zone”, and I had made P true in it. Here’s the of�cial model:

W = {r,a,b}�

r= {⟨b,a⟩ . . .}

I (P,a) =I (Q,a) =I (P,b) =I (R,b) = 1, all else 0

Example 8.4: Determine whether �SC

3P→[(P2→Q)→∼(P2→∼Q)]. An

attempt to �nd a countermodel fails at the following point:

OO

no P

��

/. -,

() *+1

1 1 0

P ∼Qa

/. -,() *+

1 0 0 1 0 0 1

3P→[(P2→Q)→∼(P2→∼Q)]r

At world a, I’ve got Q being both true and false. A word about how I got

to that point. I noticed that I had to make two counterfactuals true: P2→Q


and P2→∼Q. Now, this isn’t a contradiction all by itself. Remember that

counterfactuals are vacuously true if their antecedents are impossible. So if Pwere impossible, then both of these would indeed be true, without any problem.

But 3P has to be true at r. This rules out those counterfactual’s being vacuously

true. Since P is possible, the limit assumption has the result that there is a closest

P world. This then with the two true counterfactuals created the contradiction.

This reasoning is embodied in the following semantic validity proof:

i) Suppose for reductio that (P2→Q)→∼(P2→∼Q) is false at some world

r in which 3P is true.

ii) Then P2→Q and P2→∼Q are both true at r as well.

iii) Since 3P is true at r , P is true at some world. So, by the limit assumption,

we have: there exists a world, a, such that V(P,a) = 1 and for any x, if

V(P, x) = 1 then a �r x. For short, a is a closest-to-r P world.

iv) The truth condition for 2→, applied to P2→Q, gives us that Q is true

at all the closest-to-r P worlds.

v) Similarly, applied to P2→∼Q, we know that ∼Q is true at all the closest-

to-r P worlds.

vi) Thus, both Q and ∼Q would be true at a. Impossible.

Note the use of the limit assumption. It is the limit assumption we must use

when we need to know that there is a nearest φ-world, in cases where we can’t

get this knowledge from other things in the proof.

Cases where one counterfactual is nested within another call for something

new. Let’s consider how to show that [P2→(Q2→R)]→[(P∧Q)2→R] is SC-

invalid (this is the formula corresponding to “importation”). We begin by

making the formula false in r, the actual world of the model. This means making

the antecedent true and the consequent false. Now, since the consequent is a

false counterfactual, we are forced to make there be a nearest P∧Q world in

which R is false:


OO

no P∧Q

��

/. -,() *+

1 1 1 0

P∧Q Ra

/. -,() *+

1 0 0

[P2→(Q2→R)]→[(P∧Q)2→R]r

Next we must make P2→(Q2→R) true. We can’t make it vacuously true,

because we’ve already got a P-world in the model: a. So, we’ve got to put in

the nearest-to-r P world. Could it be farther away than a? No, because a would

be a closer P world. Could it be a? No, because we’ve got to make Q2→R true

in the closest P world, and since Q is true but R is false in a, Q2→R is already

false in a. So, we do it as follows:

OO

no P∧Q

��

/. -,() *+

1 1 1 0

P∧Q Ra

/. -,() *+

1 0 1

P Q2→Rb OO

no P

��/. -,() *+

0 1 0 0

[P2→(Q2→R)]→[(P∧Q)2→R]r

(Why did I make Q false at b? Well, because b is in the no P∧Q zone, and P is

true at b, so Q had to be false there.)

The remaining thing to do is to make Q2→R true at b. This requires some

thought. The diagram right now represents “the view from r”—it represents

how near the worlds in the model are to r. That is, it represents the �r

relation.

But the truth value of Q2→R at b depends on “the view from b”—that is, the

�b

relation. So we need to consider a new diagram, in which b is the bottom

world:


OO

no Q

��

/. -,() *+

1 1

Q Rc

/. -,() *+

1 0 1

P Q2→Rb

I made there be a nearest-to-b Q world, and made R true there. Notice that

I kept the old truth values of b from the other diagram. This is because this

new diagram is a diagram of the same worlds as the old diagram; the difference

is that the new diagram represents the �b

nearness relation, whereas the old

one represented a different relation: �r. Now, this diagram isn’t �nished. The

diagram is that of the �b

relation, and that relation relates all the worlds in the

model. So, worlds r and a have to show up somewhere here. The safest practice

is to put them far away from b, so that there isn’t any possibility of con�ict

with the no Q zone that has been established. Thus, the �nal appearance of

this part of the diagram is as follows:

/. -,() *+r

/. -,() *+a

OO

no Q

��

/. -,() *+

1 1

Q Rc

/. -,() *+

1 0 1

P Q2→Rb

The old truth values from worlds r and a are still in effect (remember that this

is another diagram of the same model, but representing a different nearness

relation), but I left them out because of the fact that they’ve already been

written on the other part of the diagram.

Notice that the order of the worlds in the r-diagram does not in any way

affect the order of the worlds on the b diagram. The nearness relations in


the two diagrams are completely independent, because in our de�nition of

‘SC-model’, we entered in no conditions constraining the relations between �iand � j when i 6= j . This sometimes seems unintuitive. For example, we could

have two halves of a model looking as follows:

The view from r The view from ac r

b b

a c

r a

It might, for example, seem odd that b is physically closer to a than to c in the

view from r, but not in the view from a. But remember that in any diagram,

only some of the features are intended to be genuinely representative. These

diagrams are in ink, but this is not intended to convey the idea that the worlds in

the model are made of ink. This feature of the diagram isn’t intended to convey

information about the model. Analogously, the fact that in b is physically closer

to a than to c in the view from r is not intended to convey the information that,

in the model, b�ac. In fact, the diagram of the view from r is only intended to

convey information about �r; it doesn’t carry any information about �

a, �

b, or

�c.

Back to the countermodel. That other part of the diagram, the view from r,

must be updated to include world c. The safest procedure is to put c far away

on the model to minimize possibility of con�ict. Thus, the �nal picture of the

view from r is:


/. -,() *+c

OO

no P∧Q

��

/. -,() *+

1 1 1 0

P∧Q Ra

/. -,() *+

1 0 1

P Q2→Rb OO

no P

��/. -,() *+

0 1 0 0

[P2→(Q2→R)]→[(P∧Q)2→R]r

Again, I haven’t re-written the truth values in world c, because they’re already

in the other diagram, but they are to be understood as carrying over. Now for

the of�cial model:

W = {r,a,b,c}�

r= {⟨b,a⟩, ⟨a,c⟩ . . .}

�b= {⟨c,a⟩, ⟨a, r⟩ . . .}

I (P,a) =I (Q,a) =I (P,b) =I (Q,c) =I (R,c) = 1, all else 0

Notice that we needed to express two of�’s subrelations: �rand�

b. Remember

that any model has got to contain �i for every world i in the model. For

example, if we were to write out this model completely of�cially, we’d have to

specify �a

and �c. But we don’t bother with those parts of � that don’t matter.

Exercise 8.2 Determine whether the following wffs are SC-valid

or invalid. Give a falsifying model for every invalid wff, and a

semantic validity proof for every valid wff.

a) 3P→[(P2→Q)↔∼(P2→∼Q)]

b) [P2→(Q→R)]→[(P∧Q)2→R]

c) [P2→(Q2→R)]→[Q2→(P2→R)]


8.6 Logical Features of SCHere we’ll discuss various features of SC, which appear to con�rm that it’s a

good logic for counterfactual conditionals. In part, we will be showing that our

semantics for 2→ matches the logical features of natural language that were

discussed in section 8.1.

8.6.1 Not truth-functionalWe wanted our system for counterfactuals to have the following features:

∼P 2 P2→QQ 2 P2→Q

Clearly, it does. The �rst fact is demonstrated by a model in which P is false

at some world, r, and in which there’s a nearest-to-r P world in which Q is

false; the second, by a model in which Q is true at r, P is false at r, and in which

there’s a nearest-to-r P world in which Q is false.

8.6.2 Can be contingentWe wanted it to turn out that:

P2→Q 22(P2→Q)∼(P2→Q) 2→2∼(P2→Q)

Our semantics does indeed have this result, because the similarity metrics based

on different worlds can be very different. For example: consider a model with

worlds r and a, in which Q is true in the nearest-to-r P world, but in which Qis false at the nearest-to-a P world. P2→Q is true at r and false at a, whence

2(P2→Q) is false at r.

8.6.3 No augmentationIn example 8.3 we produced a model containing a world in P2→Q was true

but (P∧R)2→Q was false. Thus, P2→Q 2 (P∧R)2→Q.


8.6.4 No contrapositionLet’s show that P2→Q 2∼Q2→∼P :

OO

no ∼Q

��

/. -,() *+

1 0 0 1

∼Q ∼Pa

/. -,() *+

1 1

P Qb OO

no P

��/. -,() *+

1 0

(P2→Q) (∼Q2→∼P )r

I won’t bother with the of�cial model.

8.6.5 Some implicationsExercises 8.1a and 8.1b show that the SC semantics vindicates the inference

from the strict to the counterfactual conditional, and from the counterfactual

conditional to the material conditional.

8.6.6 No exportationWe have shown that the SC-semantics reproduces the logical features of natural

language counterfactuals discussed in section 8.1. In the next few sections we

discuss some further logical features of the SC-semantics, and compare them

with the logical features of the→, the⇒, and natural language counterfactuals.

The→ obeys exportation:

(φ∧ψ)→χφ→(ψ→χ )

But the ⇒ doesn’t in any system; (P∧Q)⇒R 2S5

P⇒(Q⇒R). Nor does

the 2→; it can be easily shown with a countermodel that (P∧Q)2→R 2SC

P2→(Q2→R).Does the natural language counterfactual obey exportation? Here is an

argument that it does not. The following is true:


If Bill had married Laura and Hillary, he would have

been a bigamist.

But one can argue that the following is false:

If Bill had married Laura, then it would have been the

case that if he had married Hillary, he would have been

a bigamist.

Suppose Bill had married Laura. Would it then have been true that: if he had

married Hillary, he would have been a bigamist? Well, let’s ask for comparison:

what would the world have been like, had George W. Bush married Hillary

Rodham Clinton? Would Bush have been a bigamist? Here the natural answer

is no. George W. Bush is in fact married to Laura Bush; but when imagining him

married to Hillary Rodham Clinton, we don’t hold constant his actual marriage.

We imagine him being married to Hillary instead. If this is true for Bush, then

one might think it’s also true for Bill in the counterfactual circumstance in

which he’s married to Laura: it would then have been true of him that, if he

had married Hillary, he wouldn’t have still been married to Laura, and hence

would not have been a bigamist.

It’s unclear whether this is a good argument, though, since it assumes that

ordinary standards for evaluating unembedded counterfactuals (“If George had

married Hillary, he would have been a bigamist”) apply to counterfactuals

embedded within other counterfactuals (“If Bill had married Hillary, he would

have been a bigamist” as embedded within “If Bill had married Laura then…”.)

Contrary to the assumption, it seems most natural to evaluate the consequent of

an embedded counterfactual by holding its antecedent constant. But a defender

of the SC semantics might argue that the second displayed counterfactual

above has a reading on which it is false (recall the context-dependence of

counterfactuals), and hence that we need a semantics that allows for the failure

of exportation.

8.6.7 No importationImportation holds for→, and for⇒ in T and stronger systems:

φ→(ψ→χ )φ⇒(ψ⇒χ )

(φ∧ψ)→χ (φ∧ψ)⇒χ


but not for the 2→: above we produced an SC-model with a world in which

the conditional [P2→(Q2→R)]→[(P∧Q)2→R] was false.

The status of importation for natural language counterfactuals is similar to

the status of exportation. One can argue that the following is true, at least on

one reading:

If Bill had married Laura, then it would have been the

case that if he had married Hillary, he would have been

happy.

without the result of importing being true:

If Bill had married Laura and Hillary, he would have

been happy

(if he had married both he would have become a public spectacle, which would

have made him most unhappy.)

8.6.8 No hypothetical syllogism (transitivity)The following inference pattern is valid for→ and⇒ (in all systems):

ψ→χ φ→ψφ→χ

ψ⇒χ φ⇒ψφ⇒χ

But the model produced above invalidating [(P2→Q)∧(Q2→R)]→(P2→R)contained a world at which P2→Q and Q2→R were both false but in which

P2→R was true.

Natural language counterfactuals also seem not to obey hypothetical syllo-

gism. I am the oldest child in my family; my brother Mike is the second-oldest.

So the following counterfactuals seem true:4

4Note that if these two statements are written in the reverse order, it seems far less clear

that they’re both true: “If my parents had never met, I wouldn’t have been born; If I hadn’t

been born, Mike would have been my parent’s oldest child.” It’s natural in this case to interpret

the second conditional by holding constant the antecedent of the �rst conditional. This fact,

together with what we observed about embedded counterfactuals in section 8.6.6, suggests a

systematic dependence of the interpretation of counterfactuals on their immediate linguistic

context. See von Fintel (2001) for a “dynamic” semantics for counterfactuals, which more

accurately models this feature of their use, and also makes sense of how hard it is to hear the

readings argued for in sections 8.6.6, 8.6.7, and 8.6.8.


If I hadn’t been born, Mike would have been my parent’s

oldest child.

If my parents had never met, I wouldn’t have been born.

But the result of applying hypothetical syllogism seems false:

If my parents had never met, Mike would have been their

oldest child.

8.6.9 No transpositionTransposition governs the→:

φ→(ψ→χ )ψ→(φ→χ )

but not the⇒ (in any of our modal systems); P⇒(Q⇒R) 2S5

Q⇒(P⇒R). Nor

does it govern the 2→; it’s easy to show that P2→(Q2→R) 2SC

Q2→(P2→R).The status of transposition for natural language counterfactuals is sim-

ilar to that of importation and exportation. If we can ignore the effects of

embedding on the evaluation of counterfactuals, then we have the following

counterexample to transposition. It is true that:

If Bill Clinton had married Laura Bush, then if he had

married Hillary Rodham, he’d have been married to a

Democrat.

But it is not true that:

If Bill Clinton had married Hillary Rodham, then if he

had married Laura Bush, he’d have been married to a

Democrat.

8.7 Lewis’s criticisms of Stalnaker’s theoryDavid Lewis has a rival theory of counterfactuals. Like Stalnaker’s theory, it is

based on similarity, but it differs from Stalnaker’s in certain respects. So far, we


have only discussed features of Stalnaker’s system that are shared by Lewis’s.

Let’s turn, now, to the differences.

Lewis challenges Stalnaker’s assumption that �w is always anti-symmetric.

Real similarity relations permit ties. So it seems implausible to rule out the

possibility of two worlds being exactly similar to a given world.

The challenge to Stalnaker here is most straightforward if Stalnaker intends

to be giving truth conditions for natural language counterfactuals, rather than

merely doing model theory. In that case, the setW in an SC-model must be the

set of genuine possible worlds, and � must be a relation of genuine similarity,

in which case it ought to admit ties. But even if Stalnaker is not doing this,

the objection may yet have bite, to the extent that the semantics of natural

language conditionals is like similarity-theoretic semantics.5

The validity of certain formulas depends on the “no ties” assumption; the

following two wffs are SC-valid, but are challenged by Lewis:

(P2→Q)∨ (P2→∼Q) (“Conditional excluded middle”)

[P2→(Q∨R)]→ [(P2→Q)∨(P2→R)] (“distribution”)

Take the �rst one, for example. Suppose you gave up anti-symmetry, thereby

allowing ties. Then the following would be a countermodel for the law of

conditional excluded middle:

OO

no Q

��

/. -,() *+

1 0

P Qa

/. -,() *+

1 0 1

P ∼Qb

/. -,() *+

0 0 0

(P2→Q)∨(P2→∼Q)r

Remember that P2→Q is true only if Q is true in all the nearest P worlds.

In this model, Q is true in one of the nearest P worlds, but not all, so that

counterfactual is false at r. Similarly for P2→∼Q.

A similar model shows that distribution fails if the “no ties” assumption is

given up.

So, should we give up conditional excluded middle? As Lewis concedes,

the principle is initially plausible. An equivalent formulation of conditional

5For an interesting response to Lewis, see Stalnaker (1981).


excluded middle is this:

∼(P2→Q)→(P2→∼Q)

But everyone agrees that (P2→∼Q)→∼(P2→Q) is always true, at least, when

P is possibly true. So, in cases where P is possibly true anyway, the question

of whether conditional excluded middle is valid is the question of whether

∼(P2→Q) and P2→∼Q are equivalent to each other. And it does indeed

seem that in ordinary usage, one expresses the negation of a counterfactual

by negating its consequent. To deny the counterfactual “if she had played,

she would have won", one says "no, she wouldn’t have!”, meaning “if she had

played, she would not have won”.

And take the other formula validated by Stalnaker’s theory, distribution. In

reply to: “if the coin had been �ipped, it would have come out either heads or

tails”, one might ask: “which would it have been, heads or tails?”. The thinking

behind the reply is that “if the coin had been �ipped, it would have come up

heads”, or “if the coin had been �ipped, it would have come up tails” must be

true.

So there’s some plausibility to both these formulas. But Lewis says two

things. The �rst is metaphysical: if we’re going to accept the similarity analysis,

we’ve got to give them up, because ties just are possible. The second is purely

semantic: the intuitions aren’t completely compelling. About the coin-�ipping

case, Lewis denies that if the coin had been �ipped, it would have come up

heads, and he also denies that if the coin had been �ipped, it would have come

up tails. Rather, he says, if it had been �ipped, it might have come up heads.

And if it had been �ipped, it might have come up tails. But neither outcome is

such that it would have resulted, had the coin been �ipped.

Concerning excluded middle, Lewis says:

It is not the case that if Bizet and Verdi were compatriots, Bizet would be

Italian; and it is not the case that if Bizet and Verdi were compatriots, Bizet

would not be Italian; nevertheless, if Bizet and Verdi were compatriots,

Bizet either would or would not be Italian. (Counterfactuals, p. 80.)

Lewis can follow this up by noting that if Bizet and Verdi were compatriots,

Bizet might be Italian, but it’s not the case that if they were compatriots, he

would be Italian.

Here is a related complaint of Lewis’s about Stalnaker’s semantics. In the

last little bit, I’ve used English phrases of the form “if it were the case that φ,


then it might have been the case that ψ”. This conditional Lewis calls “the

‘might’ counterfactual”; he symbolizes it as φ3→ψ, and de�nes it thus:

· “φ3→ψ” is short for “∼(φ2→∼ψ)”

Lewis criticizes Stalnaker’s system for the fact that this de�nition of 3→ doesn’t

work in Stalnaker’s system. Why not? Well, since internal negation is valid in

Stalnaker’s system, φ3→ψ would always imply φ2→ψ—not good, since the

might-conditional in English seems weaker than the would-conditional. So,

Lewis’s de�nition of 3→ doesn’t work in Stalnaker’s system. Moreover, there

doesn’t seem to be any other plausible de�nition. So, Stalnaker can’t de�ne

3→.6

Lewis also objects to Stalnaker’s limit assumption. The following line is

less than one inch long:

Now, consider the counterfactual:

If the line were more than one inch long, it would be

over one hundred miles long.

Seems false. But if we use Stalnaker’s truth conditions as truth conditions for

natural language counterfactuals, and take our intuitive judgments of similarity

seriously, we seem to get the result that it is true! The reason is that there

doesn’t seem to be a closest world in which the line is more than one inch long.

For every world in which the line is, say, 1+ k inches long, there’s another

world in which the line has a length closer to its actual length but still more

than one inch long: say, 1+ k2 inches. So there doesn’t seem to be any closest

world in which the line is over one inch long.

In light of these criticisms, Lewis proposes a new similarity-based seman-

tics for counterfactuals, which assumes neither anti-symmetry nor the limit

assumption. Let’s look at that system.

6Lewis (1973, p. 80).


8.8 Lewis’s system7

To move from Stalaker’s system to Lewis’s, we can start by just dropping the

anti-symmetry assumption. We also want to drop the limit assumption. But

after dropping the limit assumption, if we made no further adjustments to the

system, we would get unwanted vacuous truths, as we did in the example of the

one-inch long line above.8

The truth de�nition for 2→ needs to be changed.

Instead of saying that φ2→ψ is true iff ψ is true in all the nearest φ worlds,

we will instead say that φ2→ψ is true iff either i) φ is true in no worlds (the

vacuous case), or ii) there is some φ world such that for every φ world at least

as close, φ→ψ is true there.

Here is the new system, LC (Lewis-counterfactuals). It is exactly the same

as the Stalnaker system except that limit and anti-symmetry are dropped, and

the parts indicated in boldface are changed:

Definition of LC-model: An LC-model,M , is an ordered triple ⟨W ,�,I ⟩,where:

· W is a nonempty set

· I is a function that assigns either 0 or 1 to each sentence letter relative

to each member ofW· � is a three-place relation overW· The valuation function, VM , forM (see below) and� satisfy the following

conditions:

· for any w, �w is strongly connected

· for any w, �w is transitive

· for any x , y, if y �x x then x = y (Lewis’s “base”)

Definition of LC-valuation: WhereM (= ⟨W ,�,I ⟩) is any LC-model, the

valuation forM , VM , is de�ned as the two-place function that assigns either

0 or 1 to each wff relative to each member of W , subject to the following

7See Lewis (1973, pp. 48-49). My formulation does away with the accessibility relation (in

Lewis’s terminology, Si , the set of worlds accessible from world i , is alwaysW , the set of all

worlds in the model), so it is a bit simpler.

8Actually, dropping the limit assumption doesn’t affect the class of valid formulas, which is

the same with or without the limit assumption (Lewis, 1973, p. 121).


constraints, where α is any sentence letter, φ and ψ are any wffs, and w is any

member ofW :

· VM (α, w) =I (α, w)

· VM (∼φ, w) = 1 iff VM (φ, w) = 0

· VM (φ→ψ, w) = 1 iff either VM (φ, w) = 0 or VM (ψ, w) = 1

· VM (2φ, w) = 1 iff for any v, VM (φ, v) = 1

· VM (φ2→ψ, w) = 1 iff EITHER φ is true at no worlds, OR: there issome world, x , such that VM (φ, x) = 1 and for all y, if y �w x thenVM (φ→ψ, y) = 1

It may be veri�ed that every LC-valid wff is SC-valid.9

The converse is not

true, as the discussion of conditional excluded middle in the previous section

shows.

Comments on all this: First, notice that the limit and anti-symmetry con-

ditions are simply dropped. Second, the Base condition is modi�ed; now it

says that no world is as close to a world as itself. Before, it said that each world

is at least as close to itself as any other. Stalnaker’s Base condition, plus anti-

symmetry, entails the present Base condition. But Lewis’s system doesn’t have

anti-symmetry, so the Base condition must be stated in the stronger form.10

Third, let’s think about what the truth condition for the 2→ says. First,

there’s the vacuous case: if φ is necessarily false then φ2→ψ comes out true.

But if φ is possibly true, then what the clause says is this: φ2→ψ is true at

w iff there’s some φ world where ψ is true, such that no matter how much

closer to w you go, you’ll never get a φ world where ψ is false. If there is a

nearest-to-w φ world, then this implies that φ2→ψ is true at w iff ψ is true in

all the nearest-to-w φ worlds.

So, thinking of these as truth-conditions for natural-language counterfac-

tuals for a moment, recall the sentence:

9Let’s say that an LC model is “Stalnaker-acceptable” iff it obeys the limit and anti-symmetry

assumptions. Suppose that φ is LC-valid. Then it’s true in all Stalnaker acceptable LC-models.

Now, notice that in Stalnaker-acceptable models, Lewis’s truth-conditions for formulas yield

the same results as Stalnaker’s (exercise 8.3). So, φ must be true in all SC-models.

10Why do we want to prohibit worlds being just as close to w as w is to itself? So that P∧Q

semantically implies P2→Q. Otherwise P∧Q could be true at w while P∧∼Q was true at

some world as close to w as w is to itself, in which case P2→Q would turn out false at w.


If the line were over one inch long, it would be over ten

inches long.

There’s no nearest world in which the line is over one inch long, only an in�nite

series of worlds where the line has lengths getting closer and closer to one

inch long. But this doesn’t make the counterfactual true. A counterfactual is

true if its antecedent is impossible, but that’s not true in this case. So the only

way the counterfactual could be true is if the second part of the de�nition is

satis�ed—if, that is, there is some world, x, such that the antecedent is true

at x, and the material conditional (antecedent→consequent) is true at every

world at least as similar to the actual world as is x. Since the “at least as similar

as” relation is re�exive, this can be rewritten thus:

· for some world, x, the antecedent and consequent are both true at x, and

the material conditional (antecedent→consequent) is true at every world

at least as similar to the actual world as is x

So, is there any such world, x? No. For let x be any world at which the

antecedent and consequent are both true—i.e., any world in which the line is

over ten inches long. We can always �nd a world that is more similar to the

actual world than x in which the material conditional (antecedent→consequent)

is false: just choose a world just like x but in which the line is only, say, two

inches long.

Let’s see how Lewis’s theory works in the case of a true counterfactual, for

instance:

If I were more than six feet tall, then I would be less than

nine feet tall

(I am, in fact, less than six feet tall.) The situation here is similar to the previous

example in that there is no nearest world in which the antecedent is true. But

now, we can �nd a world x, in which the antecedent and consequent are both

true, and such that the material conditional (antecedent→ consequent) is true

in every world at least as similar to the actual world as is x. Simply take x to

be a world just like the actual world but in which I am, say, six-feet-one. Any

world that is at least as similar to the actual world as this world must be one in

which I’m less than nine feet tall; so in any such world the material conditional

(I’m more than six feet tall→I’m less than nine feet tall) is true.


Notice that the formulas representing Conditional Excluded Middle and

Distribution come out invalid now, because of the possibility of ties.

Another thing: Lewis gives the following de�nition for the ‘might’-counter-

factual:

· “φ3→ψ” is short for “∼(φ2→∼ψ)”

From this we may obtain a derived clause for the truth conditions of φ3→ψ:

· VM (φ3→ψ, w) = 1 iff for some x, VM (φ, x) = 1, and for any x, if

VM (φ, x) = 1 then there’s some y such that y �w x and VM (φ∧ψ, y) = 1)

That is, φ3→ψ is true at w iff φ is possible, and for any φ world, there’s a

world as close or closer to w in which φ and ψ are both true. In cases where

there is a nearest φ world, this means that ψ must be true in at least one of the

nearest φ worlds.

Exercise 8.3 Show that in any Lewis model in which the limit

and anti-symmetry conditions hold, Lewis’s truth conditions reduce

to Stalnaker’s. That is, in any such model, a wff counts as being

true at a given world given Lewis’s de�nition of truth in a model if

and only if it counts as being true at that world given Stalnaker’s

de�nition.

8.9 The problem of disjunctive antecedentsBefore we leave counterfactual conditionals, I want to talk about one criticism

that has been raised against both Lewis’s and Stalnaker’s systems.11

In neither

system does the formula (P∨Q)2→R semantically imply P2→R. (Take a model

where there is a unique nearest P∨Q world to r, in which Q is true but not P ;

and make there be a unique nearest P world in which R is false.) But shouldn’t

this implication hold? Imagine a conversation between Butch Cassidy and

the Sundance Kid in heaven, after having been surrounded and killed by the

Bolivian army. They say:

11For references, see the bibliography of Lewis (1977).


If we had surrendered or tried to run away, we would

have been shot.

Intuitively, if this is true, so is this:

If we had surrendered, we would have gotten shot.

In general, one is entitled to conclude from “If P or Q had been the case,

then R would have been the case” that “if P had been the case, R would have

been the case”. If Butch Cassidy and the Sundance Kid could have survived by

surrendering, they certainly would not say to each other “If we had surrendered

or tried to run away, we would have been shot”.

Is this a problem for Lewis and Stalnaker? Some have argued this, but

others respond as follows. One must take great care in translating from natural

language into logic. For example,12

no one would want to criticize the law

∼∼P→P on the grounds that “There ain’t no way I’m doing that” doesn’t

imply that I might do that. And there are notorious peculiar things about the

behavior of ‘or’ in similar contexts. Consider:

You are permitted to stay or go.

One can argue that this does not have the form:

You are permitted to do the action: (Stay ∨ Go)

After all, suppose that you are permitted to stay, but not to go. If you stay, you

can’t help doing the following act: staying ∨ going. So, surely, you’re permitted

to do that. So, the second sentence is true. But the �rst isn’t; if someone uttered

it to you when you were in jail, they’d be lying to you! It really means:

You are permitted to stay, AND you are permitted to go.

Similarly, “If either P or Q were true then R would be true” seems usually to

mean “If P were true then R would be true, and if Q were true then R would be

true”. We can’t just expect natural language to translate directly into our logical

language—sometimes the surface structure of natural language is misleading.

12The example is adapted from Loewer (1976).

Chapter 9

Quanti�ed Modal Logic

We’re going to look at possible-worlds semantics for quanti�ed modal

logic—QML. The language is what you get by adding the 2 and 3 to

the language of predicate logic. There are many interesting issues concerning

the interaction of modal operators with quanti�ers.

9.1 Grammar of QMLThe grammar of the language of QML is exactly what you’d expect: that of

plain old predicate logic, but with the 2 added. Thus, the one new clause to

the de�nition of a predicate-logic wff is the clause that if φ is a wff, then so

is 2φ. (3φ continues to be de�ned as meaning ∼2∼φ.) You get a different

grammar for QML depending on what version of predicate logic grammar you

begin with. To keep things simple, let’s consider a stripped-down version of

predicate logic: no function symbols, and no de�nite description operator. But

let’s include the identity sign =.

9.2 Symbolizations in QMLLike any logical extension, moving to QML gives us more power to analyze the

logical structure of natural language sentences. We began with propositional

logic, which let us analyze a certain level of structure, structure in terms of ‘and’,

‘or’, ‘not’, and so on. The move to predicate logic let us analyze quanti�cational

243

CHAPTER 9. QUANTIFIED MODAL LOGIC 244

structure; the move to modal propositional logic let us analyze modal structure.

Moving to QML lets us do all three at once, as with:

It’s not possible for something to create itself

whose tripartite propositional, predicate, and modal structure is revealed in its

QML symbolization:

∼3∃xC x x

This deeper level of analysis reveals some new logical features. One ex-

ample is the famous distinction between de re and de dicto modal statements.

Consider:

Some rich person might have been poor.

∃x(Rx∧3P x)

It might have been the case that some rich person is poor.

3∃x(Rx∧P x)

The �rst sentence asserts the existence of someone who is in fact rich, but

who might have been poor. This seems true, in contrast to the absurd second

sentence, which says that the following state of affairs is possible: someone

is both rich and poor. The second sentence is called “de dicto” because the

modality is attributed to a sentence (dictum): the modal operator 3 attaches to

the closed sentence ∃x(Rx∧P x). The �rst sentence is called “de re” because

the modality is attributed to an object (res): the 3 attaches to a sentence with a

free variable, P x, and thus can be thought of as attributing a modal property,

the property of possibly being poor, to an object u when x is assigned the value u.

Modal propositional logic alone does not reveal this distinction. Given only

a Q to stand for “some rich person is poor”, we can write only 3Q, which

represents only the absurd second sentence. To represent the �rst sentence

we need to insert the 3 inside the Q, as we can when we further analyze Q as

∃x(Rx∧P x) using predicate logic.

A further example of the de re/de dicto distinction:

Every bachelor is such that he is necessarily male

∀x(B x→2M x)

It is necessary that all bachelors are male

2∀x(B x→M x)


The second, de dicto, sentence makes the true claim that in any possible world,

anyone that is in that world a bachelor is, in that world, male. The �rst, de

re, sentence makes the false claim that if any object, u, is a bachelor in the

actual world, then that object u is necessarily a bachelor—i.e., the object u is a

bachelor in all possible worlds.

What do the following English sentences mean?

All bachelors are necessarily male

Bachelors must necessarily be male

Surface grammar suggests that they would mean the de re claim that each

bachelor is such that he is necessarily male. But in fact, it’s very natural to

hear these sentences as making the de dicto claim that it’s necessary that all

bachelors are male.

The de re/de dicto distinction also emerges with de�nite descriptions. This

may be illustrated by using Russell’s theory of descriptions (section 5.3.3). Re-

call how Russell’s method generated two possible symbolizations for sentences

containing de�nite descriptions and negations. “The striped bear is not dan-

gerous”, for example, can be symbolized as either of the following, depending

on whether the de�nite description is given wide or narrow scope relative to

the negation operator:

∃x(S x∧B x∧∀y([Sy∧By]→x=y)∧∼D x)∼∃x(S x∧B x∧∀y([Sy∧By]→x=y)∧D x)

(The second denies the existence of something that is both i) the one and only

striped bear, and ii) dangerous; the �rst says that there exists something that

is the one and only striped bear, and adds that this bear is non-dangerous.) A

similar phenomenon arises with sentences containing de�nite descriptions and

modal operators. There are two symbolizations of “The number of the planets

is necessarily odd” (letting “N x” mean that x numbers the planets):

∃x(N x∧∀y(N y→x=y)∧2O x)2∃x(N x∧∀y(N y→x=y)∧O x)

(Let’s count Pluto as a planet.) The second is de dicto; it says that it’s necessary

that: there is one and only one number of the planets, and that number is odd.

This claim is false, since there could have been eight planets. The second is de


re; it says that (in fact) there is one and only one number of the planets, and

that that number is necessarily odd. That’s true, I suppose: the number nine

(the number that in fact numbers the planets) is necessarily odd.

Natural language sentences containing both de�nite descriptions and modal

operators are perhaps ambiguous. “The number of the planets is necessarily

odd” is naturally heard as expressing a de re claim; but “The American president

is necessarily an American citizen” can be heard as expressing a de dicto claim.

9.3 A simple semantics for QMLLet’s begin with a very simple semantics, SQML (for “simple QML”). It’s simple

in two ways. First, there is no accessibility relation. 2φ will be said to be true

iff φ is true in all worlds in the model. In effect, each world is accessible from

every other (and hence the underlying propositional modal logic is S5). Second,

it will be a “constant domain” semantics. (We’ll discuss what this means, and

more complex semantical treatments of QML, below.)

Definition of SQML-model: An SQML-model is an ordered triple ⟨W ,D,I ⟩such that:

· W is a nonempty set (“possible worlds”)

· D is a nonempty set (“domain”)

· I is a function such that: (“interpretation function”)

· if α is a constant then I (α) ∈D· if Πn

is an n-place predicate then I (Πn) is a set of n + 1-tuples

⟨u1, . . . , un, w⟩, where u1, . . . , un are members of D, and w ∈W

Recall that our semantics for moda propositional logic assigned truth values to

sentence letters relative to possible worlds. We have something similar here: we

relativize the interpretation of predicates to possible worlds. The interpretation

of a two-place predicate, R, for example is a set of ordered triples, two members

of which are in the domain, and one member of which is a possible world.

When ⟨u1, u2, w⟩ is in the interpretation of R, that represents R’s applying to

u1 and u2 in possible world w. In a possible worlds setting, this relativization

makes intuitive sense: a predicate can apply to some objects in one possible

world but fail to apply to those same objects in some other possible world.


Notice that the interpretations of constants are not relativized in any way to

possible worlds. The interpretation I assigns simply a member of the domain

to a name. This re�ects the common belief that natural language proper

names—which constants are intended to represent—are rigid designators, i.e.,

terms that have the same denotation relative to every possible world (see Kripke

(1972).) We’ll discuss the signi�cance of this feature of our semantics below.

On to the de�nition of the valuation function for an SQML-model. First,

we keep the de�nition of a variable assignment from nonmodal predicate logic

(section 4). Our variable assignments therefore assign members of the domain

to variables absolutely, rather than relative to worlds. (This is an appropriate

choice given our choice to assign constants absolute semantic values.) But the

valuation function will now relativize truth values to possible worlds (as well

as to variable assignments). After all, the sentence ‘F a’, if it represents “Ted is

tall”, should vary in truth value from world to world.

Definition of valuation: The valuation function VM ,g , for SQML-model

M (= ⟨W ,D,I ⟩) and variable assignment g , is de�ned as the function that

assigns either 0 or 1 to each wff relative to each member ofW , subject to the

following constraints:

· for any terms α,β, VM ,g (α=β, w) = 1 iff [α]M ,g = [β]M ,g

· for any n-place predicate, Π, and any terms α1, . . . ,αn,VM ,g (Πα1 . . .αn, w) = 1 iff ⟨[α1]M ,g , . . . ,[αn]M ,g , w⟩ ∈ I (Π)· for any wffs φ, ψ, and variable, α,

VM ,g (∼φ, w) = 1 iff VM ,g (φ, w) = 0

VM ,g (φ→ψ, w) = 1 iff either VM ,g (φ, w) = 0 or VM ,g (ψ, w) = 1

VM ,g (∀αφ, w) = 1 iff for every u ∈D,VM ,gαu(φ, w) = 1

VM ,g (2φ, w) = 1 iff for every v ∈W ,VM ,g (φ, v) = 1

The derived clauses are what you’d expect, including the following one for

3:

VM ,g (3φ, w) = 1 iff for some v ∈W ,VM ,g (φ, v) = 1

Finally, we have:



· φ is valid inM (= ⟨W ,D,I ⟩) iff for every variable assignment, g , and

every w ∈W ,VM ,g (φ, w) = 1

· φ is SQML-valid (“�QML

φ”) iff φ is valid in all SQML models.

· Γ SQML-semantically-implies φ (“Γ �SQML

φ”) iff for every world w in

every SQML model, if every member of Γ is true at w, then so is φ

9.4 Countermodels and validity proofs in SQMLAs before, we want to come up with countermodels for invalid formulas, and

validity proofs for valid ones. Validity proofs introduce nothing new.

Example 9.1: Show that �SQML

3∃x(x = a∧2F x)→F a:

i) suppose for reductio that (for some model, world r , and variable assign-

ment g ,) Vg (3∃x(x=a∧2F x)→F a, r ) = 0. Thus Vg (3∃x(x=a∧2F x), r ) =1 and …

ii) …Vg (F a, r ) = 0

iii) From i), for some w ∈W , Vg (∃x(x=a∧2F x), w) = 1

iv) so for some u ∈D, Vg xu(x=a∧2F x, w) = 1)

v) Thus, Vg xu(x=a, w) = 1 and …

vi) …Vg xu(2F x, w) = 1

vii) from vi), Vg xu(F x, r ) = 1

viii) Thus, ⟨[x]g xu, r ⟩ ∈ I (F )—that is, ⟨u, r ⟩ ∈ I (F )

ix) from v,[x]g xu= [a]g x

u

x) By the de�nition of denotation plus facts about variable assignments,

u =I (a)

xi) By viii) and x), ⟨I (a), r ⟩ ∈ I (F )


xii) Thus, Vg (F a, r ) = 1. Contradicts line ii)

Notice that in line xii) I inferred that Vg assigned “F a” truth at r . I could have

subscripted ‘V’ with any variable assignment, since the truth condition for the

formula “F a” is the same, regardless of the variable assignment; I picked gbecause that’s what I needed to get the contradiction.

As for countermodels, we can use the pictorial method of section 6.3.4,

asterisks and all, with a few changes. First, there’s no need for the arrows

between worlds, since we’ve dropped the accessibility relation, thereby making

every world accessible to every other. Secondly, we have predicates and names

for atomics instead of sentence letters, so how to account for this? Let’s look at

an example: �nding a countermodel for the formula (3F a∧3Ga)→3(F a∧Ga).We begin as follows:

∗1 1 1 0 0

(3F a∧3Ga)→3(F a∧Ga)∗ ∗

r

The understars make us create two new worlds:

∗1 1 1 0 0

(3F a∧3Ga)→3(F a∧Ga)∗ ∗

r

1

F aa

1

Gab

We must then discharge the overstar from the false diamond in each world

(since every world is accessible to every other world in our models):


∗1 1 1 0 0 0 0

(3F a∧3Ga)→3(F a∧Ga)∗ ∗ †

r

1 0 0

F a F a∧Gaa

1 0 0

Ga F a∧Gab

(I had to make either F a or Ga false in r—I chose F a arbitrarily.) Now, we’ve

indicated the truth-values that we want the atomics to have. How do we make

the atomics have the TVs we want in the picture?

We do this by introducing a domain for the model, and stipulating what the

names refer to and what objects are in the extensions of the predicates. Let’s

use letters like ‘u’ and ‘v’ as the members of the domain in our models. Now, if

we let the name ‘a’ refer to (the letter) u, and let the extension of F in world r

be {} (the empty set), then the truth value of ‘F a’ in world r will be 0 (false),

since the denotation of a isn’t in the extension of F at world r. Likewise, we

need to put u in the extension of F (but not in the extension of G) in world

a, and put u in the extension of G ((but not in the extension of F ) in world b.

This all may be indicated on the diagram as follows:

a: u

∗1 1 1 0 0 0 0

(3F a∧3Ga)→3(F a∧Ga)∗ ∗ †

F :{}

r

1 0 0

F a F a∧Ga

F : {u} G : {}

a

1 0 0

Ga F a∧Ga

F : {} G : {u}

b

Within each world I’ve included a speci�cation of the extension of each predi-


cate. But the speci�cation of the referent of the name ‘a’ does not go within

any world; it was rather indicated (in boldface) at the top of model. This is

because names, unlike predicates, get assigned semantic values absolutely in a

model, not relative to worlds.

Time for the of�cial model:

W = {r,a,b}D = {u}

I (a) = u

I (F ) = {⟨u,a⟩}I (G) = {⟨u,b⟩}

What about formulas with quanti�ers? A countermodel for 2∃xF x→∃x2F xbegins as follows:

∗ +1 1 0 0

2∃ xF x→∃ x2F x+

r

The overstar above the 2 in the antecedent must be discharged in r itself, since,

remember, every world sees every world in these models. That gives us a true

existential. Now, a true existential is a bit like a true 3—the true ∃xF x means

that there must be some object u from the domain that’s in the extension of Fin r. I’ll put a + under true ∃s and false ∀s, to indicate a commitment to someinstance of some sort or other. Analogously, I’ll indicate a commitment to all

instances of a given type (which would arise from a true ∀ or a false ∃) with a +

above the connective in question.

OK, how do we make ∃xF x true in r? By making “F x” true for some

value of x. Let’s put the letter u in the domain, and make “F x” true when u is

assigned to x. We’ll indicate this by putting a 1 overtop of “F u

x ” in the diagram.

Now, “F u

x ” isn’t a formula of our language—what it indicates is that “F x” is to

be true when u is assigned to x. And to make this come true, we treat it as an

atomic—we put u in the extension of F at r:


∗ +1 1 0 0 1

2∃ xF x→∃ x2F x F u

x+

F : {u}

r

Good. Now we’ve got to attend to the overplus, the + sign overtop the false

∃x2F x. Since it’s a false ∃, we’ve got to make 2F x false for every object in the

domain (otherwise—if there were something in the domain for which 2F x was

true—∃x2F x would be true after all). So far, we’ve got only one object in our

domain, u, so we’ve got to make 2F x false, when u is assigned to the variable

‘x’. We’ll indicate this on the diagram by putting a 0 overtop of “2F u

x ”:

∗ +1 1 0 0 1 0


x 2F u

x+ ∗

F : {u}

r

Ok, now we have an understar, which means we should add a new world to

our model. When doing so, we’ll need to discharge the overstar from the

antecedent. We get:

∗ +1 1 0 0 1 0


x 2F u

x+ ∗

F : {u}

r

0 1 1

F u

x ∃ xF x F v

x+

F : {v}

a


This move requires some explanation. Why the v? Well, I was required to

make F x false, with u assigned to x. Well, that means keeping u out of the

extension of F at a. Easy enough, right? Just make F ’s extension {}? Well,

no—because of the true 2 in r, I’ve got to make ∃xF x true in a. But that means

that something’s got to be in F ’s extension in a! It can’t be u, so I’ll add a new

object, v, to the domain, and put it in F ’s extension in a.

But adding v to the domain of the model adds a complication. We had

an overplus in r—over the false ∃. That meant that, in r, for every member of

the domain, 2F x is false. So, 2F x is false in r when v is assigned to x. That

creates another understar, requiring the creation of a new world. The model

then looks as follows:

∗ +1 1 0 0 1 0 0


x 2F u

x 2F v

x+ ∗ ∗

F : {u}

r

0 1 1

F u

x ∃ xF x F v

x+

F : {v}

a

0 1 1

F v

x ∃ xF x F u

x+

F : {u}

b

(Notice that we needn’t have made another world b—we could simply have

discharged the understar on r.)

Ok, here’s the of�cial model:

W = {r,a,b}D = {u,v}

I (F ) = {⟨u, r⟩, ⟨u,b⟩, ⟨v,a⟩}


Exercise 9.1 For each formula, give a validity proof if the wff is

SQML-valid, and a countermodel if it is invalid.

a) 3∀xF x→∃x3F x

b) ∃x3Rax→32∃x∃yRxy

c) ∃x(N x∧∀y(N y→y=x)∧2O x)→2∃x(N x∧∀y(N y→y=x)∧O x)

9.5 Philosophical questions about SQMLOur semantics for quanti�ed modal logic faces philosophical challenges. In

each case we will be able to locate a particular feature of our semantics that

gives rise to the alleged problem. In response, one can stick with the simple

semantics and give it a philosophical defense, or one can revise the semantics.

9.5.1 The necessity of identityLet’s try to come up with a countermodel for the following formula:

∀x∀y(x=y→2(x=y))

When we try to make the formula false by putting a 0 over the initial ∀, we get

an under-plus. So we’ve got to make the inside part, ∀y(x=y→2x=y), false

for some value of x. We do this by putting some object u in the domain, and

letting that be the value of x for which ∀y(x=y→2x=y) is false. We get:

0 0

∀x∀y(x=y→2x=y) ∀y( ux=y→2( ux=y))+ +

r

Now we need to do the same thing for our new false universal: ∀y(x=y→2x=y).For some value of y, the inside conditional has to be false. But that means that

the antecedent must be true. So the value for y has to be u again. We get:


0 0 1 0 0

∀x∀y(x=y→2x=y) ∀y( ux=y→2( ux=y)) u

x=u

y→2( ux=u

y )+ + ∗

r

The understar now requires creation of a world in which x=y is false, when

both x and y are assigned u. But there cannot be any such world! An identity

sentence is true (at any world) if the denotations of the terms are identical. Our

attempt to �nd a countermodel has failed; we must do a validity proof. Consider

any SQML model ⟨W ,D,I ⟩, any r ∈W , and any variable assignment g ; we’ll

show that Vg (∀x∀y(x=y→2x=y), r ) = 1:

i) suppose for reductio that Vg (∀x∀y(x=y→2x=y), r ) = 0.

ii) Then, for some u ∈D, Vg xu(∀y(x=y→2x=y), r ) = 0

iii) So, for some v ∈D, Vg xyuv(x=y→2(x=y), r ) = 0.

iv) Thus, Vg xyuv(x=y, r ) = 1, and …

v) …Vg xyuv(2(x=y), r ) = 0

vi) from iv) [x]g xyuv= [y]g xy

uv

vii) From v), at some world, w, Vg xyuv(x=y, w) = 0

viii) And so, [x]g xyuv6= [y]g xy

uv. Contradicts vi).

Notice at the end how the particular world at which the identity sentence was

false didn’t matter. The truth condition for an identity sentence is simply that

the terms denote the same thing; it doesn’t matter what world this is evaluated

relative to.1

1A note about variables. In validity proofs, I’m using italicized ‘u’ and ‘v’ as variables to

range over objects in the domain of the model I’m considering. So, a sentence like ‘u = v’

might be true, just as the sentence ‘x=y’ of our object language can be true. But when I’m

doing countermodels, I’m using the roman letters ‘u’ and ‘v’ as themselves being members of

the domain, not as variables ranging over members of the domain. Since the letters ‘u’ and

‘v’ are different letters, they are different members of the domain. Thus, in a countermodel

with letters in the domain, if the denotation of a name ‘a’ is the letter ‘u’, and the denotation


We can think of ∀x∀y(x=y→2(x=y)) as expressing “the necessity of iden-

tity”: it says that whenever identity holds between objects, it necessarily holds.

The necessity of identity is philosophically controversial. On the one hand

it can seem obviously correct. “x=y” says that x and y are one and the same

thing. Now, if there were a world in which x was different from y, since x and

y are the same thing, this would have to be a world in which x was different

from x. How could that be? On the other hand, it was a great discovery that

Hesperus = Phosphorus. Surely, it could have turned out the other way—surely,

Hesperus might not have turned out identical to Phosphorus! But isn’t this

a counterexample to this formula? For a discussion of this example, see Saul

Kripke’s book Naming and Necessity.

It’s worth noting why the necessity of identity turns out valid, given our

semantics. It turns out valid because of the way we de�ned variable assignments:

our variable assignments assign members of the domain to variables absolutely,

rather than relative to worlds. (Similarly: since the interpretation function I ,

according to our de�nition above, assigns referents to names absolutely, rather

than relative to worlds, the formula a=b→2a=b turns out valid.) One could,

instead, de�ne variable assignments as functions that assign members of the

domain to variables relative to worlds. Given appropriate adjustments to the

de�nition of the valuation function, this would have the effect of invalidating

the necessity of identity.2

(Similarly, one could make I assign denotations to

names relative to worlds, thus invaliding a=b→2a=b .)

9.5.2 The necessity of existenceAnother (in)famous valid formula of SQML is the “Barcan Formula” (named

after Ruth Barcan Marcus)

∀x2F x→2∀xF x

(Call the schema ∀α2φ→2∀αφ the “Barcan schema”.) If we try to �nd a

countermodel for this formula we get to the following stage:

of the name ‘b ’ is the letter ‘v’, then the sentence ‘a=b ’ has got to be false, since ‘u’6=‘v’. If

I were using ‘u’ and ‘v’ as variables ranging over members of the domain, then the sentence

‘u = v’ might be true! This just goes to show that it’s important to distinguish between the

sentence u = v and the sentence ‘u’= ‘v’. The �rst could be true, depending on what ‘u’ and

‘v ’ currently refer to, but the second one is just plain false, since ‘u’ and ‘v’ are different letters.

2See Gibbard (1975).


+1 0 0

∀x2F x→2∀xF x∗

r

0 0

∀xF x F u

x+

F : {}

a

When you have a choice between discharging over-things and under-things,

whether plusses or stars, always do the under things �rst. In this case, this

means discharging the understar and ignoring the over-plus for the moment.

So, discharging the understar gave us world a, in which we made a universal

false. This gave an underplus, and forced us to make an instance false. So I put

object u in our domain, and keep it out of the extension of F in a. This makes

F x false in a, when x is assigned u.

But now, I need to discharge the overplus in r. I must make 2F x true for

every member of the domain, including u, which is now in the domain. But

then this requires F x to be true, when u is assigned to x, in a:

+ ∗1 0 0 1 1

∀x2F x→2∀xF x 2F u

x∗

F : {u}

r

0 0 1

∀xF x F u

x F u

x+

F : {?}

a

So, we fail to get a model. Time for a validity proof; let’s show that every

instance of the Barcan Schema is valid:

i) suppose for reductio that Vg (∀α2φ→2∀αφ, r ) = 0. Then Vg (∀α2φ, r ) =1 and …

ii) …Vg (2∀αφ, r ) = 0.

iii) from ii), for some w, Vg (∀αφ, w) = 0

iv) so, for some u in the domain, Vgu/α(φ, w) = 0

v) from i), for every member of the domain, and so for u in particular,

Vgαu(2φ, r ) = 1.


vi) thus, for every world, and so for w in particular, Vgαu(φ, w) = 1. Contra-

dicts iv).

The validity of the Barcan formula in our semantics is infamous because the

Barcan formula seems, intuitively, to be invalid. To see why, we need to think a

bit about the intuitive signi�cance of the relative order of quanti�ers and modal

operators. Consider the difference between the following two sentences:

3∃xF x∃x3F x

In general, a sentence of the form 3φ says that it’s possible for the component

sentence, φ, to be true. So the �rst of our two sentences, 3∃xF x, says that

it’s possible for “∃xF x” to be true. That is: it’s possible for there to exist an F .

What about the second sentence? In general, a sentence that begins without

a modal operator in front makes a statement about the actual world. Thus,

a statement that begins with “∃x . . .” is saying that there exists, in the actual

world, an object x, such that…. Our second statement, then, says that there

actually exists an object, x, that is possibly F . It matters, therefore, whether

the ∃ comes after or before the 3. If the ∃ comes �rst, then the statement is

saying that there actually exists a certain sort of object (namely, an object that

could have been a certain way.) But if it comes second, after the 3, then the

statement is merely saying that there could have existed a certain sort of object.

There is a similar contrast with the following two statements:

2∀xF x∀x2F x

The �rst says that it’s necessary that: everything is F . That is, in every possible

world, every object that exists in that world is F in that world. The objects

ranged over by the ∀, so to speak, are drawn from the worlds the 2 introduces,

because the ∀ occurs inside the scope of the 2. The second statement, by

contrast, says that: every actual object is necessarily F . That is, every object that

exists in the actual world is F in every possible world. The second statement

concerns just actually existing objects because the ∀ occurs in the front of the

formula, not inside the scope of the 2.

With all this in mind, return to the Barcan formula, ∀x2F x→2∀xF x. It

says:


“If every actually existing thing is F in every possible

world, then in every world, every object in that world is

F in that world”

Now we can see why this claim is questionable. Even if every actual thing is

necessarily F , there could still be worlds containing non-F things, so long as

those non-F things don’t exist in the actual world. Suppose, for instance, that

every object in the actual world is necessarily a material object. Then, letting

F stand for “is a material object”, ∀x2F x is true. Nevertheless, 2∀xF x seems

false—it would presumably be possible for there to exist an immaterial object:

a ghost, say. Possible worlds containing ghosts would simply need to contain

objects that do not exist in the actual world (since all the objects in the actual

world are necessarily material.)

This objection to the validity of the Barcan formula is obviously based on

the idea that what objects exist can vary from possible world to possible world.

But this sort of variation is not represented in the SQML de�nition of a model.

Each such model contains a single domain, D, rather than different domains

for different possible worlds. The truth condition we speci�ed for a quanti�ed

sentence ∀αφ, at a world w, was simply that φ is true at w of every member of

D—the quanti�er ranges over the same domain, regardless of which possible

world is being described. That is why the Barcan formula turns out valid under

our de�nition.

This feature of SQML models is problematic for an even more direct reason:

the sentence ∀x2∃y y = x, i.e., “everything necessarily exists”, turns out valid!:

i) Suppose for reductio that Vg (∀x2∃y y=x, w) = 0.

ii) Then Vg xu(2∃y y=x, w) = 0, for some u ∈D

iii) So Vg xu(∃y y=x, w ′) = 0, for some w ′ ∈W

iv) So Vg xy

u u′(y=x, w ′) = 0, for every u ′ ∈D

v) So, since u ∈D, we have Vg xyu u(y=x, w ′) = 0. But that can’t be, given the

clause for ‘=’ in the de�nition of the valuation function.

It’s clear that this formula turns out valid for the same reason that the Barcan

formula turns out valid: SQML models have a single domain common to each

possible world.


The Barcan schema is just one of a number of interesting schemas concern-

ing how quanti�ers and modal operators interact (for each schema I also list an

equivalent schema with 3 in place of 2):

∀α2φ→2∀αφ 3∃αφ→∃α3φ (Barcan)

2∀αφ→∀α2φ ∃α3φ→3∃αφ (converse Barcan)

∃α2φ→2∃αφ 3∀αφ→∀α3φ

2∃αφ→∃α2φ ∀α3φ→3∀αφ

We have already discussed the Barcan schema. The third schema raises no philo-

sophical problems for SQML, since, quite properly, it has instances that turn out

invalid: as we saw above, there are SQML models in which 2∃xF x→∃x2F xis false. Let’s look at the other two schemas.

First, the converse Barcan schema. Like the Barcan schema, each of its

instances is valid given the SQML semantics (I’ll leave this to the reader to

demonstrate), and like the Barcan schema, this verdict faces a philosophical

challenge. The antecedent says that in every world, everything that exists inthat world is φ. Existents are thus always φ. It might still be that some object

isn’t necessarily φ: perhaps some object that is φ in every world in which it

exists, fails to be φ in worlds in which it doesn’t exist. This talk of an object

being φ in a world in which it doesn’t exist may seem strange, but consider

the following instance of the converse Barcan schema, substituting “∃y y=x”

(think: “x exists”) for φ:

2∀x∃y y=x→∀x2∃y y=x

This formula seems to be false. Its antecedent is clearly true; but its consequent

says that every object in the actual world exists necessarily, and hence seems

intuitively to be false.

Each instance of the fourth schema, ∃α2φ→2∃αφ, is also validated by

the SQML semantics (again, an exercise for the reader); and again, this is

philosophically questionable. Let’s suppose that physical objects are necessarily

physical. Then, ∃x2P x seems true, letting P mean ‘is physical’. But 2∃xP xseems false—it seems possible that there are no physical objects. This coun-

terexample requires that there be worlds with fewer objects than those that

actually exist, whereas the counterexample to the Barcan formula involved the

possibility that there be more objects than those that actually exist.


9.5.3 Necessary existence defendedThere are various ways to respond to the challenge of the previous section.

From a logical point of view, the simplest is to stick to one’s guns and defend

the SQML semantics. SQML-models accurately model the modal facts. The

Barcan formula, the converse Barcan formula, the fourth schema, and the state-

ment that everything necessarily exists are all logical truths; the philosophical

objections are mistaken. Contrary to appearances it is not contingent what

exists. Each possible world has exactly the same stock of individuals. Call this

the doctrine of Constancy.

One could uphold Constancy either by taking an narrow view of what is

possible, or by taking a broad view of what exists. On the former alternative,

one would claim that it is just not possible for there to be any ghosts, and that

it is just not possible in any sense for an actual object to have failed to exist. On

the latter alternative, which I’ll be discussing for the rest of this section, one

accepts the possibility of ghosts, dragons, and so on, but claims that possible

ghosts and dragons exist in the actual world.

Think of the objects in D as being all of the possible objects. In addition to

normal things—what one would normally think of as the actually existing entities:

people, tables and chairs, planets and electrons, and so on—our defender of

Constancy claims that there also exist objects that, in other possible worlds, are

ghosts, golden mountains, talking donkeys, and so forth; and these are included

in D as well. Call these further objects “merely possible things” (but don’t be

misled by this label; the claim is that merely possible things actually exist.) The

formula “∀xF x” means that every possible object is F in the actual world. It’s

not enough for the normal things to be F , for the normal things are not all

of the things that there are. There are also all the merely possible things, and

each of these must be F as well (must be F here in the actual world, that is), in

order for ∀xF x to be true. Hence, the objection to the Barcan formula from

the previous section fails. That objection assumed that ∀x2F x, the antecedent

of (an instance of) the Barcan formula, was true, when F symbolizes “is a

material object”. But this neglects the merely possible things. It’s true that all

the normal objects are necessarily material objects, but there are some further

things—merely possible things—that are not necessarily material objects.

Further: in ordinary language, when we say “Everything” or “something”,

we typically don’t mean to be talking about all possible objects; we’re typically

talking about just the normal things. Otherwise we would be speaking falsely

when we say, for example, “everything has mass”: merely possible unicorns


presumably have no mass (nor any spatial location, nor any other physical

feature.) Ordinary quanti�cation is restricted to normal things. So if we want

to translate an ordinary claim into the language of QML, we must introduce

a predicate for the normal things, “N”, and use it to restrict quanti�ers. But

now, consider the following ordinary English statement:

If everything is necessarily a material object, then neces-

sarily: everything is a material object

If we mindlessly translate this into the language of QML, we would get

∀x2F x→2∀xF x—an instance of the Barcan schema. But since in every-

day usage, quanti�ers are restricted to normal things, the thought in the mind

of an ordinary speaker who utters this sentence is more likely the following:

∀x(N x→2F x)→2∀x(N x→F x)

which says:

If every normal thing is necessarily a material object,

then necessarily: every normal thing is a material object.

And this formula is not an instance of the Barcan schema, nor is it valid, as may

be shown by the following countermodel:

+1 0 0 0 1 0

∀x(N x→2F x)→2∀x(N x→F x) N u

x→2F u

x∗ †

N : {}

r

0 1 0 0

∀x(N x→F x) N u

x→F u

x+

N : {u} F : {}

a

So in a sense, the ordinary intuitions that were alleged to undermine the Barcan

schema are in fact consistent with Constancy.


The defender of Constancy can defend the converse Barcan schema and the

fourth schema in similar fashion. The objection to the converse Barcan schema

assumed the falsity of ∀x2∃y y=x. “Sheer prejudice!”, according to the friend

of constancy. “And recall further that an ordinary utterance of ‘Everything exists

necessarily’ expresses, not ∀x2∃y y=x, but rather ∀x(N x→2∃y(N y∧y=x)),(N for ‘normal’), the falsity of which is is perfectly compatible with Constancy.

It’s possible to fail to be normal; all that’s impossible is to utterly fail to exist.

Likewise for the fourth schema.”

This defense of SQML is hard to take. Let “G” stand for a kind of object

that, in fact, has no members, but which could have had members. Perhaps ghostis such a kind. ∀x2∼Gx→2∀x∼Gx is an instance of the Barcan schema, and

so true according to the defender of Constancy. Since there could have existed

ghosts, the consequent of this conditional is false. Therefore, its antecedent

∀x2∼Gx must be false. That is, there exists something that could have been a

ghost. But this is a very surprising result. The alleged possible ghost couldn’t

be any material object, presumably, assuming it would be impossible for any

material object to be a ghost. The defender of Constancy, then, is committed to

the existence of objects which we wouldn’t otherwise have dreamed of accepting:

things that could have been ghosts, things that been dragons, things that could

have been gods, and so on.

The defender of Constancy might try to defend this conclusion by remind-

ing us that these “possible-ghosts”, “possible-dragons”, and so on, are not

normal objects. They aren’t in space and time, presumably, which explains why

no one has ever seen, heard, felt, or smelled one. He might even say that they

are non-actual, or even that they do not exist (though they are). We are quite cor-

rect, she might say, to scoff at the idea that some normal/actual/existing objects

are capable of being ghosts; but what’s the big deal about saying that some non-

normal/non-actual/non-existing objects have these capabilities? This move,

too, will be considered philosophically suspect by many. Many philosophers

regard the idea that there are some non-existent things, or some non-actual

things, as being anywhere from obviously false to conceptually incoherent, or

subversive, or worse.3

And how does it help to point out that the objects aren’t

normal? The postulation of non-normal objects—objects above and beyond

the objects that the rest of us believe in—was exactly what I was claiming is

philosophically suspect!

On the other hand, Constancy’s defenders can point to certain powerful

3See Quine (1948); Lycan (1979).


arguments in its favor. Here’s a quick sketch of one such argument. First, the

following seems to be a logical truth:

Ted=Ted

But it follows from this that:

∃y y =Ted

This latter formula, too, is therefore a logical truth. But if φ is a logical truth

then so is 2φ (recall the rule of necessitation from chapter 6). So we may infer

that the following is a logical truth:

2∃y y =Ted

Next, notice that nothing in the argument for 2∃y y =Ted depended on any

special features of me. We may therefore conclude that the reasoning holds

good for every object; and so ∀x2∃y y = x is indeed a logical truth. Since,

therefore, every object exists necessarily, it should come as no surprise that

there are things that might have been ghosts, dragons, and so on—for if there

had been a ghost, it would have necessarily existed, and thus must actually exist.

This and other related arguments have apparently wild conclusions, but they

cannot be lightly dismissed, for it is no mean feat to say exactly where they go

wrong (if they go wrong at all!).4

9.6 Variable domainsWe now consider a way of dealing with the problems discussed in section 9.5.2

above that does not require embracing Constancy.

SQML models contain a single domain,D, over which the quanti�ers range

in each possible world. Since it was this feature that led to the problems of

section 9.5.2, let’s introduce a new semantics that instead provides different

domains for different possible worlds. And let’s also reinstate the accessibility

relation, for reasons to be made clear below:5

The new semantics is called

VDQML (“variable-domains quanti�ed modal logic”):

4On this topic see Prior (1967, 149-151); Plantinga (1983); Fine (1985); Linsky and Zalta

(1994, 1996); Williamson (1998, 2002).

5More care than I take is needed in converting the earlier de�nition of validity if one is

worried about the validity of formulas with free variables. See ?, p. 275.


Definition ofVDQML-model: A VDQML-model is a 5-tuple ⟨W ,R ,D,Q,I ⟩such that:

· W is a nonempty set (“possible worlds”)

· R is a binary relation onW (“accessibility relation”)

· D is a set (“super-domain”)

· Q is a function that assigns to any w ∈W a non-empty6

subset of D. Let

us refer toQ(w) as “Dw”. Think of Dw as w’s “sub-domain”—the set of

objects that exist at w.

· I is a function such that: (“interpretation function”)

· if α is a constant then I (α) ∈D· ifΠ is an n-place predicate thenI (Π) is a set of ordered n+1-tuples

⟨u1, . . . , un, w⟩, where u1, . . . , un are members of D, and w ∈W .

Definition of valuation: The valuation function VM ,g , for VDQML-model

M (= ⟨W ,R ,D,Q,I ⟩) and variable assignment g , is de�ned as the function

that assigns either 0 or 1 to each wff relative to each member ofW , subject to

the following constraints:

· for any terms α and β, VM ,g (α=β, w) = 1 iff [α]M ,g = [β]M ,g

· for any n-place predicate, Π, and any terms α1, . . . ,αn,

VM ,g (Πα1 . . .αn, w) = 1 iff ⟨[α1]M ,g , . . . ,[αn]M ,g , w⟩ ∈ I (Π)· for any wffs φ and ψ, and variable, α,

VM ,g (∼φ, w) = 1 iff VM ,g (φ, w) = 0

VM ,g (φ→ψ, w) = 1 iff either VM ,g (φ, w) = 0 or VM ,g (ψ, w) = 1

VM ,g (∀αφ, w) = 1 iff for each u ∈Dw ,VM ,gαu(φ, w) = 1

VM ,g (2φ, w) = 1 iff for each v ∈W , ifRwv then VM ,g (φ, v) = 1

6One could drop this assumption. But if subdomains can be empty then 2∃x(F x→F x) will

be invalid. Since ∃x(F x→F x) is valid given the chapter 4 semantics for non-modal predicate

logic, we would have the odd result of a logical truth whose necessitation isn’t a logical truth.

One could modify the chapter 4 semantics by allowing predicate logic models with empty

domains, thus invalidating ∃x(F x→F x). This approach is known as free logic.


The de�nitions of denotation, validity and semantic consequence remain un-

changed. The obvious derived clauses for ∃ and 3 are as follows:

VM ,g (∃αφ, w) = 1 iff for some u ∈Dw ,VM ,gαu(φ, w) = 1

VM ,g (3φ, w) = 1 iff for some v ∈W ,Rwv and VM ,g (φ, v) = 1

Thus, we have introduced introduced subdomains. We still have D, a set

that contains all of the possible individuals. But for each possible world w,

we introduce a subset of the domain, Dw , to be the domain for w. When

evaluating a quanti�ed sentence at a world w, the quanti�er ranges only over

Dw . Notice that we also reinstated the accessibility relation. This isn’t necessary

for introducing subdomains; I did this in order to be able to make a certain

point about subdomains in section 9.6.2.

Note that ifM is a SQML model, then we can construct a corresponding

VDQML model with the same set of worlds, (super-) domain, and interpre-

tation function, in which every world is accessible from every other, and in

whichQ is a constant function assigning the whole super-domain to each world.

It is intuitively clear that the same sentences are true in this corresponding

model as are true inM . Hence, whenever a sentence is SQML-invalid, it is

VDQML-invalid. (The converse of course is not true.)

9.6.1 Countermodels to the Barcan and related formulas inVDQML

What is the effect of this new truth de�nition on the Barcan formula and related

formulas? All of these formulas come out invalid:

∀x2F x→2∀xF x 3∃xF x→∃x3F x (Barcan)

2∀xF x→∀x2F x ∃x3F x→3∃xF x (converse Barcan)

∃x2F x→2∃xF x 3∀xF x→∀x3F x2∃xF x→∃x2F x ∀x3F x→3∀xF x

The third one on the list was invalid before, and so is still invalid now. As for

the Barcan formula, here is a countermodel:


Dr: {u} F : {u}r

��

00

Da

: {u,v} F : {u}a

00

Of�cial model:

W = {r,a}R = {⟨r, r⟩, ⟨r,a⟩, ⟨a,a⟩}D = {u,v}D

r= {u}

Da= {u,v}

I (F ) = {⟨u, r⟩, ⟨u,a⟩}

I leave the demonstrations of the invalidity of the remaining formulas as exer-

cises.

Exercise 9.2 Does the move to variable domain semantics change

whether any of the formulas in exercise set 9.1 are valid? Justify

your answers.

Exercise 9.3 Demonstrate the VDQML-invalidity of the follow-

ing formulas

a) 2∀xF x→∀x2F x

b) ∃x2F x→2∃xF x

c) ∀x2∃y y=x

9.6.2 Expanding, shrinking domainsThere are several comments worth making about VDQML-models. First,

note that if we made certain restrictions on variable-domains models, then

the countermodels of the previous section would no longer be legal models.


For example, the �rst example, the counterexample to the Barcan formula,

required a model in which the domain expanded; world a was accessible from

world r, and had a larger domain. But suppose we made the decreasing domainsrequirement:

ifRwv, then Dv ⊆Dw

The counterexample would then go away. Indeed, every instance of the Barcan

schema would then become VDSQML-valid, which may be proved as follows:

i) suppose for reductio that Vg (∀α2φ→2∀αφ, w) = 0. Then Vg (∀α2φ, w) =1 and…

ii) …Vg (2∀αφ, w) = 0

iii) by ii), for some v,Rwv and Vg (∀αφ, v) = 0

iv) and so, for some u ∈Dv ,Vgαu(φ, v) = 0

v) given decreasing domains, Dv ⊆Dw , and so u ∈Dw

vi) by i), for every object in Dw , and so for u in particular, Vgαu(2φ, w) = 1

vii) so, Vgαu(φ, v) = 1. Contradicts iv)

Similarly, notice that the counterexamples to ∃x2F x→2∃xF x and the

converse Barcan formula assumed that domains can shrink. But the following

increasing domains requirement validates these formulas (as may be easily shown):

ifRwv then Dw ⊆Dv

Even after imposing the increasing domains requirement, the Barcan for-

mula remains VDQML-invalid; and after imposing the decreasing domains

requirement, the converse Barcan formula and also ∃x2F x→2∃xF x remain

VDQML-invalid (the original countermodels for these formulas establish this.)

However, in systems in which the accessibility relation is symmetric, this col-

lapses: imposing either of these requirements results in imposing the other.

That is, in B or S5, imposing either the increasing or the decreasing domains

requirement results in imposing both, and hence results in all three formulas

being validated.


Exercise 9.4 Show that every instance of each of the following

schemas is valid given the increasing domains requirement.

a) 2∀αφ→∀α2φ

b) ∃α2φ→2∃αφ

9.6.3 Strong and weak necessityIn order for 2φ to be true at a world, the VDQML semantics requires that φbe true at every accessible world. It might be thought that this requirement is

too strong. In order for 2F a, say, to be true, our de�nition requires F a to be

true in all possible worlds. But what if a fails to exist in some worlds? In order

for “Necessarily, I am human” to be true, must I be human in every possible

world? Isn’t it enough for me to be human in all the worlds in which I exist?

This argument goes by a little too quickly. The main worry of its proponent

is that our semantics requires a to exist necessarily, in order for 2F a to come

out true. But our semantics doesn’t require this. It does require F a to be

true in every world, in order for 2F a to be true; but it does not require ato exist in every world in which F a is true. The clause in the de�nition of a

VDQML-model for the interpretation of predicates was this:

· if Π is an n-place predicate then I (Π) is a set of ordered n + 1-tuples

⟨u1, . . . , un, w⟩, where u1, . . . , un are members of D, and w ∈W .

This allows I (F ) to contain pairs ⟨u, w⟩, where u is not a member of Dw . So

one could say that 2F a is consistent with a’s failing to necessarily exist; it’s just

that a has to be F even in worlds where it doesn’t exist.

I doubt this really addresses the worry, since it looks like bad metaphysics

to say that a person could be human at a world where he doesn’t exist. One

could hard-wire a prohibition of this sort of bad metaphysics into VDQML

semantics, by replacing the old clause with a new one:

· if Π is an n-place predicate then I (Π) is a set of ordered n+1-tuples

⟨u1, . . . , un, w⟩, where u1, . . . , un are members of Dw , and w ∈W .

thus barring objects from having properties at worlds where they don’t ex-

ist. But some would argue that this goes too far. The new clause validates


∀x2(F x→∃y y=x). “An object must exist in order to be F ”—sounds clearly

true if F stands for ‘is human’, but what if F stands for ‘is famous’? If Baconians

had been right and there had been no such person as Shakespeare, perhaps

Shakespeare might still have been famous.

The issues here are complex.7

But whether or not we should adopt the new

clause, it looks as though there are some existence-entailing English predicates

π: predicates π such that nothing can be a π without existing. ‘Is human’ seems

to be such a predicate. So we’re back to our original worry about VDQML-

semantics: its truth condition for 2φ requires truth of φ at all worlds, which is

allegedly too strong, at least when φ is a sentence like πa, where π is existence-

entailing.

One could modify the clause for the 2 in the de�nition of the valuation

function, so that in order for 2F a to be true, a only needs to be F in worlds in

which it exists:

VM ,g (2φ, w) = 1 iff for each v ∈W , ifRwv, and if [α]M ,g ∈Dw for each

name or free variable α occurring in φ, then VM ,g (φ, v) = 1

(“Free variable” here means a variable not bound to any quanti�er in φ.) This

would indeed have the result that 2F a gets to be true provided a is F in every

world in which it exists. But be careful what you wish for. Along with this result

comes the following: even if a doesn’t necessarily exist, the sentence 2∃x x=acomes out true. For according to the new clause, in order for 2∃x x=a to be

true, it must merely be the case that ∃x x=a is true in every world in which aexists, and of course this is indeed the case.

If 2∃x x=a comes out true even if a doesn’t necessarily exist, then 2∃x x=adoesn’t say that a necessarily exists. Indeed, it doesn’t look like we have any way

of saying that a necessarily exists, using the language of QML, if the 2 has the

meaning provided for it by the new clause.

A notion of necessity according to which “Necessarily φ” requires truth in

all possible worlds is sometimes called a notion of strong necessity. In contrast,

a notion of weak necessity is one according to which “Necessarily φ” requires

merely that φ be true in all worlds in which objects named within φ exist. The

new clause for the 2 corresponds to weak necessity, whereas our original clause

corresponds to strong necessity.

As we saw, if the 2 expresses weak necessity, then one cannot even express

the idea that a thing necessarily exists. That’s because one needs strong necessity

7The question is that of so-called “serious actualism” (Plantinga, 1983).


to say that a thing necessarily exists: in order to necessarily exist, you need to

exist at all worlds, not just at all worlds at which you exist! So this is a serious

de�ciency of having the 2 of QML express weak necessity. But if we allow the

2 to express strong necessity instead, there is no corresponding de�ciency, for

one can still express weak necessity using the strong 2 and other connectives.

For example, to say that a is weakly necessarily F (that is, that a is F in every

world in which it exists), one can say: 2(∃x x=a→F a).So it would seem that we should stick with our original truth condition for

the 2, and live with the fact that statements like 2F a turn out false if a fails

to be F at worlds in which it doesn’t exist. Those who think that “Necessarily,

Ted is human” is true despite Ted’s possible nonexistence can always translate

this natural language sentence into the language of QML as 2(∃x x=a→F a)(which requires a to be F only at worlds at which it exists) rather than as 2F a(which requires a to be F at all worlds).

Chapter 10

Two-dimensional modal logic

In this chapter we consider an extension to modal logic with considerable

philosophical interest.

10.1 ActualityThe word ‘actually’, in one of its senses anyway, can be thought of as a one-place

sentence operator: “Actually, φ.”

‘Actually’ might at �rst seem redundant. “Actually, snow is white” basically

amounts to: “snow is white”. But the actuality operator interacts with modal

operators in interesting ways. The following two sentences, for example, clearly

have different meanings:

Necessarily, if snow is white then snow is white

Necessarily, if snow is white then snow is actually white

The �rst sentence expresses the triviality that snow is white in any possible

world in which snow is white. But the second sentence makes the nontrivial

statement that if snow is white in any world, then snow is white in the actualworld.

So, ‘actually’ is nonredundant, and consequently, worth thinking about.

Let’s add a symbol to modal logic for it. “@φ” will symbolize “Actually, φ”.

We can now symbolize the pair of sentences above as 2(S→S) and 2(S→@S),

272

CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 273

respectively. For some further examples of sentences we can symbolize using

‘actually’, consider:1

It might have been that everyone who is actually rich is

poor

3∀x(@Rx→P x)

There could have existed something that does not actu-

ally exist

3∃x@∼∃y y=x

10.1.1 Kripke models with designated worldsFor the purposes of this chapter, the logic of iterated boxes and diamonds isn’t

relevant, so let’s simplify things by dropping the accessibility relation from

models; we will thereby treat every world as being accessible from every other.

Before laying out the semantics of @, let’s examine a slightly different way

of laying out standard modal logic. For propositional modal logic, instead of

de�ning a model as an ordered pair ⟨W ,I ⟩ (no accessibility relation, remem-

ber), one could instead de�ne a model as a triple ⟨W , w@,I ⟩, whereW and Iare as before, and w@ is a member ofW , thought of as the actual, or designatedworld of the model. The designated world w@ plays no role in the de�nition of

the valuation for a given model; it only plays a role in the de�nitions of truth

in a model and validity:

Definitions of truth in a model and validity with designated worlds:

· φ is true in modelM (= ⟨W , w@,I ⟩) iff VM (φ, w@) = 1

· φ is valid in system S iff φ is true in all models for system S

One could add a designated world to models for quanti�ed modal logic in a

parallel way.

The old de�nition of validity for a system (section 6.3), recall, never em-

ployed the notion of truth in a model; rather, it proceeded via the notion of

validity in a frame. The nice thing about the new de�nition is that it’s parallel

1In certain special cases, we could do without the new symbol @. For example, instead of

symbolizing “Necessarily, if snow is white then snow is actually white” as 2(S→@S), we could

symbolize it as 3S→S. But the @ is not in general eliminable; see Hodes (1984b,a).


to the way validity is usually de�ned in model theory: one �rst de�nes truth in

a model, and then de�nes validity as truth in all models. But the new de�nition

doesn’t differ in any substantive way from the old de�nition, in that it yields

exactly the same class of valid formulas:

Proof. It’s obvious that everything valid on the old de�nition is valid on the

new de�nition (the old de�nition says that validity is truth in all worlds in

all models; the addition of the designated world w@ doesn’t play any role in

de�ning truth at worlds, so each of the new models has the same distribution

of truth values as one of the old models.) Moreover, suppose that a formula is

invalid on the old de�nition—i.e., suppose that φ is false at some world, w, in

some modelM . Now construct a model of the new variety that’s just likeMexcept that its designated world is w. φ will be false in this model, and so φturns out invalid under the new de�nition.

10.1.2 Semantics for @

Now for the semantics of @. We can give @ a very simple semantics using

models with designated worlds. Further, the designated world will now be

involved in the notion of truth in a model, not just in the de�nition of validity.

We’ll move straight to quanti�ed modal logic, bypassing propositional logic. To

keep things simple, let the models have a constant domain and no accessibility

relation. (It will be obvious how to add these complications back in, if they are

desired.)

Definition of a Designated-world SQML-model: A designated-world

SQML-model is a four-tuple ⟨W , w@,D,I ⟩, where:

· W is a non-empty set (“worlds”)

· w@ is a member ofW (“designated/actual world”)

· D is a non-empty set (“domain”)

· I is a “interpretation” function that assigns semantic values as before (to

names: members of D; to predicates: extensions relative to worlds)

In the de�nition of the valuation for such a model, the semantic clauses for the

old logical constants run just as with SQML (section 9.3); and we now add a

clause for the new operator @:

· VM ,g (@φ, w) = 1 iff VM ,g (φ, w@) = 1

i.e., @φ is true at any world iff φ is true in the designated world of the model.


10.1.3 Establishing validity and invalidityThe strategies for establishing the validity or invalidity of a given formula are

similar to those from chapter 9.

Example 10.1: Show that � ∀x(F x∨2Gx)→2∀x(Gx∨@F x)

i) Suppose for reductio that this formula is not valid. Then for some model

and some variable assignment g , Vg (∀x(F x∨2Gx)→2∀x(Gx∨@F x), w@) =0.

ii) Then Vg (∀x(F x∨2Gx), w@) = 1 and…

iii) …Vg (2∀x(Gx∨@F x), w@) = 0

iv) Given the latter, there is some world, call it “a”, such that Rw@a and

Vg (∀x(Gx∨@F x),a) = 0. And so, there is some object, call it “u”, in the

model’s domain, D, such that Vg xu(Gx∨@F x,a) = 0

v) And so Vg xu(Gx,a) = 0 and…

vi) …Vg xu(@F x,a) = 0

vii) Given the latter, Vg xu(F x, w@) = 0 (by the clause in the truth de�nition

for @)

viii) Given ii), for every object inD, and so for u in particular, Vg xu(F x∨2Gx, w@) =

1.

ix) And so, either Vg xu(F x, w@) = 1 or Vg x

u(2Gx, w@) = 1

x) From ix) and vii), Vg xu(2Gx, w@) = 1

xi) And so, Vg xu(Gx,a) = 1, which contradicts v)

Example 10.2: Show that 2 2∀x(Gx∨@F x)→2∀x(Gx∨F x): here is a

model in which this formula is false:

W = {w@,a}D = {u}

I (F ) = {⟨u, w@⟩}I (G) =∅


The formula turns out false in this model, which means that it turns out false

in w@: the consequent is false in @ because at world a, something (namely, u)

is neither G nor F ; but the antecedent is true there: since u is F at w@, it’s

necessary that u is either G or actually F .

10.2 ×Adding @ to the language of quanti�ed modal logic is a step in the right

direction, since it allows us to express certain kinds of comparisons between

possible worlds that we couldn’t express otherwise. But it doesn’t go far enough;

we need a further addition.2

Consider this sentence:

It might have been the case that, if all those then rich

might all have been poor, then someone is happy

What it’s saying, in possible worlds terms, is this:

For some world w, if there’s a world v such that (every-

one who is rich in w is poor in v), then someone is happy

in w.

This is a bit like “It might have been that everyone who is actually rich is poor”;

in this new sentence the word ‘then’ plays a role a bit like the role ‘actually’

played in the earlier sentence. But the ‘then’ does not take us back to the actual

world of the model; it rather takes us back to the world, w, that is introduced

by the �rst possibility operator, ‘it might have been the case that’. We cannot,

therefore, symbolize our new sentence thus:

3(3∀x(@Rx→P x)→∃xH x)

for this has the truth condition that there is a world w such that, if there’s a

world v such that (everyone who is rich in w@ is poor in v), then someone is

happy in w. The problem is that the @, as we’ve de�ned it, always takes us

back to the model’s designated world, whereas what we need to do is to “mark”

a world, and have @ take us back to the “marked” world:

3×(3∀x(@Rx→P x)→∃xH x)

× marks the spot: it is a point of reference for subsequent occurrences of @.

2See Hodes (1984a) on the limitations of @; see Cresswell (1990) on× (his symbol is “Ref”),

and further related additions.


10.2.1 Two-dimensional semantics for ×So let’s further augment the language of QML with another one-place sentence

operator, ×. The idea is that ×φ means the same thing as φ, except that

subsequent occurrences of @ inφ are to be interpreted as picking out the world

that was the “current world of evaluation” when the × was encountered. (This

will become clearer once we lay out the semantics for × and @.)

To lay out this semantics, let’s return to the old SQML models (i.e., without

a designated world). Thus, a model is a triple ⟨W ,D,I ⟩,W a non-empty set,

D a non-empty set, I a function assigning referents to names and extensions

to predicates at worlds as before. But now we change the de�nition of truth.

We no longer evaluate formulas at worlds. Instead we evaluate a formula at a

pair of worlds (hence: “two-dimensional semantics”). One world is the world

we’re used to; it’s the world that we’re evaluating the formula for truth in. Call

this the “world of evaluation”. The other world is a “reference world”—it’s

the world that we’re currently thinking of as the actual world, and the world

that will be relevant to the evaluation of @. Thus, “VM ,g (φ, w1, w2)” will mean

that φ is true at world w2, with reference world w1. We de�ne [α]M ,g , the

denotation of term α relative to modelM and variable assignment g , as before.

And we de�ne the valuation function as follows.

Definition of two-dimensional valuation function: The two-dimensional

valuation function, VM ,g , for an SQML-modelM (= ⟨W ,D,I ⟩) is de�ned as

the three-place function that assigns to each wff, relative to each pair of worlds,

either 0 or 1 subject to the following constraints, for any n-place predicate Π,

terms α1 . . .αn, wffs φ and ψ, and variable β:

· VM ,g (Πα1 . . .αn, v, w) = 1 iff ⟨[α1]M ,g , . . . ,[αn]M ,g , w⟩ ∈ I (Π)· VM ,g (∼φ, v, w) = 1 iff VM ,g (φ, v, w) = 0

· VM ,g (φ→ψ, v, w) = 1 iff VM ,g (φ, v, w) = 0 or VM ,g (ψ, v, w) = 1

· VM ,g (∀βφ, v, w) = 1 iff for all u ∈D,VM ,gαu(φ, v, w) = 1

· VM ,g (2φ, v, w) = 1 iff for all w ′ ∈W ,VM ,g (φ, v, w ′) = 1

· VM ,g (@φ, v, w) = 1 iff VM ,g (φ, v, v) = 1

· VM ,g (×φ, v, w) = 1 iff VM ,g (φ, w, w) = 1


Note what the× does: change the reference world. When evaluating a formula,

it says to forget about the old reference world, and make the new reference

world whatever the current world of evaluation happens to be.

We can de�ne validity and consequence thus:

Two-dimensional definitions of validity and consequence:

· φ is 2D-valid (“�2Dφ”) iff for every modelM , every world w in that

model, and every assignment g based on that model, VM ,g (φ, w, w) = 1

· φ is a 2D-semantic consequence of Γ (“Γ �2Dφ”) iff for every model

M , every assignment g based on that model, and every world w in that

model, if VM ,g (γ , w, w) = 1 for each γ ∈ Γ, then VM ,g (φ, w, w) = 1

Valid formulas are thus de�ned as those that are true at every pair of worlds of

the form ⟨w, w⟩; semantic consequence is truth-preservation at every such pair.

Notice, however, that these aren’t the only notions of validity and con-

sequence that one could introduce. There is also the notion of truth, and

truth-preservation, at every pair of worlds:3

Definitions of general 2D validity and consequence:

· φ is generally 2D-valid (“�G2D

φ”) iff for every modelM , any worlds

v and w in that model, and every assignment g based on that model,

VM ,g (φ, v, w) = 1

· φ is a general 2D-semantic consequence of Γ (“Γ �G2D

φ”) iff for every

modelM , every assignment g based on that model, and any worlds v and

w in that model, if VM ,g (γ , v, w) = 1 for each γ ∈ Γ, then VM ,g (φ, v, w) =1

Validity and general validity, and consequence and general consequence, come

apart in various ways, as we’ll see below.

As we saw, moving to this new language increases the �exibility of the @;

we can symbolize

It might have been the case that, if all those then rich

might all have been poor, then someone is happy

3The term ‘general validity’ is from Davies and Humberstone (1980); the �rst de�nition of

validity corresponds to their “real-world validity”.


as

3×(3∀x(@Rx→P x)→∃xH x)

Moreover, it costs us nothing. For we can replace any sentence φ of the old

language with ×φ in the new language (i.e. we just put the × operator at the

front of the sentence.)4

For example, instead of symbolizing

It might have been that everyone who is actually rich is

poor

as 3∀x(@Rx→P x) as we did before, we symbolize it now as:

×3∀x(@Rx→P x)

Example 10.3: Show that if � φ then � @φ. Suppose for reductio that

φ is valid but @φ is not. That means that in some model and some world,

w (and some assignment g , but I’ll suppress this since it isn’t relevant here),

V(@φ, w, w) = 0. Thus, given the truth condition for @, V(φ, w, w) = 0. But

that violates the validity of φ.

Example 10.4: Show that every instance of φ↔@φ is 2D-valid, but not

every instance of 2(φ↔@φ) is. (Moral: any proof theory for this logic had

better not include the rule of necessitation!) For the �rst, the truth condition

for @ insures that for any world w in any model (and any variable assignment),

V(@φ, w, w) = 1 iff V(φ, w, w) = 1, and so V(φ↔@φ, w, w) = 1. Thus, �φ↔@φ.

But some instances of 2(φ↔@φ) aren’t valid. Let φ be ‘F a’; here’s a

countermodel:

W = {c,d}D = {u}

I (a) = u

I (F ) = {⟨u,c⟩}4This amounts to the same thing as the old symbolization in the following sense. Let

φ be any wff of the old language. Thus, φ may have some occurrences of @, but it has no

occurrences of ×. Then, for every SQML-modelM = ⟨W ,D,I ⟩, and any v, w ∈W ,×φ is

true at ⟨v, w⟩ inM iff φ is true in the designated-world SQML model ⟨W , w,D,I ⟩.


In this model, V(2(F a↔@F a), c, c) = 0, because V(F a↔@F a, c,d) = 0. For

‘F a’ is true at ⟨c,d⟩ iff the referent of ‘a’ is in the extension of ‘F ’ at world d (it

isn’t) whereas ‘@F a’ is true at ⟨c,d⟩ iff the referent of ‘a’ is in the extension of

‘F ’ at world c (it is).

Note that this same model shows thatφ↔@φ is not generally valid. General

validity is truth at all pairs of worlds, and the formula F a↔@F a, as we just

showed, is false at the pair ⟨c,d⟩.

Exercise 10.1 Demonstrate the following facts:

a) �φ→2@φ (for any wff φ)

b) �2×∀x3@F x→2∀xF x

10.3 FixedlyThe two-dimensional approach to semantics—evaluating formulas at pairs of

worlds rather than single worlds—raises an intriguing possibility. The 2 is a

universal quanti�er over the world of evaluation; we might, by analogy, follow

Davies and Humberstone (1980) and introduce an operator that is a universal

quanti�er over the reference world. Davies and Humberstone call this operator

F, and read “Fφ” as “�xedly, φ”. Grammatically, F is a one-place sentential

operator. Its semantic clause is this:

· VM ,g (Fφ, v, w) = 1 iff for every v ′,VM ,g (φ, v ′, w) = 1

All the other two-dimensional semantic de�nitions, including the de�nitions

of validity and consequence, remain the same.5

Humberstone and Davies point out that given F, @, and 2, we can introduce

two new operators: F@ and F2. It’s easy to show that:

· VM ,g (F@φ, v, w) = 1 iff for every v ′ ∈W ,VM ,g (φ, v ′, v ′) = 1

5Humberstone and Davies don’t use two-dimensional semantics; they instead use designated-

world QML models (and they don’t include ×). Say that designated-world QML models are

variants iff they are alike except perhaps for the designated world. Their truth condition for Fis then this: VM ,g (Fφ, w) = 1 iff for every modelM ′

that is a variant ofM , VM ′,g (φ, w) = 1.

This approach isn’t signi�cantly different from the present two-dimensional one.


· VM ,g (F2, v, w) = 1 iff for v ′, w ′ ∈W ,VM ,g (φ, v ′, w ′) = 1

Thus, we can think of F@ and F2, as well as 2 and F themselves, as expressing

“kinds of necessities”, since their truth conditions introduce universal quanti�ers

over worlds of evaluation and reference worlds. (What about 2F? It’s easy to

show that 2F is just equivalent to F2.)

As with the semantics of the previous section, validity and general validity

do not always coincide, as the following example shows.

Example 10.5: For example, F@φ→φ is 2D-valid for each wff φ (exer-

cise 10.2). But not every instance of this wff is generally valid. The formula

F@(@Ga↔Ga)→(@Ga↔Ga) is not generally valid, for example. General va-

lidity requires truth at all pairs ⟨v, w⟩ in all models. But in the following model,

Vg (F@(@Ga↔Ga)→(@Ga↔Ga), c,d) = 0 (for any variable assignment g ):

W = {c,d}D = {u}

I (a) = u

I (G) = {⟨u,c⟩}

In this model, the referent of ‘a’ is in the extension of ‘G’ in world c, but not

in world d. That means that @Ga is true at ⟨c,d⟩ whereas Ga is false at ⟨c,d⟩,and so @Ga↔Ga is false at ⟨c,d⟩. But F@φ means that φ is true at all pairs

of the form ⟨v, v⟩, and the formula @Ga↔Ga is true at any such pair (in any

model). Thus, F@(@Ga↔Ga) true at ⟨c,d⟩ in this model.

Exercise 10.2 Show that �2DF@φ→φ, for each wff φ

Exercise 10.3 Show that for some φ, 22Dφ→Fφ.

Exercise 10.4 Show that if φ has no occurrences of @, then �2D

φ→Fφ (dif�cult).

Example 4:


10.4 A philosophical application: necessity and apriority

The two-dimensional modal framework has been put to signi�cant philosophi-

cal use in the past twenty-�ve or so years.6

This is not the place for an extended

survey; rather, I will brie�y present the two-dimensional account of just one

philosophical issue: the relationship between necessity and a priority.

In Naming and Necessity, Saul Kripke famously presented putative examples

of necessary a posteriori statements and of contingent a priori statements:

Hesperus = Phosphorus

B (the standard meter bar) is one meter long

The �rst statement, Kripke argued, is necessary because whenever we try to

imagine a possible world in which Hesperus is not Phosphorus, we �nd that

we have in fact merely imagined a world in which ‘Hesperus’ and ‘Phosphorus’

denote different objects than they in fact denote. Given that Hesperus and

Phosphorus are in fact one and the same entity—namely, the planet Venus—

there is no possible world in which Hesperus is different from Phosphorus, for

such a world would have to be a world in which Venus is distinct from itself.

Thus, the statement is necessary, despite its a posteriority: it took astronomical

evidence to learn that Hesperus and Phosphorus were identical; no amount

of pure rational re�ection would have suf�ced. As for the second statement,

Kripke argues that one can know its truth as soon as one knows that the phrase

‘one meter’ has its reference �xed by the description: “the length of bar B”.

Thus it is a priori. Nevertheless, he argues, it is contingent: bar B does not

have its length essentially, and thus could have been longer or shorter than one

meter.

On the face of it, the existence of necessary a priori or contingent a posterioristatements is paradoxical. How can a statement that is true in all possible worlds

be in principle resistant to a priori investigation? Worse, how can a statement

that might have been false be known a priori?The two-dimensional framework has been thought by some to shed light

on all this. Let’s consider the contingent a priori �rst. Let’s de�ne the following

6For work in this tradition, see Stalnaker (1978, 2003a, 2004); Evans (1979); Davies and

Humberstone (1980); Hirsch (1986); Chalmers (1996); ?); Jackson (1998); see Soames (2004)

for an extended critique.


notion of contingency:

Definition of superficial contingency: φ is super�cially contingent in model

M at world w iff, for every variable assignment g ,VM ,g (2φ, w, w) = 0 and

VM ,g (2∼φ, w, w) = 0.

This corresponds, intuitively, to this: if we were sitting at w, and we uttered

3φ∧3∼φ, we’d speak the truth.

How should we formalize the notion of a priority? As a rough and ready

guide, let’s think of a sentence as being a priori iff it is 2D-valid—i.e., true at

every pair ⟨w, w⟩ of every model. In defense of this guide: we can think of the

truth value of an utterance of a sentence as being the valuation of that sentence

at the pair ⟨w, w⟩ in a model that accurately models the genuine possibilities,

and in which w accurately models the (genuine) possible world of the speaker.

So any 2D-valid sentence is invariably true whenever uttered; hence, if φ is

2D-valid, any speaker who understands his or her language is in a position to

know that an utterance of φ would be true.

Under these de�nitions, there are sentences that are super�cially contingent

but nevertheless a priori. Consider any sentence of the form: φ↔@φ. In any

model in which φ is true at w and false at some other world, the sentence

is super�cially contingent. But it is a priori, since, as we showed above, it is

2D-valid (though it’s not generally 2D-valid, as we also showed above.)

That was a relatively simple example; but one can give other examples that

are similar in spirit both to Kripke’s example of the meter bar, and to a related

example due to Gareth Evans (1979):

Bar B is one meter

Julius invented the zip

where bar B is the standard meter bar, and the “descriptive names” ‘one meter’

and ‘Julius’ are said to be “rigid designators” whose references are “�xed” by the

descriptions ‘the length of bar B ’ and ‘the inventor of the zip’, respectively. Now,

whether or not these sentences, understood as sentences of everyday English,

are indeed genuinely contingent and a priori depends on delicate issues in the

philosophy of language concerning descriptive names, rigid designation, and

reference �xing. Rather than going into all that, let’s construct some examples

that are similar to Kripke’s and Evans’s. Let’s simply stipulate that ‘one meter’

and ‘Julius’ are to abbreviate “actualized descriptions”: ‘the actual length of bar


B ’ and ‘the actual inventor of the zip’. With a little creative reconstruing in the

�rst case, the sentences then have the form: “the actual G is G”:

the actual length of bar B is a length of bar B

the actual inventor of the zip invented the zip

Now, these sentences are not quite a priori, since for all one knows, the G might

not exist—there might exist no unique length of bar B , no unique inventor of

the zip. So suppose we consider instead the following sentences:

If there is exactly one length of bar B , then the actual

length of bar B is a length of bar B

If there is exactly one inventor of the zip, then the actual

inventor of the zip invented the zip

Each has the form:

If there is exactly one G, then the actual G is G

Or, in symbols:

∃x(Gx∧∀y(Gy→y=x))→∃x(@Gx∧∀y(@Gy→y=x)∧Gx)

Any sentence of this form is 2D-valid (though not generally 2D-valid), and is

super�cially contingent. So we have further examples of the contingent a prioriin the neighborhood of the examples of Kripke and Evans.

Various philosophers want to concede that these sentences are contingent

in one sense—namely, in the sense of super�cial contingency. But, they claim,

this is a relatively unimportant sense (hence the term ‘super�cial contingency’,

which was coined by Evans). In another sense, they’re not contingent at all.

Evans calls the second sense of contingency “deep contingency”, and de�nes it

thus (1979, p. 185):

If a deeply contingent statement is true, there will exist some state of

affairs of which we can say both that had it not existed the statement

would not have been true, and that it might not have existed.

The intended meaning of ‘the statement would not have been true’ is that the

statement, as uttered with its actual meaning, would not have been true. The


idea is supposed to be that ‘Julius invented the zip’ is not deeply contingent,

because we can’t locate the required state of affairs, since in any situation in

which ‘Julius invented the zip’ is uttered with its actual meaning, it is uttered

truly. So the Julius example is not one of a deeply contingent a priori truth.

Evans’s notion of deep contingency is far from clear. One of the nice things

about the two-dimensional modal framework is that it allows us to give a

clear de�nition of deep contingency. Davies and Humberstone (1980) give a

de�nition of deep contingency which is parallel to the de�nition of super�cial

contingency, but with F@ in place of 2:

Definition of deep contingency: φ is deeply contingent inM at w iff (for

all g ) VM ,g (F@φ, w, w) = 0 and VM ,g (F@∼φ, w, w) = 0.

Under this de�nition, the examples we have given are not deeply contingent.

To be sure, this de�nition is only as clear as the two-dimensional notions of

�xedness and actuality. The formal structure of the two-dimensional framework

is of course clear, but one can raise philosophical questions about how that

formalism is to be interpreted. But at least the formalism provides a clear

framework for the philosophical debate to occur.

Our discussion of the necessary a posteriori will be parallel to that of the

contingent a priori. Just as we de�ned super�cial contingency as the falsity of

the 2, so we can de�ne super�cial necessity as the truth of the 2:

Definition of superficial necessity: φ is super�cially necessary inM at wiff (for all g ) VM ,g (2φ, w, w) = 1

How shall we construe a posteriority? Let’s follow our earlier strategy, and take

the failure to be 2D-valid as our guide.

But here we must take a bit more care. It’s quite a trivial matter to construct

models in which 2D-invalid sentences are necessarily true; and we don’t need

the two-dimensional framework to do it. We clearly don’t want to say that

‘Everything is a lawyer” is an example of the necessary a posteriori. But let Fstand for ‘is a lawyer’; we can construct a model in which the predicate F is

true of every member of the domain at any world, ∀xF x is true, and so is

super�cially necessary at every world, despite the fact that it is not 2D-valid.

But this is too cheap. We began by letting the predicate F stand for a predicate

of English, but then constructed our model without attending to the modal

fact that it’s simply not the case that it’s necessarily true that everything is a

lawyer. If F is indeed to stand for ‘is a lawyer’, we would need to include in any


realistic model—any model faithful to the modal facts—worlds in which not

everything is in the extension of F .

To provide nontrivial models of the necessary a posteriori, when we have

chosen to think of the nonlogical expressions of the language of QML as

standing for certain expressions of English, our strategy will be provide realisticmodels—models that are faithful to the real modal facts in relevant respects,

given the choice of what the nonlogical expressions stand—in which 2D-invalid

sentences are necessarily true. Now, since the notion of a “realistic model”

has not been made precise, the argument here will be imprecise; but in the

circumstances this is inevitable.

So: consider now, as a schematic example of an a posteriori and super�cially

necessary sentence:

If the actual F and the actual G exist, they are identical

In symbols:

[∃x(@F x∧∀y(@F y→x=y))∧∃z(@Gz∧∀y(@Gy→z=y)]→∃x[@F x∧∀y(@F y→x=y)∧∃z(@Gz∧∀y(@Gy→z=y)∧ z=x)]

This sentence isn’t 2D-valid. Nevertheless, it is super�cially necessary in any

model and any world w in which F and G each have a single object in their

extension, no matter what the extensions of F and G are in other worlds in the

model. So whenever such a model is realistic (given what we let F and G stand

for), we will have our desired example.

We can �ll in this schema and construct an example similar to Kripke’s

Hesperus and Phosphorus example. Set aside controversies about the semantics

of proper names in natural language; let’s just stipulate that ‘Hesperus’ is to

be short for ‘the actual F ’, and that Phosphorus is to be short for ‘the actual

G’. And let’s think of F as standing for ‘the �rst heavenly body visible in the

evening’, and G for ‘the last heavenly body visible in the morning’. Then

(HP) If Hesperus and Phosphorus exist then they are identical

has the form ‘If the actual F and the actual G exist then they are identical’,

which was discussed in the previous paragraph. We may then construct a

realistic model in which F and G each have a single object in their extension in

some world, w, but in which they have different objects in their extensions in

other worlds. In such a model, the sentence


(2HP) 2(If Hesperus and Phosphorus exist then they are identical)

is true at ⟨w, w⟩, and so we again have our desired example: (HP) is super�cially

necessary, despite the fact that it is a posteriori (2D-invalid).

Isn’t it strange that (HP) is both a posteriori and necessary? The two-

dimensional response is: no, it’s not, since although it is super�cially necessary,

it isn’t deeply necessary in the following sense:

Definition of deep neceessity: φ is deeply necessary inM at w iff (for all g )

VM ,g (F@φ, w, w) = 1

It isn’t deeply necessary because in any realistic model (given what F and Gcurrently stand for), there must be worlds and objects other than c and u, that

are con�gured as they are in the model below:

W = {c,d}D = {u,v}

I (F ) = {⟨u,c⟩, ⟨u,d⟩}I (G) = {⟨u,c⟩, ⟨v,d⟩}

In this model, even though (2HP) is false at ⟨c, c⟩, still, F@(HP), i.e.:

F@{[∃x(@F x∧∀y(@F y→x=y))∧∃z(@Gz∧∀y(@Gy→z=y)]→∃x[@F x∧∀y(@F y→x=y)∧∃z(@Gz∧∀y(@Gy→z=y)∧ z=x)]}

is false at ⟨c, c⟩ (and indeed, at every pair of worlds), since (HP) is false at ⟨d,d⟩.And so, (HP) is not deeply necessary in this model.

One might try to take this two-dimensional line further, and claim that in

every case of the necessary a posteriori (or the contingent a priori), the necessity

(contingency) is merely super�cial. But defending this stronger line would

require more than we have in place so far. To take one example, return again

to ‘Hesperus = Phosphorus’, but now, instead of thinking of ‘Hesperus’ and

‘Phosphorus’ as abbreviations for actualized descriptions, let us represent them

by names in the logical sense (i.e., the expressions called “names” in the de�ni-

tion of well-formed formulas, which are assigned denotations by interpretation

functions in models). Thus, ‘Hesperus = Phosphorus’ is now represented as:


a=b . Consider the following model:

W = {c,d}D = {u,v}

I (a) = u

I (b ) = u

The model is apparently realistic; it falsi�es no relevant modal facts. But the

sentence a=b is deeply necessary (at any world in the model). And yet it is aposteriori (2D-invalid).

Exercise 10.5 Show that the symbolization of “If there is exactly

one G, then the actual G is G”, that is:

∃x(Gx∧∀y(Gy→y=x))→∃x(@Gx∧∀y(@Gy→y=x)∧Gx)

is valid, though not generally valid, and super�cially contingent in

any world in any model.

Exercise 10.6 Show that a formula is capable of being super�cially

contingent (i.e., for some model and some world, it is super�cially

contingent at that world) iff it fails to be generally valid.

Appendix A

Answers to Selected Exercises

Exercise 5.6 We must show that for any PC+DD model ⟨D,I ,E⟩, and any

variable assignment g , [α]g (relative to this model) is either E or a member of

D. We’ll do this by induction on the grammar of α. So, we’ll show that the

result holds when α is a variable, constant, or ι term (base cases), and then show

that, assuming the result holds for simpler terms (inductive hypothesis), it also

holds for complex terms made up of the simpler terms using a function symbol.

Base cases. If α is a variable then [α]g is g (α), which is a member of Dgiven the de�nition of a variable assignment. If α is a constant then [α]g is

I (α), which is a member of D given the de�nition of a model’s interpretation

function. If α has the form ιβφ then [α]g is either the unique u ∈D such that

Vgβu(φ) = 1 (if there is such a u) or E (if there isn’t). So in all three cases, [α]g

is either E or a member of D. (Note that even though ι terms are syntactically

complex, we treated them here as a base case of our inductive proof. That’s

because we had no need for any inductive hypothesis; we could simply show

directly that the result holds for all ι terms.)

Next we assume the inductive hypothesis:

(ih) The denotations of terms α1 . . .αn are either E or members of D

and show that the same goes for the complex term f (α1 . . .αn). Well,

[ f (α1 . . .αn)]g is de�ned as I ( f )([α1]g . . .[αn]g ). And the inductive hypothesis

tells us that each [αi]g is either E or a member of D. And we know from the

de�nition of a model that I ( f ) is a function that maps any n-tuple of members

289

APPENDIX A. ANSWERS TO SELECTED EXERCISES 290

of D ∪ {E} to a member of D ∪ {E}. So I ( f )([α1]g . . .[αn]g ) is a member of

D ∪{E}—i.e. it is either E or a member of D.

Exercise 6.2a 2[P→3(Q→R)]→3[Q→(2P→3R)]:

D-Countermodel (hence the formula is invalid in K as well):

∗ ∗1 0 0

2[P→3(Q→R)]→3[Q→(2P→3R)]r

��∗ ∗

1 1 1 1 0 1 1 0 0 0

P→3(Q→R) Q→(2P→3R)† ∗

a

00

��0 1 0 1

Q→R P†

b

00

Of�cial model:

W = {r, a,b}R = {⟨r, a⟩, ⟨a,b⟩, ⟨a, a⟩, ⟨b,b⟩}

I (P, a) =I (Q, a) =I (P, b) = 1, all else 0

T-validity proof (establishes validity in B, S4, and S5 as well):

i) Suppose for reductio that for some T-model ⟨W ,R ,I ⟩ and some r ∈W ,

V(the formula, r ) = 0.

ii) So V(2[P→3(Q→R)], r ) = 1, and…

iii) …V(3[Q→(2P→3R)], r ) = 0

iv) From iii), for any v such thatR r v, V(Q→(2P→3R), v) = 0

v) ButR r r (re�exivity), so V(Q→(2P→3R), r ) = 0


vi) So V(2P→3R, r ) = 0 (truth condition for→)

vii) And so, V(2P, r ) = 1 (truth condition for→)

viii) And so, sinceR r r , V(P, r ) = 1.

ix) From ii), givenR r r , we know that V(P→3(Q→R), r ) = 1.

x) Given viii) and the truth condition for→, V(3(Q→R), r ) = 1

xi) So for some world, call it “a”,R ra and V(Q→R,a) = 1

xii) SinceR ra, from iv) we have V(Q→(2P→3R),a) = 0.

xiii) Given the truth condition for the→, V(Q,a) = 1 and …

xiv) …V(2P→3R,a) = 0

xv) That means that V(3R,a) = 0; so, sinceRaa (re�exivity), V(R,a) = 0.

xvi) Lines xiii), xv), and xi) contradict (truth condition for→)

Exercise 6.2d 2(P↔Q)→2(2P↔2Q):

B-countermodel (establishes invalidity in S5 as well):

∗1 1 1 1 0 0

2(P↔Q)→2(2P↔2Q)† ∗

r

OO

��

00

∗1 1 1 1 1 0 0

P↔Q 2P↔2Q† † ∗

a

OO

��

00

0 1

Q Pb

00


Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨a, r⟩, ⟨r,b⟩, ⟨b, r⟩}

I (P, r) =I (Q, r) =I (P, a) =I (Q, a) =I (P, b) = 1, all else 0

Validity proof for S4 (and so for S5 as well):

i) Suppose for reductio that in some world r of some S4-model, the formula

is false.

ii) Then V(2(P↔Q), r ) = 1 and…

iii) …V(2(2P↔2Q), r ) = 0.

iv) Given iii), for some world a,R ra and V(2P↔2Q,a) = 0.

v) Given iv), 2P and 2Q must have different truth values in world a. With-

out loss of generality (given the symmetry between P and Q elsewhere

in the problem), let’s suppose that …

vi) …V(2P,a) = 1 and …

vii) …V(2Q,a) = 0.

viii) Given vii), for some world b ,Rab and V(Q, b ) = 0.

ix) Given vi), V(P, b ) = 1.

x) We already know thatR ra andRab . By transitivity,R r b .

xi) But then, given ii), V(P↔Q, b ) = 1. This contradicts viii) and ix).

Exercise 6.2g 332P↔2P :

B-countermodel (this establishes invalidity in S5 as well):


1 1 0 0

332P↔2P∗ † ∗

r

??

�� ]]

��<<<<<<<<<<<<<<00

∗1 1 1

32P∗

a

00

0

Pb

00

Of�cial model:


I (P, r) =I (P, a) = 1, all else 0

S4-countermodel:

1 0 0

332P↔2P∗ † ∗

r

��

��<<<<<<<<<<<<<<00

∗1 1 1

32P∗

a

00

0

Pb

00

Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨r,b⟩}

I (P, a) = 1, all else 0

S5-validity proof:

i) We must show that in any world r of any S5-model, 332P and 2P have

the same truth value.


ii) So �rst suppose for reductio that V(332P, r ) = 1 and …

iii) …V(2P, r ) = 0.

iv) From ii), for some a,R ra and V(32P,a) = 1.

v) So, for some b ,Rab and V(2P, b ) = 1.

vi) From iii), for some c ,R r c and V(P, c) = 0.

vii) SinceR ra andRab ,R r b (transitivity), and soRb r (symmetry); since

R r c , we haveRb c (transitivity).

viii) But givenRb c and v), V(P, c) = 1, which contradicts vi).

ix) So it cannot be that 332P is true at r while 2P is false. Suppose next

for reductio that the former is false and the latter is true; that is, …

x) V(332P, r ) = 0 and …

xi) …V(2P, r ) = 1.

xii) By re�exivity,R r r . So, given xi), V(32P, r ) = 1; and so, V(332P, r ) =1, contradicting x).

Exercise 6.4a `K

3(P∧Q)→(3P∧3Q):

1. (P∧Q)→P PL

2. 2[(P∧Q)→P] 1, NEC

3. 2[(P∧Q)→P]→[3(P∧Q)→3P] K3

4. 3(P∧Q)→3P 2,3 MP

5. 3(P∧Q)→3Q Similar to 1-4

6. 3(P∧Q)→(3P∧3Q) 4, 5, PL

Exercise 6.4c `K∼3(Q∧R)↔2(Q→∼R):


1. ∼(Q∧R)→(Q→∼R) PL

2. 2[∼(Q∧R)→(Q→∼R)] 1, NEC

3. 2∼(Q∧R)→2(Q→∼R) 2, K, MP

4. 2∼(Q∧R)↔∼3(Q∧R) MN (modal negation, proved in book)

5. (Q→∼R)→∼(Q∧R) PL

6. 2(Q→∼R)→2∼(Q∧R) 5, NEC, K, MP

7. ∼3(Q∧R)↔2(Q→∼R) 3, 4, 6 PL

Exercise 6.4g We’re to show that `K

3(P→Q)↔(2P→3Q). This one’s

a bit tough. The trick for the �rst half is getting the right order for the PL

tautologies, and for the second half, getting the right PL strategy.

1. P→[(P→Q)→Q] PL

2. 2P→2[(P→Q)→Q] 1, NEC, K, MP

3. 2[(P→Q)→Q]→[3(P→Q)→3Q] K3

4. 2P→[3(P→Q)→3Q] 2, 3, PL

5. 3(P→Q)→(2P→3Q) 4, PL

I must now prove the right-to-left direction, namely, (2P→3Q)→3(P→Q).Note that the antecedent of this conditional is PL-equivalent to ∼2P3Q. So

my goal will be to get two conditionals, ∼2P→3(P→Q), and 3Q→3(P→Q),from which the desired conditional follows by PL.

6. ∼P→(P→Q) PL

7. 3∼P→3(P→Q) 1, NEC, K3, MP

8. 3∼P→∼2P MN

9. Q→(P→Q) PL

10. 3Q→3(P→Q) 9, NEC, K3, MP

11. (2P→3Q)→3(P→Q) 7, 8, 10, PL

12. 3(P→Q)↔(2P→3Q) 5, 11, PL

Exercise 7.2 We must show that if Γ �Iφ then Γ �

PLφ. Suppose Γ �

Iφ,

and let I be a PL-interpretation in which every member of Γ is true; we must


show that VI (φ) = 1. (VI , recall, is the classical valuation for I .) Consider

the intuitionist model with just one stage, r, in which formulas have the same

valuations as they have in the classical interpretation—i.e., ⟨{r},{⟨r, r⟩},I *⟩,where I *(α, r) =I (α) for each sentence letter α. It’s easy to check that since

the intuitionist model has only one stage, the classical and intuitionist truth

conditions collapse in this case, so that for every wff φ, VI *(φ, r) =VI (φ). So,

since every member of Γ is true in I , every member of Γ is true at r in the

intuitionist model. Since Γ �Iφ, it follows that φ is 1 at r in the intuitionist

model; and so, φ is true in the classical interpretation—i.e., VI (φ) = 1.

Exercise 10.4 We’re to show that if φ has no occurrences of @, then �2D

φ→Fφ. Let’s prove by induction that if φ has no occurrences of @, then

φ→Fφ is generally valid (i.e., true in any model at any world pair ⟨v, w⟩ under

any variable assignment). The result then follows, because general validity

(truth at all pairs) obviously implies validity (truth at pairs ⟨w, w⟩).First, let φ be atomic. Then we’re trying to show that for any worlds

v, w, and any variable assignment g , Vg (Πα1 . . .αn→FΠα1 . . .αn, v, w) =1. Suppose otherwise—suppose that (i) Vg (Πα1 . . .αn, v, w) = 1 and

(ii) Vg (FΠα1 . . .αn, v, w) = 0. Given (ii), for some world, call it

v ′,Vg (Πα1 . . .αn, v ′, w) = 0, and so the ordered n-tuple of the denotations

of α1 . . .αn is not in the extension of Π at w, which contradicts (i).

Now the inductive step. We must assume that φ and ψ obey our statement,

and show that complex formulas built from φ and ψ also obey our statement.

That is, we assume the inductive hypothesis:

(ih) φ and ψ have no occurrences of @, �G2D

φ→Fφ and �G2D

ψ→Fψ

and we must show that the following are also generally valid: ∼φ→F∼φ,

(φ→ψ)→F(φ→ψ), ∀αφ→F∀αφ, 2φ→F2φ, Fφ→FFφ, and ×φ→F×φ:

∼ : Suppose otherwise—suppose V(∼φ→F∼φ, v, w) = 0 for some v, w.

So V(∼φ, v, w) = 1 and V(F∼φ, v, w) = 0. So V(φ, v, w) = 0, and for some

v ′,V(∼φ, v ′, w) = 0; and so V(φ, v ′, w) = 1. By (ih), V(φ→Fφ, v ′, w) = 1, and

so V(Fφ, v ′, w) = 1, and so V(φ, v, w) = 1—contradiction.

→ : Suppose for some v, w,V(φ→ψ)→ F(φ→ψ), v, w) = 0. So (i)

V(φ→ψ, v, w) = 1 and V(F(φ→ψ), v, w) = 0. So, for some world, call it u,

V(φ→ψ, u, w) = 0, and so V(φ, u, w) = 1 and V(ψ, u, w) = 0. Given the former

and the inductive hypothesis, V(Fφ, u, w) = 1, and so V(φ, v, w) = 1. And so,


given (i), V(ψ, v, w) = 1, and so, given the inductive hypothesis, V(Fψ, v, w) = 1,

and so V(ψ, u, w) = 1, which contradicts (ii).

∀ : Suppose for some v, w, Vg (∀αφ, v, w) = 1, but Vg (F∀αφ, v, w) = 0.

Given the latter, for some v ′, Vg (∀αφ, v ′, w) = 0; and so, for some u in the

domain, Vgαu(φ, v ′, w) = 0. Given the former, Vgαu

(φ, v, w) = 1; given (ih) it

follows that Vgαu(Fφ, v, w) = 1, and so, Vgαu

(φ, v ′, w) = 1. Contradiction.

2 : suppose (i) V(2φ, v, w) = 1 and (ii) V(F2φ, v, w) = 0, for some v, w.

From (ii), V(2φ, v ′, w) = 0 for some v ′, and so V(φ, v ′, w ′) = 0 for some

w ′. Given (i), V(φ, v, w ′) = 1; and so, given (ih), V(Fφ, v, w ′) = 1, and so

V(φ, v ′, w ′) = 1. Contradiction.

F: suppose V(Fφ, v, w) = 1 and V(FFφ, v, w) = 0, for some v, w. From the

latter, V(Fφ, v ′, w) = 0 for some v ′, and so V(φ, v ′′, w) = 0 for some v ′′, which

contradicts the former.

×: suppose Vg (×φ, v, w) = 1 but Vg (F×φ, v, w) = 0, for some v, w. Given

the latter, Vg (×φ, v ′, w) = 0 for some v ′, and so Vg (φ, w, w) = 0, which con-

tradicts the former.

Appendix B

Answers to Remaining Exercises

Exercise 2.1 We’re to show that the de�ned symbols ∨ and↔ get the right

truth conditions. We must �rst show that V (ψ∨χ ) = 1 iff either V (ψ) = 1 or

V (χ ) = 1, for any valuation V . ψ∨χ is short for ∼ψ→χ , so we need to show

that for any V ,V (∼ψ→χ ) = 1 iff V (ψ) = 1 or V (χ ) = 1.

First suppose that V (∼ψ→χ ) = 1. We must now show that V (ψ) = 1or V (χ ) = 1. So suppose for reductio that this is not the case—i.e., suppose

that V (ψ) = 0 and V (χ ) = 0. Since V (ψ) = 0 then V (∼ψ) = 1, by the clause

in the de�nition of truth-in-a-valuation function for the ∼. And then, since

V (∼ψ) = 1 and V (χ ) = 0, V (∼ψ→χ ) = 1, by the clause in the de�nition

of truth-in-a-valuation function for the →. That contradicts the reductio

assumption.

Next suppose that either V (ψ) = 1 or V (χ ) = 1, and suppose for reductio

that V (∼ψ→χ ) = 0. Given the latter, V (∼ψ) = 1 and V (χ ) = 0 (clause for→ );

and then given the clause for ∼, V (ψ) = 0. But if both V (ψ) = 0 and V (χ ) = 0,

that contradicts the initial supposition that either V (ψ) = 1 or V (χ ) = 1.

Next we must show that V (ψ↔χ ) = 1 iff V (ψ) = V (χ ) (for any V ).

“ψ↔χ ” is short for: “(ψ→χ )∧(χ→ψ)”. The ∧is still not part of our basic

vocabulary; however, we showed in class that, given how ∧is de�ned, V (α∧β) =1 iff V (α) = 1 and V (β) = 1. Given this fact, V (ψ↔χ ) = 1 iff V (ψ→χ ) = 1and V (χ→ψ) = 1. But it is true that both V (ψ→χ ) = 1 and also V (χ→ψ) = 1,

iff ψ and χ have the same truth value in V —i.e., iff V (ψ) =V (χ ). For if they

have different truth values then one of the conditionals ψ→χ or χ→ψ must

be false (whichever one is 1→0); and conversely, if ψ and χ have the same truth

value then each of these conditionals must be true (since both 0→0 and 1→1

298

APPENDIX B. ANSWERS TO REMAINING EXERCISES 299

are 1.)

Exercise 2.2a Sequent proof of P,Q, R ` P :

1 (1) P As.

2 (2) Q As.

3 (3) R As.

2,3 (4) Q∧R 2,3 ∧I

1,2,3 (5) P∧(Q∧R) 1,4 ∧I

1,2,3 (6) P 5, ∧E

Exercise 2.2b Sequent proof of P→(Q→R) ` (Q∧∼R)→∼P :

1 (1) P→(Q→R) As

2 (2) Q∧∼R As (for conditional proof)


1,3 (4) Q→R 1, 3→E

2 (5) Q 2, ∧E

1,2,3 (6) R 4,5→E

2 (7) ∼R 2, ∧E

1,2,3 (8) R∧∼R 6,7 ∧I

1,2 (9) ∼P 8, RAA

1 (10) (Q∧∼R)→∼P 9,→I

Exercise 2.2c Sequent proof of P→Q, R→Q ` (P∨R)→Q:

1 (1) P→Q As

2 (2) R→Q As

3 (3) P∨R As (for conditional proof)

4 (4) P As (for use with ∨E)

1,4 (5) Q 1, 4→E

6 (6) R As (for use with ∨E)

2,6 (7) Q 2,6→E

1,2,3 (8) Q 3, 5, 7 ∨E

1,2 (9) (P∨R)→Q 8,→I


Exercise 2.3a Axiomatic proof of P→P :

1. (P→((P→P )→P )→((P→(P→P ))→(P→P ))

From (A2); φ= P ,ψ= P→P , χ = P

2. P→((P→P )→P ) From (A1); φ= P,ψ= P→P3. (P→(P→P ))→(P→P ) 1,2 MP

4. P→(P→P ) From (A1); φ,ψ= P5. P→P 3,4 MP

Exercise 2.3b Axiomatic proof of: (∼P→P )→P :

1. ∼P→∼P Repeat the proof in problem #1, but

substitute ‘∼P ’ everywhere for ‘P ’

2. (∼P→∼P )→((∼P→P )→P ) From (A3); φ,ψ= P3. (∼P→P )→P 1, 2 MP

Exercise 2.3c Axiomatic proof of P from ∼∼P :

1. ∼∼P Premise

2. ∼∼P→(∼P→∼∼P ) A1

3. ∼P→∼∼P 1,2, MP

4. (∼P→∼∼P )→[(∼P→∼P )→P] A3

5. [(∼P→∼P )→P] 3,4 MP

6. ∼P→∼P Repeat the proof of #1

7. P 5,6 MP

Exercise 2.4a Axiomatic proof of ` P→[(P→Q)→Q]:

1. (P→Q)→(P→Q) Repeat the proof of #1

2. [(P→Q)→(P→Q)]→(P→[(P→Q)→Q]) “Swapping antecedents”

3. P→[(P→Q)→Q] 1, 2 MP

Exercise 2.4b Axiomatic proof that ` ∼∼P→P : Here’s my strategy. Look

back to problem 3. Its conclusion is P , whereas what I want is ∼∼P→P . So


I’m going to convert the proof of problem 3 by sticking a “∼∼P→ ” in front

of a number of its lines. When problem 3 uses modus ponens, I can just use

the modus ponens technique. When problem 3 gets to its conclusion, I’ll be

at the conclusion I want. I’m going to start at line 3 of problem 3, since I can

stick a “∼∼P→ ” in front of that easily, by A1. Here is the proof:

1. ∼∼P→(∼P→∼∼P ) A1 (counterpart of line 3

of old proof)

2. (∼P→∼∼P )→[(∼P→∼P )→P] A3

3. ∼∼P→[(∼P→∼∼P )→[(∼P→∼P )→P]] 2, adding an antecedent

(cpt of old line 4)

4. ∼∼P→[(∼P→∼P )→P] 1, 3, MP technique

5. ∼P→∼P Repeat the proof of

problem 1

6. ∼∼P→(∼P→∼P ) 5, adding an antecedent

(cpt of old line 6)

7. ∼∼P→P 4, 6, MP technique

Exercise 2.4c Axiomatic proof that ` P→∼∼P :

1. ∼∼∼P→∼P repeat proof of problem 5

2. (∼∼∼P→∼P )→(P→∼∼P ) contraposition 1

3. P→∼∼P 1, 2, MP

Exercise 2.5 The system we are to consider uses the same de�nition of wffs

and the same rule (MP), but has different axioms:

φ→φ(φ→ψ)→(ψ→φ)

Let’s �rst prove that a) every theorem of this system has an even number of

“∼”s:

A theorem of a system is the last line of a proof. So I’ll prove by induction

that every line of every proof in this system has an even number of ∼s. To do

that, I need to prove i) that every axiom of the system has an even number of


∼s (base case), and ii) that if we assume that φ and φ→ψ each have an even

number of tildes, then it follows that what you get from those formulas by

MP—i.e., ψ—must also have an even number of ∼s (inductive step).

Base case: that’s easy. In each axiom schema, each Greek letter occurs twice.

In the �rst schema, φ occurs twice, and in the second schema, both φ and ψoccur twice. So whenever we construct an axiom from either schema, each ∼that occurs in any wff that we stick in for φ or for ψ will appear in the axiom

twice. Thus, that axiom will have an even number of ∼s.

Inductive step: assume the inductive hypothesis: that φ and φ→ψ each

have an even number of ∼s. Let n be the number of ∼s in φ, and let m be the

number of ∼s inφ→ψ. The inductive hypothesis tells us that both n and mare even. That means that m− n is even. But m− n is the number of ∼s in ψ.

So ψ has an even number of ∼s.

Next I’ll show that b) not every theorem of this system is valid. To do this,

I just need to produce a single theorem of this system and a single valuation

in which this theorem is false. I choose the theorem (P→Q)→(Q→P ). That’s

a theorem of the system because it’s an axiom (second axiom schema). The

valuation I choose is one in which Q is 1 and P is 0. In this valuation P→Q is

true (because P is 0), and Q→P is 0 (because Q is 1 and P is 0); so that means

that the whole thing is 0.

NOTE: you can’t just say that the axiom schema (φ→ψ)→(ψ→φ) is invalid.

First, it’s just a schema, not a wff, so the notion of validity doesn’t apply to

it. Second, there are instances of this schema that are valid, for instance:

[(P→P )→Q]→[Q→(P→P )].

Exercise 2.6 We are to show (for regular propositional logic) that the truth

value of a formula depends only on the truth values of the sentence letters in

that formula.

Let φ be any wff and let V and V ′be valuations that agree on the sentence

letters in φ (i.e., for any sentence letter α, if α is in φ then V (α) = V ′(α)).Show that V (φ) =V ′(φ).

Let V and V ′be as described. Let’s show by induction that every formula

φ containing only sentence letters on which V and V ′agree is such that

V (φ) =V ′(φ). Since we’re trying to show something of the form “all formulas

φ are blah blah blah”, our base case is to show that all atomic formulas are blah

blah blah; and our induction step will be to show that if ψ and χ are blah blah

blah, then so are ∼ψ and ψ→χ .

Base case: we must show that if φ is atomic then V (φ) = V ′(φ). But it’s


given in the problem that V (α) = V ′(α) if α is an atomic in φ. Since in this

case, φ is atomic, φ is one of those αs, and so we have V (φ) =V ′(φ).Inductive step: assume that ψ and χ obey the theorem we’re trying to

prove—i.e., assume thatV (ψ) = V ′(ψ), and also that V (χ ) = V ′(χ ), where

ψ and χ are formulas made up of atomics over which V and V ′agree. We

now need to show that both ∼ψ and ψ→χ obey the theorem also—i.e., that

V (∼ψ) =V ′(∼ψ), and that V (ψ→χ ) =V ′(ψ→χ ). Well, in any valuation, the

truth value of ∼ψ is just the opposite of the truth value of ψ. So, since we’re

given that V (ψ) =V ′(ψ), the truth of ψ in both V and V ′is the opposite of

whatever truth value ψ has in both V and V ′. Thus, V (∼ψ) =V ′(∼ψ). Finally,

in any valuation, the truth value ofψ→χ is 0 ifψ is true and χ is false; otherwise

ψ→χ is 1. But we’re given that V (ψ) = V ′(ψ), and also that V (χ ) = V ′(χ ).So if ψ is 1 and χ is 0 in both V and V ′

then ψ→χ is false in both V and V ′;

otherwise ψ→χ is true in both V and V ′. Either way, V (ψ→χ ) =V ′(ψ→χ ).

End of inductive proof.

Exercise 2.7 We must show that for any set of formulas, Γ, and any formula

φ, if Γ ` φ then Γ � φ (i.e., if φ is provable from Γ then φ is a semantic

consequence of Γ.) Like the proof of the original version of soundness, let’s

do this by induction. Here we’re not proving that every formula has a certain

property; we’re trying to prove that anything that is provable from Γ has a certain

property. So our inductive proof will concern the successive addition of lines

to a growing proof according to the rules of proof, not the successive addition

of more formulas to a growing formula by the rules of grammar.

Remember that a formula φ is provable from Γ iff there exists a proof from Γ(i.e., a proof in which each line is either an axiom, a member of Γ , or follows

from earlier lines in the proof by MP) whose last line is φ. So let’s prove by

induction that the last line of every proof from Γ is a semantic consequence of Γ . And

we do that, in essence, by showing that every time you add to a proof from Γ,

you must always add a formula that is a semantic consequence of Γ. Formulas

you add to the proof fall into two categories: i) axioms and members of Γ , and

ii) formulas following by MP from earlier lines.

Base case: we must show that axioms and members of Γ are semantic

consequences of Γ. What does it mean to say that a formula ψ is a semantic

consequence of Γ ? It means that for any valuation, V , if every member of Γis true in V , then so is ψ. Well, it’s then obvious that any member of Γ is a

semantic consequence of Γ (obviously, a member of Γ is true in any valuation

that counts all of Γ s members true.) What about axioms—are they all true


in every valuation that counts every member of Γ true? Yes—because axioms

are true in all valuations whatsoever (we proved this when proving the original

soundness theorem).

Inductive case: Here we start by assuming that the inputs to modus ponens,

namely φ and φ→ψ, are both semantic consequences of Γ, and then show that

the output of modus ponens, namely ψ, is a semantic consequence of Γ—i.e.,

that ψ is true in any valuation that counts every member of Γ as true. (We do

this because we’re assuming that a certain proof has the feature we’re interested

in—each line is a semantic consequence of Γ—and then trying to prove that it

follows that we can add one more line to the proof and be assured that it will

continue to have the feature.) Well, let V be any such valuation. Since φ and

φ→ψ are semantic consequences of Γ, V (φ) = 1 and V (φ→ψ) = 1. But then,

V (ψ) must be 1 (since if it were 0, since V (φ) = 1, V (φ→ψ) would have to be

0, which it isn’t.) End of inductive proof.

Exercise 2.8 We must show that if Γ `φ is a provable sequent, then Γ �φ.

A provable sequent is the last line of any sequent proof, where a sequent

proof is a list of sequents, each of which is either the rule of assumptions,

or comes from earlier lines in the proof by one of the rules (∧I, →E, etc.)

Let’s call a sequent Γ ` φ a valid sequent iff Γ � φ (i.e., iff φ is true in every

interpretation in which all the members of Γ are true.) So all we need to do is

prove by induction that every sequent in every sequent proof is a valid sequent.

Base: prove that sequents coming from the rule of assumptions are valid

sequents. That’s easy. Such sequents look like this: φ `φ. Any such sequent is

obviously valid, since φ is obviously true in every valuation in which φ is true.

Induction: prove that each rule preserves validity of sequents. I’ll do this for

a few of them to illustrate the idea:

∧E: We assume that the input to this rule—i.e., Γ `φ∧ψ—is a valid sequent,

and must show that its outputs (namely, Γ `φ and Γ ` ψ) are valid sequents.

To show that Γ `φ is a valid sequent, consider any valuation V in which every

member of Γ is true; we must show that V(φ) = 1. By the assumption (namely,

Γ ` φ∧ψ is a valid sequent), we know that V(φ∧ψ) = 1. Given the derived

truth conditions for ∧(section 2.3), this means that V(φ) = 1. The proof for ψis parallel.

∨E: We assume that the inputs to this rule, i.e., Γ ` φ∨ψ; ∆1,φ ` χ ; and

∆2,ψ ` χ are valid sequents, and show that the rule’s output, i.e., Γ,∆1,∆2 ` χis also a valid sequent. To show this, consider any valuation V in which the

members of Γ, ∆1 and ∆2 are all true. Since Γ ` φ∨ψ is a valid sequent,


V(φ∨ψ) = 1, so either V(φ) = 1 or V(ψ) = 1. If the former, then since∆1,φ ` χis a valid sequent, and we have assumed that all the members of ∆1 are true

in V, then V(χ ) = 1. Similarly, if the latter then V(χ ) = 1 (since ∆2,ψ ` χ is a

valid sequent). Either way, V(χ ) = 1.

RAA: We assume that the input to this rule, Γ,φ `ψ∧∼ψ, is a valid sequent,

and show that its output, Γ ` ∼φ, is also a valid sequent. To show this, let V

be any valuation in which every member of Γ is true. Then V(∼φ) must be 1.

For if it were 0, then V(φ) would be 1. But then, since Γ,φ `ψ∧∼ψ is a valid

sequent, V(ψ∧∼ψ) would be 1, which is impossible.

→I: We assume that the input to this rule, Γ,φ `ψ, is a valid sequent, and

show that its output, Γ `φ→ψ, is also a valid sequent. To show this, let V be

any valuation in which every member of Γ is true, and suppose for reductio

that V(φ→ψ) = 0. Then V(φ) = 1 and V(ψ) = 0. But if V(φ) = 1, then since

Γ,φ `ψ is a valid sequent (and since every member of Γ is true in V), V(ψ) = 1.

contradiction.

Exercise 3.1 The connective % is stipulated to have the following truth

table:

% 1 01 0 10 1 0

Let’s show that negation cannot be de�ned using just %. ∼P is 1 when P is 0,

whereas we can show by induction that any φ constructed from just P and %

is always 0 whenever P is 0. Let V be any valuation such that V(P ) = 0.

Base: φ is atomic. Then φ is just P , and so V(φ) = 0Induction: assume result holds for φ and ψ (i.e., each is 0 in V). Show result

holds for φ%ψ—i.e., show that this too is 0 in V. That follows from the truth

table for %; 0%0= 0.


Exercise 3.2 We are to express these truth functions:

f (1,1) = 1 g (1,1,1) = 1f (1,0) = 0 g (1,1,0) = 0f (0,1) = 0 g (1,0,1) = 1f (0,0) = 1 g (1,0,0) = 1

g (0,1,1) = 1g (0,1,0) = 1g (0,0,1) = 0g (0,0,0) = 1

Function f is expressed in standard propositional logic by: P↔Q. To get the

Sheffer stroke equivalent, note that P↔Q is equivalent to (P→Q)∧(Q→P ),which is in turn equivalent to

∼(P∧∼Q)∧∼(∼P∧Q)

Now, | is equivalent to “not both”, so we can rewrite this as follows:

(P |∼Q)∧ (∼P |Q)

And since ∼φ is equivalent to φ|φ, we can rewrite this as follows:

[P |(Q|Q)]∧ [(P |P )|Q]

Finally, φ∧ψ is equivalent to ∼∼(φ∧ψ), which is equivalent to ∼(φ|ψ), which

is equivalent to (φ|ψ)|(φ|ψ). So the �nal answer is:

([P |(Q|Q)]|[(P |P )|Q]) | ([P |(Q|Q)]|[(P |P )|Q])

As for function g , we want a sentence, S , containing the sentence letters P , Q,

and R, that has the listed truth table. Our sentence S ought to be false in just

two cases:

· P and Q are true and R is false

· P and Q are false and R is true


Thus, the whole thing is false exactly when one of the following is true:

P∧Q∧∼R∼P∧∼Q∧R

i.e., when the following is true:

(*) (P∧Q∧∼R)∨ (∼P∧∼Q∧R)

Thus, the whole thing is true when (*) is false—i.e., when the negation of (*) is

true. But the negation of (*) is equivalent to:

∼(P∧Q∧∼R)∧∼(∼P∧∼Q∧R)

So S should be equivalent to this last sentence. Let’s arbitrarily pick some

groupings for these three-way conjunctions (of�cially ∧ is a binary connective;

I’ve been sloppy in putting in the parentheses because ∧ is associative):

∼[(P∧Q)∧∼R]∧∼[(∼P∧∼Q)∧R]

This is equivalent to:

[(P∧Q)|∼R]∧ [(∼P∧∼Q)|R]

which in turn is equivalent to:

[(P∧Q)|(R|R)]∧ [((P |P )∧(Q|Q))|R]

The only remaining thing to do is eliminate the ∧s with their equivalents using

the |, via the equivalence mentioned above: φ∧ψ is equivalent to (φ|ψ)|(φ|ψ).That would take forever, and I’m getting a bit lazy, so I won’t write it out.

Note: a simpler expression of function g is: (P↔Q)→(Q↔R), which

could then be expressed using the Sheffer stroke.

Exercise 3.3 The truth table for ↓ is the following:

% 1 01 0 00 0 1


We know that all the truth functions can be de�ned using just ∼ and ∨. So all

we need to do is show that ∼ and ∨ can be de�ned using ↓.First, ∼φ can be de�ned as φ ↓φ. (The ↓ generates a false sentence from

two trues, and a true sentence from two falses.)

Second, note that φ ↓ψ is equivalent to ∼(φ∨ψ). So φ∨ψ is equivalent to

∼(φ ↓ψ), and hence to (φ ↓ψ) ↓ (φ ↓ψ).

Exercise 3.4a P↔∼P becomes:↔P∼P

Exercise 3.4b (P→(Q→(R→∼∼(S∨T )))) becomes: →P→Q→R∼∼∨ST

Exercise 3.4c [(P∧∼Q)∨(∼P∧Q)]↔∼[(P∨∼Q)∧(∼P∨Q)] becomes:

↔∨∧P∼Q∧∼PQ∼∧∨P∼Q∨∼PQ

Here’s how to work this out step by step. Start with the conjuncts on the left

hand side:

P∧∼Q : ∧P∼Q∼P∧Q : ∧∼PQ

then plug these after an ∨ to get the entire left hand side:

(P∧∼Q)∨(∼P∧Q) : ∨∧P∼Q∧∼PQ

now go to work on the small formulas on the right hand side:

P∨∼Q : ∨P∼Q∼P∨Q : ∨∼PQ

now to get the conjunction on the right hand side, put an ∧ followed by these

last two formulas:

(P∨∼Q)∧(∼P∨Q) : ∧∨P∼Q∨∼PQ

put a ∼ in front of this to negate it, and so obtain the entire right hand side:

∼[(P∨∼Q)∧(∼P∨Q)] :∼∧∨P∼Q∨∼PQ

Now, for the whole formula. The major connective is the↔, so we start with

a↔, then put the whole symbolization for the left hand side, followed by the

whole symbolization for the right hand side:

↔ ∨∧P∼Q∧∼PQ ∼∧∨P∼Q∨∼PQ


Exercise 3.5 Inde�nability of the→ for Łukasiewicz. We must show that

there is no wff φ such that i) φ contains just the sentence letters P and Q, plus

the connectives ∼, ∧, and ∨ (plus parentheses), and ii) φ has the same truth

table as P→Q (i.e., φ is true in exactly the same Łukasiewicz-valuations as

P→Q).

P→Q is true in the Łukasiewicz tables when P and Q are both #. I’ll now

show by induction that any wffφ containing just P , Q,∼, ∧ and ∨ gets assigned

# by any Łukasiewicz-valuation that assigns both P and Q #, and hence has a

different truth table from P→Q.

Let I be any trivalent interpretation in which both P and Q are #. Here

is the inductive proof that for any φ made up of just P , Q, ∼, ∧, and ∨,

ŁVI (φ) = #:

Base: φ is atomic. That means that φ is either P or Q, and so ŁVI (φ) = #.

Induction: assume the result holds for φ and ψ (i.e., ŁVI (φ) = #, ŁVI (ψ) =#), and show that the result also holds for ∼φ, φ∧ψ, and φ∨ψ—i.e., show that

each of these is # in ŁVI . This follows from the truth tables for the ∼, ∧and ∨:

whenever φ is #, so is ∼φ; and whenever φ and ψ are both #, so are φ∧ψ and

φ∨ψ.

Exercise 3.8 We are to show that no wff in which no sentence letter is

repeated is supervaluationally valid. Call a sentence “fresh” if no sentence letter

is repeated in it, and consider the trivalent interpretation I that assigns # to

each sentence letter. I’ll show by induction that each wff φ has the following

property (which I’ll call “the property”)

If φ is fresh then SI (φ) = #

Base case: suppose φ is a sentence letter. Then by stipulation, I (φ) = #, and so

SI (φ) = #.

Induction: suppose that φ and ψ have the property; we must show that (a)

φ→ψ and (b) ∼φ have the property.

(a): To show that φ→ψ has the property, we assume φ→ψ is fresh; we

must show that SI (φ→ψ) = #. In order to do that, we need to construct two

precisi�cations of I : one in which φ→ψ is 1, one in which φ→ψ is 0.

Since φ→ψ is fresh, we know that φ and ψ are each fresh. Since by the

inductive hypothesis each has the property, we know that SI (φ) = # and

SI (ψ) = #. But now: if SI of any formula is #, that means that the formula is 1in some precisi�cation ofI , and also that the formula is 0 in some precisi�cation


of I . So, since SI (φ) and SI (ψ) are both #, we know that there exist four

precisi�cations of I : B , C , D, and E , such that:

VB (φ) = 1 VC (φ) = 0VD(ψ) = 1 VE (ψ) = 0

Note that VC (φ→ψ) = 1. So this is the �rst of the two precisi�cations of Ithat we need. The other one we need is a precisi�cation of I in which φ→ψis false. Let F be the precisi�cation that is just likeB except that it assigns

whatever E assigns to any sentence letter occurring in ψ. Notice that, since

φ→ψ was fresh, no sentence letter occurs in both φ and ψ; and so all three

precisi�cationsF ,B , and E assign the same truth values to all the sentence

letters occurring in either φ or ψ. But that means that φ has the same truth

value in F ,B , and E ; and likewise for ψ (since, as shown in exercise 2.6, in

any PL-valuation the truth value of an entire sentence is a function solely of

the truth values of the sentence letters that occur in that sentence.) And that

means that φ is 1 inF and ψ is 0 inF ; hence, VF (φ→ψ) = 0.

(b): It remains to show that ∼φ has the property. That’s easy: to show it

has the property we must assume that ∼φ is fresh, and show that SI (∼φ) = #.

But if ∼φ is fresh then so is φ; so by the inductive hypothesis, SI (φ) = #. That

means that φ is 1 in some precisi�cations of I and 0 in others. But ∼φ is 0 in

the �rst precisi�cations and 1 in the second precisi�cations, and so SI (∼φ) = #.

Exercise 4.1a What we are trying to show is that ∀x(F x→(F x∨Gx)) is

valid—i.e., that this formula is true in every model—i.e., that for any model

M (= ⟨D,I ⟩), and any assignment to the variables g de�ned on that model,

Vg ,M (∀x(F x→(F x∨Gx))) = 1. (I’ll leave the subscriptM implicit from now

on.) So, suppose for reductio that this is not true—that is, suppose for some

model and some g de�ned on that model, we have:

Vg (∀x(F x→(F x∨Gx))) = 0

We then reason as follows:

i) So, for some u ∈D ,Vg xu(F x→(F x∨Gx)) = 0. Call one such u, “u”.

ii) Vg xu(F x) = 1 and Vg x

u(F x∨Gx) = 0 (clause for→)

iii) Given ii), Vg xu(F x) = 0 (derived clause for ∨)


iv) Lines ii) and iii) contradict.

Exercise 4.1b � ∀x(F x∧Gx)→(∀xF x∧∀xGx):

i) Suppose for reductio that for some model and some g , Vg (∀x(F x∧Gx)→(∀xF x∧∀xGx)) = 0

ii) Then Vg (∀x(F x∧Gx)) = 1 and Vg (∀xF x∧∀xGx)) = 0

iii) Given the second, either Vg (∀xF x) = 0 or Vg (∀xGx) = 0

iv) Suppose the former (i.e., Vg (∀xF x) = 0). Then for some u in the domain,

Vg xu(F x) = 0. From the �rst part of ii), we know that for every object in

the domain, and so for u in particular, Vg xu(F x∧Gx) = 1. From the clause

in the de�nition of V for ∧, we know that Vg xu(F x) = 1. Contradiction.

v) Suppose the latter (i.e., Vg (∀xGx) = 0). Then, for some v in the domain,

Vg xv(Gx) = 0. From the �rst part of ii), we know that for every object in

the domain, and so for v in particular, Vg xv(F x∧Gx) = 1. So Vg x

v(Gx) = 1.

contradiction.

vi) So either way we get a contradiction.

Exercise 4.1c What we’re trying to show is that the set of formulas

{∀x(F x→Gx),∀x(Gx→H x)} logically implies the sentence ∀x(F x→H x).That is, we’re trying to show that ∀x(F x→H x) is true in every model,M , in

which all the premises in the set are true. So, we proceed as follows: suppose

for reductio that in some model, M , each of the premises are true and the

conclusion is false. We then reason as follows:

i) Since the conclusion is false in this model, we know that for some g ,

Vg (∀x(F x→H x)) = 0.

ii) Since the premises are true, we know that for each variable assignment,

and so for g in particular, Vg (∀x(F x→Gx)) = 1, and Vg (∀x(Gx→H x)) =1

iii) From i), for some u ∈D , Vg xu(F x→H x) = 0 (clause for ∀ ). Call it “u”


iv) So, Vg xu(F x) = 1 and Vg x

u(H x) = 0 (clause for→)

v) Given ii), we know that for all v ∈ D, Vg xv(F x→Gx) = 1 and

Vg xv(Gx→H x) = 1.

vi) Since u ∈ D, we can conclude from v) that: Vg xu(F x→Gx) = 1 and

Vg xu(Gx→H x) = 1.

vii) Given the clause for→, we have:

a) Vg xu(F x) = 0 or Vg x

u(Gx) = 1; and

b) Vg xu(Gx) = 0 and Vg x

u(H x) = 1

viii) Given lines iv) and vii)a), we have Vg xu(Gx) = 1

ix) And so, given line vii)b), we have Vg xu(H x) = 1.

x) That contradicts line iv).

Exercise 4.1d � ∃x∀yRxy→∀y∃xRxy:

i) Suppose for reductio that for some model and some

g ,Vg (∃x∀yRxy→∀y∃xRxy) = 0.

ii) So, Vg (∃x∀yRxy) = 1 and Vg (∀y∃xRxy) = 0

iii) Given the former, for some member of the domain, call it u,

Vg xu(∀yRxy) = 1

iv) Given the latter, for some member of the domain, call it v , Vg yv(∃xRxy) =

0

v) From line iii), we know that for each member of the domain, and so for

v in particular, Vg xyuv(Rxy) = 1

vi) From line iv) we know that for each member of the domain, and so for uin particular, Vg y x

v u(Rxy) = 0.


vii) The function g xyuv is the same function as the function g y x

v u (each is the

function just like g , except that it assigns u to x and v to y.) So the

previous two lines contradict.

Exercise 4.2a We must show that 2 ∀x(F x→Gx)→∀x(Gx→F x). To say

that a sentence is valid is to say that it is true in all models. So in order to show

that a sentence is invalid, all we need to do is to produce one model in which

the sentence is false. Our given sentence says that if all F s are Gs, then all Gs

are Fs. So let’s produce a model in which the set of Gs contains the set of F s as

a subset, but more objects in addition. I’ll use numbers in the domains of my

models:

D = {0,1}I (F ) = {0}I (G) = {0,1}

In this model, everything that is in F ’s extension is also in G’s extension; so

the antecedent of our conditional, namely: ∀x(F x→Gx), is true. But since

the object 1 is in the extension of G without being in the extension of F , the

consequent of the conditional, namely: ∀x(Gx→F x), is false. So the condi-

tional is false in this model. So we’ve found a model in which the conditional

∀x(F x→Gx)→∀x(Gx→F x) is false. So it’s not true in all models. So it’s

invalid.

Exercise 4.2b We must show that 2 ∀x(F x∨∼Gx)→(∀xF x ∨∼∃xGx). The

antecedent says that everything is either an F or not a G. The consequent says

that either everything is an F , or nothing is a G. So let’s choose a model with

some things that are F and also G, and other things that are neither F nor G.

Then the antecedent will be true and the consequent will be false:

D = {0,1}I (F ) = {0}I (G) = {0}

Exercise 4.2c To show that Rab does not semantically imply ∃xRx x, we

need to �nd a model in which the �rst is true and the second is false. Here is


such a model:

D = {0,1}I (a) = 0I (b ) = 1I (R) = {⟨0,1⟩}

Exercise 4.2d We must show that∀x∀y∀z[(Rxy∧Ry z)→Rx z],∀x∃yRxy 2∃xRx x. So we need a model in which the premises are true and the conclusion is

false. The �rst premise says that R is transitive; the second says that everything

Rs something. To make the conclusion false, we must make sure in our model

that nothing Rs itself.

There is no way to satisfy all these constraints with a �nite model. Suppose

you start with just a single object in the domain:

•

It must R something, given the second premise. It can’t R itself, since we

want the conclusion to be false. So we must posit a second thing that it Rs:

• // •

But now, given the second premise, this new thing has to R something. It

can’t R back to the �rst thing, because given transitivity, that �rst thing would

then need to R itself. So we need to add a third thing:

• // • // •

Also, given transitivity, the �rst thing Rs the third thing. But now this third

thing needs to R something. It can’t R itself, or any of the things earlier in the

sequence, because each of those things Rs it; so given transitivity, if the third

thing Rs any of those things, it would have to R itself.

And so on. We can never stop with any �nite model, since the second

premise will always force us to add another object.

But we can have an in�nite model, in which each object Rs all the later

objects:

• // • // • // • // . . .


Here’s the of�cial model:

D ={0,1,2, . . .} (i.e., the set of natural numbers)

I (R) ={⟨i , j ⟩ | i , j ∈D, i < j } (i.e., the set of ordered pairs in which the

�rst member of the pair is a smaller number than the second.)

In essence, R is interpreted in this model as meaning “is less than”. The �rst

premise is true in the model because “is less than” is a transitive relation. The

second premise is true because for each natural number there exists a greater

natural number. The conclusion is false because no number is less than itself.

Exercise 5.1a F ab � ∀x(x=a→F x b ):

i) Suppose for reductio that for some model and some g , Vg (F ab ) = 1, but

…

ii) …Vg (∀x(x=a→F x b )) = 0.

iii) Given ii), for some u ∈D (call it: u) we have: Vg xu(x=a→F x b ) = 0

iv) And so we have: Vg xu(x=a) = 1 and Vg x

u(F x b ) = 0 (clause for→)

a) Given Vg xu(x=a) = 1 we have: [x]g x

uis (identical to) [a]g x

u

b) But [x]g xu

is just g xu (x)—that is, u

c) And [a]g xu

is just I (a)

d) So, we have: u is I (a)e) Given Vg x

u(F x b )=0, we have ⟨[x]g x

u,[b]g x

u⟩ /∈I (F ) (clause for atom-

ics)

f) But [x]g xu

is u—i.e., I (a), as we just showed

g) And [b]g xu

is I (b )

h) So we have: ⟨I (a),I (b )⟩ /∈I (F )

v) Given line i), we have ⟨[a]g ,[b]g ⟩ ∈ I (F )

a) [a]g is just I (a)b) [b]g is just I (b )


c) so ⟨I (a),I (b )⟩ ∈ I (F )d) this contradicts line iv) h)

Exercise 5.1b We are to show that ∃x∃y∃z(F x∧F y∧F z∧x 6=y∧x 6=z∧y 6=z),∀x(F x→(Gx∨H x) 2 ∃x∃y∃z(Gx∧Gy∧Gz∧x 6=y∧x 6=z∧y 6=z). We need a

model in which the two premises:

∃x∃y∃z(F x∧F y∧F z ∧x 6=y∧x 6=z∧y 6=z)∀x(F x→(Gx∨H x)

are true, but in which the conclusion:

∃x∃y∃z(Gx∧Gy∧Gz ∧ x 6=y ∧ x 6=z ∧ y 6=z)

is false. Let’s think about what these sentences mean. The �rst premise says

that “there are at least three F s”. The second premise says that “every F is

either a G or an H”. The conclusion says that “there are at least three Gs”.

Well, if we include three objects in our model, and make each of them F s, then

the �rst premise is true. If we make some but not all of those objects Gs, and

make the rest H s, then the second premise will be true. But since we haven’t

made all three objects in the model Gs, the conclusion will be false. Here is

the model:

D = {0,1,2}I (F ) = {0,1,2}I (G) = {0,1}I (H ) = {2}

Exercise 5.2a “Everyone who loves someone else loves everyone”:

∀x[∃y(y 6=x ∧Lxy)→∀yLxy]

Exercise 5.2b “The only truly great player who plays in the NBA is Allen

Iverson”: ∀x[(Gx∧P xn)→x=i]

Exercise 5.2c We must symbolize “If a person shares a solitary con�nement

cell with a guard, then they are the only people in the cell”. Letting ‘C ’ stand


for ‘is a solitary con�nement cell’, ‘S xy z’ stand for ‘x shares y with z’ (it’s a

three place predicate), and ‘I ’ stand for ‘is in’:

∀x∀y∀z[(P x∧C y∧Gz∧S xy z)→∀x1([P x1∧I x1y]→[x1=x∨x1=z])]

Exercise 5.2d The shortest symbolization of “there are at least �ve dinosaurs”

I could �nd:

∃xD x ∧∼∃x1∃x2∃x3∃x4(D x1∧D x2∧D x3∧D x4∧∀y(Dy→ (y=x1∨y=x2∨y=x3∨y=x4))

Exercise 5.3a “The product of an even number and an odd number is an

even number.”: ∀x∀y[(E x∧Oy)→E p(x, y)]

Exercise 5.3b “If the square of a number that is divisible by each

smaller number is odd, then that number is greater than all numbers.”:

∀x[(N x∧∀y(Sy x→D xy)∧O s(x))→∀z(N z→S z x)]

Exercise 5.4a To show that � ∀xF x→F f (a), letM be any model, let g be

any assignment in that model.

i) suppose for reductio that Vg (∀xF x→F f (a)) = 0.

ii) Then Vg (∀xF x) = 1 and Vg (F f (a)) = 0

iii) Given the second, [ f (a)]g /∈ I (F )

iv) Given the �rst, for all u ∈D,Vg xu(F x) = 1.

v) So, letting u = [ f (a)]g (which we know to be inD because all denotations

are in D), we have Vg x[ f (a)]g(F x) = 1

vi) And so, [x]g x[ f (a)]g

∈I (F )

a) But [x]g x[ f (a)]g

is g x[ f (a)]g

(x)…

b) …which is just [ f (a)]g


vii) so, [ f (a)]g ∈I (F )—contradicts line iii)

Exercise 5.4b To show that {∀x f (x)6=x} 2 ∃x∃y( f (x)=y ∧ f (y)=x),we must �nd a model in which ∀x f (x)6=x is true, but in which

∃x∃y( f (x)=y∧ f (y)=x) is false.

In any model, the interpretation of the function symbol f will some one-

place function de�ned on the domain. Given that ∀x f (x)6=x is to be true in

our desired model, the function assigned to f must map each object to an

object other than itself. And given that ∃x∃y( f (x)=y∧ f (y)=x) is to be false,

there can be no two objects that this function maps to each other. So let’s just

choose a model with a “triangle” of three objects, each mapped by the function

to the next vertex in the clockwise direction:

0

��========

2

@@��1oo

The arrows represent the function.

Here is the of�cial model:

D = {0,1,2}I ( f ) = the function g such that g (0) = 1, g (1) = 2, g (2) = 0

Exercise 5.5a We must show that � ∀xL(x, ιyF xy)→∀x∃yLxy. It’s easy to

get confused by the complexity of the antecedent here, “∀xL(x, ιyF xy)”. This

just has the form: ∀xLxα , where α is “ιyF xy”. L is a two-place predicate;

it applies to the terms x and α. If you think of “F xy” as meaning that x is a

father of y, and “Lxy” as meaning that x loves y, then ∀xL(x, ιyF xy) means

“everyone x loves the y that he (x) is the father of”.

Now for the proof. Suppose for reductio that in some model, and some

assignment g in that model:

i) Vg (∀xL(x, ιyF xy)→∀x∃yLxy) = 0

ii) So, Vg (∀xL(x, ιyF xy)) = 1 and Vg (∀x∃yLxy) = 0

iii) Given the second, for some u ∈D,Vg xu(∃xLxy) = 0. Call this u “u”.


iv) Given the �rst, for every v ∈D,Vg xv(L(x, ιyF xy)) = 1

v) Letting v = u, we have: Vg xu(L(x, ιyF xy)) = 1

vi) So, ⟨[x]g xu,[ιyF xy]g x

u⟩ ∈ I (L)

a) [x]g xu

is g xu (x)—i.e., u.

b) Let’s call [ιyF xy]g xu

“v”. Thus, we have ⟨u, v⟩ ∈ I (L).

Aside: It doesn’t really matter for this problem, but we can in-

fer something about v. Remember that E , the emptiness

marker, is never in the extension of any predicate. That goes

for two-place predicates like L, as well as one-place predi-

cates. What that means is that for any ordered pair ⟨o1, o2⟩,if ⟨o1, o2⟩ ∈ I (L), then neither o1 nor o2 can be E . Thus, since

⟨[x]g xu,[ιyF xy]g x

u⟩ ∈ I (L), we can conclude that [ιyF xy]g x

u—

i.e., v—is not E . What’s more, given the de�nition of deno-

tation for ι terms, there must exist exactly one object v ∈ Dsuch that Vg xy

uv(F xy) = 1, and that [ιyF xy]g x

uis this v. (If there

weren’t exactly one such v, then [ιyF xy]g xu

would be E .) Sum-

mary: we know that v is not the emptiness marker, but rather is

the one and only object in the domain such that Vg xyuv(F xy) = 1.)

vii) Thus, we have: ⟨u, v⟩ ∈ I (L)

viii) Now, from line iii) we have: for every o ∈D,Vg xyuo(Lxy) = 0

ix) Letting o = v, we have: Vg xyuv(Lxy) = 0.

x) And so, ⟨[x]g xyuv

,[y]g xyuv⟩ /∈I (L)

xi) But [x]g xyuv

is u and [y]g xyuv

is v

xii) So, ⟨u, v⟩ /∈I (L), contradicting line vii).

Exercise 5.5b We must show that 2GιxF x→F ιxGx. To make this formula

false in a model, we need to make GιxF x true and F ιxGx false. Let’s think

about the denotation of ιxF x. To make GιxF x true, the denotation of ιxF x


must be in the extension of G; that means that it can’t be the emptiness marker.

So let’s let the denotation of ιxF x be the number 0. Now, 0 must be the one

and only object in the extension of F (since it is not the emptiness marker and

is the denotation of ιxF x.) So we have this so far:

D = {E , 0, . . .?I (F ) = {0}I (G) = {0, . . .?

Now let’s ask: can we stop there? Can we let G’s extension just contain 0 and

nothing else? The answer is no. For if 0 is the one and only object in G’s

extension, then 0 will be the denotation of ιxGx. But since 0 is in the extension

of F , that would make F ιxGx be true, whereas we want it to be false. So we

need to add something else to G’s extension:

D = {E , 0, 1}I (F ) = {0}I (G) = {0,1}

Now, the denotation of ιxGx is the emptiness marker, E . Since E is not in the

extension of F , F ιxGx is false, which is what we want.

Exercise 5.7a “If a person commits a crime, then the judge that sentences

him/her wears a wig”: ∀x[(P x∧∃y(C y∧M xy)) → ∃y(W y∧E ιz(J z∧S z x)y)](“E x1x2” = “x1 wears x2”)

Exercise 5.7b “The tallest spy is a spy": Sιx(S x∧∀y((Sy∧y 6=x)→T xy))

Exercise 5.8 “The ten-feet-tall man is not happy”, symbolized �rst using

the ι, and then (under two readings) using Russell’s method:

∼H ιx(T x∧M x)∼∃x(T x∧M x∧∀y([T y∧M y)→y=x)∧H x)∃x(T x∧M x∧∀y([T y∧M y)→y=x)∧∼H x)

The �rst Russellian symbolization says that it’s not true that: there is exactly

one ten-feet-tall man who is happy. The second says that there is exactly one

ten-feet-tall man, and he his not happy. So if there isn’t exactly one ten-feet-tall


man (whether because no man is ten-feet-tall, or because more than one man

is ten-feet-tall), then the �rst is true while the second is false. Given that the

null individual is not in the extension of any predicate, the ι symbolization is

also true if there is not exactly one ten-feet-tall man; so the �rst Russellian

symbolization is like the ι symbolization.

Exercise 5.9 The semantics of ∃prime: For any model,M , and any assignment

to the variables g , VM ,g (∃primeαφ) = 1 iff |φM,g ,α| is prime.

Exercise 5.10 Symbolize “the number of people multiplied by the number

of cats that bite at least one dog is 198”, inventing any desired generalized

quanti�ers:

This sentence concerns the cardinalities of two sets, the set of people and

the set of cats that bite at least one dog. So we need a binary quanti�er. I’ll

invent one called Ted’s quanti�er, Ted. The idea is that (Ted α :φ)ψ is to be

true iff the number of φs multiplied by the number of ψs is 198. The semantic

clause for this quanti�er is this: for any model,M , and variable assignment g ,

VM ,g ((Ted α :φ)ψ) = 1 iff |φM ,g ,α| · |ψM ,g ,α|= 198

The symbolization of the sentence is then:

(Ted x : P x)(C x∧∃y(Dy∧B xy))

Exercise 6.2b 2(P∨3Q)→(2P∨3Q):

B-countermodel (hence the formula is invalid in K, D, and T as well):


∗ ∗1 1 1 0 0 0 0 0

2(P∨3Q)→(2P∨3Q)† ∗

r

OO

��

00

0 0 1 1

P Q P∨3Q† ∗

a

OO

��

00

1

Qb

00

Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨a, r⟩, ⟨a,b⟩, ⟨b,a⟩}

I (P, r) =I (Q, b) = 1, all else 0

Validity proof in S4 (establishes validity in S5 as well):

i) Suppose for reductio that in some S4 model ⟨W,R,I⟩, for some world

r ∈R , V(2(P∨3Q)→(2P∨3Q), r ) = 0

ii) So V(2(P∨3Q), r ) = 1 and …

iii) …V(2P∨3Q), r ) = 0

iv) Given iii), V(2P, r ) = 0. So for some world, call it a,R ra and V(P,a) = 0

v) Given ii), V(P∨3Q,a) = 1, and so, given 4, V(3Q,a) = 1. So there is

some world, call it b , such thatRab and V(Q, b ) = 1

vi) Since R ra and Rab , by transitivity, R r b . Given iii), V(3Q, r ) = 0.

And so V(Q, b ) = 0, contradicting v)

Exercise 6.2c 3(P∧3Q)→(23P→32Q):


S5-countermodel (hence, the formula is invalid in all systems):

∗ ∗1 1 1 1 0 1 1 1 0 0 0 0

3(P∧3Q)→(23P→32Q)∗ ∗ ∗ ∗

r

OO

��

00

1 1 0

Q 3P 2Q∗ ∗

a

00

Of�cial model:

W = {r, a}R = {⟨r, r⟩, ⟨a, a⟩, ⟨r, a⟩, ⟨a, r⟩}

I (P, r) =I (Q, a) = 1; all else 0

Exercise 6.2e 2(P∧Q)→22(3P→3Q):

T-countermodel (establishes invalidity in K and D as well):

∗1 1 1 1 0 0

2(P∧Q)→22(3P→3Q)∗

r

��

00

1 1 1 0

P∧Q 2(3P→3Q)∗

a

��

00

∗1 1 0 0 0

3P→3Q∗

b

00


Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨a,b⟩}


S4-validity proof (establishes validity in S5 as well):

i) Suppose for reductio that the formula is false at some world r in some

B-model

ii) Then V(2(P∧Q), r ) = 1 and …

iii) V(22(3P→3Q), r ) = 0

iv) Given ii), for some world, call it a,R ra and V(2(3P→3Q),a) = 0. And

so, for some world, call it b ,Rab and V(3P→3Q, b ) = 0

v) Thus, V(3P, b ) = 1 and …

vi) …V(3Q, b ) = 0

vii) SinceR ra andRab , by transitivityR r b . Given i), V(P∧Q, b ) = 1, and

so V(Q, b ) = 1.

viii) Given re�exivity,Rb b , and so by vi), V(Q, b ) = 0, contradicting vii)

B-validity proof: like the S4-validity proof through the �rst 6 steps; then:

vii) SinceRab , by symmetry,Rba. So, given vi), V(Q,a) = 0.

viii) SinceR ra, given ii), V(P∧Q,a) = 1, and so V(Q,a) = 1, contradicting

vii)

Exercise 6.2f 2(2P→Q)→2(2P→2Q):

B-countermodel (establishes invalidity in S5 as well):


∗1 1 1 1 1 0 0

2(2P→Q)→2(2P→2Q)† ∗

r

OO

��

00

∗1 1 1 1 1 0 0

2P→Q 2P→2Q† ∗

a

OO

��

00

0 1

Q Pb

00

Of�cial model:




i) Suppose for reductio that the formula is false at some world r of some

S4-model.

ii) Then V(2(2P→Q), r ) = 1 and…

iii) …V(2(2P→2Q), r ) = 0.

iv) Given iii), for some world a,R ra and V(2P→2Q,a) = 0.

v) By the truth condition for the→, V(2P,a) = 1 and…

vi) …V(2Q,a) = 0.

vii) Given vi), for some b ,Rab and V(Q, b ) = 0.

viii) Given v), V(P, b ) = 1.


ix) SinceR ra andRab ,R r b , by transitivity.

x) But then, given ii), V(2P→Q, b ) = 1.

xi) Given x) and vii), by the truth condition for the→, V(2P, b ) = 0.

xii) So, for some c ,Rb c and V(P, c) = 0.

xiii) SinceRab andRb c ,Rac (transitivity).

xiv) So, given v), V(P, c) = 1. This contradicts xii).

Exercise 6.2h 33P→23P :

S4-countermodel (establishes invalidity in K, D, and T as well):

1 0 0

33P→23P∗ ∗

r

00

~~~~~~~~~~~~~~

@@@@@@@@@@@@

1 1

3P∗

a

00

∗0 0

3Pb

00

Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨r,b⟩}


B-countermodel:

1 0 0 0

33P→23P∗ ∗

r

00 >>

~~~~~~~~~~~~~~ ``

@@@@@@@@@@@@

1 1

3P∗

a

00

∗0 0

3Pb

00


Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨r,b⟩, ⟨a, r⟩, ⟨b, r⟩}


S5 validity proof:

i) Suppose for reductio that the formula is false at some world r in some

S5 model

ii) Then V(33P, r ) = 1 and …

iii) …V(23P, r ) = 0

iv) Given i), V(3P,a) = 1 for some a such thatR ra

v) Given ii), V(3P, b ) = 0 for some b such thatR r b .

vi) Given iv), V(P, c) = 1, for some c such thatRac .

vii) SinceR r b ,Rb r by symmetry. SinceRb r ,R ra, andR r c , by transi-

tivity: Rb c . But then by v), V(P, c) = 0, contradicting vi)

Exercise 6.2i 2[2(P→2P )→2P]→(32P→2P ):

S4-model (establishes invalidity in K, D, and T as well):

∗1 0 1 0 0 1 0 1 0 0

2[2(P→2P )→2P]→(32P→2P )∗ ∗ † ∗ ∗

r

00

��

ff

&&NNNNNNNNNNNNNNN

∗1 1 1

2(P→2P )→2P 2P†

a

00

0 1 0

2(P→2P )→2P∗ †

boo

00


Of�cial model:

W = {r, a,b}R = {⟨r, r⟩, ⟨a, a⟩, ⟨b,b⟩, ⟨r, a⟩, ⟨r,b⟩, ⟨b, r⟩, ⟨b,a⟩}

I (P, r) =I (P, a) = 1, all else 0

B-model (establishes invalidity in K, D, and T as well):

∗1 0 1 0 0 1 0 1 0 0

2[2(P→2P )→2P]→(32P→2P )∗ ∗ † ∗ ∗

r

00 OO

��

ff

&&NNNNNNNNNNNNNNN

∗1 1 1

2(P→2P )→2P 2P†

a

00

0 1 0

2(P→2P )→2P∗ †

b

00

Of�cial model:


I (P, r) =I (P, a) = 1, all else 0


i) Suppose for reductio that the formula is false in some world, r , of some

S5 model

ii) Then V(2[2(P→2P )→2P], r ) = 1 and …

iii) V(32P→2P, r ) = 0

iv) Given iii), V(32P, r ) = 1, so V(2P,a) = 1 for some a such thatR ra

v) Given iii), V(P, c) = 0 for some c such thatR r c


vi) SinceR ra,Ra r by symmetry; and then sinceR r c , by transitivity we

haveRac , whence V(P, c) = 1 by iv), contradicting v)

Exercise 6.4b `K

2∼P→2(P→Q):

1. ∼P→(P→Q) PL

2. 2(∼P→(P→Q)) 1, Nec

3. 2∼P→2(P→Q) 2, K

Exercise 6.4d `K 2(P↔Q)→(2P↔2Q):

1. (P↔Q)→(P→Q) PL

2. 2(P↔Q)→2(P→Q) 1, NEC, K, MP

3. 2(P→Q)→(2P→2Q) K

4. 2(P↔Q)→(2P→2Q) 2, 3, PL

5. 2(P↔Q)→(2Q→2P ) repeat 1-4 starting with (P↔Q)→(Q→P )6. 2(P↔Q)→(2P↔2Q) 4, 5, PL

Exercise 6.4e `K[2(P→Q)∧2(P→∼Q)]→∼3P

1. (P→Q)→[(P→∼Q)→∼P] PL

2. 2(P→Q)→2[(P→∼Q)→∼P] 1, NEC, K, MP

3. 2[(P→∼Q)→∼P]→[2(P→∼Q)→2∼P] K

4. 2(P→Q)→[2(P→∼Q)→2∼P] 2, 3, PL

5. ∼3P↔2∼P MN

6. [2(P→Q)∧2(P→∼Q)]→∼3P 4,5 PL

Exercise 6.4f `K(2P∧2Q)→2(P↔Q):

1. P→(Q→(P↔Q)) PL

2. 2[P→(Q→(P↔Q))] 1, Nec

3. 2P→2(Q→(P↔Q)) 2, K, MP

4. 2(Q→(P↔Q))→(2Q→2(P↔Q)) K

5. 2P→(2Q→(2(P↔Q)) 3, 4, PL

6. (2P∧2Q)→2(P↔Q) 5, PL


Exercise 6.4h `K

3P→(2Q→3Q):

1. Q→(P→Q) PL

2. 2Q→2(P→Q) 1, NEC, K, MP

3. 2(P→Q)→(3P→3Q) K3

4. 3P→(2Q→3Q) 2, 3, PL

Exercise 6.5a `D ∼(2P∧2∼P ):

1. 2P→3P D

2. 2P→∼2∼P rewrite of 1 given def of 3

3. ∼(2P∧2∼P ) 2, PL

Exercise 6.5b `D∼2[2(P∧Q)∧2(P→∼Q)]:

1. (P∧Q)→∼(P→∼Q) PL

2. 2(P∧Q)→2∼(P→∼Q) 1, NEC, K, MP

3. 2∼(P→∼Q)→3∼(P→∼Q) D

4. 3∼(P→∼Q)↔∼2(P→∼Q) MN

5. ∼[2(P∧Q)∧2(P→∼Q)] 2, 3, 4, PL

6. 2∼[2(P∧Q)∧2(P→∼Q)] 6, NEC

7. 3∼[2(P∧Q)∧2(P→∼Q)] D, 6, MP

8. 3∼[2(P∧Q)∧2(P→∼Q)]↔∼2[2(P∧Q)∧2(P→∼Q)]

MN

9. ∼2[2(P∧Q)∧2(P→∼Q)] 7, 8, PL

Exercise 6.6a `T 32P→3(P∨Q):

1. 2P→P T

2. 2P→(P∨Q) 1, PL

3. 32P→3(P∨Q) 2, NEC, K3, MP

Exercise 6.6b `T[2P∧32(P→Q)]→3Q:


1. 2(P→Q)→(P→Q) T

2. P→(2(P→Q)→Q) 1, PL

3. 2P→2(2(P→Q)→Q) 2, NEC, K, MP

4. 2(2(P→Q)→Q)→(32(P→Q)→3Q) K3

5. 2P→(32(P→Q)→3Q) 3, 4, PL

6. [2P∧32(P→Q)]→3Q 5, PL

Exercise 6.6c `T

3(P→2Q)→(2P→3Q):

1. 2Q→Q T

2. P→[(P→2Q)→Q] 1, PL

3. 2P→2[(P→2Q)→Q] 2, NEC, K, MP

4. 2[(P→2Q)→Q]→[3(P→2Q)→3Q] K3

5. 3(P→2Q)→(2P→3Q) 3, 4, PL

Exercise 6.7a `B

32P↔3232P :

1. 2P→232P B3

2. 32P→3232P 1, Nec, K3, MP

3. 3232P→32P B

4. 32P↔3232P 2, 3 PL

Exercise 6.7b `B [2P∧232(P→Q)]→2Q:

1. 32(P→Q)→(P→Q) B

2. 232(P→Q)→2(P→Q) 1, NEC, K, MP

3. 2(P→Q)→(2P→2Q) K

4. [2P∧232(P→Q)]→2Q 2, 3, PL

Exercise 6.8a `S4

2P→232P :

1. 2P→32P T3

2. 22P→232P 1, NEC, K, MP

3. 2P→22P S4

4. 2P→232P 2, 3, PL


Exercise 6.8b `S4

2323P→23P :

1. 23P→3P T

2. 323P→33P 1, NEC, K3, MP

3. 33P→3P S43

4. 323P→3P 2, 3, PL

5. 2323P→23P 4, NEC, K, MP

This is also provable in B:

1. 323P→3P B

2. 2323P→23P 1, NEC, K, MP

Exercise 6.8c `S4 32P→3232P :

1. 2P→32P T3

2. 22P→232P 1, NEC, K, MP

3. 2P→22P S4

4. 2P→232P 2, 3, PL

5. 32P→3232P 4, NEC, K3, MP

Exercise 6.9a We are to show that `S5(2P∨3Q)↔2(P∨3Q). My strat-

egy for the �rst half, in lines 1-7, uses the fact from propositional logic that

(φ∨ψ)→χ follows from φ→χ and ψ→χ . My strategy for the second half uses

MN, plus the fact from propositional logic that χ→(φ∨ψ) is equivalent to

χ→(∼ψ→φ):


1. P→(P∨3Q) PL

2. 2P→2(P∨3Q) 1, Nec, K, MP

3. 3Q→(P∨3Q) PL

4. 23Q→2(P∨3Q) 3, Nec, K, MP

5. 3Q→23Q S53

6. 3Q→2(P∨3Q) 4, 5, PL

7. (2P∨3Q)→2(P∨3Q) 2, 6, PL (done left-to-right; now

for the other direction. Goal: get

2(P∨3Q)→(2∼Q→2P ))8. 2∼Q↔∼3Q MN

9. (P∨3Q)→(2∼Q→P ) 8, PL

10. 2(P∨3Q)→2(2∼Q→P ) 9, NEC, K, MP

11. 2(2∼Q→P )→(22∼Q→2P ) K

12. 2∼Q→22∼Q S4

13. 2(P∨3Q)→(2∼Q→2P ) 10, 11, 12, PL

14. 2(P∨3Q)→(2P∨3Q) 13, 8, PL

15. (2P∨3Q)↔2(P∨3Q) 7, 14, PL

Exercise 6.9b `S5 3(P∧3Q)↔(3P∧3Q):

1. (P∧3Q)→P PL

2. 3(P∧3Q)→3P 1, NEC, K3, MP

3. (P∧3Q)→3Q PL

4. 3(P∧3Q)→33Q 3, NEC, K3, MP

5. 33Q→3Q S43

6. 3(P∧3Q)→(3P∧3Q) 2, 4, 5, PL

7. 3Q→(P→(P∧3Q)) PL

8. 23Q→2(P→(P∧3Q)) 7, NEC, K, MP

9. 2(P→(P∧3Q))→(3P→3(P∧3Q)) K3

10. 3Q→23Q S53

11. 3Q→(3P→3(P∧3Q)) 8, 9, 10, PL

12. (3P∧3Q)→3(P∧3Q) 11, PL

13. 3(P∧3Q)↔(3P∧3Q) 6, 12, PL


Exercise 6.9c We must show that `S5

2(2P→2Q)∨2(2Q→2P ). Plan

of attack: show the PL equivalent: ∼2(2P→2Q)→2(2Q→2P ). Note the

equivalence of the antecedent of this conditional, via MN, with 3∼(2P→2Q).

1. ∼(2P→2Q)→2P PL

2. 3∼(2P→2Q)→32P 1, NEC, K3, MP

3. 32P→2P S5

4. 2P→22P S4

5. 2P→(2Q→2P ) PL

6. 22P→2(2Q→2P ) 5, NEC, K, MP

7. 3∼(2P→2Q)↔∼2(2P→2Q) MN

8. ∼2(2P→2Q)→2(2Q→2P ) 7, 2, 3, 4, 6 PL

9. 2(2P→2Q)∨2(2Q→2P ) 8, PL

Exercise 6.9d `S5

2[2(3P→Q)↔2(P→2Q)]:

1. 2(3P→Q)→(23P→2Q) K

2. P→23P B

3. 2(3P→Q)→(P→2Q) 1, 2, PL

4. 22(3P→Q)→2(P→2Q) 3, NEC, K

5. 2(3P→Q)→22(3P→Q) S4

6. 2(3P→Q)→2(P→2Q) 4, 5, PL

7. 2(P→2Q)→(3P→32Q) K3 (now the other direction)

8. 32Q→Q B

9. 2(P→2Q)→(3P→Q) 7, 8, PL

10. 22(P→2Q)→2(3P→Q) 9, NEC, K, MP

11. 2(P→2Q)→22(P→2Q) S4

12. 2(P→2Q)→2(3P→Q) 10, 11, PL

13. 2(3P→Q)↔2(P→2Q) 12, 6, PL

14. 2[2(3P→Q)↔2(P→2Q)] 13, NEC

Exercise 6.10 Lemma 6.4c is a trivial consequence of lemma 6.4b. As for

lemma 6.4d: since Γ is maximal, either φ or ∼φ is a member of Γ. But ∼φ


can’t be a member of Γ; otherwise, since Γ would contain the S-inconsistent

subset {∼φ} (this subset is S-inconsistent because `Sφ). So φ ∈ Γ.

Exercise 6.11 Where S is any normal modal system, we must show that if

∆ is an S-consistent set of wffs containing the formula 3φ, then 2−(∆)∪φ is

also S-consistent.

3φ is an abbreviation of ∼2∼φ; so what we’re given is this: S is a normal

modal system, ∆ is an S-consistent set of wffs containing the formula ∼2∼φ.

By Lemma 6.6, 2−(∆)∪ {∼∼φ} is S-consistent. Now suppose for reductio

that 2−(∆)∪ {φ} is not S-consistent. So given the de�nition of S-consistency,

for some ψ1 . . .ψn in 2−(∆) ∪ {φ}, `S ∼(ψ1∧· · ·∧ψn). Since S includes PL,

`S ∼(ψ1∧· · ·∧ψn∧φ). Ifφ is one of the ψs, then the rest of the ψs are members

of 2−(∆); so for some δ1 . . .δm in 2−(∆) (namely, all the ψs other than φ),

`S ∼(δ1∧· · ·∧δm∧φ). Since S includes PL, we have: `S ∼(δ1∧· · ·∧δm∧∼∼φ),which violates the S-consistency of 2−(∆)∪{∼∼φ}.

Exercise 6.12 We are to demonstrate completeness for the system that results

from adding to K every axiom of the form 3φ→2φ, where the frames for this

system are de�ned as those whose accessibility relation meets the condition

that every world can see at most one world. Let’s �rst show that

(*) in the canonical model for the strange system, every world sees at most

one world.

To do this, suppose for reductio that for some world, w, in this canonical

model,Rwv andRwv ′ and v 6=v ′. Now, since v 6=v ′, and v and v ′ are maximal

consistent sets of sentences, there must be some sentence, φ, that is a member

of one set but not the other. Without loss of generality, suppose that φ ∈ v and

φ /∈ v ′. Then, by theorem 6.7, V(φ, v) = 1 and V(φ, v ′) = 0. SinceRwv and

Rwv ′, that means that V(3φ, w) = 1 (since φ is true in some world accessible

from w) and V(2φ, w) = 0 (since φ isn’t true at all worlds accessible from w).

So V(3φ→2φ, w) = 0. But 3φ→2φ is a theorem of the strange system, and

so by 6.4d is a member of w, and so by theorem 6.7 is true at w. Contradiction.

Now we use (*) to prove completeness for the strange system. Suppose φis valid in the strange system. That is, φ is true in any world of any model in

which the accessibility relation is such that every world sees at most one world.

Given (*), the canonical model for the strange system is such a model. So φ is


true at every world in the canonical model—i.e., is valid in the canonical model

for this system. By corollary 6.8, φ is a theorem of the strange system.

Exercise 7.1 We’re to show that φ �Iψ iff �

Iφ→ψ. For the left-to-right

direction, suppose φ �ψ, and suppose for reductio that V(φ→ψ, s) = 0. Then

for some s ′ (that s sees), V(φ, s ′) = 1 and V(ψ, s ′) = 0—contradicts φ �ψ.

For the other direction, suppose � φ→ψ, and suppose for reductio that

V(φ, s) = 1 while V(ψ, s) = 0. By �φ→ψ, V(φ→ψ, s) = 1. By re�exivity, either

V(φ, s) = 0 or V(ψ, s) = 1. Contradiction.

Exercise 7.3a We’re to show that ∼(P∧Q) 2 (∼P∨∼Q):

∗1 0 0 0 0 0

∼(P∧Q) ∼P∨∼Q∗ ∗

r

zzttttttttttt

%%JJJJJJJJJJJ00

∗1 0 0

P P∧Qa

00

∗1 0 0

Q P∧Qb

00

Exercise 7.3b We’re to show that ∼P∨∼Q � ∼(P∧Q). Suppose

V(∼P∨∼Q, s) = 1 and V(∼(P∧Q), s) = 0. Given the latter, for some s ′,R s s ′

and V(P∧Q, s ′) = 1. So, V(P, s ′) = 1 and V(Q, s ′) = 1. Given the former, either

V(∼P, s) = 1 or V(∼Q, s) = 1. If the former then V(P, s ′) = 0; if the latter

V(Q, s ′) = 0. Contradiction either way.

Exercise 7.3c We’re to show that P→(Q∨R) 2 (P→Q)∨(P→R). We begin

thus:

∗1 0 0 0

P→(Q∨R) (P→Q)∨(P→R)∗ ∗

r

00

We now discharge the bottom-asterisks:


∗1 0 0 0

P→(Q∨R) (P→Q)∨(P→R)∗ ∗

r

xxrrrrrrrrrrrrrrr

&&LLLLLLLLLLLLLLLLL00

∗1 0

P Qa

00

P Rb

00

Exercise 7.4 We are assuming the inductive hypothesis (ih) that heredity

holds for formulas φ and ψ, and we must show that heredity then must also

hold for ∼φ, φ→ψ, and φ∨ψ.

∼ : Suppose for reductio that V(∼φ, s) = 1,R s s ′, and V(∼φ, s ′) = 0. Given

the latter, for some s ′′, R s ′ s ′′ and V(φ, s ′′) = 1. By transitivity, R s s ′′. This

contradicts V(∼φ, s) = 1.

→ : Suppose for reductio that V(φ→ψ, s) = 1, R s s ′, and V(φ→ψ, s ′) = 0.

Given the latter, for some s ′′,R s ′ s ′′ and V(φ, s ′′) = 1 and V(ψ, s ′′) = 0; but by

transitivity,R s s ′′—contradicts the fact that V(φ→ψ, s) = 1.

∨ : Suppose for reductio that V(φ∨ψ, s) = 1, R s s ′, and V(φ∨ψ, s ′) = 0.

Given the former, either V(φ, s) = 1 or V(ψ, s) = 1; and so, given (ih), either φor ψ is 1 in s ′. That violates V(φ∨ψ, s ′) = 0.

Exercise 7.5 We must show that ∧I, ∨I, DNI, RAA,→I,→E, and EF pre-

serve I-validity.

∧E: Assume that Γ `φ∧ψ is I-valid, and suppose for reductio that V(Γ, s) =1 and V(φ, s) = 0, for some stage s in some model. By the I-validity of Γ `φ∧ψ,

V(φ∧ψ, s) = 1, so V(φ, s) = 1. Contradiction. The case of ψ is parallel.

∨I: Assume that Γ `φ is I-valid and suppose for reductio that V(Γ, s) = 1but V(φ∨ψ, s) = 0. Thus V(φ, s) = 0—contradiction.

DNI: if Γ ` φ then Γ ` ∼∼φ): assume Γ ` φ is I-valid, and suppose for

reductio that V(Γ, s) = 1 but V(∼∼φ, s) = 1. From the latter, for some s ′,R s s ′

and V(∼φ, s ′) = 1. So V(φ, s ′) = 0 (sinceR is re�exive). From the former and

the fact that Γ `φ, V(φ, s) = 1. This violates general heredity.

RAA: Suppose that Γ,φ `ψ∧∼ψ is I-valid, and suppose for reductio that

V(Γ, s) = 1 but V(∼φ, s) = 0. Then for some s ′, R s s ′ and V(φ, s ′) = 1. Since


V(Γ, s) = 1 (i.e., all members of Γ are 1 at s), by general heredity V(Γ, s ′) = 1 (all

members of Γ are 1 at s’). Thus, since Γ,φ `ψ∧∼ψ is I-valid, V(ψ∧∼ψ, s ′) = 1.

But that is impossible. (If V(ψ∧∼ψ, s ′) = 1, then V(ψ, s ′) = 1 and V(∼ψ, s ′) = 1;

but from the latter and the re�exivity ofR it follows that V(ψ, s ′) = 0.)

→I: Suppose thatΓ,φ `ψ is I-valid, and suppose for reductio that V(Γ, s) = 1but V(φ→ψ, s) = 0. Given the latter, for some s ′, R s s ′ and V(φ, s ′) = 1 and

V(ψ, s ′) = 0. Given general heredity, V(Γ, s ′) = 1. And so, given that Γ,φ `ψ is

I-valid, V(ψ, s ′) = 1—contradiction.

→E: Suppose that Γ `φ and ∆ `φ→ψ are both I-valid, and suppose for

reductio that V(Γ ∪∆, s) = 1 but V(ψ, s) = 0. Since Γ ` φ and ∆ ` φ→ψ,

V(φ, s) = 1 and V(φ→ψ, s) = 1. Given the latter, and given thatR is re�exive,

either V(φ, s) = 0 or V(ψ, s) = 1. Contradiction.

EF (ex falso): Suppose Γ `φ∧∼φ is I-valid, and suppose for reductio that

V(Γ, s) = 1 but V(ψ, s) = 0. Given the former and the I-validity of Γ `φ∧∼φ,

V(φ∧∼φ, s) = 1, which is impossible.

Exercise 8.1a We’re to show that φ⇒ψ �φ2→ψ:

i) Suppose φ⇒ψ is true at r , and suppose for reductio that φ2→ψ is false

at r .

ii) Then there’s a nearest-to-r φ world, a, at which ψ is false.

iii) But that can’t be. “φ⇒ψ” means 2(φ→ψ). So φ→ψ is true at every

world. So there can’t be a world like a, in which φ is true and ψ is false.

Exercise 8.1b We’re to show that φ2→ψ �φ→ψ:

i) Suppose φ2→ψ is true at some world r in some SC-model, and …

ii) …suppose for reductio that φ→ψ isn’t true there.

iii) Then φ is true at r and …

iv) …ψ is false at r

v) Given “base”, for every world, x, r �r x.

vi) Given iii) and v), r is a closest-to-r φ world. So, given i), ψ is true at r .

Contradicts iv).


Exercise 8.2a �SC

3P→[(P2→Q)↔∼(P2→∼Q)]:

i) Suppose for reductio that V(3P→[(P2→Q)↔∼(P2→∼Q)], w) = 0,

for some world w in some Stalnaker model.

ii) Then V(3P, w) = 0, and so P is true at some world; and …

iii) …V[(P2→Q)↔∼(P2→∼Q)], w) = 0. So P2→Q and ∼(P2→∼Q)have different truth values at w—either the �rst is true and the second is

false, or the �rst is false and the second is true.

iv) Suppose �rst that V(P2→Q, w) = 1 and …

v) …V(∼(P2→∼Q), w) = 0.

vi) Given ii) and the limit assumption, there is some nearest-to-w P-world,

call it v.

vii) Given iv), Q is true in every nearest-to-w P-world; thus, V(Q, v) = 1.

viii) Given v), V(P2→∼Q, w) = 1; and so V(∼Q, v) = 1, and so V(Q, v) = 0,

contradicting vii).

ix) Suppose second that V(∼(P2→∼Q), w) = 1 and…

x) …V(P2→Q, w) = 0

xi) Given x), there is some nearest-to-w P-world, call it v ′, such that

V(Q, v ′) = 0.

xii) Given ix), V(P2→∼Q, w = 0). So there’s some nearest-to-w P-world,

call it v ′′, such that V(∼Q, v ′′) = 0.

xiii) To say that v ′ is a “nearest-to-w” P-world (line xi)) is to say: V(P, v ′) = 1and for every x, if V(P, x) = 1 then v ′ �w x. So, since V(P, v ′′) = 1 (line

xii)), we have v ′ �w v ′′. Likewise, since v ′′ is a nearest-to-w P-world and

P is true at v ′, we know that v ′′ �w v ′.

xiv) From xiii), by antisymmetry, v ′ = v ′′. And so by xii), V(∼Q, v ′) = 1, and

so V(Q, v ′) = 1, contradicting xi)


(Note that this wff is invalid given Lewis’s semantics, as the countermodel from

section 8.7 shows.

Exercise 8.2b 2SC[P2→(Q→R)]→[(P∧Q)2→R]:

/. -,() *+

1 1 1 0

P∧Q Rb OO

no P∧Q

��

OO

no P

��

/. -,() *+

1 0 1

P Q→Ra

/. -,() *+

1 0 0

[P2→(Q→R)]→[(P∧Q)2→R]r

Of�cial model:

W = {r, a,b}�

r= {⟨a,b⟩ . . .}

I (P, b) =I (Q, b) =I (P, a) = 1; all else 0

Exercise 8.2c 2SC[P2→(Q2→R)]→[Q2→(P2→R)]:


“view from r”:

/. -,() *+d

/. -,() *+c

/. -,() *+

1 0

Q P2→Rb OO

no Q

��

OO

no P

��

/. -,() *+

1 1

P Q2→Ra

/. -,() *+

1 0 0

[P2→(Q2→R)]→[Q2→(P2→R)]r

“view from a”:

/. -,() *+d

/. -,() *+r

/. -,() *+b

OO

no Q

��

/. -,() *+

1 1

Q Rc

/. -,() *+

1 1

P Q2→Ra


“view from b”:

/. -,() *+c

/. -,() *+r

/. -,() *+a

OO

no P

��

/. -,() *+

1 0

P Rd

/. -,() *+

1 0

Q P2→Rb

Of�cial model:

W = {r, a,b, c,d}�r = {⟨a,b⟩, ⟨b,c⟩, ⟨c,d⟩ . . .}�a = {⟨c,b⟩, ⟨b, r⟩, ⟨r,d⟩ . . .}�b = {⟨d,a⟩, ⟨a, r⟩, ⟨r, c⟩ . . .}

I (P, a) =I (Q, b) =I (Q, c) =I (R, c) =I (P, d) = 1, all else 0

Exercise 8.3 We must show that in Lewis models where the limit and anti-

symmetry conditions hold, Lewis’s truth conditions reduce to Stalnaker’s. Con-

sider any Lewis model ⟨W ,�,I ⟩ in which the limit and anti-symmetry con-

ditions hold. Let VL and VS be the Lewis and Stalnaker valuation functions,

respectively, for this model. We must show that these are the same functions.

We’ll show by induction that for any wff φ: for any world w, VS(φ2→ψ) =VS(φ2→ψ). Base case: show that VS and VL assign the same truth values to

sentence letters at each world. This follows from the fact that both functions

by de�nition assign the truth values I (α, w) for sentence letters α.

Induction step: assuming the inductive hypothesis:


(ih) VS and VL assign the same truth values at each world to wffs φ and ψ

we must show that VS and VL also assign the same truth values at each world

to ∼φ, φ→ψ, 2φ, and φ2→ψ. This is easy in the �rst three cases, since i)

the clauses in the de�nitions of VS and VL for the ∼,→, and 2 are identical,

and de�ne the truth values of the complex formulas ∼φ, φ→ψ, and 2φ at a

given world as a function of the truth values of φ and ψ at that world and other

worlds; and ii) (ih) tells us that φ and ψ have the same VS and VL values at all

worlds.

It remains to show that, for a given world w, VS(φ2→ψ) = VL(φ2→ψ).Given Stalnaker’s truth conditions, we know that VS(φ2→ψ, w) = 1 iff:

(S) for any x, IF [VS(φ, x) = 1 and for any y such that VS(φ, y) = 1, x �w y]THEN VS(ψ, x) = 1

And given Lewis’s truth conditions, we know that VL(φ2→ψ, w) = 1 iff:

(L) EITHER φ is trueL at no worlds, OR: there is some world, x, such that

VL(φ, x) = 1 and for all y, if y �w x then VL(φ→ψ, y) = 1

So what we must show is that (S) holds iff (L) holds.

First: (S)⇒ (L):

i) Suppose (S) is true

ii) Suppose for reductio that (L) isn’t true. Then each disjunct of (L) is false;

so:

iii) φ is trueL at some world, and …

iv) …NOT: “there is some world, x, such that VL(φ, x) = 1 and for all y, if

y �w x then VL(φ→ψ, y) = 1”. So: for every world, x, if VL(φ, x) = 1then for some y, y �w x and VL(φ→ψ, y) = 0

v) Given iii) and the limit condition (which we are assuming holds in this

model), there is some world, call it a, that is a nearest-to-w world in

which φ is trueL.

vi) That is, VL(φ,a) = 1 and …


vii) …for all y, if VL(φ, y) = 1 then a �w y

viii) Given (ih) and vi), VS(φ,a) = 1

ix) Given (ih) and vii), for all y, if VS(φ, y) = 1 then a �w y. (It’s crucial to

the success of this step that (ih) tells us that φ has the same value under

VS and VL at all worlds.)

x) Given i), viii), and ix), VS(ψ,a) = 1. Given (ih), VL(ψ,a) = 1

xi) Given iv) and vi), there is some world, call it b , such that b �w a and …

xii) …VL(φ→ψ, b ) = 0

xiii) From xii), VL(φ, b ) = 1. So by vii), a �w b . So, given xi), by antisymmetry

(assumed to hold in this model), a = b , and so, given x), VL(ψ, b ) = 1.

This contradicts xii).

Now: (L)⇒ (S):

i) Suppose (L) is true

ii) Suppose for reductio that (S) isn’t true. So for some world, call it a,

VS(φ,a) = 1 and…

iii) …for any y such that VS(φ, y) = 1, a �w y; and …

iv) VS(ψ,a) = 0. So by (ih), VL(ψ,a) = 0

v) Given ii), by (ih) VL(φ,a) = 1.

vi) v) tells us that φ is trueL at some world. So by i), there is some world,

call it b , such that VL(φ, b ) = 1 and …

vii) …for all y, if y �w b then VL(φ→ψ, y) = 1

viii) Given v) and iv), VL(φ→ψ,a) = 0. So from vii), a �w b . But given vi)

and the ih, VS(φ, b ) = 1; and so, given iii), a �w b . Contradiction.

Exercise 9.1a �SQML

3∀xF x→∃x3F x:


i) Suppose for reductio that in some QML model ⟨W ,D,I ⟩, for some

w ∈ W , and for some variable assignment g based on this model,

Vg (3∀xF x→∃x3F x, w) = 0

ii) Then Vg (3∀xF x, w) = 1 and…

iii) …Vg (∃x3F x, w) = 0

iv) From ii), for some w ′ ∈ W ,Vg (∀xF x, w ′) = 1. So for every u ∈ D,

Vg xu(F x, w ′) = 1

v) From iii), for every u ∈D,Vg xu(3F x, w) = 0

vi) The de�nition of a QML model speci�es that the domain D cannot be

empty. So, D has at least one member; call some member of D “u ′”.

vii) From iv), Vg xu′(F x, w ′) = 1

viii) From v), Vg xu′(3F x, w) = 0, from which it follows that for every member

of W , and so for w ′ in particular, Vg xu′(F x, w ′) = 0, contradicting vii)

Exercise 9.1b 2SQML

∃x3Rax→32∃x∃yRxy:

∗1 0 0 0 1 1

∃ x3Rax→32∃x∃yRxy 3Ra ux

+ ∗ ∗

R : {⟨u,u⟩}

r

+ +0 0 0

∃ x∃yRxy ∃ yR ux y R u

xuy

c

D = {u}I (a) = u


Of�cial model:

W = {r, c}D = {u}

I (a) = u

I (R) = {⟨u,u, r⟩}

Exercise 9.1c We are to determine whether the formula

∃x(N x∧∀y(N y→y=x)∧2O x)→2∃x(N x∧∀y(N y→y=x)∧O x) is valid.

Think of N as meaning “numbers the planets”, and think of O as meaning “is

odd”. The sentence then says: “If the entity which is in fact the number of the

planets is necessarily-odd, then it’s necessary that: the number of the planets is

odd”. It is, therefore, intuitively invalid. I won’t write out the full process of

constructing the model (with overstars, etc.); I’ll just go directly to a picture of

a model. The model will have one world in which one and only one object, u,

numbers the planets; and we will make u be odd in every world in the model.

The model will also contain a second world in which one and only one object,

v, numbers the planets; but v won’t be odd in this second world:

D : {u,v}

N : {u} O : {u}r

N : {v} O : {u}a

W = {r, a}D = {u,v}

I (N ) = {⟨u, r⟩, ⟨v,a⟩}I (O) = {⟨u, r⟩, ⟨u,a⟩}

Exercise 9.2 Formulas 9.1b and 9.1c are SQML-invalid, and so remain

invalid in the variable domain semantics. But whereas 9.1a is SQML-valid, it

is VDQML-invalid:


+ ∗1 0 0 0 0

3∀xF x→∃ x3F x 3F vx

∗

Dr : {v} F : {}

r

+1 1 0

∀xF x F ux F v

x

Da : {u} F : {u}

a

Of�cial model:

W = {r, a}D

r= {v}

Da= {u}

I (F ) = {⟨u,a⟩}

Exercise 9.3a 2VDQML

2∀xF x→∀x2F x:

Dr: {u,v} F : {u,v}r

��

00

Da

: {v} F : {v}a

00

Of�cial model:

W = {r,a} R = {⟨r, r⟩, ⟨r,a⟩, ⟨a,a⟩}D = {u,v} D

r= {u,v} D

a= {v}

I (F ) = {⟨u, r⟩, ⟨v, r⟩, ⟨v,a⟩}

Exercise 9.3b 2VDQML

∃x2F x→2∃xF x:

Dr: {u,v} F : {u}r

��

00

Da

: {v} F : {u}a

00

Of�cial model:

W = {r,a} R = {⟨r, r⟩, ⟨r,a⟩, ⟨a,a⟩}D = {u,v} D

r= {u,v} D

a= {v}

I (F ) = {⟨u, r⟩, ⟨u,a⟩}

Exercise 9.3c The model in exercise 9.3a shows that 2VDQML

∀x2∃y y=x.

Exercise 9.4a 2VDQML+ID

2∀αφ→∀α2φ:

i) suppose for reductio that Vg (2∀αφ, w) = 1, and …


ii) …Vg (∀α2φ, w) = 0

iii) so for some u ∈Dw ,Vgαu(2φ, w) = 0

iv) so for some v,Rwv and Vgαu(φ, v) = 0

v) by i), Vg (∀αφ, v) = 1

vi) by increasing domains, u ∈Dv

vii) by v), for every object in Dv , and so for u in particular, Vgαu(φ, v) = 1.

Contradicts iv)

Exercise 9.4b 2VDQML+ID

∃α2φ→2∃αφ:

i) suppose for reductio that Vg (∃α2φ→2∃αφ, w) = 0, Then

Vg (∃α2φ, w) = 1 and …

ii) …Vg (2∃αφ, w) = 0

iii) by i), for some u ∈Dw ,Vgu/α(2φ, w) = 1

iv) by ii), for some world v,Rwv and Vg (∃αφ, v) = 0

v) by the increasing domain requirement, Dw ⊆Dv , and so u ∈Dv

vi) by iv), for every object in Dv , and so for u in particular, Vgαu(φ, v) = 0

vii) by iii), Vgαu(φ, v) = 1. Contradicts vi)

Exercise 10.1a We are to show that �φ→2@φ. Consider any model, world

w (and variable assignment), and suppose for reductio that V(φ, w, w) = 1but V(2@φ, w, w) = 0. Given the latter, there is some world, v, such that

V(@φ, w, v) = 0. And so, given the truth condition for @, V(φ, w, w) = 0.

Contradiction.

Exercise 10.1b We are to show that �2×∀x3@F x→2∀xF x.

i) Suppose for reductio that for some world w, some variable assignment

g , and some model, Vg (2×∀x3@F x, w, w) = 1 and …


ii) …Vg (2∀xF x, w, w) = 0.

iii) Given the latter, for some world, call it “a”, Vg (∀xF x, w,a) = 0.

iv) And so for some u ∈D (call it “u”), Vg xu(F x, w,a) = 0.

v) Given i), Vg (×∀x3@F x, w,a) = 1

vi) Given the truth condition for ×, Vg (∀x3@F x,a,a) = 1

vii) Thus, for every object in the domain, and so for u in particular,

Vg xu(3@F x,a,a) = 1

viii) Thus, for some world, call it b , Vg xu(@F x,a, b ) = 1

ix) Given the truth condition for @, Vg xu(F x,a,a) = 1

x) Given the truth condition for atomics, ⟨[x]g xu,a⟩ ∈ I (F )

xi) But given iv), ⟨[x]g xu,a⟩ /∈I (F ). Contradiction

Exercise 10.2 We must show that �2DF@φ→φ. Suppose V(F@φ, w, w) = 1

but V(φ, w, w) = 0. Given the former, V(@φ, w, w) = 1 (given the truth

condition for ‘F’); but then V(φ, w, w) = 1 (given the truth condition for @).

Contradiction.

Exercise 10.3 We must �nd some φ such that 22Dφ→Fφ. In example 10.5

model of the previous problem, the formula @Ga→F@Ga is false at ⟨c, c⟩. The

antecedent is true because the referent of ‘a’ is in the extension of ‘G’ at c. The

consequent is false because F@Ga means that ‘Ga’ is true at all pairs of the

form ⟨v, v⟩, whereas ‘Ga’ is not true at ⟨d,d⟩ (since the referent of ‘a’ is not in

the extension of ‘G’ at d).

Bibliography

Benacerraf, Paul and Hilary Putnam (eds.) (1983). Philosophy of Mathematics.2

ndedition. Cambridge: Cambridge University Press.

Boolos, George (1975). “On Second-Order Logic.” Journal of Philosophy 72:

509–527. Reprinted in Boolos 1998: 37–53.

— (1984). “To Be Is to Be the Value of a Variable (or to Be Some Values of

Some Variables).” Journal of Philosophy 81: 430–49. Reprinted in Boolos 1998:

54–72.

— (1985). “Nominalist Platonism.” Philosophical Review 94: 327–44. Reprinted

in Boolos 1998: 73–87.

— (1998). Logic, Logic, and Logic. Cambridge, MA: Harvard University Press.

Boolos, George and Richard Jeffrey (1989). Computability and Logic. 3rd

edition.

Cambridge: Cambridge University Press.

Chalmers, David (1996). The Conscious Mind. Oxford: Oxford University Press.

Cresswell, M. J. (1990). Entities and Indices. Dordrecht: Kluwer.

Cresswell, M.J. and G.E. Hughes (1996). A New Introduction to Modal Logic.London: Routledge.

Davies, Martin and Lloyd Humberstone (1980). “Two Notions of Necessity.”

Philosophical Studies 38: 1–30.

Dummett, Michael (1973). “The Philosophical Basis of Intuitionist Logic.” In

H. E. Rose and J. C. Shepherdson (eds.), Proceedings of the Logic Colloquium,Bristol, July 1973, 5–49. Amsterdam: North-Holland. Reprinted in Benacerraf

and Putnam 1983: 97–129.

350

BIBLIOGRAPHY 351

Enderton, Herbert (1977). Elements of Set Theory. New York: Academic Press.

Evans, Gareth (1979). “Reference and Contingency.” The Monist 62: 161–189.

Reprinted in Evans 1985.

— (1985). Collected Papers. Oxford: Clarendon Press.

Fine, Kit (1985). “Plantinga on the Reduction of Possibilist Discourse.” In

J. Tomberlin and Peter van Inwagen (eds.), Alvin Plantinga, 145–186. Dor-

drecht: D. Reidel.

Gamut, L. T. F. (1991a). Logic, Language, and Meaning, Volume 1: Introductionto Logic. Chicago: University of Chicago Press.

— (1991b). Logic, Language, and Meaning, Volume 2: Intensional Logic and LogicalGrammar. Chicago: University of Chicago Press.

Gibbard, Allan (1975). “Contingent Identity.” Journal of Philosophical Logic 4:

187–221. Reprinted in Rea 1997: 93–125.

Glanzberg, Michael (2006). “Quanti�ers.” In Ernest Lepore and Barry C.

Smith (eds.), The Oxford Handbook of Philosophy of Language, 794–821. Oxford

University Press.

Harper, William L., Robert Stalnaker and Glenn Pearce (eds.) (1981). Ifs: Con-ditionals, Belief, Decision, Chance, and Time. Dordrecht: D. Reidel Publishing

Company.

Hirsch, Eli (1986). “Metaphysical Necessity and Conceptual Truth.” In

Peter French, Theodore E. Uehling, Jr. and Howard K. Wettstein (eds.),

Midwest Studies in Philosophy XI: Studies in Essentialism, 243–256. Minneapolis:

University of Minnesota Press.

Hodes, Harold (1984a). “On Modal Logics Which Enrich First-order S5.”

Journal of Philosophical Logic 13: 423–454.

— (1984b). “Some Theorems on the Expressive Limitations of Modal Lan-

guages.” Journal of Philosophical Logic 13: 13–26.

Jackson, Frank (1998). From Metaphysics to Ethics: A Defence of Conceptual Analysis.Oxford: Oxford University Press.

BIBLIOGRAPHY 352

Kripke, Saul (1972). “Naming and Necessity.” In Donald Davidson and Gilbert

Harman (eds.), Semantics of Natural Language, 253–355, 763–769. Dordrecht:

Reidel. Revised edition published in 1980 as Naming and Necessity (Cambridge,

MA: Harvard University Press).

Lemmon, E. J. (1965). Beginning Logic. London: Chapman & Hall.

Lewis, C. I. (1918). A Survey of Symbolic Logic. Berkeley: University of California

Press.

Lewis, C. I. and C. H. Langford (1932). Symbolic Logic. New York: Century

Company.

Lewis, David (1973). Counterfactuals. Oxford: Blackwell.

— (1977). “Possible-World Semantics for Counterfactual Logics: A Rejoinder.”

Journal of Philosophical Logic 6: 359–363.

— (1979). “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8:

339–59. Reprinted in Lewis 1983: 233–249.

— (1983). Philosophical Papers, Volume 1. Oxford: Oxford University Press.

Linsky, Bernard and Edward N. Zalta (1994). “In Defense of the Simplest

Quanti�ed Modal Logic.” In James Tomberlin (ed.), Philosophical Perspectives8: Logic and Language, 431–458. Atascadero: Ridgeview.

— (1996). “In Defense of the Contingently Nonconcrete.” Philosophical Studies84: 283–294.

Loewer, Barry (1976). “Counterfactuals with Disjunctive Antecedents.” Journalof Philosophy 73: 531–537.

Lycan, William (1979). “The Trouble with Possible Worlds.” In Michael J.

Loux (ed.), The Possible and the Actual, 274–316. Ithaca: Cornell University

Press.

Mendelson, Elliott (1987). Introduction to Mathematical Logic. Belmont, Cali-

fornia: Wadsworth & Brooks.

Plantinga, Alvin (1983). “On Existentialism.” Philosophical Studies 44: 1–20.

BIBLIOGRAPHY 353

Priest, Graham (2001). An Introduction to Non-Classical Logic. Cambridge:

Cambridge University Press.

Prior, A. N. (1967). Past, Present, and Future. Oxford: Oxford University Press.

— (1968). Papers on Time and Tense. London: Oxford University Press.

Quine, W. V. O. (1948). “On What There Is.” Review of Metaphysics 2: 21–38.

Reprinted in Quine 1953a: 1–19.

— (1953a). From a Logical Point of View. Cambridge, Mass.: Harvard University

Press.

— (1953b). “Mr. Strawson on Logical Theory.” Mind 62: 433–451.

Rea, Michael (ed.) (1997). Material Constitution. Lanham, Maryland: Rowman

& Little�eld.

Russell, Bertrand (1905). “On Denoting.” Mind 479–93. Reprinted in Russell

1956: 41–56.

— (1956). Logic and Knowledge. Ed. Robert Charles Marsh. New York: G.P.

Putnam’s Sons.

Sher, Gila (1991). The Bounds of Logic: A Generalized Viewpoint. Cambridge,

Mass.: MIT Press.

Sider, Theodore (2003). “Reductive Theories of Modality.” In Michael J. Loux

and Dean W. Zimmerman (eds.), Oxford Handbook of Metaphysics, 180–208.

Oxford: Oxford University Press.

Soames, Scott (2004). Reference and Description: The Case against Two-Dimensionalism. Princeton: Princeton University Press.

Stalnaker, Robert (1968). “A Theory of Conditionals.” In Studies in LogicalTheory: American Philosophical Quarterly Monograph Series, No. 2. Oxford:

Blackwell. Reprinted in Harper et al. 1981: 41–56.

— (1978). “Assertion.” In Peter Cole and Jerry Morgan (eds.), Syntax and Se-mantics, Volume 9: Pragmatics, 315–332. New York: Academic Press. Reprinted

in Stalnaker 1999: 78–95.

BIBLIOGRAPHY 354

— (1981). “A Defense of Conditional Excluded Middle.” In Harper et al.

(1981), 87–104.

— (1999). Context and Content: Essays on Intentionality in Speech and Thought.Oxford: Oxford University Press.

— (2003a). “Conceptual Truth and Metaphysical Necessity.” In Stalnaker

(2003b), 201–215.

— (2003b). Ways a World Might Be. Oxford: Oxford University Press.

— (2004). “Assertion Revisited: On the Interpretation of Two-Dimensional

Modal Semantics.” Philosophical Studies 118: 299–322. Reprinted in Stalnaker

2003b: 293–309.

von Fintel, Kai (2001). “Counterfactuals in a Dynamic Context.” In Ken Hale:A Life in Language, 123–152,. Cambridge, MA: MIT Press.

Westerståhl, Dag (1989). “Quanti�ers in Formal and Natural Languages.” In

D. Gabbay and F. Guenther (eds.), Handbook of Philosophical Logic, volume 4,

1–131. Dordrecht: Kluwer.

Williamson, Timothy (1998). “Bare Possibilia.” Erkenntnis 48: 257–273.

— (2002). “Necessary Existents.” In A. O’Hear (ed.), Logic, Thought andLanguage, 233–51. Cambridge: Cambridge University Press.

sider theodore - logic for philosophy

Documents