web educationwebéducation.com/wp-content/uploads/2020/03/aaron-r.-bradley-zo… · preface logic...

The Calculus of Computation

Aaron R. Bradley · Zohar Manna

The Calculusof ComputationDecision Procedureswith Applications to Verification

With 60 Figures

123

Authors

Aaron R. BradleyZohar Manna

Gates Building, Room 481Stanford UniversityStanford, CA 94305USA

[email protected]@cs.stanford.edu

Library of Congress Control Number: 2007932679

ACM Computing Classification (1998): B.8, D.1, D.2, E.1, F.1, F.3, F.4, G.2, I.1, I.2

ISBN 978-3-540-74112-1 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the materialis concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad-casting, reproduction on microfilm or in any other way, and storage in data banks. Duplication ofthis publication or parts thereof is permitted only under the provisions of the German Copyright Lawof September 9, 1965, in its current version, and permission for use must always be obtained fromSpringer. Violations are liable for prosecution under the German Copyright Law.

Springer is a part of Springer Science+Business Media

springer.com

© Springer-Verlag Berlin Heidelberg 2007

The use of general descriptive names, registered names, trademarks, etc. in this publication does notimply, even in the absence of a specific statement, that such names are exempt from the relevant pro-tective laws and regulations and therefore free for general use.

Typesetting by the authorsProduction: LE-TEX Jelonek, Schmidt & Vöckler GbR, LeipzigCover design: KünkelLopka Werbeagentur, Heidelberg

Printed on acid-free paper 45/3180/YL - 5 4 3 2 1 0

To my wife,

Sarah

A.R.B.

To my grandchildren,

Itai

Maya

Ori

Z.M.

Preface

Logic is the calculus of computation. Forty-five years ago, John McCarthypredicted in A Basis for a Mathematical Theory of Computation that “therelationship between computation and mathematical logic will be as fruitful inthe next century as that between analysis and physics in the last”. The field ofcomputational logic emerged over the past few decades in partial fulfillmentof that vision. Focusing on producing efficient and powerful algorithms fordeciding the satisfiability of formulae in logical theories and fragments, itcontinues to push the frontiers of general computer science.

This book is about computational logic and its applications to programverification. Program verification is the task of analyzing the correctness of aprogram. It encompasses the formal specification of what a program should doand the formal proof that the program meets this specification. The reasoningpower that computational logic offers revolutionized the field of verification.Ongoing research will make verification standard practice in software andhardware engineering in the next few decades. This acceptance into everydayengineering cannot come too soon: software and hardware are becoming evermore ubiquitous and thus ever more the source of failure.

We wrote this book with an undergraduate and beginning graduate audi-ence in mind. However, any computer scientist or engineer who would like toenter the field of computational logic or apply its products should find thisbook useful.

Content

The book has two parts. Part I, Foundations, presents first-order logic, induc-tion, and program verification. The methods are general. For example, Chap-ter 2 presents a complete proof system for first-order logic, while Chapter 5describes a relatively complete verification methodology. Part II, AlgorithmicReasoning, focuses on specialized algorithms for reasoning about fragments offirst-order logic and for deducing facts about programs. Part II trades gener-ality for decidability and efficiency.

VIII Preface

The first three chapters of Part I introduce first-order logic. Chapters 1 and2 begin our presentation with a review of propositional and predicate logic.Much of the material will be familiar to the reader who previously studiedlogic. However, Chapter 3 on first-order theories will be new to many readers.It axiomatically defines the various first-order theories and fragments that westudy and apply throughout the rest of the book. Chapter 4 reviews induction,introducing some forms of induction that may be new to the reader. Inductionprovides the mathematical basis for analyzing program correctness.

Chapter 5 turns to the primary motivating application of computationallogic in this book, the task of verifying programs. It discusses specification, inwhich the programmer formalizes in logic the (sometimes surprisingly vague)understanding that he has about what functions should do; partial correctness,which requires proving that a program or function meets a given specificationif it halts; and total correctness, which requires proving additionally that a pro-gram or function always halts. The presentation uses the simple programminglanguage pi and is supported by the verifying compiler πVC (see The πVCSystem, below, for more information on πVC). Chapter 6 suggests strategiesfor applying the verification methodology.

Part II on Algorithmic Reasoning begins in Chapter 7 with quantifier-elimination methods for limited integer and rational arithmetic. It describesan algorithm for reducing a quantified formula in integer or rational arithmeticto an equivalent formula without quantifiers.

Chapter 8 begins a sequence of chapters on decision procedures forquantifier-free and other fragments of theories. These fragments of first-ordertheories are interesting for three reasons. First, they are sometimes decidablewhen the full theory is not (see Chapters 9, 10, and 11). Second, they aresometimes efficiently decidable when the full theory is not (compare Chapters7 and 8). Finally, they are often useful; for example, proving the verificationconditions that arise in the examples of Chapters 5 and 6 requires just thefragments of theories studied in Chapters 8–11. The simplex method for linearprogramming is presented in Chapter 8 as a decision procedure for decidingsatisfiability in rational and real arithmetic without multiplication.

Chapters 9 and 11 turn to decision procedures for non-arithmetical theo-ries. Chapter 9 discusses the classic congruence closure algorithm for equalitywith uninterpreted functions and extends it to reason about data structureslike lists, trees, and arrays. These decision procedures are for quantifier-freefragments only. Chapter 11 presents decision procedures for larger fragmentsof theories that formalize array-like data structures.

Decision procedures are most useful when they are combined. For example,in program verification one must reason about arithmetic and data structuressimultaneously. Chapter 10 presents the Nelson-Oppen method for combiningdecision procedures for quantifier-free fragments. The decision procedures ofChapters 8, 9, and 11 are all combinable using the Nelson-Oppen method.

Chapter 12 presents a methodology for constructing invariant generationprocedures. These procedures reason inductively about programs to aid in

Preface IX

1–4

5,6 7 8 9

12 10

11

Verification Decision procedures

Fig. 0.1. The chapter dependency graph

verification. They relieve some of the burden on the programmer to provideprogram annotations for verification purposes. For now, developing a staticanalysis is one of the easiest ways of bringing formal methods into generalusage, as a typical static analysis requires little or no input from the pro-grammer. The chapter presents a general methodology and two instances ofthe method for deducing arithmetical properties of programs.

Finally, Chapter 13 suggests directions for further reading and research.

Teaching

This book can be used in various ways and taught at multiple levels. Figure0.1 presents a dependency graph for the chapters. There are two main tracks:the verification track, which focuses on Chapters 1–4, 5, 6, and 12; and thedecision procedures track, which focuses on Chapters 1–4 and 7–11. Withinthe decision procedures track, the reader can focus on the quantifier-free de-cision procedures track, which skips Chapters 7 and 11. The reader interestedin quickly obtaining an understanding of modern combination decision proce-dures would prefer this final track.

We have annotated several sections with a ⋆ to indicate that they provideadditional depth that is unnecessary for understanding subsequent material.Additionally, all proofs may be skipped without preventing a general under-standing of the material.

Each chapter ends with a set of exercises. Some require just a mechanicalunderstanding of the material, while others require a conceptual understand-ing or ask the reader to think beyond what is presented in the book. Theselatter exercises are annotated with a ⋆. For certain audiences, additional exer-cises might include implementing decision procedures or invariant generationprocedures and exploring certain topics in greater depth (see Chapter 13).

In our courses, we assign program verification exercises from Chapters 5and 6 throughout the term to give students time to develop this importantskill. Learning to verify programs is about as difficult for students as learning

X Preface

to program in the first place. Specifying and verifying programs also strength-ens the students’ facility with logic.

Bibliographic Remarks

Each chapter ends with a section entitled Bibliographic Remarks in whichwe attempt to provide a brief account of the historical context and develop-ment of the chapter’s material. We have undoubtedly missed some importantcontributions, for which we apologize. We welcome corrections, comments,and historical anecdotes.

The πVC System

We implemented a verifying compiler called πVC to accompany this text. Itallows users to write and verify annotated programs in the pi programminglanguage. The system and a set of examples, including the programs listed inthis book, are available for download from http://theory.stanford.edu/∼arbrad/pivc. We plan to update this website regularly and welcome readers’comments, questions, and suggestions about πVC and the text.

Acknowledgments

This material is based upon work supported by the National Science Foun-dation under Grant Nos. CSR-0615449 and CNS-0411363 and by Navy/ONRcontract N00014-03-1-0939. Any opinions, findings, and conclusions or rec-ommendations expressed in this material are those of the authors and donot necessarily reflect the views of the National Science Foundation or theNavy/ONR. The first author received additional support from a Sang SamuelWang Stanford Graduate Fellowship.

We thank the following people for their comments throughout the writ-ing of this book: Miquel Bertran, Andrew Bradley, Susan Bradley, Chang-Seo Park, Caryn Sedloff, Henny Sipma, Matteo Slanina, Sarah Solter, FabioSomenzi, Tomas Uribe, the students of CS156, and Alfred Hofmann and thereviewers and editors at Springer. Their suggestions helped us to improvethe presentation substantially. Remaining errors and shortcomings are ourresponsibility.

Stanford University, Aaron R. BradleyJune 2007 Zohar Manna

Contents

Part I Foundations

1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Satisfiability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.1 Truth Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.2 Semantic Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4 Equivalence and Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.6 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.7 Decision Procedures for Satisfiability . . . . . . . . . . . . . . . . . . . . . . . 21

1.7.1 Simple Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 211.7.2 Reconsidering the Truth-Table Method . . . . . . . . . . . . . . . 221.7.3 Conversion to an Equisatisfiable Formula in CNF . . . . . . 241.7.4 The Resolution Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 271.7.5 DPLL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.3 Satisfiability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.4 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4.1 Safe Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4.2 Schema Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.5 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.6 Decidability and Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.6.1 Satisfiability as a Formal Language . . . . . . . . . . . . . . . . . . 53

XII Contents

2.6.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.6.3 ⋆Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.7 ⋆Meta-Theorems of First-Order Logic . . . . . . . . . . . . . . . . . . . . . 562.7.1 Simplifying the Language of FOL. . . . . . . . . . . . . . . . . . . . 572.7.2 Semantic Argument Proof Rules . . . . . . . . . . . . . . . . . . . . . 582.7.3 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . 582.7.4 Additional Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


3 First-Order Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.1 First-Order Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.2 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.3 Natural Numbers and Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.3.1 Peano Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.3.2 Presburger Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.3.3 Theory of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.4 Rationals and Reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.4.1 Theory of Reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803.4.2 Theory of Rationals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.5 Recursive Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.6 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873.7 ⋆Survey of Decidability and Complexity . . . . . . . . . . . . . . . . . . . 903.8 Combination Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.1 Stepwise Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.2 Complete Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994.3 Well-Founded Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1024.4 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5 Program Correctness: Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.1 pi: A Simple Imperative Language . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.1.1 The Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.1.2 Program Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.2 Partial Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235.2.1 Basic Paths: Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.2.2 Basic Paths: Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . 131

Contents XIII

5.2.3 Program States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.2.4 Verification Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1365.2.5 P -Invariant and P -Inductive . . . . . . . . . . . . . . . . . . . . . . . . 142

5.3 Total Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1435.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6 Program Correctness: Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . 1536.1 Developing Inductive Annotations . . . . . . . . . . . . . . . . . . . . . . . . . 153

6.1.1 Basic Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.1.2 The Precondition Method . . . . . . . . . . . . . . . . . . . . . . . . . . 1566.1.3 A Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

6.2 Extended Example: QuickSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1646.2.1 Partial Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1676.2.2 Total Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171


Part II Algorithmic Reasoning

7 Quantified Linear Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837.1 Quantifier Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

7.1.1 Quantifier Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1847.1.2 A Simplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

7.2 Quantifier Elimination over Integers . . . . . . . . . . . . . . . . . . . . . . . 1857.2.1 Augmented Theory of Integers . . . . . . . . . . . . . . . . . . . . . . 1857.2.2 Cooper’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1877.2.3 A Symmetric Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . 1947.2.4 Eliminating Blocks of Quantifiers . . . . . . . . . . . . . . . . . . . . 1957.2.5 ⋆Solving Divides Constraints . . . . . . . . . . . . . . . . . . . . . . . 196

7.3 Quantifier Elimination over Rationals . . . . . . . . . . . . . . . . . . . . . . 2007.3.1 Ferrante and Rackoff’s Method . . . . . . . . . . . . . . . . . . . . . . 200

7.4 ⋆Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2047.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

8 Quantifier-Free Linear Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 2078.1 Decision Procedures for Quantifier-Free Fragments . . . . . . . . . . . 2078.2 Preliminary Concepts and Notation . . . . . . . . . . . . . . . . . . . . . . . . 2098.3 Linear Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2138.4 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

XIV Contents

8.4.1 From M to M0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2198.4.2 Vertex Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2238.4.3 ⋆Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237


9 Quantifier-Free Equality and Data Structures . . . . . . . . . . . . . . 2419.1 Theory of Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2429.2 Congruence Closure Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

9.2.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2459.2.2 Congruence Closure Algorithm . . . . . . . . . . . . . . . . . . . . . . 247

9.3 Congruence Closure with DAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . 2519.3.1 Directed Acyclic Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 2519.3.2 Basic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2549.3.3 Congruence Closure Algorithm . . . . . . . . . . . . . . . . . . . . . . 2559.3.4 Decision Procedure for TE-Satisfiability . . . . . . . . . . . . . . . 2569.3.5 ⋆Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

9.4 Recursive Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2599.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2639.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

10 Combining Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 26910.1 Combining Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 26910.2 Nelson-Oppen Method: Nondeterministic Version . . . . . . . . . . . . 271

10.2.1 Phase 1: Variable Abstraction . . . . . . . . . . . . . . . . . . . . . . . 27110.2.2 Phase 2: Guess and Check . . . . . . . . . . . . . . . . . . . . . . . . . . 27310.2.3 Practical Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

10.3 Nelson-Oppen Method: Deterministic Version . . . . . . . . . . . . . . . 27610.3.1 Convex Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27610.3.2 Phase 2: Equality Propagation . . . . . . . . . . . . . . . . . . . . . . 27810.3.3 Equality Propagation: Implementation . . . . . . . . . . . . . . . 282

10.4 ⋆Correctness of the Nelson-Oppen Method . . . . . . . . . . . . . . . . . 28310.5 ⋆Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28710.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

11 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29111.1 Arrays with Uninterpreted Indices . . . . . . . . . . . . . . . . . . . . . . . . . 292

11.1.1 Array Property Fragment . . . . . . . . . . . . . . . . . . . . . . . . . . 29211.1.2 Decision Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

11.2 Integer-Indexed Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Contents XV

11.2.1 Array Property Fragment . . . . . . . . . . . . . . . . . . . . . . . . . . 30011.2.2 Decision Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

11.3 Hashtables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30411.3.1 Hashtable Property Fragment . . . . . . . . . . . . . . . . . . . . . . . 30511.3.2 Decision Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306

11.4 Larger Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30811.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

12 Invariant Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31112.1 Invariant Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

12.1.1 Weakest Precondition and Strongest Postcondition . . . . 31212.1.2 ⋆General Definitions of wp and sp . . . . . . . . . . . . . . . . . . . 31512.1.3 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31612.1.4 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

12.2 Interval Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32512.3 Karr’s Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33312.4 ⋆Standard Notation and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 34112.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

13 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Part I

Foundations

Everything is vague to a degree you do not realize till you have triedto make it precise.

— Bertrand RussellPhilosophy of Logical Atomism, 1918

Modern design and implementation of software and hardware systems lacksprecision. Design documents written in a natural language admit misinterpre-tation. Informal arguments about why a system works miss crucial weaknesses.The resulting systems are fragile. Part I of this book presents an alternativeapproach to system design and implementation based on using a formal lan-guage to specify and reason about software systems.

Chapters 1 and 2 introduce the (first-order) predicate calculus. Chapter 1presents the propositional calculus, and Chapter 2 presents the full predicatecalculus. A central task is determining whether formulae of the calculus arevalid. Chapter 3 formalizes common data types of software in the predicatecalculus. It also introduces the concepts of decidability and complexity ofdeciding validity of formulae.

The final three chapters of Part I discuss applications of the predicate cal-culus. Chapter 4 formalizes mathematical induction in the predicate calculus,in the process introducing several forms of induction that may be new to thereader. Chapters 5 and 6 then apply the predicate calculus and mathemat-ical induction to the specification and verification of software. Specificationconsists of asserting facts about software. Verification applies mathematicalinduction to prove that each assertion evaluates to true when program con-trol reaches it; and to prove that program control eventually reaches specificprogram locations.

Part I thus provides the mathematical foundations for precise engineering.Part II will investigate algorithmic aspects of applying these foundations.

1

Propositional Logic

A deduction is speech in which, certain things having been supposed,something different from the things supposed results of necessity be-cause of their being so.

— AristotlePrior Analytics, 4th century BC

A calculus is a set of symbols and a system of rules for manipulating thesymbols. In an interesting calculus, the symbols and rules have meaning insome domain that matters. For example, the differential calculus defines rulesfor manipulating the integral symbol over a polynomial to compute the areaunder the curve that the polynomial defines. Area has meaning outside of thecalculus; the calculus provides the tool for computing such quantities. Thedomain of the differential calculus, loosely speaking, consists of real numbersand functions over those numbers.

Computer scientists are interested in a different domain and thus requirea different calculus. The behavior of programs, or computation, is a computerscientist’s chief concern. What is an appropriate domain for studying com-putation? The basic entity of the domain is state: roughly, the assignment ofvalues (for example, Booleans, integers, or addresses) to variables. Pairs ofstates comprise transitions. A computation is a sequence of states, each ad-jacent pair of which is a transition. A program defines the form of its states,the set of transitions between states, and the set of computations that it canproduce. A program’s set of computations characterizes the program itself asprecisely as its source code. Chapter 5 studies these ideas in depth.

With a domain in mind, a computer scientist can now ask questions. Doesthis program that accepts an array of integers produce a sorted array? Inother words, does each of the program’s computations have a state in which asorted array is returned? Does this program ever access unallocated memory?Does this function always halt? To answer such questions, we need a calculusto reason about computations.

4 1 Propositional Logic

This chapter and the next introduce the calculus that will be the basis forstudying computation in this book. In this chapter, we cover propositionallogic (PL); in the next chapter, we build on the presentation to define first-order logic (FOL). PL and FOL are also known as propositional calculusand predicate calculus, respectively, because they are calculi for reasoningabout propositions (“the sky is blue”, “this comment references itself”) andpredicates (“x is blue”, “y references z”), respectively. Propositions are eithertrue or false, while predicates evaluate to true or false depending on the valuesgiven to their parameters (x, y, and z).

Just as differential calculus has a set of symbols, a set of rules, and amapping to reality that provides its meaning, propositional logic has its ownsymbols, rules of inference, and meaning. Sections 1.1 and 1.2 introduce thesyntax and semantics (meaning) of PL formulae. Then Section 1.3 discussestwo concepts that are fundamental throughout this book, satisfiability (Isthis formula ever true?) and validity (Is this formula always true?), and therules for computing whether a PL formula is satisfiable or valid. Rules formanipulating PL formulae, some of which preserve satisfiability and validity,are discussed in Section 1.5 and applied in Section 1.6.

1.1 Syntax

In this section, we introduce the syntax of PL. The syntax of a logical lan-guage consists of a set of symbols and rules for combining them to form“sentences” (in this case, formulae) of the language.

The basic elements of PL are the truth symbols ⊤ (“true”) and ⊥(“false”) and the propositional variables, usually denoted by P , Q, R,P1, P2, . . .. A countably infinite set of propositional variable symbols exists.Logical connectives, also called Boolean connectives, provide the expres-sive power of PL. A formula is simply ⊤, ⊥, or a propositional variable P ; orthe application of one of the following connectives to formulae F , F1, or F2:

• ¬F : negation, pronounced “not”;• F1 ∧ F2: conjunction, pronounced “and”;• F1 ∨ F2: disjunction, pronounced “or”;• F1 → F2: implication, pronounced “implies”;• F1 ↔ F2: iff, pronounced “if and only if”.

Each connective has an arity (the number of arguments that it takes): nega-tion is unary (it takes one argument), while the other connectives are binary(they take two arguments). The left and right arguments of → are called theantecedent and consequent, respectively.

Some common terminology is useful. An atom is a truth symbol ⊤, ⊥ orpropositional variable P , Q, . . .. A literal is an atom α or its negation ¬α. Aformula is a literal or the application of a logical connective to a formula orformulae.

1.1 Syntax 5

Formula G is a subformula of formula F if it occurs syntactically withinG. More precisely,

• the only subformula of P is P ;• the subformulae of ¬F are ¬F and the subformulae of F ;• and the subformulae of F1∧F2, F1∨F2, F1 → F2, F1 ↔ F2 are the formula

itself and the subformulae of F1 and F2.

Notice that every formula is a subformula of itself. The strict subformulaeof a formula are all its subformulae except itself.

Example 1.1. Consider the formula

F : (P ∧Q) → (P ∨ ¬Q) .

It contains two propositional variables, P and Q. Each instance of P and Qis an atom and a literal. ¬Q is a literal, but not an atom. F has six distinctsubformulae:

F , P ∨ ¬Q , ¬Q , P ∧Q , P , Q .

Its strict subformulae are all of its subformulae except F itself.

Parentheses are cumbersome. We define the relative precedence of the logi-cal connectives from highest to lowest as follows: ¬, ∧, ∨,→,↔. Additionally,let → and ↔ associate to the right, so that P → Q→ R is the same formulaas P → (Q→ R).

Example 1.2. Abbreviate F of Example 1.1 as

F ′ : P ∧Q → P ∨ ¬Q .

Also,

P1 ∧ ¬P2 ∧ ⊤ ∨ ¬P1 ∧ P2

stands for

(P1 ∧ ((¬P2) ∧ ⊤)) ∨ ((¬P1) ∧ P2) .

Finally,

P1 → P2 → P3

abbreviates

P1 → (P2 → P3) .


1.2 Semantics

So far, we have considered the syntax of PL. The semantics of a logic providesits meaning. What exactly is meaning? In PL, meaning is given by the truthvalues true and false, where true 6= false. Our objective is to define how togive meaning to formulae.

The first step in defining the semantics of PL is to provide a mechanismfor evaluating the propositional variables. An interpretation I assigns toevery propositional variable exactly one truth value. For example,

I : P 7→ true, Q 7→ false, . . .

is an interpretation assigning true to P and false to Q, where . . . elides the(countably infinitely many) assignments that are not relevant to us. That is, Iassigns to every propositional variable available to us (and there are countablyinfinitely many) a value. We usually do not write the elision. Clearly, manyinterpretations exist.

Now given a PL formula F and an interpretation I, the truth value of Fcan be computed. The simplest manner of computing the truth value of F isvia a truth table. Let us first examine truth tables that indicate how to eval-uate each logical connective in terms of its arguments. First, a propositionalvariable gets its truth value immediately from I. Now consider the possibleevaluations of F : it is either true or false. How is ¬F evaluated? The followingtable summarizes the possibilities, where 0 corresponds to the value false, and1 corresponds to true:

F ¬F0 11 0

The other connective can be defined similarly given values of F1 and F2:

F1 F2 F1 ∧ F2 F1 ∨ F2 F1 → F2 F1 ↔ F2

0 0 0 0 1 10 1 0 1 1 01 0 0 1 0 01 1 1 1 1 1

In particular, F1 → F2 is false iff F1 is true and F2 is false. (Throughout thebook, we use the word “iff” to abbreviate the phrase “if and only if”; one canalso read it as “precisely when”.)


F : P ∧Q → P ∨ ¬Q

and the interpretation

1.2 Semantics 7

I : P 7→ true, Q 7→ false .

To evaluate the truth value of F under I, construct the following table:

P Q ¬Q P ∧Q P ∨ ¬Q F

1 0 1 0 1 1

The top row is given by the subformulae of F . I provides values for the firsttwo columns; then the semantics of PL provide the values for the remainderof the table. Hence, F evaluates to true under I.

This tabular notation is convenient, but it is unsuitable for the predicatelogic of Chapter 2. Instead, we introduce an inductive definition of PL’ssemantics that will extend to Chapter 2. An inductive definition defines themeaning of basic elements first, which in the case of PL are atoms. Then itassumes that the meaning of a set of elements is fixed and defines a morecomplex element in terms of these elements. For example, in PL, F1 ∧ F2 is amore complex formula than either of the formulae F1 or F2.

Recall that we want to compute whether F has value true under inter-pretation I. We write I |= F if F evaluates to true under I and I 6|= F ifF evaluates to false. To start our inductive definition, define the meaning oftruth symbols:

I |= ⊤I 6|= ⊥

Under any interpretation I, ⊤ has value true, and ⊥ has value false. Next,define the truth value of propositional variables:

I |= P iff I[P ] = true

P has value true iff the interpretation I assigns P to have value true.Since an interpretation assigns a truth value to every propositional vari-

able, I assigns false to P when I does not assign true to P . Thus, we caninstead define the truth values of propositional variables as follows:

I 6|= P iff I[P ] = false

Since true 6= false, both definitions yield the same (unique) truth values.Having completed the base cases of our inductive definition, we turn to

the inductive step. Assume that formulae F , F1, and F2 have truth values.From these formulae, evaluate the semantics of more complex formulae:

I |= ¬F iff I 6|= FI |= F1 ∧ F2 iff I |= F1 and I |= F2

I |= F1 ∨ F2 iff I |= F1 or I |= F2

I |= F1 → F2 iff, if I |= F1 then I |= F2

I |= F1 ↔ F2 iff I |= F1 and I |= F2, or I 6|= F1 and I 6|= F2


In studying these definitions, it is useful to recall the earlier definitions givenby the truth tables, which are free of English ambiguities.

For implication, consider also the equivalent formulation

I 6|= F1 → F2 iff I |= F1 and I 6|= F2

The formula F1 → F2 has truth value true under I when either F1 is falseor F2 is true. It is false only when F1 is true and F2 is false. Our inductivedefinition of the semantics of PL is complete.


F : P ∧Q → P ∨ ¬Q

and the interpretation

I : P 7→ true, Q 7→ false .

Compute the truth value of F as follows:

1. I |= P since I[P ] = true2. I 6|= Q since I[Q] = false3. I |= ¬Q by 2 and semantics of ¬4. I 6|= P ∧Q by 2 and semantics of ∧5. I |= P ∨ ¬Q by 1 and semantics of ∨6. I |= F by 4 and semantics of →

We considered the distinct subformulae of F according to the subformulaordering: F1 precedes F2 if F1 is a subformula of F2. In that order, wecomputed the truth value of F from its simplest subformulae to its mostcomplex subformula (F itself).

The final line of the calculation deserves some explanation. According tothe semantics for implication,


the implication F1 → F2 has value true when I 6|= F1. Thus, line 5 is unnec-essary for establishing the truth value of F .

1.3 Satisfiability and Validity

We now consider a fundamental characterization of PL formulae.A formula F is satisfiable iff there exists an interpretation I such that

I |= F . A formula F is valid iff for all interpretations I, I |= F . Determiningsatisfiability and validity of formulae are important tasks in logic.

Satisfiability and validity are dual concepts, and switching from one to theother is easy. F is valid iff ¬F is unsatisfiable. For suppose that F is valid;

1.3 Satisfiability and Validity 9

then for any interpretation I, I |= F . By the semantics of negation, I 6|= ¬F ,so ¬F is unsatisfiable. Conversely, suppose that ¬F is unsatisfiable. For anyinterpretation I, I 6|= ¬F , so that I |= F by the semantics of negation. Thus,F is valid.

Because of this duality between satisfiability and validity, we are free tofocus on either one or the other in the text, depending on which is moreconvenient for the discussion. The reader should realize that statements aboutone are also statements about the other.

In this section, we present several methods of determining validity andsatisfiability of PL formulae.

1.3.1 Truth Tables

Our first approach to checking the validity of a PL formula is the truth-tablemethod. We exhibit this method by example.


F : P ∧Q → P ∨ ¬Q .

Is it valid? Construct a table in which the first row is a list of the subformulaeof F ordered according to the subformula ordering. Fill columns of proposi-tional variables with all possible combinations of truth values. Then apply thesemantics of PL to fill the rest of the table:

P Q P ∧Q ¬Q P ∨ ¬Q F

0 0 0 1 1 10 1 0 0 0 11 0 0 1 1 11 1 1 0 1 1

The final column, which represents the truth value of F under the possibleinterpretations, is filled entirely with true. F is valid.


F : P ∨Q → P ∧Q .

Construct the truth table:

P Q P ∨Q P ∧Q F

0 0 0 0 10 1 1 0 01 0 1 0 01 1 1 1 1

Because the second and third rows show that F can be false, F is invalid.


1.3.2 Semantic Arguments

Our next approach to validity checking is the semantic argument method.While more complicated than the truth-table method, we introduce it andemphasize it throughout the remainder of the chapter because it is our onlymethod of evaluating the satisfiability and validity of formulae in Chapter 2.

A proof based on the semantic method begins by assuming that the givenformula F is invalid: hence, there is a falsifying interpretation I such thatI 6|= F . The proof proceeds by applying the semantic definitions of the logicalconnectives in the form of proof rules. A proof rule has one or more premises(assumed facts) and one or more deductions (deduced facts). An applicationof a proof rule requires matching the premises to facts already existing in thesemantic argument and then forming the deductions. The proof rules are thefollowing:

• According to the semantics of negation, from I |= ¬F , deduce I 6|= F ; andfrom I 6|= ¬F , deduce I |= F :

I |= ¬FI 6|= F

I 6|= ¬FI |= F

• According to the semantics of conjunction, from I |= F ∧G, deduce bothI |= F and I |= G; and from I 6|= F ∧ G, deduce I 6|= F or I 6|= G. Thelatter deduction results in a fork in the proof; each case must be consideredseparately.

I |= F ∧G

I |= FI |= G

I 6|= F ∧G

I 6|= F | I 6|= G

• According to the semantics of disjunction, from I |= F ∨G, deduce I |= For I |= G; and from I 6|= F ∨ G, deduce both I 6|= F and I 6|= G. Theformer deduction requires a case analysis in the proof.

I |= F ∨G

I |= F | I |= G

I 6|= F ∨G

I 6|= FI 6|= G

• According to the semantics of implication, from I |= F → G, deduceI 6|= F or I |= G; and from I 6|= F → G, deduce both I |= F and I 6|= G.The former deduction requires a case analysis in the proof.

I |= F → G

I 6|= F | I |= G

I 6|= F → G

I |= FI 6|= G


• According to the semantics of iff, from I |= F ↔ G, deduce I |= F ∧G orI 6|= F ∨ G; and from I 6|= F ↔ G, deduce I |= F ∧ ¬G or I |= ¬F ∧ G.Both deductions require considering multiple cases.

I |= F ↔ G

I |= F ∧G | I 6|= F ∨G

I 6|= F ↔ G

I |= F ∧ ¬G | I |= ¬F ∧G

• Finally, a contradiction occurs when following the above proof rules resultsin the claim that an interpretation I both satisfies a formula F and doesnot satisfy F .

I |= FI 6|= FI |= ⊥

Before explaining proofs in more detail, let us see several examples.

Example 1.7. To prove that the formula

F : P ∧Q → P ∨ ¬Q

is valid, assume that it is invalid and derive a contradiction. Thus, assumethat there is a falsifying interpretation I of F (such that I 6|= F ). Then,

1. I 6|= P ∧Q → P ∨ ¬Q assumption2. I |= P ∧Q by 1 and semantics of →3. I 6|= P ∨ ¬Q by 1 and semantics of →4. I |= P by 2 and semantics of ∧5. I |= Q by 2 and semantics of ∧6. I 6|= P by 3 and semantics of ∨7. I 6|= ¬Q by 3 and semantics of ∨8. I |= Q by 7 and semantics of ¬

Lines 4 and 6 contradict each other, so that our assumption must be wrong:F is actually valid.

We can end the proof as soon as we have a contradiction. For example,

1. I 6|= P ∧Q → P ∨ ¬Q assumption2. I |= P ∧Q by 1 and semantics of →3. I 6|= P ∨ ¬Q by 1 and semantics of →4. I |= P by 2 and semantics of ∧5. I 6|= P by 3 and semantics of ∨

This argument is sufficient because a contradiction already exists. In otherwords, the discovered contradiction closes the one branch of the proof. Wesometimes note the contradiction explicitly in the proof:

6. I |= ⊥ 4 and 5 are contradictory



F : (P → Q) ∧ (Q→ R) → (P → R)

is valid, assume otherwise and derive a contradiction:

1. I 6|= F assumption2. I |= (P → Q) ∧ (Q→ R) by 1 and semantics of →3. I 6|= P → R by 1 and semantics of →4. I |= P by 3 and semantics of →5. I 6|= R by 3 and semantics of →6. I |= P → Q by 2 and semantics of ∧7. I |= Q→ R by 2 and semantics of ∧

There are two cases to consider from 6. In the first case,

8a. I 6|= P by 6 and semantics of →9a. I |= ⊥ 4 and 8a are contradictory

In the second case,

8b. I |= Q by 6 and semantics of →

Now there are two more cases from 7. In the first case,

9ba. I 6|= Q by 7 and semantics of →10ba. I |= ⊥ 8b and 9ba are contradictory

In the second case,

9bb. I |= R by 7 and semantics of →10bb. I |= ⊥ 5 and 9bb are contradictory

All three branches of the proof are closed: F is valid.

We introduce vocabulary for discussing semantic proofs. The reader neednot memorize these terms now; just refer to them as they are used. A lineL : I |= F or L : I 6|= F is a single statement in the proof, sometimes labeledas in the examples. A line L is a direct descendant of a parent M if L isdirectly below M in the proof. L is a descendant of M if M is L itself, if L isa direct descendant of M , or if the parent of L is a descendant of M (in otherwords, descendant is the reflexive and transitive closure of direct descendant).M is an ancestor of L if L is a descendant of M . Several proof rules — thesecond conjunction rule, the first disjunction rule, the first implication rule,and both rules for iff — produce a fork in the argument, as the last exampleshows. A proof thus evolves as a tree rather than linearly. A branch of thetree is a sequence of lines descending from the root. A branch is closed if itcontains a contradiction, either explicitly as I |= ⊥ or implicitly as I |= G


and I 6|= G for some formula G. Otherwise, the branch is open. A semanticargument is finished when no more proof rules are applicable. It is a proofof the validity of F if every branch is closed; otherwise, each open branchdescribes a falsifying interpretation of F .

While the given proof rules are (theoretically) sufficient, derived proofrules can make proofs more concise.

Example 1.9. The derived rule of modus ponens simplifies the proof ofExample 1.8. The rule is the following:

I |= FI |= F → GI |= G

In words, from I |= F and I |= F → G, deduce I |= G.Using this rule, let us simplify the proof of the validity of

F : (P → Q) ∧ (Q→ R) → (P → R) .

We assume that it is invalid and try to derive a contradiction.

1. I 6|= F assumption2. I |= (P → Q) ∧ (Q→ R) by 1 and semantics of →3. I 6|= P → R by 1 and semantics of →4. I |= P by 3 and semantics of →5. I 6|= R by 3 and semantics of →6. I |= P → Q by 2 and semantics of ∧7. I |= Q→ R by 2 and semantics of ∧8. I |= Q by 4, 6, and modus ponens9. I |= R by 8, 7, and modus ponens10. I |= ⊥ 5 and 9 are contradictory

This proof has only one branch.

The truth-table and semantic methods can be used to check satisfiability.For example, the truth table of Example 1.6 can be extended to show that

¬F : ¬(P ∨Q → P ∧Q)

is satisfiable:

P Q P ∨Q P ∧Q F ¬F0 0 0 0 1 00 1 1 0 0 11 0 1 0 0 11 1 1 1 1 0

The second and third rows represent satisfying interpretations of ¬F . Addi-tionally, the semantic argument in the following example shows that


G : ¬(P ∨Q → P ∧Q)

is satisfied by the discovered interpretation I, and thus that G is satisfiable.


F : P ∨Q → P ∧Q

is valid, assume that F is invalid; then there is an interpretation I such thatI |= ¬F :

1. I 6|= P ∨Q → P ∧Q assumption2. I |= P ∨Q by 1 and semantics of →3. I 6|= P ∧Q by 1 and semantics of →

We have two choices to make. By 2 and the semantics of disjunction, eitherP or Q must be true. By 3 and the semantics of conjunction, either P or Qmust be false. So there are two options: either P is true and Q is false, or P isfalse and Q is true. We choose P to be true and Q to be false. Then,

4a. I |= P by 2 and semantics of ∨5a. I 6|= Q by 3 and semantics of ∧

The only subformulae of P and Q are themselves, so the table is complete.Yet we did not derive a contradiction. In fact, we found the interpretation

I : P 7→ true, Q 7→ false

for which I |= ¬F . Therefore, F is actually invalid. The interpretation I :P 7→ true, Q 7→ false is a falsifying interpretation.

If our choice had resulted in a contradiction, then we would have had totry the other choice for P and Q, in which P is false and Q is true. In general,we stop either when we have found an interpretation or when we have closedevery branch.

1.4 Equivalence and Implication

Just as satisfiability and validity are important properties of PL formulae,equivalence and implication are important properties of pairs of formulae.Two formulae F1 and F2 are equivalent if they evaluate to the same truthvalue under all interpretations I. That is, for all interpretations I, I |= F1

iff I |= F2. Another way to state the equivalence of F1 and F2 is to assertthe validity of the formula F1 ↔ F2. We write F1 ⇔ F2 when F1 and F2 areequivalent. F1 ⇔ F2 is not a formula; it simply abbreviates the statement “F1

and F2 are equivalent.”We use the last characterization to prove that two formulae are equivalent.

1.4 Equivalence and Implication 15

Example 1.11. To prove that

P ⇔ ¬¬P ,

we prove that

P ↔ ¬¬P

is valid via a truth table:

P ¬P ¬¬P P ↔ ¬¬P0 1 0 11 0 1 1

Example 1.12. To prove

P → Q ⇔ ¬P ∨Q ,

we prove that

F : P → Q ↔ ¬P ∨Q

is valid via a truth table:

P Q P → Q ¬P ¬P ∨Q F

0 0 1 1 1 10 1 1 1 1 11 0 0 0 0 11 1 1 0 1 1

Formula F1 implies formula F2 if I |= F2 for every interpretation I suchthat I |= F1. Another way to state that F1 implies F2 is to assert the validityof the formula F1 → F2. We write F1 ⇒ F2 when F1 implies F2. Do notconfuse the implication F1 ⇒ F2, which asserts the validity of F1 → F2, withthe PL formula F1 → F2, which is constructed using the logical operator →.F1 ⇒ F2 is not a formula.

As with equivalences, we use the validity characterization to prove impli-cations.


R ∧ (¬R ∨ P ) ⇒ P ,

we prove that


F : R ∧ (¬R ∨ P ) → P

is valid via a semantic argument. Suppose F is not valid; then there exists aninterpretation I such that I 6|= F :

1. I 6|= F assumption2. I |= R ∧ (¬R ∨ P ) by 1 and semantics of →3. I 6|= P by 1 and semantics of →4. I |= R by 2 and semantics of ∧5. I |= ¬R ∨ P by 2 and semantics of ∧

There are two cases to consider. In the first case,

6a. I |= ¬R by 5 and semantics of ∨7a. I |= ⊥ 4 and 6a are contradictory

In the second case,

6b. I |= P by 5 and semantics of ∨7b. I |= ⊥ 3 and 6b are contradictory

Thus, our assumption that I 6|= F is wrong, and F is valid.

1.5 Substitution

Substitution is a syntactic operation on formulae with significant semanticconsequences. It allows us to prove the validity of entire sets of formulae viaformula templates. It is also an essential tool for manipulating formulaethroughout the text.

A substitution σ is a mapping from formulae to formulae:

σ : F1 7→ G1, . . . , Fn 7→ Gn .

The domain of σ, domain(σ), is

domain(σ) : F1, . . . , Fn ,

while the range range(σ) is

range(σ) : G1, . . . , Gn .

The application of a substitution σ to a formula F , Fσ, replaces each occur-rence of a formula Fi in the domain of σ with its corresponding formula Gi inthe range of σ. Replacements occur all at once. We remove any ambiguity byestablishing that when both subformulae Fj and Fk are in the domain of σ,and Fk is a strict subformula of Fj , then the larger subformula Fj is replacedby the corresponding formula Gj . An example clarifies this statement.

1.5 Substitution 17

Example 1.14. Consider formula

F : P ∧Q → P ∨ ¬Q

and substitution

σ : P 7→ R, P ∧Q 7→ P → Q .

Then

Fσ : (P → Q) → R ∨ ¬Q ,

where the antecedent P ∧ Q of F is replaced by P → Q, and the P of theconsequent is replaced by R. Moreover,

Fσ 6= R ∧Q → R ∨ ¬Q

by our convention.

A variable substitution is a substitution in which the domain consistsonly of propositional variables.

One notation is useful when working with substitutions. When we writeF [F1, . . . , Fn], we mean that formula F can have formulae Fi, i = 1, . . . , n, assubformulae. If σ is F1 7→ G1, . . . , Fn 7→ Gn, then

F [F1, . . . , Fn]σ : F [G1, . . . , Gn] .

In the formula of Example 1.14, writing

F [P, P ∧Q]σ : F [R, P → Q]

emphasizes that subformulae P and P ∧ Q of F are replaced by formulae Rand P → Q, respectively.

Two interesting semantic consequences can be derived from substitution.Proposition 1.15 states that substituting subformulae Fi of F with correspond-ing equivalent subformulae Gi results in an equivalent formula F ′.

Proposition 1.15 (Substitution of Equivalent Formulae). Considersubstitution

σ : F1 7→ G1, . . . , Fn 7→ Gn

such that for each i, Fi ⇔ Gi. Then F ⇔ Fσ.

Example 1.16. Consider applying substitution

σ : P → Q 7→ ¬P ∨Q

to


F : (P → Q) → R .

Since P → Q ⇔ ¬P ∨Q, the formula

Fσ : (¬P ∨Q) → R

is equivalent to F .

Proposition 1.17 asserts that proving the validity of a PL formula F actu-ally proves the validity of an infinite set of formulae: those formulae that canbe derived from F via variable substitutions.

Proposition 1.17 (Valid Template). If F is valid and G = Fσ for somevariable substitution σ, then G is valid.

Example 1.18. In Example 1.12, we proved that P → Q is equivalent to¬P ∨Q:

F : (P → Q) ↔ (¬P ∨Q)

is valid. The validity of F implies that every formula of the form F1 → F2 isequivalent to ¬F1 ∨ F2, for arbitrary subformulae F1 and F2.

Finally, it is often useful to compute the composition of substitutions.Given substitutions σ1 and σ2, the idea is to compute substitution σ such thatFσ1σ2 = Fσ for any F . Compute σ1σ2 as follows:

1. apply σ2 to each formula of the range of σ1, and add the results to σ;2. if Fi of Fi 7→ Gi appears in the domain of σ2 but not in the domain of

σ1, then add Fi 7→ Gi to σ.

Example 1.19. Compute the composition of substitutions

σ1σ2 : P 7→ R, P ∧Q 7→ P → QP 7→ S, S 7→ Q

as follows:

P 7→ Rσ2, P ∧Q 7→ (P → Q)σ2, S 7→ Q= P 7→ R, P ∧Q 7→ S → Q, S 7→ Q

1.6 Normal Forms

A normal form of formulae is a syntactic restriction such that for everyformula of the logic, there is an equivalent formula in the normal form. Threenormal forms are particularly important for PL.

1.6 Normal Forms 19

Negation normal form (NNF) requires that ¬, ∧, and ∨ be the onlyconnectives and that negations appear only in literals. Transforming a formulaF to equivalent formula F ′ in NNF can be computed recursively using thefollowing list of template equivalences:

¬¬F1 ⇔ F1

¬⊤ ⇔ ⊥¬⊥ ⇔ ⊤

¬(F1 ∧ F2) ⇔ ¬F1 ∨ ¬F2

¬(F1 ∨ F2) ⇔ ¬F1 ∧ ¬F2

F1 → F2 ⇔ ¬F1 ∨ F2

F1 ↔ F2 ⇔ (F1 → F2) ∧ (F2 → F1)

When implementing the transformation, the equivalences should be appliedleft-to-right. The equivalences

¬(F1 ∧ F2) ⇔ ¬F1 ∨ ¬F2 ¬(F1 ∨ F2) ⇔ ¬F1 ∧ ¬F2

are known as De Morgan’s Law.Propositions 1.15 and 1.17 justify that the result of applying the template

equivalences to a formula produces an equivalent formula. The transitivity ofequivalence justifies that this equivalence holds over any number of transfor-mations: if F ⇔ G and G⇔ H , then F ⇔ H .

Example 1.20. To convert the formula

F : ¬(P → ¬(P ∧Q))

into NNF, apply the template equivalence

F1 → F2 ⇔ ¬F1 ∨ F2 (1.1)

to produce

F ′ : ¬(¬P ∨ ¬(P ∧Q)) .

Let us understand this “application” of the template equivalence in detail.First, apply variable substitution

σ1 : F1 7→ P, F2 7→ ¬(P ∧Q)

to the valid template formula of equivalence (1.1):

(F1 → F2 ↔ ¬F1 ∨ F2)σ1 : P → ¬(P ∧Q) ↔ ¬P ∨ ¬(P ∧Q) .

Proposition 1.17 implies that the result is valid. Then construct substitution

σ2 : P → ¬(P ∧Q) 7→ ¬P ∨ ¬(P ∧Q) ,


and apply Proposition 1.15 to Fσ2 to yield that

F ′ : ¬(¬P ∨ ¬(P ∧Q))

is equivalent to F . Subsequently, we shall not provide these details.Continuing with the conversion to NNF, apply De Morgan’s law

¬(F1 ∨ F2) ⇔ ¬F1 ∧ ¬F2

to produce

F ′′ : ¬¬P ∧ ¬¬(P ∧Q) .

Apply

¬¬F1 ⇔ F1

twice to produce

F ′′′ : P ∧ P ∧Q ,

which is in NNF and equivalent to F .

A formula is in disjunctive normal form (DNF) if it is a disjunctionof conjunctions of literals:∨

i

∧

j

ℓi,j for literals ℓi,j .

To convert a formula F into an equivalent formula in DNF, transform F intoNNF and then use the following table of template equivalences:

(F1 ∨ F2) ∧ F3 ⇔ (F1 ∧ F3) ∨ (F2 ∧ F3)F1 ∧ (F2 ∨ F3) ⇔ (F1 ∧ F2) ∨ (F1 ∧ F3)

Again, when implementing the transformation, the equivalences should beapplied left-to-right. The equivalences simply say that conjunction distributesover disjunction.

Example 1.21. To convert

F : (Q1 ∨ ¬¬Q2) ∧ (¬R1 → R2)

into DNF, first transform it into NNF

F ′ : (Q1 ∨Q2) ∧ (R1 ∨R2) ,

and then apply distributivity to obtain

F ′′ : (Q1 ∧ (R1 ∨R2)) ∨ (Q2 ∧ (R1 ∨R2)) ,

and then distributivity twice again to produce

F ′′′ : (Q1 ∧R1) ∨ (Q1 ∧R2) ∨ (Q2 ∧R1) ∨ (Q2 ∧R2) .

F ′′′ is in DNF and is equivalent to F .

1.7 Decision Procedures for Satisfiability 21

The dual of DNF is conjunctive normal form (CNF). A formula inCNF is a conjunction of disjunctions of literals:

∧

i

∨

j

ℓi,j for literals ℓi,j .

Each inner block of disjunctions is called a clause. To convert a formula Finto an equivalent formula in CNF, transform F into NNF and then use thefollowing table of template equivalences:

(F1 ∧ F2) ∨ F3 ⇔ (F1 ∨ F3) ∧ (F2 ∨ F3)F1 ∨ (F2 ∧ F3) ⇔ (F1 ∨ F2) ∧ (F1 ∨ F3)

Example 1.22. To convert

F : (Q1 ∧ ¬¬Q2) ∨ (¬R1 → R2)

into CNF, first transform F into NNF:

F ′ : (Q1 ∧Q2) ∨ (R1 ∨R2) .

Then apply distributivity to obtain

F ′′ : (Q1 ∨R1 ∨R2) ∧ (Q2 ∨R1 ∨R2) ,

which is in CNF and equivalent to F .

1.7 Decision Procedures for Satisfiability

Section 1.3 introduced the truth-table and semantic argument methods fordetermining the satisfiability of PL formulae. In this section, we study al-gorithms for deciding satisfiability (see Section 2.6 for a formal discussion ofdecidability). A decision procedure for satisfiability of PL formulae reports,after some finite amount of computation, whether a given PL formula F issatisfiable.

1.7.1 Simple Decision Procedures

The truth-table method immediately suggests a decision procedure: constructthe full table, which has 2n rows when F has n variables, and report whetherthe final column, representing F , has value 1 in any row.

The semantic argument method also suggests a decision procedure. Thebasic idea is to make sure that a proof rule is only applied to each line inthe argument at most once. Because each deduction is simpler in constructionthan its premise, the constructed proof is of finite size (see Chapter 4 for


a formal approach to proving this point). When the semantic argument isfinished, report whether any branch is still open.

This simple description leaves out many details. Most importantly, whenmany lines exist to which one can apply proof rules, which line should be con-sidered next? Different implementations of this decision, called proof tactics,result in different proof shapes and sizes. For example, one basic tactic is toapply proof rules with only one deduction before proof rules with multipledeductions to delay forks in the proof as long as possible.

Subsequent sections consider more sophisticated procedures that are thebasis for modern satisfiability solvers.

1.7.2 Reconsidering the Truth-Table Method

In the naive decision procedure based on the truth-table method, the entiretable is constructed. Actually, only one row need be considered at a time, mak-ing for a space efficient procedure. This idea is implemented in the followingrecursive algorithm for deciding the satisfiability of a PL formula F :

let rec sat F =if F = ⊤ then true

else if F = ⊥ then false

else

let P = choose vars(F ) in(sat FP 7→ ⊤) ∨ (sat FP 7→ ⊥)

The notation “let rec sat F =” declares sat as a recursive function thattakes one argument, a formula F . The notation “let P = choose vars(F ) in”means that P ’s value in the subsequent text is the variable returned by thechoose function. When applying the substitutions FP 7→ ⊤ or FP 7→ ⊥,the template equivalences of Exercise 1.2 should be applied to simplify theresult. Then the comparisons F = ⊤ and F = ⊥ can be implemented aspurely syntactic operations.

At each recursive step, if F is not yet ⊤ or ⊥, a variable is chosen on whichto branch. Each possibility for P is attempted if necessary. This algorithmreturns true immediately upon finding a satisfying interpretation. Otherwise,if F is unsatisfiable, it eventually returns⊥. sat may save branching on certainvariables by simplifying intermediate formulae.


F : (P → Q) ∧ P ∧ ¬Q .

To compute sat F , choose a variable, say P , and recurse on the first case,

FP 7→ ⊤ : (⊤ → Q) ∧⊤ ∧ ¬Q ,

which simplifies to


F

F1 : Q ∧ ¬Q ⊥

⊥ ⊥

P 7→ ⊤ P 7→ ⊥

Q 7→ ⊤ Q 7→ ⊥

F

⊥ ⊤

P 7→ ⊤ P 7→ ⊥

(a) (b)

Fig. 1.1. Visualizing runs of sat

F1 : Q ∧ ¬Q .

Now try each of

F1Q 7→ ⊤ and F1Q 7→ ⊥ .

Both simplify to ⊥, so this branch ends without finding a satisfying interpre-tation.

Now try the other branch for P in F :

FP 7→ ⊥ : (⊥ → Q) ∧⊥ ∧ ¬Q ,

which simplifies to ⊥. Thus, this branch also ends without finding a satisfyinginterpretation. Thus, F is unsatisfiable.

The run of sat on F is visualized in Figure 1.1(a).


F : (P → Q) ∧ ¬P .

To compute sat F , choose a variable, say P , and recurse on the first case,

FP 7→ ⊤ : (⊤ → Q) ∧ ¬⊤ ,

which simplifies to ⊥. Therefore, try

FP 7→ ⊥ : (⊥ → Q) ∧ ¬⊥

instead, which simplifies to ⊤. Arbitrarily assigning a value to Q produces thefollowing satisfying interpretation:

I : P 7→ false, Q 7→ true .

The run of sat on F is visualized in Figure 1.1(b).


→ Rep(F )

Rep(P ∨ Q) ∨ ¬ Rep(¬(P ∧ ¬R))

P Q ∧ Rep(P ∧ ¬R)

P ¬ Rep(¬R)

R

Fig. 1.2. Parse tree of F : P∨Q → ¬(P∧¬R) with representatives for subformulae

1.7.3 Conversion to an Equisatisfiable Formula in CNF

The next two decision procedures operate on PL formulae in CNF. The trans-formation suggested in Section 1.6 produces an equivalent formula that can beexponentially larger than the original formula: consider converting a formulain DNF into CNF. However, to decide the satisfiability of F , we need onlyexamine a formula F ′ such that F and F ′ are equisatisfiable. F and F ′ areequisatisfiable when F is satisfiable iff F ′ is satisfiable.

We define a method for converting PL formula F to equisatisfiable PLformula F ′ in CNF that is at most a constant factor larger than F . The mainidea is to introduce new propositional variables to represent the subformulaeof F . The constructed formula F ′ includes extra clauses that assert that thesenew variables are equivalent to the subformulae that they represent.

Figure 1.2 visualizes the idea of the procedure. Each node of the “parsetree” of F represents a subformula G of F . With each node G is associated arepresentative propositional variable Rep(G). In the constructed formula F ′,each representative Rep(G) is asserted to be equivalent to the subformula Gthat it represents in such a way that the conjunction of all such assertions isin CNF. Finally, the representative Rep(F ) of F is asserted to be true.

To obtain a small formula in CNF, each assertion of equivalence betweenRep(G) and G refers at most to the children of G in the parse tree. How is thispossible when a subformula may be arbitrarily large? The main trick is to referto the representatives of G’s children rather than the children themselves.

Let the “representative” function Rep : PL→ V∪⊤,⊥map PL formulaeto propositional variables V , ⊤, or ⊥. In the general case, it is intended tomap a formula F to its representative propositional variable PF such that thetruth value of PF is the same as that of F . In other words, PF provides acompact way of referring to F .

Let the “encoding” function En : PL→ PL map PL formulae to PL formu-lae. En is intended to map a PL formula F to a PL formula F ′ in CNF thatasserts that F ’s representative, PF , is equivalent to F : “Rep(F )↔ F”.


As the base cases for defining Rep and En, define their behavior on ⊤, ⊥,and propositional variables P :

Rep(⊤) = ⊤ En(⊤) = ⊤Rep(⊥) = ⊥ En(⊥) = ⊤Rep(P ) = P En(P ) = ⊤

The representative of ⊤ is ⊤ itself, and the representative of ⊥ is ⊥ itself.Thus, Rep(⊤) ↔ ⊤ and Rep(⊥) ↔ ⊥ are both trivially valid, so En(⊤) andEn(⊥) are both ⊤. Finally, the representative of a propositional variable P isP itself; and again, Rep(P )↔ P is trivially valid so that En(P ) is ⊤.

For the inductive case, F is a formula other than an atom, so define itsrepresentative as a unique propositional variable PF :

Rep(F ) = PF .

En then asserts the equivalence of F and PF as a CNF formula. On conjunc-tion, define

En(F1 ∧ F2) =let P = Rep(F1 ∧ F2) in(¬P ∨ Rep(F1)) ∧ (¬P ∨ Rep(F2)) ∧ (¬Rep(F1) ∨ ¬Rep(F2) ∨ P )

The returned formula

(¬P ∨ Rep(F1)) ∧ (¬P ∨ Rep(F2)) ∧ (¬Rep(F1) ∨ ¬Rep(F2) ∨ P )

is in CNF and is equivalent to

Rep(F1 ∧ F2) ↔ Rep(F1) ∧ Rep(F2) .

In detail, the first two clauses

(¬P ∨ Rep(F1)) ∧ (¬P ∨ Rep(F2))

together assert

P → Rep(F1) ∧ Rep(F2)

(since, for example, ¬P ∨ Rep(F1) is equivalent to P → Rep(F1)), while thefinal clause asserts

Rep(F1) ∧ Rep(F2) → P .

Notice the application of Rep to F1 and F2. As mentioned above, it is thetrick to producing a small CNF formula.

On negation, En(¬F ) returns a formula equivalent to Rep(¬F ) ↔ ¬Rep(F ):

En(¬F ) =let P = Rep(¬F ) in(¬P ∨ ¬Rep(F )) ∧ (P ∨ Rep(F ))


En is defined for ∨, →, and ↔ as well:

En(F1 ∨ F2) =let P = Rep(F1 ∨ F2) in(¬P ∨ Rep(F1) ∨ Rep(F2)) ∧ (¬Rep(F1) ∨ P ) ∧ (¬Rep(F2) ∨ P )

En(F1 → F2) =let P = Rep(F1 → F2) in(¬P ∨ ¬Rep(F1) ∨ Rep(F2)) ∧ (Rep(F1) ∨ P ) ∧ (¬Rep(F2) ∨ P )

En(F1 ↔ F2) =let P = Rep(F1 ↔ F2) in(¬P ∨ ¬Rep(F1) ∨ Rep(F2)) ∧ (¬P ∨ Rep(F1) ∨ ¬Rep(F2))∧ (P ∨ ¬Rep(F1) ∨ ¬Rep(F2)) ∧ (P ∨ Rep(F1) ∨ Rep(F2))

Having defined En, let us construct the full CNF formula that is equisat-isfiable to F . If SF is the set of all subformulae of F (including F itself),then

F ′ : Rep(F ) ∧∧

G∈SF

En(G)

is in CNF and is equisatisfiable to F . The second main conjunct asserts theequivalences between all subformulae of F and their corresponding represen-tatives. Rep(F ) asserts that F ’s representative, and thus F itself (accordingto the second conjunct), is true.

If F has size n, where each instance of a logical connective or a proposi-tional variable contributes one unit of size, then F ′ has size at most 30n + 2.The size of F ′ is thus linear in the size of F . The number of symbols in theformula returned by En(F1 ↔ F2), which incurs the largest expansion, is 29.Up to one additional conjunction is also required per symbol of F . Finally,two extra symbols are required for asserting that Rep(F ) is true.


F : (Q1 ∧Q2) ∨ (R1 ∧R2) ,

which is in DNF. To convert it to CNF, we collect its subformulae

SF : Q1, Q2, Q1 ∧Q2, R1, R2, R1 ∧R2, F

and compute

En(Q1) = ⊤En(Q2) = ⊤

En(Q1 ∧Q2) = (¬P(Q1∧Q2) ∨Q1) ∧ (¬P(Q1∧Q2) ∨Q2)∧ (¬Q1 ∨ ¬Q2 ∨ P(Q1∧Q2))


En(R1) = ⊤En(R2) = ⊤

En(R1 ∧R2) = (¬P(R1∧R2) ∨R1) ∧ (¬P(R1∧R2) ∨R2)∧ (¬R1 ∨ ¬R2 ∨ P(R1∧R2))

En(F ) = (¬P(F ) ∨ P(Q1∧Q2) ∨ P(R1∧R2))∧ (¬P(Q1∧Q2) ∨ P(F ))∧ (¬P(R1∧R2) ∨ P(F ))

Then

F ′ : P(F ) ∧∧

G∈SF

En(G)

is equisatisfiable to F and is in CNF.

1.7.4 The Resolution Procedure

The next decision procedure that we consider is based on resolution andapplies only to PL formulae in CNF. Therefore, the procedure of Section1.7.3 must first be applied to the given PL formula if it is not already in CNF.

Resolution follows from the following observation of any PL formula F inCNF: to satisfy clauses C1[P ] and C2[¬P ] that share variable P but disagreeon its value, either the rest of C1 or the rest of C2 must be satisfied. Why? IfP is true, then a literal other than ¬P in C2 must be satisfied; while if P isfalse, then a literal other than P in C1 must be satisfied. Therefore, the clauseC1[⊥] ∨ C2[⊥], simplified according to the template equivalences of Exercise1.2, can be added as a conjunction to F to produce an equivalent formula stillin CNF.

Clausal resolution is stated as the following proof rule:

C1[P ] C2[¬P ]C1[⊥] ∨C2[⊥]

From the two clauses of the premise, deduce the new clause, called the resol-vent.

If ever ⊥ is deduced via resolution, F must be unsatisfiable since F ∧⊥ isunsatisfiable. Otherwise, if every possible resolution produces a clause that isalready known, then F must be satisfiable.

Example 1.26. The CNF of (P → Q) ∧ P ∧ ¬Q is the following:

F : (¬P ∨Q) ∧ P ∧ ¬Q .

From resolution

(¬P ∨Q) PQ

,


construct

F1 : (¬P ∨Q) ∧ P ∧ ¬Q ∧ Q .

From resolution

¬Q Q⊥ ,

deduce that F , and thus the original formula, is unsatisfiable.


F : (¬P ∨Q) ∧ ¬Q .

The one possible resolution

(¬P ∨Q) ¬Q¬P

yields

F1 : (¬P ∨Q) ∧ ¬Q ∧ ¬P .

Since no further resolutions are possible, F is satisfiable. Indeed,

I : P 7→ false, Q 7→ false

is a satisfying interpretation. A CNF formula that does not contain the clause⊥ and to which no more resolutions can be applied represents all possiblesatisfying interpretations.

1.7.5 DPLL

Modern satisfiability procedures for propositional logic are based on the Davis-Putnam-Logemann-Loveland algorithm (DPLL), which combines the space-efficient procedure of Section 1.7.2 with a restricted form of resolution. Wereview in this section the basic algorithm. Much research in the past decadehas advanced the state-of-the-art considerably.

Like the resolution procedure, DPLL operates on PL formulae in CNF.But again, as the procedure decides satisfiability, we can apply the conversionprocedure of Section 1.7.3 to produce a small equisatisfiable CNF formula.

As in the procedure sat, DPLL attempts to construct an interpretation ofF ; failing to do so, it reports that the given formula is unsatisfiable. Ratherthan relying solely on enumerating possibilities, however, DPLL applies arestricted form of resolution to gain some deductive power. The process ofapplying this restricted resolution as much as possible is called Boolean con-straint propagation (BCP).


BCP is based on unit resolution. Unit resolution operates on two clauses.One clause, called the unit clause, consists of a single literal ℓ (ℓ = P orℓ = ¬P for some propositional variable P ). The second clause contains thenegation of ℓ: C[¬ℓ]. Then unit resolution is the deduction

ℓ C[¬ℓ]C[⊥]

.

Unlike with full resolution, the literals of the resolvent are a subset of theliterals of the second clause. Hence, the resolvent replaces the second clause.

Example 1.28. In the formula

F : (P ) ∧ (¬P ∨Q) ∧ (R ∨ ¬Q ∨ S) ,

(P ) is a unit clause. Therefore, applying unit resolution

P (¬P ∨Q)Q

produces

F ′ : (Q) ∧ (R ∨ ¬Q ∨ S) .

Applying unit resolution again

Q R ∨ ¬Q ∨ SR ∨ S

produces

F ′′ : (R ∨ S) ,

ending this round of BCP.

The implementation of DPLL is structurally similar to sat, except that itbegins by applying BCP:

let rec dpll F =let F ′ = bcp F in

if F ′ = ⊤ then true

else if F ′ = ⊥ then false

else

let P = choose vars(F ′) in(dpll F ′P 7→ ⊤) ∨ (dpll F ′P 7→ ⊥)

As in sat, intermediate formulae are simplified according to the templateequivalences of Exercise 1.2.


F

(R) ∧ (¬R) ∧ (P ∨ ¬R) (¬P ∨ R)

¬P

⊥ I : P 7→ false, Q 7→ false, R 7→ true

Q 7→ ⊤

R (¬R)

⊥

Q 7→ ⊥

R 7→ ⊤

P 7→ ⊥

Fig. 1.3. Visualization of Example 1.30

One easy optimization is the following: if variable P appears only positivelyor only negatively in F , it should not be chosen by choose vars(F ′). P appearsonly positively when every P -literal is just P ; P appears only negatively whenevery P -literal is ¬P . In both cases, F is equisatisfiable to the formula F ′

constructed by removing all clauses containing an instance of P . Therefore,these clauses do not contribute to BCP. When only such variables remain,the formula must be satisfiable: a full interpretation can be constructed bysetting each variable’s value based on whether it appears only positively (true)or only negatively (false).

The values to which propositional variables are set on the path to a solutioncan be recorded so that DPLL can return a satisfying interpretation if oneexists, rather than just true.


F : (P ) ∧ (¬P ∨Q) ∧ (R ∨ ¬Q ∨ S) .

On the first level of recursion, dpll recognizes the unit clause (P ) and appliesthe BCP steps from Example 1.28, resulting in the formula

F ′′ : R ∨ S .

The unit resolutions correspond to the partial interpretation

P 7→ true, Q 7→ true .

Only positively occurring variables remain, so F is satisfiable. In particular,

P 7→ true, Q 7→ true, R 7→ true, S 7→ true

is a satisfying interpretation of F .Branching was not required in this example.


F : (¬P ∨Q ∨R) ∧ (¬Q ∨R) ∧ (¬Q ∨ ¬R) ∧ (P ∨ ¬Q ∨ ¬R) .

1.8 Summary 31

On the first level of recursion, dpll must branch. Branching on Q or R willresult in unit clauses; choose Q.

Then

FQ 7→ ⊤ : (R) ∧ (¬R) ∧ (P ∨ ¬R) .

The unit resolution

R (¬R)⊥

finishes this branch.On the other branch,

FQ 7→ ⊥ : (¬P ∨R) .

P appears only negatively, and R appears only positively, so the formula issatisfiable. In particular, F is satisfied by interpretation

I : P 7→ false, Q 7→ false, R 7→ true .

This run of dpll is visualized in Figure 1.3.

1.8 Summary

This chapter introduces propositional logic (PL). It covers:

• Its syntax. How one constructs a PL formula. Propositional variables,atoms, literals, logical connectives.

• Its semantics. What a PL formula means. Truth values true and false.Interpretations. Truth-table definition, inductive definition.

• Satisfiability and validity. Whether a PL formula evaluates to true underany or all interpretations. Duality of satisfiability and validity, truth-tablemethod, semantic argument method.

• Equivalence and implication. Whether two formulae always evaluate to thesame truth value under every interpretation. Whether under any interpre-tation, if one formula evaluates to true, the other also evaluates to true.Reduction to validity.

• Substitution, which is a tool for manipulating formulae and making generalclaims. Substitution of equivalent formulae. Valid templates.

• Normal forms. A normal form is a set of syntactically restricted formulaesuch that every PL formula is equivalent to some member of the set.

• Decision procedures for satisfiability. Truth-table method, sat, resolutionprocedure, dpll. Transformation to equisatisfiable CNF formula.


PL is an important logic with applications in software and hardware de-sign and analysis, knowledge representation, combinatorial optimization, andcomplexity theory, to name a few. Although relatively simple, the Booleanstructure that is central to PL is often a main source of complexity in appli-cations of the algorithmic reasoning that is the focus of Part II. Exercise 8.1explores this point in more depth.

Besides being an important logic in its own right, PL serves to introducethe main concepts that are important throughout the book, in particularsyntax, semantics, and satisfiability and validity. Chapter 2 presents first-order logic by building on the concepts of this chapter.


For a complete and concise presentation of propositional logic, see Smullyan’stext First-Order Logic [87]. The semantic argument method is similar toSmullyan’s tableau method.

The DPLL algorithm is based on work by Davis and Putnam, presentedin [26], and by Davis, Logemann, and Loveland, presented in [25].

Exercises

1.1 (PL validity & satisfiability). For each of the following PL formulae,identify whether it is valid or not. If it is valid, prove it with a truth table orsemantic argument; otherwise, identify a falsifying interpretation. Recall ourconventions for operator precedence and associativity from Section 1.1.

(a) P ∧Q → P → Q(b) (P → Q) ∨ P ∧ ¬Q(c) (P → Q → R) → P → R(d) (P → Q ∨R) → P → R(e) ¬(P ∧Q) → R → ¬R → Q(f) P ∧Q ∨ ¬P ∨ (¬Q → ¬P )(g) (P → Q → R) → ¬R → ¬Q → ¬P(h) (¬R → ¬Q → ¬P ) → P → Q → R

1.2 (Template equivalences). Use the truth table or semantic argumentmethod to prove the following template equivalences.

(a) ⊤ ⇔ ¬⊥(b) ⊥ ⇔ ¬⊤(c) ¬¬F ⇔ F(d) F ∧⊤ ⇔ F(e) F ∧⊥ ⇔ ⊥(f) F ∧ F ⇔ F

Exercises 33

(g) F ∨⊤ ⇔ ⊤(h) F ∨⊥ ⇔ F(i) F ∨ F ⇔ F(j) F → ⊤ ⇔ ⊤(k) F → ⊥ ⇔ ¬F(l) ⊤ → F ⇔ F

(m) ⊥ → F ⇔ ⊤(n) ⊤ ↔ F ⇔ F(o) ⊥ ↔ F ⇔ ¬F(p) ¬(F1 ∧ F2) ⇔ ¬F1 ∨ ¬F2

(q) ¬(F1 ∨ F2) ⇔ ¬F1 ∧ ¬F2

(r) F1 → F2 ⇔ ¬F1 ∨ F2

(s) F1 → F2 ⇔ ¬F2 → ¬F1

(t) ¬(F1 → F2) ⇔ F1 ∧ ¬F2

(u) (F1 ∨ F2) ∧ F3 ⇔ (F1 ∧ F3) ∨ (F2 ∧ F3)(v) (F1 ∧ F2) ∨ F3 ⇔ (F1 ∨ F3) ∧ (F2 ∨ F3)(w) (F1 → F3) ∧ (F2 → F3) ⇔ F1 ∨ F2 → F3

(x) (F1 → F2) ∧ (F1 → F3) ⇔ F1 → F2 ∧ F3

(y) F1 → F2 → F3 ⇔ F1 ∧ F2 → F3

(z) (F1 ↔ F2) ∧ (F2 ↔ F3) ⇒ (F1 ↔ F3)

1.3 (Redundant logical connectives). Given ⊤, ∧, and ¬, prove that ⊥,∨, →, and↔ are redundant logical connectives. That is, show that each of ⊥,F1 ∨ F2, F1 → F2, and F1 ↔ F2 is equivalent to a formula that uses only F1,F2, ⊤, ∨, and ¬.

1.4 (The nand connective). Let the logical connective ∧ (pronounced“nand”) be defined according to the following truth table:

F1 F2 F1∧F2

0 0 10 1 11 0 11 1 0

Show that all standard logical connectives can be defined in terms of ∧.

1.5 (Normal forms). Convert the following PL formulae to NNF, DNF, andCNF via the transformations of Section 1.6.

(a) ¬(P → Q)(b) ¬(¬(P ∧Q) → ¬R)(c) (Q ∧R → (P ∨ ¬Q)) ∧ (P ∨R)(d) ¬(Q→ R) ∧ P ∧ (Q ∨ ¬(P ∧R))


1.6 (Graph coloring). A solution to a graph coloring problem is an as-signment of colors to vertices such that no two adjacent vertices have the samecolor. Formally, a finite graph G = 〈V, E〉 consists of vertices V = v1, . . . , vnand edges E = 〈vi1 , wi1 〉, . . . , 〈vik

, wik〉. The finite set of colors is given by

C = c1, . . . , cm. A problem instance is given by a graph and a set of colors:the problem is to assign each vertex v ∈ V a color(v) ∈ C such that for everyedge 〈v, w〉 ∈ E, color(v) 6= color(w). Clearly, not all instances have solutions.

Show how to encode an instance of a graph coloring problem into a PLformula F . F should be satisfiable iff a graph coloring exists.

(a) Describe a set of constraints in PL asserting that every vertex is colored.Since the sets of vertices, edges, and colors are all finite, use notation suchas “color(v) = c” to indicate that vertex v has color c. Realize that suchan assertion is encodeable as a single propositional variable P c

v .(b) Describe a set of constraints in PL asserting that every vertex has at most

one color.(c) Describe a set of constraints in PL asserting that no two connected vertices

have the same color.(d) Identify a significant optimization in this encoding. Hint: Can any con-

straints be dropped? Why?(e) If the constraints are not already in CNF, specify them in CNF now. For

N vertices, K edges, and M colors, how many variables does the optimizedencoding require? How many clauses?

1.7 (CNF). Example 1.25 constructs a CNF formula that is equisatisfiableto a given small formula in DNF.

(a) If distribution of disjunction over conjunction (described in Section 1.6)were used, how many clauses would the resulting formula have?

(b) Consider the formulae

Fn :n∨

i=1

(Qi ∧Ri)

for positive integers n. As a function of n, how many clauses are in(i) the formula F ′ constructed based on distribution of disjunction over

conjunction?(ii) the formula

F ′ : Rep(Fn) ∧∧

G∈SFn

En(G) ?

(iii) For which n is the distribution approach better?

1.8 (DPLL). Describe the execution of DPLL on the following formulae.

(a) (P ∨ ¬Q ∨ ¬R) ∧ (Q ∨ ¬P ∨R) ∧ (R ∨ ¬Q)(b) (P ∨Q ∨R) ∧ (¬P ∨ ¬Q ∨ ¬R) ∧ (¬P ∨Q ∨R) ∧ (¬Q ∨R) ∧ (Q ∨ ¬R)

2

First-Order Logic

One task we logicians are interested in is that of analyzing the notionof “proof” — to make it as rigorous as any other notion in mathe-matics.

— Raymond SmullyanThe Lady or the Tiger?, 1982

This chapter extends the machinery of propositional logic to first-orderlogic (FOL), also called both predicate logic and the first-order predicatecalculus. While first-order logic enjoys a degree of expressiveness that makesit suitable for reasoning about computation, it does not admit completelyautomated reasoning.

FOL extends PL with predicates, functions, and quantifiers. As in ourdiscussion of PL, we first introduce the syntax of FOL and its semantics. Wethen build on the semantic argument method of PL to provide a method ofproving first-order validity.

Section 2.6 reviews decidability and complexity. A decidable problem hasan algorithm, which is a procedure that always finishes with a correct answeron every instance of the problem. While validity of PL formulae is decid-able, validity of FOL formulae is not. Complexity is the study of the intrinsichardness of a decidable problem.

The optional Section 2.7 proves the soundness and completeness of thesemantic argument method. It also presents two classic theorems that areapplied in Chapter 10.

2.1 Syntax

All formulae of PL evaluate to true or false. FOL is not so simple. In FOL,terms evaluate to values other than truth values such as integers, people, orcards of a deck. However, we are getting ahead of ourselves: just as in PL,

36 2 First-Order Logic

the syntax of FOL is independent of its meaning. The most basic terms arevariables x, y, z, x1, x2, . . . and constants a, b, c, a1, a2, . . ..

More complicated terms are constructed using functions. An n-ary func-tion f takes n terms as arguments. Notationally, we represent generic FOLfunctions by symbols f , g, h, f1, f2, . . .. A constant can also be viewed as a0-ary function.

Example 2.1. The following are all terms:

• a, a constant (or 0-ary function);• x, a variable;• f(a), a unary function f applied to a constant;• g(x, b), a binary function g applied to a variable x and a constant b;• f(g(x, f(b))).

The propositional variables of PL are generalized to predicates, denotedp, q, r, p1, p2, . . .. An n-ary predicate takes n terms as arguments. A FOLpropositional variable is a 0-ary predicate, which we write P , Q, R, P1,P2, . . ..

Countably infinitely many constant, function, and predicate symbols areavailable.

An atom is ⊤, ⊥, or an n-ary predicate applied to n terms. A literal isan atom or its negation.

Example 2.2. The following are all literals:

• P , a propositional variable (or 0-ary predicate);• p(f(x), g(x, f(x))), a binary predicate applied to two terms;• ¬p(f(x), g(x, f(x))).

The first two literals are also atoms.

A FOL formula is a literal, the application of a logical connective ¬, ∧,∨, →, or ↔ to a formula or formulae, or the application of a quantifier to aformula. There are two FOL quantifiers:

• the existential quantifier ∃x. F [x], read “there exists an x such thatF [x]”;

• and the universal quantifier ∀x. F [x], read “for all x, F [x]”.

In ∀x. F [x], x is the quantified variable, and F [x] is the scope of thequantifier ∀x. For convenience, we sometimes refer informally to the scopeof the quantifier ∀x as the scope of the quantified variable x itself. The caseis similar for ∃x. F [x]. Also, x in F [x] is bound (by the quantifier). Byconvention, the period in “. F [x]” indicates that the scope of the quantifiedvariable x extends as far as possible. We often abbreviate ∀x. ∀y. F [x, y] by∀x, y. F [x, y].

2.1 Syntax 37

Example 2.3. In

∀x. p(f(x), x) → (∃y. p(f(g(x, y)), g(x, y))︸︷︷︸G

) ∧ q(x, f(x))

︸︷︷︸F

,

the scope of x is F , and the scope of y is G. This formula is read: “for all x, ifp(f(x), x) then there exists a y such that p(f(g(x, y)), g(x, y)) and q(x, f(x))”.

A variable is free in formula F [x] if there is an occurrence of x that isnot bound by any quantifier. Denote by free(F ) the set of free variables ofa formula F . A variable is bound in formula F [x] if there is an occurrenceof x in the scope of a binding quantifier ∀x or ∃x. Denote by bound(F ) theset of bound variables of a formula F . In general, it is possible that free(F )∩bound(F ) 6= ∅, as a variable x can have both free and bound occurrences.

A formula F is closed if it does not contain any free variables.

Example 2.4. In

F : ∀x. p(f(x), y) → ∀y. p(f(x), y) ,

x only occurs bound, while y appears both free (in the antecedent) and bound(in the consequent). Thus, free(F ) = y and bound(F ) = x, y.

If free(F ) = x1, . . . , xn, then its universal closure is

∀x1. . . .∀xn. F ,

and its existential closure is

∃x1. . . .∃xn. F .

We usually write the universal and existential closures as ∀ ∗ . F and ∃ ∗ . F ,respectively.

The subformulae of a FOL formula are defined according to an extensionof the PL definition of subformula:

• the only subformula of p(t1, . . . , tn), where the ti are terms, is p(t1, . . . , tn);• the subformulae of ¬F are ¬F and the subformulae of F ;• the subformulae of F1 ∧ F2, F1 ∨ F2, F1 → F2, F1 ↔ F2 are the formula

itself and the subformulae of F1 and F2;• and the subformulae of ∃x. F and ∀x. F are the formula itself and the

subformulae of F .

The strict subformulae of a formula excludes the formula itself.The subterms of a FOL term are defined as follows:

• the only subterm of constant a or variable x is a or x itself, respectively;

• and the subterms of f(t1, . . . , tn) are the term itself and the subterms oft1, . . . , tn.

The strict subterms of a term excludes the term itself.

Example 2.5. In

F : ∀x. p(f(x), y) → ∀y. p(f(x), y) ,

the subformulae of F are

F , p(f(x), y) → ∀y. p(f(x), y) , ∀y. p(f(x), y) , p(f(x), y) .

The subterms of g(f(x), f(h(f(x)))) are

g(f(x), f(h(f(x)))) , f(x) , f(h(f(x))) , h(f(x)) , x .

f(x) occurs twice in g(f(x), f(h(f(x)))).

Example 2.6. Before discussing the formal semantics for FOL, we suggesttranslations of English sentences into FOL. The names of the constants, func-tions, and predicates are chosen to provide some intuition for the meaning ofthe FOL formulae.

• Every dog has its day.

∀x. dog(x) → ∃y. day(y) ∧ itsDay(x, y)

• Some dogs have more days than others.

∃x, y. dog(x) ∧ dog(y) ∧ #days(x) > #days(y)

• All cats have more days than dogs.

∀x, y. dog(x) ∧ cat(y) → #days(y) > #days(x)

• Fido is a dog. Furrball is a cat. Fido has fewer days than does Furrball.

dog(Fido) ∧ cat(Furrball) ∧ #days(Fido) < #days(Furrball)

• The length of one side of a triangle is less than the sum of the lengths ofthe other two sides.

∀x, y, z. triangle(x, y, z) → length(x) < length(y) + length(z)

• Fermat’s Last Theorem.

∀n. integer(n) ∧ n > 2→ ∀x, y, z.

integer(x) ∧ integer(y) ∧ integer(z) ∧ x > 0 ∧ y > 0 ∧ z > 0→ xn + yn 6= zn

2.2 Semantics 39

2.2 Semantics

Having defined the syntax of FOL, we now define its semantics. Formulaeof FOL evaluate to the truth values true and false as in PL. However, termsof FOL formulae evaluate to values from a specified domain. We extend theconcept of interpretations to this more complex setting and then define thesemantics of FOL in terms of interpretations.

First, we define a FOL interpretation I. The domain DI of an interpre-tation I is a nonempty set of values or objects, such as integers, real numbers,dogs, people, or merely abstract objects. |DI | denotes the cardinality, orsize, of DI . Domains can be finite, such as the 52 cards of a deck of cards;countably infinite, such as the integers; or uncountably infinite, such as thereals. But all domains are nonempty.

The assignment αI of interpretation I maps constant, function, and pred-icate symbols to elements, functions, and predicates over DI . It also mapsvariables to elements of DI :

• each variable symbol x is assigned a value xI from DI ;• each n-ary function symbol f is assigned an n-ary function

fI : DnI → DI

that maps n elements of DI to an element of DI ;• each n-ary predicate symbol p is assigned an n-ary predicate

pI : DnI → true, false

that maps n elements of DI to a truth value.

In particular, each constant (0-ary function symbol) is assigned a value fromDI , and each propositional variable (0-ary predicate symbol) is assigned atruth value.

An interpretation I : (DI , αI) is thus a pair consisting of a domain and anassignment.

Example 2.7. The formula

F : x + y > z → y > z − x

contains the binary function symbols + and−, the binary predicate symbol >,and the variables x, y, and z. Again, +, −, and > are just symbols: we choosethese names to provide intuition for the intended meaning of the formulae.We could just as easily have written

F ′ : p(f(x, y), z) → p(y, g(z, x)) .

We construct a “standard” interpretation. The domain is the integers, Z:

DI = Z = . . . ,−2,−1, 0, 1, 2, . . . .


To + and − we assign standard addition +Z and subtraction −Z of integers,respectively. To > we assign the standard greater-than relation >Z of integers.Finally, to x, y, and z, we assign the values 13, 42, and 1, respectively. Weignore the countably infinitely many other constant, function, and predicatesymbols that do not appear in F . We thus have interpretation I : (Z, αI),where

αI : + 7→ +Z, − 7→ −Z, > 7→>Z, x 7→ 13, y 7→ 42, z 7→ 1, . . . .

The elision reminds us that, as always, αI provides values for the countablyinfinitely many other constant, function, and predicate symbols. Usually, wedo not write the elision.

Given a FOL formula F and interpretation I : (DI , αI), we want to com-pute if F evaluates to true under interpretation I, I |= F , or if F evaluates tofalse under interpretation I, I 6|= F . We define the semantics inductively as inPL. To start, define the meaning of truth symbols:

I |= ⊤I 6|= ⊥

Next, consider more complicated atoms. αI gives meaning αI [x], αI [c], andαI [f ] to variables x, constants c, and functions f . Evaluate arbitrary termsrecursively:

αI [f(t1, . . . , tn)] = αI [f ](αI [t1], . . . , αI [tn]) ,

for terms t1, . . . , tn. That is, define the value of f(t1, . . . , tn) under αI byevaluating the function αI [f ] over the terms αI [t1], . . . , αI [tn]. Similarly,evaluate arbitrary atoms recursively:

αI [p(t1, . . . , tn)] = αI [p](αI [t1], . . . , αI [tn]) .

Then

I |= p(t1, . . . , tn) iff αI [p(t1, . . . , tn)] = true

Having completed the base cases of our inductive definition, we turn tothe inductive step. Assume that formulae F1 and F2 have fixed truth values.From these formulae, evaluate the semantics of more complex formulae. Thelogical connectives are handled in FOL in precisely the same way as in PL:

I |= ¬F iff I 6|= FI |= F1 ∧ F2 iff I |= F1 and I |= F2

I |= F1 ∨ F2 iff I |= F1 or I |= F2


I |= F1 ↔ F2 iff I |= F1 and I |= F2, or I 6|= F1 and I 6|= F2

2.2 Semantics 41

Example 2.8. Recall the formula

F : x + y > z → y > z − x

of Example 2.7 and the interpretation I : (Z, αI), where

αI : + 7→ +Z, − 7→ −Z, > 7→>Z, x 7→ 13Z, y 7→ 42Z, z 7→ 1Z .

Compute the truth value of F under I as follows:

1. I |= x + y > z since αI [x + y > z] = 13Z +Z 42 >Z 1Z

2. I |= y > z − x since αI [y > z − x] = 42Z >Z 1Z −Z 13Z

3. I |= F by 1, 2, and the semantics of →

For the quantifiers, let x be a variable. Define an x-variant of an inter-pretation I : (DI , αI) as an interpretation J : (DJ , αJ) such that

• DI = DJ ;• and αI [y] = αJ [y] for all constant, free variable, function, and predicate

symbols y, except possibly x.

That is, I and J agree on everything except possibly the value of variable x.Denote by J : I ⊳ x 7→ v the x-variant of I in which αJ [x] = v for somev ∈ DI . Then

I |= ∀x. F iff for all v ∈ DI , I ⊳ x 7→ v |= FI |= ∃x. F iff there exists v ∈ DI such that I ⊳ x 7→ v |= F

In words, I is an interpretation of ∀x. F iff all x-variants of I are interpre-tations of F . I is an interpretation of ∃x. F iff some x-variant of I is aninterpretation of F .


F : ∃x. f(x) = g(x)

and the interpretation I : (D : , •, αI) in which

αI : f() 7→ , f(•) 7→ •, g() 7→ •, g(•) 7→ .

Compute the truth value of F under I as follows:

1. I ⊳ x 7→ v 6|= f(x) = g(x) for v ∈ D2. I 6|= ∃x. f(x) = g(x) since v ∈ D is arbitrary

In the first line, basic reasoning about the interpretation I reveals that f andg always disagree. The second line follows from the first by the semantics ofexistential quantification.


2.3 Satisfiability and Validity

A formula F is said to be satisfiable iff there exists an interpretation Isuch that I |= F . A formula F is said to be valid iff for all interpretationsI, I |= F . Determining satisfiability and validity of formulae are importanttasks in FOL. Recall that satisfiability and validity are dual: F is valid iff ¬Fis unsatisfiable.

Technically, satisfiability and validity only apply to closed FOL formulae,which do not have free variables. However, let us agree on a convention: if wesay that a formula F such that free(F ) 6= ∅ is valid, we mean that its universalclosure ∀ ∗ . F is valid; and if we say that it is satisfiable, we mean that itsexistential closure ∃ ∗ . F is satisfiable. Duality still holds: a formula F withfree variables is valid (∀ ∗ . F is valid) iff its negation is unsatisfiable (∃ ∗ . ¬Fis unsatisfiable). Henceforth, we freely discuss the validity and satisfiability offormulae with free variables.

For arguing the validity of FOL formulae, we extend the semantic argu-ment method from PL to FOL. Most of the concepts carry over to the FOLcase without change. In addition to the rules for the logical connectives of PL(see Section 1.3), we have the following rules for the quantifiers.

• According to the semantics of universal quantification, from I |= ∀x. F ,deduce I ⊳ x 7→ v |= F for any v ∈ DI .

I |= ∀x. F

I ⊳ x 7→ v |= Ffor any v ∈ DI

In practice, we usually apply this rule using a domain element v that wasintroduced earlier in the proof.

• Similarly, from the semantics of existential quantification, from I 6|= ∃x. F ,deduce I ⊳ x 7→ v 6|= F for any v ∈ DI .

I 6|= ∃x. F

I ⊳ x 7→ v 6|= Ffor any v ∈ DI

Again, we usually apply this rule using a domain element v that was in-troduced earlier in the proof.

• According to the semantics of existential quantification, from I |= ∃x. F ,deduce I ⊳ x 7→ v |= F for some v ∈ DI that has not been previouslyused in the proof.

I |= ∃x. F

I ⊳ x 7→ v |= Ffor a fresh v ∈ DI

• Similarly, from the semantics of universal quantification, from I 6|= ∀x. F ,deduce I ⊳ x 7→ v 6|= F for some v ∈ DI that has not been previouslyused in the proof.

I 6|= ∀x. F

I ⊳ x 7→ v 6|= Ffor a fresh v ∈ DI


The restriction in the latter two rules corresponds to our intuition: if all weknow is that ∃x. F , then we certainly do not know which value in particularsatisfies F . Hence, we choose a new value v that does not appear previously inthe proof: it was never introduced before by a quantification rule. Moreover,αI does not already assign it to some constant, αI [a], or to some functionapplication, αI [f(t1, . . . , tn)].

Notice the similarity between the first two and between the final two rules.The first two rules handle a case that is universal in character. Consider thesecond rule: if there does not exist an x such that F , then for all values, Fdoes not hold. The final two rules are existential in character.

Lastly, the contradiction rule is modified for the FOL case.

• A contradiction exists if two variants of the original interpretation I dis-agree on the truth value of an n-ary predicate p for a given tuple of domainvalues.

J : I ⊳ · · · |= p(s1, . . . , sn)K : I ⊳ · · · 6|= p(t1, . . . , tn)I |= ⊥

for i ∈ 1, . . . , n, αJ [si] = αK [ti]

The intuition behind the contradiction rule is the following. The variants Jand K are constructed only through the rules for quantification. Hence, thetruth value of p on the given tuple of domain values is already establishedby I. Therefore, the disagreement between J and K on the truth value of pindicates a problem with I.

None of these rules cause branching, but several of the rules for the logicalconnectives do. Thus, a proof in general is a tree. A branch is closed if itcontains a contradiction according to the (first-order) contradiction rule; it isopen otherwise. All branches are closed in a finished proof of a valid formula.We exhibit the proof method through several examples.

Example 2.10. We prove that

F : (∀x. p(x)) → (∀y. p(y))

is valid. Suppose not; then there is an interpretation I such that I 6|= F :

1. I 6|= F assumption2. I |= ∀x. p(x) 1 and semantics of →3. I 6|= ∀y. p(y) 1 and semantics of →4. I ⊳ y 7→ v 6|= p(y) 3 and semantics of ∀, for some v ∈ DI

5. I ⊳ x 7→ v |= p(x) 2 and semantics of ∀

Lines 2 and 3 state the case in which line 1 holds: the antecedent and conse-quent of F are respectively true and false under I. Line 4 states that becauseof 3, there must be a value v ∈ DI such that I ⊳ y 7→ v 6|= p(y). Line 5 usesthis same value v and the semantics of ∀ with 2 to derive a contradiction:under I, p(v) is false by 4 and true by 5. Thus, F is valid.


For concision we shorten, for example, “semantics of ∀” to “∀” in theexplanation column of our arguments.

Example 2.11. Consider the following relation between universal and exis-tential quantification:

F : (∀x. p(x)) ↔ (¬∃x. ¬p(x)) .

Is it valid? Suppose not. Then there is an interpretation I such that I 6|= F .Consider the forward (→) and backward (←) directions of ↔ as separatecases. In the first case,

1. I |= ∀x. p(x) assumption2. I 6|= ¬∃x. ¬p(x) assumption3. I |= ∃x. ¬p(x) 2 and ¬4. I ⊳ x 7→ v |= ¬p(x) 3 and ∃, for some v ∈ DI

5. I ⊳ x 7→ v |= p(x) 1 and ∀Lines 4 and 5 are contradictory. In line 5, we use the value introduced in line4 with the semantics of ∀ and line 1. We are allowed to choose this same valuev precisely because line 1 states that for all x, p(x).

For the second case,

1. I 6|= ∀x. p(x) assumption2. I |= ¬∃x. ¬p(x) assumption3. I ⊳ x 7→ v 6|= p(x) 1 and ∀, for some v ∈ DI

4. I 6|= ∃x. ¬p(x) 2 and ¬5. I ⊳ x 7→ v 6|= ¬p(x) 4 and ∃6. I ⊳ x 7→ v |= p(x) 5 and ¬

Lines 3 and 6 are contradictory. Line 4 says that ∃x. ¬p(x) is false under I.Thus, by the semantics of ∃, no value w from DI is such that p(w) is true. Inparticular, line 5 identifies v, introduced in line 3.

As both cases end in contradictions for arbitrary interpretation I, F isvalid.

It is sometimes useful to reference known values, as the following simpleexample illustrates.


F : p(a) → ∃x. p(x)

is valid, assume otherwise and derive a contradiction.

1. I 6|= F assumption2. I |= p(a) 1 and →3. I 6|= ∃x. p(x) 1 and →4. I ⊳ x 7→ αI [a] 6|= p(x) 3 and ∃5. I |= ⊥ 2, 4

2.4 Substitution 45

In line 4, we used the value assigned to a to instantiate the quantified variableof line 3, which has universal character. Because lines 2 and 4 are contradic-tory, F is valid.

To show that a formula F is invalid, it suffices to find an interpretation Isuch that I |= ¬F .


F : (∀x. p(x, x)) → (∃x. ∀y. p(x, y)) .

To show that it is invalid, we find an interpretation I such that

I |= ¬((∀x. p(x, x)) → (∃x. ∀y. p(x, y))) ,

or, according to the semantics of →,

I |= (∀x. p(x, x)) ∧ ¬(∃x. ∀y. p(x, y)) .

Choose

DI = 0, 1

and

pI = (0, 0), (1, 1) .

We use a common notation for defining relations: pI(a, b) is true iff (a, b) ∈ pI .Here, pI(0, 0) is true, and pI(1, 0) is false.

Both ∀x. p(x, x) and ¬(∃x. ∀y. p(x, y)) evaluate to true under I, so

I |= (∀x. p(x, x)) ∧ ¬(∃x. ∀y. p(x, y)) ,

which shows that F is invalid. Interpretation I is a falsifying interpretationof F .

We apply the semantic argument method to more examples in Section 3.1.Equivalence (F1 ⇔ F2) and implication (F1 ⇒ F2) extend directly from

PL to FOL. Equivalence of and implication between two formulae can beargued using the semantic argument method. See, for example, Example 2.11.

2.4 Substitution

Substitution for FOL is more complex than substitution for PL because ofquantification. We introduce two types of substitution in this section withthe goal of generalizing Propositions 1.15 and 1.17 to the FOL setting. As inPL, substitution allows us to consider the validity of entire sets of formulaesimultaneously.


First, we define the renaming of a quantified variable. If variable x isquantified in F so that F has the form F [∀x. G[x]], then the renaming of x tofresh variable x′ produces the formula F [∀x′. G[x′]]. A “fresh variable” is anyvariable that does not occur in F . By the semantics of universal quantification,the original and final formulae are equivalent. The case is similar for existentialquantification. Often, we simply say that a variable is renamed to mean thatits bound occurrences are renamed to a fresh variable. Free occurrences ofvariables are never renamed.

Example 2.14. Renaming the bound variable x to fresh variable x′ in

F : p(x) ∧ ∀x. q(x, y)

produces

F ′ : p(x) ∧ ∀x′. q(x′, y) .

Renaming y does not cause any change because y does not occur bound.

A substitution is a map from FOL formulae to FOL formulae:

σ : F1 7→ G1, . . . , Fn 7→ Gn .

As in PL, domain(σ) = F1, . . . , Fn and range(σ) = G1, . . . , Gn. To com-pute the application of σ to F , Fσ, replace each occurrence of Fi in F by Gi

simultaneously. When both subformulae Fj and Fk are in the domain of σand Fk is a strict subformula of Fj , replace occurrences of Fj by Gj .


F : (∀x. p(x, y)) → q(f(y), x)

and substitution

σ : x 7→ g(x), y 7→ f(x), q(f(y), x) 7→ ∃x. h(x, y) .

Then

Fσ : (∀x. p(g(x), f(x))) → ∃x. h(x, y) .

Notice how there are more bound occurrences of x in Fσ than in F .

Use care when substitutions include quantifiers in the domain. Substitutingfor a quantified subformula ∀x. F requires that all of ∀x. F be replaced.


F : ∃y. p(x, y) ∧ p(y, x)

and substitution

σ : ∃y. p(x, y) 7→ p(x, a) ,

where a is a constant. Fσ = F because the scope of the quantifier ∃y in F isp(x, y) ∧ p(y, x), not just p(x, y).

2.4 Substitution 47

2.4.1 Safe Substitution

A restricted application of substitution has a useful semantic property.Define for a substitution σ its set of free variables:

Vσ =⋃

i

(free(Fi) ∪ free(Gi)) .

Vσ consists of the free variables of all formulae Fi and Gi of the domain andrange of σ. Compute the safe substitution Fσ of formula F as follows:

1. For each quantified variable x in F such that x ∈ Vσ, rename x to a freshvariable to produce F ′.

2. Compute F ′σ.

Example 2.17. Consider again formula

F : (∀x. p(x, y)) → q(f(y), x)

and substitution

σ : x 7→ g(x), y 7→ f(x), q(f(y), x) 7→ ∃x. h(x, y) .

To compute the safe substitution Fσ, first compute

Vσ = free(x) ∪ free(g(x)) ∪ free(y) ∪ free(f(x))∪ free(q(f(y), x)) ∪ free(∃x.h(x, y))

= x ∪ x ∪ y ∪ x ∪ x, y ∪ y= x, y

Then

1. x ∈ Vσ, so rename bound occurrences in F :

F ′ : (∀x′. p(x′, y)) → q(f(y), x) .

x also occurs free in F .2. F ′σ : (∀x′. p(x′, f(x))) → ∃x. h(x, y).

The first step of computing a safe substitution becomes trivial if eachquantified variable has a unique name.


F : (∀z. p(z, y)) → q(f(y), x) ,

in which the quantified variable has a different name than any free variableof F or the substitution


σ : x 7→ g(x), y 7→ f(y), q(f(y), x) 7→ ∃w. h(w, y) .

Compared to Example 2.17, the quantified variable z in F and the quantifiedvariable w of σ have different names than any other variable of F or σ. Thesafe substitution is the unrestricted substitution

Fσ : (∀z. p(z, f(y))) → ∃w. h(w, y) .

Proposition 2.19 (Substitution of Equivalent Formulae). Considersubstitution

σ : F1 7→ G1, . . . , Fn 7→ Gn

such that for each i, Fi ⇔ Gi. Then F ⇔ Fσ when Fσ is computed as a safesubstitution.

The language of Propositions 2.19 and 1.15 are almost identical.

2.4.2 Schema Substitution

Example 2.11 proves an interesting relation between universal and existentialquantification (∀x. p(x) ⇔ ¬∃x. ¬p(x)), but the result is not general. Wewould like to prove that for any FOL formula F ,

H : (∀x. F ) ↔ (¬∃x. ¬F )

is valid. H is a formula schema (plural: schemata). Formula schema andschema substitutions provide the desired generality.

A formula schema H contains at least one placeholder F1, F2, . . .. Forexample, F is a placeholder in the formula schema H above. A formula schemacan also have side conditions that specify that certain variables do not occurfree in the placeholders.

Consider a substitution σ mapping placeholders to FOL formulae. Aschema substitution is an (unrestricted) application of σ to a formulaschema. It is legal only if σ obeys the side conditions of the formula schema.

Example 2.20. Recall from Example 2.11 that

(∀x. p(x)) ↔ (¬∃x. ¬p(x))

is valid. It can act as a formula schema. First, rewrite the formula usingplaceholders:

H : (∀x. F ) ↔ (¬∃x. ¬F ) .

H does not have any side conditions. Next, to prove the validity of

2.4 Substitution 49

G : (∀x. ∃y. q(x, y)) ↔ (¬∃x. ¬∃y. q(x, y)) ,

show that G is derivable from H via a schema substitution:

σ : F 7→ ∃y. q(x, y) .

Then Hσ = G (Hσ is syntactically identical to G), so that by Proposition2.25 below, G is valid.

A formula schema H is valid if Hσ is valid for every schema substitutionσ that obeys the side conditions of H . Apply the semantic method to provethe validity of a formula schema.

Example 2.21. To prove the validity of the formula schema

H : (∀x. F1 ∧ F2) ↔ (∀x. F1) ∧ (∀x. F2) ,

consider the two directions. First, assume that I 6|= (∀x. F1∧F2) → (∀x. F1) ∧(∀x. F2):

1. I |= ∀x. F1 ∧ F2 assumption2. I 6|= (∀x. F1) ∧ (∀x. F2) assumption3. I |= (∃x. ¬F1) ∨ (∃x. ¬F2) 2, ¬

There are two cases to consider:

4a. I |= ∃x. ¬F1 3, ∨5a. I ⊳ x 7→ v |= ¬F1 4a, ∃, for some v ∈ DI

6a. I ⊳ x 7→ v |= F1 ∧ F2 1, ∀7a. I ⊳ x 7→ v |= F1 6a, ∧

ending in a contradiction. The second disjunctive case is similar.For the second main case, assume that I 6|= (∀x. F1) ∧ (∀x. F2) →

(∀x. F1 ∧ F2):

1. I 6|= ∀x. F1 ∧ F2 assumption2. I |= (∀x. F1) ∧ (∀x. F2) assumption3. I |= ∃x. ¬F1 ∨ ¬F2 1, ¬4. I ⊳ x 7→ v |= ¬F1 ∨ ¬F2 3, ∃, for some v ∈ DI

5. I |= ∀x. F1 2, ∧6. I |= ∀x. F2 2, ∧7. I ⊳ x 7→ v |= F1 5, ∀8. I ⊳ x 7→ v |= F2 6, ∀

Again, there are two cases to consider:

9a. I ⊳ x 7→ v |= ¬F1 4, ∨

ending in a contradiction. The second disjunctive case is similar. Thus, H isa valid formula schema.


We now consider formula schemata with side conditions.

Example 2.22. Consider the formula schema with side condition

H : (∀x. F ) ↔ F provided x 6∈ free(F ) .

If we disregard the side condition, then H is an invalid formula schema as, forexample,

G1 : (∀x. p(x)) ↔ p(x) ,

obtained from H by schema substitution

σ : F 7→ p(x) ,

is invalid. However, σ is disallowed by the side condition that x should notoccur free in F .

H (with the side condition) is a valid formula schema. Thus, the formula

G2 : (∀x. ∃y. p(z, y)) ↔ ∃y. p(z, y)

is valid: obtain it via the schema substitution

σ : F 7→ ∃y. p(z, y) ,

which obeys H ’s side condition.

Reasoning about the validity of a formula schema with side conditionsusually requires invoking its side conditions during a semantic argument.

Example 2.23. To prove the validity of

H : (∀x. F ) ↔ F provided x 6∈ free(F ) ,

consider the two directions of ↔. First,

1. I |= ∀x. F assumption2. I 6|= F assumption3. I |= F 1, ∀, since x 6∈ free(F )4. I |= ⊥ 2, 3

Second,

1. I 6|= ∀x. F assumption2. I |= F assumption3. I |= ∃x. ¬F 1 and ¬4. I |= ¬F 3, ∃, since x 6∈ free(F )5. I |= ⊥ 2, 4

Thus, H is a valid formula schema.

2.5 Normal Forms 51

Example 2.24. To prove the validity of

H : (∀x. F1 ∧ F2) ↔ (∀x. F1) ∧ F2 provided x 6∈ free(F2) ,

consider two cases. First,

1. I |= ∀x. F1 ∧ F2 assumption2. I 6|= (∀x. F1) ∧ F2 assumption3. I |= (∀x. F1) ∧ (∀x. F2) 1, valid schema from Example 2.214. I |= (∀x. F1) ∧ F2 3, Example 2.23, since x 6∈ free(F2)5. I |= ⊥ 2, 4

Observe the application of valid formula schemata from previous examples inlines 3 and 4. Second,

1. I 6|= ∀x. F1 ∧ F2 assumption2. I |= (∀x. F1) ∧ F2 assumption3. I |= (∀x. F1) ∧ (∀x. F2) 2, Example 2.23, since x 6∈ free(F2)4. I |= ∀x. F1 ∧ F2 3, Example 2.215. I |= ⊥ 1, 4

Thus, H is a valid formula schema.

Proposition 2.25 (Formula Schema). If H is a valid formula schema andσ is a substitution obeying H’s side conditions, then Hσ is also valid.

The valid PL formula

(P → Q) ↔ (¬P ∨Q)

can be treated as a valid formula schema:

(F1 → F2) ↔ (¬F1 ∨ F2) .

In general, valid propositional templates are valid formulae schemata, so thatProposition 2.25 generalizes Proposition 1.17.

2.5 Normal Forms

The normal forms of PL extend to FOL. A FOL formula F can be trans-formed into negation normal form (NNF) by using the procedure of Section1.6 augmented with these two (schema) equivalences:

¬∀x. F [x] ⇔ ∃x. ¬F [x] ¬∃x. F [x] ⇔ ∀x. ¬F [x] .

Example 2.26. We apply the procedure to find a formula in NNF that isequivalent to

G : ∀x. (∃y. p(x, y) ∧ p(x, z)) → ∃w.p(x, w) .

Each formula below is equivalent to G and is obtained from the previous onethrough an application of an equivalence.

1. ∀x. (∃y. p(x, y) ∧ p(x, z)) → ∃w. p(x, w)2. ∀x. ¬(∃y. p(x, y) ∧ p(x, z)) ∨ ∃w. p(x, w)3. ∀x. (∀y. ¬(p(x, y) ∧ p(x, z))) ∨ ∃w. p(x, w)4. ∀x. (∀y. ¬p(x, y) ∨ ¬p(x, z)) ∨ ∃w. p(x, w)

Formula 2 follows from the equivalence

F1 → F2 ⇔ ¬F1 ∨ F2 .

Formula 3 arises from an application of

¬∃x. F [x] ⇔ ∀x. ¬F [x] ,

and the final formula, which is in NNF, follows from De Morgan’s Law.

A formula is in prenex normal form (PNF) if all of its quantifiersappear at the beginning of the formula:

Q1x1. . . . Qnxn. F [x1, . . . , xn] ,

where Qi ∈ ∀, ∃ and F is quantifier-free. Every FOL formula F can betransformed into an equivalent formula F ′ in PNF. To compute an equivalentPNF F ′ of F ,

1. Convert F into NNF formula F1.2. When multiple quantified variables have the same name, rename them to

fresh variables, resulting in F2.3. Remove all quantifiers from F2 to produce quantifier-free formula F3.4. Add the quantifiers before F3,

F4 : Q1x1. . . . Qnxn. F3 ,

where the Qi are the quantifiers such that if Qj is in the scope of Qi inF1, then i < j.

F4 is equivalent to F .

Example 2.27. We apply the procedure to find a PNF equivalent of

F : ∀x. ¬(∃y. p(x, y) ∧ p(x, z)) ∨ ∃y. p(x, y) .

1. Write F in NNF:

F1 : ∀x. (∀y. ¬p(x, y) ∨ ¬p(x, z)) ∨ ∃y. p(x, y) .

2. Rename quantified variables:

F2 : ∀x. (∀y. ¬p(x, y) ∨ ¬p(x, z)) ∨ ∃w. p(x, w) .

2.6 Decidability and Complexity 53

3. Remove all quantifiers to produce quantifier-free formula

F3 : ¬p(x, y) ∨ ¬p(x, z) ∨ p(x, w) .

4. Add the quantifiers before F3:

F4 : ∀x. ∀y. ∃w. ¬p(x, y) ∨ ¬p(x, z) ∨ p(x, w) .

Alternately, choose order

F ′4 : ∀x. ∃w. ∀y. ¬p(x, y) ∨ ¬p(x, z) ∨ p(x, w) .

Both F4 and F ′4 are equivalent to F . However,

G : ∀y. ∃w. ∀x. ¬p(x, y) ∨ ¬p(x, z) ∨ p(x, w)

is not equivalent to F .

A FOL formula is in CNF (DNF) if it is in PNF and its main quantifier-free subformula is in CNF (DNF). CNF and DNF equivalents are obtained bytransforming formula F into PNF formula F ′ and then applying the relevantprocedure of Section 1.6 to the main quantifier-free subformula of F ′.

2.6 Decidability and Complexity

We review the main concepts from decidability and complexity theory. Bibli-ographic Remarks refers the interested reader to other texts that focus onthese topics.

2.6.1 Satisfiability as a Formal Language

Satisfiability of formulae is the primary decision problem in logic. We canformalize satisfiability as a formal language decision problem. Consider PL.Let LPL be the set of all satisfiable formulae. That is, the word w ∈ LPL iff

1. w is a syntactically well-formed formulae: it parses according to the defi-nition of Section 1.1;

2. and when w is viewed as a PL formula F , F is satisfiable.

Then the formal decision problem is the following: given a word w, is w ∈ LPL?Satisfiability of FOL formulae can be similarly formalized as a language

question. Let LFOL be the set of all FOL formulae (words that parse accordingto Section 2.1) that are satisfiable. Then the formal decision problem is thefollowing: given a word w, is w ∈ LFOL? In other words, is it a well-formedFOL formula, and if so, is it satisfiable? Dually, we can define the validityproblem for PL and FOL.


2.6.2 Decidability

A formal language L is decidable if there exists a procedure that, given aword w, (1) eventually halts and (2) answers yes if w ∈ L and no if w 6∈L. Other terms for “decidable” are recursive and Turing-decidable. Aprocedure for a decidable language is called an algorithm. Satisfiability ofPL formulae is decidable: the truth-table method is a decision procedure.

A formal language is undecidable if it is not decidable.A formal language L is semi-decidable if there exists a procedure that,

given a word w, (1) halts and answers yes iff w ∈ L, (2) halts and answers no ifw 6∈ L, or (3) does not halt if w 6∈ L. The possible outcomes (2) and (3) for thecase w 6∈ L mean that the procedure may or may not halt. Unlike a decidablelanguage, the procedure is only guaranteed to halt if w ∈ L. Other terms for“semi-decidable” are partially decidable, recursively enumerable, andTuring-recognizable.

The terms “Turing-decidable” and “Turing-recognizable” arise from AlanTuring’s classic formalization of procedures as Turing machines. A Turingmachine consists of a finite automaton coupled with an infinite tape and tapehead. Each cell of the tape can hold one character from a finite alphabet.The state of the automaton changes based on its control structure and thecharacter currently under the tape head. During a state change, the automatoninstructs the tape head to write a new character to the current cell and tomove one position left or right.

Church and Turing showed that LFOL is undecidable: there does not existan algorithm for deciding if a FOL formula F is satisfiable (and similarlyfor validity). However, there is a procedure that halts and says yes if F isvalid, so validity is semi-decidable. We describe such a procedure based onthe semantic argument method in Section 2.7.

2.6.3 ⋆Complexity

If a language is decidable, then one considers the complexity of the decisionproblem. We define several of the main complexity classes here. A languageL is polynomial-time decidable, or in PTIME (also, in P), if there exists aprocedure that, given w, answers yes when w ∈ L, answers no when w 6∈ L,and halts with the answer in a number of steps that is at most proportionateto some polynomial of the size of w. For example, determining if the wordw is a well-formed FOL formula is polynomial-time decidable and can beimplemented using standard parsing methods.

A language L is nondeterministic-polynomial-time decidable, or inNPTIME (also, in NP), if there exists a nondeterministic procedure that, givenw,

1. guesses a witness W to the fact that w ∈ L that is at most proportionatein size to some polynomial of the size of w;

2.6 Decidability and Complexity 55

2. checks in time at most proportionate to some polynomial of the size of wthat W really is a witness to w ∈ L;

3. and answers yes if the check succeeds and no otherwise.

For example, LPL is in NP, as exhibited by the following nondeterministicprocedure for deciding if a PL formula is satisfiable:

1. parse the input w as formula F (return no if w is not a well-formed PLformula);

2. guess an interpretation I, which is linear in the size of w;3. check that I |= F .

I is the witness to the satisfiability of F .A language L is in co-NP if its complement language L is in NP. For

example, unsatisfiability of PL formulae is in co-NP because satisfiability isin NP. It is not known if unsatisfiability of PL formulae is in NP. Whilea satisfiable PL formula has a polynomial size witness of its satisfiability(a satisfying interpretation I), there is no known polynomial size witness ofunsatisfiability.

A language L is NP-hard if every instance v ∈ L′ of every other NP decid-able language L′ can be reduced to deciding an instance wv

L′ ∈ L. Moreover,the size of wv

L′ must be at most proportionate to some polynomial of the sizeof v. That is, L is NP-hard if every query v ∈ L′ of every NP language L′

can be encoded into a query w ∈ L, where w is not much larger than v. L isNP-complete if it is in NP and is NP-hard.

LPL is NP-complete. Indeed, LPL, also called SAT, was the first languageshown to be NP-complete. We proved that LPL is in NP above by describing anondeterministic polynomial time procedure. The Cook-Levin theorem showsthat all NP-languages L can be reduced to LPL, so that LPL is NP-hard. Theyexhibited a polynomial time algorithm that, given L and input w, constructsan encoding into PL of a simulation of a run of a nondeterministic Turingmachine for L on w. The encoding as PL formula F has length that is poly-nomial in the length of w. F is satisfiable iff the Turing machine decides thatw ∈ L.

For discussing the complexity of decision problems and algorithms, we usethe standard notation. Consider a function f(n) over integers. For example,f(n) = log n, f(n) = n2, f(n) = 2n, or f(n) = 22n

. A function g(n) is of atmost order f(n) if there exist a scalar c ≥ 0 and an integer n0 ≥ 0 such that

∀n ≥ n0. g(n) ≤ cf(n) .

O(f(n)) denotes the set of all functions of at most order f(n). Similarly,Ω(f(n)) denotes the set of all function of at least order f(n): a function g(n)is of at least order f(n) if there exist a scalar c ≥ 0 and an an integer n0 ≥ 0such that

∀n ≥ n0. g(n) ≥ cf(n) .


Finally, Θ(f(n)) = Ω(f(n)) ∩ O(f(n)) denotes the set of all functions ofprecisely order f(n).

Example 2.28.

• 3n2 + n ∈ O(n2)• 3n2 + n ∈ Ω(n2)• 3n2 + n ∈ Θ(n2)• 1

99n2 + n ∈ Ω(n2)• 3n2 + n ∈ O(2n)• 3n2 + n ∈ Ω(n)• 3n2 + n 6∈ Ω(2n)• 3n2 + n 6∈ Θ(2n)• 2n ∈ Ω(n3)• 2n 6∈ O(n3)

A decision problem has time complexity O(f(n)) if there exists a decisionalgorithm P for the problem and a function g(n) ∈ O(f(n)) such that Pruns in time at most g(n) on input of size n. A decision problem has timecomplexity Ω(f(n)) if there exists a function g(n) ∈ Ω(f(n)) such that alldecision algorithms P for the problem run in time at least g(n) on input ofsize n. Finally, a decision problem has time complexity Θ(f(n)) if it has timecomplexities Ω(f(n)) and O(f(n)).

Example 2.29. The algorithm sat for deciding PL satisfiability runs in timeΘ(2n), where n is the number of variables in the input formula, because eachlevel of recursion branches. Hence, the problem of PL satisfiability has timecomplexity O(2n).

2.7 ⋆Meta-Theorems of First-Order Logic

We prove that the semantic argument method for FOL is sound and, givena proper strategy of applying the proof rules, complete. A proof method issound if every formula that has a proof according to the method is valid. Aproof method is complete if every formula that is valid has a proof accordingto the method. That the semantic argument method is sound means that aclosed semantic argument for I 6|= F proves the validity of F ; and that thesemantic argument method is complete means that every valid formula Fof FOL has a closed semantic argument proving its validity. Because thereexists a complete proof method for FOL, FOL is a complete logic: every validformula of FOL has a proof of its validity.

The second half of this section is devoted to proving two classic theoremsthat we apply in Chapter 10.

2.7 ⋆Meta-Theorems of First-Order Logic 57

2.7.1 Simplifying the Language of FOL

In preparation for the proofs, we simplify the language of FOL without los-ing expressiveness. Exercises 1.3 and 4.6 show that we have many redundantlogical connectives. We choose to use only the logical constant ⊤ and the con-nectives ¬ and ∧, from which the others can be constructed. Additionally, weneed only one quantifier since ∃x. F is equivalent to ¬∀x. ¬F . We choose ∀.

A second simplification is more involved. The goal is to remove constantand function symbols from the language by using predicate symbols instead.Given a formula F , let S be the set of function symbols appearing in it.Associate with each n-ary function symbol f of S a new (n+1)-ary predicatepf . Then for each occurrence of a function f in a literal L of F

L[f(t1, . . . , tn)] ,

replace L in F with the new formula

∃x. pf (t1, . . . , tn, x) ∧ L[x] .

After all replacements, the resulting formula G does not contain any functionsymbols.

The next step ensures that the new predicate pf describes a function f : itassociates with each tuple of domain values v1, . . . , vn precisely one value v.For each introduced predicate pf , construct the formula

If : ∀x. ∃y. pf (x, y) ∧ (∀z. pf (x, z) → y = z) .

Then construct the formula

H :

∧

f∈S

If

→ G .

The equality predicate = is not yet defined. To make = an equivalencerelation, assert that it is reflexive, symmetric, and transitive:

E :(∀x. x = x)∧ (∀x, y. x = y → y = x)∧ (∀x, y, z. x = y ∧ y = z → x = z)

Additionally, every predicate symbol p appearing in G should obey =:

Ep : ∀x, y. x = y → (p(x)↔ p(y))

Let T be the set of predicate symbols of G. Construct the final formula

F ′ :

E ∧∧

p∈T

Ep

→ H .


F ′ is valid iff F is valid. Moreover, F ′ does not contain any function symbols.For the special case of constant symbols, it is simpler to replace F [a] with

F ′ : ∀x. F [x].For the remainder of this section, we consider a version of FOL with only

the logical constant⊤, the connectives ¬ and ∧, the quantifier ∀, and predicatesymbols. It is equivalent in expressive power to the richer language studiedearlier in the chapter.

2.7.2 Semantic Argument Proof Rules

In this simplified form of FOL, we need only the following seven proof rules.

• For handling negation:

I |= ¬FI 6|= F

I 6|= ¬FI |= F

• For handling conjunction:

I |= F ∧G

I |= FI |= G

I 6|= F ∧G

I 6|= F | I 6|= G

• For handling universal quantification:

I |= ∀x. F

I ⊳ x 7→ v |= Ffor any v ∈ DI

and

I 6|= ∀x. F

I ⊳ x 7→ v 6|= Ffor a fresh v ∈ DI

• For deriving a contradiction:

J : I ⊳ · · · |= p(x1, . . . , xn)K : I ⊳ · · · 6|= p(y1, . . . , yn)I |= ⊥

for i ∈ 1, . . . , n, αJ [xi] = αK [yi]

Only the second rule for conjunction requires a case analysis.

2.7.3 Soundness and Completeness

That the semantic argument method is sound is fairly obvious for the first sixrules: each follows almost directly from the semantics of FOL. The final rulefor deriving a contradiction requires some explanation. The variants J andK are constructed only through the rules for handling quantification so thatthey simply assign values to the arguments of p. Hence, the truth value of pon the given tuple of domain values is already established by I. Furthermore,the disagreement between J and K on the truth value of p indicates that I isnot in fact an interpretation. Therefore, we have the following theorem.


Theorem 2.30 (Sound). If every branch of a semantic argument proof ofI 6|= F closes, then F is valid.

Completeness is more complicated. We want to show that there exists aclosed semantic argument proof of I 6|= F when F is valid. Our strategy is asfollows. We define a procedure for applying the proof rules. When applyingthe quantification rules, the procedure selects values from a predeterminedcountably infinite domain. We then show that when some falsifying interpre-tation I exists (such that I 6|= F ) our procedure constructs, at the limit, afalsifying interpretation. Therefore, F must be valid if the procedures actu-ally discovers an argument in which all branches are closed. We now proceedaccording to this proof plan.

Let D be a countably infinite domain of values v1, v2, v3, . . . which wecan enumerate in some fixed order. Start the semantic argument by placingI 6|= F at the root and marking it as unused. Now assume that the procedurehas constructed a partial semantic argument and that each line is marked aseither used or unused. We describe the next iteration.

Select the earliest line L : I |= G or L : I 6|= G in the argument thatis marked unused, and choose the appropriate proof rule to apply accordingto the root symbol of G’s parse tree. To apply a rule, add the appropriatedeductions at the end of every open branch that passes through line L; markeach new deduction as unused ; and mark L as used. The application of thenegation rules and the first conjunction rule is then straightforward. Applyingthe second (branching) conjunction rule introduces a fork at the end of everyopen branch, doubling the number of open branches. In applying the secondquantification rule, choose the next domain element vi that does not appearin the semantic argument so far. For the first quantification rule, assume thatG has the form ∀x. H . Choose the first value vi on which ∀x. H has not beeninstantiated in any ancestor of L. Additionally, consider I |= G as a second“deduction” of this rule (so that both I ⊳ x 7→ vi |= H and I |= G areadded to every branch passing through L and marked as unused). This trickguarantees that x of ∀x. H is instantiated on every domain element withoutpreventing the rest of the proof from progressing. Finally, close any branchthat has a contradiction resulting from a deduction in this iteration.

Recall that a semantic argument is finished if no further applications ofrules are possible. In our proof procedure, this situation occurs when all linesare marked as used. Although we can never construct a finished semanticargument with infinitely many lines in practice, we can reason about sucharguments. For example, such an argument has an infinitely long branch. Forsuppose not: then every branch has finite length, so there must be an infinitenumber of these finite branches. But such a situation requires a deductionstep that results in an infinite number of branches, whereas each proof ruleproduces at most two branches. This result is known as Konig’s Lemma. Wenext prove that each open branch of a finished semantic argument describesa falsifying interpretation.


Lemma 2.31. Each open branch of a finished semantic argument for I 6|= Fdefines a falsifying interpretation of F .

Proof. We apply structural induction on the formulae appearing in the branchto conclude that each line L : I |= G or L : I 6|= G holds, including I 6|= F . Inthat case, I is a falsifying interpretation of F . The technique of structuralinduction is defined in Section 4.4.

For the base case, consider lines in which G is an atom. As the contradictionrule is never applied (otherwise, the branch would be closed) no contradictionexists. Therefore, each instance I⊳· · · |= p(x1, . . . , xn) or I⊳· · · 6|= p(x1, . . . , xn)defines the truth value of p on one tuple of domain elements without contra-dicting any other definition on the branch. (For tuples of domain elementson which p is not explicitly defined on the branch, p may take any value, sayfalse.)

Consider when G is formed by applying a logical connective to one or twoformulae. As the procedure applied the appropriate proof rule, the inductivehypothesis and the semantics of the logical connectives tell us that L holds.For example, consider the case L : I |= F1 ∧ F2. Then I |= F1 and I |= F2

appear on the branch, and both lines hold by the inductive hypothesis. Thereader may verify the other logical connectives with similar reasoning.

Consider the case L : I 6|= ∀x. F . For L to hold, it must be the casethat for some fresh domain value v, M : I ⊳ x 7→ v 6|= F . But the procedureguarantees that M is a descendant of L. Moreover, F is a subformula of ∀x. F ,so the inductive hypothesis asserts that M holds. Then so does L.

Consider the case L : I |= ∀x. F . For L to hold, it must be the case thatfor all domain values v, M : I ⊳ x 7→ v |= F . But the procedure guaranteesthat such a line exists for every v. Moreover, F is a subformula of ∀x. F , sothe inductive hypothesis asserts that each such lines holds. Hence, so does L,finishing the proof.

Remark 2.32. The formulae that appear on an open branch of a finishedsemantic proof comprise a Hintikka set. The proof strategy that we employedis essentially that used in proving Hintikka’s Lemma, which asserts that aHintikka set is satisfiable.

Remark 2.33. We defined the procedure with a fixed countably infinite do-main in mind and then proved that an open branch of a finished semanticargument corresponds to at least one falsifying interpretation. Therefore, wehave proved an additional fact: every satisfiable FOL formula is satisfied by aninterpretation with a countable domain. This result is Lowenheim’s Theo-rem.

Theorem 2.34 (Complete). The semantic argument method is complete:each valid formula F has a semantic argument proof (in which every branchis closed).


Proof. Suppose that F is valid, yet no semantic argument proof exists. Thena finished semantic argument constructed according to our procedure has anopen branch. By Lemma 2.31, this branch describes a falsifying interpretationof F , a contradiction. Hence, all branches of a finished semantic argument mustin fact be closed (and thus finite). By Konig’s Lemma, the semantic argumentitself has finite size.

2.7.4 Additional Theorems

Is a countable (but possibly infinite) set S of satisfiable formulae simulta-neously satisfiable? That is, does there exist a single interpretation I thatsatisfies every member of S? The Compactness Theorem relates simulta-neous satisfiability of S to satisfiability of the conjunction of each finite subsetof S. Dually, we might consider whether the disjunction of a countable set offormulae is valid.

Theorem 2.35 (Compactness Theorem). A countable set of first-orderformulae S is simultaneously satisfiable iff the conjunction of every finite sub-set is satisfiable.

Proof. The forward direction is clear: if I simultaneously satisfies the membersof S, then it satisfies each finite conjunction.

For the other direction, extend the proof procedure of the previous sectionas follows. Arrange the members of S in some order F1, F2, F3, . . ., which ispossible because S is countable. Start the procedure with I 6|= ¬F1. At theend of each iteration of the procedure, choose the next formula Fi in thesequence and append I 6|= ¬Fi to every open branch, marking it as unused.Since each finite subset of S is satisfiable, at least one branch remains openand the procedure does not terminate.

A finished semantic argument constructed in this manner enumerates aninterpretation that falsifies every ¬Fi and thus satisfies every Fi. Hence, S issimultaneously satisfiable.

Remark 2.36. This proof proves an additional fact that extends Lowenheim’sTheorem: every simultaneously satisfiable countable set of FOL formulae issimultaneously satisfied by an interpretation with a countable domain. Thisresult is the Lowenheim-Skolem Theorem.

We apply the next theorem, the Craig Interpolation Lemma, in Chap-ter 10. It asserts that if F → G is valid, then there is a formula H (called aninterpolant) such that F → H and H → G are valid and whose predicatesand free variables occur in both F and G. The proof is constructive: it de-scribes a procedure for extracting the interpolant from a proof of the validityof F → G. However, proofs constructed via the proof rules of Section 2.7.2are not directly amenable to the interpolation procedure. We describe an al-ternate set of proof rules instead and show that any proof constructed from


the rules of 2.7.2 can be translated into a proof using the new rules. Then weprove the Craig Interpolation Lemma using these new proof rules.

One trick that will prove convenient is the following. Associate a freshvariable xi with each domain value vi introduced during the proof. Whenevera variant I ⊳ x 7→ vi is used, rename x to the variable xi corresponding tothe value vi in both the variant interpretation and the formula. This renamingdoes not affect the soundness of the proof, but it makes contradictions moreobvious.

The new rules are the following:

• For handling double negation:

I |= ¬¬FI |= F

I 6|= ¬¬FI 6|= F

• For handling conjunction:

I |= F ∧G

I |= FI |= G

I 6|= F ∧G

I 6|= F | I 6|= G

and

I |= ¬(F ∧G)

I |= ¬F | I |= ¬GI 6|= ¬(F ∧G)

I 6|= ¬FI 6|= ¬G

• For handling universal quantification:

I |= ∀x. F

I ⊳ x 7→ v |= F

I 6|= ¬∀x. F

I ⊳ x 7→ v 6|= ¬Ffor any v ∈ DI

and

I 6|= ∀x. F

I ⊳ x 7→ v 6|= F

I |= ¬∀x. F

I ⊳ x 7→ v |= ¬Ffor a fresh v ∈ DI

• For deriving a contradiction (recall our trick of renaming variables to cor-respond uniquely to domain values):

J : I ⊳ · · · |= p(x1, . . . , xn)K : I ⊳ · · · 6|= p(x1, . . . , xn)I |= ⊥

and

J : I ⊳ · · · |= p(x1, . . . , xn)K : I ⊳ · · · |= ¬p(x1, . . . , xn)I |= ⊥

J : I ⊳ · · · 6|= p(x1, . . . , xn)K : I ⊳ · · · 6|= ¬p(x1, . . . , xn)I |= ⊥


The important characteristic (for proving the interpolation lemma) of this setof proof rules is that premises and deductions agree on the use of |= or 6|=,except in the contradiction rules. In contrast, the negation rules of Section2.7.2 do not have this property. We obtained this property by folding eachnegation rule into every other rule.

Before proving the interpolation lemma, let us prove that the new semanticargument proof system based on these rules is sound and complete. Soundnessis fairly obvious; for completeness, we briefly describe how to map a proof fromthe system of Section 2.7.2 to a proof using these rules.

Lemma 2.37. Every proof in the proof system of Section 2.7.2 has a corre-sponding proof in the new proof system.

Proof. In constructing the new proof, ignore any use of the negation rulesof Section 2.7.2, instead choosing from the (doubled) set of conjunction andquantification rules depending on whether a ¬ is at the root of the parse treeof a formula. Use the new negation rules to remove double negations whennecessary. For deriving a contradiction, one of the three cases represented bythe contradiction rules must occur when a contradiction occurs in the originalproof.

We can now prove the theorem.

Theorem 2.38 (Craig Interpolation Lemma). If F → G is valid, thenthere exists a formula H such that F → H and H → G are valid and whosepredicates and free variables occur in both F and G.

Proof. We prove the result by describing a procedure that extracts from a(closed) semantic argument proof of the validity of F → G the interpolant H .For convenience, let the proof itself begin with the lines

1. I |= F assumption2. I 6|= G assumption

Notice that with the new set of proof rules, only |= rules will be appliedto deductions stemming from line 1, while only 6|= rules will be applied tothose stemming from line 2. The three contradiction rules correspond to threepossible situations: a contradiction between I |= F and I 6|= G (F → G isvalid), within I |= F itself (F is unsatisfiable), and within I 6|= G itself (G isvalid).

The procedure runs backwards through a proof. It associates with each lineL of the proof a set of positive formulae U and a set of negative formulae V .U consists of formulae on lines from which L descends (including itself) thatare satisfied by their interpretation (lines of the form K |= F1). V consists offormulae on lines from which L descends (including itself) that are falsifiedby their interpretation (lines of the form K 6|= F2). Define L’s characteristicformula as


∧U →

∨V ,

written as U → V for concision. That the branch on which L lies ends ina contradiction implies that U → V is valid. The procedure constructsfor each line an interpolant X of U → V ; that is, X is such that

∧U ⇒ X and X ⇒

∨V

and the predicates and free variables of X appear in both U and V . Theinterpolant of line 2 of the proof is the interpolant H that we seek.

Let us begin with the end of a branch, L : I |= ⊥. It must have beendeduced via a contradiction. If the first contradiction rule produced L, thenits characteristic formula is of the form

U, p(x1, . . . , xn),⊥ → V, p(x1, . . . , xn) ,

where the variable renaming trick ensures that the arguments to p are syn-tactically the same. Its parent has characteristic formula

U, p(x1, . . . , xn) → V, p(x1, . . . , xn) ,

and both have interpolant p(x1, . . . , xn). If the second contradiction rule pro-duced L, then its characteristic formula is of the form

U, p(x1, . . . , xn),¬p(x1, . . . , xn),⊥ → V ,

and its parent’s is of the form

U, p(x1, . . . , xn),¬p(x1, . . . , xn) → V .

Both have interpolant ⊥ (¬⊤ in the restricted language). Similarly, if the thirdcontradiction rule produced L, then the interpolant is ⊤.

Consider lines derived via the conjunction rules. Suppose L : I |= F isdeduced from I |= F ∧G. Then the characteristic formulae of L and its parentare

U, F ∧G, F → V and U, F ∧G → V ,

respectively. If L has interpolant X , then so does its parent. The case is similarfor a line L : I 6|= ¬F deduced from I 6|= ¬(F ∧G).

For the next conjunction rule, suppose that L : I 6|= F is deduced on onebranch from I 6|= F ∧G. Then L is at a fork in the proof and has sibling lineL′ : I 6|= G. The characteristic formulae of L, L′, and their parent are

U → V, F ∧G, F , U → V, F ∧G, G , and U → V, F ∧G ,

respectively. If L and L′ have interpolants X and Y , respectively, then theirparent has interpolant X ∧ Y .


Similarly, suppose L : I |= ¬F is deduced on one branch from I |= ¬(F∧G)(so that its sibling is L′ : I |= ¬G). If L and L′ have interpolants X and Y ,respectively, then their parent has interpolant X ∨ Y (¬(¬X ∧ ¬Y ) in therestricted language).

The interpolant X of a line L derived via a double-negation rule passesdirectly to its parent, for the characteristic formula of L simply has a repetition¬¬F of a formula F that the parent’s characteristic formula does not have.

We turn to the quantification rules. Consider a line L : I ⊳ z 7→ v 6|= Fderived from I 6|= ∀x. F . L and its parent M have characteristic formulae

U → V, ∀x. F, F and U → V, ∀x. F ,

respectively. Moreover, v is fresh, and thus z does not appear in either U orV according to our trick. Hence, z cannot occur free in L’s interpolant X . Itthus follows that

∀ ∗ . ∀z. X → V ∨ ∀x.F ∨ F

is equivalent to

∀ ∗ . X → V ∨ ∀x.F ∨ ∀z. F

and thus to

∀ ∗ . X → V ∨ ∀x.F .

Therefore, X is an interpolant of M . Similarly, X is an interpolant of theparent of L : I ⊳ z 7→ v |= ¬F deduced from I |= ¬∀x. F .

Consider L : I ⊳z 7→ v |= F with interpolant X derived from I |= ∀x. F ,where z is not necessarily fresh. The characteristic formulae of L and its parentM are

U, ∀x. F, F → V and U, ∀x. F → V ,

respectively. Clearly,

U ∧ ∀x. F ∧ F ⇒ X

implies that

U ∧ ∀x. F ⇒ X .

Hence, X is an interpolant of M when z is not free in U or V (so that z is notfree in X) and when it is free in both. However, if z is free in V but not in U ,then X is not an interpolant of M . But ∀z. X is an interpolant. In particular,we have

U ∧ ∀x. F ⇒ ∀z. X


and

∀z. X ⇒ V because ∀z. X ⇒ X and X ⇒ V .

For the final case, suppose that L : I ⊳ z 7→ v 6|= ¬F is deduced fromI 6|= ¬∀x. F . The characteristic formula of L is

U → V,¬∀x. F, F .

Then X is the interpolant of the parent M unless z is free in U but not freein V . In the latter case, the interpolant is ∃z. X (¬∀z. ¬X in the restrictedlanguage). The reasoning is similar to the previous case, completing the proof.

2.8 Summary

Building on the presentation of PL in Chapter 1, this chapter introduces first-order logic (FOL). It covers:

• Its syntax. How one constructs a FOL formula. Variables, terms, functionsymbols, predicate symbols, atoms, literals, logical connectives, quantifiers.

• Its semantics. What a FOL formula means. Truth values true and false.Interpretations: domain and assignments. Difference between a function(predicate) symbol and a function (predicate) over a domain.

• Satisfiability and validity. Whether a FOL formula evaluates to true underany or all interpretations. Semantic argument method.

• Substitution, which is a tool for manipulating formulae and making generalclaims. Safe and schema substitutions. Substitution of equivalent formulae.Valid schemata.

• Normal forms. A normal form is a set of syntactically restricted formulaesuch that every FOL formula is equivalent to some member of the set.

• A review of decidability and complexity theory, which provides the conceptsnecessary for discussing decidability and complexity questions in logic.

• Meta-theorems. Semantic argument method is sound and complete. Com-pactness Theorem. Craig Interpolation Lemma.

The results of Section 2.7 are the groundwork for our theoretical treatmentof the Nelson-Oppen combination method in Chapter 10.

FOL is the most general logic that is discussed in this book. Its applicationsinclude software and hardware design and analysis, knowledge representation,and complexity and decidability theory.

FOL is a complete logic: every valid FOL formula has a proof in the se-mantic argument method. However, validity is undecidable. Many applicationsbenefit from complete automation, which is impossible when considering allof FOL. Therefore, Chapter 3 introduces first-order theories, which formal-ize interesting structures, such as integers, rationals, lists, stacks, and arrays.Part II of this book explores algorithms for reasoning within these theories.

Exercises 67


For a complete and concise presentation of propositional and first-order logic,see Smullyan’s text First-Order Logic [87]. The semantic argument methodis similar to Smullyan’s tableau method. Also, the proofs of completeness ofthe semantic argument method, the Compactness Theorem, and the CraigInterpolation Lemma are inspired by Smullyan’s presentation.

The history of the development of mathematical logic is rich. For anoverview, see [98] and related articles in The Stanford Encyclopedia of Phi-losophy. We mention in particular Hilbert’s program of the 1920s — see, forexample, [38] — to find a consistent and complete axiomatization of arith-metic. Godels two incompleteness theorems proved that such a goal is impos-sible. The first incompleteness theorem, which Godel presented in a lecturein September, 1930, and then in [36], states that any axiomatization of arith-metic contains theorems that are not provable within the theory. The second,which Godel had proved by October, 1930, states that a theory such as Peanoarithmetic cannot prove its own consistency unless it is itself inconsistent.Earlier, Godel proved that first-order logic is complete [35]: every theoremhas a proof. However, Church — and, independently, Turing — proved thatsatisfiability in first-order logic is undecidable [13]. Thus, while every theoremof first-order logic has a finite proof, invalid formulae need not have a finiteproof of their invalidity.

For an introduction to formal languages, decidability, and complexity the-ory, see [85, 72, 41].

Exercises

2.1 (English and FOL). Encode the following English sentences into FOL.

(a) Some days are longer than others.(b) In all the world, there is but one place that I call home.(c) My mother’s mother is my grandmother.(d) The intersection of two convex sets is convex.

2.2 (FOL validity & satisfiability). For each of the following FOL formu-lae, identify whether it is valid or not. If it is valid, prove it with a semanticargument; otherwise, identify a falsifying interpretation.

(a) (∀x, y. p(x, y) → p(y, x)) → ∀z. p(z, z)(b) ∀x, y. p(x, y) → p(y, x) → ∀z. p(z, z)(c) (∃x. p(x)) → ∀y. p(y)(d) (∀x. p(x)) → ∃y. p(y)(e) ∃x, y. (p(x, y) → (p(y, x) → ∀z. p(z, z)))

2.3 (Semantic argument). Use the semantic argument method to prove thefollowing formula schemata.


(a) ¬(∀x. F ) ⇔ ∃x. ¬F(b) ¬(∃x. F ) ⇔ ∀x. ¬F(c) ∀x, y. F ⇔ ∀y, x. F(d) ∃y. ∀x. F ⇒ ∀x. ∃y. F(e) ∃x. F ∨G ⇔ (∃x. F ) ∨ (∃y. G)(f) ∃x. F → G ⇔ (∀x. F ) → (∃x. G)(g) ∃x. F ∨G ⇔ (∃x. F ) ∨ G, provided x 6∈ free(G)(h) ∀x. F ∨G ⇔ (∀x. F ) ∨ G, provided x 6∈ free(G)(i) ∃x. F ∧G ⇔ (∃x. F ) ∧ G, provided x 6∈ free(G)(j) ∀x. F → G ⇔ (∃x. F ) → G, provided x 6∈ free(G)

2.4 (Normal forms). Put the following formulae into prenex normal form.

(a) (∀x. ∃y. p(x, y)) → ∀x. p(x, x)(b) ∃z. (∀x. ∃y. p(x, y)) → ∀x. p(x, z)(c) ∀w. ¬(∃x, y. ∀z. p(x, z) → q(y, z)) ∧ ∃z. p(w, z)

2.5 (⋆Characteristic formula). Why is the characteristic formula of a lineon a closed branch of a semantic argument valid?

3

First-Order Theories

Formalization works as “an early-warning system” when things aregetting contorted.

— Edsger W. DijkstraEWD764: Repaying Our Debts, 1980

When reasoning in particular application domains such as software orhardware, one often has particular structures in mind. For example, programsmanipulate numbers, lists, and arrays. First-order theories formalize thesestructures to enable reasoning about them. This chapter introduces first-ordertheories in general and then focuses on theories useful in verification and re-lated tasks. These theories include a theory of equality, of integers, of rationalsand reals, of recursive data structures, and of arrays.

There is another reason to study first-order theories. While validity in FOLis undecidable, validity in particular theories or fragments of theories is some-times decidable. Many of the theories studied in this chapter have importantfragments for which validity is efficiently decidable. For each theory, we iden-tify the decidable and efficiently decidable fragments, which we summarize inSection 3.7. Part II studies decision procedures for the decidable fragments.

3.1 First-Order Theories

A first-order theory T is defined by the following components.

1. Its signature Σ is a set of constant, function, and predicate symbols.2. Its set of axiomsA is a set of closed FOL formulae in which only constant,

function, and predicate symbols of Σ appear.

A Σ-formula is constructed from constant, function, and predicate symbolsof Σ, as well as variables, logical connectives, and quantifiers. As usual, thesymbols of Σ are just symbols without prior meaning. The axioms A providetheir meaning.

70 3 First-Order Theories

A Σ-formula F is valid in the theory T , or T -valid, if every interpre-tation I that satisfies the axioms of T ,

I |= A for every A ∈ A , (3.1)

also satisfies F : I |= F . For this reason, we write

T |= F

to mean that F is T -valid. Formally, the theory T consists of all (closed)formulae that are T -valid. We call an interpretation satisfying (3.1) a T -interpretation.

A Σ-formula F is satisfiable in T , or T -satisfiable, if there is a T -interpretation I that satisfies F .

A theory T is complete if for every closed Σ-formula F , T |= F orT |= ¬F . A theory is consistent if there is at least one T -interpretation. Inparticular, in a consistent theory T , there does not exist a Σ-formula F suchthat both T |= F and T |= ¬F . (Otherwise, by the semantics of conjunction,T |= F ∧ ¬F and thus T |= ⊥; but ⊥ is not satisfied by any interpretation.)

Concepts from general FOL validity carry over to first-order theories inthe natural way. For example, two formulae F1 and F2 are equivalent in T ,or T -equivalent, if T |= F1 ↔ F2: for every T -interpretation I, I |= F1 iffI |= F2.

A fragment of a theory is a syntactically-restricted subset of formulaeof the theory. For example, the quantifier-free fragment of a theory T isthe set of formulae without quantifiers that are valid in T . Recall our con-vention that non-closed formula F is valid iff its universal closure is valid.Technically, the “quantifier-free fragment” of T actually consists of valid for-mulae in which all variables are universally quantified. However, the term“quantifier-free fragment” is the common and accepted name for this frag-ment. Subsequent chapters show that the quantifier-free fragments of theoriesare of great practical and theoretical importance.

A theory T is decidable if T |= F is decidable for every Σ-formula F . Thatis, there is an algorithm that always terminates with “yes” if F is T -valid orwith “no” if F is T -invalid. A fragment of T is decidable if T |= F is decidablefor every Σ-formula F that obeys the fragment’s syntactic restrictions.

The union T1 ∪ T2 of two theories T1 and T2 has signature Σ1 ∪ Σ2 andaxioms A1∪A2. Clearly, a (T1∪T2)-interpretation is both a T1-interpretationand a T2-interpretation since it satisfies the axioms of both T1 and T2. Hence,a formula that is T1-valid or T2-valid is (T1 ∪ T2)-valid, while a formula thatis (T1 ∪ T2)-satisfiable is both T1-satisfiable and T2-satisfiable.

Because FOL (the “empty” theory, or the theory without axioms) is unde-cidable in general, we must turn to theories and fragments of theories for thepossibility of fully automated reasoning. While many interesting theories areundecidable, there are several important theories and fragments of theoriesthat are decidable. These theories and fragments are the main subject of Part

3.2 Equality 71

II of this book. We introduce them in the following sections. In Section 3.7,we summarize the decidability and complexity results for these theories andfragments.

3.2 Equality

The theory of equality TE is the simplest first-order theory. Its signature

ΣE : =, a, b, c, . . . , f, g, h, . . . , p, q, r, . . .

consists of

• = (equality), a binary predicate;• and all constant, function, and predicate symbols.

Equality = is an interpreted predicate symbol: its meaning is defined viathe axioms of TE. The other constant, function, and predicate symbols areuninterpreted except as they relate to equality. The axioms of TE are thefollowing:

1. ∀x. x = x (reflexivity)2. ∀x, y. x = y → y = x (symmetry)3. ∀x, y, z. x = y ∧ y = z → x = z (transitivity)4. for each positive integer n and n-ary function symbol f ,

∀x, y.

(n∧

i=1

xi = yi

)→ f(x) = f(y) (function congruence)

5. for each positive integer n and n-ary predicate symbol p,

∀x, y.

(n∧

i=1

xi = yi

)→ (p(x)↔ p(y)) (predicate congruence)

The notation x stands for the list of variables x1, . . . , xn. Axioms (function con-gruence) and (predicate congruence) are actually axiom schemata. An axiomschema stands for a set of axioms, each an instantiation of the parameters (fand p in (function congruence) and (predicate congruence), respectively). Forexample, for binary function symbol f2, (function congruence) instantiates tothe following axiom:

∀x1, x2, y1, y2. x1 = y1 ∧ x2 = y2 → f2(x1, x2) = f2(y1, y2) .

The first three axioms state that = is an equivalence relation: it is abinary predicate that obeys reflexivity, symmetry, and transitivity. The finaltwo axiom schemata formalize our intuition for the behavior of functions andpredicates under equality. A function (predicate) always evaluates to the same


value (truth value) for a given set of argument values. They assert that = isa congruence relation.

TE is just as undecidable as full FOL because it allows all constant, func-tion, and predicate symbols. In particular, any FOL formula F can be en-coded as a ΣE-formula F ′ simply by replacing occurrences of the symbol =with a fresh symbol. Since = does not occur in this transformed formula F ′,the axioms of TE are irrelevant; hence, F ′ is TE-satisfiable iff F ′ is first-ordersatisfiable.

However, the quantifier-free fragment of TE is both interesting and effi-ciently decidable, as we show in Chapter 9.

Example 3.1. Without quantifiers, free variables and constants play thesame role. In the formula

F : a = b ∧ b = c → g(f(a), b) = g(f(c), a) ,

a, b, and c are constants, while in

F ′ : x = y ∧ y = z → g(f(x), y) = g(f(z), x) ,

x, y, and z are free variables. F is TE-valid iff F ′ is TE-valid; F is TE-satisfiableiff F ′ is TE-satisfiable.

It is often useful to reason about the T -satisfiability or T -validity of a Σ-formula F in a structured but informal way. We show how to use the semanticargument method with TE.


F : a = b ∧ b = c → g(f(a), b) = g(f(c), a)

is TE-valid, assume otherwise: there exists a TE-interpretation I such thatI 6|= F :

1. I 6|= F assumption2. I |= a = b ∧ b = c 1, →3. I 6|= g(f(a), b) = g(f(c), a) 1, →4. I |= a = b 2, ∧5. I |= b = c 2, ∧6. I |= a = c 4, 5, (transitivity)7. I |= f(a) = f(c) 6, (function congruence)8. I |= b = a 4, (symmetry)9. I |= g(f(a), b) = g(f(c), a) 7, 8 (function congruence)10. I |= ⊥ 3, 9

Our assumption is apparently false: F is TE-valid.

3.3 Natural Numbers and Integers 73

3.3 Natural Numbers and Integers

Arithmetic involving the addition and multiplication of the natural numbersN = 0, 1, 2, . . . is perhaps the oldest of mathematical theories. In this sectionwe describe three theories of arithmetic. Peano arithmetic allows additionand multiplication over natural numbers, while Presburger arithmetic isrestricted to addition over natural numbers. The final theory, the theory ofintegers, is convenient for automated reasoning but is no more expressivethan Presburger arithmetic.

3.3.1 Peano Arithmetic

The theory of Peano arithmetic TPA, or first-order arithmetic, hassignature

ΣPA : 0, 1, +, ·, = ,

where

• 0 and 1 are constants;• + (addition) and · (multiplication) are binary functions;• and = (equality) is a binary predicate.

Its axioms are the following:

1. ∀x. ¬(x + 1 = 0) (zero)2. ∀x, y. x + 1 = y + 1 → x = y (successor)3. F [0] ∧ (∀x. F [x]→ F [x + 1]) → ∀x. F [x] (induction)4. ∀x. x + 0 = x (plus zero)5. ∀x, y. x + (y + 1) = (x + y) + 1 (plus successor)6. ∀x. x · 0 = 0 (times zero)7. ∀x, y. x · (y + 1) = x · y + x (times successor)

These axioms concisely define addition, multiplication, and equality over nat-ural numbers. Informally, axioms (zero), (plus zero), and (times zero) define0 as we understand it: it is the minimal element of the natural numbers; itis the identity for addition (x + 0 = x); and under multiplication, it mapsany number to 0 (x · 0 = 0). Axioms (zero), (successor), (plus zero), and (plussuccessor) define addition. Axioms (times zero) and (times successor) definemultiplication: in particular, (times successor) defines multiplication in termsof addition.

(induction) is an axiom schema: it stands for the set of axioms obtainedby substituting for F each ΣPA-formula that has precisely one free variable.It asserts that every TPA-interpretation I obeys induction: if I satisfies F [0]and ∀x. F [x]→ F [x + 1], then I also satisfies ∀x. F [x].

For convenience, we usually do not write the “·” for multiplication. Forexample, we write xy rather than x · y.


The intended interpretations of TPA have domain N and assignmentsαI defining 0, 1, +, ·, and = as we understand them in everyday arithmetic.In particular,

• αI [0] is 0N: αI maps the symbols “0” to 0N ∈ N;• αI [1] is 1N: αI maps the symbols “1” to 1N ∈ N;• αI [+] is +N, addition over N;• αI [·] is ·N, multiplication over N;• αI [=] is =N, equality over N.

Example 3.3. The formula 3x + 5 = 2y can be written using the signatureΣPA as

x + x + x + 1 + 1 + 1 + 1 + 1 = y + y

or as

(1 + 1 + 1) · x + 1 + 1 + 1 + 1 + 1 = (1 + 1) · y .

In practice, we use the abbreviated notation 3x + 5 = 2y.

Example 3.4. Rather than augmenting TPA with axioms defining inequality>, we can transform formulae with inequality into formulae over the restrictedsignature ΣPA. Write

3x + 5 > 2y as ∃z. z 6= 0 ∧ 3x + 5 = 2y + z ,

where z 6= 0 abbreviates ¬(z = 0). The latter formula is a ΣPA-formula. Weakinequality can be similarly transformed. Write

3x + 5 ≥ 2y as ∃z. 3x + 5 = 2y + z .

Example 3.5. The ΣPA-formula

∃x, y, z. x 6= 0 ∧ y 6= 0 ∧ z 6= 0 ∧ xx + yy = zz

is TPA-valid. It asserts that there exists a triple of positive integers fulfillingthe Pythagorean Theorem. The formula

∃x, y, z. x 6= 0 ∧ y 6= 0 ∧ z 6= 0 ∧ xxx + yyy = zzz

is the cubic analogue. For constant n, let xn represent n multiplications of x;then every formula of the set

∀x, y, z. x 6= 0 ∧ y 6= 0 ∧ z 6= 0 → xn + yn 6= zn : n > 2 ∧ n ∈ Z

is TPA-valid, as claimed by Fermat’s Last Theorem and proved by AndrewWiles in 1994.


Remark 3.6. Godel’s first incompleteness theorem (see Bibliographic Re-marks of Chapter 2) implies that Peano arithmetic TPA does not capture truearithmetic: there exist closed ΣPA-formulae representing valid propositions ofnumber theory that are TPA-invalid. Godel’s proof constructs such a formula:it encodes the assertion that the formula itself cannot be proved. Now, eitherthis formula can be proved from the axioms of TPA (contradicting itself so thatTPA is inconsistent) or it cannot be proved (so that TPA is incomplete).

Satisfiability and validity in TPA is undecidable. Therefore, we turn to amore restricted theory of arithmetic that does not allow multiplication.

3.3.2 Presburger Arithmetic

The theory of Presburger arithmetic TN has signature

ΣN : 0, 1, +, = ,

where

• 0 and 1 are constants;• + (addition) is a binary function;• and = (equality) is a binary predicate.

Its axioms are a subset of the axioms of TPA:

1. ∀x. ¬(x + 1 = 0) (zero)2. ∀x, y. x + 1 = y + 1 → x = y (successor)3. F [0] ∧ (∀x. F [x]→ F [x + 1]) → ∀x. F [x] (induction)4. ∀x. x + 0 = x (plus zero)5. ∀x, y. x + (y + 1) = (x + y) + 1 (plus successor)

Again, (induction) is an axiom schema standing for the set of axioms obtainedby replacing F with each ΣN-formula that has precisely one free variable.

The intended interpretations of TN have domain N and are such that

• αI [0] is 0N ∈ N;• αI [1] is 1N ∈ N;• αI [+] is +N, addition over N;• αI [=] is =N, equality over N.

How does one reason about all integers, Z = . . . ,−2,−1, 0, 1, 2, . . .? Suchformulae can be encoded as ΣN-formulae.


F0 : ∀w, x. ∃y, z. x + 2y − z − 13 > −3w + 5 ,

where − is meant to be interpreted as standard subtraction, and w, x, y, andz are intended to range over Z. The formula


F1 :∀wp, wn, xp, xn. ∃yp, yn, zp, zn.

(xp − xn) + 2(yp − yn)− (zp − zn)− 13 > −3(wp − wn) + 5

introduces two variables, vp and vn, for each variable v of F0. While each of vp

and vn can only range over N, vp−vn should range over the integers. But howis − interpreted? Moving negated terms to the other side of the inequalityeliminates −:

F2 :∀wp, wn, xp, xn. ∃yp, yn, zp, zn.

xp + 2yp + zn + 3wp > xn + 2yn + zp + 13 + 3wn + 5 .

The final transformation eliminates constant coefficients and strict inequality:

F3 :

∀wp, wn, xp, xn. ∃yp, yn, zp, zn. ∃u.¬(u = 0) ∧xp + yp + yp + zn + wp + wp + wp

= xn + yn + yn + zp + wn + wn + wn + u+ 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1+ 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 .

Presburger showed in 1929 that TN is decidable. Therefore, the “theoryof (negative and positive) integers” that we loosely constructed above is alsodecidable via the syntactic rewriting of formulae into ΣN-formulae. Ratherthan using this cumbersome rewriting, however, we next study a theory ofintegers.

3.3.3 Theory of Integers

The theory of integers TZ has signature

ΣZ : . . . ,−2,−1, 0, 1, 2, . . . ,−3·,−2·, 2·, 3·, . . . , +, −, =, > ,

where

• . . . , −2, −1, 0, 1, 2, . . . are constants, intended to be assigned the obviouscorresponding values in the intended domain of integers Z;

• . . . , −3·, −2·, 2·, 3·, . . . are unary functions, intended to represent con-stant coefficients (e.g., 2 · x, abbreviated 2x);

• + and − are binary functions, intended to represent the obvious corre-sponding functions over Z;

• = and > are binary predicates, intended to represent the obvious corre-sponding predicates over Z.

Since Example 3.7 shows that ΣZ-formulae can be reduced to ΣN-formulae, wedo not axiomatize TZ. TZ is merely a convenient representation for reasoningabout addition over all integers.


The intended interpretations of TZ have domain Z and are such that αI

assigns the obvious values, functions, and predicates to the constant, function,and predicate symbols of ΣZ.

In Chapter 7, we discuss Cooper’s decision procedure for deciding TZ-validity, while in Chapter 8, we discuss decision procedures for the quantifier-free fragment of TZ. These procedures decide TN-validity as well: the followingexample illustrates that ΣN-formulae can be reduced to ΣZ-formulae.

Example 3.8. To decide the TN-validity of

∀x. ∃y. x = y + 1 ,

decide the TZ-validity of

∀x. x ≥ 0 → ∃y. y ≥ 0 ∧ x = y + 1 ,

where t1 ≥ t2 expands to t1 = t2 ∨ t1 > t2.

We prove the validity of several ΣZ-formulae using the semantic argumentmethod. Our application of the semantic argument method in this context isinformal; it is intended to allow us to argue intuitively about validity untilChapter 7.

Example 3.9. To prove that the ΣZ-formula

F : ∀x, y, z. x > z ∧ y ≥ 0 → x + y > z .

is TZ-valid, assume otherwise: there is a TZ-interpretation I such that I 6|= F :

1. I 6|= F assumption2. I1 : I ⊳ x 7→ v1 ⊳ y 7→ v2 ⊳ z 7→ v3

6|= x > z ∧ y ≥ 0 → x + y > z 1, ∀3. I1 |= x > z ∧ y ≥ 0 2, →4. I1 6|= x + y > z 2, →5. I1 |= ¬(x + y > z) 4, ¬

We derive a contradiction by collecting formulae from lines 3 and 5, applyingthe variant interpretation I1, and querying the theory TZ: are there integersv1, v2, v3 such that

v1 > v3 ∧ v2 ≥ 0 ∧ ¬(v1 + v2 > v3) ?

No, for v1 > v3 ∧ v2 ≥ 0 implies v1 + v2 > v3. We summarize this reasoningin TZ with the line

6. I1 |= ⊥ 3, 5, TZ

Therefore, F is TZ-valid.


Arguing the validity of arithmetic formulae at the level of axioms is te-dious. Therefore, unlike in Example 3.2 in which the theory-specific reasoningis incorporated into the semantic argument method by applying and statingspecific axioms of TE, the semantic argument method for TZ handles the “log-ical” aspects of the structured reasoning, while a separate informal argumentreasons about the theory-specific elements.

Example 3.10. To prove that the ΣZ-formula

F : ∀x, y. x > 0 ∧ (x = 2y ∨ x = 2y + 1) → x− y > 0

is TZ-valid, assume otherwise: there is a TZ-interpretation I such that I 6|= F :

1. I 6|= F assumption2. I1 : I ⊳ x 7→ v1 ⊳ y 7→ v2

6|= x > 0 ∧ (x = 2y ∨ x = 2y + 1) → x− y > 01, ∀

3. I1 |= x > 0 ∧ (x = 2y ∨ x = 2y + 1) 2, →4. I1 |= x > 0 3, ∧5. I1 |= x = 2y ∨ x = 2y + 1 3, ∧6. I1 6|= x− y > 0 2, →7. I1 |= ¬(x − y > 0) 6, ¬

There are two cases to consider. In the first case,

8a. I1 |= x = 2y 5, ∨

We collect the formulae of lines 4, 7, and 8a, apply the variant interpretationI1, and query the theory TZ: are there integers v1, v2 such that

v1 > 0 ∧ v1 = 2v2 ∧ ¬(v1 − v2 > 0) ?

No, for substituting v1 = 2v2 throughout produces

2v2 > 0 ∧ ¬(2v2 − v2 > 0) ,

which simplifies to

v2 > 0 ∧ ¬(v2 > 0) ,

a contradiction. This reasoning is summarized by

9a. I1 |= ⊥ 4, 7, 8a, TZ

In the second case,

8b. I1 |= x = 2y + 1 5, ∨

Considering lines 4, 7, and 8b, are there integers v1, v2 such that

3.4 Rationals and Reals 79

v1 > 0 ∧ v1 = 2v2 + 1 ∧ ¬(v1 − v2 > 0) ?

No, for substituting v1 = 2v2 + 1 throughout produces

2v2 + 1 > 0 ∧ ¬(2v2 + 1− v2 > 0) ,

which simplifies to

2v2 + 1 > 0 ∧ ¬(v2 + 1 > 0) .

The first literal holds only when v2 > −1, while the second holds only whenv2 ≤ −1, a contradiction. This reasoning is summarized by

9b. I1 |= ⊥ 4, 7, 8b, TZ

Thus, F is TZ-valid.

3.4 Rationals and Reals

Almost as old as arithmetic on integers is arithmetic on the rational numbersQ and (not quite as old) on the real numbers R. In this section, we describetwo theories of real arithmetic. The latter theory can also be seen as a theoryof rational arithmetic.

The first theory is the theory of reals, involving addition and multiplica-tion over R; it is also known as elementary algebra. The term “elementary”refers to the restriction that variables range only over domain elements (as inall first-order theories), not over sets or functions of domain elements. Mostjunior high students are familiar with elementary algebra. As a first-ordertheory, elementary algebra is of course more complex since formulae are con-structed with logical connectives and quantifiers.

The second theory is the theory of addition over R or Q. Interpretationswith domains of R are indistinguishable from interpretations with domains ofQ, as we discuss below. For this reason, we call this second theory the theoryof rationals.

Example 3.11. Let us distinguish informally between the theories of integers(with only addition), reals (with addition and multiplication), and rationals(with only addition).

In the theory of integers,

F : ∃x. 2x = 7

is TZ-invalid. However, assigning to x the rational number 72 satisfies 2x = 7,

so F should be satisfiable in the theory of rationals. Moreover, 72 is also a real

number, so F should also be satisfiable in the theory of reals.The theory of reals includes multiplication, allowing a formula like


G : ∃x. x2 = 2

to be expressed, where x2 abbreviates x · x. G should be valid in the theoryof reals because assigning to x the real number

√2 satisfies x2 = 2.

√2 is

irrational.

3.4.1 Theory of Reals

The theory of reals TR, or elementary algebra, has signature

ΣR : 0, 1, +, −, ·, =, ≥ ,

where

• 0 and 1 are constants;• + (addition) and · (multiplication) are binary functions;• − (negation) is a unary function;• and = (equality) and ≥ (weak inequality) are binary predicates.

TR has the most complex axiomatization of the theories that we study. Wegroup axioms by their mathematical content.

First are the axioms of an abelian group. An abelian group is a structurewith additive identity 0, associative and commutative addition +, additiveinverse −, and equality =. The qualifier “abelian” simply means that additionis commutative. The axioms are the following:

1. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)2. ∀x. x + 0 = x (+ identity)3. ∀x. x + (−x) = 0 (+ inverse)4. ∀x, y. x + y = y + x (+ commutativity)

The first three axioms are the axioms of a group.Second are the additional axioms of a ring. A ring is an abelian group with

a multiplicative identity 1 and associative multiplication · that distributes overaddition. For convenience, we usually shorten x · y to xy.

1. ∀x, y, z. (xy)z = x(yz) (· associativity)2. ∀x. x1 = x (· left identity)3. ∀x. 1x = x (· right identity)4. ∀x, y, z. x(y + z) = xy + xz (left distributivity)5. ∀x, y, z. (x + y)z = xz + yz (right distributivity)

Both left and right identity and distributivity axioms are required since · isnot commutative (yet). It is made so in the next set of axioms.

Third are the additional axioms of a field. In a field, · is commutative;the additive and multiplicative identities are different; and the multiplicativeinverse of a non-0 value exists (e.g., 1

2 is the multiplicative inverse of 2).

1. ∀x, y. xy = yx (· commutativity)


2. 0 6= 1 (separate identities)3. ∀x. x 6= 0 → ∃y. xy = 1 (· inverse)

The axiom (· commutativity) makes the (· right identity) and (right distributivity)axioms redundant.

Fourth are the additional axioms characterizing ≥ as a total order.

1. ∀x, y. x ≥ y ∧ y ≥ x → x = y (antisymmetry)2. ∀x, y, z. x ≥ y ∧ y ≥ z → x ≥ z (transitivity)3. ∀x, y. x ≥ y ∨ y ≥ x (totality)

Finally are the additional axioms of a real closed field.

1. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)2. ∀x, y. x ≥ 0 ∧ y ≥ 0 → xy ≥ 0 (· ordered)3. ∀x. ∃y. x = y2 ∨ x = −y2 (square-root)4. for each odd integer n,

∀x. ∃y. yn + x1yn−1 + · · ·+ xn−1y + xn = 0 (at least one root)

We again abbreviate x1, . . . , xn by x. By yn, we mean y multiplied by itselfn times: y · · · y. The axioms (+ ordered) and (· ordered) assert that everyTR-interpretation is an ordered field. The axiom (square-root) asserts theexistence of the square-root of every value. The final axiom schema statesthat polynomials of odd degree have at least one root.

Putting all axioms together and pruning redundant axioms, we have:


4. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)5. ∀x. x + 0 = x (+ identity)6. ∀x. x + (−x) = 0 (+ inverse)7. ∀x, y. x + y = y + x (+ commutativity)8. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)

9. ∀x, y, z. (xy)z = x(yz) (· associativity)10. ∀x. 1x = x (· identity)11. ∀x. x 6= 0 → ∃y. xy = 1 (· inverse)12. ∀x, y. xy = yx (· commutativity)13. ∀x, y. x ≥ 0 ∧ y ≥ 0 → xy ≥ 0 (· ordered)

14. ∀x, y, z. x(y + z) = xy + xz (distributivity)15. 0 6= 1 (separate identities)

16. ∀x. ∃y. x = y2 ∨ −x = y2 (square-root)


17. for each odd integer n,

∀x. ∃y. yn + x1yn−1 + · · ·+ xn−1y + xn = 0 (at least one root)

Example 3.12. The method of quantifier elimination, which we studyin Chapter 7, eliminates quantifiers from a formula to produce an equivalentquantifier-free formula. If a formula F contains free variables, then a quantifierelimination procedure produces an equivalent quantifier-free formula F ′ suchthat free(F ′) ⊆ free(F ). For example, when is the formula

F : ∃x. ax2 + bx + c = 0

satisfiable? That is, what are the conditions on a, b, and c such that a quadraticpolynomial has a real root? Recall that the discriminant must be nonnegative:

F ′ : b2 − 4ac ≥ 0 .

F ′ is the quantifier-free formula that is TR-equivalent to F .

Tarski proved that TR was decidable in the 1930s, although the SecondWorld War prevented his publishing the result until 1956. Collins proposed themore efficient technique of cylindrical algebraic decomposition (CAD) in1975. Unfortunately, even the most efficient decision procedures for TR have

prohibitively high time complexity: CAD runs in time proportionate to 22k|F |

,for some constant k and for |F | the length of ΣR-formula F .

3.4.2 Theory of Rationals

Given the high complexity of deciding TR-validity (and the high intellectualcomplexity of Tarski’s and subsequent decision procedures for TR), we turn toa simpler theory without multiplication, the theory of rationals TQ. It hassignature

ΣQ : 0, 1, +, −, =, ≥ ,

where

• 0 and 1 are constants;• + (addition) is a binary function;• − (negation) is a unary function;• and = (equality) and ≥ (weak inequality) are binary predicates.




4. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)5. ∀x. x + 0 = x (+ identity)6. ∀x. x + (−x) = 0 (+ inverse)7. ∀x, y. x + y = y + x (+ commutativity)8. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)

9. for each positive integer n,

∀x. nx = 0 → x = 0 (torsion-free)

10. for each positive integer n,

∀x. ∃y. x = ny (divisible)

By nx we mean x added to itself n times: x + · · ·+ x. The first eight axiomsare a subset of the axioms of TR. They state that every TQ-interpretation isan ordered abelian group. ≥ is a total order by the first three axioms. Identity0, addition +, additive inverse −, and equality = comprise an abelian groupby the next four axioms. The eighth axiom asserts that the abelian group isordered.

The axiom schema (torsion-free) states that only 0 can be added to itself toproduce 0. The name “torsion-free” comes from the following mathematicalcontext. In a group, the order of an element v is the integer n such that nvis the identity element 0: nv = 0. If no such n exists, then the element v hasinfinite order. A group is torsion-free if the only element with finite order isthe identity 0.

Finally, the axiom schema (divisible) asserts that all elements of the domainDI of a TQ-interpretation I are divisible. That is, for every positive integer n,every element v ∈ DI is the sum of n of some other element w ∈ DI .

Thus, every TQ-interpretation is a divisible torsion-free abelian group. Inparticular, the rationals and reals with +, −, =, and ≥ are divisible torsion-free abelian groups. As TQ-interpretations, the rationals and reals are ele-mentarily equivalent: there does not exist a ΣQ-formula that distinguishesbetween a real TQ-interpretation (an interpretation with domain R) and arational TQ-interpretation (an interpretation with domain Q).

This characteristic makes sense, intuitively: no linear expression with onlyinteger coefficients can capture, say,

√2 without also being satisfied by some

rational values. When junior high students solve linear algebra problems, theyapply addition, subtraction, multiplication, and division; but they do not takeroots.

In contrast, TR is a theory of reals: the ΣR-formula x·x = 2 is only satisfiedby TR-interpretations I in which αI [x] = −

√2 or αI [x] =

√2.

Example 3.13. Strict inequality is simple to express in TQ. Write

∀x, y. ∃z. x + y > z


as the ΣQ-formula

∀x, y. ∃z. ¬(x + y = z) ∧ x + y ≥ z .

The situation is similar for TR.

Example 3.14. Rational coefficients are simple to express in TQ. Write

1

2x +

2

3y ≥ 4

as the ΣQ-formula 3x + 4y ≥ 24.

For convenience, we sometimes write x ≤ y for y ≥ x.In Chapter 7, we study a procedure for eliminating quantifiers in the theory

TQ. On closed formulae, this procedure decides validity. In Chapter 8, we studya decision procedure for the quantifier-free fragment of TQ, which is efficientlydecidable.

3.5 Recursive Data Structures

The theory of recursive data structures (RDS) describes a set of datastructures that are ubiquitous in programming. The most basic RDS is a non-recursive structure, like C’s struct, in which a single variable has multiplefields. Truly recursive RDSs include lists, stacks, and binary trees.

The theory of recursive data structures TRDS formalizes the reasoningabout such structures. It builds on the theory of equality TE.

Theory of Lists

We first focus on the theory of LISP-like lists, Tcons, which has signature

Σcons : cons, car, cdr, atom, = ,

where

• cons is a binary function, called the constructor: cons(a, b) represents thelist constructed by concatenating a to b;

• car is a unary function, called the left projector: car(cons(a, b)) = a;• cdr is a unary function, called the right projector: cdr(cons(a, b)) = b;• atom is a unary predicate: atom(x) is true iff x is a single-element list;• and = (equality) is a binary predicate.

car and cdr are historical names abbreviating “contents of address register”and “contents of decrement register”, respectively. In the intended interpre-tations, atoms are individual elements, while lists are multiple elements as-sembled together via cons. For example, cons(a, cons(b, c)) is a list of three

3.5 Recursive Data Structures 85

elements, while a for which atom(a) holds is an atom. car and cdr are func-tions for accessing parts of lists. For example, car(cons(a, cons(b, c))) returnsthe head a of the list; cdr(cons(a, cons(b, c))) returns the tail cons(b, c) of thelist; and cdr(cdr(cons(a, cons(b, c)))) returns c.

The axioms of Tcons are the following:

1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE

2. instantiations of the (function congruence) axiom schema for cons, car, andcdr:

∀x1, x2, y1, y2. x1 = x2 ∧ y1 = y2 → cons(x1, y1) = cons(x2, y2)

∀x, y. x = y → car(x) = car(y)

∀x, y. x = y → cdr(x) = cdr(y)

3. an instantiation of the (predicate congruence) axiom schema for atom:

∀x, y. x = y → (atom(x)↔ atom(y))

4. ∀x, y. car(cons(x, y)) = x (left projection)5. ∀x, y. cdr(cons(x, y)) = y (right projection)6. ∀x. ¬atom(x) → cons(car(x), cdr(x)) = x (construction)7. ∀x, y. ¬atom(cons(x, y)) (atom)

The first three sets of axioms define = to be a congruence relation for cons,car, cdr, and atom. The axioms (left projection) and (right projection) definethe behavior of car and cdr on non-atom lists: car returns the first element ofa cons structure, and cdr returns the second element. However, they do notspecify the behavior of car and cdr on atoms. The (construction) axiom statesthat the cons of car(x) and cdr(x) is x itself, unless x is an atom. In otherwords, cons constructs structures, and car and cdr deconstructs them. Finally,the axiom (atom) asserts that a term with root function symbol cons is notan atom; it is a non-atomic list.

The congruence axioms for cons, car, and cdr assert an important propertyabout lists: two lists are equal iff their components are equal. The forwarddirection — if two lists are equal, then their components are equal — is aconsequence of the (function congruence) axioms for car and cdr. The backwarddirection is a consequence of the (function congruence) axiom for cons. Thisrelationship between two structures and their components is sometimes calledextensionality. We see it in arrays as well.

General Theory of RDS

Tcons is an instance of the general theory of recursive data structures TRDS.Each RDS contributes the following to the signature:

• an n-ary constructor C;


• n projection functions πC1 , . . . , πC

n;• and one atom predicate atomC.

Associated with each RDS is an instantiation of the following axiom schema:

1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE;2. instantiations of the (function congruence) axiom schema for constructor

C and set of projectors πC1 , . . . , πC

n;3. an instantiation of the (predicate congruence) axiom schema for atomC;4. for each i ∈ 1, . . . , n,

∀x1, . . . , xn. πCi (C(x1, . . . , xn)) = xi (projection)

5. ∀x. ¬atomC(x) → C(πC1 (x), . . . , πC

n(x)) = x (construction)6. ∀x1, . . . , xn. ¬atomC(C(x1, . . . , xn)) (atom)

The axioms of Tcons are an instantiation of this schema. We subsequentlyfocus on Tcons for concreteness.

Theory of Acyclic Lists

A variation on this theory in which data structures are acyclic has been stud-ied. Acyclicity makes sense for stacks, but not necessarily for lists and otherdata structures. Consider the theory of acyclic LISP-like lists, T +

cons. Its axiomsinclude those of Tcons and the following axiom schema:

∀x. car(x) 6= x∀x. cdr(x) 6= x∀x. car(car(x)) 6= x∀x. car(cdr(x)) 6= x∀x. cdr(car(x)) 6= x. . .

T +cons is decidable, but Tcons is not. However, the quantifier-free fragments of

these theories are efficiently decidable.

Theory of Lists with Specified Atoms

The axioms of Tcons leave the behavior of car and cdr on atoms unspecified.Adding the axiom

∀x. atom(x) → atom(car(x)) ∧ atom(cdr(x))

to those of Tcons makes decidability of the resulting theory T atomcons NP-complete.

3.6 Arrays 87

Theory of Lists with Equality

In Chapter 9, we describe a decision procedure for satisfiability in thequantifier-free fragment of Tcons. The decision procedure is actually appli-cable to the quantifier-free fragment of a more expressive theory, T =

cons, whichis the combination of TE and Tcons and thus includes uninterpreted constants,functions, and predicates. Thus, its signature is ΣE∪Σcons, and its axioms arethe union of the axioms of TE and Tcons. In Section 3.8 and Chapter 10, wediscuss more general combinations of theories.

Example 3.15. To prove that the Σ=cons-formula

F :car(a) = car(b) ∧ cdr(a) = cdr(b) ∧ ¬atom(a) ∧ ¬atom(b)→ f(a) = f(b)

is T =cons-valid, assume otherwise: there exists a T =

cons-interpretation I such thatI 6|= F :

1. I 6|= F assumption2. I |= car(a) = car(b) 1, →, ∧3. I |= cdr(a) = cdr(b) 1, →, ∧4. I |= ¬atom(a) 1, →, ∧5. I |= ¬atom(b) 1, →, ∧6. I 6|= f(a) = f(b) 1, →7. I |= cons(car(a), cdr(a)) = cons(car(b), cdr(b))

2, 3, (function congruence)8. I |= cons(car(a), cdr(a)) = a 4, (construction)9. I |= cons(car(b), cdr(b)) = b 5, (construction)10. I |= a = b 7, 8, 9, (transitivity)11. I |= f(a) = f(b) 10, (function congruence)12. I |= ⊥ 6, 11

Therefore, F is T =cons-valid.

3.6 Arrays

Arrays are another common data structure in programming. They are similarto the uninterpreted functions of TE except that they can be modified. Thetheory of arrays TA describes the basic characteristic of an array: if value vis written to position i of array a, then subsequently reading from position iof a should return v. Because logic is static, modified arrays are representedfunctionally, as in functional programming.

The theory of arrays TA has signature

ΣA : ·[·], ·〈· ⊳ ·〉, = ,

where


• a[i] (read) is a binary function: a[i] represents the value of array a atposition i;

• a〈i ⊳ v〉 (write) is a ternary function: a〈i ⊳ v〉 represents the modified arraya in which position i has value v;

• and = (equality) is a binary predicate.

·[·] and ·〈·⊳·〉 really are binary and ternary functions, respectively, even thoughwe write them using a convenient notation. Writing a[i] as read(a, i) and a〈i⊳e〉as write(a, i, e) emphasizes that they are functions.

Arrays are represented functionally. The term a〈i ⊳ v〉 is an array thatis like a except that it has value v at position i. The term a〈i ⊳ v〉[j] (whichabbreviates (a〈i⊳v〉)[j]) is equal to the value of array a〈i⊳v〉 at position j: it isv if j = i and a[j] otherwise. a〈i⊳v〉〈j ⊳w〉 (which abbreviates (a〈i⊳v〉)〈j ⊳w〉)is an array that is like a except that it differs at the positions i, where it hasvalue v, and j, where is has value w. Finally, the term a〈i ⊳ v〉〈j ⊳w〉[k] (whichabbreviates ((a〈i ⊳ v〉)〈j ⊳ w〉)[k]) has value w if k = j (even if k = i also),value v if k = i and k 6= j, and value a[k] otherwise.

The axioms of TA are the following:

1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE;2. ∀a, i, j. i = j → a[i] = a[j] (array congruence)3. ∀a, v, i, j. i = j → a〈i ⊳ v〉[j] = v (read-over-write 1)4. ∀a, v, i, j. i 6= j → a〈i ⊳ v〉[j] = a[j] (read-over-write 2)

The first set of axioms defines = as an equivalence relation. The next ax-iom asserts that accessing an array with two equal expressions produces thesame element. The final two axioms capture the basic characteristic of arrays:reading at an index that has been written produces the most recently writtenvalue.

The equality predicate = is only defined for array “elements”. For example,

F : a[i] = e → a〈i ⊳ e〉 = a

is not TA-valid, although our intuition suggests that it should be. The problemis that the interaction between = and the read and write functions is notcaptured in the axioms of TA. In other words, equality between arrays, notjust between elements, is undefined.

Instead of F , we write

F ′ : a[i] = e → ∀j. a〈i ⊳ e〉[j] = a[j] ,

which is TA-valid.


F ′ : a[i] = e → ∀j. a〈i ⊳ e〉[j] = a[j] ,

is TA-valid, assume otherwise: there is a TA-interpretation I such that I 6|= F ′:

3.6 Arrays 89

1. I 6|= F ′ assumption2. I |= a[i] = e 1, →3. I 6|= ∀j. a〈i ⊳ e〉[j] = a[j] 1, →4. I1 : I ⊳ j 7→ j 6|= a〈i ⊳ e〉[j] = a[j] 3, ∀, for some j ∈ DI

5. I1 |= a〈i ⊳ e〉[j] 6= a[j] 4, ¬6. I1 |= i = j 5, (read-over-write 2)7. I1 |= a[i] = a[j] 6, (array congruence)8. I1 |= a〈i ⊳ e〉[j] = e 6, (read-over-write 1)9. I1 |= a〈i ⊳ e〉[j] = a[j] 2, 7, 8, (transitivity)10. I1 |= ⊥ 4, 9

We derive line 6 from line 5 by using the contrapositive of (read-over-write2). The contrapositive of F1 → F2 is ¬F2 → ¬F1, and

F1 → F2 ⇔ ¬F2 → ¬F1 .

Lines 4 and 9 are contradictory, so that actually I |= F ′. Thus, F ′ is TA-valid.

Unfortunately, TA-validity is undecidable. It is straightforward to encodearbitrary formulae of FOL in TA by viewing functions as multi-dimensionalarrays (arrays whose elements are arrays). Therefore, a theory T =

A in whichthe behavior of = on arrays is axiomatized has been studied. Its quantifier-free fragment is decidable. The signature of T =

A is the same as that of TA. Itsaxioms consists of those of TA and the following axiom:

∀a, b. (∀i. a[i] = b[i]) ↔ a = b (extensionality)


F : a[i] = e → a〈i ⊳ e〉 = a

is T =A -valid, assume otherwise: there is a T =

A -interpretation I such that I 6|=F :

1. I 6|= F assumption2. I |= a[i] = e 1, →3. I 6|= a〈i ⊳ e〉 = a 1, →4. I |= a〈i ⊳ e〉 6= a 3, ¬5. I |= ¬(∀j. a〈i ⊳ e〉[j] = a[j]) 4, (extensionality)6. I 6|= ∀j. a〈i ⊳ e〉[j] = a[j] 5, ¬

The rest of the proof then proceeds as in Example 3.16.

We present a decision procedure for the quantifier-free fragment of TA inChapter 9. In Chapter 11, we present a decision procedure for satisfiability in afragment of TA that is more expressive than even the quantifier-free fragmentof T =

A .


Table 3.1. Decidability of theories and quantifier-free fragments

Theory Description Full QFF

TE equality no yes

TPA Peano arithmetic no no

TN Presburger arithmetic yes yes

TZ linear integers yes yes

TR reals (with ·) yes yes

TQ rationals (without ·) yes yes

TRDS recursive data structures no yes

T+

RDS acyclic recursive data structures yes yes

TA arrays no yes

T=A arrays with extensionality no yes

Table 3.2. Complexities for decidable theories

Theory Complexity

PL NP-complete

TN, TZ Ω“

22n

”

, O

„

222

kn«

TR O“

22kn

”

TQ Ω (2n), O“

22kn

”

T+

RDS not elementary recursive

3.7 ⋆Survey of Decidability and Complexity

We survey the known decidability and complexity results of the theories ofthis chapter.

Table 3.1 summarizes the decidability results for the first-order theories.The quantifier-free fragment of each theory that we study in Part II of thisbook is decidable.

Table 3.2 summarizes the complexity results for satisfiability in PL and thedecidable first-order theories. For all complexities, n is the size of the inputformula, and k is some positive integer. A decision problem is not elementaryrecursive if its running time cannot be bounded by a fixed-height stack ofexponentials. Only decision procedures for satisfiability in PL scale well tolarge problems.

Table 3.3 summarizes the complexity results for the quantifier-free frag-ments. As satisfiability in PL is already NP-complete, we consider only con-junctive formulae, which are just conjunctions of literals. For example, sat-isfiability of propositional conjunctive formulae is decidable in linear time: ifboth P and ¬P appear in F , for some propositional variable P , then F is un-satisfiable; otherwise, F is satisfiable. For quantifier-free (but not conjunctive)formulae, all complexities except that for TR are NP-complete. Satisfiability inthe quantifier-free fragments of TE, TQ, TRDS, and T +

RDS is efficiently decidable.

3.8 Combination Theories 91

Table 3.3. Complexities for quantifier-free, conjunctive fragments of theories

Theory Complexity Theory Complexity

PL Θ(n) TE O(n log n)

TN, TZ NP-complete TR O“

22kn

”

TQ PTIME T+

RDS Θ(n)

TRDS O(n log n) TA NP-complete

3.8 Combination Theories

In practice, the formulae that we want to check for satisfiability or validityspan multiple theories. For example, in program verification, one might wantto prove a property about an array of integers or a list of reals. We will seemany such examples in Chapter 5. Thus, decision procedures for fragments offirst-order theories are essentially useless unless they can be combined.

What does every signature of every theory presented so far have in com-mon? They all have equality, =. Nelson and Oppen made equality the focalpredicate in their general method for combining quantifier-free fragments offirst-order theories (with some restrictions). Given two theories T1 and T2 suchthat Σ1 ∩Σ2 = = — only = is shared — the combined theory T1 ∪ T2 hassignature Σ1 ∪Σ2 and axioms A1 ∪A2. Nelson and Oppen showed that if

• satisfiability in the quantifier-free fragment of T1 is decidable;• satisfiability in the quantifier-free fragment of T2 is decidable;• and certain technical requirements are met,

then satisfiability in the quantifier-free fragment of T1 ∪ T2 is decidable. Fur-thermore, if the decision procedures for T1 and T2 are in P (in NP), then thecombined decision procedure for T1 ∪ T2 is in P (in NP).

Chapter 10 studies the Nelson-Oppen combination of decision procedures.

Example 3.18. To prove that the (Σ=A ∪ΣZ)-formula

F : a = b → a[i] ≥ b[i]

is (T =A ∪ TZ)-valid we assume otherwise: there is a (T =

A ∪ TZ)-interpretation Isuch that I 6|= F :

1. I 6|= F assumption2. I |= a = b 1, →3. I 6|= a[i] ≥ b[i] 1, →4. I |= ¬(a[i] ≥ b[i]) 3, ¬5. I |= a[i] = b[i] 2, T =

A (extensionality)6. I |= ⊥ 4, 5, T =

A ∪ TZ

Line 6 summarizes the argument that it is impossible for a TZ-interpretationto satisfy both a[i] = b[i] and ¬(a[i] ≥ b[i]).


Example 3.19. The (ΣE ∪ΣZ)-formula

1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2)

is (TE ∪ TZ)-unsatisfiable, for x cannot be either 1 or 2 without violating(function congruence). Seen as a (ΣE ∪ΣQ)-formula, it is (TE ∪TQ)-satisfiable:choose x = 3

2 .The (ΣE ∪ΣQ)-formula

f(f(x)− f(y)) 6= f(z) ∧ x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z

is (TE∪TQ)-unsatisfiable. In particular, the final three literals imply that z = 0and x = y, so that f(x) = f(y). But then from the first literal, f(0) 6= f(0)since both f(x) − f(y) and z equal 0.

Finally, the (ΣE ∪ΣZ)-formula

1 ≤ x ∧ x ≤ 3 ∧ f(x) 6= f(1) ∧ f(x) 6= f(3) ∧ f(1) 6= f(2)

is (TE∪TZ)-satisfiable since x can be 2 without violating (function congruence).

3.9 Summary

Important data types in software and hardware models include integers; ratio-nals; recursive data structures like records, lists, stacks, and trees; and arrays.This chapter introduces first-order theories that formalize these data types.It covers:

• First-order theories. Formalizations of structures and operations intofirst-order logic: signatures, axioms. Fragments of theories, in particularquantifier-free fragments. Interpretations, satisfiability, validity.

• Specific theories:– Equality defines the binary predicate = as a congruence relation. Sat-

isfiability in the quantifier-free fragment is efficiently decidable, andthe decision procedure is the basis for decision procedures for datastructures (see Chapter 9).

– Integer arithmetic. Satisfiability in integer arithmetic without multipli-cation is decidable.

– Rational and real arithmetic. Satisfiability in real arithmetic with mul-tiplication is decidable with high complexity. Satisfiability in ratio-nal arithmetic without multiplication is efficiently decidable. Rationalarithmetic without multiplication is indistinguishable from real arith-metic without multiplication.

– Recursive data structures include records, lists, stacks, and queues.Satisfiability in the quantifier-free fragment is efficiently decidable.

Exercises 93

– Arrays can be read and written. Satisfiability in the quantifier-freefragment is decidable. Chapter 11 studies a larger fragment in whichsatisfiability is still decidable.

• Combination theories. How can decision procedures for multiple theoriesbe combined to decide satisfiability in combination theories?

Studying first-order theories is important for two reasons. First, theoriesformalize into FOL interesting structures and operations on the structures.Second, satisfiability in some theories or fragments of theories is decidable andthus can be reasoned about algorithmically, whereas satisfiability in generalFOL is undecidable. Part II of this book focuses on such theories and frag-ments that are useful for program analysis. Chapters 5 and 6 provide manyexamples of formulae from combinations of these theories in the context ofprogram verification.


The undecidability of validity in FOL [13] motivated the subsequent study offirst-order theories and fragments. In 1929, Presburger proved that satisfiabil-ity in arithmetic without multiplication is decidable [73]. Tarski showed in the1930s that real arithmetic is decidable even with multiplication, although theSecond World War delayed the publication of this result [90]. The axiomati-zation of recursive data structures that we study is from work by Nelson andOppen [66]. Oppen studied a variation in which structures are acyclic [69].The axiomatization of arrays, in particular the read-over-write axioms, is dueto McCarthy [59]. The Nelson-Oppen combination method is based on workby Nelson and Oppen in the late 1970s and early 1980s [65].

Exercises

3.1 (Semantic argument in TE). Use the semantic method to argue thevalidity of the following ΣE-formulae, or identify a counterexample (a falsifyingTE-interpretation).

(a) f(x, y) = f(y, x) → f(a, y) = f(y, a)(b) f(g(x)) = g(f(x)) ∧ f(g(f(y))) = x ∧ f(y) = x → g(f(x)) = x(c) f(f(f(a))) = f(f(a)) ∧ f(f(f(f(a)))) = a → f(a) = a(d) f(f(f(a))) = f(a) ∧ f(f(a)) = a → f(a) = a(e) p(x) ∧ f(f(x)) = x ∧ f(f(f(x))) = x → p(f(x))

3.2 (Semantic argument in TZ). Use the semantic method to argue thevalidity of the following ΣZ-formulae, or identify a counterexample (a falsifyingTZ-interpretation).

(a) x ≤ y ∧ z = x + 1 → z ≤ y


(b) x ≤ y ∧ z = x− 1 → z ≤ y(c) 3x = 2 → x ≤ 0(d) 1 ≤ x ∧ x ≤ 2 → x = 1 ∨ x = 2(e) 1 ≤ x ∧ x + y ≤ 3 ∧ 1 ≤ y → x = 1 ∨ x = 2(f) 0 ≤ x ∧ 0 ≤ x + y ∧ x + y ≤ 1 ∧ (y ≤ −2 ∨ 2 ≤ y) → 0 ≤ −1

3.3 (Semantic argument in TQ). Use the semantic method to argue the va-lidity of the following ΣQ-formulae, or identify a counterexample (a falsifyingTQ-interpretation).

(a) 3x = 2 → x ≤ 0(b) 0 ≤ x + 2y ∧ 2x + y ≤ 1(c) 1 ≤ x ∧ x ≤ 2 → x = 1 ∨ x = 2

3.4 (Semantic argument in Tcons). Use the semantic method to argue thevalidity of the following Σcons-formulae, or identify a counterexample (a falsi-fying Tcons-interpretation).

(a) car(x) = y ∧ cdr(x) = z → x = cons(y, z)(b) ¬atom(x) ∧ car(x) = y ∧ cdr(x) = z → x = cons(y, z)

3.5 (Semantic argument in TA). Use the semantic method to argue the va-lidity of the following ΣA-formulae, or identify a counterexample (a falsifyingTA-interpretation).

(a) a〈i ⊳ e〉[j] = e → i = j(b) a〈i ⊳ e〉[j] = e → a[j] = e(c) a〈i ⊳ e〉[j] = e → i = j ∨ a[j] = e(d) a〈i ⊳ e〉〈j ⊳ f〉[k] = g ∧ j 6= k ∧ i = j → a[k] = g

3.6 (Semantic argument in combinations). For each of the following for-mulae, identify the combination of theories in which it lies. To avoid ambiguity,prefer TZ to TQ. Then argue its validity in that combination of theories usingthe semantic method, or identify a counterexample.

(a) 1 ≤ x ∧ x ≤ 2 ∧ cons(1, y) 6= cons(x, y) → cons(2, y) = cons(x, y)(b) a[i] ≥ 1 ∧ a[i] + x ≤ 2 ∧ x > 0 ∧ x = i → a〈x ⊳ 2〉[i] = 1(c) 1 ≤ x ∧ x ≤ 2 ∧ cons(1, y) 6= cons(x, y) → cons(2, y) = cons(x, y)(d) x + y = z ∧ f(z) = z → f(x + y) = z(e) g(x + y, z) = f(g(x, y)) ∧ x + z = y ∧ z ≥ 0 ∧ x ≥ y ∧ g(x, x) = z

→ f(z) = g(2x, 0)

3.7 (Semantic argument in combinations). Redo Exercise 3.6, preferringTQ to TZ.

4

Induction

Even though this proposition may have an infinite number of cases,I shall give a very short proof of it assuming two lemmas. The first,which is self evident, is that the proposition is valid for the second row.The second is that if the proposition is valid for any row then it mustnecessarily be valid for the following row.

— Blaise PascalTraite du Triangle Arithmetique, c. 1654

This chapter discusses induction, a classic proof technique for provingfirst-order theorems with universal quantifiers. Section 4.1 begins with step-wise induction, which may be familiar to the reader from earlier education.Section 4.2 then introduces complete induction in the context of arithmetic.Complete induction is theoretically equivalent in power to stepwise inductionbut sometimes produces more concise proofs. Section 4.3 generalizes com-plete induction to well-founded induction in the context of arithmetic andrecursive data structures. Finally, Section 4.4 covers a form of well-foundedinduction over logical formulae called structural induction. It is useful forreasoning about correctness of decision procedures and properties of logicaltheories and their interpretations.

We apply induction in various ways throughout the book. Structural induc-tion is applied in proofs. Additionally, induction is the basis for the programverification methods of Chapter 5.

4.1 Stepwise Induction

We review stepwise induction for arithmetic and then show that it extendsnaturally to other theories, such as the theory of lists Tcons.

96 4 Induction

Arithmetic

Recall from Chapter 3 that the theory of Peano arithmetic TPA formalizesarithmetic over the natural numbers. Its axioms include an instance of the(induction) axiom schema

F [0] ∧ (∀n. F [n]→ F [n + 1]) → ∀x. F [x]

for each ΣPA-formula F [x] with only one free variable x. This axiom schemasays that to prove ∀x. F [x] — that is, F [x] is TPA-valid for all natural numbersx — it is sufficient to do the following:

• For the base case, prove that F [0] is TPA-valid.• For the inductive step, assume as the inductive hypothesis that for

some arbitrary natural number n, F [n] is TPA-valid. Then prove that F [n+1] is TPA-valid under this assumption.

These two steps comprise the stepwise induction principle for Peano (andPresburger) arithmetic.

Example 4.1. Consider the theory T +PA obtained from augmenting TPA with

the following axioms:

• ∀x. x0 = 1 (exp. zero)• ∀x, y. xy+1 = xy · x (exp. successor)• ∀x, z. exp3(x, 0, z) = z (exp3 zero)• ∀x, y, z. exp3(x, y + 1, z) = exp3(x, y, x · z) (exp3 successor)

The first two axioms define exponentiation xy , while the latter two axiomsdefine a ternary function exp3(x, y, z).

Let us prove that the following formula is T +PA-valid:

∀x, y. exp3(x, y, 1) = xy . (4.1)

We need to choose either x or y as the induction variable. Considering theexp3 axioms, it appears that y is the smarter choice: (exp3 successor) definesexp3 recursively by considering the predecessor of y + 1.

Therefore, we prove by stepwise induction on y that

F [y] : ∀x. exp3(x, y, 1) = xy .

For the base case, we prove

F [0] : ∀x. exp3(x, 0, 1) = x0 .

But x0 = 1 by (exp. zero), and exp3(x, 0, 1) = 1 by (exp3 zero).Assume as the inductive hypothesis that for arbitrary natural number n,

F [n] : ∀x. exp3(x, n, 1) = xn . (4.2)

4.1 Stepwise Induction 97

We want to prove that

F [n + 1] : ∀x. exp3(x, n + 1, 1) = xn+1 . (4.3)

By (exp3 successor), we have

exp3(x, n + 1, 1) = exp3(x, n, x · 1) .

Unfortunately, the inductive hypothesis (4.2) does not apply to the left sideof the equation since n 6= n + 1, and it does not apply to the right side ofthe equation because the third argument is x · 1 rather than 1. Continuing toapply axioms is unlikely to bring us closer to the proof. Thus, we have failedto prove the property.

What went wrong in the proof? Did we choose the wrong induction vari-able? Would x have worked better? In fact, it is often the case that the prop-erty must be strengthened to allow the induction to go through. A strongertheorem provides a stronger inductive hypothesis.

Let us strengthen the property to be proved to

∀x, y, z. exp3(x, y, z) = xy · z . (4.4)

It clearly implies the desired property (4.1): just choose z = 1.Again, we must choose the induction variable. Based on (exp3 successor),

we use y again. Thus, we prove by stepwise induction on y that

F [y] : ∀x, z. exp3(x, y, z) = xy · z .

For the base case, we prove

F [0] : ∀x, z. exp3(x, 0, z) = x0 · z .

From (exp3 zero), we have exp3(x, 0, z) = z, while from (exp. zero), we havex0 · z = 1 · z = z.

Assume as the inductive hypothesis that

F [n] : ∀x, z. exp3(x, n, z) = xn · z (4.5)

for arbitrary natural number n. We want to prove that

F [n + 1] : ∀x, z′. exp3(x, n + 1, z′) = xn+1 · z′ , (4.6)

where we have renamed z to z′ for convenience. We have

exp3(x, n + 1, z′) = exp3(x, n, x · z′) (exp3 successor)

= xn · (x · z′) IH (4.5), z 7→ x · z′

= xn+1 · z′ (exp. successor)

finishing the proof. The annotation z 7→ x·z′ indicates that x·z′ is substitutedfor z when applying the inductive hypothesis (4.5). This substitution is jus-tified because z is universally quantified. Renaming z to z′ avoids confusionduring the application of the inductive hypothesis in the second line.

98 4 Induction

Lists

We can define stepwise induction over recursive data structures such as lists(see Chapters 3 and 9). Consider the theory of lists Tcons. Stepwise inductionin Tcons is defined according to the following schema

(∀ atom u. F [u]) ∧ (∀u, v. F [v]→ F [cons(u, v)]) → ∀x. F [x]

for Σcons-formulae F [x] with only one free variable x. The notation ∀ atom u. F [u]abbreviates ∀u. atom(u)→ F [u]. In other words, to prove ∀x. F [x] — that is,F [x] is Tcons-valid for all lists x — it is sufficient to do the following:

• For the base case, prove that F [u] is Tcons-valid for an arbitrary atom u.• For the inductive step, assume as the inductive hypothesis that for

some arbitrary list v, F [v] is valid. Then prove that for arbitrary list u,F [cons(u, v)] is Tcons-valid under this assumption.

These steps comprise the stepwise induction principle for lists.

Example 4.2. Consider the theory T +cons obtained from augmenting Tcons with

the following axioms:

• ∀ atom u. ∀v. concat(u, v) = cons(u, v) (concat. atom)• ∀u, v, x. concat(cons(u, v), x) = cons(u, concat(v, x)) (concat. list)• ∀ atom u. rvs(u) = u (reverse atom)• ∀x, y. rvs(concat(x, y)) = concat(rvs(y), rvs(x)) (reverse list)• ∀ atom u. flat(u) (flat atom)• ∀u, v. flat(cons(u, v)) ↔ atom(u) ∧ flat(v) (flat list)

The first two axioms define the concat function, which concatenates two liststogether. For example,

concat(cons(a, b), cons(b, cons(c, d)))= cons(a, cons(b, cons(b, cons(c, cons(d))))) .

The next two axioms define the rvs function, which reverses a list. For exam-ple,

rvs(cons(a, cons(b, c))) = cons(c, cons(b, a)) .

Note, however, that rvs is undefined on lists like cons(cons(a, b), c), forcons(cons(a, b), c) cannot result from concatenating two lists together. There-fore, the final two axioms define the flat predicate, which evaluates to ⊤ ona list iff every element is an atom. For example, cons(a, cons(b, c)) is flat , butcons(cons(a, b), c) is not because the first element of the list is itself a list.

Let us prove that the following formula is T +cons-valid:

∀x. flat(x) → rvs(rvs(x)) = x . (4.7)

For example,

4.2 Complete Induction 99

rvs(rvs(cons(a, cons(b, c)))) = rvs(cons(c, cons(b, a)))

= cons(a, cons(b, c))

We prove by stepwise induction on x that

F [x] : flat(x) → rvs(rvs(x)) = x .

For the base case, we consider arbitrary atom u and prove

F [u] : flat(u) → rvs(rvs(u)) = u .

But rvs(rvs(u)) = u follows from two applications of (reverse atom).Assume as the inductive hypothesis that for arbitrary list v,

F [v] : flat(v) → rvs(rvs(v)) = v . (4.8)

We want to prove that for arbitrary list u,

F [cons(u, v)] : flat(cons(u, v)) → rvs(rvs(cons(u, v))) = cons(u, v) . (4.9)

Consider two cases: either atom(u) or ¬atom(u).If ¬atom(u), then

flat(cons(u, v)) ⇔ atom(u) ∧ flat(v) ⇔ ⊥ ,

by (flat list) and assumption. Therefore, (4.9) holds since its antecedent is ⊥.If atom(u), then we have that

flat(cons(u, v)) ⇔ atom(u) ∧ flat(v) ⇔ flat(v)

by (flat list). Furthermore,

rvs(rvs(cons(u, v)))

= rvs(rvs(concat(u, v))) (concat. atom)

= rvs(concat(rvs(v), rvs(u))) (reverse list)

= concat(rvs(rvs(u)), rvs(rvs(v))) (reverse list)

= concat(u, rvs(rvs(v))) (reverse atom)

= concat(u, v) IH (4.8), since flat(v)

= cons(u, v) (concat. atom)

which finishes the proof.

4.2 Complete Induction

Complete induction is a form of induction that sometimes yields moreconcise proofs. For the theory of arithmetic TPA it is defined according to thefollowing schema

100 4 Induction

(∀n. (∀n′. n′ < n→ F [n′])→ F [n]) → ∀x. F [x]

for ΣPA-formulae F [x] with only one free variable x. In other words, to prove∀x. F [x] — that is, F [x] is TPA-valid for all natural numbers x — it is sufficientto follow the complete induction principle:

• Assume as the inductive hypothesis that for arbitrary natural numbern and for every natural number n′ such that n′ < n, F [n′] is TPA-valid.Then prove that F [n] is TPA-valid.

It appears that we are missing a base case. In practice, a case analysis usuallyrequires at least one base case. In other words, the base case is implicit inthe structure of complete induction. For example, for n = 0, the inductivehypothesis does not provide any information — there does not exist a naturalnumber n′ < 0. Hence, F [0] must be shown separately without assistance fromthe inductive hypothesis.

Example 4.3. Consider another augmented version of Peano arithmetic, T ∗PA,

that defines integer division. It has the usual axioms of TPA plus the following:

• ∀x, y. x < y → quot(x, y) = 0 (quotient less)• ∀x, y. y > 0 → quot(x + y, y) = quot(x, y) + 1 (quotient successor)• ∀x, y. x < y → rem(x, y) = x (remainder less)• ∀x, y. y > 0 → rem(x + y, y) = rem(x, y) (remainder successor)

These axioms define functions for computing integer quotients quot(x, y) andremainders rem(x, y). For example, quot(5, 3) = 1 and rem(5, 3) = 2. Weprove two properties, which the reader may recall from grade school, aboutthese functions. First, we prove that the remainder is always less than thedivisor:

∀x, y. y > 0 → rem(x, y) < y . (4.10)

Then we prove that

∀x, y. y > 0 → x = y · quot(x, y) + rem(x, y) . (4.11)

For property (4.10), (remainder successor) suggests that we apply completeinduction on x to prove

F [x] : ∀y. y > 0 → rem(x, y) < y . (4.12)

Thus, for the inductive hypothesis, assume that for arbitrary natural numberx,

∀x′. x′ < x → ∀y. y > 0 → rem(x′, y) < y︸︷︷︸F [x′]

. (4.13)

Let y be an arbitrary positive natural number. Consider two cases: eitherx < y or ¬(x < y).

4.2 Complete Induction 101

If x < y, then

rem(x, y) = x (remainder less)

< y by assumption x < y

as desired.If ¬(x < y), then there is a natural number n, n < x, such that x = n+ y.

Compute

rem(x, y) = rem(n + y, y) x = n + y

= rem(n, y) (remainder successor)

< y IH (4.13), x′ 7→ n, since n < x

finishing the proof of this property.For property (4.11), (remainder successor) again suggests that we apply

complete induction on x to prove

G[x] : ∀y. y > 0 → x = y · quot(x, y) + rem(x, y) . (4.14)

Thus, for the inductive hypothesis, assume that for arbitrary natural numberx,

∀x′. x′ < x → ∀y. y > 0 → x′ = y · quot(x′, y) + rem(x′, y)︸︷︷︸G[x′]

. (4.15)

Let y be an arbitrary positive natural number. Consider two cases: eitherx < y or ¬(x < y).

If x < y, then

y·quot(x, y) + rem(x, y)

= y · 0 + rem(x, y) (quotient less)

= x (remainder less)

as desired.If ¬(x < y), then there is a natural number n < x such that x = n + y.

Compute

y·quot(x, y) + rem(x, y)

= y · quot(n + y, y) + rem(n + y, y) x = n + y

= y · (quot(n, y) + 1) + rem(n + y, y) (quotient successor)

= y · (quot(n, y) + 1) + rem(n, y) (remainder successor)

= (y · quot(n, y) + rem(n, y)) + y

= n + y IH (4.15), x′ 7→ n, since n < x

= x x = n + y

finishing the proof of this property.

In the next section, we generalize complete induction so that we can applyit in other theories.

102 4 Induction

4.3 Well-Founded Induction

A binary predicate ≺ over a set S is a well-founded relation iff there doesnot exist an infinite sequence s1, s2, s3, . . . of elements of S such that eachsuccessive element is less than its predecessor:

s1 ≻ s2 ≻ s3 ≻ · · · ,

where s ≺ t iff t ≻ s. In other words, each sequence of elements of S thatdecreases according to ≺ is finite.

Example 4.4. The relation < is well-founded over the natural numbers. Anysequence of natural numbers decreasing according to < is finite:

1023 > 39 > 30 > 29 > 8 > 3 > 0 .

However, the relation < is not well-founded over the rationals. Consider theinfinite decreasing sequence

1 >1

2>

1

3>

1

4> · · · ,

that is, the sequence si = 1i

for i ≥ 0.

Example 4.5. Consider the theory T PAcons, which includes the axioms of Tcons

and TPA and the following axioms:

• ∀ atom u, v. u c v ↔ u = v (c (1))• ∀ atom u. ∀v. ¬atom(v) → ¬(v c u) (c (2))• ∀ atom u. ∀v, w. u c cons(v, w) ↔ u = v ∨ u c w (c (3))• ∀u1, v1, u2, v2. cons(u1, v1) c cons(u2, v2)

↔ (u1 = u2 ∧ v1 c v2) ∨ cons(u1, v1) c v2 (c (4))• ∀x, y. x ≺c y ↔ x c y ∧ x 6= y (≺c)• ∀ atom u. |u| = 1 (length atom)• ∀u, v. |cons(u, v)| = 1 + |v| (length list)

The first four axioms define the sublist relation c. x c y holds iff x is a(not necessarily strict) sublist of y. The next axiom defines the strict sublistrelation: x ≺c y iff x is a strict sublist of y. The final two axioms define thelength function, which returns the number of elements in a list.

The strict sublist relation ≺c is well-founded on the set of all lists. One canprove that the number of sublists of a list is finite; and that its set of strictsublists is a superset of the set of strict sublists of any of its sublists. Hence,there cannot be an infinite sequence of lists descending according to ≺c.

Well-founded induction generalizes complete induction to arbitrarytheory T by allowing the use of any binary predicate ≺ that is well-foundedin the domain of every T -interpretation. It is defined in the theory T withwell-founded relation ≺ by the following schema

4.3 Well-Founded Induction 103

(∀n. (∀n′. n′ ≺ n→ F [n′])→ F [n]) → ∀x. F [x]

for Σ-formulae F [x] with only one free variable x. In other words, to prove theT -validity of ∀x. F [x], it is sufficient to follow the well-founded inductionprinciple:

• Assume as the inductive hypothesis that for arbitrary element n andfor every element n′ such that n′ ≺ n, F [n′] is T -valid. Then prove thatF [n] is T -valid.

Complete induction in TPA of Section 4.2 is a specific instance of well-foundedinduction that uses the well-founded relation <.

A theory of lists augmented with the first five axioms of Example 4.5 haswell-founded induction in which the well-founded relation is ≺c.

Example 4.6. Consider proving the trivial property

∀x. |x| ≥ 1 (4.16)

in T PAcons, which was defined in Example 4.5. We apply well-founded induction

on x using the well-founded relation ≺c to prove

F [x] : |x| ≥ 1 . (4.17)

For the inductive hypothesis, assume that

∀x′. x′ ≺c x → |x′| ≥ 1︸︷︷︸F [x′]

. (4.18)

Consider two cases: either atom(x) or ¬atom(x).In the first case |x| = 1 ≥ 1 by (length atom).In the second case x is not an atom, so x = cons(u, v) for some u, v by the

(construction) axiom. Then

|x| = |cons(u, v)|= 1 + |v| (length list)

≥ 1 + 1 IH (4.18), x′ 7→ v, since v ≺c cons(u, v)

≥ 1

as desired. Exercise 4.2 asks the reader to prove formally that ∀u, v. v ≺c

cons(u, v).This property is also easily proved using stepwise induction.

In applying well-founded induction, we need not restrict ourselves to theintended domain D of a theory T . A useful class of well-founded relations arelexicographic relations. From a finite set of pairs of sets and well-foundedrelations (S1,≺1), . . . , (Sm,≺m), construct the set

104 4 Induction

S = S1 × · · · × Sm ,

and define the relation ≺:

(s1, . . . , sm) ≺ (t1, . . . , tm) ⇔m∨

i=1

si ≺i ti ∧i−1∧

j=1

sj = tj

for si, ti ∈ Si. That is, for elements s : (s1, . . . , sm), t : (t1, . . . , tm) of S, s ≺ tiff at some position i, si ≺i ti, and for all preceding positions j, sj = tj .For convenience, we abbreviate (s1, . . . , sm) by s and thus write, for example,s ≺ t.

Lexicographic well-founded induction has the form

(∀n. (∀n′. n′ ≺ n→ F [n′])→ F [n]) → ∀x. F [x]

for Σ-formula F [x] with only free variables x = x1, . . . , xm. Notice that theform of this induction principle is the same as well-founded induction. Theonly difference is that we are considering tuples n = (n1, . . . , nm) rather thansingle elements n.

Example 4.7. Consider the following puzzle. You have a bag of red, yellow,and blue chips. If only one chip remains in the bag, you take it out. Otherwise,you remove two chips at random:

1. If one of the two removed chips is red, you do not put any chips in thebag.

2. If both of the removed chips are yellow, you put one yellow chip and fiveblue chips in the bag.

3. If one of the chips is blue and the other is not red, you put ten red chipsin the bag.

These cases cover all possibilities for the two chips. Does this process alwayshalt?

We prove the following property: for all bags of chips, you can executethe choose-and-replace process only a finite number of times before the bag isempty. Let the triple

(#yellow, #blue, #red)

represent the current state of the bag. Such a tuple is in the set of triples ofnatural numbers S : N3. Let <3 be the natural lexicographic extension of <to such triples. For example,

(11, 13, 3) 6<3 (11, 9, 104) but (11, 9, 104) <3 (11, 13, 3) .

We prove that for arbitrary bag state (y, b, r) represented by the triple ofnatural numbers y, b, and r, only a finite number of steps remain.

For the base cases, consider when the bag has no chips (state (0, 0, 0)) oronly one chip (one of states (1, 0, 0), (0, 1, 0), or (0, 0, 1)). In the first case, youare done; in the second set of cases, only one step remains.

Assume for the inductive hypothesis that for any bag state (y′, b′, r′) suchthat

(y′, b′, r′) <3 (y, b, r) ,

only a finite number of steps remain. Now remove two chips from the currentbag, represented by state (y, b, r). Consider the three possible cases:

1. If one of the two removed chips is red, you do not put any chips in the bag.Then the new bag state is (y− 1, b, r− 1), (y, b− 1, r− 1), or (y, b, r− 2).Each is less than (y, b, r) by <3.

2. If both of the removed chips are yellow, you put one yellow chip and fiveblue chips in the bag. Then the new bag state is (y − 1, b + 5, r), which isless than (y, b, r) by <3.

3. If one of the chips is blue and the other is not red, you put ten red chips inthe bag. Then the new bag state is (y−1, b−1, r+10) or (y, b−2, r+10).Each is less than (y, b, r) by <3.

In all cases, we can apply the inductive hypothesis to deduce that only a finitenumber of steps remain from the next state. Since only one step of the processis required to get to the next state, there are only a finite number of stepsremaining from the current state (y, b, r). Hence, the process always halts.

Example 4.8. Consider proving the property

∀x, y. x c y → |x| ≤ |y| (4.19)

in T PAcons. Let ≺2

c be the natural lexicographic extension of ≺c to pairs of lists.That is, (x1, y1) ≺2

c (x2, y2) iff x1 ≺c x2 ∨ (x1 = x2 ∧ y1 ≺c y2).We apply lexicographic well-founded induction to pairs (x, y) to prove

F [x, y] : x c y → |x| ≤ |y| . (4.20)

For the inductive hypothesis, assume that

∀x′, y′. (x′, y′) ≺2c (x, y) → x′ c y′ → |x′| ≤ |y′|︸︷︷︸

F [x′,y′]

. (4.21)

Now consider arbitrary lists x and y. Consider two cases: either atom(x)or ¬atom(x).

If atom(x), then

|x| = 1 (length atom)

≤ |y| Example 4.6

106 4 Induction

Hence, regardless of whether x c y, we have that |x| ≤ |y| so that (4.20)holds.

If ¬atom(x), then consider two cases: either atom(y) or ¬atom(y). Ifatom(y), then

x c y ⇔ ⊥

by (c (2)); therefore, (4.20) holds trivially.For the final case, we have that ¬atom(x) and ¬atom(y). Then x =

cons(u1, v1) and y = cons(u2, v2) for some lists u1, v1, u2, v2. We have

x c y ⇔ cons(u1, v1) c cons(u2, v2) assumption

⇔ (u1 = u2 ∧ v1 c v2) ∨ cons(u1, v1) c v2 (c (4))

The disjunction suggests two possibilities. Consider the first disjunct. Becausev1 ≺c cons(u1, v1) = x, we have that

(v1, v2) ≺2c (x, y) ,

allowing us to appeal to the inductive hypothesis (4.21): from v1 c v2, deducethat |v1| ≤ |v2|. Then with two applications of (length list), we have

|x| ≤ |y| ⇔ 1 + |v1| ≤ 1 + |v2| ⇔ |v1| ≤ |v2| .

Therefore, |x| ≤ |y| and (4.20) holds for this case.Suppose the second disjunct (cons(u1, v1) c v2) holds. We again look to

the inductive hypothesis (4.21). We have

(cons(u1, v1), v2) ≺2c (x, y)

because cons(u1, v1) = x and v2 ≺c cons(u2, v2) = y. Therefore, the inductivehypothesis tells us that |x| ≤ |v2|, while (length list) implies that |v2| < |y|. Inshort,

|x| ≤ |v2| < |y| ,

which implies |x| ≤ |y| as desired, completing the proof.

Example 4.9. Augment the theory of Presburger arithmetic TN (see Chap-ters 3 and 7) with the following axioms to define the Ackermann function:

• ∀y. ack (0, y) = y + 1 (ack left zero)• ∀x. ack (x + 1, 0) = ack(x, 1) (ack right zero)• ∀x, y. ack (x + 1, y + 1) = ack (x, ack (x + 1, y)) (ack successor)

The Ackermann function grows quickly for increasing arguments:

• ack(0, 0) = 1• ack(1, 1) = 3

• ack(2, 2) = 7• ack(3, 3) = 61

• ack(4, 4) = 222216

− 3

One might expect that proving properties about the Ackermann functionwould be difficult.

However, lexicographic well-founded induction allows us to reason aboutcertain properties of the function. Define <2 as the natural lexicographic ex-tension of < to pairs of natural numbers. Now consider input arguments toack and the resulting arguments in recursive calls:

• (ack left zero) does not involve a recursive call.• In (ack right zero), (x + 1, 0) >2 (x, 1).• In (ack successor),

– (x + 1, y + 1) >2 (x + 1, y), and– (x + 1, y + 1) >2 (x, ack (x + 1, y)).

As the arguments decrease according to <2 with each level of recursion, weconclude that the computation of ack (x, y) halts for every x and y. In Chap-ter 5, we show that finding well-founded relations is a general technique forshowing that functions always halt.

Additionally, we can induct over the execution of ack to prove propertiesof the ack function itself. Let us prove that

∀x, y. ack (x, y) > y (4.22)

is T ack

N -valid. We apply lexicographic well-founded induction to the argumentsof ack to prove

F [x, y] : ack(x, y) > y (4.23)

for arbitrary natural numbers x and y. For the inductive hypothesis, assumethat

∀x′, y′. (x′, y′) <2 (x, y) → ack (x′, y′) > y′

︸︷︷︸F [x′,y′]

. (4.24)

Consider three cases: x = 0, x > 0 ∧ y = 0, and x > 0 ∧ y > 0.If x = 0, then ack (0, y) = y + 1 > y by (ack left zero), as desired.If x > 0 ∧ y = 0, then

ack(x, 0) = ack (x− 1, 1)

by (ack right zero). Since

(x′ : x− 1, y′ : 1) <2 (x, y) ,

the inductive hypothesis (4.24) tells us that

108 4 Induction

ack(x − 1, 1) > 1 .

Therefore, we have

ack(x, 0) = ack (x− 1, 1) > 1 ,

so ack (x, 0) > 0 as desired.For the final case, x > 0 ∧ y > 0, we have

ack(x, y) = ack(x − 1, ack(x, y − 1))

by (ack successor). Since

(x′ : x− 1, y′ : ack (x, y − 1)) <2 (x, y) ,

the inductive hypothesis (4.24) implies that

ack(x − 1, ack(x, y − 1)) > ack (x, y − 1) .

Furthermore, since

(x′ : x, y′ : y − 1) <2 (x, y) ,

the inductive hypothesis (4.24) implies that

ack(x, y − 1) > y − 1 .

All together, then, we have

ack(x, y) = ack(x − 1, ack(x, y − 1)) > ack (x, y − 1) > y − 1 ;

hence, ack(x, y) > (y − 1) + 1 = y, completing the proof.

4.4 Structural Induction

Induction has many other applications outside of reasoning about the valid-ity of first-order formulae. In this section, we introduce the proof techniqueof structural induction for proving properties about formulae themselves.Structural induction is applied in Section 2.7, in analyzing the quantifier elim-ination procedures of Chapter 7, and in other applications throughout thebook.

Define the strict subformula relation over FOL formulae as follows:two formulae F1 and F2 are related by the strict subformula relation iff F1 isa strict subformula of F2. The strict subformula relation is well founded overthe set of FOL formulae since every formula, having only a finite number ofsymbols, has only a finite number of strict subformulae; and each of its strictsubformulae has fewer strict subformulae that it does. To prove a desiredproperty of FOL formulae, instantiate the well-founded induction principlewith the strict subformula relation:

4.4 Structural Induction 109

• Assume as the inductive hypothesis that for arbitrary FOL formula Fand for every strict subformula G of F , G has the desired property. Thenprove that F has the property.

Since atoms do not have strict subformulae, they are treated as base cases.This induction principle is the structural induction principle.

Example 4.10. Exercise 1.3 asks the reader to prove that certain logicalconnectives are redundant in the presence of others. Formally, the exerciseis asking the reader to prove the following claim: Every propositional formulaF is equivalent to a propositional formula F ′ constructed with only the logicalconnectives ⊤, ∧, and ¬.

There are three base cases to consider:

• The formula ⊤ can be represented directly as ⊤.• The formula ⊥ is equivalent to ¬⊤.• Any propositional variable P can be represented directly as P .

For the inductive step, consider formulae G, G1, and G2, and assume asthe inductive hypothesis that each is equivalent to formulae G′, G′

1, and G′2,

respectively, which are constructed only from the connectives ⊤, ∨, and ¬(and propositional variables, of course). We show that each possible formulaethat can be constructed from G, G1, and G2 with only one logical connectiveis equivalent to another constructed with only ⊤, ∨, and ¬:• ¬G is equivalent to ¬G′ from the inductive hypothesis.• By considering the truth table in which the four possible valuations of

G1 and G2 are considered, one can establish that G1 ∨ G2 is equivalentto ¬(¬G′

1 ∧ ¬G′2). By the inductive hypothesis, the latter formula is con-

structed only from propositional variables, ⊤, ∧, and ¬.• By similar reasoning, G1 → G2 is equivalent to ¬(G′

1∧¬G′2), which satisfies

the claim.• Similar reasoning handles G1 ↔ G2 as well.

Hence, the claim is proved.Note that the main argument is essentially similar to the answer that the

reader might have provided in answering Exercise 1.3. Structural inductionmerely provides the basis for lifting the truth-table argument to a generalstatement about propositional formulae.

Structural induction is also useful for reasoning about interpretations offormulae, as the following example shows.

Example 4.11. This example relies on several basic concepts of set theory;however, even the reader unfamiliar with set theory can understand the ap-plication of structural induction without understanding the actual claim.

Consider ΣQ-formulae F [x1, . . . , xn] in which the only predicate is ≤,the only logical connectives are ∨ and ∧, and the only quantifier is ∀. We

110 4 Induction

prove that the set of satisfying TQ-interpretations of F (intuitively, those TQ-interpretations that assign to x1, . . . , xn values from Qn that satisfy F ) de-scribes a closed subset of Qn.

For the base case, consider any inequality α ≤ β with free variablesx1, . . . , xn. From basic set theory, the set of satisfying points is closed.

For the inductive step, consider formulae G, G1, and G2 constructedas specified. Assume as the inductive hypothesis that the satisfying TQ-interpretations for each comprise closed sets. Consider applying the allowedlogical connectives and quantifier:

• G1∧G2: The set described by this formula is the set-theoretic intersectionof the sets described by G1 and G2, and is thus closed by the inductivehypothesis and set theory.

• G1 ∨ G2: Similarly, the set described by this formula is the set-theoreticunion of the sets described by G1 and G2, and is thus closed by the induc-tive hypothesis and set theory.

• ∀x. G: Consider subformula G with free variable x (if x is not free in G,then the formula is equivalent to just G, which describes a closed set bythe inductive hypothesis). For each value a

b∈ Q, consider the formula

G ab

: G ∧ bx ≤ a ∧ a ≤ bx .

The set described by each G ab

is closed according to the inductive hy-pothesis and reasoning similar to the previous cases. From set theory,the conjunction of all such sets is still closed, so the set of satisfying TQ-interpretations of ∀x. G describes a closed set.

The induction is complete, so the claim is proved.Results from Chapter 7 prove that ∃ also preserves closed sets in TQ.

Remark 4.12. Example 4.11 considers a subset of FOL formulae. However,this subset is by definition closed under conjunction, disjunction, and universalquantification: if F , F1, and F2 are in the subset, then so are F1∧F2, F1∨F2,and ∀x. F ; and conversely. In other words, all strict subformulae of a formulain the subset are also in the subset, so that structural induction is applicable.

The proof of Lemma 2.31 provides another example of the application ofstructural induction.

4.5 Summary

This chapter covers several induction principles in several first-order theories:

• Stepwise induction is presented in the context of integer arithmetic andlists. The induction principle requires defining a step such as adding oneor constructing a list with one more element.

Exercises 111

• Complete induction is presented in the context of integer arithmetic. Theinduction principle relies on the well-foundedness of the < predicate.Rather than assuming that the desired property holds for one elementn and proving the property for the case n+1 as in stepwise reduction, oneassumes that the property holds for all elements n′ < n and proves thatit holds for n. This stronger assumption sometime yields easier or moreconcise proofs.

• Well-founded induction generalizes complete induction to other theories; itis presented in the context of lists and lexicographic tuples. The inductionprinciple requires a well-founded relation over the domain.

• Structural induction is an instance of well-founded induction in which thedomain is formulae and the well-founded relation is the strict subformularelation.

Besides being an important tool for proving first-order validities, induc-tion is the basis for both verification methodologies studied in Chapter 5.Structural induction also serves as the basis for the quantifier eliminationprocedures studied in Chapter 7.


The induction proofs in Examples 4.1, 4.3, and 4.9 are taken from the text ofManna and Waldinger [55].

Blaise Pascal (1623–1662) and Jacob Bernoulli (1654–1705) are recognizedas having formalized stepwise and complete induction, respectively. Less for-mal versions of induction appear in texts by Francesco Maurolico (1494–1575);Rabbi Levi Ben Gershon (1288–1344), who recognized induction as a distinctform of mathematical proof; Abu Bekr ibn Muhammad ibn al-Husayn Al-Karaji (953–1029); and Abu Kamil Shuja Ibn Aslam Ibn Mohammad IbnShaji (850–930) [97]. Some historians claim that Euclid may have appliedinduction informally.

Exercises

4.1 (T +cons). Prove the following in T +

cons:

(a) ∀u, v. flat(u) ∧ flat(v) → flat(concat(u, v))(b) ∀u. flat(u) → flat(rvs(u))

4.2 (T PAcons). Prove or disprove the following in T PA

cons:

(a) ∀u. u c u(b) ∀u, v, w. cons(u, v) c w → v c w(c) ∀u, v. v ≺c cons(u, v)

112 4 Induction

4.3 (T +cons ∪ T PA

cons). Prove the following in T +cons ∪ T PA

cons:

(a) ∀u, v. |concat(u, v)| = |u|+ |v|(b) ∀u. flat(u) → |rvs(u)| = |u|

4.4 (Chips). Does the process of Example 4.7 still halt if

(a) in Step 1, you return one red chip to the bag?(b) in Step 1, you add one blue chip?(c) in Step 1, you add one blue chip; and in Step 3, you return the blue chip

to the bag but do not add any other chips?

4.5 (Strict sublist). Modify Example 4.8 to prove

∀x, y. x ≺c y → |x| < |y| .

4.6 (Structural induction). Prove that every first-order formula F is equiv-alent to a first-order formula F ′ constructed with only the logical connectives⊤, ∧, and ¬ and the quantifier ∀.

4.7 (Finite number of sublists). Prove that the number of sublists of alist (defined in T PA

cons) is finite.

4.8 (⋆≺c is well-founded). Prove that ≺c, defined in T PAcons, is well-founded

over lists. To avoid circularity, do not apply well-founded induction in thisproof. Hint: Prove that ≺c is transitive and irreflexive (∀u. ¬(u ≺c u)); thenapply Exercise 4.7.

5

Program Correctness: Mechanics

When examining the detail of the algorithm, it seems probable that theproof will be helpful in explaining not only what is happening but why.

— Tony HoareAn Axiomatic Basis for Computer Programming, 1969

We are finally ready to apply FOL and induction to a real problem: spec-ifying and proving properties of programs. In this chapter, we develop thethree foundational methods that underly all verification and program analy-sis techniques. In the next chapter, we discuss strategies for applying them.

First, specification is the precise statement of properties that a programshould exhibit. The language of FOL offers precision. The remaining taskis to develop a scheme for embedding FOL statements into program textas program annotations. We focus on two forms of properties. Partialcorrectness properties, or safety properties, assert that certain states —typically, error states — cannot ever occur during the execution of a program.An important subset of this form of property is the partial correctness ofprograms: if a program halts, then its output satisfies some relation withits input. Total correctness properties, or progress properties, assert thatcertain states are eventually reached during program execution. Section 5.1presents specification in the context of a simple programming language, pi.

The next foundational method is the inductive assertion method forproving partial correctness properties. The inductive assertion method isbased on the mathematical induction of Chapter 4. To prove that every stateduring the execution of a program satisfies FOL formula F , prove as the basecase that F holds at the beginning of execution; assume as the inductive hy-pothesis that F currently holds (at some point during the execution); andprove as the inductive step that F holds after one more step of the program.Section 5.2 discusses the mechanics for reducing a program with a partial cor-rectness specification to this inductive argument. The challenge in applyingthis method is to discover additional annotations to make the induction gothrough. Chapter 6 discusses strategies for finding the extra information.

114 5 Program Correctness: Mechanics

The third foundational method is the ranking function method forproving total correctness properties. Proving total correctness breaks downinto two arguments. First, one proves that some partial correctness propertyholds using the inductive assertion method; second, one argues that some setof loops and recursive functions always halt. The ranking function methodapplies to the latter argument: one associates with each loop and recursivefunction a ranking function that maps the program variables to a well-founded domain (see Chapter 4); then one proves that whenever programcontrol moves from one ranking function to the next, the value decreases ac-cording to the well-founded relation. Since the relation is well-founded, thelooping and recursion must eventually halt. A typical total correctness prop-erty asserts that the program halts and its output satisfies some relation withits input. The ranking function method applies to the first conjunct (the pro-gram halts); the inductive assertion method applies to the second (its outputsatisfies some relation with its input). Section 5.3 presents the ranking func-tion method.

We explain all concepts in this and the next chapter with a set of exampleprograms that manipulate arrays. We chose these programs for several rea-sons. First, they should be familiar to most readers; the reader can thus focuson the verification methodology rather than on understanding new complexprograms. Second, our correctness proofs rely heavily on the decision proce-dures that are discussed in Part II of this book. Finally, they are small butdense, allowing us to exhibit common techniques for proving interesting factsabout programs. The reader should keep in mind, however, that the methodsof this chapter underlie software and hardware analyses that are applied inpractice.

The reader may find the contents of this chapter to be rather technical.Indeed, the transformation of an annotated program into a set of verificationconditions is a purely mechanical task. Fortunately, a verifying compilerdoes this work in practice: it parses an annotated program, checking the syn-tax and semantics as usual, and generates a set of verification conditions. Butjust as learning how compilers work is important for understanding program-ming languages, learning how verifying compilers work is important for un-derstanding verification and programming languages with annotations. More-over, applying the steps of generating verification conditions provides practicein manipulating and understanding FOL formulae, a useful skill. Chapter 6discusses strategies for writing annotations, a task that cannot be fully auto-mated.

5.1 pi: A Simple Imperative Language

This section introduces the programming language pi, an imperative languagewith facilities for annotations. To allow us to focus on the fundamentals ofprogram verification, pi lacks complicating features of typical imperative lan-

5.1 pi: A Simple Imperative Language 115

@pre ⊤@post ⊤bool LinearSearch(int[] a, int ℓ, int u, int e) for @ ⊤

(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

return false;

Fig. 5.1. LinearSearch

guages: its data types do not include pointer or reference types; and it doesnot allow global variables, although it does have global constants (see Exercise6.5). After reading this chapter and Chapter 12, the interested reader shouldconsult the wide literature on program analysis to learn how the techniques ofthese chapters extend to reasoning about standard programming languages.

5.1.1 The Language

Because pi is superficially a C-like language with restrictions, we present theessential features of pi through examples.

Example 5.1. Figure 5.1 lists the function LinearSearch, which searches therange [ℓ, u] of an array a of integers for a value e. It returns true iff the givenarray contains the value between the lower bound ℓ and upper bound u. Itbehaves correctly only if 0 ≤ ℓ and u < |a|; otherwise, the array a is accessedoutside of its domain [0, |a| − 1]. |a| denotes the length of array a.

Observe that most of the syntax is similar to C. For example, the for loopsets i to be ℓ initially and then executes the body of the loop and increments iby 1 as long as i ≤ u. Also, an integer array has type int[], which is constructedfrom base type int. One syntactic difference occurs in assignment, which iswritten := to distinguish it from the equality predicate =. We use = as theequality predicate, rather than ==, to correspond to the standard equalitypredicate of FOL. Finally, unlike C, pi has type bool and constants true andfalse.

Notice the lines beginning with @. They are program annotations, whichwe discuss in detail in the next section.

In LinearSearch, a, ℓ, u, and e are the formal parameters (also, param-eters) of the function. If LinearSearch is called as LinearSearch(b, 0, |b| − 1, v),then b, 0, |b| − 1, and v are the arguments.

Example 5.2. Figure 5.2 lists the recursive function BinarySearch, whichsearches a range [ℓ, u] of a sorted (weakly increasing: a[i] ≤ a[j] if i ≤ j)array a of integers for a value e. Like LinearSearch, it returns true iff the

@pre ⊤@post ⊤bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else int m := (ℓ + u) div 2;if (a[m] = e) return true;else if (a[m] < e) return BinarySearch(a, m + 1, u, e);else return BinarySearch(a, ℓ, m − 1, e);

Fig. 5.2. BinarySearch

given array contains the value in the range [ℓ, u]. It behaves correctly only if0 ≤ ℓ and u < |a|.

One level of recursion operates as follows. If the lower bound ℓ of the rangeis greater than the upper bound u, then the (empty) subarray cannot containe, so it returns false. Otherwise, it examines the middle element a[m] of thesubarray: if it is e, then the subarray clearly contains e; otherwise, it recurseson the left half if a[m] < e and on the right half if a[m] > e.

pi syntactically distinguishes between integer division and real division: forint variables a and b, write a div b instead of a/b. Integer division is definedas follows:

a div bdef=⌊a

b

⌋.

That is, a div b is equal to the greatest integer less than or equal to ab

(thefloor of a

b).

Example 5.3. Figure 5.3 lists the function BubbleSort, which sorts an integerarray. It works by “bubbling” the largest element of the left unsorted region ofthe array toward the sorted region on the right; this element then becomes theleft element of the sorted region, enlarging the region by one cell. In Figure5.4, for example, the first line shows an array in which the rightmost boxedcells comprise the sorted region and the other cells comprise the unsortedregion. In the final line, the sorted region has been expanded by one cell.

Figure 5.4 lists a portion of a sample execution trace. The right two cells(5, 6) of the array have already been sorted. In the trace, the inner loopmoves the largest element 4 of the unsorted region to the right to join thesorted region, which is indicated by the dotted rectangle. In the first twosteps, a[j] ≤ a[j + 1] (2 ≤ 3 and 3 ≤ 4), so the values of cell j and j + 1are not swapped in either case. In the subsequent two steps, a[j] > a[j + 1](4 > 1 and 4 > 2), causing a swap at each step. In the fifth step, the innerloop’s guard i < j no longer holds, so the inner loop exits and the outer

@pre ⊤@post ⊤int[] BubbleSort(int[] a0) int[] a := a0;for @ ⊤

(int i := |a| − 1; i > 0; i := i − 1) for @ ⊤

(int j := 0; j < i; j := j + 1) if (a[j] > a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

return a;

Fig. 5.3. BubbleSort

2 3 4 1 2 5 6j i

2 3 4 1 2 5 6j i

2 3 4 1 2 5 6

j i

2 3 1 4 2 5 6

j i

2 3 1 2 4 5 6j, i

2 3 1 2 4 5 6j i

Fig. 5.4. Sample execution of BubbleSort

loop decrements i by 1. The sorted region has been expanded by one cell, asindicated by the final dotted rectangle. The last step shows the beginning ofthe next round of the inner loop.

Because pi does not have pointer or reference types, all data are passed byvalue, including arrays and structures. If BubbleSort were missing the return


typedef struct qs int pivot;int[] array;

qs;

Fig. 5.5. Structure qs

statement, then calling it would not have any discernible effect on the callingcontext. Additionally, pi does not allow updates to parameters, so BubbleSortassigns a0 to a fresh variable a in the first line. This artificial requirementmakes reasoning about functions easier: in annotations (see Section 5.1.2)throughout the function, one can always reference the input.

In this book, our example programs manipulate arrays rather than re-cursive data structures. The reason is that we can express more interestingproperties about arrays in the fragment of the theory of arrays studied inChapter 11 than we can about lists in the fragment of the theory of recursivedata structures studied in Chapter 9. This bias is a reflection of the structureand content of this book, not of what is theoretically possible.

However, we sometimes use records, a basic recursive data type, to allowa function to return multiple values. The following example illustrates such arecord type, which is used in the program QuickSort (see Section 6.2).

Example 5.4. The structure qs of Figure 5.5 is a record with two fields: thepivot field of type int and the array field of type array. If x is a variable oftype qs, then x.pivot returns the value in its pivot field; also, x.array[i] := vassigns v to position i of x’s array field.

5.1.2 Program Annotations

The most important feature of pi is the capacity for complex function an-notations. An annotation is a FOL formula F whose free variables includeonly the program variables of the function in which the annotation occurs. Anannotation F at location L asserts that F is true whenever program controlreaches L. We discuss several forms of annotations in this section.

Function Specifications

The function specification of a function is a pair of annotations. The func-tion precondition is a formula F whose free variables include only the for-mal parameters. It specifies what should be true upon entering the function— or, in other words, under what inputs the function is expected to work. Thefunction postcondition is a formula G whose free variables include only theformal parameters and the special variable rv representing the return valueof the function. The postcondition relates the function’s output (the returnvalue rv) to its input (the parameters).

@pre 0 ≤ ℓ ∧ u < |a|@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

bool LinearSearch(int[] a, int ℓ, int u, int e) for @ ⊤


return false;

Fig. 5.6. LinearSearch with function specification

@pre ⊤@post rv ↔ ∃i. 0 ≤ ℓ ≤ i ≤ u < |a| ∧ a[i] = e

bool LinearSearch(int[] a, int ℓ, int u, int e) if (ℓ < 0 ∨ u ≥ |a|) return false;for @ ⊤


return false;

Fig. 5.7. Robust LinearSearch with function specification

Example 5.5. In Example 5.1, we informally specified the behavior of Lin-earSearch as follows: LinearSearch returns true iff the array a contains thevalue e in the range [ℓ, u]. It behaves correctly only when ℓ ≥ 0 and u < |a|.

Function specifications formalize such statements. Figure 5.6 presents Lin-earSearch with its specification. The precondition asserts that the lower boundℓ should be at least 0 and that the upper bound u should be less than thelength |a| of the array a. The postcondition asserts that the return value rvis true iff a[i] = e for some index i ∈ [ℓ, u] of a.

Example 5.6. A nontrivial precondition (a formula other than ⊤) is not al-ways acceptable, especially if a function is public to a module. Figure 5.7 listsa more robust version of linear search. The formula

0 ≤ ℓ ≤ i ≤ u < |a|

abbreviates

0 ≤ ℓ ∧ ℓ ≤ i ∧ i ≤ u ∧ u < |a| .

A nontrivial precondition is sometimes acceptable for a function that is privateto a module. The verification method of this chapter checks that every instanceof a call to such a function obeys the precondition.

@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else int m := (ℓ + u) div 2;if (a[m] = e) return true;else if (a[m] < e) return BinarySearch(a, m + 1, u, e);else return BinarySearch(a, ℓ, m − 1, e);

Fig. 5.8. BinarySearch with function specification

@pre ⊤@post sorted(rv , 0, |rv | − 1)int[] BubbleSort(int[] a0) int[] a := a0;for @ ⊤

(int i := |a| − 1; i > 0; i := i − 1) for @ ⊤


return a;

Fig. 5.9. BubbleSort with function specification

Example 5.7. Figure 5.8 lists BinarySearch with its specification. As ex-pected, its postcondition is identical to the postcondition of LinearSearch.However, its precondition also states that the array a is sorted.

The sorted predicate is defined in the combined theory of integers andarrays, TZ ∪ TA:

sorted(a, ℓ, u) ⇔ ∀i, j. ℓ ≤ i ≤ j ≤ u → a[i] ≤ a[j] .

Example 5.8. Figure 5.9 lists BubbleSort with its specification. Given anyarray, the returned array is sorted. Of course, other properties are desirable


and could be specified as well. For example, the returned array rv should bea permutation of the original array a0 (see Exercise 6.5).

Section 5.2 presents a method for proving that a function satisfies itspartial correctness specification: if the function precondition is satisfied andthe function halts, then the function postcondition holds upon return. Section5.3 discusses a method for proving that, additionally, the function always halts.

Loop Invariants

Each for loop and while loop has an attendant annotation called the loopinvariant. A while loop

while

@ F(〈condition〉) 〈body〉

says to apply the 〈body〉 as long as 〈condition〉 holds. The assertion F musthold at the beginning of every iteration. It is evaluated before the 〈condition〉is evaluated, so it must hold even on the final iteration when 〈condition〉 isfalse. Therefore, on entering the 〈body〉 of the loop,

F ∧ 〈condition〉must hold, and on exiting the loop,

F ∧ ¬〈condition〉must hold.

To consider a for loop, translate the loop

for

@ F(〈initialize〉; 〈condition〉; 〈increment〉) 〈body〉

into the equivalent loop

〈initialize〉;while

@ F(〈condition〉) 〈body〉〈increment〉

F must hold after the 〈initialize〉 statement has been evaluated and, on eachiteration, before the 〈condition〉 is evaluated.

bool LinearSearch(int[] a, int ℓ, int u, int e) for

@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

return false;

Fig. 5.10. LinearSearch with loop invariant

Example 5.9. Figure 5.10 lists LinearSearch with a nontrivial loop invariantat L. It asserts that whenever control reaches L, the loop index is at least ℓand that a[j] 6= e for previously examined indices j.

Section 5.2 shows that loop invariants are crucial for constructing an in-ductive argument that a function obeys its specification.

Assertions

In pi, one can add an annotation anywhere. When an annotation is not afunction precondition, function postcondition, or loop invariant, we call it anassertion. Assertions allow programmers to provide a formal comment. Forexample, if at the statement

i := i + k;

the programmer thinks that k is positive, then the programmer can add anassertion stating that supposition:

@ k > 0;i := i + k;

Later, the programmer’s hyothesis about k is verified with formal verificationat compile time or with dynamic assertion tests at runtime.

Runtime assertions are a special class of assertions. In most program-ming languages, runtime errors include division by 0, modulo by 0, anddereference of null. In particular, division by 0 and modulo by 0 cause hard-ware exceptions, while only some languages, such as Java, catch a dereferenceof null. In pi, runtime errors include division by 0, modulo by 0, and accessingan array out of bounds. The pi compiler generates runtime assertions to catchruntime errors.

Example 5.10. Figure 5.11 lists LinearSearch with runtime assertions. Thearray read a[i] is protected by the assertion that i is a legal index of a.

5.2 Partial Correctness 123

@pre ⊤@post ⊤bool LinearSearch(int[] a, int ℓ, int u, int e) for @ ⊤

(int i := ℓ; i ≤ u; i := i + 1) @ 0 ≤ i < |a|;if (a[i] = e) return true;

return false;

Fig. 5.11. LinearSearch with runtime assertions

@pre ⊤@post ⊤bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else

@ 2 6= 0;int m := (ℓ + u) div 2;@ 0 ≤ m < |a|;if (a[m] = e) return true;else

@ 0 ≤ m < |a|;if (a[m] < e) return BinarySearch(a,m + 1, u, e);else return BinarySearch(a, ℓ, m − 1, e);

Fig. 5.12. BinarySearch with runtime assertions

Example 5.11. Figure 5.12 lists BinarySearch with runtime assertions. Thefirst assertion protects the division: it asserts that 2 6= 0, which clearly holds.The next two assertions protect the array reads.

Example 5.12. Figure 5.13 lists BubbleSort with compiler-generated runtimeassertions. All assertions protect array accesses. The first two runtime asser-tions are sufficient to protect all array accesses. Figure 5.14 lists a conciseversion.

5.2 Partial Correctness

Having specified and implemented each function of a program, we would liketo prove that the functions obey their specifications (from another perspective,

(int i := |a| − 1; i > 0; i := i − 1) for @ ⊤

(int j := 0; j < i; j := j + 1) @ 0 ≤ j < |a|;@ 0 ≤ j + 1 < |a|;if (a[j] > a[j + 1])

@ 0 ≤ j < |a|;int t := a[j];@ 0 ≤ j < |a|;@ 0 ≤ j + 1 < |a|;a[j] := a[j + 1];@ 0 ≤ j + 1 < |a|;a[j + 1] := t;

return a;

Fig. 5.13. BubbleSort with runtime assertions

that their specifications reflect their actual behavior). A function is partiallycorrect if when the function’s precondition is satisfied on entry, its post-condition is satisfied when the function returns (if it ever does). In general,functions may not halt: a loop’s guard could be incorrect, or a recursive func-tion could fail to handle a particular base case. Section 5.3 discusses how toprove that a function always halts.

We present the inductive assertion method for proving that a programis partially correct. The method reduces each function and its annotations toa finite set of verification conditions (VCs), which are FOL formulae. Ifall of a function’s VCs are valid, then the function obeys its specification. Thereduction occurs in two stages: first, each function of the annotated program isbroken down into a finite set of basic paths (Sections 5.2.1 and 5.2.2); second,each basic path generates a verification condition (Section 5.2.4). Sections5.2.3 and 5.2.5 provide a more abstract view of the inductive assertion method.

Loops complicate proofs of partial correctness because they create an un-bounded number of paths from function entry to exit. Recursive functionssimilarly complicate proofs. A path is a sequence of program statements. Forloops, loop invariants cut the paths into a finite set of basic paths, while forrecursive functions, the function specification of the recursive function cutsthe paths. In this section, we assume that we are given function specifications

(int i := |a| − 1; i > 0; i := i − 1) for @ ⊤

(int j := 0; j < i; j := j + 1) @ 0 ≤ j < |a| ∧ 0 ≤ j + 1 < |a|;if (a[j] > a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

return a;

Fig. 5.14. BubbleSort with compressed runtime assertions


bool LinearSearch(int[] a, int ℓ, int u, int e) for

@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

return false;

Fig. 5.15. LinearSearch with loop invariants

and loop invariants and concentrate on the task of generating the correspond-ing verification conditions. In practice, this task is performed by a verifyingcompiler. Chapter 6 discusses strategies for constructing specifications andloop invariants.

5.2.1 Basic Paths: Loops

A basic path is a sequence of instructions that begins at the function pre-condition or a loop invariant and ends at a loop invariant, an assertion, orthe function postcondition. Moreover, a loop invariant can only occur at thebeginning or the ending of a basic path. Thus, basic paths do not cross loops.The following examples illustrate the characteristics of basic paths.

Example 5.13. Figure 5.15 lists an annotated version of LinearSearch. Itsbasic paths are the following. The first basic path starts at the function pre-condition, enters the for loop via the initialization statement, and ends atthe loop invariant L:

(1)@pre 0 ≤ ℓ ∧ u < |a|i := ℓ;@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)

The second basic path begins at the loop invariant at L, passes the loopguard i ≤ u, passes the guard a[i] = e of the if statement, executes thereturn (of true), and ends at the postcondition:

(2)@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)assume i ≤ u;assume a[i] = e;rv := true;@post rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e

This path exhibits two new aspects of basic paths. First, return statementsbecome assignments to the special variable rv representing the return value.

Second, guards arising in program statements (in for loop guards, whileloop guards, or if statements) become assume statements in basic paths.An assume statement assume c in a basic path means that the remainder ofthe basic path is executed only if the condition c holds at assume c. Eachguard with condition c results in two assumptions: the guard holds (c) orit does not hold (¬c). Therefore, each guard produces two paths with thesame prefix up to the guard. They diverge on the assumption: one basic pathhas the statement assume c, and the other has the statement assume ¬c.These assumptions and the control structure of the program determine theconstruction of the remainder of the basic paths.

For example, the third path has the same prefix as (2) but makes theopposite assumption at the if statement guard: it assumes a[i] 6= e ratherthan a[i] = e. Therefore, this path loops back around to the loop invariant:

(3)@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)assume i ≤ u;assume a[i] 6= e;i := i + 1;@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j u rather thani ≤ u. Therefore, this path exits the loop and returns false:

@pre

L

@post

(1)

(3)

(2),(4)

Fig. 5.16. Visualization of basic paths of LinearSearch

(4)@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j u;rv := false;@post rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e

To avoid forgetting a basic path, we list paths in a depth-first order. Whena guard is encountered, assume that it holds and generate the resulting paths;then assume that it does not hold and generate the resulting paths. In thisexample, the loop guard i ≤ u in (2) is first encountered; (2) and (3) followfrom the assumption that the loop guard holds, while (4) follows from theassumption that it does not hold. The if statement guard a[i] = e is nextencountered in (2); (2) follows from the assumption that it holds, while (3)follows from the assumption that it does not hold. Figure 5.16 visualizes thesebasic paths.

Example 5.14. Figure 5.17 lists BubbleSort with loop invariants. The outerloop invariant at L1 asserts that

• i is in the range [−1, |a| − 1] (if |a| = 0, then i is initially −1);• a is sorted in the range [i, |a| − 1];• and a is partitioned such that each element in the range [0, i] is at most

(less than or equal to) each element in the range [i + 1, |a| − 1].

Its inner loop invariant at L2 asserts that

• i is in the range [1, |a| − 1], and j is in the range [0, i];• a is sorted in the range [i, |a| − 1] as in the outer loop;• a is partitioned as in the outer loop;• and a is also partitioned such that each element in the range [0, j − 1] is

at most a[j].

The partitioned predicate is defined in the theory TZ ∪ TA:

partitioned(a, ℓ1, u1, ℓ2, u2)⇔ ∀i, j. ℓ1 ≤ i ≤ u1 < ℓ2 ≤ j ≤ u2 → a[i] ≤ a[j] .

@pre ⊤@post sorted(rv , 0, |rv | − 1)int[] BubbleSort(int[] a0) int[] a := a0;for

@L1 :

2

4

−1 ≤ i < |a|∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ sorted(a, i, |a| − 1)

3

5

(int i := |a| − 1; i > 0; i := i − 1) for

@L2 :

2

6

6

4

1 ≤ i < |a| ∧ 0 ≤ j ≤ i

∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ partitioned(a, 0, j − 1, j, j)∧ sorted(a, i, |a| − 1)

3

7

7

5


return a;

Fig. 5.17. BubbleSort with loop invariants

Performing a depth-first exploration, the first basic path starts at the pre-condition and ends at the outer loop invariant at L1:

(1)@pre ⊤;a := a0;i := |a| − 1;@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)

The second basic path starts at L1 and ends at the inner loop invariant at L2

(recall that the annotation is checked after the loop initialization j := 0):(2)

@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)assume i > 0;j := 0;

@L2 :

[1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)

]

The third and fourth basic paths follow the inner loop, each handling oneassumption on the guard a[j] > a[j + 1] of the if statement:

(3)

@L2 :


]

assume j < i;assume a[j] > a[j + 1];t := a[j];a[j] := a[j + 1];a[j + 1] := t;j := j + 1;

@L2 :


]

(4)

@L2 :


]

assume j < i;assume a[j] ≤ a[j + 1];j := j + 1;

@L2 :


]

The fifth basic path starts at L2, exits the inner loop, and decrements i on itsway to L1:

(5)

@L2 :


]

assume j ≥ i;i := i− 1;@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)

The final basic path starts at L1, exits the outer loop, and then exits thefunction, returning the (presumably sorted) array a:

(6)@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)assume i ≤ 0;rv := a;@post sorted(rv , 0, |rv | − 1)

Figure 5.18 visualizes these basic paths.

Example 5.15. Figure 5.19 lists BubbleSort with runtime assertions at L3

and a different set of loop invariants at L1 and L2 relevant for proving theruntime assertions. Six basic paths correspond in structure to those of Figure5.18, although their content varies based on the new precondition, postcondi-

@pre

L1

@post L2

(1)

(2)(6) (5)

(3), (4)

Fig. 5.18. Visualization of basic paths of BubbleSort

@pre ⊤@post ⊤int[] BubbleSort(int[] a0) int[] a := a0;for

@L1 : −1 ≤ i < |a|(int i := |a| − 1; i > 0; i := i − 1) for

@L2 : 0 a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

return a;

Fig. 5.19. BubbleSort with runtime assertions

tion, and loop invariants. These basic paths ignore the runtime assertion atL3. Then one additional basic path ends at the runtime assertion:

(7)@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ iassume j < i;@L3 : 0 ≤ j < |a| ∧ 0 ≤ j + 1 < |a|

@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else int m := (ℓ + u) div 2;if (a[m] = e) return true;else if (a[m] < e)

@R1 : 0 ≤ m + 1 ∧ u < |a| ∧ sorted(a, m + 1, u);return BinarySearch(a, m + 1, u, e);

else @R2 : 0 ≤ ℓ ∧ m − 1 < |a| ∧ sorted(a, ℓ, m − 1);return BinarySearch(a, ℓ, m − 1, e);

Fig. 5.20. BinarySearch with function call assertions

5.2.2 Basic Paths: Function Calls

Like loops, recursive functions create an unbounded number of paths withinprograms. But just as loop invariants cut loops to produce a finite number ofbasic paths, function specifications cut function calls.

Recall that the function postcondition is a relation between the returnvalue rv and the formal parameters. A function’s postcondition summarizesthe effects of calling it. We use these summaries to replace function calls inbasic paths.

Remark 5.16. The postconditions of the functions of a program need onlyinclude information that is relevant for proving the given specification, so thesummaries may be incomplete. Ignoring irrelevant aspects of functions reducesthe size of annotations. Chapter 6 discusses techniques for developing functionspecifications.

The replacement of function calls by function summaries makes the listingof basic paths (and the resulting analysis described in Section 5.2.4) local tofunctions. Basic paths do not span multiple functions. However, recall that thefunction postcondition is guaranteed to hold on return only when the functionprecondition is satisfied on entry. To ensure that the precondition is satisfied,each instance of a function call generates an extra basic path in which thecalled function’s precondition is asserted. An example clarifies this discussion.

Example 5.17. Figure 5.8 lists BinarySearch with its function specification.BinarySearch contains two (recursive) function calls. In Figure 5.20, each func-tion call is protected by a function call assertion at R1 and R2. Each asser-tion is constructed by applying a substitution to BinarySearch’s precondition

F [a, ℓ, u, e] : 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u) .

The first function call is BinarySearch(a, m + 1, u, e), so the function call as-sertion at R1 is Fσ1, where

σ1 : a 7→ a, ℓ 7→ m + 1, u 7→ u, e 7→ e .

The notation of Section 1.5 allows us to write F [a, m + 1, u, e].The second function call is BinarySearch(a, ℓ, m−1, e), so the function call

assertion at R2 is F [a, ℓ, m−1, e]. These assertions are treated in the same wayas other assertions, such as runtime assertions. So far, we have the followingbasic paths:

(1)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ > u;rv := false;@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

(2)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] = e;rv := true;@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

(3)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] < e;@R1 : 0 ≤ m + 1 ∧ u < |a| ∧ sorted(a, m + 1, u)

(5)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] ≥ e;@R2 : 0 ≤ ℓ ∧ m− 1 < |a| ∧ sorted(a, ℓ, m− 1)

Because BinarySearch lacks loops, each basic path starts at the function pre-condition.

It remains to consider paths (4) and (6), which pass through the recursivefunction calls and end at the postcondition. Paths (3) and (5) end in thefunction call assertions at R1 and R2 protecting these function calls. Sincethey assert that the called BinarySearch’s precondition holds, we can assumethat the returned values obey the postcondition of BinarySearch in each ofthe calling contexts. Therefore, we can use the function postcondition as asummary of the function call:

(4)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] < e;assume v1 ↔ ∃i. m + 1 ≤ i ≤ u ∧ a[i] = e;rv := v1;@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

The lines

assume v1 ↔ ∃i. m + 1 ≤ i ≤ u ∧ a[i] = e;rv := v1;

arise as follows. First, translate the return statement

return BinarySearch(a, m + 1, u, e);

into an assignment to rv , as usual:

rv := BinarySearch(a, m + 1, u, e);

Next, given that the precondition holds (from path (3)), assume that thepostcondition holds. Therefore, summarize the function call with a relationbased on BinarySearch’s postcondition,

G[a, ℓ, u, e, rv] : rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e .

Specifically, the relation is G[a, m + 1, u, e, v1], where v1 is a fresh variablethat captures the return value. In the basic path, assume this relation; thenuse the return value v1 in the assignment:

assume G[a, m + 1, u, e, v1];rv := v1;

These are the penultimate lines of (4). Hence, (4) replaces the function callBinarySearch(a, m + 1, u, e) with a summary based on the function postcon-dition. Now reasoning about the basic path does not require reasoning aboutall of BinarySearch at once.

@pre

R1 R2

@post

(1) (2)

(3),(4)

(4)

(5),(6)

(6)

Fig. 5.21. Visualization of basic paths of BinarySearch

Construct the final basic path for the function call BinarySearch(a, ℓ, m−1, e) similarly:

(6)@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] ≥ e;assume v2 ↔ ∃i. ℓ ≤ i ≤ m− 1 ∧ a[i] = e;rv := v2;@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

Again, v2 is a fresh variable.Figure 5.21 visualizes these basic paths. Paths (4) and (6) are shown to

pass through locations R1 and R2, respectively.

For the general case, consider function f with prototype

@pre F [p1, . . . , pn]@post G[p1, . . . , pn, rv ]type0 f(type1 p1, . . . , typen pn)

Suppose that f is called in context

w := f(e1, . . . , en);

where e1, . . . , en are expressions. Then augment the calling context with thefunction call assertion:

@ F [e1, . . . , en];w := f(e1, . . . , en);

Treat this new assertion the same as any assertion: it results in at least onebasic path ending in

. . .@ F [e1, . . . , en]

Finally, in basic paths that pass through the function call, replace the functioncall by an assumption and assignment constructed from the postcondition,where v is a fresh variable:

. . .assume G[e1, . . . , en, v];w := v;. . .

Note that rv need not have type bool as in BinarySearch. For example, fora function with prototype

@pre ⊤@post rv ≥ xint g(int x)

the statement

w := g(n + 1);

is summarized in basic paths as follows:

assume v ≥ n + 1;w := v;

5.2.3 Program States

Before presenting the final step in proving partial correctness, we formalizeprogram state. A program state s is an assignment of values of the propertype to all variables. The program variables include a distinguished variablepc, the program counter. It holds the current location of control.

Example 5.18. The state

s : pc 7→ L1, a 7→ [2; 0; 1], i 7→ 2, j 7→ 0, t 7→ 2, rv 7→ []

is a state of BubbleSort in which control resides at L1.

A state can be extended to a logical interpretation. Suppose that T is thetheory that captures the functions (+, −, etc.) and predicates (=, <, etc.)of the program. Extend a state s to a logical T -interpretation I : (DI , αI):let DI be all values of the program types; and construct αI by using the as-signments of s and adding assignments for all logical functions and predicatesso that I is a T -interpretation. Subsequently, when we say a state, we meaneither the assignment of program variables to values or the extension to a T -interpretation for the appropriate theory T , depending on the context. Whenwe write s |= F , we mean that I |= F for a T -interpretation I extending s.


•s

wp(F, S)

•s′

F

S

Fig. 5.22. Weakest precondition

Example 5.19. To extend

s : pc 7→ L1, a 7→ [2; 0; 1], i 7→ 2, j 7→ 0, t 7→ 2, rv 7→ []

to a (TZ ∪TA)-interpretation I : (DI , αI), let DI be the set of all integers andarrays of integers; and for αI , assign the standard functions to ·[·], ·〈· ⊳ ·〉, +,−, etc.

5.2.4 Verification Conditions

Our goal is to reduce an annotated function to a finite set of FOL formu-lae, called verification conditions, such that their validity implies that thefunction’s behavior agrees with its annotations. The reduction to basic pathsreduces reasoning about the function to reasoning about a finite set of ba-sic paths. The final reduction from basic paths to VCs requires a mechanismfor incorporating the effects of program statements into FOL formulae. Theweakest precondition predicate transformer is the mechanism. A predi-cate transformer p is a function

p : FOL× stmts→ FOL

that maps a FOL formula F ∈ FOL and program statement S ∈ stmts to aFOL formula.

The weakest precondition wp(F, S) has the defining characteristic that ifstate s is such that

s |= wp(F, S)

and if statement S is executed on state s to produce state s′, then

s′ |= F .

This situation is visualized in Figure 12.1(a). The region labeled F is the setof states that satisfy F ; similarly, the region labeled wp(F, S) is the set of


states that satisfy wp(F, S). Every state s on which executing statement Sleads to a state s′ in the F region must be in the wp(F, S) region.

Define the weakest precondition for the two statement types of basic pathsintroduced in Section 5.2.1:

• Assumption: What must hold before statement assume c is executed toensure that F holds afterward? If c→ F holds before, then satisfying c inassume c guarantees that F holds afterward:

wp(F, assume c) ⇔ c→ F

• Assignment : What must hold before statement v := e is executed to ensurethat F [v] holds afterward? If F [e] holds before, then assigning e to v withv := e makes F [v] hold afterward:

wp(F [v], v := e) ⇔ F [e]

For a sequence of statements S1; . . . ; Sn, define

wp(F, S1; . . . ; Sn) ⇔ wp(wp(F, Sn), S1; . . . ; Sn−1) .

The weakest precondition moves a formula backward over a sequence of state-ments: for F to hold after executing S1; . . . ; Sn, wp(F, S1; . . . ; Sn) must holdbefore executing the statements. Because basic paths have only assumptionand assignment statements, the definition of wp is complete.

Then the verification condition of basic path

@ FS1;...Sn;@ G

is

F → wp(G, S1; . . . ; Sn) .

Its validity implies that when F holds before the statements of the path areexecuted, then G holds afterward. Traditionally, this verification condition isdenoted by the Hoare triple

FS1; . . . ; SnG .

Example 5.20. Consider the basic path(1)

@ x ≥ 0x := x + 1;@ x ≥ 1

The VC is

x ≥ 0x := x + 1x ≥ 1 : x ≥ 0 → wp(x ≥ 1, x := x + 1)

so compute

wp(x ≥ 1, x := x + 1)⇔ (x ≥ 1)x 7→ x + 1⇔ x + 1 ≥ 1⇔ x ≥ 0

Simplifying the VC based on this computation produces

x ≥ 0 → x ≥ 0 ,

which is clearly TZ-valid.

Example 5.21. We generate the VC corresponding to basic path (2) in Ex-ample 5.13 (LinearSearch):

(2)@L : F : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)S1 : assume i ≤ u;S2 : assume a[i] = e;S3 : rv := true;@post G : rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e

The VC is

F → wp(G, S1; S2; S3) ,

so compute

wp(G, S1; S2; S3)⇔ wp(wp(rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, rv := true), S1; S2)⇔ wp(true ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1; S2)⇔ wp(∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1; S2)⇔ wp(wp(∃j. ℓ ≤ j ≤ u ∧ a[j] = e, assume a[i] = e), S1)⇔ wp(a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1)⇔ wp(a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, assume i ≤ u)⇔ i ≤ u → (a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e)

Replacing F and wp(G, S1; S2; S3) according to the computed formula resultsin the VC

ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)→ (i ≤ u → (a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e))

or, equivalently,

ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e) ∧ i ≤ u ∧ a[i] = e→ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e

according to the equivalence

F1 ∧ F2 → (F3 → (F4 → F5)) ⇔ (F1 ∧ F2 ∧ F3 ∧ F4) → F5 .

This formula is (TZ ∪ TA)-valid.

Example 5.22. For basic path (3) of Example 5.13 (LinearSearch),(3)

@L : F : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)S1 : assume i ≤ u;S2 : assume a[i] 6= e;S3 : i := i + 1;@L : G : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)

we use a different strategy to compute the corresponding VC

F → wp(G, S1; S2; S3) .

We collect all modifications to G during applications of wp and then applythem all at once. Compute

wp(G, S1; S2; S3)⇔ wp(wp(G, i := i + 1), S1; S2)⇔ wp(Gi 7→ i + 1, S1; S2)⇔ wp(wp(Gi 7→ i + 1, assume a[i] 6= e), S1)⇔ wp(a[i] 6= e → Gi 7→ i + 1, S1)⇔ wp(a[i] 6= e → Gi 7→ i + 1, assume i ≤ u)⇔ i ≤ u → a[i] 6= e → Gi 7→ i + 1

Then the VC is

F → i ≤ u → a[i] 6= e → Gi 7→ i + 1 ,

or

ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e) ∧ i ≤ u ∧ a[i] 6= e→ ℓ ≤ i + 1 ∧ ∀j. ℓ ≤ j < i + 1 → a[j] 6= e ,

which is (TZ ∪ TA)-valid.

Example 5.23. Consider basic path (2) of Example 5.17 (BinarySearch):(2)

@pre F : 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)S1 : assume ℓ ≤ u;S2 : m := (ℓ + u) div 2;S3 : assume a[m] = e;S4 : rv := true;@post G : rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

The VC is

F → wp(G, S1; S2; S3; S4) ,

so compute

wp(G, S1; S2; S3; S4)⇔ wp(wp(G, rv := true), S1; S2; S3)⇔ wp(Grv 7→ true, S1; S2; S3)⇔ wp(wp(Grv 7→ true, assume a[m] = e), S1; S2)⇔ wp(a[m] = e → Grv 7→ true, S1; S2)⇔ wp(wp(a[m] = e → Grv 7→ true, m := (ℓ + u) div 2), S1)⇔ wp((a[m] = e → Grv 7→ true)m 7→ (ℓ + u) div 2, S1)⇔ wp((a[m] = e → Grv 7→ true)m 7→ (ℓ + u) div 2

, assume ℓ ≤ u)⇔ ℓ ≤ u → (a[m] = e → Grv 7→ true)m 7→ (ℓ + u) div 2

Applying the substitutions produces

ℓ ≤ u → a[(ℓ + u) div 2] = e → Grv 7→ true, m 7→ (ℓ + u) div 2 .

Simplifying the VC accordingly

0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u) ∧ ℓ ≤ u ∧ a[(ℓ + u) div 2] = e→ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e

reveals that it is (TZ ∪ TA)-valid: ℓ ≤ u from the antecedent implies thatℓ ≤ (ℓ + u) div 2 ≤ u, so that a[(ℓ + u) div 2] = e implies that ∃i. ℓ ≤ i ≤u ∧ a[i] = e.

Example 5.24. Consider basic path (3) of Example 5.14 (BubbleSort):(3)

@L2 : F :

1 ≤ i < |a| ∧ 0 ≤ j ≤ i∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)

S1 : assume j < i;S2 : assume a[j] > a[j + 1];S3 : t := a[j];S4 : a[j] := a[j + 1];S5 : a[j + 1] := t;S6 : j := j + 1;

@L2 : G :

1 ≤ i < |a| ∧ 0 ≤ j ≤ i∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)

The VC is

F → wp(G, S1; S2; S3; S4; S5; S6)

so compute

wp(G, S1; S2; S3; S4; S5; S6)⇔ wp(wp(G, j := j + 1), S1; S2; S3; S4; S5)⇔ wp(Gj 7→ j + 1, S1; S2; S3; S4; S5)⇔ wp(wp(Gj 7→ j + 1, a[j + 1] := t), S1; S2; S3; S4)⇔ wp(Gj 7→ j + 1a 7→ a〈j + 1 ⊳ t〉, S1; S2; S3; S4)

Array assignment a[j + 1] := t translates to the functional substitution a 7→a〈j + 1 ⊳ t〉 according to the theory of arrays, TA. That is, the assignmentis modeled by substituting a〈j + 1 ⊳ t〉 for every instance a affected by theassignment. Continuing,

⇔ wp(Gj 7→ j + 1, a 7→ a〈j + 1 ⊳ t〉, S1; S2; S3; S4)⇔ wp(wp(Gj 7→ j + 1, a 7→ a〈j + 1 ⊳ t〉, a[j] := a[j + 1])

, S1; S2; S3)⇔ wp(Gj 7→ j + 1, a 7→ a〈j + 1 ⊳ t〉a 7→ a〈j ⊳ a[j + 1]〉

, S1; S2; S3)⇔ wp(Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ t〉, S1; S2; S3)

The array assignment S4 similarly translates to a substitution. Composingthe two substitutions produces the substitution in which the doubly-modifiedarray a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ t〉 replaces a. Then

⇔ wp(wp(Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ t〉, t := a[j]), S1; S2)

⇔ wp(Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ t〉t 7→ a[j], S1; S2)

⇔ wp(Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j], S1; S2)

Here, the composition of substitutions results in applying t 7→ a[j] to theterm a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ t〉, which contains t. To finish,

⇔ wp(wp(Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j], assume a[j] > a[j + 1])

, S1)⇔ wp(a[j] > a[j + 1]

→ Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j], S1)

⇔ wp(a[j] > a[j + 1]→ Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j]

, assume j < i)⇔ j a[j + 1]

→ Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j]⇔ j a[j + 1]

→ Gj 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j]

Finally, applying the substitution

σ : j 7→ j + 1, a 7→ a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, t 7→ a[j]

to G produces Gσ:

1 ≤ i < |a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉| ∧ 0 ≤ j + 1 ≤ i∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, i, i + 1,

|a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉| − 1)∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, j, j + 1, j + 1)∧ sorted(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, i, |a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉| − 1)

The length function | · | obeys the following axiom:

∀a, i, e. |a〈i ⊳ e〉| = |a| (array length)

In other words, modifying an array’s elements does not change its size. Underthis axiom, Gσ is equivalent to

1 ≤ i < |a| ∧ 0 ≤ j + 1 ≤ i∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, i, i + 1, |a| − 1)∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, j, j + 1, j + 1)∧ sorted(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, i, |a| − 1)

Thus, the VC is

1 ≤ i < |a| ∧ 0 ≤ j ≤ i∧ partitioned(a, 0, i, i + 1, |a| − 1)∧ partitioned(a, 0, j − 1, j, j)∧ sorted(a, i, |a| − 1)∧ j a[j + 1]

→

1 ≤ i < |a| ∧ 0 ≤ j + 1 ≤ i∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, i, i + 1, |a| − 1)∧ partitioned(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, 0, j, j + 1, j + 1)∧ sorted(a〈j ⊳ a[j + 1]〉〈j + 1 ⊳ a[j]〉, i, |a| − 1)

which is (TZ ∪ TA)-valid.

5.2.5 P -Invariant and P -Inductive

Let us consider the inductive assertion method abstractly.A program is a set of functions with some distinguished entry function.

Let P be a program with distinguished entry function f such that f has func-tion precondition Fpre and initial location L0. A computation of program P ,or a P -computation, is a sequence of states

s0, s1, s2, . . .

such that

5.3 Total Correctness 143

1. s0[pc] = L0 and s0 |= Fpre;2. and for each index i, si+1 is a result of executing the instruction at si[pc]

on state si.

The notation si[pc] indicates the value of pc given by state si. A computationmay be finite or infinite. Infinite computations arise when a program containsa function that does not always halt (either through recursion or looping).

A formula F annotating location L of program P is P -invariant (also, isan invariant of P , or is a P -invariant) iff for all P -computations s0, s1, s2, . . .and for each index i, if si[pc] = L then si |= F . The annotations of P areP -invariant iff each annotation is P -invariant. This definition is not imple-mentable: checking if F is P -invariant requires checking an infinite number ofP -computations in general, even if all computations are finite.

Instead, we use the inductive assertion method introduced in this chapter.If all verification conditions generated from program P are T -valid (in theproper theory T ), then the annotations are P -inductive (also, are inductiveP -invariants) and therefore P -invariant. In summary, we have the followingtheorem.

Theorem 5.25 (Verification Conditions). If for every basic path

@Li : FS1;...Sn;@Lj : G

of program P , the verification condition

FS1; . . . ; SnG

is valid (in the appropriate theory), then the annotations are P -invariant. Inparticular, the annotations are P -inductive.

In other words, when P ’s annotations are P -inductive, each function of Pobeys its specification.

Henceforth, when P is obvious from the context, we say invariant insteadof P -invariant and inductive instead of P -inductive.

5.3 Total Correctness

Partial correctness is just one step in proving the total correctness of afunction or program. Total correctness of a function asserts that if the inputsatisfies the function precondition, then the function eventually halts and pro-duces output that satisfies its function postcondition. In other words, total

correctness requires proving partial correctness and that the function alwayshalts on input satisfying its precondition. We focus now on the latter task.

Proving function termination is based on well-founded relations (see Chap-ter 4). Define a set S with a well-founded relation ≺. Then find a function δmapping program states to S such that δ decreases according to ≺ along everybasic path. Since ≺ is well-founded, there cannot exist an infinite sequence ofprogram states; otherwise, they would map to an infinite decreasing sequencein S. The function δ is called a ranking function.

Choosing Well-Founded Relations and Ranking Functions

The concept of well-founded relations is the fundamental principle of thismethod. The ranking function itself is really just a convenience. One coulddirectly construct a well-founded relation over the program states and showthat the output state of each basic path is less than the input state accordingto this relation. However, it is often conceptually easier to find a functionthat maps states to a known set S with a known well-founded relation ≺. Forexample, we usually choose as S the set of n-tuples of natural numbers and as≺ the lexicographic extension <n of < to n-tuples of natural numbers, wheren varies according to the application.

As with partial correctness, we consider total correctness in the contextof loops first and then in the context of recursion. Section 6.2 examines aprogram with both loops and recursion.

Example 5.26. Figure 5.23 lists BubbleSort with ranking annotations. It con-tains one new type of annotation: ↓ (i+1, i+1) and ↓ (i+1, i− j) assert thatthe functions (i+1, i+1) and (i+1, i− j), respectively, are ranking functions.These functions map states of BubbleSort onto pairs of natural numbers S : N2

with well-founded relation <2. Intuitively, we have captured two separate ar-guments. The outer loop eventually finishes because i decreases to 0; hencei + 1 decreases as well. Why do we use i + 1 rather than i? When |a| = 0, theinitial assignment to i is −1. While i + 1 is always nonnegative, i is not; andrecall that we want to map into the natural numbers. The inner loop haltsbecause j increases to i; hence i − j decreases to 0. Therefore, our intuitiontells us that i + 1 is important for the outer loop, and i− j is important forthe inner loop.

Placing these two functions i + 1 and i− j together as a pair (i + 1, i− j)provides the annotation for the inner loop. We expect i+1 to remain constantwhile the inner loop is executing and decreasing i− j. For the annotation ofthe outer loop, we note that i + 1 > i − j = i − 0 = i on entry to the innerloop, so that (i + 1, i + 1) >2 (i + 1, i− j).

The loop annotations assert that the ranking functions (i + 1, i + 1) and(i + 1, i− j) map program states to pairs of natural numbers. Hence, we needto prove that the loop annotations are inductive using the inductive assertion

@pre ⊤@post ⊤int[] BubbleSort(int[] a0) int[] a := a0;for

@L1 : i + 1 ≥ 0↓ (i + 1, i + 1)(int i := |a| − 1; i > 0; i := i − 1) for

@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0↓ (i + 1, i − j)(int j := 0; j < i; j := j + 1) if (a[j] > a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

return a;

Fig. 5.23. BubbleSort with annotations to prove termination

method. We leave this step to the reader. It only remains to prove that thefunctions decrease along each basic path.

The relevant basic paths are the following:(1)

@L1 : i + 1 ≥ 0↓L1 : (i + 1, i + 1)assume i > 0;j := 0;↓L2 : (i + 1, i− j)

(2)@L2 : i + 1 ≥ 0 ∧ i− j ≥ 0↓L2 : (i + 1, i− j)assume j < i;assume a[j] > a[j + 1];t := a[j];a[j] := a[j + 1];a[j + 1] := t;j := j + 1;↓L2 : (i + 1, i− j)

(3)@L2 : i + 1 ≥ 0 ∧ i− j ≥ 0↓L2 : (i + 1, i− j)assume j < i;assume a[j] ≤ a[j + 1];j := j + 1;↓L2 : (i + 1, i− j)

(4)@L2 : i + 1 ≥ 0 ∧ i− j ≥ 0↓L2 : (i + 1, i− j)assume j ≥ i;i := i− 1;↓L1 : (i + 1, i + 1)

The paths entering and exiting the outer loop at L1 are not relevant for thetermination argument. The entering path does not begin with a ranking func-tion annotation, so there is nothing to prove. The exiting path leads to thereturn statement.

For termination purposes, paths (2) and (3) can be treated the same:

@L2 : i + 1 ≥ 0 ∧ i− j ≥ 0↓L2 : (i + 1, i− j)assume j < i;· · ·j := j + 1;↓L2 : (i + 1, i− j)

The excluded statements do not impact the value of the ranking functions.

Verification Conditions

A basic path beginning and ending with a ranking function

@ F↓ δ[x]S1;...Sk;↓ κ[x]

induces a verification condition

F → wp(κ ≺ δ[x0], S1; · · · ; Sk)x0 7→ x .

In words, we must prove that the value of κ after executing statementsS1; · · · ; Sk is less than the value of δ before executing the statements. There-fore, we rename the variables of δ to something new (x0 in this case), computethe weakest precondition of κ ≺ δ[x0] across the statements S1; · · · ; Sk, andthen rename the new variables back to their original names (x in this case).This renaming process preserves the value of δ[x] in its context. The anno-tation F can provide extra invariant information with which to prove thisrelation.

Example 5.27. Let us return to the proof of Example 5.26 that BubbleSorthalts. Path (1) induces the following verification condition:

i + 1 ≥ 0 ∧ i > 0 → (i + 1, i− 0) <2 (i + 1, i + 1) ,

which is valid. Paths (2) and (3) induce the verification condition:

i + 1 ≥ 0 ∧ i− j ≥ 0 ∧ j < i → (i + 1, i− (j + 1)) <2 (i + 1, i− j) ,

which is valid. Finally, (4) induces the verification condition

i+1 ≥ 0 ∧ i− j ≥ 0 ∧ j ≥ i → ((i− 1)+1, (i− 1)+1) <2 (i+1, i− j) ,

which is also valid. Hence, BubbleSort always halts. Combined with the proofof the sortedness property, we can now say that BubbleSort is totally correctwith respect to its specification: it always halts and returns a sorted array.

Let us work through the construction of the final verification conditionfor basic path (4). First, replace i and j with i0 and j0, respectively, in thefunction annotating L2: (i0 + 1, i0 − j0). Then compute

wp((i + 1, i + 1) <2 (i0 + 1, i0 − j0), assume j ≥ i; i := i− 1)⇔ wp(((i− 1) + 1, (i− 1) + 1) <2 (i0 + 1, i0 − j0), assume j ≥ i)⇔ j ≥ i → (i, i) <2 (i0 + 1, i0 − j0)

Now, having preserved the original value of (i + 1, i− j) at L2, replace i0 andj0 with i and j, respectively:

j ≥ i → (i, i) <2 (i + 1, i− j) .

Noting that (4) begins by asserting i + 1 ≥ 0 ∧ i − j ≥ 0, we have ourverification condition:

i + 1 ≥ 0 ∧ i− j ≥ 0 ∧ j ≥ i → (i, i) <2 (i + 1, i− j) .

In this proof, the loop annotations (other than the ranking functions) donot have any bearing on the termination argument. Their purpose is only toprove that the given functions map to the natural numbers.

@pre u − ℓ + 1 ≥ 0@post ⊤↓ u − ℓ + 1bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else int m := (ℓ + u) div 2;if (a[m] = e) return true;else if (a[m] < e) return BinarySearch(a, m + 1, u, e);else return BinarySearch(a, ℓ, m − 1, e);

Fig. 5.24. BinarySearch always halts

Besides loops, recursion is another source of nontermination and hence re-quires ranking annotations as well. Again, we break the termination argumentdown to the level of basic paths.

Example 5.28. Figure 5.24 lists the BinarySearch function. u − ℓ + 1 mapsthe formal parameters of BinarySearch to S : N with well-founded relation <.Why do we use u − ℓ + 1 as the ranking function? Intuitively, the interval[ℓ, u] shortens at each level of recursion, so u− ℓ is a good guess at a rankingfunction. However, it may be that ℓ > u so that u − ℓ does not map into N;but in this case ℓ = u + 1, so u− ℓ + 1 does map into N.

We now need to prove two properties of u− ℓ + 1:

1. Because u− ℓ+1 has type int, we must prove that u− ℓ+1 in fact mapsto N: whenever u − ℓ + 1 is evaluated at function entry, it must be thecase that u− ℓ + 1 ≥ 0.

2. We must prove that u−ℓ+1 decreases from function entry to each recursivecall.

The function precondition asserts the first property, so we need only prove thatthe function specification is inductive as usual. To prove the second property,we reduce the argument to basic paths: u− ℓ + 1 should decrease across eachbasic path.

Convince yourself that the annotations other than the ranking argumentare inductive.

We now consider the ranking argument. The relevant basic paths of thefunction are the following:

5.4 Summary 149

(1)@pre u− ℓ + 1 ≥ 0↓ u− ℓ + 1assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] < e;↓ u− (m + 1) + 1

(2)@pre u− ℓ + 1 ≥ 0↓ u− ℓ + 1assume ℓ ≤ u;m := (ℓ + u) div 2;assume a[m] 6= e;assume a[m] ≥ e;↓ (m− 1)− ℓ + 1

Two other basic paths exist from function entry to the first two return state-ments; however, as the recursion ends at each, they are irrelevant to the ter-mination argument.

The basic paths induce two verification conditions. Before examining them,notice that the assume statements about a[m] are irrelevant to the terminationargument. Now, the first VC is

u− ℓ + 1 ≥ 0 ∧ ℓ ≤ u ∧ · · ·→ u− (((ℓ + u) div 2) + 1) + 1 < u− ℓ + 1 ,

where · · · elides the literals involving a[m]. It is TZ-valid. The VC

u− ℓ + 1 ≥ 0 ∧ ℓ ≤ u ∧ · · ·→ (((ℓ + u) div 2)− 1)− ℓ + 1 < u− ℓ + 1

for the second basic path is also TZ-valid, so BinarySearch halts on all inputin which ℓ is initially at most u + 1.

Section 6.2 provides an alternative to using the awkward ranking functionu− ℓ + 1. Additionally, the argument proves termination on all input.

5.4 Summary

This chapter introduces the specification and verification of sequential pro-grams. It covers:

• The programming language pi.


• Program specification. Specifying a program involves writing function pre-conditions and function postconditions as first-order assertions. A func-tion precondition asserts on which inputs the function is defined, while afunction postcondition asserts the form of the returned data. Other anno-tations: loop invariants, assertions, runtime assertions.

• Partial correctness, which guarantees that if a function halts and the in-put satisfied the function precondition, then its returned value satisfies thefunction postcondition. Partial correctness is proved via an inductive argu-ment. Additional annotations strengthen the inductive hypothesis. Basicpaths, program state, verification conditions, inductive invariants.

• Total correctness, which guarantees additionally that the program or func-tion always halts. Total correctness requires mapping, via a ranking func-tion, program states to a domain with a well-founded relation and provingthat the mapped values in the domain decrease as computation progresses.Typically, proving termination requires additional partial correctness an-notations.

Chapter 6 discusses strategies for applying the techniques of this chapter.The methods introduced in this chapter are fundamental to verification

of software and hardware. However, we present them in a simple context.Decades of research have made much more possible.

The validity of verification conditions listed in examples or produced byprograms in the chapter are all decided using the decision procedures dis-cussed in Part II. We have focused on specifications that can be expressed inthe fragments of theories introduced in Chapter 3. More complex specifica-tions may require general mechanical theorem proving. However, mechanicaltheorem provers often rely on the decision procedures of Part II when possiblefor speed and to minimize human interaction.

Chapter 12 introduces algorithms for deducing annotations.


Formally proving program correctness has been a subject of active research forfive decades. McCarthy argues in [59, 58] for a “mathematical science of com-putation”. Floyd [34] and Hoare [39] introduce the main concepts for provingproperty invariance and termination. In particular, they develop Floyd-Hoarelogic. Manna describes a verification style similar to ours [52]. The weakestprecondition predicate transformer was first formalized by Dijkstra [28].

King describes in his thesis [50] the idea of a verifying compiler, whichgenerates and proves during compilation the verification conditions that arisefrom program annotations. See [27] for a discussion of the Extended StaticChecker, a verifying compiler for Java.

Exercises 151

@pre p(a0)@post sorted(rv , 0, |rv | − 1)int[] InsertionSort(int[] a0) int[] a := a0;for

@ r1(a, a0, i, j)(int i := 1; i < |a|; i := i + 1) int t := a[i];for

@ r2(a, a0, i, j)(int j := i − 1; j ≥ 0; j := j − 1) if (a[j] ≤ t) break;a[j + 1] := a[j];

a[j + 1] := t;

return a;

Fig. 5.25. InsertionSort for Exercise 5.1(a)

Exercises

5.1 (Basic paths). For each of the following functions, replace each @pre ⊤with a fresh predicate p over the function parameters, each @post ⊤ with afresh predicate q over rv and the function parameters, and each @ ⊤ with afresh predicate r over the function variables. As an example, see Figure 5.25for the replacements for part (a). Then list the basic paths.

(a) InsertionSort of Figure 6.8.(b) merge of Figure 6.9.(c) ms of Figure 6.9.

5.2 (Weakest precondition). Compute the following formulae:

(a) wp(x ≥ 0, x := x− k; assume k ≤ 1)(b) wp(x ≥ 0, assume k ≤ x; x := x− k)(c) wp(x ≥ 0, x := x− k; assume k ≤ x)(d) wp(x + 2y ≥ 3, x := x + 1; assume x > 0; y := y + x)

5.3 (Verification condition generation). Generate the VCs for the follow-ing basic paths:

(1)@ x > 0;x := x− k;assume k ≤ 1;@ x ≥ 0;


(2)@ ⊤;assume k ≤ x;x := x− k;@ x ≥ 0;

(3)@ ⊤;x := x− k;assume k ≤ x;@ x ≥ 0;

(4)@ k ≥ 0;x := x− k;assume k ≤ x;@ x ≥ 0;

(5)@ y ≥ 0;x := x + 1;assume x > 0;y := y + x;@ x + 2y ≥ 3;

Which are TZ-valid? Which are TQ-valid?

5.4 (Verification conditions). For each basic path generated in Exercise5.1, list the corresponding VCs.

5.5 (Public functions). Example 5.6 asserts that a function that is acces-sible outside a module should have a reasonable function precondition, suchas ⊤. Implement verified public wrapper functions to LinearSearch and Bina-rySearch for searching an entire array. The function preconditions should bereasonable.

5.6 (The div function). Integer division, even by a constant, is not a func-tion of TZ; however, it is useful for reasoning about programs like BinarySearch.Show how basic paths that include the div function can be altered to use onlystandard linear arithmetic. Hint : Use an additional assume statement. Howdoes this change affect the resulting VCs?

5.7 (Ackermann function). Implement the ack function of Example 4.9 asa pi program and prove that it always halts.

6

Program Correctness: Strategies

The basis of our approach is the notion of an interpretation of a pro-gram: that is, an association of a proposition with each connection inthe flow of control through a program.

— Robert W. FloydAssigning Meanings to Programs, 1967

As in other applications of mathematical induction (see Example 4.1),the main challenge in applying the inductive assertion method is discoveringextra information to make the inductive argument succeed. Consider a typicalpartial correctness property that asserts that a function’s output satisfies somerelation with its input. Assuming this property as the inductive hypothesisdoes not provide any information about how the function behaves betweenentry and exit. It is a weak hypothesis. Therefore, one must provide moreinformation about the function in the form of additional program annotations.Section 6.1 discusses strategies for discovering this additional information.Section 6.2 applies the strategies to prove that QuickSort always returns asorted array.

6.1 Developing Inductive Annotations

The machinery presented in Section 5.2 automatically reduces an annotatedfunction to a finite set of verification conditions. Furthermore, the decisionprocedures discussed in Part II of this text make deciding the validity ofmany verification conditions an automatable task. For example, all verificationconditions in this text fall within decidable fragments of FOL, unless otherwisenoted. Thus, determining if a program’s annotations are inductive can beautomated in many cases. Ongoing research in decision procedures continuallyexpands the set of annotations that produce decidable verification conditions.

Developing annotations is a different matter. As expected, writing a func-tion specification requires human ingenuity, just as implementing the function

154 6 Program Correctness: Strategies

requires it. Of course, simple assertions, such as that a program is free of run-time errors, can be generated automatically.

Writing the loop invariants also requires human ingenuity. A certain levelof human intervention is acceptable: the programmer ought to know certainfacts about her/his code. Loop invariants often capture insights into how thecode works and what it accomplishes. Developing the implementation and an-notations simultaneously results in more robust systems. Finally, annotationsformally document code, facilitating better development in team projects.

However, Section 5.2.5 points out a fundamental limitation of the inductiveassertion method of program verification: loop invariants must be inductivefor the corresponding verification conditions to be valid, not just invariant.Consequently, the programmer can assert many facts that are indeed invariant;yet if the annotations are not inductive, the facts cannot be proved.

Much research addresses automatic (inductive) invariant discovery. For ex-ample, algorithms exist for discovering linear and polynomial relations amonginteger and real variables. Such invariants can, for example, provide loop in-dex bounds, prove the lack of division by 0, or prove that an index into anarray is within bounds. Other methods exist for discovering the “shape” ofmemory in programming languages with pointers, allowing, for example, thepartially automated analysis of linked lists. One of the most important rolesof automatic invariant discovery is strengthening the programmer’s annota-tions into inductive annotations. Chapter 12 introduces invariant generationprocedures. However, no set of algorithms will ever fully replace humans inwriting verified software.

In this section, we suggest structured techniques for developing inductiveannotations to prove partial correctness. We emphasize that the methods arejust heuristics: human ingenuity is still the most important ingredient in form-ing proofs.

6.1.1 Basic Facts

To begin a proof, include basic facts in loop invariants. Basic facts include loopindex ranges and other “obvious” facts. To be inductive, complex assertionsusually require these basic facts. We illustrate the development of basic factsthrough several examples.

Example 6.1. Consider the loop of LinearSearch(see also Figure 5.1):

for

@L : ⊤(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

Based on the initialization of i, the loop guard, and that i is only modified bybeing incremented in the loop update, we know that at L,

6.1 Developing Inductive Annotations 155

ℓ ≤ i ≤ u + 1 .

Notice the upper bound. It is a common mistake to forget that on the finaliteration, the loop guard is not true. Our basic annotation of the loop is thefollowing:

for

@L : ℓ ≤ i ≤ u + 1(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

Example 6.2. Consider the loops of BubbleSort (see also Figure 5.3):

for

@L1 : ⊤(int i := |a| − 1; i > 0; i := i− 1) for

@L2 : ⊤(int j := 0; j < i; j := j + 1) if (a[j] > a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

The outer loop index i ranges according to

−1 ≤ i < |a|

Why −1? If |a| = 0, then |a| − 1 = −1 so that i is initially −1. Keep in mindthat “corner cases” like this one are just as important as normal cases (andperhaps even more important when considering correctness: corner cases areoften the source of bugs). In the inner loop, the range of i is more restricted:

0 < i < |a|

because of the outer loop guard.In the inner loop, j ranges according to

0 ≤ j ≤ i .

Therefore, our basic annotation of the two loops is the following:

for

@L1 : −1 ≤ i < |a|(int i := |a| − 1; i > 0; i := i− 1) for

@L2 : 0 a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

Note that the loops modify just the elements of a, not a itself. Therefore,we could add the annotation

|a| = |a0|

to both loops. Such an annotation would be useful if the postcondition as-serted, for example, that

|rv | = |a0| .

For the property that we address (sorted(rv , 0, |rv | − 1)), this annotation isnot useful.

6.1.2 The Precondition Method

Basic facts provide a foundation for more interesting information. The pre-condition method (also called the “backward substitution” or “backwardpropagation” method) is a strategy for developing more interesting informa-tion in a structured way. Again, we emphasize that the method is a heuristic,not an algorithm: it provides some guidance for the human rather than re-placing the human’s intuition and ingenuity.

The precondition method consists of the following steps:

1. Identify a fact F that is known at one location L in the function (@L : F )but that is not supported by annotations earlier in the function.

2. Repeat:a) Compute the weakest preconditions of F backward through the func-

tion, ending at loop invariants or at the beginning of the function.b) At each new annotation location L′, generalize the new facts to new

formula F ′ (@L′ : F ′).

We illustrate the technique through examples.


Example 6.3. Consider the loop of LinearSearch (see also Figure 5.1), anno-tated with basic facts:

for

@L : ℓ ≤ i ≤ u + 1(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;return false;

The postcondition of LinearSearch is

rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e .

Consider basic path (4) of Example 5.13 but with the current loop invariantsubstituted for the first assertion:

(4)@L : F1 : ℓ ≤ i ≤ u + 1S1 : assume i > u;S2 : rv := false;@post F2 : rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e

Note that we continue to number basic paths as they were numbered in Ex-ample 5.13. The VC

F1S1; S2F2 : ℓ ≤ i ≤ u + 1 ∧ i > u → ¬(∃j. ℓ ≤ j ≤ u ∧ a[j] = e)

is not (TZ ∪ TA)-valid. Essentially, the antecedent does not assert anythinguseful about the content of a. Write the consequent as

F : ∀j. ℓ ≤ j ≤ u → a[j] 6= e

by pushing in the negation. F says that if LinearSearch exits via S2, then noelement of a in the range [ℓ, u] is e. But F is not supported by the currentloop invariant at L. In short, F2 is a fact that is not supported by earlierannotations.

Having identified an unsupported fact, we compute preconditions. To prop-agate F2 back to the loop invariant, compute

wp(F2, S1; S2)⇔ wp(wp(F2, rv := false), S1)⇔ wp(F2rv 7→ false, S1)⇔ wp(F2rv 7→ false, assume i > u)⇔ i > u → F2rv 7→ false⇔ i > u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e

The final formula

G : i > u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e ,

in particular the antecedent i > u, and some intuition suggests the general-ization

G′ : ∀j. ℓ ≤ j u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e

Then

wp(G, S1; S2; S3)⇔ wp(wp(G, i := i + 1), S1; S2)⇔ wp(Gi := i + 1, S1; S2)⇔ wp(wp(Gi := i + 1, assume a[i] 6= e), S1)⇔ wp(a[i] 6= e → Gi := i + 1, S1)⇔ wp(a[i] 6= e → Gi := i + 1, assume i ≤ u)⇔ i ≤ u → a[i] 6= e → Gi := i + 1⇔ i ≤ u ∧ a[i] 6= e ∧ i + 1 > u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ u → a[j] 6= e⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ u− 1 → a[j] 6= e⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ i− 1 → a[j] 6= e

To obtain the second-to-last line from the third-to-last, note that the an-tecedent already asserts that a[u] 6= e; hence, its occurrence as the case j = uof ∀j. ℓ ≤ j ≤ u · · · is redundant. The final line is realized by applying theequality i = u to the upper bound on j. As we suspected, it seems that theright bound on j should be related to the progress of i, rather than being fixedto u. This observation from computing the weakest precondition matches ourintuition. One trick to generalize assertions is to replace fixed terms (bounds,indices, etc.) with terms that evolve according to the loop counter.

Thus, we settle on the formula

G′ : ∀j. ℓ ≤ j < i → a[j] 6= e .

That is, all previously checked entries of a do not equal e. We add this assertionto the loop invariant:

for

@L : ℓ ≤ i ≤ u + 1 ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)(int i := ℓ; i ≤ u; i := i + 1) if (a[i] = e) return true;

The result is similar to the annotation in Figure 5.15. Generating and checkingthe corresponding VCs reveals that the annotations are inductive.

Example 6.4. Consider the version of BinarySearch of Figure 5.12 that con-tains runtime assertions but has only a trivial function specification ⊤. Usingthe precondition method, we infer a function precondition that makes the an-notations inductive. Contexts that call BinarySearch are then forced to obeythis function precondition, guaranteeing a lack of runtime errors.

Consider the path from function entry to the assertion protecting the arrayaccess:

(·)@pre H : ?S1 : assume ℓ ≤ u;S2 : m := (ℓ + u) div 2;@ F : 0 ≤ m < |a|

Compute

wp(F, S1; S2)⇔ wp(wp(F, m := (ℓ + u) div 2), S1)⇔ wp(Fm 7→ (ℓ + u) div 2, S1)⇔ wp(Fm 7→ (ℓ + u) div 2, assume ℓ ≤ u)⇔ ℓ ≤ u → Fm 7→ (ℓ + u) div 2⇔ ℓ ≤ u → 0 ≤ (ℓ + u) div 2 < |a|⇐ 0 ≤ ℓ ∧ u < |a|

The final line implies the penultimate line, for if 0 ≤ ℓ ∧ u < |a| and ℓ ≤ u,then both 0 ≤ ℓ < |a| and 0 ≤ u < |a|; hence, their mean is also in the range[0, |a| − 1]. Therefore, it is guaranteed that

0 ≤ ℓ ∧ u < |a| → wp(F, S1; S2)

is TZ-valid.The formula 0 ≤ ℓ ∧ u < |a| appears as the function precondition in

Figure 6.1. The annotations are inductive, proving that the runtime assertion0 ≤ m < |a| holds in every execution of BinarySearch in which the precondition0 ≤ ℓ ∧ u < |a| is satisfied.

Example 6.5. Consider the following code fragment of BubbleSort (see alsoFigure 5.3)

@pre 0 ≤ ℓ ∧ u < |a|@post ⊤bool BinarySearch(int[] a, int ℓ, int u, int e) if (ℓ > u) return false;else

@ 2 6= 0;int m := (ℓ + u) div 2;@ 0 ≤ m < |a|;if (a[m] = e) return true;else if (a[m] < e) return BinarySearch(a, m + 1, u, e);else return BinarySearch(a, ℓ, m − 1, e);

Fig. 6.1. BinarySearch with runtime assertions

for

@L1 : −1 ≤ i < |a|(int i := |a| − 1; i > 0; i := i− 1) for

@L2 : 0 a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;return a;

and its postcondition

F : sorted(rv , 0, |rv | − 1) .

Consider the path(6)

@L1 : G : ?S1 : assume i ≤ 0;S2 : rv := a;@post F : sorted(rv , 0, |rv | − 1)

Computing wp(F, S1; S2) produces the formula

F ′ : i ≤ 0 → sorted(a, 0, |a| − 1) ,

which tells us (not surprisingly) that a should be sorted upon exiting theouter loop. Observe the index variable of the outer loop: it starts at |a| − 1

and decrements down to 0. Therefore, recalling the trick to replace fixed terms(bounds, indices, etc.) with terms that evolve according to the loop countersuggests the following generalization of F ′:

G : sorted(a, i, |a| − 1) .

G trivially holds upon entering the outer loop; moreover, it follows from thebehavior of i that progress is made by working down the array. The outer loopinvariant L1 should include G. Thus, we have

@L1 : −1 ≤ i < |a| ∧ sorted(a, i, |a| − 1)

so far.Propagate G via wp to the inner loop along the path from the exit of the

inner loop L2 to the top of the outer loop L1:(5)

@L2 : H : ?S1 : assume j ≥ i;S2 : i := i− 1;@L1 : G : sorted(a, i, |a| − 1)

The result at L2 is the formula

H ′ : j ≥ i → sorted(a, i− 1, |a| − 1) ,

which states that when the inner loop has finished, the range [i− 1, |a| − 1] issorted. Immediately generalizing H ′ to

H ′′ : sorted(a, i− 1, |a| − 1)

is too strong. For suppose H ′′ were to annotate the inner loop at L2, andconsider the path

(2)@L1 : G : sorted(a, i, |a| − 1)S1 : assume i > 0;S2 : j := 0;@L2 : H ′′ : sorted(a, i− 1, |a| − 1)

Computing

G → wp(H ′′, assume i > 0; j := 0)

produces

sorted(a, i, |a| − 1) ∧ i > 0 → sorted(a, i− 1, |a| − 1) ,

which is not (TZ ∪TA)-valid. All we know at L1 (with respect to sortedness ofa) is G : sorted(a, i, |a|−1). Essentially, sorted(a, i−1, |a|−1) at L2 is a special

case that definitely holds only when the inner loop has finished. Therefore,we generalize H ′ to the weaker assertion H : sorted(a, i, |a|−1), which claimsthat a smaller subrange of a is sorted.

At this point, we have annotated the loops of BubbleSort as follows:

for

@L1 : −1 ≤ i < |a| ∧ sorted(a, i, |a| − 1)(int i := |a| − 1; i > 0; i := i− 1) for

@L2 : 0 a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

The resulting VCs are not valid. Further annotations require some insight onour part, which leads us to the next section.

6.1.3 A Strategy

In general, proofs require insights beyond generalizing formulae obtainedthrough the precondition method. We adopt the following strategy when prov-ing partial correctness.

First, decompose the function specification into atomic properties. Thenanalyze each atomic property. For example, to prove that BubbleSort returns asorted array that is a permutation of the input, study the sortedness propertyand permutation property separately. In some cases, several atomic propertiesmay have to be examined together to complete the proof. For each basicproperty, apply the following steps:

1. Assert basic facts (Section 6.1.1).2. Repeat:

a) Use the precondition method to propagate annotations (Section 6.1.2).b) Formalize an insight.

The second step suggests applying the precondition method until nothingmore can be learned. Then pause, understand another essential fact aboutthe program, and resume applying the precondition method.

While Chapter 12 discusses the foundations of algorithms for automati-cally generating inductive annotations, the reader should be aware that eventhe best of these algorithms cannot approach the abilities of a human. Takeheart! Experience has shown that students quickly become adept at annotat-ing programs.

Example 6.6. We resume our analysis of BubbleSort from Example 6.5. Somecogitation (and observation of sample traces; see Figure 5.4) suggests thatBubbleSort exhibits the following behavior: the inner loop propagates thelargest value of the unsorted region to the right side of the unsorted region,thus expanding the sorted region. At every iteration, j is the index of thelargest value found so far. In other words, all values in the range [0, j− 1] areat most a[j]:

F : partitioned(a, 0, j − 1, j, j) .

This observations should be added as an annotation at L2. Having gained newinsight into BubbleSort, we return to the precondition method and propagateF back to the outer loop at L1 along the path

(2)@L1 : H : ?S1 : assume i > 0;S2 : j := 0;@L2 : F : partitioned(a, 0, j − 1, j, j)

resulting in the new annotation

wp(F, S1; S2) : i > 0 → partitioned(a, 0,−1, 0, 0)

at L1. The result is trivially valid according to the definition of partitioned,so it does not contribute any new information. Thus, we finish this round ofStep 2 with the annotations

for

@L1 : −1 ≤ i < |a| ∧ sorted(a, i, |a| − 1)(int i := |a| − 1; i > 0; i := i− 1) for

@L2 :

[0 < i < |a| ∧ 0 ≤ j ≤ i∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)

]


The annotations are not yet inductive.Some further meditation enlightens us with the following: the sorted region

must contain the largest elements of a, for the inner loop has assiduouslymoved the largest element of the unsorted region to the sorted region. Inother words, the sorted range [i+1, |a|−1] contains the largest elements of a:


G : partitioned(a, 0, i, i + 1, |a| − 1) .

This observation should be added as an annotation at L1. Now we prop-agate G from L1 to L2. Recall from Example 6.5 that our propagation ofsorted(a, i, |a| − 1) to the inner loop was unsuccessful. Similarly,

wp(G, assume j ≥ i; i := i− 1)⇔ j ≥ i → partitioned(a, 0, i− 1, i, |a| − 1)

for the path(5)

@L2 : H : ?assume j ≥ i;i := i− 1;@L1 : G : partitioned(a, 0, i, i + 1, |a| − 1)

cannot be generalized to

partitioned(a, 0, i− 1, i, |a| − 1) .

Instead, consider the path from L1 to L2:(2)

@L1 : G : partitioned(a, 0, i, i + 1, |a| − 1)S1 : assume i > 0;S2 : j := 0;@L2 : H : ?

Find the strongest formula H that can annotate the inner loop such that theVC

partitioned(a, 0, i, i + 1, |a| − 1) → wp(H, S1; S2)

for the path is valid. In other words, seek a formula H annotating the innerloop that is supported by the annotation G of the outer loop. The strongestsuch formula is G itself.

These new annotations result in Figure 5.17.

6.2 Extended Example: QuickSort

In this section, we bring together the concepts studied in this and the lastchapters through a single example. We prove that QuickSort always halts andreturns a sorted array. We argue at the level of annotations, leaving a computeror the reader to check the VCs.

Figure 6.2 lists the high-level functions of QuickSort. QuickSort is a wrapperfunction (or public interface) for the recursive function qsort, which sorts arraya0 in the range [ℓ, u]. As in BubbleSort, the first line of qsort assigns a0 to an

6.2 Extended Example: QuickSort 165

typedef struct qs int pivot;int[] array;

qs;

@pre ⊤@post sorted(rv , 0, |rv | − 1)int[] QuickSort(int[] a) return qsort(a, 0, |a| − 1);

@pre ⊤@post ⊤int[] qsort(int[] a0, int ℓ, int u) int[] a := a0;if (ℓ ≥ u) return a;else qs p := partition(a, ℓ, u);a := p.array;a := qsort(a, ℓ, p.pivot − 1);a := qsort(a, p.pivot + 1, u);return a;

Fig. 6.2. Main functions of QuickSort

array a because qsort modifies a (recall that pi does not allow parametersto be modified). The qs data structure holds the two data that the partitionfunction, listed in Figure 6.3, returns: the pivot index pivot and the partitionedarray array.

One level of recursion of qsort works as follows. If ℓ ≥ u, then the trivialrange [ℓ, u] of a0 is already sorted. Otherwise, partition chooses a pivot indexpi ∈ [ℓ, u], remembering the pivot value a[pi] as pv. It then swaps cells pi andu of a so that the randomly chosen pivot now appears on the right side of the[ℓ, u] subarray. random has the following prototype:

@pre ℓ ≤ u@post ℓ ≤ rv ≤ uint random(int ℓ, int u);

The for loop of partition partitions a such that all elements at most pvare on the left and all elements greater than pv are on the right. Within theloop, j < u, so that the pivot value pv, stored in a[u], remains untouched.When the loop finishes, if i < u − 1, then the value a[i + 1] is the first valuegreater than pv; otherwise, all elements of a are at most pv. Finally, partition

@pre ⊤@post ⊤qs partition(int[] a0, int ℓ, int u) int[] a := a0;int pi := random(ℓ, u);int pv := a[pi];a[pi] := a[u];a[u] := pv;

int i := ℓ − 1;for @ ⊤

(int j := ℓ; j < u; j := j + 1) if (a[j] ≤ pv)

i := i + 1;t := a[i];a[i] := a[j];a[j] := t;

t := a[i + 1];a[i + 1] := a[u];a[u] := t;return

pivot = i + 1;a = a;

;

Fig. 6.3. QuickSort’s partition function

swaps the pivot value a[u] with a[i + 1] so that a is partitioned as follows inthe range [ℓ, u]: cells to the left of i + 1 have value at most pv; a[i + 1] = pv;and cells to the right of i + 1 have value greater than pv. It returns the pivotindex i + 1 and the partitioned array a via an instance of the qs data type.

Finally, qsort recursively sorts the subarrays to the left and to the right ofthe pivot index.

Figure 6.4 presents a sample trace. In the first line, partition chooses thesecond cell as the pivot and swaps it with cell u. The subsequent six lines followthe partition’s loop as it partitions elements according to pv. The penultimateline shows the swap that brings the pivot element into the pivot position. Thefinal line shows the state of the array when it is returned to qsort. qsort callsitself recursively on the two indicated subarrays. We encourage the reader tounderstand QuickSort and the sample trace before reading further.

0 3 2 6 5

ℓ pi u

0 5 2 6 3 0 ≤ 3i j

0 6 2 5 3

i, j

0 5 2 6 3 5 6≤ 3i j

0 5 2 6 3 2 ≤ 3i j

0 5 2 6 3

i j

0 2 5 6 3 6 6≤ 3i j

0 2 5 6 3

i i + 1 j, u

0 2 3 6 5

9

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

=

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

>

;

loop

Fig. 6.4. Sample execution of QuickSort

6.2.1 Partial Correctness

We prove that if QuickSort halts, then it returns a sorted array. First, wedevelop the function specifications for qsort and partition so that QuickSortand qsort have inductive annotations. We then leave the annotation of theloop of partition as Exercise 6.3.

First, annotate qsort and partition with their function specifications. Toavoid runtime errors, the function preconditions should include

0 ≤ ℓ ∧ u < |a0| .

Next, while the returned array is not the same as the input array a0 in eitherfunction, we know that their lengths are the same:

|rv | = |a0| .

We also observe that neither qsort nor partition modify the array outside ofthe range [ℓ, u]. Thus, we note in the function postcondition that

beq(rv , a0, 0, ℓ− 1) ∧ beq(rv , a0, u + 1, |a0| − 1) .

The bounded equality predicate beq is defined

beq(a, b, k1, k2) ⇔ ∀i. k1 ≤ i ≤ k2 → a[i] = b[i]

in the theory TZ ∪ TA. It asserts that two arrays are equal in the index range[k1, k2].

The annotations for partition vary slightly because of its return type:

|rv .array| = |a0| ∧ beq(rv .array, a0, 0, ℓ− 1)∧ beq(rv .array, a0, u + 1, |a0| − 1)

Since partition returns an integer (pivot) and an array (array) in a qs struc-ture, the postcondition asserts facts about rv .array.

To finish with qsort, formalize that qsort sorts the range [ℓ, u] of the givenarray:

sorted(rv , ℓ, u) .

So far, then, we have specified qsort as follows:

@pre 0 ≤ ℓ ∧ u < |a0|@post

[|rv | = |a0| ∧ beq(rv , a0, 0, ℓ− 1) ∧ beq(rv , a0, u + 1, |a0| − 1)∧ sorted(rv , ℓ, u)

]

int[] qsort(int[] a0, int ℓ, int u)

To finish the specification of partition, recall that partition is intended toreturn an array that is partitioned around the pivot element. Therefore, let usformalize the description that we gave above. Observe that the specificationof random guarantees that

ℓ ≤ rv .pivot ≤ u ,

as desired. Now, for the left subarray,

∀i. ℓ ≤ i < rv .pivot → rv .array[i] ≤ rv .array[rv .pivot] ,

or, as a partition,

partitioned(rv .array, ℓ, rv .pivot− 1, rv .pivot, rv .pivot) ,

while for the right subarray,

∀i. rv .pivot < i ≤ u → rv .array[rv .pivot] < rv .array[i] .

We weaken this assertion slightly to

partitioned(rv .array, rv .pivot, rv .pivot, rv .pivot + 1, u) ,

which does not capture the strict inequality but is more convenient for rea-soning. For partition, we have thus specified the following:

@pre 0 ≤ ℓ ∧ u < |a0|

@post

|rv .array| = |a0| ∧ beq(rv .array, a0, 0, ℓ− 1)∧ beq(rv .array, a0, u + 1, |a0| − 1)∧ ℓ ≤ rv .pivot ≤ u∧ partitioned(rv .array, ℓ, rv .pivot− 1, rv .pivot, rv .pivot)∧ partitioned(rv .array, rv .pivot, rv .pivot, rv .pivot + 1, u)

qs partition(int[] a0, int ℓ, int u)

Let us step back a moment. Essentially, we have specified that qsort doesnot modify the array outside of the range [ℓ, u]. Regarding the subarray givenby [ℓ, u], all we have asserted is that it is sorted in the returned array. Focuson the recursive calls to qsort: is knowing that the ranges [ℓ, p.pivot− 1] and[p.pivot+1, u] of a are sorted enough to conclude that the range [ℓ, u] is sortedwhen a is returned? In other words, is the VC corresponding to the followingbasic path valid? The basic path follows the path from the precondition tothe second return statement, using the function call abstraction introducedin Section 5.2.1 to abstract away functions calls:

(·)@pre 0 ≤ ℓ ∧ u < |a0|a := a0;assume ℓ < u;

assume

|v1.array| = |a| ∧ beq(v1.array, a, 0, ℓ− 1)∧ beq(v1.array, a, u + 1, |a| − 1)∧ ℓ ≤ v1.pivot ≤ u∧ partitioned(v1.array, ℓ, v1.pivot− 1, v1.pivot, v1.pivot)∧ partitioned(v1.array, v1.pivot, v1.pivot, v1.pivot + 1, u)

;

p := v1;a := p.array;

assume

[|v2| = |a| ∧ beq(v2, a, 0, ℓ− 1) ∧ beq(v2, a, p.pivot, |a| − 1)∧ sorted(v2, ℓ, p.pivot− 1)

];

a := v2;

assume

[|v3| = |a| ∧ beq(v3, a, 0, p.pivot) ∧ beq(v3, a, u + 1, |a| − 1)∧ sorted(v3, p.pivot + 1, u)

];

a := v3;rv := a;

@

[|rv | = |a0| ∧ beq(rv , a0, 0, ℓ− 1) ∧ beq(rv , a0, u + 1, |a0| − 1)∧ sorted(rv , ℓ, u)

]

The corresponding VC is not valid. The assumptions about v1, v2, and v3 arenot strong enough to imply that the range [ℓ, u] of rv is sorted.

The standard approach to addressing this problem is to reason simulta-neously that qsort returns an array that is a permutation of its input array

(a permuted array contains the same elements as the original array but pos-sibly in a different order). However, reasoning about permutations presentsa problem. A straightforward formalization of permutation is not possible inFOL, instead requiring second-order logic. We could assert that the outputis a weak permutation of the input: all values occurring in the input arrayoccur in the output array but possibly with a varying number of occurrences.Formally,

∀e. (∃i. a0[i] = e) ↔ (∃j. rv [j] = e) .

In Exercise 6.5, we ask the reader to explore an approximation to weak per-mutation.

However, that the elements are permuted is a stronger statement thannecessary to prove sortedness. Instead, notice that QuickSort imposes a largerpartitioning of the intermediate arrays than we have previously observed inour analysis. At every level of recursion of qsort, the elements of a0 in therange [ℓ, u] are at least the elements to their left and at most the elements totheir right. Formally, we strengthen the specification of qsort as follows:

@pre

0 ≤ ℓ ∧ u < |a0|∧ partitioned(a0, 0, ℓ− 1, ℓ, u)∧ partitioned(a0, ℓ, u, u + 1, |a0| − 1)

@post

|rv | = |a0| ∧ beq(rv , a0, 0, ℓ− 1) ∧ beq(rv , a0, u + 1, |a0| − 1)∧ partitioned(rv , 0, ℓ− 1, ℓ, u)∧ partitioned(rv , ℓ, u, u + 1, |rv | − 1)∧ sorted(rv , ℓ, u)

int[] qsort(int[] a0, int ℓ, int u)

Of course, now the specification of partition must be strengthened to carrythis reasoning through the main basic path of qsort:

@pre

0 ≤ ℓ ∧ u < |a0|∧ partitioned(a0, 0, ℓ− 1, ℓ, u)∧ partitioned(a0, ℓ, u, u + 1, |a0| − 1)

@post

|rv .array| = |a0| ∧ beq(rv .array, a0, 0, ℓ− 1)∧ beq(rv .array, a0, u + 1, |a0| − 1)∧ partitioned(rv .array, 0, ℓ− 1, ℓ, u)∧ partitioned(rv .array, ℓ, u, u + 1, |rv .array| − 1)∧ ℓ ≤ rv .pivot ≤ u∧ partitioned(rv .array, ℓ, rv .pivot− 1, rv .pivot, rv .pivot)∧ partitioned(rv .array, rv .pivot, rv .pivot, rv .pivot + 1, u)

qs partition(int[] a0, int ℓ, int u)

That is, partition preserves this partitioning even as it manipulates the ele-ments in the range [ℓ, u]. Indeed, partition itself imposes the necessary parti-

tioning for the next level of recursion, which we already observed earlier asthe final partitioned assertions of the function postcondition.

The annotations of QuickSort and qsort are inductive. Exercise 6.3 asksthe reader to finish the proof by annotating the for loop of partition so thatthe annotations of partition are also inductive.

6.2.2 Total Correctness

To prove total correctness — that QuickSort actually returns a sorted arrays —we need to prove that QuickSort always halts. Our implementation of QuickSorthas both recursive behavior in function qsort and looping behavior in functionpartition. We must analyze both possible sources of nontermination.

To prove that the loop in partition always halts, let us use the obviousranking function of δ1 : u− j suggested by the structure of the loop. δ1 clearlymaps the program state to Z, but we prove the stronger fact that δ1 actuallymaps the program state to N with well-founded relation <. In particular, weprove that

• u− j ≥ 0 is a loop invariant;• and u− j decreases on each iteration.

Annotating the loop with the bounds on j suggested by the loop structureproves that the loop always halts:

for

@L1 : ℓ ≤ j ∧ j ≤ u↓ δ1 : u− j(int j := ℓ; j < u; j := j + 1)

Proving that the recursion of qsort always halts is superficially more dif-ficult. The argument that we would like to make is that u − ℓ decreases oneach recursive call, which requires proving that the pivot value returned bypartition lies within the range [ℓ, u].

Observe, however, that u − ℓ may be negative when qsort is called withℓ > u. But in this case, ℓ = u + 1, for either |a0| = 0, and qsort was calledfrom QuickSort; or p.pivot = ℓ or p.pivot = u, and qsort was called recursively.More generally, we can establish that u − ℓ + 1 ≥ 0 is an invariant of qsort.Hence, δ2 : u− ℓ + 1 is our proposed ranking function that maps the programstates to N with well-founded relation <.

Figure 6.5 formalizes the arguments that δ1 and δ2 are ranking functions.Notice that bounds on i are proved as loop invariants at L1. These boundsimply that rv .pivot lies within the range [ℓ, u] as required.

One trick that would avoid reasoning about the case in which ℓ > u is tocut the recursion at a point within qsort rather than at function entry. Figure6.6 provides an alternate argument in which the ranking function labels the

@pre u − ℓ + 1 ≥ 0@post ⊤↓ δ2 : u − ℓ + 1int[] qsort(int[] a0, int ℓ, int u) int[] a := a0;if (ℓ ≥ u) return a;else qs p := partition(a, ℓ, u);a := p.array;a := qsort(a, ℓ, p.pivot − 1);a := qsort(a, p.pivot + 1, u);return a;

@pre ℓ ≤ u

@post ℓ ≤ rv .pivot ∧ rv .pivot ≤ u

qs partition(int[] a0, int ℓ, int u) ...int i := ℓ − 1;for

@L1 : ℓ ≤ j ∧ j ≤ u ∧ ℓ − 1 ≤ i ∧ i < j

↓ δ1 : u − j

(int j := ℓ; j < u; j := j + 1) ...

...return

pivot = i + 1;a = a;

;

Fig. 6.5. QuickSort always halts

else branch in qsort. The first branch terminates the recursion. partition isannotated as in Figure 6.5.

6.3 Summary

This chapter presents strategies for specifying and proving the correctness ofsequential programs. It covers:

• Strategies for proving partial correctness. The need for strengthening an-notations. Basic facts; the precondition method.

Exercises 173

@pre ⊤@post ⊤int[] qsort(int[] a0, int ℓ, int u) int[] a := a0;if (ℓ ≥ u) return a;else

↓ δ3 : u − ℓ

qs p := partition(a, ℓ, u);a := p.array;a := qsort(a, ℓ, p.pivot − 1);a := qsort(a, p.pivot + 1, u);return a;

Fig. 6.6. Alternate argument that QuickSort always halts

@pre ⊤@post ∀i. 0 ≤ i < |rv | → rv [i] ≥ 0int[] abs(int[] a0) int[] a := a0;for @ ⊤

(int i := 0; i < |a|; i := i + 1) if (a[i] < 0)

a[i] := − a[i];

return a;

Fig. 6.7. Computing the absolute value of elements of a0

• A full proof that QuickSort returns a sorted array.


QuickSort was discovered by Tony Hoare, who also proposed specifying andverifying programs using FOL [39].

Exercises

6.1 (Absolute value). Prove the partial correctness of abs in Figure 6.7.That is, annotate the function; list basic paths and verification conditions;and argue that the VCs are valid.

@pre ⊤@post sorted(rv , 0, |rv | − 1)int[] InsertionSort(int[] a0) int[] a := a0;for @ ⊤

(int i := 1; i < |a|; i := i + 1) int t := a[i];for @ ⊤

(int j := i − 1; j ≥ 0; j := j − 1) if (a[j] ≤ t) break;a[j + 1] := a[j];

a[j + 1] := t;

return a;

Fig. 6.8. InsertionSort

6.2 (InsertionSort). Prove the partial correctness of InsertionSort. That is,annotate the function; list basic paths and verification conditions; and arguethat the VCs are valid. See Figure 6.8.

As in other languages, the break statement moves control to the loop exit.

6.3 (QuickSort). Finish the proof of the sortedness property of QuickSort. Thatis, annotate partition of Figure 6.3 so that its precondition and postconditionannotations given at the end of Section 6.2 are inductive.

6.4 (MergeSort). Prove the partial correctness of MergeSort. See Figure 6.9.The function merge uses the pi keyword new, which allocates an array of thespecified size. Therefore, it is known after the allocation to buf that |buf | =u− ℓ + 1.

First, deduce the function specifications for ms and merge by focusingon MergeSort and ms. Prove MergeSort and ms correct with respect to theseannotations. Then analyze merge.

Since MergeSort is fairly long, you need not list basic paths and VCs. Justpresent MergeSort with its inductive annotations.

6.5 (Weak permutation). Define weak permutation as follows:

∀e. (∃i. a[i] = e) ↔ (∃j. b[j] = e) . (6.1)

Unfortunately, the decision procedures for arrays discussed in Chapter 11 can-not decide the validity of VCs arising from wperm annotations, as such VCsfall outside of the studied fragments of TA. Instead, we describe an approxi-mation.

Exercises 175

Consider annotating BubbleSort as in Figure 6.10. The define keyworddefines a global constant. In this case, e is defined to have some nondetermin-istic value; that is, e stands for an arbitrary integer. The annotations then usethis e in wperm literals, where wperm is defined as follows:

wperm(a, a0, e) ⇔ (∃i. a[i] = e) ↔ (∃j. a0[j] = e) .

Compared to the full definition (6.1) of weak permutation, wperm does nothave a universally quantified variable e; instead, it uses a given expression e,in this case the global constant e.

(a) Argue that the annotations of Figure 6.10 are inductive. That is, list theVCs and argue their validity.

(b) Argue that the annotations imply that BubbleSort actually satisfies theweakest permutation property. That is, prove that the validity of the VC

wperm(a, a0, e) ∧ a′ = . . . → wperm(a′, a0, e)

implies the validity of the VC

(∀e. (∃i. a[i] = e) ↔ (∃j. a0[j] = e)) ∧ a′ = . . .

→ (∀e. (∃i. a′[i] = e) ↔ (∃j. a0[j] = e)) .

(c) Can this approximation be used to prove the weakest permutation prop-erty of(i) InsertionSort (Figure 6.8)?(ii) MergeSort (Figure 6.9)?(iii) QuickSort (Section 6.2)?If so, prove it. If not, explain why not.

6.6 (Sets with arrays). Implement an API (application programminginterface) for manipulating sets. The underlying data structure of the imple-mentation is arrays.

(a) Prove the correctness of the union function of Figure 6.11 by adding in-ductive annotations.

(b) Implement and specify and prove the correctness of an intersection func-tion, which takes two arrays a0 and b0 and returns the intersection of thesets they represent.

(c) Implement and specify and prove the correctness of a subset function,which takes two arrays a0 and b0 and returns true iff the first set, repre-sented by a0, is a subset of the second set, represented by b0.

6.7 (Sets with sorted arrays). Implement an API for manipulating sets.The underlying data structure of the implementation is sorted arrays.

(a) Prove the correctness of the union function of Figure 6.12 by adding in-ductive annotations.


(b) Implement and specify and prove the correctness of an intersection func-tion, which takes two sorted arrays a0 and b0 and returns the intersectionof the sets they represent as a sorted set.

(c) Implement and specify and prove the correctness of a subset function,which takes two sorted arrays a0 and b0 and returns true iff the first set,represented by a0, is a subset of the second set, represented by b0.

6.8 (QuickSort halts). Provide the basic paths and verification conditions forthe proof of Section 6.2.2 that QuickSort always halts.

6.9 (Intuitive ranking functions). Following the proof that the recursionof qsort halts, move the location of the ranking function annotations in thefollowing functions to produce more intuitive arguments:

(a) BinarySearch, Figure 5.2(b) BubbleSort, Figure 5.3(c) InsertionSort, Figure 6.8

6.10 (Fewer annotations). Notice in the annotated BubbleSort of Figure5.17 that there are only a finite number of basic paths from function entry toL2, from L2 to function exit, and from L2 back to itself.

(a) List these basic paths.(b) Annotate only the inner loop of BubbleSort so that the VCs corresponding

to the basic paths of (a) are valid.(c) Treat InsertionSort of Figure 5.25 similarly.(d) Similarly, annotate only the inner loop of BubbleSort with a ranking an-

notation.(e) Treat InsertionSort similarly.

Exercises 177

@pre ⊤@post sorted(rv , 0, |rv | − 1)int[] MergeSort(int[] a) return ms(a, 0, |a| − 1);

@pre ⊤@post ⊤int[] ms(int[] a0, int ℓ, int u) int[] a := a0;if (ℓ ≥ u) return a;else int m := (ℓ + u) div 2;a := ms(a, ℓ, m);a := ms(a, m + 1, u);a := merge(a, ℓ,m, u);return a;

@pre ⊤@post ⊤int[] merge(int[] a0, int ℓ, int m, int u) int[] a := a0, buf := new int[u − ℓ + 1];int i := ℓ, j := m + 1;for @ ⊤

(int k := 0; k < |buf |; k := k + 1) if (i > m)

buf [k] := a[j];j := j + 1;

else if (j > u) buf [k] := a[i];i := i + 1;

else if (a[i] ≤ a[j]) buf [k] := a[i];i := i + 1;

else buf [k] := a[j];j := j + 1;

for @ ⊤

(k := 0; k < |buf |; k := k + 1) a[ℓ + k] := buf [k];

return a;

Fig. 6.9. MergeSort

define int e = ?;

@pre ⊤@post wperm(a, a0, e)int[] BubbleSort(int[] a0) int[] a := a0;for

@L1 : −1 ≤ i < |a| ∧ wperm(a, a0, e)(int i := |a| − 1; i > 0; i := i − 1) for

@L2 : 0 ≤ j a[j + 1]) int t := a[j];a[j] := a[j + 1];a[j + 1] := t;

return a;

Fig. 6.10. BubbleSort with annotations for weak permutation

define int e = ?;

@pre ⊤

@post

»

(∃i. 0 ≤ i < |rv | ∧ rv [i] = e)↔ (∃i. 0 ≤ i < |a0| ∧ a0[i] = e) ∨ (∃i. 0 ≤ i < |b0| ∧ b0[i] = e)

–

int[] union(int[] a0, int[] b0) int[] u := new int[|a0| + |b0|];int j := 0;for @ ⊤

(int i = 0; i < |a0|; i := i + 1) u[j] := a0[i];j := j + 1;

for @ ⊤

(int i = 0; i < |b0|; i := i + 1) u[j] := b0[i];j := j + 1;

return u;

Fig. 6.11. Function union of the linear set implementation

Exercises 179

define int e = ?;

@pre sorted(a0, 0, |a0| − 1) ∧ sorted(b0, 0, |b0| − 1)

@post

2

4

sorted(rv , 0, |rv | − 1)

∧

»

(∃i. 0 ≤ i < |a0| ∧ a0[i] = e) ∨ (∃i. 0 ≤ i < |b0| ∧ b0[i] = e)↔ (∃i. 0 ≤ i < |rv | ∧ rv [i] = e)

–

3

5

int[] union(int[] a0, int[] b0) int[] u := new int[|a0| + |b0|];int i := 0, j := 0;for @ ⊤

(int k = 0; k < |u|; k := k + 1) if (i ≥ |a0|)

u[k] := b0[j];j := j + 1;

else if (j ≥ |b0|)

u[k] := a0[i];i := i + 1;

else if (a0[i] ≤ b0[j])

u[k] := a0[i];i := i + 1;

else

u[k] := b0[j];j := j + 1;

return u;

Fig. 6.12. Function union of the sorted set implementation

Part II

Algorithmic Reasoning

It is reasonable to hope that the relationship between computation andmathematical logic will be as fruitful in the next century as that be-tween analysis and physics in the last. The development of this rela-tionship demands a concern for both applications and mathematicalelegance.

— John McCarthyA Basis for a Mathematical Theory of Computation, 1963

Having established in Part I the mathematical foundations for precision indesigning and implementing software and hardware systems, we turn in PartII to the task of automating some of the required logical reasoning.

Chapters 7–10 discuss decision procedures for many of the theories andfragments of theories introduced in Chapter 3. These decision procedures au-tomate the task of proving the verification conditions of Chapters 5 and 6.Chapter 7 applies the method of quantifier elimination to decide validity inTZ and TQ. Chapters 8 and 9 focus on decision procedures for the quantifier-free fragments of the theories TQ, TE, Tcons, and TA. In Chapter 10, theseprocedures are combined in the Nelson-Oppen framework to address formulaewith symbols of multiple signatures. The treatment of arrays in Chapter 11looks beyond the quantifier-free fragment, but we show how the combinationmethod of Chapter 10 still applies.

Chapter 12 turns to automating the inductive assertion method of Chapter5. It presents a methodology for constructing algorithms that learn inductivefacts about software. These algorithms reduce the burden on the engineer toprovide annotations.

7

Quantified Linear Arithmetic

Whilst in [Presburger arithmetic] one can only state rather simpletheorems, yet an efficient algorithm. . . is useful in that it can quicklydispose of a host of the simpler formulas. . . , leaving only the morecomplex to be dealt with by some more general theorem prover or bythe human.

— David C. CooperTheorem Proving in Arithmetic Without Multiplication, 1972

Recall from Chapter 3 that satisfiability in the theories of linear integerarithmetic TZ and linear rational (or real) arithmetic TQ is decidable. Thischapter explores a common method of deciding satisfiability of arbitrarilyquantified formulae in decidable theories in the context of TZ and TQ.

A quantifier elimination procedure is an algorithm that constructsfrom a given formula an equivalent quantifier-free formula. When the givenformula does not have any free variables, the atoms of the resulting formula areapplications of predicates to constant terms such as 3 < 5. If the truth value ofthese constant terms is decidable, a quantifier elimination procedure provides abasis for a satisfiability decidable procedure. When the given formula containsfree variables, the resulting quantifier-free formula contains a subset of thesefree variables.

Section 7.1 presents the method of quantifier elimination in an abstractcontext. Sections 7.2 and 7.3 then present quantifier elimination methods forthe theory of integers TZ and the theory of rationals TQ, respectively. The com-plexity results of Section 7.4 suggest that these algorithms are near-optimalin time complexity, although quantifier-elimination methods are, in general,computationally expensive.

Sections 7.2.4 and 7.2.5 discuss optimizations for the integer case that areparticularly effective when the given formula has only existential quantifiers oronly universal quantifiers. With these optimizations, the quantifier eliminationmethod for TZ is our main decision procedure for the quantifier-free fragment

184 7 Quantified Linear Arithmetic

of TZ. However, the case for TQ is different: Chapter 8 presents a specialdecision procedure for the quantifier-free fragment of TQ.

7.1 Quantifier Elimination

7.1.1 Quantifier Elimination

Quantifier elimination (QE) is the main technique that underlies the al-gorithms of this chapter. As the name suggests, the idea is to eliminate quan-tifiers of a formula F until only a quantifier-free formula G that is equivalentto F remains.

Formally, a theory T admits quantifier elimination if there is an algo-rithm that, given Σ-formula F , returns a quantifier-free Σ-formula G that isT -equivalent to F . Then T is decidable if satisfiability in the quantifier-freefragment of T is decidable.

Example 7.1. Consider the ΣQ-formula

F : ∃x. 2x = y ,

which expresses the set of rationals y that can be halved. Intuitively, all ra-tionals can be halved, so a quantifier-free TQ-equivalent formula is

G : ⊤ ,

which expresses the set of all rationals. Also, G states that F is valid.

Example 7.2. Consider the ΣZ-formula

F : ∃x. 2x = y ,

which expresses the set of integers y that can be halved (to produce anotherinteger). Intuitively, only even integers can be halved. Can you think of aquantifier-free TZ-equivalent formula to F? In fact, no such formula exists.Later, we introduce an augmented theory of integers that contains a countablyinfinite number of divisibility predicates. For example, in this extended theory,an equivalent formula to F is

G : 2 | y ,

which expresses the set of even integers: integers that are divisible by 2.

7.2 Quantifier Elimination over Integers 185

7.1.2 A Simplification

In developing a QE algorithm for theory T , we need only consider formulae ofthe form ∃x. F for quantifier-free formula F . For given arbitrary formula G,choose the innermost quantified formula ∃x. H or ∀x. H. In the latter case,rewrite ∀x. H as ¬(∃x. ¬H) and focus on the subformula ∃x. ¬H inside thenegation.

In both the existential and universal cases, we now have a formula of theform ∃x. F to which we apply the QE algorithm to find T -equivalent andquantifier-free formula H ′. In the existential case, replace ∃x. H in G withH ′. In the universal case, replace ∀x. H in G with ¬H ′.

Example 7.3. Consider the Σ-formula

G1 : ∃x. ∀y. ∃z. F1[x, y, z] ,

and assume that T admits quantifier elimination. The innermost quantifiedformula is ∃z. F1[x, y, z]. Applying the QE algorithm for T to this subformulareturns F2[x, y]:

G2 : ∃x. ∀y. F2[x, y] .

The innermost quantified formula is now ∀y. F2[x, y]; rewriting, we have

G3 : ∃x. ¬(∃y. ¬F2[x, y]) .

Applying the QE algorithm to the existential subformula ∃y. ¬F2[x, y] pro-duces F3[x]. We now have

G4 : ∃x. ¬F3[x] .

Finally, applying the QE algorithm one more time to G4 produces a quantifier-free formula G5. G5 is T -equivalent to G1.

7.2 Quantifier Elimination over Integers

We now turn to the theory of integers TZ. Presburger showed this theory tobe decidable in 1929. We discuss the QE procedure of Cooper, first describedin 1971.

7.2.1 Augmented Theory of Integers

Recall that TZ has the following signature:

ΣZ : . . . ,−2,−1, 0, 1, 2, . . . ,−3·,−2·, 2·, 3·, . . . , +, −, =, < ,

where


• . . . , −2, −1, 0, 1, 2, . . . are constants, intended to be assigned the obviouscorresponding values in the intended domain Z;

• . . . , −3·, −2·, 2·, 3·, . . . are unary functions, intended to represent con-stant coefficients (e.g., 2 · x, abbreviated 2x);

• + and − are binary functions, intended to represent the obvious corre-sponding functions over Z;

• = and > are binary predicates, intended to represent the obvious corre-sponding predicates over Z.

Example 7.2 claims that the ΣZ-formula ∃x. 2x = y does not have a TZ-equivalent quantifier-free formula. The following lemma proves a more generalclaim.

Lemma 7.4. Consider quantifier-free ΣZ-formula F such that free(F ) = y.F represents the set of integers

S : n ∈ Z : Fy 7→ n is TZ-valid .

Either S ∩ Z+ or Z+ \ S is finite.

Z+ is the set of positive integers; ∩ and \ are set intersection and com-plement, respectively. The lemma says that every quantifier-free ΣZ-formulawith only one free variable represents a set of integers S such that either thesubset of positive integers in S has finite cardinality or the set of positiveintegers not in S has finite cardinality. Exercise 7.1 asks the reader to provethis lemma.

Consider again the case of ∃x. 2x = y, and let S be the set of integerssatisfying the formula, namely the even integers. Since both the set of positiveeven integers S∩Z+ and the set of positive odd integers Z+\S are infinite, theset of even integers cannot be represented in TZ by a quantifier-free formulaaccording to the lemma. Therefore, there is no quantifier-free ΣZ-formula thatis TZ-equivalent to ∃x. 2x = y, and thus TZ does not admit QE.

To circumvent this problem, we augment the theory TZ with an infinitebut countable number of unary divisibility predicates

k | · for k ∈ Z+ ;

that is, a predicate exists for each positive integer k. The intended interpre-tation of k | x is that it holds iff k divides x without any remainder. Forexample,

x > 1 ∧ y > 1 ∧ 2 | x + y

is satisfiable (choose x = 2 and y = 2), but

¬(2 | x) ∧ 4 | x

is not satisfiable.

The augmented theory TZ extends the signature of TZ with these divisi-bility predicates. Additionally, the predicates are defined by the countable setof axioms

∀x. k | x ↔ ∃y. x = ky (divides)

for k ∈ Z+. The next section shows that TZ admits QE.

7.2.2 Cooper’s Method

In this section, we present the basic form of Cooper’s quantifier eliminationmethod. In the subsequent sections, we examine a set of optimizations.

The algorithm is given a ΣZ-formula ∃x. F [x] as input, where F isquantifier-free but may contain free variables in addition to x. It then proceedsto construct a quantifier-free ΣZ-formula that is TZ-equivalent to ∃x. F [x] ac-cording to the following steps.

Step 1

Put F [x] in NNF. The output ∃x. F1[x] is TZ-equivalent to ∃x. F [x] and issuch that F1 is a positive Boolean combination (only ∧ and ∨) of literals.

Step 2

Replace literals according to the following TZ-equivalences, applied from leftto right:

s = t ⇔ s < t + 1 ∧ t < s + 1¬(s = t) ⇔ s < t ∨ t < s¬(s < t) ⇔ t < s + 1

The output ∃x. F2[x] is TZ-equivalent to ∃x. F [x] and contains only literalsof the form

s < t , k | t , or ¬(k | t) ,

where s, t are ΣZ-terms and k ∈ Z+.

Example 7.5. Applying the TZ-equivalences to

¬(x < y) ∧ ¬(x = y + 3)

produces the TZ-equivalent formula

y < x + 1 ∧ (x < y + 3 ∨ y + 3 < x) .

Step 3

Collect terms containing x so that literals have the form

hx < t , t < hx , k | hx + t , or ¬(k | hx + t) ,

where t is a term that does not contain x and h, k ∈ Z+. The output is theformula ∃x. F3[x], which is TZ-equivalent to ∃x. F [x].

Example 7.6. Collecting terms in

x + x + y < z + 3z + 2y − 4x

produces the TZ-equivalent formula

6x < 4z + y .

Step 4

Let

δ′ = lcmh : h is a coefficient of x in F3[x] ,

where lcm returns the least common multiple of the set. Multiply atoms inF3[x] by constants so that δ′ is the coefficient of x everywhere:

hx < t ⇔ δ′x < h′t where h′h = δ′

t < hx ⇔ h′t < δ′x where h′h = δ′

k | hx + t ⇔ h′k | δ′x + h′t where h′h = δ′

¬(k | hx + t) ⇔ ¬(h′k | δ′x + h′t) where h′h = δ′

Notice the abuse of notation: h′k | · is a different (unary) predicate than k | ·.This rewriting results in formula F ′

3 in which all occurrences of x occur interms δ′x. Replace δ′x terms with a fresh variable x′ to form F ′′

3 :

F ′′3 : F ′

3δ′x 7→ x′ .

Finally, construct

∃x′. F ′′3 [x′] ∧ δ′ | x′

︸︷︷︸F4[x′]

.

The divisibility literal constrains the fresh variable x′ to be divisible by δ′,which exactly captures the values of δ′x.∃x′. F4[x

′] is TZ-equivalent to ∃x. F [x]. Moreover, each literal of F4[x′]

that contains x′ has one of the following forms:

(A) x′ < a(B) b < x′

(C) h | x′ + c(D) ¬(k | x′ + d)

where a, b, c, d are terms that do not contain x, and h, k ∈ Z+.

Step 5

Construct the left infinite projection F−∞[x′] from F4[x′] by replacing

(A) literals x′ < a by ⊤

and

(B) literals b < x′ by ⊥ .

The idea is that very small numbers (the left side of the “number line”) satisfy(A) literals but not (B) literals.

Let

δ = lcm

h of (C) literals h | x′ + ck of (D) literals ¬(k | x′ + d)

and B be the set of b terms appearing in (B) literals. Construct

F5 :

δ∨

j=1

F−∞[j] ∨δ∨

j=1

∨

b∈B

F4[b + j] .

F5 is quantifier-free and TZ-equivalent to ∃x. F [x].

Step 5 is the trickiest part of the procedure, so let us focus on this step.The first major disjunct of F5 contains only divisibility literals. It asserts thatan infinite number of small numbers n satisfy F4[n]. For if there exists onenumber n that satisfies the Boolean combination of divisibility literals in F−∞,then every n− λδ, for λ ∈ Z+, also satisfies F−∞.

The second major disjunct asserts that there is a least n ∈ Z that satisfiesF4[n]. This least n is determined by the b terms of the (B) literals.

More formerly, consider the following periodicity property of the divis-ibility predicates:

If m | δ, then m | n iff m | n + λδ for all λ ∈ Z.

In other words, m | · cannot distinguish between the cases m | n and m | n+λδ.Since δ is chosen in Step 5 to be the least common multiple of the integers hand k of the divides predicates, no divides literal in F5 can distinguish betweentwo integers n and n + λδ, for any λ ∈ Z.

With this property in mind, consider the first major disjunct of F5 again.If n ∈ Z satisfies F [n], then so does n − λδ for λ ∈ Z+. Then surely a smallenough number exists that satisfies all (A) literals and falsifies all (B) literalsof F4, mirroring the construction of F−∞.

For the second major disjunct, suppose that some number n satisfies F4[n].Decreasing this number continues to satisfy the same (A) literals. It cannotdecrease past some value b∗ without changing the truth of some (B) literal.

(a) • • • • • a2

⊳

a1

⊳

(b) • • • • a2

⊳

a1

⊳

b

⊲ |

(c) • b

⊲ |a2

⊳

a1

⊳

Fig. 7.1. (a) Left infinite projection (b) δ-interval (c) false

However, according to the periodicity rule, there exists a satisfying integer n′

within the δ-interval to the right of b∗.Figure 7.1 illustrates several situations when a, b, c, and d terms are con-

stant. Circles represent points that satisfy the divides constraints; solid circlesin particular represent satisfying points. Figure 7.1(a) illustrates a formulax < a1 ∧ x < a2 ∧ δ | x: each left-pointing triangle represents a x < ai literal.The left infinite projection is satisfied. Figure 7.1(b) illustrates an additionalx > b literal; now, the δ-interval following the right-pointing triangle at b issearched. It contains a satisfying point. Finally, b > a1 in Figure 7.1(c), sothe δ-interval does not contain a satisfying point.

Example 7.7. Consider ΣZ-formula

∃x. 3x− 2y + 1 > −y ∧ 2x− 6 < z ∧ 4 | 5x + 1︸︷︷︸F [x]

.

After Step 3, we have

∃x. 2x < z + 6 ∧ y − 1 < 3x ∧ 4 | 5x + 1︸︷︷︸F3[x]

.

Collecting coefficients of x in Step 4, we find

δ′ = lcm2, 3, 5 = 30 .

Multiplying when necessary, we rewrite the formula so that 30 is the coefficientof every occurrence of x:

∃x. 30x < 15z + 90 ∧ 10y − 10 < 30x ∧ 24 | 30x + 6 .

Replacing 30x with fresh x′ and conjoining a divides atom completes Step 4:

∃x′. x′ < 15z + 90 ∧ 10y − 10 < x′ ∧ 24 | x′ + 6 ∧ 30 | x′

︸︷︷︸F4[x′]

.

For Step 5, construct the left infinite projection

F−∞[x] : ⊤ ∧ ⊥ ∧ 24 | x′ + 6 ∧ 30 | x′ ,

which simplifies to ⊥. Compute

δ = lcm24, 30 = 120

and

B = 10y − 10 .

Then replacing x′ by 10y − 10 + j in F4[x′] produces

F5 :

120∨

j=1

[10y − 10 + j < 15z + 90 ∧ 10y − 10 < 10y − 10 + j∧ 24 | 10y − 10 + j + 6 ∧ 30 | 10y − 10 + j

]

which simplifies to

F5 :120∨

j=1

[10y + j < 15z + 100 ∧ 0 < j∧ 24 | 10y + j − 4 ∧ 30 | 10y + j − 10

].

F5 is quantifier-free and TZ-equivalent to ∃x. F [x].

Example 7.8. Consider again the formula defining the set of even integers:

∃x. 2x = y︸︷︷︸F [x]

.

Rewriting according to Steps 2 and 3 produces

∃x. y − 1 < 2x ∧ 2x < y + 1 .

Then

δ′ = lcm2, 2 = 2 ,

so Step 4 completes with

∃x′. y − 1 < x′ ∧ x′ < y + 1 ∧ 2 | x′

︸︷︷︸F4[x′]

.

Computing the left infinite projection F−∞ produces ⊥, as F4[x′] contains a

(B) literal as a conjunct. However,

δ = lcm2 = 2

and

B = y − 1 ,

so

F5 :

2∨

j=1

(y − 1 < y − 1 + j ∧ y − 1 + j < y + 1 ∧ 2 | y − 1 + j) .

Simplifying, we find

F5 :

2∨

j=1

(0 < j ∧ j < 2 ∧ 2 | y + j − 1) ,

and then

F5 : 2 | y ,

which is quantifier-free and TZ-equivalent to ∃x. F [x].


∃x. (3x + 1 < 10 ∨ 7x− 6 > 7) ∧ 2 | x︸︷︷︸F [x]

.

Rewriting to isolate x terms produces

∃x. (3x < 9 ∨ 13 < 7x) ∧ 2 | x ,

so

δ′ = lcm3, 7 = 21 .

After multiplying coefficients by proper constants,

∃x. (21x < 63 ∨ 39 < 21x) ∧ 42 | 21x ,

we replace 21x by x′:

∃x′. (x′ < 63 ∨ 39 < x′) ∧ 42 | x′ ∧ 21 | x′

︸︷︷︸F4[x′]

.

Then

F−∞[x′] : (⊤ ∨ ⊥) ∧ 42 | x′ ∧ 21 | x′ ,

or, simplifying,

F−∞[x′] : 42 | x′ ∧ 21 | x′ .

Finally,

δ = lcm21, 42 = 42

and

B = 39 ,

so

F5 :

42∨

j=1

(42 | j ∧ 21 | j) ∨

42∨

j=1

((39 + j < 63 ∨ 39 < 39 + j) ∧ 42 | 39 + j ∧ 21 | 39 + j) .

Since 42 | 42 and 21 | 42, the left main disjunct simplifies to ⊤, so that ∃x. F [x]

is TZ-equivalent to ⊤. Thus, F is TZ-valid.

Theorem 7.10 (Correct). Given ΣZ-formula ∃x. F [x] in which F is quantifier-

free, Cooper’s method returns a TZ-equivalent quantifier-free formula.

Proof. The transformations of the first four steps produce formula F4. Byinspection, we assert that in TZ

∃x. F [x] ⇔ ∃x. F4[x] .

The focus of the proof is to prove that ∃x. F4[x] ⇔ F5 in TZ:

∃x. F4[x] ⇔δ∨

j=1

F−∞[j] ∨δ∨

j=1

∨

b∈B

F4[b + j] .

We accomplish the proof in two steps.

1. F5 ⇒ ∃x. F4[x]: We assume the existence of an interpretation I such thatI |= F5 and prove that I |= ∃x. F4[x].

2. ∃x. F4[x]⇒ F5: We assume the existence of an interpretation I such thatI |= ∃x. F4[x] and prove that I |= F5.

Assume then that I |= F5, so that one of the disjuncts of F5 is true underI. If one of the second set of disjuncts is true, say F4[b

∗ + j∗], then

I ⊳ x 7→ b∗ + j∗ |= F4[x]

I |= ∃x. F4[x] .

Otherwise, one of the first set of disjuncts is true, so for some j∗ ∈ [1, δ],I ⊳ x 7→ j∗ |= F−∞[x]. By construction of F−∞, there is some λ > 0 suchthat I ⊳ x 7→ j∗−λδ |= F4[x]. That is, there is some j∗−λδ that is so smallthat the inequality literals of F4 evaluate under I ⊳ x 7→ j∗ − λδ exactly asin the construction of F−∞. Thus, I |= ∃x. F4[x] in this case as well.

For the other direction, assume that I |= ∃x. F4[x]. Thus, some n ∈ Z

exists such that I ⊳ x 7→ n |= F4[x]. If for some b∗ ∈ B and j∗ ∈ [1, δ],

I |= n = b∗ + j∗, then I |= F4[b∗ + j∗]. As F4[b

∗ + j∗] is a disjunct of F5,I |= F5.

Otherwise, consider whether I ⊳ x 7→ n− δ |= F4[x]. If not, then one ofthe (B) literals, say b∗ < x for some b∗ ∈ B, of F4 becomes false under I inthe transition from n to n− δ. But then I |= n = b∗ + j∗ for some j∗ ∈ [1, δ],contradicting our assumption that n is not equal to some b∗ + j∗. Hence, itmust be the case that I ⊳ x 7→ n− δ |= F4[x].

By induction using this argument, we find that I ⊳ x 7→ n− λδ |= F4[x]for all λ > 0. For some λ, n− λδ becomes so small that

I ⊳ x 7→ n− λδ |= F4[x]↔ F−∞[x] , so

I ⊳ x 7→ n− λδ |= F−∞[x] .

That is, n − λδ is so small that the inequality literals of F4 evaluate underI ⊳ x 7→ n − λδ exactly as in the construction of F−∞. Now, since F−∞

contains only divides literals, we can choose a µ such that n−λδ+µδ ∈ [1, δ].Let j∗ = n− λδ + µδ. Then I |= F−∞[j∗], so that I |= F5.

7.2.3 A Symmetric Elimination

The construction in Step 5 was biased to the left. We can just as easily definea right elimination. Construct the right infinite projection F+∞[x′] fromF4[x

′] by replacing

(A) literals x′ < a by ⊥

and

(B) literals b < x′ by ⊤ .

The idea is that very large numbers (the right side of the “number line”)satisfy (B) literals but not (A) literals.

Then define δ as before, but now define A as the set of a terms appearingin (A) literals. Construct

F5 :

δ∨

j=1

F+∞[−j] ∨δ∨

j=1

∨

a∈A

F4[a− j] .

Now, instead of choosing the left or right elimination (correspondingto the left or right infinite projections, respectively) arbitrarily, choose theelimination according to the number of (A) and (B) literals. If there arefewer (A) literals than (B) literals, choose the right elimination; otherwise,choose the left elimination. This heuristic minimizes the number of disjunctsin the resulting formula F5.

Example 7.11. In the formula

F : ∃x. (x < 13 ∨ x > 15) ∧ x < y ,

there are two (A) literals but only one (B) literal. Hence, choose the leftinfinite projection to produce fewer disjuncts.

7.2.4 Eliminating Blocks of Quantifiers

Consider a formula with a block of quantifiers: ∃x1. · · · ∃xn. F [x1, . . . , xn],where F is quantifier-free. Eliminating xn produces a formula of the form

G1 :

∃x1. · · · ∃xn−1.δ∨

j=1

F−∞[x1, . . . , xn−1, j] ∨δ∨

j=1

∨

b∈B

F4[x1, . . . , xn−1, b + j] .

Disjunction and existential quantification commute, so we can rewrite G1 as

G2 :

∨δj=1 ∃x1. · · · ∃xn−1. F−∞[x1, . . . , xn−1, j]

∨δ∨

j=1

∨

b∈B

∃x1. · · · ∃xn−1. F4[x1, . . . , xn−1, b + j]

and continue the elimination on each disjunct.This optimization can be taken one step further. Rather than expanding

the disjuncts over the iterator j, treat j as a free variable during the subsequenteliminations. Then only 1 + |B| formulae need be examined during the nextphase: the formula ∃x1. · · · ∃xn−1. F−∞[x1, . . . , xn−1, j] and, for each b ∈ B,the formula ∃x1. · · · ∃xn−1. F4[x1, . . . , xn−1, b + j].


∃y. ∃x. x < −2 ∧ 1− 5y < x ∧ 1 + y < 13x︸︷︷︸F [x,y]

.

At Step 3,

δ′ = lcm1, 13 = 13 ,

producing

∃y. ∃x. 13x < −26 ∧ 13− 65y < 13x ∧ 1 + y < 13x

and then

∃y. ∃x′. x′ < −26 ∧ 13− 65y < x′ ∧ 1 + y < x′ ∧ 13 | x′ .

With δ = lcm13 = 13, A = −26, and B = 13− 65y, 1 + y, choose theright elimination to form:

∃y.

13∨

j=1

[−26− j < −26 ∧ 13− 65y < −26− j∧ 1 + y < −26− j ∧ 13 | − 26− j

].

F+∞ simplifies to ⊥ since F is a conjunction of both (A) and (B) liter-als. Now, instead of applying elimination to the entire subformula within thequantifier ∃y, commute the quantifier and the disjunctions:

G :13∨

j=1

∃y. j > 0 ∧ 39 + j < 65y ∧ y < −27− j ∧ 13 | − j − 26 .

Treating j as a free variable, apply QE to the subformula

H : ∃y. j > 0 ∧ 39 + j < 65y ∧ y < −27− j ∧ 13 | − j − 26

as usual. Then simplify to produce

H ′ :

65∨

k=1

(k < −66j − 1794 ∧ 13 | − j − 26 ∧ 65 | 39 + j + k) ,

and replace H with H ′ in G to produce the final formula

13∨

j=1

65∨

k=1

(k < −66j − 1794 ∧ 13 | − j − 26 ∧ 65 | 39 + j + k) .

This formula is TZ-equivalent to ∃y, x. F [x, y].

7.2.5 ⋆Solving Divides Constraints

Consider a formula of the form G : ∃x1. · · · ∃xn. F [x1, . . . , xn] without freevariables. Applying Cooper’s method with the block elimination optimizationproduces a quantifier-free TZ-equivalent formula of the form

G′ :

δ1∨

j1=1

· · ·δn∨

jn=1

F ′[j1, . . . , jn] ,

also without free variables. Expanding this formula by attempting every pos-sible combination of values for j1, . . . , jn produces δ1× δ2× · · ·× δn disjuncts.This naive expansion is prohibitively expensive on even small problems.

Notice, however, that Step 4 introduces many divisibility literals as con-juncts. F5 has the form


F5 :

δ1∨

j1=1

· · ·δn∨

jn=1

(F ′′ ∧

∧

i

ki | ti[j1, . . . , jn]

),

where the ti are terms containing only constants and the ji iterators. Cooperrealized that the conjuncts

D :∧

i

ki | ti[j1, . . . , jn]

can be solved to reduce significantly the number of disjuncts to consider.There are two steps to solving divides constraints.

Step 1: Triangulate the Constraints

The following theorem provides a means for reducing the number of literalsthat contain some ji. It applies Euclid’s algorithm for computing the greatestcommon divisor (GCD) d of two integers m and n. Euclid’s algorithm alsoreturns two integers p and q such that pm + qn = d.

Theorem 7.13. Consider two divisibility constraints

F : m | ax + b ∧ n | αx + β ,

where m, n ∈ Z+, a, α ∈ Z \ 0, and b, β are terms not containing x. Letd, p, q = gcd(an, αm) be such that d is the GCD of an and αm, and p and qobey pan + qαm = d. Then F is satisfiable iff

G : mn | dx + bpn + βqm ∧ d | αb− aβ

is satisfiable.

While both of the literals of F contain x, only one of the literals of Gcontains x. Therefore, we can apply this theorem to triangulate a set S ofdivisibility constraints. Let ≺ be a linear ordering of j1, . . . , jn. S is in trian-gular form if for each ji, at most one constraint of S contains ji as the least(according to ≺) free variable.

The triangularization algorithm proceeds iteratively. On one iteration, per-form the following steps:

1. Choose from S two constraints

m | aji + b and n | αji + β

such that there is no jk ≺ ji that occurs in at least two divisibility con-straints of S.

2. Apply Theorem 7.13 to produce the new constraints

mn | dji + bpn + βqm and d | αb− aβ .

Replace the original constraints with these constraints in S.


Example 7.14. Consider the divisibility constraints of Example 7.12:

13 | − j − 26 ∧ 65 | 39 + j + k .

Fix the variable order j ≺ k. According to this order, only one constraintshould have an occurrence of j, so apply Theorem 7.13:

m | aj + b n | αj + β13 | −j + −26 ∧ 65 | j + k + 39

Compute

d, p, q = gcd(an, αm)13, 0, 1 = gcd(−65, 13)

so thatpan + qαm = d

0(−65) + 1(13) = 13

and construct

mn | dj + bpn + βqm d | αb − aβ845 | 13j + (−26)(0)(65) + (k + 39)13 ∧ 13 | −26 − (−1)(k + 39)

or, simplifying,

845 | 13j + 13k + 507 ∧ 13 | k + 13 .

As desired, only one constraint has an occurrence of j. Additionally, the sim-plest constraint contains only k.

Step 2: Solve the Constraints

To solve the triangulated system of constraints S, consider the following the-orem.

Theorem 7.15. Consider divisibility constraint

F : m | ax + b ,

for m ∈ Z+, a ∈ Z\0, and b a term not containing x. Let d, p, q = gcd(a, m)be such that d is the GCD of a and m, and p and q obey pa + qm = d. F issatisfiable iff d | b. If it is satisfiable, its solutions are

x = −pb

d+

λm

dfor λ ∈ Z .

With S in triangular form, we need only solve the system recursively:

1. If S is empty, generate a disjunct according to the current values ofj1, . . . , jn.

2. Otherwise, choose from S a constraint

F : m | aji + b

such that b is an integer. Apply Theorem 7.15 to F .

3. If F is unsatisfiable, return.4. Otherwise, instantiate the remaining constraints of S with each solution

to ji within the range [1, δi], and recursively solve.

Example 7.16. Consider the divisibility constraints of Example 7.12:

13 | − j − 26 ∧ 65 | 39 + j + k .

The system is already in triangular form for the variable order k ≺ j. To solvethe first constraint

m | aj + b13 | −j + −26

computed, p, q = gcd(a, m)1, −1, 0 = gcd(−1, 13) .

Does 1 | − 26? Yes, so the solutions are given by

j = −(−1)−26

1+ λ

13

1= −26 + 13λ for λ ∈ Z .

Since j ∈ [1, 13], only λ = 3 is relevant, providing the solution j = 13.After substituting j = 13 into the second constraint, to solve

m | ak + b65 | k + 52

computed, p, q = gcd(a, m)1, 1, 0 = gcd(1, 65) .

Does 1 | 52? Yes, so the solutions are given by

k = −(1)52

1+ λ

65

1= −52 + 65λ for λ ∈ Z .

Since k ∈ [1, 65], only λ = 1 is relevant, providing the solution k = 13. j = 13,k = 13 is the only solution to the divisibility constraints 13 | − j − 26 ∧65 | 39 + j + k.

However, the additional conjunct

(k < −66j − 1794)j 7→ 13, k 7→ 13

is not true, so the formula of Example 7.12 is TZ-equivalent to ⊥ and is thusTZ-unsatisfiable.

Alternatively, we could have enumerated all 13 × 65 = 845 possible dis-juncts to discover that none of them simplifies to ⊤. In 844 of these disjuncts,at least one of the divisibility constraints is ⊥.

Solve the triangulated system of Example 7.14 with variable order j ≺ k,and verify that the solution is the same. The variable order used for triangu-lating the constraint system does not affect the solutions.

Solving divisibility constraints significantly improves the performance ofQE on purely existential and purely universal formulae. What about formulae


with free variables or some quantifier alternation? For example, consider theformula

G : ∀y1. · · · ∀ym. ∃x1. · · · ∃xn. F [x1, . . . , xn, y1, . . . , ym] .

Eliminating the inner block produces the TZ-equivalent formula

G′ : ∀y1. · · · ∀ym.

δ1∨

j1=1

· · ·δn∨

jn=1

F ′[j1, . . . , jn, y1, . . . , ym] .

Order the free variables free(F ′) = j1, . . . , jn, y1, . . . , ym so that the yi’sprecede the ji’s. In the resulting triangulated system, as few constraints aspossible contain yi variables. Drop all resulting constraints that contain ayi variable, and solve the remaining constraints. A variable ji that does notappear in the final set of constraints must be instantiated to all values in itsrange [1, δi].

7.3 Quantifier Elimination over Rationals

QE for the theory of rationals TQ is simpler than for TZ. Recall that TQ hasthe following signature:

ΣQ : 0, 1, +, −, =, ≥ ,

where

• 0 and 1 are constants;• + is a binary function;• − is a unary function;• and = and ≥ are binary predicates.

To be consistent with our presentation of Cooper’s method, we switch fromweak inequality ≥ to strict inequality >. Of course, they are interchangeable:

x ≥ y ⇔ x > y ∨ x = y and x > y ⇔ x ≥ y ∧ ¬(x = y) .

7.3.1 Ferrante and Rackoff’s Method

Given a ΣQ-formula ∃x. F [x] as input, where F is quantifier-free, the algorithmproceeds according to the following steps.

Step 1

Put F [x] in NNF. The output ∃x. F1[x] is TQ-equivalent to ∃x. F [x] and issuch that F1 is a positive Boolean combination (only ∧ and ∨) of literals.

7.3 Quantifier Elimination over Rationals 201

Step 2

Replace literals according to the following TQ-equivalences, applied from leftto right:

¬(s < t) ⇔ t < s ∨ t = s¬(s = t) ⇔ t < s ∨ t > s

The output ∃x. F2[x] is TQ-equivalent to ∃x. F [x] and does not contain anynegations.

Step 3

Solve for x in each atom of F2[x]: for example, replace the atom

t < cx ,

where c ∈ Z \ 0 and t is a term not containing x, with

t

c< x .

Atoms in the output ∃x. F3[x] now have the form

(A) x < a(B) b < x(C) x = c

where a, b, c are terms that do not contain x. ∃x. F3[x] is TQ-equivalent to∃x. F [x].

Step 4

Construct the left infinite projection F−∞ from F3[x] by replacing

(A) atoms x < a by ⊤ , (B) atoms b < x by ⊥ ,

and

(C) atoms x = c by ⊥ .

Construct the right infinite projection F+∞ from F3[x] by replacing

(A) atoms x < a by ⊥ , (B) atoms b < x by ⊤ ,

and

(C) atoms x = c by ⊥ .

(a) b

⊲ •b+a2• a

⊳

(b) b

⊲

c•b+a2 a

⊳ Fig. 7.2. Satisfying points: (a) b+a

2(b) c+c

2

The left (right) infinite projection captures the case when small (large) n ∈ Q

satisfy F3[n].Let S be the set of a, b, and c terms from the (A), (B), and (C) atoms.

Construct the final output

F4 : F−∞ ∨ F+∞ ∨∨

s,t∈S

F3

[s + t

2

],

which is TQ-equivalent to ∃x. F [x].The disjunct F−∞ captures the possibility that all rationals n less than

some value a∗ satisfy F4[n]. The disjunct F+∞ captures the symmetric casefor large n. Finally, the last disjunct can be seen as capturing two possibilities.First, disjuncts in which s and t are the same terms check whether any terms ∈ S satisfies F4[s]. Second, consider the remaining O(|S|2) disjuncts in whichs and t are different terms. In any TQ-interpretation, |S| − 1 pairs s, t ∈ S areadjacent; for such a pair, (s, t) is an interval in which no other s′ ∈ S lies. IfF4[

s+t2 ], then it can be shown that every other point n ∈ (s, t) also satisfies

F4[n]. In other words, s+t2 represents the whole interval (s, t). Since no single

TQ-interpretation is fixed in advance, all O(|S|2) pairs are considered (but seeExercise 7.8 for an optimization).

Figure 7.2 illustrates two cases in which a, b, and c terms are constant.Figure 7.2(a) visualizes the formula b < x ∧ x < a, for which b+a

2 is asatisfying point. Triangles represent inequalities; the solid circle representsb+a2 . All points in the interval (b, a) are satisfying, but b+a

2 is the representativeof this interval. Figure 7.2(b) includes an additional literal, x = c (the solidcircle); now, b+a

2 (the open circle) is not a satisfying point, but c+c2 = c is.


∃x. 2x = y︸︷︷︸F [x]

.

In Step 3, solving for x produces

F ′ : ∃x. x =y

2

so that S = y2. The left F−∞ and right F+∞ infinite projections are both

⊥, as F ′ contains a single (C) atom. Hence, simplifying

7.3 Quantifier Elimination over Rationals 203

F4 :∨

s,t∈S

(s + t

2=

y

2

)

reveals the TQ-equivalent quantifier-free formula y2 = y

2 , or ⊤. Therefore,∃x. F [x] is TQ-valid.


∃x. 3x + 1 < 10 ∧ 7x− 6 > 7︸︷︷︸F [x]

.

Solving for x gives

F ′ : ∃x. x < 3 ∧ x >13

7︸︷︷︸F3[x]

and S = 3, 137 . Since x < 3 is an (A) atom and x > 13

7 is a (B) atom, bothF−∞ and F+∞ simplify to ⊥, leaving

F4 :∨

s,t∈S

(s + t

2< 3 ∧ s + t

2>

13

7

).

s+t2 takes on three expressions: 3, 13

7 , and13

7+3

2 . The first two expressions arisewhen s and t are the same terms. F3[3] and F3[

137 ] both simplify to ⊥ since

the inequalities are strict; however,

F3

[ 137 + 3

2

]:

137 + 3

2< 3 ∧

137 + 3

2>

13

7

simplifies to ⊤. Thus, F4 : ⊤ is TQ-equivalent to ∃x. F [x], so ∃x. F [x] isTQ-valid.


G : ∀x. x < y .

To eliminate x, consider the subformula F of

G′ : ¬(∃x. ¬(x < y)︸︷︷︸F [x]

) .

Step 2 rewrites F as

∃x. y < x ∨ y = x .

The literals are already in solved form for x in Step 3. Then

F−∞ : ⊥ ∨ ⊥ and F+∞ : ⊤ ∨ ⊥simplify to ⊥ and ⊤, respectively. Since F+∞ is ⊤, we need not consider therest of Step 4, but instead declare that ∃x. F [x] is TQ-equivalent to F4 : ⊤.Then G′ is ¬⊤, so that G is TQ-equivalent to ⊥.


Theorem 7.20 (Correct). Given ΣQ-formula ∃x. F [x] in which F isquantifier-free, Ferrante and Rackoff’s method returns a TQ-equivalentquantifier-free formula.

Exercise 7.9 asks the reader to prove the theorem.A limited form of the block elimination optimization discussed in Section

7.2.4 can be adapted to this QE procedure: commute disjunction and exis-tential quantification. This step reduces the size of the term set S in eachsubproblem.

7.4 ⋆Complexity

Fischer and Rabin proved the following lower bounds. The length n of aformula is the number of symbols.

Theorem 7.21 (TZ Lower Bound). There is a fixed constant c > 0 such

that for all sufficiently large n, there is a ΣZ-formula of length n that requiresat least 22cn

steps to decide its validity.

Theorem 7.22 (TQ Lower Bound). There is a fixed constant c > 0 suchthat for all sufficiently large n, there is a ΣQ-formula of length n that requiresat least 2cn steps to decide its validity.

Oppen analyzed Cooper’s method to prove the following upper bound.

Theorem 7.23 (TZ Upper Bound). On a ΣZ-formula of length n, Cooper’s

method requires deterministic time 222pn

for some fixed constant p > 0.

Ferrante and Rackoff proved the following upper bound.

Theorem 7.24 (TQ Upper Bound). On a ΣQ-formula of length n, Ferranteand Rackoff’s method requires deterministic time 22pn

for some fixed constantp > 0.

Closing the gap between the lower and upper bounds would require an-swering long-standing open questions in complexity theory.

7.5 Summary

Quantifier elimination is a standard technique for reasoning about theoriesin which satisfiability is decidable even with arbitrary quantification. Thischapter presents the technique in the context of arithmetic over integers andover rationals or reals. It covers:

Exercises 205

• Quantifier elimination in general. Based on structural induction, one onlyneeds to consider the special case of formulae of the form ∃x. F [x], inwhich F is quantifier-free but may contain free variables in addition to x;arbitrary formulae may then be treated compositionally.

• Elimination over integers, TZ. The basic theory of integers does not admitquantifier elimination; it must be augmented with divisibility predicates.This situation, in which additional predicates are required to develop aquantifier elimination procedure, is common. The main idea of the proce-dure is to identify intervals with periodic behavior induced by the divisi-bility predicates.

• Elimination over rationals, TQ. The main idea of the procedure is to par-tition the rationals into a finite number of points and intervals.

The optimizations of Cooper’s method, particularly solving divides con-straints, make the procedure acceptably fast in practice on quantifier-freeΣZ-formulae. However, faster decision procedures exist for deciding ΣQ-satisfiability of quantifier-free ΣQ-formulae; we study one in Chapter 8.

In addition to handling quantifiers, the algorithms of this chapter treatarbitrary Boolean combinations of literals. The decision procedures of sub-sequent chapters require the Boolean structure to be simple: formulae arejust conjunctions of literals. Treating formulae with arbitrary Boolean struc-ture directly avoids the potential exponential increase in size associated withconverting to DNF.


Presburger proves that arithmetic over the natural numbers without multi-plication TN is decidable [73]. Cooper presents the version of the quantifier-elimination procedure for TZ that we describe [19]. Fischer and Rabin providethe lower bound on the complexity of the decision problem for TZ [33], whileOppen analyzes Cooper’s procedure to obtain an upper bound [68].

Ferrante and Rackoff describe the quantifier-elimination procedure thatwe present and the lower and upper complexity bounds on the problem [32].

Exercises

7.1 (TZ does not admit QE). Prove Lemma 7.4. Hint : Apply structuralinduction; the base cases involve comparisons between ay and c, for constantsa and c.

7.2 (QE for TZ). Apply quantifier-elimination to the following ΣZ-formulae.

(a) ∀y. 3 < x + 2y ∨ 2x + y < 3(b) ∃y. 3 < x + 2y ∨ 2x + y < 3

(c) ∃y. x = 2y ∧ y < x(d) ∀x. (∃y. x = 2y) → (∃y. 3x = 2y)

7.3 (QE for TZ). Construct new ΣZ-formulae such that

(a) the F−∞ component of the elimination simplifies to ⊤;(b) the F−∞ component of the elimination simplifies to ⊥;(c) using the right infinite projection is better than using the left infinite

projections.

In each case, describe the elimination.

7.4 (Block elimination in TZ). Apply quantifier-elimination to the follow-ing ΣZ-formulae. Use block elimination.

(a) ∃x. ∃y. 2x + 3y = 7 ∧ x < y(b) ∃x. ∃y. 2x + 3y = 7 ∧ y < x(c) ∃x. ∃y. 2x + 3y = 7 ∧ x < y ∧ 0 < x ∧ 0 < y(d) ∃x. ∃y. 3x + 3y < 8 ∧ 8 < 3x + 2y(e) ∃x. ∃y. x = 2y ∧ ∃z. x = 3z

7.5 (Divides constraints). Apply the divides-constraints elimination to theΣZ-formulae of Exercise 7.4.

7.6 (QE for TQ). Apply quantifier-elimination to the formulae of Exercise7.2, but treat them as ΣQ-formula.

7.7 (QE for TQ). Construct new ΣQ-formulae such that

(a) the F−∞ and F+∞ components of the elimination simplify to ⊤ and ⊥,respectively;

(b) the F−∞ and F+∞ components of the elimination simplify to ⊥ and ⊤,respectively.

In each case, describe the elimination.

7.8 (Sufficient set). Step 4 of Ferrante and Rackoff’s method examines termss+t2 for all s, t ∈ S, where S is the set of all a, b, and c terms. Describe a

smaller set of terms that is still sufficient. According to this new definition,which terms should be examined in Example 7.18 and Exercise 7.6?

7.9 (⋆Theorem 7.20). Prove Theorem 7.20. Hint : Apply the strategy em-ployed in the proof of Theorem 7.10.

7.10 (⋆Optimization problem). Consider the optimization problemmaxf(x) : F [x] in which the objective function f(x) is a linear expres-sion over the problem variables x, and the constraint F [x] is a ΣZ-formulasuch that free(F [x]) = x. The solution to the problem is the largest numbern such that there exists some evaluation v of x for which f(v) = n and F [v]is true. Show how to use QE for TZ to solve this optimization problem.

8

Quantifier-Free Linear Arithmetic

Because my mathematics has its origin in a real problem doesn’t makeit less interesting to me — just the other way around, I find it makesthe puzzle I am working on all the more exciting. I get satisfaction outof knowing that I’m working on a relevant problem.

— George DantzigAn Interview with George B. Dantzig:

The Father of Linear Programming, 1986

This chapter considers satisfiability in the quantifier-free fragment of thetheory of rationals TQ. Addressing this fragment is motivated by two observa-tions. First, program verification typically requires just considering formulaefrom the quantifier-free fragments of theories such as TQ. Second, decidingsatisfiability in the full theory of TQ is computationally expensive, while de-ciding satisfiability in just the quantifier-free fragment of TQ is fast in practicewhen using, for example, the simplex method for linear programming.

A linear program is an optimization problem in which the goal is to finda point satisfying a set of linear constraints that maximizes a linear objectivefunction. The linear constraints are a quantifier-free conjunctive ΣQ-formulain which each atom is a weak inequality. The objective function is a linearfunction. Deciding TQ-satisfiability can be cast as solving a linear program,so we benefit from existing algorithms for solving them.

Section 8.1 motivates studying the quantifier-free fragments of theoriesin general. Section 8.2 reviews concepts from linear algebra. Then Section8.3 introduces linear programs and shows how to decide TQ-satisfiability bysolving a linear program. Finally, Section 8.4 presents the simplex method forsolving linear programs.

8.1 Decision Procedures for Quantifier-Free Fragments

The time complexities of the algorithms of Chapter 7 limit their practicalimpact. Additionally, Ferrante and Rackoff’s method is not optimal for the

208 8 Quantifier-Free Linear Arithmetic

quantifier-free fragment of TQ. For other theories, the situation is worse: sat-isfiability in the full theories such as equality TE, lists Tcons, and arrays TA

is undecidable. Fortunately, for verification and other applications, we typ-ically need to decide satisfiability of quantifier-free formulae rather than ofarbitrarily quantified formulae.

Recall that the quantifier-free fragment of a theory T with signatureΣ consists of the axioms of T and valid Σ-formulae of the form

G : ∀x1, . . . , xn. F [x1, . . . , xn] ,

where F is quantifier-free and free(F ) = x1, . . . , xn. While such formulaehave quantifiers, the point is that they do not have quantifier alternations:all quantifiers are universal. Using our conventions, we would ordinarily askwhether the formula F is T -valid — whether its universal closure ∀ ∗ . F isT -valid. F is indeed quantifier-free.

T -validity of G corresponds to T -unsatisfiability of

¬G : ∃x1, . . . , xn. ¬F [x1, . . . , xn] ,

or, using our conventions, simply T -unsatisfiability of ¬F . Hence, the quan-tifiers are “natural” for satisfiability checking: ¬G is T -satisfiable iff thereexists a T -interpretation I and an assignment to x1, . . . , xn under which ¬Fevaluates to true.

Fortunately, the quantifier-free fragments of many theories are decidable,often efficiently so. In Chapters 8-10, we focus on decision procedures for thequantifier-free fragments of theories.

For ease of exposition, we consider only conjunctive quantifier-free Σ-formulae in each theory T that we examine. Conjunctive Σ-formulae are con-junctions of Σ-literals. This restriction does not limit the scope of the decisionprocedures. For given arbitrary quantifier-free Σ-formula F , we can convertit into DNF Σ-formula

F1 ∨ · · · ∨ Fk

in which each Fi is conjunctive. F is T -satisfiable iff at least one Fi is T -satisfiable. Decide the T -satisfiability of F by considering each Fi.

Remark 8.1 (⋆Complexity). This restriction does, however, affect com-plexity. Because satisfiability in PL is NP-complete, any decision procedurethat considers arbitrary quantifier-free formulae must be at least NP-hard, asit must handle not only the theory-specific aspects of the formulae but also thecombinatorial (PL) aspects. However, considering only conjunctive formulaeallows us to give more insightful complexity bounds. For example, satisfia-bility in the conjunctive quantifier-free fragments of TQ, TE, Tcons, and theirunion theory is in PTIME. Thus, the “hard” part of deciding satisfiability ofarbitrary quantifier-free formulae in these theories is handling the underlyingPL structure. Analyzing the theory-specific aspects is comparatively easy.

8.2 Preliminary Concepts and Notation 209

8.2 Preliminary Concepts and Notation

We define basic concepts and notation of linear algebra, covering only what isrequired for understanding the remainder of the chapter. We refer the readerinterested in learning more about linear algebra to relevant texts in Biblio-graphic Remarks.

Basic Concepts and Notation

A variable n-vector x is a column of n variables x1, . . . , xn. An n-vector isa column a ∈ Qn of n rationals, and its transpose aT is a row with elementslisted in the same order:

a =

a1

...an

and aT =

[a1 · · · an

].

An m × n-matrix A ∈ Qm×n consists of n columns of m rationals each(alternatively, m rows of n rationals each), and its transpose AT is an n×m-matrix in which element aij is swapped with element aji:

A =

a11 · · · a1n

...am1 · · · amn

and AT =

a11 · · · am1

...a1n · · · amn

.

When we refer to a row ai of A, we mean the row vector[ai1 · · · ain

], and

when we refer to a column aj of A, we mean the column vector[a1j · · · amj

]T.

We use this compact notation of transposed row vectors for column vectorsto save vertical space.

Vector-vector multiplication works as follows:

aTb =[a1 · · · an

]

b1

...bn

=

n∑

i=1

aibi .

Matrix-vector multiplication works as follows:

Ax =

a11 · · · a1n

...am1 · · · amn

x1

...xn

=

n∑

i=1

a1ixi

...n∑

i=1

amixi

.


Each row of A multiplies (as a vector) x. Finally matrix-matrix multi-plication works as follows: the product P of AB, for m × n-matrix A andn× ℓ-matrix B, is an m× ℓ-matrix in which element pij is the product of thevector-vector multiplication of row ai of A and column bj of B:

pij = aibj =[ai1 · · · ain

]

b1j

...bnj

=

n∑

k=1

aikbkj .

There are several important named vectors and matrices. 0 is a (column)vector of 0s. Similarly, 1 is a vector of 1s. Their sizes depend on the contextin which they are used. Combining our notation so far,

1Tx =

n∑

i=1

xi .

In this context, 1 is an n-vector. The n×n-matrix I is the identity matrix,in which the diagonal elements are 1 and all other elements are 0. Thus,

IA = AI = A

for any n × n-matrix A. Finally, the unit vector ei is the vector in whichthe ith element is 1 and all other elements are 0. Again, the sizes of I and ei

depend on their context.

Linear Equations

A vector space is a set of vectors that is closed under addition and scalingof vectors: if v1, . . . , vk ∈ S are vectors in vector space S, then also

λ1v1 + · · ·+ λkvk ∈ S

for λ1, . . . , λn ∈ Q. 0 is always a member of a vector space. For example, Qn

is a vector space with dimension n; it consists of all n-vectors of rationals.In general, the dimension of a vector space is given by the minimal numberof vectors required to produce all vectors of the space through addition andscaling. Such a minimal set is called a basis. For example, a line has dimension1 and can be described by a single vector: the full space of the line is describedsimply by scaling this vector. Similarly, a plane has dimension 2 and can bedescribed by two vectors. The vector space consisting of just 0 has dimension0 and is described by the empty set of vectors.

The linear equation

F : Ax = b ,

for m×n-matrix A, variable n-vector x, and m-vector b, compactly representsthe ΣQ-formula

8.2 Preliminary Concepts and Notation 211

F :m∧

i=1

ai1x1 + · · ·+ ainxn = bi .

The satisfying points comprise a vector space.To solve linear equations (at least on paper), we apply Gaussian elim-

ination. Consider first the case when A is a square matrix: its numbersof columns and rows are equal. For equation Ax = b define the augmentedmatrix

[A b

]. A matrix is in triangular form if all of its entries below

the diagonal are 0; an augmented matrix[A b

]is in triangular form if A is

in triangular form. The goal of Gaussian elimination is to manipulate[A b

]

into triangular form via the following elementary row operations:

1. Swap two rows.2. Multiply a row by a nonzero scalar.3. Add one row to another.

The last two operations are often combined to yield the composite operation:add the scaling of one or more rows (by nonzero scalars) to a row.

Once an augmented matrix is in triangular form, solving the equations issimple. Solve the equation of the final row, which involves only one variable.Then substitute the solution into the other rows, yielding another equationwith one variable. Continue until a solution is obtained for each variable.

Example 8.2. Solve

3 1 21 0 12 2 1

x1

x2

x3

=

612

.

Construct the augmented matrix

3 1 2 61 0 1 12 2 1 2

.

Apply the row operations as follows:

1. Add −2 times the first row and 4 times the second row to the third row:

3 1 2 61 0 1 10 0 1 −6

2. Add −1 times the first row and 2 times the second row to the second row:

3 1 2 60 −1 1 −30 0 1 −6

This augmented matrix is in triangular form.


Now solve the final row, representing equation x3 = −6, for x3, yieldingx3 = −6. Substituting into the second equation yields −x2 − 6 = −3, orx2 = −3. Substituting the solutions for x2 and x3 into the first equation

yields 3x1 − 3− 12 = 6, or x1 = 7. Hence, the solution is x =[7 −3 −6

]T.

Gaussian elimination can also be applied when A is not a square matrix.Rather than achieving a triangular form, the goal is to achieve echelon form,in which the first nonzero element of each row is to the right of those aboveit. In this case, there are multiple solutions to the equation.

Example 8.3. Suppose that an equation over variables x1, x2, x3, x4 reducesto the following echelon form:

3 1 2 0 60 −1 1 −1 00 0 0 2 −6

From the last row, x4 = −3. We cannot solve for x3 because there is not a rowin which the x3 column has the first non-zero element; therefore, x3 can takeon any value. To solve the second row, −x2 + x3 − x4 = 0, for x2, replace x4

with its value −3 and let x3 be any value: −x2 +x3 +3 = 0. Then x2 = 3+x3.Substituting for x2 in the first equation, solve 3x1 +(3+x3)+2x3 = 6 for x1:x1 = 1− x3. Solutions thus lie on the line described by

1− x3

3 + x3

x3

−3

for any value of x3.

A square matrix A is nonsingular or invertible if its inverse A−1 suchthat AA−1 = A−1A = I exists. We can also define nonsingularity in terms ofa matrix’s null space, denoted null(A), which is the set of points v such thatAv = 0. A matrix is nonsingular iff its null space has dimension 0.

For intuition, view square matrix A as a function that maps one point uto another v = Au. Suppose that A is not nonsingular so that null(A) hasdimension greater than 0: A sends more than one point to 0. In this case, onecannot construct an inverse function A−1, as it would have to map 0 backto more than one point. It turns out that it is sufficient to consider only thepoints that A maps to 0: if A sends only 0 to 0, then the inverse A−1 exists.In other words, when A maps only 0 to 0, then it is a 1-to-1 map so that itsinverse exists.

A’s inverse can be computed (on paper) with Gaussian elimination if itexists. Construct the augmented n × 2n-matrix

[A I

], and apply the ele-

mentary row operations to it until the left half is the identity matrix. The righthalf is then the inverse. To find just the kth column of A−1, solve Ay = ek by

8.3 Linear Programs 213

Gaussian elimination for y, rather than computing all of A−1 and extractingthe kth column.

Example 8.4. To find the second column of the inverse of

A =

3 1 21 0 12 2 1

,

solve

3 1 21 0 12 2 1

︸︷︷︸A

x1

x2

x3

=

010

︸︷︷︸e2

.

Construct the augmented matrix

3 1 2 01 0 1 12 2 1 0

Apply the same row operations as in Example 8.2.

1. Add −2 times the first row and 4 times the second row to the third row:

3 1 2 01 0 1 10 0 1 4

2. Add −1 times the first row and 2 times the second row to the second row:

3 1 2 00 −1 1 30 0 1 4

This augmented matrix is in triangular form.

Solving the resulting equations in reverse yields x1 = −3, x2 = 1, and x3 = 4.

Verify that[−3 1 4

]Tis indeed the second column of A−1.

8.3 Linear Programs

Linear Inequalities

The linear inequality

G : Ax ≤ b ,


for m×n-matrix A, variable n-vector x, and m-vector b, compactly representsthe ΣQ-formula

G :

m∧

i=1

ai1x1 + · · ·+ ainxn ≤ bi .

The subset of Qn (or Rn) that this inequality describes is called a poly-hedron. Each member of this subset corresponds to one satisfying TQ-interpretation of G. Exercises 8.3, 8.4, and 8.5 explore polyhedra in depth.

One important characteristic of the space defined by Ax ≤ b is that it isconvex. An n-dimensional space S ⊆ Rn is convex if for all pairs of pointsv1, v2 ∈ S,

λv1 + (1− λ)v2 ∈ S for λ ∈ [0, 1] .

Ax ≤ b defines a convex space. For suppose Av1 ≤ b and Av2 ≤ b; then also

λAv1 ≤ λb and (1− λ)Av2 ≤ (1− λ)b

for λ ∈ [0, 1]. Summing each side of the inequalities yields

λAv1 + (1− λ)Av2 ≤ λb + (1− λ)b

A(λv1 + (1 − λ)v2) ≤ b

as desired.Consider when the m× n-matrix A is such that m ≥ n. An n-vector v is

a vertex of Ax ≤ b if there is a nonsingular n × n-submatrix A0 of A andcorresponding n-subvector b0 of b such that A0v = b0. The rows in A0 and b0

are the set of defining constraints of the vertex v. In general, v may havemultiple sets of defining constraints. Two vertices are adjacent if they havedefining constraints that differ in only one constraint.

Intuitively, a vertex is an extremal point, such as the three vertices ofa triangle. Adjacent vertices are connected by an edge of the space definedby Ax ≤ b; for example, each pair of vertices is adjacent in a triangle andconnected by their common edge.

Linear Programs

The linear optimization problem, or linear program,

max cTxsubject to

Ax ≤ b

is solved by a point v∗ that satisfies the constraints Ax ≤ b and that max-imizes the objective function cTx. That is, Av∗ ≤ b, and cTv∗ is maximal:cTv∗ ≥ cTu for all u satisfying Au ≤ b. If Ax ≤ b is unsatisfiable, the maximumis −∞ by convention. It is also possible that the maximum is unbounded, inwhich case the maximum is ∞ by convention.


Example 8.5. Consider the following linear program:

max[1 1 −1 −1

]︸︷︷︸

cT

xy

z1

z2

︸︷︷︸x

subject to

−1 0 0 00 −1 0 00 0 −1 00 0 0 −11 1 0 01 0 −1 00 1 0 −1

︸︷︷︸A

xy

z1

z2

︸︷︷︸x

≤

0000322

︸︷︷︸b

A is a 7 × 4-matrix, b is a 7-vector, and x is a variable 4-vector representingthe variables x, y, z1, z2. The objective function is

(x− z1) + (y − z2) .

The constraints are equivalent to the ΣQ-formula

x ≥ 0 ∧ y ≥ 0 ∧ z1 ≥ 0 ∧ z2 ≥ 0∧ x + y ≤ 3 ∧ x− z1 ≤ 2 ∧ y − z2 ≤ 2 .

One vertex of the constraints is v =[2 1 0 0

]T. Why is it a vertex? Consider

the submatrix A0 of A consisting of rows 3, 4, 5, and 6; and the subvector b0

of b consisting of the same rows. A0 is invertible. Additionally, A0v = b0:

0 0 −1 00 0 0 −11 1 0 01 0 −1 0

︸︷︷︸A0

2100

︸︷︷︸v

=

0032

︸︷︷︸b0

.

Rows 3-6 are equationally satisfied in this case. Constraints 3, 4, 5, and 6comprise the defining constraints of v.

Another vertex is simply[0 0 0 0

]T, for which the first four constraints

are the defining constraints.

The following theorem is fundamental for solving linear programs. It as-serts that the maximum value achieved by the objective function cTx overx satisfying Ax ≤ b is equal to the minimum value achieved over the dualoptimization problem.


•

Ax ≤ b cTx ≤ δ

δ−

δ+

Fig. 8.1. Choosing δ

Theorem 8.6 (Duality Theorem of Linear Programming). ConsiderA ∈ Zm×n, b ∈ Zm, and c ∈ Zn. Then

maxcTx : Ax ≤ b = minyTb : y ≥ 0 ∧ yTA = cT

if Ax ≤ b is satisfiable.

By convention, when Ax ≤ b is unsatisfiable, the maximum of the primalproblem is −∞, and the minimum of the dual form is ∞.

The left and right optimization problems of the theorem are the primaland dual forms of the same problem. The theorem states that maximizingthe function cTx over Ax ≤ b is equivalent to minimizing the function yTbover all nonnegative y such that yTA = cT.

The theorem is actually surprisingly easy to visualize. In Figure 8.1, theregion labeled Ax ≤ b satisfies the inequality. The objective function cTx isrepresented by the dashed line. Its value increases in the direction of the arrowlabeled δ+ and decreases in the direction of the arrow labeled δ−. Now, ratherthan asking for the maximum value of cTx over x satisfying Ax ≤ b, let usinstead seek the minimal value δ such that

Ax ≤ b ⇒ cTx ≤ δ .

In words, we seek the minimal δ such that Ax ≤ b implies cTx ≤ δ or —visually — the region defined by cTx ≤ δ just covers the region defined byAx ≤ b. Moving the dashed line of Figure 8.1 in the directions given by thearrows, we see that the best δ makes the dashed line just touch the Ax ≤ bregion (and at that “touched” point, cTx achieves its maximum, which is δ).Decreasing δ (δ− direction) causes the implication to fail; increasing δ (δ+

direction) causes the region defined by cTx ≤ δ to be unnecessarily large.

Therefore, we seek the right δ. Now consider multiplying the rows of Ax ≤ bby nonnegative rationals and then summing the multiplied rows together. Theresulting inequality is satisfied by any x satisfying Ax ≤ b; in other words, itis implied by Ax ≤ b. Mathematically, consider any nonnegative vector y ≥ 0;then

Ax ≤ b ⇒ yTAx ≤ yTb .

Hence, to prove

Ax ≤ b ⇒ cTx ≤ δ

for a fixed δ, find y ≥ 0 such that

yTA = cT and yTb = δ .

That is,

Ax ≤ b ⇒ cT

︸︷︷︸yTA

x ≤ δ︸︷︷︸yTb

.

But we want to find a minimal δ such that the implication holds, not justprove it for a fixed δ. Thus, choose y ≥ 0 such that yTA = cT and thatminimizes yTb. This equivalence between maximizing cTx and minimizing yTbis the one claimed by Theorem 8.6.

We refer the reader in Bibliographic Remarks to texts that contain theproof of this theorem.

TQ-Satisfiability

Consider a generic ΣQ-formula

F :m∧

i=1

ai1x1 + · · ·+ ainxn ≤ bi

∧ℓ∧

i=1

αi1x1 + · · ·+ αinxn < βi

with both weak and strict inequalities. Equalities can be written as two in-equalities. F is TQ-equivalent to the ΣQ-formula

F ′ :m∧

i=1


∧ℓ∧

i=1

αi1x1 + · · ·+ αinxn + xn+1 ≤ βi

∧ xn+1 > 0


with only weak inequalities except for xn+1 > 0. To decide the TQ-satisfiabilityof F ′, and thus of F , pose and solve the following linear program:

max xn+1

subject tom∧

i=1


ℓ∧

i=1

αi1x1 + · · ·+ αinxn + xn+1 ≤ βi

F ′ is TQ-satisfiable iff the optimum is positive.When F does not contain any strict inequality literals, the optimization

problem is just a satisfiability problem because xn+1 is not introduced. Thissituation corresponds to a linear program with a constant objective value:

max 1subject to

m∧

i=1


According to convention, the optimum is −∞ iff the constraints are TQ-unsatisfiable and 1 otherwise. This form of linear program is sometimes calleda linear feasibility problem.

8.4 The Simplex Method

Consider the generic linear program

M : max cTxsubject to

G : Ax ≤ b

The simplex method solves the linear program in two main steps. In thefirst step, it obtains an initial vertex v1 of Ax ≤ b. In the second step, ititeratively traverses the vertices of Ax ≤ b, beginning at v1, in search of thevertex that maximizes the objective function. On each iteration of the secondstep, it determines if the current vertex vi has a greater objective value thanthe vertices adjacent to vi. If not, it moves to one of the adjacent vertices witha greater objective value. If so, it halts and reports vi as the optimum pointwith value cTvi.

vi is a local optimum since its adjacent vertices have lesser objectivevalues. But because the space defined by Ax ≤ b is convex, vi is also theglobal optimum: it is the highest value attained by any point that satisfiesthe constraints.

8.4 The Simplex Method 219

How does the simplex method find the initial vertex v1? In the first step,it constructs a new linear program

M0 : max c′Tx

subject to

G0 : A′x ≤ b′

based only on G : Ax ≤ b and for which, by construction, it has an initialvertex v′1. Moreover, by construction, M0 achieves a certain optimal value vG

iff G is satisfiable; and if this optimal value is achieved, the point that achievesit is a vertex v1 of G. M0 is solved by running the second step of the simplexmethod on M0 itself, initialized with the known vertex v′

1.If the objective function of M is constant, then solving M0 is the main

step of the algorithm. G is satisfiable iff the optimum of M0 is vG. The secondstep is not applied to M in this case.

We now discuss the details of the simplex method.

8.4.1 From M to M0

To find the initial vertex of M , the simplex method constructs and solves anew linear program M0. To that end, reformulate the constraints G : Ax ≤ bof M so that they have the form

x ≥ 0, Ax ≤ b

for new matrix A and vectors x and b. A general technique is to introducetwo nonnegative variables x1, x2 for each variable x and then to replace eachinstance of x with x1−x2. Then we obtain a constraint system of the desiredform that is TQ-equisatisfiable to G.

Next, separate the new constraints Ax ≤ b into two sets of inequalities

D1x ≤ g1 and D2x ≥ g2 for g1 ≥ 0, g2 > 0

according to the signs of the bi. That is, if bi is nonnegative, make row i ofAx ≤ b a part of D1x ≤ g1; otherwise, multiply row i of Ax ≤ b by −1 andmake the result a part of D2x ≥ g2.

Pose the following optimization problem:

M0 : max 1T(D2x− z)

subject tox, z ≥ 0 (1)D1x ≤ g1 (2)

D2x− z ≤ g2 (3)

(8.1)

The variable vector z has as many rows as D2. The objective function is thesum of the components of the vector D2x− z.

M0 has several interesting characteristics:


1. The point x = 0, z = 0 satisfies constraints (1)-(3). It is a vertex.• It satisfies constraints (2) and (3), for

D10 = 0 ≤ g1 and D20− 0 = 0 ≤ g2 .

The inequalities hold since g1 ≥ 0 and g2 > 0 according to our con-struction of g1 and g2.

• It satisfies constraints (1). Indeed, constraints (1) are its defining con-straints.

2. The optimum equals vG = 1Tg2 iff G is TQ-satisfiable.

• If the optimal value is 1Tg2 at optimal point x∗, z∗, then x∗ satisfies

G. For

D1x∗ ≤ g1

by (2); and

D2x∗ − z∗ = g2

by constraint (3) of M0 and that the optimal value is 1Tg2, so

D2x∗ = g2 + z∗ ≥ g2

by (1). By construction of D1, D2, g1, and g2, we thus know thatAx∗ ≤ b, so G is satisfiable.

• If x∗ satisfies G, then the optimal value is at least 1Tg2. We have

D1x∗ ≤ g1 , D2x

∗ ≥ g2 , and x∗ ≥ 0 .

Choose z∗ = D2x∗ − g2. Then z∗ ≥ 0 and D2x

∗ − z∗ = g2, so that all

constraints of M0 are satisfied. Additionally, 1T(D2x

∗ − z∗) = 1Tg2,

so that the maximum of M0 is at least 1Tg2. Constraint (3) limits the

maximum to 1Tg2.

Because the optimum equals 1Tg2 iff G is TQ-satisfiable, we need only solve

the optimization problem M0 to determine the satisfiability of G and to finda vertex of G if it is satisfiable. We already have a satisfying vertex x = 0,z = 0 of M0 by characteristic 1.


F : x + y ≥ 1 ∧ x− y ≥ −1 .

Because F has only weak inequality literals, the corresponding linear programhas a constant objective function:

M : max 1subject to

x + y ≥ 1x− y ≥ −1

G

Solving the corresponding linear program M0 is sufficient to determine theTQ-satisfiability of F .

To convert F to the form x ≥ 0 ∧ Ax ≤ b, introduce nonnegative x1, x2

for x and y1, y2 for y:

F ′ : (x1 − x2) + (y1 − y2) ≥ 1 ∧ (x1 − x2)− (y1 − y2) ≥ −1∧ x1, x2, y1, y2 ≥ 0

F is TQ-equisatisfiable to F ′. In matrix form, the first two literals of F ′ are

[−1 1 −1 1−1 1 1 −1

]

︸︷︷︸A

x1

x2

y1

y2

≤

[−1

1

]

︸︷︷︸b

Since b1 < 0 and b2 > 0, separating constraints yields

[−1 1 1 −1

]︸︷︷︸

D1

x1

x2

y1

y2

≤ [1]︸︷︷︸

g1

and[1 −1 1 −1

]︸︷︷︸

D2

x1

x2

y1

y2

≥ [1]︸︷︷︸

g2

.

D2 has only one row, so z = [z]. According to (8.1), pose the following opti-mization problem:

max[1 −1 1 −1

]

x1

x2

y1

y2

− [z]

subject tox1, x2, y1, y2, z ≥ 0

[−1 1 1 −1

]

x1

x2

y1

y2

≤ [1]

[1 −1 1 −1

]

x1

x2

y1

y2

− [z] ≤ [1]

F is TQ-satisfiable iff the optimum is vG = 1Tg2 = 1.

We know that the point[x1 x2 y1 y2 z

]T=[0 0 0 0 0

]Tis a vertex. It

satisfies all constraints and has defining constraints x1, x2, y1, y2, z ≥ 0.

The simplex method requires M0 to be in standard form


max cTysubject to

Ay ≤ b

Let

y =

[xz

].

Since the objective function of M0 is

1T(D2x− z) = 1

T [D2 −I

] [xz

]

︸︷︷︸y

,

construct vector c as follows:

cT = 1T [

D2 −I]

.

The constraints have the form

−I−I

D1

D2 −I

[xz

]

︸︷︷︸y

≤

00

g1

g2

,

where blank regions of the matrix are filled with 0s. Hence, M0 of (8.1) instandard form is written

M0 : max 1T [

D2 −I]

︸︷︷︸cT

[xz

]

︸︷︷︸y

subject to

−I−I

D1

D2 −I

︸︷︷︸A

[xz

]

︸︷︷︸y

≤

00

g1

g2

︸︷︷︸b

(8.2)

Example 8.8. According to (8.2), rewrite the optimization problem of Ex-ample 8.7 as follows:


max[1 −1 1 −1 −1

]︸︷︷︸

cT

x1

x2

y1

y2

z

subject to

−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1−1 1 1 −1 0

1 −1 1 −1 −1

︸︷︷︸A

x1

x2

y1

y2

z

≤

0000011

︸︷︷︸b

8.4.2 Vertex Traversal

Assume that we have a generic optimization problem

max cTxsubject to

Ax ≤ b

(8.3)

for which we have a satisfying vertex vi. The simplex method traverses verticesof Ax ≤ b to find the vertex v∗ that maximizes cTx. In particular, one iterationof this step seeks a vertex vi+1 adjacent to vi such that cTvi+1 > cTvi.

For the optimization problem M0 of (8.2), the point x = 0, z = 0, whichsatisfies the constraints of M0, is the initial vertex v1.

Example 8.9. The point v1 =[x1 x2 y1 y2 z

]T=[0 0 0 0 0

]Tis a vertex of

the constraints of the optimization problem of Example 8.8.

Construction of u

To begin the ith iteration, we use the vertex vi to construct a vector u suchthat uTA = cT. If u ≥ 0 then the Duality Theorem (Theorem 8.6) impliesthat vi is optimal, as we discuss below, and the process terminates. However,in all but the final iteration, u 6≥ 0: at least one row of u is negative.

To construct u, choose one of its sets of defining constraints: choose a n×nnonsingular submatrix Ai of A with corresponding rows bi such that

Aivi = bi . (8.4)

Let R be the indices of the rows of A in Ai. Such a subset of constraints existsbecause vi is a vertex.


cTx

•v1

x1

x2

Fig. 8.2. Illustration of Example 8.10

Second, solve uTA = cT for u: solve

AiTui = c (8.5)

for ui (which has a solution because Ai is nonsingular), and let u be ui forindices in R with 0s added for rows with indices not in R. Thus,

uTA = cT . (8.6)

Example 8.10. Consider the optimization problem of the form (8.3)

max [−1 1]︸︷︷︸cT

x

subject to

−1 0

0 −12 1

︸︷︷︸A

x ≤

002

︸︷︷︸b

for which we know that v1 =[0 0]T

is a vertex.The problem and initial vertex is visualized in Figure 8.2. The solid lines

represent the constraints of the problem, and the set of satisfying points cor-responds to the interior of the triangle. The dashed line indicates cTx; thearrow points in the direction of increasing value.

Given vertex v1 =[0 0]T

, the first two constraints are the defining con-straints of v1, so choose R = [1; 2]:

A1 =

[−1 0

0 −1

]and b1 =

[00

].

From (8.5), solving


[−1 0

0 −1

]

︸︷︷︸A1

T

u1 =

[−1

1

]

︸︷︷︸c

for u1 (via, for example, Gaussian elimination; or by observing that A1T = −I

and that −Iu1 = c implies that u1 = −c) yields u1 =[1 −1

]T. Adding 0s for

rows not in R produces

u =[1 −1 0

]T,

where the first two elements are from u1. Check that this u satisfies uTA = cT

of (8.6) as desired.

Example 8.11. Continuing from Examples 8.8 and 8.9, choose the first fiverows of A and b (R = [1; 2; 3; 4; 5]) since

−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1

︸︷︷︸A1

00000

︸︷︷︸v1

=

00000

︸︷︷︸b1

as desired from (8.4). From (8.5), solving

−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1

︸︷︷︸A1

T

u1 =

1−1

1−1−1

︸︷︷︸c

yields

u1T =

[−1 1 −1 1 1

].

Then

u =[−1 1 −1 1 1 0 0

]T,

where the first five elements are from u1. Check that this u satisfies uTA = cT

of (8.6) as desired.

There are two cases to consider: either u ≥ 0 or u 6≥ 0.

Case 1: u ≥ 0

We prove that in this case, vi is actually the optimal point with optimal valuecTvi. The crux of the argument is the Duality Theorem (Theorem 8.6).

From equation (8.6), we have

cTvi = uTAvi .

We claim next that

uTAvi = uTb . (8.7)

First, from (8.4)

Aivi = bi implies uiTAivi = ui

Tbi

so that equation (8.7) holds at rows R. For rows j 6∈ R, we know that uj = 0by construction, so that both

(uTAvi)j = 0 and (uTb)j = 0 ,

proving equation (8.7). Reasoning further,

uTb ≥ minyTb : y ≥ 0 ∧ yTA = cTsince u is a member of the set by (8.6) and the case u ≥ 0. By duality (Theorem8.6),

minyTb : y ≥ 0 ∧ yTA = cT = maxcTx : Ax ≤ b .

In summary, we have by (8.6), (8.7), and Theorem 8.6,

cTvi = uTAvi = uTb ≥minyTb : y ≥ 0 ∧ yTA = cT = maxcTx : Ax ≤ b ,

which proves that vi is actually the optimal point with optimal value cTvi.Figure 8.3 illustrates this case: the vertex vi maximizes cTx. The dashed

line illustrates the objective function cTx. Moving upward relative to it in-creases its value. In this illustration, cTx cannot be increased without leavingthe region defined by the constraints.

Case 2: u 6≥ 0

In this case, vi is not the optimal point. Thus, we need to move along anedge y to an adjacent vertex to increase the value of the objective function.In moving to an adjacent vertex, we swap one of the defining constraints ofvi for another constraint to form the defining constraints of vi+1.

In this second case, there exists some uk < 0. Let k be the lowest indexof u such that uk < 0 (it must be one of the indices of R since for all otherindices ℓ, uℓ = 0). Let k′ be the index of the row of ui and Ai correspondingto row k of u and Ai.


•vi

Ax ≤ b cTx

Fig. 8.3. Case 1

Construction of y

Having fixed the indices k (row k of A and b) and k′ (the corresponding rowk′ of Ai and bi) of the offending constraint, we seek a direction along whichto travel away from vertex vi and, in particular, away from the plane thatdefines the kth constraint.

Define y to be the k′th column of −A−1i . To find y, solve

Aiy = −ek′ (8.8)

(where, recall, ek′ is the k′th unit vector, which consists of 0s except in positionk′) for y. Thus,

aℓy = 0 for every row aℓ of Ai except the k′th row

and

ak′y = −1 for the k′th row ak′ of Ai .

The vector y provides the direction along which to move to the next vertex.For all rows aℓ but the k′th row of Ai, aℓy = 0 implies that moving in directiony stays on the boundary of the constraints (i.e., both aℓvi = b and, in the nextstep, aℓvi+1 = b), so aℓ will be a row in Ai+1. However, moving along y movesinward from the boundary of the k′th constraint because ak′y = −1. Thischange is desirable, as this constraint is keeping u from being nonnegative.

Example 8.12. Let us examine

u1 =[1 −1

]Tand u =

[1 −1 0

]T

of Example 8.10. Since the second row of u is −1, we are in Case 2 with k = 2,corresponding to row k′ = 2 of u1. Let y be the second column of −A−1

1 : solve[−1 0

0 −1

]

︸︷︷︸A1

y =

[0−1

]

︸︷︷︸−e2


cTx

y

•v1

• v2

x1

x2

Fig. 8.4. Illustration of Example 8.12

for y, yielding y =[0 1]T

.This y is visualized in Figure 8.4 by the dark solid arrow that points up

from v1. The vertical and horizontal lines are the defining constraints of v1; inmoving in the direction y, we keep the vertical constraint for the next vertexv2 but drop the horizontal constraint. The diagonal constraint will becomethe second of v2’s defining constraints.

Example 8.13. Let us examine

u1 =[−1 1 −1 1 1

]Tand u =

[−1 1 −1 1 1 0 0

]T

of Example 8.11. Since the first row of u is −1, k = 1, corresponding to rowk′ = 1 of u1. Thus, solve

−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1

︸︷︷︸A1

y =

−10000

︸︷︷︸−e1

for y, yielding y =[1 0 0 0 0

]T.

Again we have two cases to consider: either the optimum is bounded (Case2(a)) or it is unbounded (Case 2(b)).

Case 2(a): Optimum is Bounded

In this case, we move along the edge y to a better vertex vi+1, according tothe objective function cTx. However, there is a set of rows of A with indices


S such that for ℓ ∈ S, aℓy > 0. For these constraints, moving in the directiony actually moves toward leaving the satisfying region. These constraints limithow far in direction y we can move. For example, in Figure 8.4, the diagonalconstraint limits how far we can move in direction y.

Construction of λi, vi+1

We want to solve for the greatest λi ≥ 0 such that

A(vi + λiy) ≤ b . (8.9)

Thus, choose λi > 0 such that for some row aℓ, ℓ ∈ S,

aℓ(vi + λiy) = bℓ (8.10)

and for all other rows m ∈ S \ ℓ,

am(vi + λiy) ≤ bm . (8.11)

Set

vi+1def= vi + λiy . (8.12)

Finally, we construct the defining constraints of vi+1 for the next iteration.Construct submatrix Ai+1 of A from Ai: replace row ak′ of Ai with row aℓ ofA. Choose the corresponding rows of b for bi+1. Row ℓ and the rows carriedover from this iteration comprise the defining constraints of vi+1.

Example 8.14. Continuing from Example 8.12, choose λ1 such that

A(v1 + λ1y) ≤ b ,

specifically

−1 0

0 −12 1

([

00

]+ λ1

[01

])≤

002

,

and one constraint is equationally satisfied. Thus, choose λ1 = 2.Notice that the first and third constraints are equationally satisfied. The

first row is already included in R; hence, focus on the third row (ℓ = 3). From(8.12), define vi+1 for the next iteration:

v2 = v1 + λ1y =

[00

]+ 2

[01

]=

[02

].

This vertex is visualized in Figure 8.4. Choosing R = [1; 3] and replacing thesecond row of A1 and b1 with the third row of Ax ≤ b yields


A2 =

[−1 0

2 1

]and b2 =

[02

].

In Figure 8.4, the rows of A2 and b2 correspond to the vertical and diagonalconstraints, respectively, which are the defining constraints of v2.

In the next iteration, solving A2Tu2 = c yields u2 =

[3 1]T

. Adding 0s forrows not in R produces

u =[3 0 1

]T.

Since u ≥ 0, we are in Case 1. The maximum is

cTv2 =[−1 1

] [ 02

]= 2

at vertex v2.

This move from vertex vi to vertex vi+1 makes progress. For

cTvi+1 = cT(vi + λiy)

= cTvi + λicTy

= cTvi + λi(−uk)

> cTvi .

When considering line 3, recall from (8.5) of the construction of u that

uiT = cTA−1

i

and that the k′th column of A−1i is −y by (8.8). Hence, cTy = −uk.

Moreover, since there are only a finite number of vertices to examine, Case1 (or, in the general case, possibly Case 2(b)) eventually occurs.

Figure 8.5(a) illustrates this case: vertex vi+1 is discovered by movingalong ray y as far as possible without violating the constraints. MoreovercTvi+1 is greater than cTvi.

It is possible that λi = 0 when the vertex vi has multiple sets of definingconstraints. However, because there are only a finite number of constraints,the number of defining constraint sets is also finite. Preventing repetitions ofthe matrix Ai across iterations guarantees that the algorithm still halts in thiscase.

Case 2(b): Optimum is Unbounded

In this case, Ay ≤ 0 and the optimum is unbounded. Since Ay ≤ 0, for allλ ≥ 0,

A(vi + λy) = Avi︸︷︷︸≤b

+λ Ay︸︷︷︸≤0

≤ b ,

y

•vi

•vi+1

Ax ≤ b cTx

y

•vi

Ax ≤ b cTx

(a) (b)

Fig. 8.5. Case 2

so all vi + λy satisfy the constraints. In other words, moving in direction yeither moves along the boundary of a constraint (ay = 0) or moves inward(ay < 0). Moreover,

cT(vi + λy) = cTvi + λcTy

= cTvi + λuTAy

= cTvi +−λ uk︸︷︷︸<0

,

which grows with λ (recall cT = uTA, and Ay =[0 · · · 0 −1 0 · · · 0

]Tby

selection of y). Thus, the maximum is unbounded. Figure 8.5(b) illustratesthis case: all points along the ray labeled y satisfy the constraints, and movingalong the ray increases cTx without bound.

Example 8.15. Continuing with y =[1 0 0 0 0

]Tfrom Example 8.13, com-

pute Ay

−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1−1 1 1 −1 0

1 −1 1 −1 −1

︸︷︷︸A

10000

︸︷︷︸y

=

−10000−1

1

to find that S = [7] since a7y = 1 > 0. Thus, according to (8.10), examine theseventh row of the constraints and choose the greatest λ1 such that

[1 −1 1 −1 −1

]︸︷︷︸

a7

(v1 + λ1y) =[1 −1 1 −1 −1

]

00000

+ λ1

10000

= 1 ;

that is, choose λ1 = 1. By (8.12), define

v2 = v1 + λ1y =[1 0 0 0 0

]T.

Form A2 by replacing the first row (k′ = 1) of A1 with the seventh row (ℓ = 7)of A:

A2 =

1 −1 1 −1 −10 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1

.

This move to vertex v2 makes progress:

cTv1 = 0 < cTv2 = 1 .

Now R = [7; 2; 3; 4; 5]; that is, A2 is the submatrix of A constructed from

rows 7, 2, 3, 4, and 5. Solve A2Tu2 = c for u2, yielding u2 = [1 0 0 0 0]

T. Since

u2 ≥ 0, we are in Case 1: we have found an optimum point v2 with optimalvalue 1.

Recall from Example 8.7 that vG = 1Tg2 = 1. The equality of the optimum

and vG implies that

F : x + y ≥ 1 ∧ x− y ≥ −1

is TQ-satisfiable. In particular, extract from

x1

x2

y1

y2

z

= v2 =

10000

the assignment

x = x1 − x2 = 1− 0 = 1 and y = y1 − y2 = 0− 0 = 0 ,

which indeed satisfies F .


F : x ≥ 0 ∧ y ≥ 0 ∧ x ≥ 2 ∧ y ≥ 2 ∧ x + y ≤ 3 ,


or, in matrix form,

F :

−1 00 −1−1 0

0 −11 1

[xy

]≤

00−2−2

3

.

Is F TQ-satisfiable? Because F has only weak inequality literals, the corre-sponding linear program has a constant objective function:

M : max 1subject to

G :

−1 00 −1−1 0

0 −11 1

[xy

]≤

00−2−2

3

Solving the corresponding linear program M0 is sufficient to determine theTQ-satisfiability of F .

Construction of M0

Because x and y are already constrained to be nonnegative, we do not needto introduce new x1, x2, y1, y2. Rewrite the final three literals of F as two setsof constraints:

[1 1]︸︷︷︸D1

[xy

]≤ [3]︸︷︷︸

g1

and

[1 00 1

]

︸︷︷︸D2

[xy

]≥[

22

]

︸︷︷︸g2

so that g1 ≥ 0 and g2 > 0. Then pose the following optimization problem ofform (8.2):

M0 : max [1 1 − 1 − 1]︸︷︷︸cT

xy

z1

z2

subject to

−1 0 0 00 −1 0 00 0 −1 00 0 0 −11 1 0 01 0 −1 00 1 0 −1

︸︷︷︸A

xy

z1

z2

≤

0000322

︸︷︷︸b


in which c is computed according to

cTx = 1T [

D2 −I]

xy

z1

z2

= [1 1]

[1 0 −1 00 1 0 −1

]

xy

z1

z2

=[1 1 −1 −1

]︸︷︷︸

cT

xy

z1

z2

.

Use the initial vertex

v1 =

xy

z1

z2

=

0000

to start the vertex traversal.F is satisfiable iff the optimal value is equal to

vG = 1Tg2 = [1 1]

[22

]= 4 .

We apply the simplex method to find the optimum.

Iteration 1

Choose rows R = [1; 2; 3; 4] of A and b to form

A1 =

−1 0 0 00 −1 0 00 0 −1 00 0 0 −1

and b1 =

0000

.

This submatrix and subvector correspond to the defining constraints of vertexv1: A1v1 = b1.

Recall that by the Duality Theorem (Theorem 8.6), our goal is to find avector u such that u ≥ 0 and uTA = cT. We use A1 to solve for a u satisfyinguTA = cT and, through iteratively exploring a sequence of vertices, eventuallyfind a u that also satisfies u ≥ 0.

Therefore, we solve for u. By (8.5), solve A1Tu1 = c to yield u1 =[

−1 −1 1 1]T

. Adding 0s for the rows not in R produces u:

u =[−1 −1 1 1 0 0 0

]T,

which satisfies uTA = cT (8.6) as desired.

Since u1, u2 < 0, we are in Case 2 with k = k′ = 1. We have not yet founda u ≥ 0, so we must continue searching. In particular, constraints 1 and 2 arecausing a problem (as indicated by u1, u2 < 0), so we must find a direction yand distance λ1 to travel to a better vertex whose defining constraints do notinclude constraint 1. The direction is given by the first column of −A−1

1 : by

(8.8), solve A1y = −e1 to yield y =[1 0 0 0

]T.

It remains to find the distance λ1 to travel along y. Computing Ay showsthat S = [5; 6]: the fifth and sixth rows a of A are such that ay > 0. Thissituation means that traveling along y too far will result in making one orboth of constraints 5 and 6 unsatisfied. The other constraints will continue tobe satisfied independent of λ1.

Therefore, by (8.9), choose the largest λ1 such that

A(v1 + λ1y) ≤ b .

Focusing on the fifth and sixth rows of A (since S = [5; 6]), choose the largestλ1 such that

[1 1 0 01 0 −1 0

]

︸︷︷︸rows 5,6 of A

0000

︸︷︷︸v1

+ λ1

1000

︸︷︷︸y

≤[

32

]

︸︷︷︸rows 5,6 of b

according to (8.9). Namely, choose λ1 = 2 (and ℓ = 6).We now have a direction (y) and a distance (λ1) to travel away from v1.

By (8.12), define

v2 = v1 + λ1y =

0000

+ 2

1000

=

2000

.

For the next iteration, replace the first row of A1 (since k′ = 1) with the sixthrow of A (since ℓ = 6) to produce

A2 =

1 0 −1 00 −1 0 00 0 −1 00 0 0 −1

and b2 =

2000

.

Have we made progress by moving to vertex v2? Yes, for

cTv1 = 0 < 2 = cTv2 .

The objective function has increased from 0 to 2.

Iteration 2

Now R = [6; 2; 3; 4]. Solve A2Tu2 = c to yield u2 =

[1 −1 0 1

]Tfor rows

[6; 2; 3; 4]. Then filling in 0s for the other rows of A produces:

u = [0 −1 0 1 0 1 0]T

.row: 2 3 4 6

Note that u maintains the order of rows, while u2 is ordered according to R.u2 < 0, so k = 2, which corresponds to row k′ = 2 of u2. According to Case 2,let y be the second column of −A−1

2 : solve A2y = −e2 to yield y = [0 1 0 0]T.Then the fifth and seventh rows a of A are such that ay > 0 so that S = [5; 7].Focusing on the fifth and seventh rows of A, choose the largest λ2 such that

[1 1 0 00 1 0 −1

]

︸︷︷︸rows 5,7 of A

2000

︸︷︷︸v2

+ λ2

0100

︸︷︷︸b

≤[

32

]

︸︷︷︸rows 5,7 of b

.

Choose λ2 = 1 (and ℓ = 5). Then

v3 = v2 + λ2y =

2000

+ 1

0100

=

2100

.

Replace the second row of A2 (since k′ = 2) with the fifth row of A (sinceℓ = 5) to produce

A3 =

1 0 −1 01 1 0 00 0 −1 00 0 0 −1

and b3 =

2300

.

Have we made progress? Yes, for

cTv1 = 0 < cTv2 = 2 < cTv3 = 3 .

The objective function has increased from 2 to 3.

Iteration 3

Now R = [6; 5; 3; 4]. Solve A3Tu3 = c, yielding u3 =

[0 1 1 1

]T. Because

u3 ≥ 0, we are in Case 1: v3 is the optimum with objective value

8.5 Summary 237

cTv3 =[1 1 −1 −1

]

2100

= 3 .

The optimal value of the constructed optimization problem is 3, which isless than the required vG = 4. The constraints G of the linear program M areunsatisfiable, and thus F itself is TQ-unsatisfiable.

8.4.3 ⋆Complexity

Theorem 8.17. TQ-satisfiability of conjunctive quantifier-free ΣQ-formulaeis weakly polynomial-time decidable.

An algorithm is weakly polynomial-time if it runs in time polynomial inthe actual values of the input, not just the size of the input. BibliographicRemarks discusses algorithms that achieve this performance.

While efficient in practice, the simplex method actually has poor worst-case behavior for known fixed pivot rules. A pivot rule determines whichadjacent vertex to visit next. In our presentation, no rule is fixed: the nextvertex can be any of those corresponding to the negative entries in the uvector. However, for all known pivot rules, there exist classes of quantifier-freeΣQ-formulae on which the simplex method requires an exponential number ofsteps in the size of the formulae.

8.5 Summary

This chapter covers linear programming and the simplex method for solvinglinear programs. It covers:

• How decision procedures that reason only about conjunctive formulae ex-tend to arbitrary Boolean structure by converting to DNF. Exercise 8.1explores a more effective way of converting to DNF.

• A review of linear algebra.• Linear programs, which are optimization problems with linear constraints

and linear objective functions. Application to TQ-satisfiability.• The simplex method. Finding an initial point via a new linear program.

Greedy search along vertices.

The structure of the simplex method is markedly different from the struc-ture of the quantifier elimination procedures of Chapter 7. It focuses on thestructure of the set of interpretations of the given formula, rather than on theformula itself. Exercises 8.3, 8.4, and 8.5 explore these sets, which describepolyhedra.


In addition to applications in all of engineering, arithmetical reasoning isnecessary for analyzing program correctness. Of course, arithmetic usually ap-pears in the context of other data structures in software. Chapter 10 discussesa method for combining the arithmetic decision procedures of this and theprevious chapter with other decision procedures.


For a presentation of linear algebra, see [42]; for a comprehensive presentationof linear and integer programming, see [82].

Work on linear programming proceeded in parallel in the western hemi-sphere and the USSR. Dantzig is the inventor of the simplex method in thewestern hemisphere [22]. Earlier work by Kantorovich in the USSR also pro-poses the method [44]. Weakly polynomial-time algorithms were achieved byKhachian [49] in the USSR and Karmarkar [45] in the western hemisphere.

Exercises

8.1 (⋆Conjunctive quantifier-free formulae). Converting an arbitraryquantifier-free Σ-formula to DNF and then applying a decision procedure toeach disjunct can be prohibitively expensive. In practice, SAT solvers (decisionprocedures for propositional logic, such as DPLL) are used to extend a decisionprocedure for conjunctive quantifier-free Σ-formula to arbitrary quantifier-freeΣ-formula.

(a) Show that the DNF of a formula F can be exponentially larger than F .(b) Describe a procedure that, using a SAT solver, extracts conjunctive Σ-

formulae from a quantifier-free Σ-formula F . Using this procedure, eachdiscovered conjunctive formula’s T -satisfiability will be decided. If it isT -satisfiable, then F is T -satisfiable, so the procedure finishes; otherwise,the procedure finds another conjunctive formula.

(c) The proposed procedure is really no more efficient than simply convertingto DNF. This part explores an optimization. An unsatisfiable core of T -unsatisfiable conjunctive Σ-formula G is the conjunction H of a subset ofliterals of G such (1) H is also T -unsatisfiable, and (2) the conjunction ofeach strict subset of literals of H is T -satisfiable. Improve your procedurefrom the previous part to use a function UnsatCoreT (G) that returns aT -unsatisfiable core of a T -unsatisfiable conjunctive Σ-formula G.

(d) Given a decision procedure DPT for conjunctive quantifier-free Σ-formula,describe a procedure UnsatCore(G) for computing an unsatisfiable core ofG that takes no more than a number of DPT calls linear in the numberof literals of G. Note that G can have multiple unsatisfiable cores; yourprocedure need only return one.

Exercises 239

(e) Given a decision procedure DPT for conjunctive quantifier-free Σ-formula,describe a procedure UnsatCore(G) for computing an unsatisfiable core ofG that takes no more than O(m log n) calls of DPT , where m is the numberof literals of the returned unsatisfiable core and n is the number of literalsof G.

8.2 (DP for TQ). Apply the simplex method to decide the TQ-satisfiabilityof the following ΣQ-formulae.

(a) x ≥ 1 ∧ 2x ≤ 1(b) x + 2y ≥ 1 ∧ 2x + y ≥ 1 ∧ 2x + 2y ≤ 1(c) x + 2y ≥ 1 ∧ 2x + y ≥ 1 ∧ x + y ≤ 1

8.3 (⋆Polyhedra & dual representation). Ax ≤ b defines a subspace ofRn called a polyhedron; Ax ≤ b is the constraint representation of thisspace. A polyhedron can also be described by the union of a set of verticesV ⊆ Rn and a set of rays R ⊆ Rn; this representation is the vertex repre-sentation. While vertices v ∈ V describe points in Rn, rays r ∈ R describedirections in Rn. For example, as a vertex, [1 1]

T ∈ R2 describes the point oneunit right and one unit up from the origin on the plane. As a ray, it describesthe “northeast” direction.

Represent each vertex v ∈ V and ray r ∈ R by the (n + 1)-vectors[

x1

]and

[r0

],

respectively. Let P be the union of V and R represented in this way. Then forevery inequality Ax ≤ b, there exists a vertex-ray set P with k elements suchthat

Ax ≤ b iff ∃λ1, . . . , λk ≥ 0.

[x1

]=

k∑

i=1

λipi .

(a) Given an element

p =

[xǫ

]∈ P ,

we can evaluate Ax ≤ ǫb. If p is a vertex, it is equivalent to evaluatingAx ≤ b, as expected. If p is a ray, it is equivalent to evaluating Ax ≤ 0.Explain what the ray case means.

(b) Define the convex hull as follows:

hull(P ) =

x : ∃λ1, . . . , λk ≥ 0.

[x1

]=

k∑

i=1

λipi

.

Show that hull(P ) is convex. Conclude that if Ax ≤ ǫb for each [xT ǫ]T ∈ P ,

then hull(P ) ⊆ x : Ax ≤ b.


8.4 (⋆Set operations). The constraint representations of two polyhedra S1

and S2 are A1x ≤ b1 and A2x ≤ b2, respectively, and their vertex representa-tions are P1 and P2, respectively.

(a) Write the constraint representation of S1 ∩ S2 in terms of S1 and S2.(b) Write the vertex representation of hull(S1 ∪ S2) in terms of P1 and P2.(c) Describe how to compute whether S1 ⊆ S2. Hint: Use both representa-

tions.

8.5 (⋆Set operations with vertex representation). The vertex represen-tations of two polyhedra S1 and S2 are P1 and P2, respectively.

(a) Describe how to compute whether S1 ⊆ S2 using only their vertex repre-sentations. Hint: Use the simplex method.

(b) Given Ax ≤ b, a vertex enumeration algorithm lists the correspondingvertex representation. Using a vertex enumeration algorithm, describe howto compute the vertex representation of S1 ∩ S2 using only their vertexrepresentations.

9

Quantifier-Free Equality and Data Structures

Almost all proofs require reasoning about equalities.— Greg Nelson and Derek C. Oppen

Fast Decision Procedures Based on Congruence Closure, 1980

Equality is perhaps the most widely-used relation among data in pro-grams. In this chapter, we consider equality among variables, constants, andfunction applications (Section 9.1); among recursive data structures (records,lists, trees, and stacks) and their elements (Section 9.4); and among elementsof arrays (Section 9.5). For all three theories, we examine their quantifier-freefragments.

In the last two theories — recursive data structures and arrays — we distin-guish between equality among data structures and among their elements. Thequantifier-free fragment of the theory of recursive data structures can expressequality among whole structures, which reduces to element-wise equality. Incontrast, the quantifier-free fragment of the theory of arrays cannot expressequality among entire arrays. Chapter 11 presents larger fragments of varioustheories of arrays that can express equality among arrays and array segments,among other properties.

Equality among variables, constants, and function applications is expressedin the theory of equality TE. This theory is sometimes referred to as the theoryof equality with uninterpreted functions (EUF). Its signature containsall predicate and function symbols, yet its axioms do not interpret (assignmeaning to) the symbols other than in the context of equality. Specifically,the axioms of TE assert that a function symbol acts like a function: if two termst1 and t2 are equal, then f(t1) and f(t2) are also equal. Similarly, predicatesymbols act like predicates: p(t1) is equivalent to p(t2) when t1 is equal to t2.

Section 9.1 presents the theory TE and its quantifier-free fragment. ThenSection 9.2 describes at a high-level the congruence-closure algorithm todecide TE-satisfiability of quantifier-free ΣE-formulae. Section 9.3 details animplementation that employs an efficient representation of ΣE-formulae.

242 9 Quantifier-Free Equality and Data Structures

The congruence closure algorithm is the basis for the other decision proce-dures of this chapter as well. It is extended in Section 9.4 to decide satisfiabilityin the quantifier-free fragment of the theory of recursive data structures TRDS,and in particular in the theory of lists Tcons. Finally, it is applied in Section 9.5to decide satisfiability in the quantifier-free fragment of the theory of arraysTA.

The quantifier-free fragment of TE and its satisfiability decision procedureplay a central role in combining theories that share the equality predicate. Wediscuss the combination of theories in Chapter 10.

9.1 Theory of Equality

Recall from Chapter 3 that the signature of TE,

ΣE : =, a, b, c, . . . , f, g, h, . . . , p, q, r, . . . ,

consists of

• =, a binary predicate;• and all constant, function, and predicate symbols.

As in every other theory, ΣE-formulae are constructed from symbols of thesignature, variables, logical connectives, and quantifiers.

The equality predicate = is interpreted, or given meaning, via the axiomsof TE. The axioms

1. ∀x. x = x (reflexivity)2. ∀x, y. x = y → y = x (symmetry)3. ∀x, y, z. x = y ∧ y = z → x = z (transitivity)

define = to be an equivalence relation. These axioms give = the expectedmeaning of equality on pairs of variable terms.

However, they do not provide the full meaning for = in the context offunction terms, such as in f(x) = f(g(y, z)). The following axiom schemastands for an infinite but countable set of axioms:

4. for each positive integer n and n-ary function symbol f ,

∀x, y.

(n∧

i=1

xi = yi

)→ f(x) = f(y) (function congruence)

For example, two instances of this axiom schema are the following:

∀x, y. x = y → f(x) = f(y)

and

∀x1, x2, y1, y2. x1 = y1 ∧ x2 = y2 → g(x1, x2) = g(y1, y2) .

9.1 Theory of Equality 243

Then

x = g(y, z) → f(x) = f(g(y, z))

is TE-valid by the first instance. Alternately,

x = g(y, z) ∧ f(x) 6= f(g(y, z))

is TE-unsatisfiable, where t1 6= t2 abbreviates ¬(t1 = t2). This axiom schemamakes = a congruence relation.

Finally, observe that the logical operator ↔ should behave on predicateformulae similarly to the way = behaves on function terms. For example, ourintuition asserts that

x = y → (p(x)↔ p(y)) (9.1)

should be TE-valid. In Chapter 3, we list a fifth axiom schema:

5. for each positive integer n and n-ary predicate symbol p,

∀x, y.

(n∧

i=1

xi = yi

)→ (p(x)↔ p(y)) (predicate congruence)

Under this axiom schema, formula (9.1) is TE-valid.

Example 9.1. The ΣE-formula

f(x) = f(y) ∧ x 6= y

is TE-satisfiable: a function can evaluate unequal arguments to equal values.However,

x = y ∧ f(x) 6= f(y)

is TE-unsatisfiable: a function evaluates each value to one value, independentof representation. Here, x = y, so x and y represent the same value.

Finally,

F : f(f(f(a))) = a ∧ f(f(f(f(f(a))))) = a ∧ f(a) 6= a

is TE-unsatisfiable. We can make the following intuitive argument: substitutinga for f(f(f(a))) in f(f(f(f(f(a))))) = a by the first equality yields f(f(a)) =a; and substituting a for f(f(a)) in f(f(f(a))) = a according to this newequality yields f(a) = a, contradicting the literal f(a) 6= a.

Substitution, of course, rests on the axioms of TE: in particular, the firstsubstitution requires four steps:

1. f(f(f(f(a)))) = f(a) first literal of F , (function congruence)2. f(f(f(f(f(a))))) = f(f(a)) step 1, (function congruence)3. f(f(a)) = f(f(f(f(f(a))))) step 2, (symmetry)4. f(f(a)) = a step 3, second literal of F , (transitivity)

The decision procedure of this chapter reasons similarly.


The (predicate congruence) axiom schema needlessly complicates our dis-cussion. Instead, we describe by example a simple reduction of ΣE-formulaecontaining uninterpreted predicates to ΣE-formulae without predicates otherthan =. This transformation allows us to disregard the (predicate congruence)axiom schema. For example, given ΣE-formula

x = y → (p(x)↔ p(y)) ,

introduce fresh constant • and fresh function fp, and write

x = y → ((fp(x) = •) ↔ (fp(y) = •)) .

Similarly, transform

p(x) ∧ q(x, y) ∧ q(y, z) → ¬q(x, z)

into

fp(x) = • ∧ fq(x, y) = • ∧ fq(y, z) = • → fq(x, z) 6= • .

In the rest of this chapter, we consider ΣE-formulae without predicates otherthan =.

TE-satisfiability is undecidable since satisfiability in FOL is undecidable.However, satisfiability in the quantifier-free fragment of TE is decidable. Wefocus on this fragment in this chapter. Therefore, when we say “ΣE-formula,”we mean “quantifier-free ΣE-formula” unless otherwise stated.

Additionally, as in Chapter 8, we consider only quantifier-free ΣE-formulaethat are conjunctions of literals. Satisfiability of an arbitrary quantifier-freeΣE-formula is considered via conversion to DNF.

9.2 Congruence Closure Algorithm

Each positive literal s = t of a (conjunctive quantifier-free) ΣE-formula Fasserts an equality between two terms s and t. Applying symmetry, reflexivity,transitivity, and congruence to these equalities of F produces more equalitiesover terms occurring in F . Since there are only a finite number of terms in F ,only a finite number of equalities among these terms are possible. Hence, oneof two situations eventually occurs: either some equality is formed that directlycontradicts a negative literal s′ 6= t′ of F ; or the propagation of equalities endswithout finding a contradiction. These cases correspond to TE-unsatisfiabilityand TE-satisfiability, respectively, of F .

This section develops the basic concepts for describing the algorithm math-ematically as forming the congruence closure of the equality relation overterms asserted by F .

9.2 Congruence Closure Algorithm 245

9.2.1 Relations

Binary Relations

Consider a set S and a binary relation R over S. For two elements s1, s2 ∈ S,either s1Rs2 or ¬(s1Rs2).

The relation R is an equivalence relation if it is

• reflexive: ∀s ∈ S. sRs;• symmetric: ∀s1, s2 ∈ S. s1Rs2 → s2Rs1;• transitive: ∀s1, s2, s3 ∈ S. s1Rs2 ∧ s2Rs3 → s1Rs3.

It is a congruence relation if it additionally obeys congruence: for everyn-ary function f ,

∀s, t.(

n∧

i=1

siRti

)→ f(s)Rf(t) .

In this case, function evaluation of R-related terms yields R-related results.The way we think of equality in our everyday computations is as a congruencerelation.

Classes and Partitions

Consider an equivalence relation R over a set S. The equivalence class ofs ∈ S under R is the set

[s]Rdef= s′ ∈ S : sRs′ .

If R is a congruence relation over S, then [s]R is the congruence class of s.

Example 9.2. Consider the set Z of integers and the equivalence relation ≡2

such that

m ≡2 n iff (m mod 2) = (n mod 2) .

m, n ∈ Z are related iff they are both even or both odd. The equivalence classof 3 under ≡2 is

[3]≡2= n ∈ Z : (n mod 2) = (3 mod 2)= n ∈ Z : (n mod 2) = 1= n ∈ Z : n is odd .


A partition P of S is a set of subsets of S that is total,(⋃

S′∈P

S′

)= S ,

and disjoint,

∀S1, S2 ∈ P. S1 6= S2 → S1 ∩ S2 = ∅ .

The quotient S/R of S by the equivalence (congruence) relation R is a par-tition of S: it is a set of equivalence (congruence) classes

S/R = [s]R : s ∈ S .

Example 9.3. The quotient Z/ ≡2 is a partition: it is the set of equivalenceclasses

n ∈ Z : n is odd, n ∈ Z : n is even .

Just as an equivalence relation R induces a partition S/R of S, a givenpartition P of S induces an equivalence relation over S. Specifically, s1Rs2 ifffor some S′ ∈ P , both s1, s2 ∈ S′.

Relation Refinements

Consider two binary relations R1 and R2 over set S. R1 is a refinement ofR2, or R1 ≺ R2, if

∀s1, s2 ∈ S. s1R1s2 → s1R2s2 .

We also say that R1 refines R2. Viewing the relations as sets of pairs, R1 ⊆R2.

Example 9.4. For S = a, b, R1 : aR1b ≺ R2 : aR2b, bR2b.Viewing the relations as sets of pairs, R1 ≺ R2 iff R1 ⊆ R2.

Example 9.5. Consider set S, the relation

R1 : sR1s : s ∈ Sinduced by the partition

P1 : s : s ∈ S ,

and the relation

R2 : sR2t : s, t ∈ Sinduced by the partition

P2 : S .

Then R1 ≺ R2.


Example 9.6. Consider the set Z and the two relations

R1 : xR1y : x mod 2 = y mod 2 and R2 : xR2y : x mod 4 = y mod 4 .

Then R2 ≺ R1.

Closures

The equivalence closure RE of the binary relation R over S is the equiva-lence relation such that

• R refines RE : R ≺ RE ;• for all other equivalence relations R′ such that R ≺ R′, either R′ = RE or

RE ≺ R′.

That is, RE is the “smallest” equivalence relation that “covers” R.

Example 9.7. If S = a, b, c, d and

R = aRb, bRc, dRd ,

then

• aRb, bRc, dRd ∈ RE since R ⊆ RE ;• aRa, bRb, cRc ∈ RE by reflexivity;• bRa, cRb ∈ RE by symmetry;• aRc ∈ RE by transitivity;• cRa ∈ RE by symmetry.

Hence,

RE = aRb, bRa, aRa, bRb, bRc, cRb, cRc, aRc, cRa, dRd .

The congruence closure RC of R is the “smallest” congruence relationthat “covers” R. Shortly, we shall illustrate the congruence closure of a termset.

9.2.2 Congruence Closure Algorithm

Having defined several abstract concepts, we now return to the theory of equal-ity. The subterm set SF of ΣE-formula F is the set that contains preciselythe subterms of F .

Example 9.8. The subterm set of

F : f(a, b) = a ∧ f(f(a, b), b) 6= a

is

SF = a, b, f(a, b), f(f(a, b), b) .


Now we relate the congruence closure of a ΣE-formula’s subterm set withits TE-satisfiability. Given ΣE-formula F

F : s1 = t1 ∧ · · · ∧ sm = tm ∧ sm+1 6= tm+1 ∧ · · · ∧ sn 6= tn , (9.2)

with subterm set SF , F is TE-satisfiable iff there exists a congruence relation∼ over SF such that

• for each i ∈ 1, . . . , m, si ∼ ti;• for each i ∈ m + 1, . . . , n, si 6∼ ti.

Such a congruence relation ∼ defines a TE-interpretation I : (DI , αI) of F .DI consists of |SF / ∼ | elements, one for each congruence class of SF under∼. αI assigns elements of DI to the terms of SF in a way that respects ∼.Finally, αI assigns to = a binary relation over DI that behaves like ∼. Basedon this construction, we abbreviate (DI , αI) |= F with ∼ |= F .

The goal of the congruence closure algorithm is to construct the con-gruence relation of a formula’s subterm set, or to prove that no congruence re-lation exists. Given formula (9.2), the algorithm performs the following steps:

1. Construct the congruence closure ∼ of

s1 = t1, . . . , sm = tm

over the subterm set SF . Then

∼ |= s1 = t1 ∧ · · · ∧ sm = tm .

2. If si ∼ ti for any i ∈ m + 1, . . . , n, return unsatisfiable.3. Otherwise, ∼ |= F , so return satisfiable.

How do we actually construct the congruence closure in Step 1? Initially,begin with the finest congruence relation ∼0 given by the partition

s : s ∈ SF

in which each term of SF is its own congruence class. Then, for each i ∈1, . . . , m, impose si = ti by merging the congruence classes

[si]∼i−1and [ti]∼i−1

to form a new congruence relation ∼i. To accomplish this merging, first formthe union of [si]∼i−1

and [ti]∼i−1. Then propagate any new congruences that

arise within this union. In the new congruence relation∼i, si ∼i ti. Section 9.3presents the details of an efficient data structure and algorithm for performingthis construction.

Example 9.9. Consider the ΣE-formula

F : f(a, b) = a ∧ f(f(a, b), b) 6= a .


Construct the following initial partition by letting each member of the subtermset SF be its own class:

a, b, f(a, b), f(f(a, b), b) .

According to the first literal f(a, b) = a, merge

f(a, b) and a

to form partition

a, f(a, b), b, f(f(a, b), b) .

According to the (function congruence) axiom,

f(a, b) ∼ a, b ∼ b implies f(f(a, b), b) ∼ f(a, b) ,

resulting in the new partition

a, f(a, b), f(f(a, b), b), b .

This partition represents the congruence closure of SF . Now, is it the casethat

a, f(a, b), f(f(a, b), b), b |= F ?

No, as f(f(a, b), b) ∼ a but F asserts that f(f(a, b), b) 6= a. Hence, F isTE-unsatisfiable.


F : f(f(f(a))) = a ∧ f(f(f(f(f(a))))) = a ∧ f(a) 6= a

From the subterm set SF , the initial partition is

a, f(a), f2(a), f3(a), f4(a), f5(a) ,

where, for example, f3(a) abbreviates f(f(f(a))).According to the literal f3(a) = a, merge

f3(a) and a .

From the union a, f3(a), deduce the following congruence propagations:

f3(a) ∼ a ⇒ f(f3(a)) ∼ f(a) , i.e., f4(a) ∼ f(a)

and

f4(a) ∼ f(a) ⇒ f(f4(a)) ∼ f(f(a)) , i.e., f5(a) ∼ f2(a) .


Thus, the final partition for this iteration is the following:

a, f3(a), f(a), f4(a), f2(a), f5(a) .

From the second literal, f5(a) = a, merge

f2(a), f5(a) and a, f3(a)

to form the partition

a, f2(a), f3(a), f5(a), f(a), f4(a) .

Propagating the congruence

f3(a) ∼ f2(a) ⇒ f(f3(a)) ∼ f(f2(a)) , i.e., f4(a) ∼ f3(a)

yields the partition

a, f(a), f2(a), f3(a), f4(a), f5(a) ,

which represents the congruence closure in which all of SF are equal.Now, is it the case that

a, f(a), f2(a), f3(a), f4(a), f5(a) |= F ?

No, as f(a) ∼ a, but F asserts that f(a) 6= a. Hence, F is TE-unsatisfiable.


F : f(x) = f(y) ∧ x 6= y .

The subterm set SF induces the following initial partition:

x, y, f(x), f(y) .

Then f(x) = f(y) indicates to merge

f(x) and f(y) .

The union f(x), f(y) does not yield any new congruences, so the final par-tition is

x, y, f(x), f(y) .

Does

x, y, f(x), f(y) |= F ?

Yes, as x 6∼ y, agreeing with x 6= y. Hence, F is TE-satisfiable.

9.3 Congruence Closure with DAGs 251

1 : f

2 : f

3 : a 4 : b

1 : f

2 : f

3 : a 4 : b

1 : f

2 : f

3 : a 4 : b

(a) (b) (c)

Fig. 9.1. (a) DAG representation; (b), (c) with find

9.3 Congruence Closure with DAGs

So far, we have considered the congruence closure algorithm at an abstractlevel. In this section, we describe an efficient implementation of the algorithm.

Section 9.3.1 defines a graph-based data structure for representing all mem-bers of the subterm set SF of a ΣE-formula F . Each node represents a sub-term. Congruence classes are stored within this data structure via referencesbetween nodes. Sections 9.3.2 and 9.3.3 present algorithms that manipulatethe data structure to construct the congruence closure of the relation definedby F .

9.3.1 Directed Acyclic Graphs

A graph G : 〈N, E〉 has a set of nodes N = n1, n2, . . . , nk and a set ofedges E = . . . , 〈ni, nj〉, . . ., which consists of pairs of nodes. In a directedgraph, edges point from one node to another. For example, the edge 〈n3, n5〉is not the same edge as 〈n5, n3〉: the first points to n5, while the second pointsto n3. In a directed edge 〈m, n〉, m is the source, and n is the target. Adirected acyclic graph (DAG) is a directed graph in which no subset ofedges forms a directed loop, or cycle.

Example 9.12. Consider the directed graph

G : 〈N : 1, 2, 3, E : 〈1, 2〉, 〈2, 3〉, 〈3, 1〉〉 .

It is not a DAG because it contains the loop 1 → 2 → 3 → 1. However, thedirected graph

G : 〈N : 1, 2, 3, E : 〈1, 2〉, 〈2, 3〉, 〈1, 3〉〉

is a DAG.


The congruence closure algorithm uses a DAG to represent terms of thesubterm set. The DAG of Figure 9.1(a) represents the subterm set

a, b, f(a, b), f(f(a, b), b) .

Each node has a unique number identifying it and a text label representingthe root constant or function symbol of the associated term. Edges pointfrom a function symbol to its arguments. For example, node 3 represents theterm a; node 2 represents the term f(a, b); and node 1 represents the termf(f(a, b), b). Node 4, representing the term b, is shared by the terms f(a, b)and f(f(a, b), b).

So far, the following data type is sufficient to represent subterms:

type node = id : idfn : stringargs : id list

The id field holds the node’s unique identification number; the fn field holdsthe constant or function symbol; and the args field holds a list of identificationnumbers representing the function arguments. For example, node 2 of Figure9.1(a) has id = 2, fn = f , and args = [3; 4]; and node 3 has id = 3, fn = a,and args = [].

To represent congruence classes, we augment the data type with one morefield:

type node = id : idfn : stringargs : id listmutable find : id

The mutable keyword indicates that the value of the field can be modified.The find field holds the identification number of another node (possibly itself)in its congruence class. Following a chain of find references leads to therepresentative of the congruence class. A representative node’s find fieldpoints to the node itself.

Consider the DAG of Figure 9.1(b). The dashed edge from node 2 to node3 indicates that node 2’s find field points to node 3, and thus that nodes 2and 3 are in the same congruence class

f(a, b), a .

To avoid cluttering the illustrations, we do not draw dashed lines from anode to itself when its find is set to itself. Thus, the DAG of Figure 9.1(b)represents the partition


f(a, b), a, b, f(f(a, b), b) ;

and, for example, the find field of node 1 is 1. Also, since the find of node 3is 3, node 3 represents the congruence class of nodes 2 and 3.

In Figure 9.1(c), the dashed and dotted edges play the same role for now,each representing the value of the source node’s find field. The DAG repre-sents the partition

f(f(a, b), b), f(a, b), a, b .

Node 3 is the (unique) representative of its congruence class. We shall see thatthe algorithm maintains exactly one representative for each congruence class.

Finally, merging congruence classes requires accessing parent terms (e.g.,f(a, b) is the parent of a and b). Thus, we add a final mutable field:

type node = id : idfn : stringargs : id listmutable find : idmutable ccpar : id set

If a node is the representative for its congruence class, then its ccpar (forcongruence closure parents) field stores the set of all parents of all nodesin its congruence class. A non-representative node’s ccpar field is empty. InFigure 9.1(b), the ccpar

• of 1 is ∅ because 1 is its own representative, but it lacks parents;• of 2 is ∅ because 2 is not a representative;• of 3 is 1, 2 because 3 represents the class 2, 3 in which the parent of 2

is 1 and the parent of 3 is 2;• of 4 is 1, 2 because 4 represents itself and has parents 1, 2.

The DAG representation of node 2 is thus

id = 2;fn = f ;args = [3; 4];find = 3;ccpar = ∅;

Similarly, the DAG representation of node 3 is

id = 3;fn = a;args = [];find = 3;ccpar = 1, 2;


9.3.2 Basic Operations

In this section, we define the union-find algorithm on the DAG data struc-ture. Generally, a union-find algorithm provides an efficient means of manip-ulating sets. It represents each set by a single representative element. Callingthe find function on any element returns its set’s unique representative. Call-ing the union function on two elements unions the two elements’ sets andfixes a unique representative for the union set.

The union-find algorithm on DAGs is sufficient for manipulating equiva-lence classes of terms. In Section 9.3.3, we extend the union-find algorithm tomanipulate congruence classes.

node

node i returns the node n with id i. Thus, (node i).id = i. For example, inFigure 9.1(b), (node 2).find = 3.

find

The find function returns the representative of a node’s equivalence class. Itfollows find edges until it finds a self-loop:

let rec find i =let n = node i inif n.find = i then i else find n.find

Example 9.13. In the DAG of Figure 9.1(b), find 2 is 3. find follows thefind edge of 2 to 3; then it recognizes the self-loop and thus returns 3.

union

The union function returns the union of two equivalence classes, given twonode identities i1 and i2. It first finds the representatives n1 and n2 of i1’sand i2’s equivalence classes, respectively. Next, it sets n1’s find to n2’s repre-sentative, which is the identity of n2 itself. Now n2 represents the new largerequivalence class.

Finally, it combines the congruence closure parents, storing the new setin n2’s ccpar field because n2 is the representative of the union equivalenceclass. This last step is not strictly part of the union-find algorithm (which,recall, computes equivalence classes); rather, it is intended for when we usethe union-find algorithm to compute congruence classes. In code,

let union i1 i2 =let n1 = node (find i1) inlet n2 = node (find i2) inn1.find ← n2.find;n2.ccpar ← n1.ccpar ∪ n2.ccpar;n1.ccpar ← ∅


Example 9.14. Consider the DAG of Figure 9.1(b). To compute union 1 2,union finds n1 = 1 and n2 = 3 as the representatives of i1 = 1 and i2 = 2,respectively. It sets 1’s find field to 3’s find, which is 3, and combines theccpars, setting 3’s ccpar to 1, 2 and 1’s ccpar to ∅. The result is Figure9.1(c). For now, ignore the difference between dotted and dashed lines.

ccpar

The simple function ccpar i returns the parents of all nodes in i’s congruenceclass:

let ccpar i =(node (find i)).ccpar

9.3.3 Congruence Closure Algorithm

We are ready to build the congruence closure algorithm using the basic oper-ations. First we define the congruent function to check whether two nodesthat are not in the same congruence class are in fact congruent. Then wedefine the merge function to merge two congruence classes and to propagatethe effects of new congruences recursively.

congruent

congruent i1 i2 tests whether i1 and i2 are congruent. Let n1 and n2 be thenodes with identities i1 and i2, respectively. If their fn fields or their numbersof arguments are different, then they cannot be congruent, so congruent

returns false. Otherwise, if any argument of n1 is not in the congruenceclass of the corresponding argument of n2, then the terms are not congruent,so congruent returns false. If both of these tests are passed, then n1 andn2 are congruent, so congruent returns true. Formally,

let congruent i1 i2 =let n1 = node i1 in

let n2 = node i2 in

n1.fn = n2.fn∧ |n1.args| = |n2.args|∧ ∀i ∈ 1, . . . , |n1.args|. find n1.args[i] = find n2.args[i]

Example 9.15. Consider the DAG of Figure 9.1(b). Are nodes 1 and 2 con-gruent? congruent notes that

• their fn fields are both f : n1.fn = n2.fn = f ;• their numbers of arguments are both 2;• their left arguments f(a, b) and a are both congruent to 3:

n1.args = [2; 4], n2.args = [3; 4], and find 2 = find 3 = 3;


• and their right arguments b and b are both congruent to 4:n1.args = [2; 4], n2.args = [3; 4], and find 4 = find 4 = 4.

Therefore, nodes 1 and 2 are congruent.

merge

Next, merge i1 i2 merges the congruence classes of i1 and i2. If find i1 =find i2, then the two nodes are already in the same congruence class, somerge returns without changing the DAG structure. Otherwise, let Pi1 andPi2 store the current values of ccpar i1 and ccpar i2, respectively. merge

first calls union i1 i2 to compute the merged equivalence class. Then, foreach pair of terms t1, t2 ∈ Pi1 × Pi2 , it recursively calls merge t1 t2 ifcongruent t1 t2. This last step propagates the effects of new congruences ac-cording to the (function congruence) axiom schema. Upon completion, the newpartition represents a congruence relation in which i1 and i2 are congruent.In code,

let rec merge i1 i2 =if find i1 6= find i2 then begin

let Pi1 = ccpar i1 in

let Pi2 = ccpar i2 in

union i1 i2;foreach t1, t2 ∈ Pi1 × Pi2 do

if find t1 6= find t2 ∧ congruent t1 t2then merge t1 t2

done

end

9.3.4 Decision Procedure for TE-Satisfiability

Given ΣE-formula

F : s1 = t1 ∧ · · · ∧ sm = tm ∧ sm+1 6= tm+1 ∧ · · · ∧ sn 6= tn

with subterm set SF , perform the following steps:

1. Construct the initial DAG for the subterm set SF .2. For i ∈ 1, . . . , m, merge si ti.3. If find si = find ti for some i ∈ m + 1, . . . , n, return unsatisfiable.4. Otherwise (if find si 6= find ti for all i ∈ m+1, . . . , n) return satisfiable.


F : f(a, b) = a ∧ f(f(a, b), b) 6= a .

The subterm set is


SF = a, b, f(a, b), f(f(a, b), b) ,

resulting in the initial partition

a, b, f(a, b), f(f(a, b), b)

in which each term is its own congruence class. The DAG of Figure 9.1(a)represents this initial partition.

According to the literal f(a, b) = a, merge 2 3. find 2 6= find 3, so let

P2 = ccpar 2 = 1 and P3 = ccpar 3 = 2 .

The union 2 3 computation results in the DAG of Figure 9.1(b) in which2 and 3 are equivalent. This new equivalence makes 1 and 2 congruent, somerge 1 2 is called recursively.

In computing merge 1 2,

P1 = ccpar 1 = ∅ and P2 = ccpar 2 = 1, 2 ,

so P1 × P2 = ∅. Hence, after computing union 1 2, the computation finisheswith the DAG of Figure 9.1(c). The dotted edge distinguishes the deducedmerge from the merge dictated by the equality in F , marked by the dashededge. This final DAG represents the partition

a, f(a, b), f(f(a, b), b), b

and thus the congruence relation in which a, f(a, b), and f(f(a, b), b) arecongruent. Does

a, f(a, b), f(f(a, b), b), b |= F ?

No, as f(f(a, b), b) ∼ a, but F asserts that f(f(a, b), b) 6= a. Hence, F isTE-unsatisfiable.


F : f(f(f(a))) = a ∧ f(f(f(f(f(a))))) = a ∧ f(a) 6= a ,

which induces the initial partition and DAG shown in Figure 9.2(a). Accord-ing to the literal f(f(f(a))) = a, merge 3 0. On this initial merge

P3 = 4 and P0 = 1 .

Additionally, the labels of both 4 and 1 are f , and their arguments, 3 and 0respectively, are congruent after union 3 0. Thus, recursively merge 4 1. Afterunion 4 1, their parents 5 and 2 are congruent, so merge 5 2. The recursionfinishes after union 5 2 since P5 = ∅, resulting in the DAG of Figure 9.2(b).


(a) 5 : f 4 : f 3 : f 2 : f 1 : f 0 : a

(b) 5 : f 4 : f 3 : f 2 : f 1 : f 0 : a

(c) 5 : f 4 : f 3 : f 2 : f 1 : f 0 : a

Fig. 9.2. DAGs for Example 9.17

The dotted edges distinguish deduced merges from merges dictated by F ,which are marked by dashed edges. Thus, the partition is now

a, f3(a), f(a), f4(a), f2(a), f5(a) .

Next, according to the literal f(f(f(f(f(a))))) = a, merge 5 0. find 5 = 2and find 0 = 0, so

P5 = 3 and P0 = 1, 4 .

After completing union 5 0 (by adding the dashed line from 2 to 0 in Figure9.2(c)), it is the case that congruent 3 1, so merge 3 1. This merge causesthe final union 3 1, resulting in the dotted line from 0 to 1 in Figure 9.2(c).Figure 9.2(c) represents the partition

a, f(a), f2(a), f3(a), f4(a), f5(a) .

Now, does

a, f(a), f2(a), f3(a), f4(a), f5(a) |= F ?

No, as f(a) ∼ a, but F asserts that f(a) 6= a. Hence, F is TE-unsatisfiable.

Theorem 9.18 (Sound & Complete). Quantifier-free conjunctive ΣE-formula F is TE-satisfiable iff the congruence closure algorithm returns satis-fiable.

9.3.5 ⋆Complexity

Let e be the number of edges and n be the number of nodes in the initialDAG.

Theorem 9.19 (Complexity). The congruence closure algorithm runs intime O(e2) for O(n) merges.

However, Downey, Sethi, and Tarjan described an algorithm with O(e log e)average running time for O(n) merges. Computing TE-satisfiability is inex-pensive.


9.4 Recursive Data Structures

Recursive data structures include records, lists, trees, stacks, and queues. Thetheory TRDS can model records, lists, trees, and stacks, but not queues: whereasa particular list has a single representation, queues — in which only ordermatters — do not. In this section, we discuss the theory of lists Tcons forease of exposition. Both its axiomatization and the decision procedure for thequantifier-free fragment rely on our discussion of TE.

Recall that the signature of Tcons is

Σcons : cons, car, cdr, atom, = ,

where

• cons is a binary function, called the constructor; cons(a, b) represents thelist constructed by prepending a to b;

• car is a unary function, called the left projector: car(cons(a, b)) = a;• cdr is a unary function, called the right projector: cdr(cons(a, b)) = b;• atom is a unary predicate;• and = is a binary predicate.



2. instantiations of the (function congruence) axiom schema for cons, car, andcdr:

∀x1, x2, y1, y2. x1 = x2 ∧ y1 = y2 → cons(x1, y1) = cons(x2, y2)

∀x, y. x = y → car(x) = car(y)

∀x, y. x = y → cdr(x) = cdr(y)

3. an instantiation of the (predicate congruence) axiom schema for atom:

∀x, y. x = y → (atom(x)↔ atom(y))

4. ∀x, y. car(cons(x, y)) = x (left projection)5. ∀x, y. cdr(cons(x, y)) = y (right projection)6. ∀x. ¬atom(x) → cons(car(x), cdr(x)) = x (construction)7. ∀x, y. ¬atom(cons(x, y)) (atom)

As in our discussion of TE, we consider only quantifier-free conjunc-tive Σcons-formulae. Again, satisfiability of quantifier-free but non-conjunctiveΣcons-formulae can be decided by converting to DNF and checking each dis-junct. However, the decision procedure does not extend to the quantified casesince satisfiability in the full theory Tcons is undecidable.


cons

x y

=⇒

car cdr

cons

x y

Fig. 9.3. The transformation in Step 2 of the decision procedure

Given a quantifier-free conjunctive Σcons-formula F , ¬atom(ui) literals areremoved in a preprocessing step that follows from the (construction) axiom.Replace

¬atom(ui) with ui = cons(u1i , u

2i ) .

Now consider Σcons-formula

F : s1 = t1 ∧ · · · ∧ sm = tm ∧ sm+1 6= tm+1 ∧ · · · ∧ sn 6= tn∧ atom(u1) ∧ · · · ∧ atom(uℓ)

in which si, ti, and ui are Tcons-terms. To decide its Tcons-satisfiability, performthe following steps:

1. Construct the initial DAG for the subterm set SF .2. For each node n such that n.fn = cons,• add car(n) to the DAG and merge car(n) n.args[1];• add cdr(n) to the DAG and merge cdr(n) n.args[2]by the (left projection) and (right projection) axioms. See Figure 9.3.

3. For i ∈ 1, . . . , m, merge si ti.4. For i ∈ m + 1, . . . , n, if find si = find ti, return unsatisfiable.5. For i ∈ 1, . . . , ℓ if ∃v. find v = find ui ∧ v.fn = cons, return unsatis-

fiable by axiom (atom).6. Otherwise, return satisfiable.

Steps 1, 3, 4, and 6 are identical to Steps 1-4 of the decision procedure for TE.Because of their similarity, it is simple to combine the two theories.

Example 9.20. Consider the (Σcons ∪ΣE)-formula

F : car(x) = car(y) ∧ cdr(x) = cdr(y) ∧ f(x) 6= f(y)∧ ¬atom(x) ∧ ¬atom(y)

in which the function symbol f is in Σ=. Is it (Tcons∪TE)-satisfiable? Accordingto the final two literals, x and y are non-atom structures; thus, the first twoliterals imply that x = y. Yet the third literal, f(x) 6= f(y), contradicts this


car f cdr car f cdr

x y

cons cons

u1 v1 u2 v2

car f cdr car f cdr

x y

car cdr car cdr

cons cons

u1 v1 u2 v2

(a) (b)

Fig. 9.4. DAG after (a) Step 1 and (b) Step 2

equality according to the (congruence) axiom of TE. Hence, F is (Tcons ∪ TE)-unsatisfiable.

To prepare F for the decision procedure, rewrite F according to the (con-struction) axiom:

F ′ : car(x) = car(y) ∧ cdr(x) = cdr(y) ∧ f(x) 6= f(y)∧ x = cons(u1, v1) ∧ y = cons(u2, v2) .

The first two and final two literals imply that u1 = u2 and v1 = v2 so thatagain x = y. The remaining reasoning is as for F .

Let us apply the decision procedure to F ′. The initial DAG of F ′ is dis-played in Figure 9.4(a). Figure 9.4(b) displays the DAG after Step 2.

According to the literals car(x) = car(y) and cdr(x) = cdr(y), compute

merge car(x) car(y) and merge cdr(x) cdr(y) ,

which add the two dashed arrows on the top of Figure 9.5(a). Then accordingto literal x = cons(u1, v1),

merge x cons(u1, v1) ,

which adds the dashed arrow from x to cons in Figure 9.5(a). Consequently,car(x) and car(cons(u1, v1)) become congruent. Since

find car(x) = car(y) and find car(cons(u1, v1)) = u1 ,

the find of car(y) is set to point to u1 during the subsequent union, resultingin the left dotted arrow of Figure 9.5(a). Similarly, cdr(x) and cdr(cons(u1, v1))become congruent, with similar effects (the right dotted arrow of Figure


car f cdr car f cdr

x y

car cdr car cdr

cons cons

u1 v1 u2 v2

car f cdr car f cdr

x y

car cdr car cdr

cons cons

u1 v1 u2 v2

(a) (b)

Fig. 9.5. (a) Intermediate DAG and (b) final DAG

9.5(a)). The state of the DAG after these merges is shown in Figure 9.5(a).Dashed lines indicate merges that arise directly from the literals of F ′; dottedlines indicate deduced merges.

Next, according to the literal y = cons(u2, v2),

merge y cons(u2, v2) ,

resulting in the new dashed line from y to cons(u2, v2) in Figure 9.5(b). Thismerge produces two new congruences:

car(y) = car(cons(u2, v2)) and cdr(y) = cdr(cons(u2, v2)) .

Trace through the actions of these merges to understand the addition of thetwo bottom dotted arrows from u1 to u2 and from v1 to v2 in Figure 9.5(b).During the computation

merge cdr(y) cdr(cons(u2, v2)) ,

cons(u1, v1) and cons(u2, v2) become congruent; then find of cons(u1, v1) isset to point to cons(u2, v2), as represented by the new horizontal dashed linein Figure 9.5(b). This congruence causes the additional congruence

f(x) = f(y) ;

merge f(x) f(y) produces the final dotted edge from f(x) to f(y). Figure9.5(b) displays the final DAG.

Does this DAG model F? No, as find f(x) is f(y) and find f(y) is f(y) sothat f(x) ∼ f(y); however, F asserts that f(x) 6= f(y). F is thus (Tcons ∪TE)-unsatisfiable.

9.5 Arrays 263

9.5 Arrays

Arrays are a basic nonrecursive data type in programming languages, so rea-soning about them is important. In this section, we present a decision proce-dure for the quantifier-free fragment of TA. This fragment is not expressive:one can assert properties only of individual elements, not of entire arrays.Chapter 11 examines a decision procedure for a more expressive fragment.

Recall from Chapter 3 that the theory of arrays TA has signature

ΣA : ·[·], ·〈· ⊳ ·〉, = ,

where

• a[i] is a binary function: a[i] represents the value of array a at position i;• a〈i ⊳ v〉 is a ternary function: a〈i ⊳ v〉 represents the modified array a in

which position i has value v;• and = is a binary predicate.

The axioms of TA are the following:


2. ∀a, i, j. i = j → a[i] = a[j] (array congruence)3. ∀a, v, i, j. i = j → a〈i ⊳ v〉[j] = v (read-over-write 1)4. ∀a, v, i, j. i 6= j → a〈i ⊳ v〉[j] = a[j] (read-over-write 2)

We consider TA-satisfiability in the quantifier-free fragment of TA. As usual,we consider only conjunctive ΣA-formulae since conversion to DNF extendsthe decision procedure to arbitrary quantifier-free ΣA-formulae.

The decision procedure for TA-satisfiability of quantifier-free ΣA-formulaF is based on a reduction to TE-satisfiability via applications of the (read-over-write) axioms. Intuitively, if F does not contain any write terms, then the readterms can be viewed as uninterpreted function terms. Otherwise, any writeterm must occur in the context of a read — as a read-over-write term a〈i⊳v〉[j]— since arrays themselves cannot be asserted to be equal or not equal. In thiscase, the (read-over-write) axioms can be applied to deconstruct the read-over-write terms. In detail, to decide the TA-satisfiability of F , perform thefollowing recursive steps.

Step 1

If F does not contain any write terms a〈i ⊳ v〉, perform the following steps:

1. Associate each array variable a with a fresh function symbol fa, and re-place each read term a[i] with fa(i).

2. Decide and return the TE-satisfiability of the resulting formula.


Step 2

Select some read-over-write term a〈i ⊳ v〉[j] (recall that a may itself be a writeterm), and split on two cases:

1. According to (read-over-write 1), replace

F [a〈i ⊳ v〉[j]] with F1 : F [v] ∧ i = j ,

and recurse on F1. If F1 is found to be TA-satisfiable, return satisfiable.2. According to (read-over-write 2), replace

F [a〈i ⊳ v〉[j]] with F2 : F [a[j]] ∧ i 6= j ,

and recurse on F2. If F2 is found to be TA-satisfiable, return satisfiable.

If both F1 and F2 are found to be TA-unsatisfiable, return unsatisfiable.

Example 9.21. Consider ΣA-formula

F : i1 = j ∧ i1 6= i2 ∧ a[j] = v1 ∧ a〈i1 ⊳ v1〉〈i2 ⊳ v2〉[j] 6= a[j] .

Recall that a〈i1 ⊳ v1〉〈i2 ⊳ v2〉[j] abbreviates the term

((a〈i1 ⊳ v1〉)〈i2 ⊳ v2〉)[j] .

F contains a write term, so select a read-over-write term to deconstruct:

a〈i1 ⊳ v1〉〈i2 ⊳ v2〉[j] .

According to (read-over-write 1), assume i2 = j and recurse on

F1 : i2 = j ∧ i1 = j ∧ i1 6= i2 ∧ a[j] = v1 ∧ v2 6= a[j] .

F1 does not contain any write terms, so rewrite it to

F ′1 : i2 = j ∧ i1 = j ∧ i1 6= i2 ∧ fa(j) = v1 ∧ v2 6= fa(j) .

The first two literals imply that i1 = i2, contradicting the third literal, so F ′1

is TE-unsatisfiable.Returning, we try the second case: according to (read-over-write 2), assume

i2 6= j and recurse on

F2 : i2 6= j ∧ i1 = j ∧ i1 6= i2 ∧ a[j] = v1 ∧ a〈i1 ⊳ v1〉[j] 6= a[j] .

F2 contains a write term, so select the only read-over-write term to decon-struct. According to (read-over-write 1), assume i1 = j and recurse on

F3 : i1 = j ∧ i2 6= j ∧ i1 = j ∧ i1 6= i2 ∧ a[j] = v1 ∧ v1 6= a[j] .

Clearly, following this path leads to a contradiction because of the final twoterms. Thus, according to (read-over-write 2), assume i1 6= j and recurse on

F4 : i1 6= j ∧ i2 6= j ∧ i1 = j ∧ i1 6= i2 ∧ a[j] = v1 ∧ a[j] 6= a[j] .

But the final literal is contradictory. As all branches have been tried, F isTA-unsatisfiable.

9.6 Summary 265

Theorem 9.22 (Sound & Complete). Given quantifier-free conjunctiveΣA-formula F , the decision procedure returns satisfiable iff F is TA-satisfiable;otherwise, it returns unsatisfiable.

Theorem 9.23 (Complexity). TA-satisfiability of quantifier-free conjunc-tive ΣA-formulae is NP-complete.

The power of disjunction manifested in Steps 2(a) and 2(b) of the decisionprocedure results in this complexity.

Proof. That the problem is in NP is simple: for a given formula F , guessa formula like F , except that for every instance of a read-over-write terma〈i ⊳ v〉[j], it includes as a conjunction either i = j or i 6= j. Only a linearnumber of literals are added. Then check TA-satisfiability of this formula, usingthe new literals to simplify read-over-write terms.

To prove NP-hardness, we reduce from satisfiability of PL formulae, whichis NP-complete. Given a propositional formula F in CNF, assert

vP 6= v¬P

for each variable P in F ; vP and v¬P represent the values of P and ¬P ,respectively. Introduce a fresh constant •. Then consider the nth clause of F ,say

(¬P ∨ Q ∨ ¬R) .

In this case, assert

a[jn] 6= • ∧ a〈iP ⊳ v¬P 〉〈iQ ⊳ vQ〉〈iR ⊳ v¬R〉[jn] = • .

jn must be equal to one of the introduced values at iP , iQ, or iR. Therefore,at cell jn the corresponding value (v¬P , vQ, or v¬R) must equal •. In thisfashion, add an assertion for each clause of F . Conjoin all assertions to formquantifier-free conjunctive ΣA-formula G. G is equisatisfiable to F : vP = • iffP is true; and v¬P iff P is false. G is also of size polynomial in the size of F .Thus, deciding TA-satisfiability of G decides propositional satisfiability of F ,so TA-satisfiability is NP-complete.

9.6 Summary

This chapter presents the congruence closure algorithm and applications todeciding satisfiability in the quantifier-free fragments of TE, Tcons, and TA. Itcovers:

• The congruence closure algorithm at an abstract level. Relations, equiva-lence relations, congruence relations. Partitions; equivalence and congru-ence classes. Closures.


• The DAG-based implementation. Directed acyclic graph representation offormulae. The union-find algorithm. Merging.

• Recursive data structures. These structures include records, lists, stacks,and trees, but not queues. The decision procedure extends the congruenceclosure algorithm.

• Arrays. The decision procedure branches based on read-over-write termsaccording to the read-over-write axioms. When all write terms have beenremoved on one branch, the congruence closure algorithm is applied.

Equality is found within many first-order theories, including all the theoriesstudied in this book. Additionally, equality and the congruence closure algo-rithm are the unifying components of the Nelson-Oppen combination methodstudied in Chapter 10. Hence, reasoning about it is fundamental. Fortunately,satisfiability is efficiently decidable using the congruence closure algorithm.

Uninterpreted functions are used to represent data structures in the con-text of TRDS and TA. In applications such as software and hardware verifica-tion, they allow abstracting away implementations of select components tosimplify reasoning.

The DAG-based data structure that is the basis of the congruence closurealgorithm is used whenever subformulae must be represented uniquely, eitherfor algorithm correctness or for space and time considerations. Exercises 9.4and 9.5 explore this data structure.

So far, we have seen three types of decision procedures. The quantifier-elimination procedures of Chapter 7 manipulate formulae to construct equiva-lent quantifier-free (and possibly variable-free) formulae. The simplex methodof Chapter 8 works explicitly with the underlying structure of the set of satis-fying interpretations. The congruence closure algorithm shares characteristicsof both types of procedures: it manipulates formulae efficiently using the DAG-based algorithm; but it also represents satisfying interpretations explicitly ascongruence classes of terms via the find pointers.


The quantifier-free fragment of TE was first proved decidable by Ackermannin 1954 [1] and later studied by various teams in the late 1970s. Shostak [83],Nelson and Oppen [66], and Downey, Sethi, and Tarjan [29] present alternatesolutions to the problem. We discuss the method of Nelson and Oppen [66].

Oppen presents a theory of acyclic recursive data structures [69, 71]. Thedecision problem in the quantifier-free fragment of this theory is decidable inlinear time, and the full theory is decidable. Our presentation is based on thework of Nelson and Oppen [66] for possibly-cyclic data structures.

McCarthy proposes the axiomatization of arrays based on read-over-write[58]. James King implemented the decision procedure for the quantifier-freefragment as part of his thesis work [50].

Exercises 267

Exercises

9.1 (DP for TE). Apply the decision procedure for TE to the following ΣE-formulae. Provide a level of detail as in Example 9.10.

(a) f(x, y) = f(y, x) ∧ f(a, y) 6= f(y, a)(b) f(g(x)) = g(f(x)) ∧ f(g(f(y))) = x ∧ f(y) = x ∧ g(f(x)) 6= x(c) f(f(f(a))) = f(f(a)) ∧ f(f(f(f(a)))) = a ∧ f(a) 6= a(d) f(f(f(a))) = f(a) ∧ f(f(a)) = a ∧ f(a) 6= a(e) p(x) ∧ f(f(x)) = x ∧ f(f(f(x))) = x ∧ ¬p(f(x))

9.2 (DAG-based DP for TE). Apply the DAG-based decision procedure forTE to the ΣE-formulae of Exercise 9.1. Provide a level of detail as in Example9.17.

9.3 (⋆Undecidable fragment). Show that allowing even one quantifier al-ternation (i.e., ∃x1, . . . , xk. ∀y1, . . . , yn. F [x, y]) makes satisfiability in TE un-decidable.

9.4 (⋆DAG). Describe a data structure and algorithm for constructing theinitial DAG in the congruence closure procedure. It should run in time ap-proximately linear in the size of the formula.

9.5 (⋆PL & DAGs). This problem explores a concise representation ofpropositional logic formulae.

(a) Describe a DAG-based representation of PL formulae.(b) If you have not already done so, consider that the logical connectives ∧

and ∨ are associative and commutative. Improve your representation toexploit this observation.

(c) Modify the CNF conversion algorithm of Section 1.7.3 to operate on yourDAG-based representation. How large is the resulting CNF formula rela-tive to the DAG?

9.6 (DP for Tcons). Apply the decision procedure for Tcons to the followingTcons-formulae. Provide a level of detail as in Example 9.20.

(a) car(x) = y ∧ cdr(x) = z ∧ x 6= cons(y, z)(b) ¬atom(x) ∧ car(x) = y ∧ cdr(x) = z ∧ x 6= cons(y, z)

9.7 (Flawed DP for Tcons). Consider a variant of the Tcons-satisfiabilityprocedure in which Steps 2 and 3 are swapped.1 What is wrong with re-versing these two steps? Identify a counterexample to its correctness: find aTcons-unsatisfiable (conjunctive, quantifier-free) Σcons-formula that the incor-rect procedure claims is satisfiable, and show the final DAGs for both theincorrect and the correct procedures.

1 Suggested by a typo in [66].


9.8 (DP for quantifier-free TA). Apply the decision procedure for quantifier-free TA to the following ΣA-formulae.

(a) a〈i ⊳ e〉[j] = e ∧ i 6= j(b) a〈i ⊳ e〉[j] = e ∧ a[j] 6= e(c) a〈i ⊳ e〉[j] = e ∧ i 6= j ∧ a[j] 6= e(d) a〈i ⊳ e〉〈j ⊳ f〉[k] = g ∧ j 6= k ∧ i = j ∧ a[k] 6= g(e) i1 = j ∧ a[j] = v1 ∧ a〈i1 ⊳ v1〉〈i2 ⊳ v2〉[j] 6= a[j]

10

Combining Decision Procedures

The expressions which arise in program manipulation often do not fallwithin any. . . naturally defined theories — they usually involve mixedterms containing functions and predicates from several theories.

— Greg Nelson and Derek C. OppenSimplification by Cooperating Decision Procedures, 1979

Chapters 7–9 consider decision procedures for theories that each formalizejust one data type. Yet almost all formulae in Chapter 5 are formulae ofunion theories. For example, many assert facts in TZ ∪ TA about arrays ofintegers indexed by integers. Additionally, the decision procedure for the arrayproperty fragment of T Z

A that we discuss in Chapter 11 requires a procedure forthe quantifier-free fragment of TZ ∪ TA. Can we reuse the decision proceduresof Chapters 7–9 to decide satisfiability of formulae in union theories, or mustwe invent a new procedure for each combination?

Fortunately, there is a general result for quantifier-free fragments of uniontheories that allows us to reuse the procedures. This chapter discusses theNelson-Oppen combination method for constructing decision proceduresfor union theories from decision procedures for individual theories. Section10.1 introduces the method and discusses its limitations. Then Section 10.2presents a nondeterministic version, for which correctness is proved in Section10.4; and Section 10.3 presents the more practical deterministic version.

In this chapter, decision procedures for individual theories apply just toquantifier-free fragments. We rely on Cooper’s method with all optimizationsfor considering quantifier-free ΣZ-formulae. Procedures for the other theoriesalready apply only to their quantifier-free fragments.

10.1 Combining Decision Procedures

Consider two theories T1 and T2 over signatures Σ1 and Σ2, respectively. Forthe quantifier-free fragments of T1 and T2, we have decision procedures P1 andP2. How do we decide satisfiability in the quantifier-free fragment of T1 ∪ T2?

270 10 Combining Decision Procedures

Example 10.1. Consider the (ΣE ∪ΣZ)-formula

F : 1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2) .

Chapter 9 describes a decision procedure for TE, while Chapter 7 presents adecision procedure for TZ. We would like to combine these decision proceduresto decide the (TE ∪ TZ)-satisfiability of F and other quantifier-free (ΣE ∪ΣZ)-formulae.

The Nelson-Oppen combination method (N-O method) combinesdecision procedures for the quantifier-free fragments of several theories intoone decision procedure for the quantifier-free fragment of the union theory.In our presentation of the N-O method, we usually discuss combining twotheories and their decision procedures; however, the N-O method can com-bine an arbitrary number of theories and procedures. Additionally, we restrictourselves to considering conjunctive formulae; however, the satisfiability ofarbitrary (quantifier-free) formulae can be considered by converting to DNFand checking each disjunct.

Besides being restricted to quantifier-free formulae, the N-O method hastwo additional restrictions. First, the signatures Σ1 and Σ2 can only shareequality =:

Σ1 ∩Σ2 = = .

Second, the theories T1 and T2 must be stably infinite.A theory T with signature Σ is stably infinite if for every quantifier-free

Σ-formula F , if F is T -satisfiable, then there exists some T -interpretationthat satisfies F and has a domain of infinite cardinality. We illustrate thisconcept with two example theories.

Example 10.2. Consider the theory Ta,b with signature

Σ2 : a, b, = ,

where both a and b are constants, and axiom

1. ∀x. x = a ∨ x = b (two)

Because of axiom (two), every Ta,b-interpretation I : (DI , αI) is such thatthe domain DI has at most two elements: |DI | ≤ 2. Hence, Ta,b is not stablyinfinite.

Example 10.3. We prove that TE is stably infinite. Consider the TE-satisfiablequantifier-free ΣE-formula F with arbitrary satisfying TE-interpretation I :(DI , αI) in which αI maps = to =I . Let A be any infinite set disjoint fromDI . Then construct new interpretation J : (DJ , αJ ):

• DJ = DI ∪A

10.2 Nelson-Oppen Method: Nondeterministic Version 271

• αJ = = 7→ =J , . . ., where for v1, v2 ∈ DJ ,

v1 =J v2def=

v1 =I v2 if v1, v2 ∈ DI

⊤ if v1 is the same element as v2

⊥ otherwise

J is a TE-interpretation satisfying F with infinite domain. Hence, TE is stablyinfinite.

The other theories discussed in this book are also stably infinite.

Example 10.4. Consider the quantifier-free conjunctive (ΣE ∪ΣZ)-formula

F : 1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2) .

The signatures of TE and TZ only share =. Also, both theories are stablyinfinite. Hence, the N-O combination of the decision procedures for TE and TZ

decides the (TE ∪ TZ)-satisfiability of F .Intuitively, F is (TE ∪ TZ)-unsatisfiable. For the first two literals imply

x = 1 ∨ x = 2 so that f(x) = f(1) ∨ f(x) = f(2). Yet the last two literalscontradict this conclusion.

10.2 Nelson-Oppen Method: Nondeterministic Version

In this section, we discuss the nondeterministic version of the N-O method.While simple to present, it suffers from high complexity. Section 10.3 refor-mulates the method to be deterministic and efficient.

Consider a quantifier-free conjunctive (Σ1 ∪ Σ2)-formula F . The N-Omethod proceeds in two steps.

10.2.1 Phase 1: Variable Abstraction

The variable abstraction phase transforms a quantifier-free conjunctive for-mula F into two quantifier-free conjunctive formulae, a Σ1-formula F1 and aΣ2-formula F2, such that F and F1 ∧ F2 are (T1 ∪ T2)-equisatisfiable. Thatis, F is (T1 ∪ T2)-satisfiable iff F1 ∧ F2 is (T1 ∪ T2)-satisfiable. F1 and F2 arelinked via a set of shared variables.

For term t, let hd(t) be the root symbol; e.g., hd(f(x)) = f . Then fori, j ∈ 1, 2 and i 6= j, repeat the following transformations as long as possible:

1. if function f ∈ Σi and hd(t) ∈ Σj ,

F [f(t1, . . . , t, . . . , tn)] =⇒ F [f(t1, . . . , w, . . . , tn)] ∧ w = t

2. if predicate p ∈ Σi and hd(t) ∈ Σj ,

F [p(t1, . . . , t, . . . , tn)] =⇒ F [p(t1, . . . , w, . . . , tn)] ∧ w = t


3. if hd(s) ∈ Σi and hd(t) ∈ Σj ,

F [s = t] =⇒ F [w = t] ∧ w = s

w is a fresh variable in each application of a transformation. Transformation3 also applies to s 6= t literals: replace F [s 6= t] with F [w 6= t] ∧ w = s.

After applying the transformations, each literal of the resulting formulafalls entirely within the signature of one of the two theories (or possibly withineach if it is just an equality x = y or a disequality x 6= y between variables:such literals are in every signature since they do not have symbols otherthan =). Divide the literals into two sets, one for each theory. These sets arenot disjoint when there is a literal that is an equality or disequality betweenvariables. Then return the conjunction of each set.

Example 10.5. Consider (ΣE ∪ΣZ)-formula

F : 1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2) .

Since f ∈ Σ= and 1 ∈ ΣZ, replace f(1) by f(w1) and add w1 = 1 by trans-formation 1. Similarly, replace f(2) by f(w2) and add w2 = 2.

Now, the literals

1 ≤ x, x ≤ 2, w1 = 1, and w2 = 2

are TZ-literals, while the literals

f(x) 6= f(w1) and f(x) 6= f(w2)

are TE-literals. Hence, construct the ΣZ-formula

FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2

and the ΣE-formula

FE : f(x) 6= f(w1) ∧ f(x) 6= f(w2) .

FZ and FE share the variables x, w1, and w2. FZ∧FE is (TE∪TZ)-equisatisfiableto F .


F : f(x) = x + y ∧ x ≤ y + z ∧ x + z ≤ y ∧ y = 1 ∧ f(x) 6= f(2) .

Intuitively, F is (TE∪TZ)-satisfiable: consider an interpretation in which x = 0,y = 1, z = 1, f(0) = 1, and f(2) = 2.

In the first literal, hd(f(x)) = f ∈ ΣE and hd(x + y) = + ∈ ΣZ; thus, bytransformation 3, replace the literal with

w1 = x + y ∧ w1 = f(x) .


In the last literal, f ∈ ΣE but 2 ∈ ΣZ, so by transformation 1, replace it with

f(x) 6= f(w2) ∧ w2 = 2 .

Now, separating the literals results in two formulae:

FZ : w1 = x + y ∧ x ≤ y + z ∧ x + z ≤ y ∧ y = 1 ∧ w2 = 2

is a ΣZ-formula, and

FE : w1 = f(x) ∧ f(x) 6= f(w2)

is a ΣE-formula. The conjunction FZ ∧ FE is (TE ∪ TZ)-equisatisfiable to F .

10.2.2 Phase 2: Guess and Check

Phase 1 separates (Σ1∪Σ2)-formula F into two formulae, Σ1-formula F1, andΣ2-formula F2. F1 and F2 are linked by a set of shared variables. Let

V = shared(F1, F2) = free(F1) ∩ free(F2)

be the shared variables of F1 and F2. Let E be an equivalence relation overV . The arrangement α(V, E) of V induced by E is the formula

α(V, E) :∧

u,v ∈ V. uEv

u = v ∧∧

u,v ∈ V. ¬(uEv)

u 6= v ,

which asserts that variables related by E are equal and that variables unre-lated by E are not equal. The formula F is (T1∪T2)-satisfiable iff there existsan equivalence relation E of V such that

• F1 ∧ α(V, E) is T1-satisfiable, and• F2 ∧ α(V, E) is T2-satisfiable.

Otherwise, F is (T1 ∪ T2)-unsatisfiable.

Example 10.7. Consider (ΣE ∪ΣZ)-formula

F : 1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2) .

Phase 1 separates this formula into the ΣZ-formula

FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2

and the ΣE-formula

FE : f(x) 6= f(w1) ∧ f(x) 6= f(w2) ,

with

V = shared(FZ, FE) = x, w1, w2 .

There are 5 equivalence relations to consider, which we list by stating thepartitions:


1. x, w1, w2, i.e., x = w1 = w2: FE ∧ α(V, E) is TE-unsatisfiable becauseit cannot be the case that both x = w1 and f(x) 6= f(w1).

2. x, w1, w2, i.e., x = w1, x 6= w2: FE ∧ α(V, E) is TE-unsatisfiablebecause it cannot be the case that both x = w1 and f(x) 6= f(w1).

3. x, w2, w1, i.e., x = w2, x 6= w1: FE ∧ α(V, E) is TE-unsatisfiablebecause it cannot be the case that both x = w2 and f(x) 6= f(w2).

4. x, w1, w2, i.e., x 6= w1, w1 = w2: FZ ∧ α(V, E) is TZ-unsatisfiablebecause it cannot be the case that both w1 = w2 and w1 = 1 ∧ w2 = 2.

5. x, w1, w2, i.e., x 6= w1, x 6= w2, w1 6= w2: FZ ∧ α(V, E) is TZ-unsatisfiable because it cannot be the case that both x 6= w1 ∧ x 6= w2

and x = w1 = 1 ∨ x = w2 = 2 (since 1 ≤ x ≤ 2 implies that x = 1 ∨ x = 2in TZ).

Hence, F is (TE ∪ TZ)-unsatisfiable.

Example 10.8. Consider the (Σcons ∪ΣZ)-formula

F : car(x) + car(y) = z ∧ cons(x, z) 6= cons(y, z) .

After two applications of transformation 1, Phase 1 separates F into the Σcons-formula

Fcons : w1 = car(x) ∧ w2 = car(y) ∧ cons(x, z) 6= cons(y, z)

and the ΣZ-formula

FZ : w1 + w2 = z ,

with

V = shared(Fcons, FZ) = z, w1, w2 .

Consider the equivalence relation E given by the partition

z, w1, w2 .

The arrangement

α(V, E) : z 6= w1 ∧ z 6= w2 ∧ w1 6= w2

satisfies both Fcons and FZ: Fcons∧α(V, E) is Tcons-satisfiable, and FZ∧α(V, E)is TZ-satisfiable. Hence, F is (Tcons ∪ TZ)-satisfiable.

10.2.3 Practical Efficiency

Phase 2 is formulated as “guess and check”: first, guess an equivalence relationE, then check the induced arrangement. Unfortunately, the number of equiva-lence relations increases significantly with the number of shared variables. The


number of equivalence relations is given by the sequence of Bell numbers,which grows super-exponentially. For example, just 12 shared variables induceover four million equivalence relations. Hence, the guess-and-check method isimpractical.

However, there is no need to guess the entire equivalence relation at once;instead, construct it incrementally, as the following example illustrates:

Example 10.9. In Example 10.6, Phase 1 separates the (ΣE ∪ΣZ)-formula

F : f(x) = x + y ∧ x ≤ y + z ∧ x + z ≤ y ∧ y = 1 ∧ f(x) 6= f(2)

into ΣZ-formula

FZ : w1 = x + y ∧ x ≤ y + z ∧ x + z ≤ y ∧ y = 1 ∧ w2 = 2

and ΣE-formula

FE : w1 = f(x) ∧ f(x) 6= f(w2)

Then


We attempt to construct an arrangement.

1. Suppose x = w1. But then w1 = x + y of FZ implies that y = 0, yet FZ

asserts that y = 1. Hence, x 6= w1.2. FZ ∧ x 6= w1 and FE ∧ x 6= w1 are TZ- and TE-satisfiable, respectively.3. Suppose x = w2. But f(x) 6= f(w2) of FE contradicts this supposition.

Hence, x 6= w2.4. FZ ∧ x 6= w1 ∧ x 6= w2 and FE ∧ x 6= w1 ∧ x 6= w2 are TZ- and

TE-satisfiable, respectively.5. Suppose w1 = w2. No contradiction exists.

We discovered the arrangement

x 6= w1 ∧ x 6= w2 ∧ w1 = w2 ,

so F is (TE ∪ TZ)-satisfiable.

Readers interested in implementing a simple Nelson-Oppen-based decisionprocedure could consider this incremental-construction “optimization” of thenondeterministic method. However, in practice, implementations are based onthe deterministic method described in the next section.


10.3 Nelson-Oppen Method: Deterministic Version

Phase 1 of the deterministic version is the same as in the nondeterministicversion.

Phase 2 of the nondeterministic method (both the guess-and-check methodand the optimized incremental construction) proposes a set of equalities anddisequalities and then lets each decision procedure Pi check the set with thecorresponding formula Fi. In contrast, Phase 2 of the deterministic versionasks the decision procedures P1 and P2 to propagate information in the formof new equalities.

A convex theory is particularly well-suited for propagating equalities.Section 10.3.1 discusses convex theories. Then Section 10.3.2 presents thedeterministic Nelson-Oppen method.

10.3.1 Convex Theories

If a conjunctive formula in a convex theory implies a disjunction of equalitiesbetween variables, then it actually implies a single equality. Formally, considera quantifier-free conjunctive Σ-formula F and a disjunction

G :

n∨

i=1

ui = vi , (10.1)

for variables ui and vi. Theory T is convex if for every such F and G, if

F ⇒n∨

i=1

ui = vi

then

F ⇒ ui = vi for some i ∈ 1, . . . , n .

If F implies G, then F actually implies one of the disjuncts of G.Intuitively, F cannot be “covered” by any disjunction of equalities — no

matter how many — if no single equality covers F (F is covered by a formulaif F implies it). This intuition is especially apparent for vector spaces (Section8.2): a plane cannot be covered by a finite disjunction of lines; it cannot evenbe covered by a finite disjunction of other planes unless at least one of theplanes is the plane itself.

Example 10.10. The theory of integers TZ is not convex. For consider thequantifier-free conjunctive ΣZ-formula

F : 1 ≤ z ∧ z ≤ 2 ∧ u = 1 ∧ v = 2 .

Then

10.3 Nelson-Oppen Method: Deterministic Version 277

F ⇒ z = u ∨ z = v ,

but neither

F ⇒ z = u nor F ⇒ z = v .

Example 10.11. The theory of arrays TA is not convex. For consider thequantifier-free conjunctive ΣA-formula

F : a〈i ⊳ v〉[j] = v .

Then

F ⇒ i = j ∨ a[j] = v ,

but neither

F ⇒ i = j nor F ⇒ a[j] = v .

Example 10.12. ⋆ The theory of rationals TQ is convex, as it is convex in ageometric sense (see Chapter 8).

Each equality ui = vi of the disjunction G of (10.1) is geometrically convex,but G itself is not. Consider, for example,

H : x = y ∨ x = z .

Let SH be the set of points satisfying H . The point (x, y, z) = (0, 0, 1) isincluded in SH , as is the point (1, 0, 1). However, the average of the twopoints, (1

2 , 0, 1) (choosing λ = 12 ), is not in SH . Indeed, choose any two points

(u, u, v1) and (w, v2, w)

from Sx=y and Sx=z, respectively, such that neither is in their intersectionSx=y=z (i.e., v1 6= u and v2 6= w). Then for any λ ∈ (0, 1), the point

(λu + (1 − λ)w, λu + (1− λ)v2, λv1 + (1 − λ)w)

is neither in Sx=y nor in Sx=z.Suppose, then, that F ⇒ G :

∨ni=1 ui = vi, but for no i ∈ 1, . . . , n does

F ⇒ ui = vi. Then it must be the case that there are two points s1 and s2 ofSF in separate subsets Sui=vi

, Suj=vj, i 6= j, of SG. By the argument above,

the points on the line segment between s1 and s2 are not in SG and thus notin SF . Then F is not geometrically convex, a contradiction.

Thus, TQ is convex.

Exercise 10.5 asks the reader to prove that the theories TE and Tcons arealso convex.


10.3.2 Phase 2: Equality Propagation

Recall that the nondeterministic version guesses an equivalence relation Eover the shared variables V and checks that both F1∧α(V, E) is T1-satisfiableand F2∧α(V, E) is T2-satisfiable. If it finds a satisfying equivalence relation E,it declares that F is (T1 ∪ T2)-satisfiable. This method suffers from the enor-mous number of equivalence relations that are possible even over small setsof shared variables. In the deterministic version, a central manager asks thedecision procedures P1 and P2 to report any new implied equalities betweenshared variables. It then adds this new information to the already discoveredequalities and propagates it to the other decision procedure. This method isefficient.

In the context of already discovered equalities E , a decision procedure Pi

for a convex theory Ti discovers a new equality u = v, for shared variables uand v, when

Fi ∧ E ⇒ u = v .

The central manager then propagates this new equality to the other decisionprocedure.

If Tj is not convex, Pj discovers a new disjunction of equalities S when

Fj ∧ E ⇒∨

ui=vi ∈ S

(ui = vi) ,

for shared variables ui and vi. In this case, the central manager must split thedisjunction and search along multiple branches. Each branch assumes one ofthe disjuncts. The search along a branch ends either when a full arrangementis discovered (so the original formula is (T1 ∪ T2)-satisfiable; see below) orwhen all sub-branches end in contradiction (Ti-unsatisfiability for some i).In the latter case, the central manager tries another branch. If no branchesremain to try, then the central manager declares the original formula to be(T1 ∪ T2)-unsatisfiable.

If at some point, neither P1 nor P2 finds a new equality (or a disjunctionof equalities in the non-convex case), then the central manager concludes thatthe given formula is (T1 ∪ T2)-satisfiable. For if E is the set of all learnedequalities, S is the set of all possible remaining equalities, and

F1 ∧ E 6⇒∨

ui=vi ∈ S

(ui = vi) and F2 ∧ E 6⇒∨

ui=vi ∈ S

(ui = vi) ,

(which must hold when no new disjunctions of equalities are discovered), then

F1 ∧ E ∧∧

ui=vi ∈ S

(ui 6= vi) and F2 ∧ E ∧∧

ui=vi ∈ S

(ui 6= vi)

are T1-satisfiable and T2-satisfiable, respectively. Hence, the discovered ar-rangement is


α(V, E) = E ∧∧

ui=vi ∈ S

(ui 6= vi) ,

and F is (T1 ∪ T2)-satisfiable.

Example 10.13. Consider the (ΣE ∪ΣQ)-formula

F : f(f(x)− f(y)) 6= f(z) ∧ x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z .

F is (TE∪TQ)-unsatisfiable: the final three literals imply that z = 0 and x = y,so that f(x) = f(y). But then from the first literal, f(0) 6= f(0) since bothf(x)− f(y) and z equal 0.

Phase 1 separates F into two formulae. According to transformation 1, itreplaces f(x) by u, f(y) by v, and u− v by w, resulting in ΣE-formula

FE : f(w) 6= f(z) ∧ u = f(x) ∧ v = f(y)

and ΣQ-formula

FQ : x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z ∧ w = u− v ,

with

V = shared(FE, FQ) = x, y, z, u, v, w .

Recall that TE and TQ are convex theories. The decision procedure PQ forTQ discovers

FQ ⇒ x = y

from x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z, so

E1 : x = y .

Then PE discovers the new congruence f(x) = f(y) from x = y, so that

FE ∧ E1 ⇒ u = v ,

yielding

E2 : x = y ∧ u = v .

But then

FQ ∧ E2 ⇒ z = w

since w = u− v = 0, according to u = v, and z = 0. Propagating this equalityback to PE via

E3 : x = y ∧ u = v ∧ z = w


FQ |= x = y

x = yFE ∧ x = y |= u = v

x = y, u = vFQ ∧ u = v |= z = w

x = y, u = v, z = wFE ∧ z = w |= ⊥

⊥

Fig. 10.1. Summary of Example 10.13

reveals the contradiction

FE ∧ E3 ⇒ ⊥ ;

in particular, z = w contradicts f(w) 6= f(z). Therefore, F is (TE ∪ TQ)-unsatisfiable.

Since both TE and TQ are convex, no case splitting was required.Figure 10.1 summarizes this argument. The left and right halves list deduc-

tions made in TQ and TE, respectively. The sets in the middle are the deducedsets of shared equalities. The deductions terminate with ⊥, indicating that Fis (TE ∪ TQ)-unsatisfiable.


F : 1 ≤ x ∧ x ≤ 2 ∧ f(x) 6= f(1) ∧ f(x) 6= f(2) .

While TE is convex, TZ is not. Thus, we should expect some case splits.According to transformation 1, Phase 1 replaces f(1) by f(w1) and f(2)

by f(w2), resulting in the ΣZ-formula

FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2

and the ΣE-formula

FE : f(x) 6= f(w1) ∧ f(x) 6= f(w2) ,

with


Immediately, PZ recognizes that

FZ ⇒ x = w1 ∨ x = w2 ,

since 1 ≤ x ≤ 2 implies that either x = 1 or x = 2. Hence, case split on thesetwo disjuncts. For the first case, propagate


⋆

x = w1 x = w2

⊥ ⊥

⋆ : FZ |= x = w1 ∨ x = w2

x = w1

FE ∧ x = w1 |= ⊥

x = w2

FE ∧ x = w2 |= ⊥


Ea1 : x = w1

to PE, which discovers that

FE ∧ Ea1 ⇒ ⊥ ,

as x = w1 contradicts f(x) 6= f(w1).For the second case,

Eb1 : x = w2 .

Again, PE discovers that

FE ∧ Eb1 ⇒ ⊥ ,

as x = w2 contradicts f(x) 6= f(w2).As all branches end in contradiction, F is (TE ∪ TZ)-unsatisfiable.Figure 10.2 summarizes this argument. Unlike in Example 10.13 and Fig-

ure 10.1, the nonconvexity of TZ causes the argument to branch along twopossibilities. Each branch ends in a contradiction.


F : 1 ≤ x ∧ x ≤ 3 ∧ f(x) 6= f(1) ∧ f(x) 6= f(3) ∧ f(1) 6= f(2) .

Applying transformation 1 of Phase 1 three times produces the ΣZ-formula

FZ : 1 ≤ x ∧ x ≤ 3 ∧ w1 = 1 ∧ w2 = 2 ∧ w3 = 3

and the ΣE-formula

FE : f(x) 6= f(w1) ∧ f(x) 6= f(w3) ∧ f(w1) 6= f(w2) ,

with


V = shared(FZ, FE) = x, w1, w2, w3 .

From 1 ≤ x ≤ 3, PZ discovers that

FZ ⇒ x = w1 ∨ x = w2 ∨ x = w3 .

Recall that TZ is not convex. On case

Ea1 : x = w1 ,

PE finds that

FE ∧ Ea1 ⇒ ⊥

because of f(x) 6= f(w1). On case

Eb1 : x = w2 ,

neither PZ nor PE discovers any contradiction or new equality. That is,

FZ ∧ Eb1 6⇒ x = w1 ∨ x = w3 ∨ w1 = w2 ∨ w1 = w3 ∨ w2 = w3

and

FE ∧ Eb1 6⇒ x = w1 ∨ x = w3 ∨ w1 = w2 ∨ w1 = w3 ∨ w2 = w3 ;

or, in other words,

FZ ∧ Eb1 ∧ x 6= w1 ∧ x 6= w3 ∧ w1 6= w2 ∧ w1 6= w3 ∧ w2 6= w3

is TZ-satisfiable, and

FE ∧ Eb1 ∧ x 6= w1 ∧ x 6= w3 ∧ w1 6= w2 ∧ w1 6= w3 ∧ w2 6= w3

is TE-satisfiable. Thus, F is (TE ∪ TZ)-satisfiable.Figure 10.3 summarizes this argument. The middle branch terminates with

a satisfying arrangement. We did not actually explore the right branch.

10.3.3 Equality Propagation: Implementation

Equality propagation can be implemented somewhat efficiently without mod-ifying the individual decision procedures. For convex theory Tj, test eachpossible equality ui = vi. Suppose that Fj is the Σj-formula constructed inPhase 1 and E is the conjunction of equalities discovered so far. Then checkif any equality ui = vi is implied:

Fj ∧ E ⇒ ui = vi .

Any implied equality should be propagated to the other theories.

10.4 ⋆Correctness of the Nelson-Oppen Method 283

⋆

x = w1 x = w2 x = w3

⊥ ⊥

⋆ : FZ |= x = w1 ∨ x = w2 ∨ x = w3

x = w1

FE ∧ x = w1 |= ⊥

x = w2x = w3

FE ∧ x = w3 |= ⊥


This procedure is not applicable to a non-convex theory Tk. A procedurefor a non-convex theory must be able to find disjunctions of equalities thatare implied by a Σk-formula Fk. Moreover, the disjunctions should be as smallas possible since the Nelson-Oppen method must branch on each disjunct. Adisjunction is minimal if it is implied by Fk and if each smaller disjunctionis not implied by Fk.

A simple procedure to find a minimal disjunction is based on the obser-vation that any disjunction that contains a minimal disjunction — which isimplied by Fk by definition — is also implied by Fk. Therefore, we can stripoff extra disjuncts one-by-one. First, consider the disjunction of all equalitiesat once. If it is not implied, then no subset is implied either, so we are done.Otherwise, drop each equality in turn: if the remaining disjunction is stillimplied by Fk, continue with this smaller disjunction; otherwise, restore theequality and continue. When all equalities have been considered, the result-ing disjunction is minimal. This procedure requires checking Tk-satisfiabilityO(|V |2) times, where V is the set of shared variables. Exercise 10.4 asks thereader to describe a procedure based on binary search that requires asymp-totically fewer satisfiability checks when the final disjunction is small relativeto the disjunction of all equalities.

10.4 ⋆Correctness of the Nelson-Oppen Method

In this section, we prove the correctness of the Nelson-Oppen combinationmethod. We reason at the level of arrangements, which is more suited to thenondeterministic version of the method. However, Section 10.3 shows how toconstruct an arrangement in the deterministic version, as well, so the followingproof can be extended to the deterministic version. We also focus on the secondphase of the nondeterministic procedure, which chooses an arrangement ifone exists. We thus assume that the variable abstraction phase is correct: it


produces formulae F1 and F2 such that F1 ∧ F2 is (T1 ∪ T2)-equivalent to thegiven (Σ1 ∪Σ2)-formula F .

A theory has equality (or is a theory with equality) if its signature in-cludes the binary predicate = and its axioms imply reflexivity, symmetry, andtransitivity of equality. The pure equality fragment of a theory with equalityis composed of formulae that are possibly quantified Boolean combinations ofequalities between variables.

Theorem 10.16 (Sound & Complete). Consider stably infinite theories T1

and T2 such that Σ1∩Σ2 = =. For conjunctive quantifier-free Σ1-formula F1

and conjunctive quantifier-free Σ2-formula F2, F1 ∧F2 is (T1 ∪ T2)-satisfiableiff there exists an arrangement K = α(shared(F1, F2), E) such that F1 ∧K isT1-satisfiable and F2 ∧K is T2-satisfiable.

Soundness is straightforward. Suppose that F1 ∧F2 is (T1 ∪ T2)-satisfiablewith satisfying (T1 ∪ T2)-interpretation I. Extract from I the equivalencerelation E such that the arrangement K = α(shared(F1, F2), E) is satisfied byI. Then F1∧K and F2∧K are both satisfied by I, which can be viewed as botha T1-interpretation and a T2-interpretation, so that they are T1-satisfiable andT2-satisfiable, respectively.

Completeness is more complicated. Let K = α(shared(F1, F2), E) be anarrangement such that F1∧K and F2∧K are T1-satisfiable and T2-satisfiable,respectively. Suppose that F1 ∧ F2 is (T1 ∪ T2)-unsatisfiable. We derive acontradiction.

The outline of the proof is the following. Because F1 ∧ F2 is (T1 ∪ T2)-unsatisfiable, we know that F1 implies ¬F2 in T1 ∪ T2. An adaptation ofthe Craig Interpolation Lemma (Theorem 2.38) tells us that there isa quantifier-free formula H such that F1 implies H over all infinite T1-interpretations (T1-interpretations with infinite domains) and F2 implies ¬Hover all infinite T2-interpretations: H interpolates between F1 and F2. We thenshow that the arrangement K implies H , which means that F2 implies ¬K overall infinite T2-interpretations. In other words, no infinite T2-interpretation sat-isfies F2∧K. Yet if T2 is stably infinite and F2∧K is T2-satisfiable as assumed,then F2 ∧K is satisfied by some infinite T2-interpretation, a contradiction.

We now present the details of the proof. First, because we are consideringonly stably infinite theories, we need only consider interpretations with infinitedomains. For we can extend a T1- or T2-interpretation with a finite domainto a T1- or T2-interpretation with an infinite domain. Therefore, define⇒∗ asa weaker form of implication: F ⇒∗ G iff G is true on every interpretation Ithat has an infinite domain and that satisfies F . Similarly, weaken ⇔ to ⇔∗.If F ⇒∗ G, we say that F weakly implies G; if F ⇔∗ G, we say that F isweakly equivalent to G.

Recall from Section 2.7.4 the following theorem.

Theorem 10.17 (Compactness Theorem). A countable set of first-orderformulae S is simultaneously satisfiable iff the conjunction of every finite sub-set is satisfiable.

10.4 ⋆Correctness of the Nelson-Oppen Method 285

Since F1 ∧ F2 is (T1 ∪ T2)-unsatisfiable, the Compactness Theorem tellsus that there exist a conjunction S1 of a finite subset of axioms of T1 and aconjunction S2 of a finite subset of axioms of T2 such that S1 ∧F1 ∧S2∧F2 is(first-order) unsatisfiable. Choose S1 and S2 to include the axioms that implyreflexivity, symmetry, and transitivity of equality. Then, rearranging, we havethat

S1 ∧ F1 ⇒ ¬S2 ∨ ¬F2 . (10.2)

Recall from Section 2.7.4 the following theorem.

Theorem 10.18 (Craig Interpolation Lemma). If F1 ⇒ F2, then thereexists a formula H such that F1 ⇒ H, H ⇒ F2, and each free variable,function symbol, and predicate symbol of H appears in F1 and F2.

Hence, from implication (10.2), there exists an interpolant H ′ such thatfree(H ′) = shared(F1, F2) and

S1 ∧ F1 ⇒ H ′ and S2 ∧ H ′ ⇒ ¬F2 .

The latter implication is derived by rearranging H ′ ⇒ ¬S2 ∨¬F2. Because =is the only predicate or function shared between S1∧F1 and S2∧F2, H ′ is of aspecial form: its atoms are equalities between variables of shared(F1, F2). How-ever, H ′ may have quantifiers. We prove next that in fact a “weak” quantifier-free interpolant H exists.

Lemma 10.19 (Weak Quantifier Elimination for Pure Equality). Con-sider any stably infinite theory T with equality. For each pure equality formulaF , there exists a quantifier-free pure equality formula F ′ such that F is weaklyT -equivalent to F ′.

Proof. Consider pure equality formula ∃x. G[x, y], where G is quantifier-freewith free variables x and y. Define

G0 : Gx = x 7→ true, x = y1 7→ false, . . . , x = yn 7→ false

and, for i ∈ 1, . . . , n,

Gi : Gx 7→ yi .

We claim that ∃x. G is weakly T -equivalent to

G′ : G0 ∨ G1 ∨ · · · ∨ Gn .

For G′ asserts that x is either equal to some free variable yi or not. Becausewe consider only interpretations with infinite domains, it is always possiblefor x not to equal any yi.

By Section 7.1, we have a weak quantifier elimination procedure over thepure equality fragment of T . It is weak because equivalence is only guaranteedto hold on infinite interpretations.


Example 10.20. Consider the pure equality formula

F : x 6= y ∧ (∀z. z = x ∨ z = y) .

For eliminating z, consider the negation of the second conjunct,

G : ∃z. z 6= x ∧ z 6= y ,

for which we have

G0 : ¬⊥ ∧ ¬⊥ ⇔ ⊤

and

Gx : x 6= x ∧ x 6= y ⇔ ⊥ Gy : y 6= x ∧ y 6= y ⇔ ⊥ .

Then

G′ : G0 ∨ Gx ∨ Gy ⇔ ⊤ .

Substituting into F , we have

x 6= y ∧ ¬(⊤) ⇔ ⊥ .

Hence, over infinite interpretations satisfying the axioms of equality, F isequivalent to ⊥.

However, in an interpretation with a two-element domain that satisfies theequality axioms, F is not equivalent to ⊥, but rather to x 6= y. For if x 6= yon such an interpretation, then every element is equal either to x or to y.

Continuing the main theorem, we claim that there exists a quantifier-freepure equality formula H over shared(F1, F2) such that

S1 ∧ F1 ⇒∗ H and S2 ∧ H ⇒∗ ¬F2 .

For by Lemma 10.19, a quantifier-free pure equality formula H exists suchthat H is weakly equivalent to the Craig interpolant H ′ in any stably infinitetheory with equality.

For the next step, recall from the beginning of the proof that F1 ∧ K isT1-satisfiable and F2 ∧K is T2-satisfiable, where K = α(shared(F1, F2), E) isan arrangement. We thus know that

S1 ∧ F1 ∧ K and S2 ∧ F2 ∧ K

are (first-order) satisfiable. Moreover, as T1 and T2 are stably infinite, each ofthese formulae has an interpretation with an infinite domain.

Now, K is a conjunction of equalities and disequalities between pairs ofvariables of shared(F1, F2). Moreover, by the definition of an arrangement, Kis as strong as possible: no additional equality literals L over shared(F1, F2)

10.5 ⋆Complexity 287

can be added to K without either K and K ∧ L being equivalent in a theorywith equality or K ∧L being unsatisfiable in a theory with equality. Based onthis observation, construct the formula K ′ by conjoining additional equalityliterals: for each pair of variables u, v ∈ shared(F1, F2), conjoin either u = vor u 6= v, depending on which maintains the satisfiability of K ′ in a theorywith equality. Now, since S1 ∧ F1 ∧K is satisfiable, then so is S1 ∧ F1 ∧K ′,indeed by the same interpretations.

We claim that the DNF representation of H must include K ′ or a (con-junctive) subformula of K ′ as a disjunct. Suppose not; then every disjunctof the DNF representation of H contradicts the satisfying interpretations ofS1∧F1 ∧K ′, of which at least one exists. Therefore, K ′ ⇒ H , and — becauseK and K ′ are equivalent in a theory with equality — K ⇒ H . In other words,the discovered arrangement K is a special case of the weak interpolant H .

To finish, we have

S2 ∧ H ⇒∗ ¬F2 ,

or, rearranging,

S2 ∧ F2 ⇒∗ ¬H .

From K ⇒ H , we have ¬H ⇒ ¬K, so

S2 ∧ F2 ⇒∗ ¬K .

But this weak implication contradicts that S2 ∧ F2 ∧ K is satisfied by someinfinite interpretation. Thus, F1 ∧ F2 is actually (T1 ∪ T2)-satisfiable, and theNelson-Oppen method is correct.

10.5 ⋆Complexity

Assume that T1 and T2 are stably infinite theories such that Σ1 ∩Σ = =.Also, they have decision procedures P1 and P2 for their respective conjunctivequantifier-free fragments.

Theorem 10.21 (Complexity: Convex Theories). If convex theories T1

and T2 have PTIME decision procedures P1 and P2, then the Nelson-Oppencombination based on equality propagation is a PTIME decision procedure forthe conjunctive quantifier-free fragment of T1 ∪ T2.

Theorem 10.22 (Complexity: Non-Convex Theories). If T1 and T2

have NPTIME decision procedures P1 and P2, then the Nelson-Oppen com-bination based on equality propagation is an NPTIME decision procedure forthe conjunctive quantifier-free fragment of T1 ∪ T2.


10.6 Summary

Combining decision procedures in a general and efficient manner is crucialfor most applications. This chapter covers the Nelson-Oppen combinationmethod, in particular:

• The nondeterministic Nelson-Oppen method. Three requirements: the the-ories only share =; the theories are stably infinite; and the considered for-mula is quantifier-free. Variable abstraction, separation into theory-specificformulae. Shared variables, equivalence relations over shared variables, ar-rangements.

• The deterministic Nelson-Oppen method. Convex theories. Equality prop-agation.

• Correctness of the Nelson-Oppen method, which follows from the CraigInterpolation Lemma of Chapter 2.

• Complexity. When the individual decision procedures are convex and runin polynomial time, the combination procedure runs in polynomial time.

The Nelson-Oppen combination method provides a general means of reasoningsimultaneously about the theories studied in this book using the individualdecision procedures. Being able to reason in union theories is crucial. Forexample, almost all of the verification conditions of Chapters 5 and 6 areexpressed in multiple signatures.


Nelson and Oppen describe the Nelson-Oppen combination method [65]. Theiroriginal proof of correctness was flawed; Oppen presents a corrected proof in[70], and Nelson presents a corrected proof in [64]. Oppen also proves in [70]the complexity results that we state. Tinelli and Harandi present an alter-nate proof of correctness in [92]. Our correctness proof derives from that ofNelson and Oppen. See [56] for another presentation of the method and itscorrectness.

Another general combination method that has received much attention isthat of Shostak [84]. See the work of Ruess and Shankar [78] for a correctpresentation of the method.

Exercises

10.1 (DP for combinations). For each of the following formulae, identifythe combination of theories in which it lies. To avoid ambiguity, prefer TZ toTQ. Then apply the N-O method using the appropriate decision procedures.Use either the nondeterministic or deterministic version. Provide a level ofdetail as in the examples of the chapter.

Exercises 289

(a) 1 ≤ x ∧ x ≤ 2 ∧ cons(1, y) 6= cons(x, y) ∧ cons(2, y) 6= cons(x, y)(b) a[i] ≥ 1 ∧ a[i] + x ≤ 2 ∧ x > 0 ∧ x = i ∧ a〈x ⊳ 2〉[i] 6= 1

10.2 (Deterministic N-O). Apply the deterministic N-O method to thefollowing formulae. Prefer TQ to TZ.

(a) 1 ≤ x ∧ x ≤ 2 ∧ cons(1, y) 6= cons(x, y) ∧ cons(2, y) 6= cons(x, y)(b) x + y = z ∧ f(z) = z ∧ f(x + y) 6= z(c) g(x + y, z) = f(g(x, y)) ∧ x + z = y ∧ z ≥ 0 ∧ x ≥ y

∧ g(x, x) = z ∧ f(z) 6= g(2x, 0)

10.3 (⋆Equality propagation in TE). Section 10.3.3 explains general tech-niques for propagating equalities. However, some decision procedures are eas-ily modified to propagate new equalities. Describe such a modification of thecongruence closure algorithm of Chapter 9.

10.4 (⋆Equality propagation). Consider conjunctive Σ-formula F of non-convex theory T and the disjunction of equalities

G :

n∨

i=1

ui = vi

such that F ⇒ G. Describe a procedure based on binary search that discoversa minimal disjunction G′ of the equalities of G that is implied by F . If theprocedure returns a disjunction with m equalities, then it should have invokedthe decision procedure for T at most O(m lg n) times. Hint: The solution isrelated to the solution of Exercise 8.1(e).

10.5 (⋆Convex theories). Prove that the following theories are convex:

(a) TE

(b) Tcons

10.6 (⋆Complexity). Prove the complexity results about the N-O method.

(a) Theorem 10.21.(b) Theorem 10.22.

11

Arrays

So let us let our ordinals start at zero: an element’s ordinal (subscript)equals the number of elements preceding it in the sequence. And themoral of the story is that we had better regard — after all those cen-turies! — zero as a most natural number.

— Edsger W. DijkstraEWD831: Why Numbering Should Start at Zero, 1982

Programs rely on data structures to store and manipulate data in com-plex ways. Therefore, verifying these programs requires decision proceduresthat reason about the data structures. This chapter discusses fragments anddecision procedures for reasoning about arrays and array-like data structures:arrays with anonymous or uninterpreted indices (Section 11.1), the familiararrays of C and other imperative languages with integer indices (Section 11.2),and hashtables (Section 11.3).

Unlike the quantifier-free fragment of arrays presented in Chapter 9, thefragments of this chapter allow some quantification. Specifically, at most onequantifier alternation is allowed: for satisfiability, formulae may have uni-versal quantifiers in addition to the implicit existential ones. The techniquethat underlies the decision procedures of this chapter is quantifier instan-tiation. A satisfiability decision procedure based on quantifier instantiationidentifies a finite set of terms suggested by the given formula F ; it then re-places universal quantification with finite conjunction over the set of terms.The resulting formula F ′ is quantifier-free and equisatisfiable to the given one.Finally, it applies to F ′ a satisfiability decision procedure for the quantifier-free fragment (for example, a Nelson-Oppen combination of procedures; seeChapter 10); the answer applies also to F .

Section 11.1 studies the array property fragment of TA in which one(restricted) quantifier alternation is allowed. Section 11.2 examines the fa-miliar case in which arrays are indexed by integers. Reasoning about integerindices allows making relative comparisons via ≤, providing more power thanthe case in which indices are only compared via equality.

292 11 Arrays

Hashtables are another important data type. They are similar to arrayswith uninterpreted indices in that their indices, or keys, can only be comparedvia equality. However, hashtables allow two new interesting operations: first, akey/value pair can be removed; and second, a hashtable’s domain — its set ofkeys that it maps to values — can be read. Section 11.3 formalizes reasoningabout hashtables in the theory TH and then presents a decision procedurefor the hashtable property fragment of TH. The procedure operates bytransforming ΣH-formulae to ΣA-formulae in the array property fragment suchthat the original formula is TH-satisfiable iff the constructed formula is TA-satisfiable.

11.1 Arrays with Uninterpreted Indices

The quantifier-free fragment of TA enables basic reasoning in the presence ofarrays. For verification purposes, it allows verifying properties of individualelements but not of entire arrays. However, in practice, it is useful to be able toreason about properties such as equality between arrays. Using combinationsof theories (see Chapter 10), one would also like to reason about propertiessuch as that all integer elements of an array are positive.

11.1.1 Array Property Fragment

In this section, we define a decidable fragment of TA that allows some quan-tification. This fragment is called the array property fragment because itallows specifying basic properties of arrays, not just properties of array el-ements. The principal characteristic of the array property fragment is thatarray indices can be universally quantified with some restrictions.

Example 11.1. In the ΣA-formula

∀j. a〈i ⊳ v〉[j] = a[j] ∧ a[i] 6= v ,

the first conjunct asserts that a〈i ⊳ v〉 and a are equal. This formula is TA-unsatisfiable.

Unfortunately, the use of universal quantification must be restricted toavoid undecidability (see Section 11.4 for further discussion). An array prop-erty is a ΣA-formula of the form

∀i. F [i] → G[i]

in which i is a list of variables, and F [i] and G[i] are the index guard andthe value constraint, respectively. The index guard F [i] is any ΣA-formulathat is syntactically constructed according to the following grammar:

11.1 Arrays with Uninterpreted Indices 293

iguard → iguard ∧ iguard | iguard ∨ iguard | atomatom → var = var | evar 6= var | var 6= evar | ⊤

var → evar | uvar

where uvar is any universally quantified index variable, and evar is any con-stant or unquantified (that is, implicitly existentially quantified) variable.

Additionally, a universally quantified index can occur in a value constraintG[i] only in a read a[i], where a is an array term. The read cannot be nested;for example, a[b[i]] is not allowed.

The array property fragment of TA then consists of formulae that areBoolean combinations of quantifier-free ΣA-formulae and array properties.

Example 11.2. The antecedent of the implication in the ΣA-formula

F : ∀i. i 6= a[k] → a[i] = a[k]

is not a legal index guard since a[k] is not a variable (neither a uvar nor anevar ); however, a simple manipulation makes it conform:

F ′ : v = a[k] ∧ ∀i. i 6= v → a[i] = a[k]

Here, i 6= v is a legal index guard, and a[i] = a[k] is a legal value constraint.F and F ′ are equisatisfiable.

However, no amount of manipulation can make the following formula con-form:

G : ∀i. i 6= a[i] → a[i] = a[k] .

Thus, G is not in the array property fragment.

Example 11.3. The array property fragment allows expressing equality be-tween arrays, a property referred to as extensionality: two arrays are equalprecisely when their corresponding elements are equal. For given formula

F : · · · ∧ a = b ∧ · · ·with array terms a and b, rewrite F as

F ′ : · · · ∧ (∀i. a[i] = b[i]) ∧ · · · .

F and F ′ are equisatisfiable. Moreover, the index guard in the literal is just⊤, and the value constraint a[i] = b[i] obeys the requirement that i appearonly as an index in read terms.

Recall that the theory of arrays with extensionality T =A (see Section 3.6)

augments TA with the following axiom:

∀a, b. (∀i. a[i] = b[i]) ↔ a = b (extensionality)

The universal quantifier of the array property fragment allows expressingequality between arrays directly.

Subsequently, where convenient, we write equality a = b between arraysto abbreviate ∀i. a[i] = b[i].

294 11 Arrays

Example 11.4. Reasoning about arrays is most useful when we can say some-thing interesting about their elements. Suppose array elements are interpretedin some theory T with signature Σ. Then we can assert that all elements ofan array have some property F [x], where F is a quantifier-free Σ-formula:∀i. F [a[i]]; or that all but a finite number of elements have some propertyF [x]:

∀i.(

n∧

k=1

i 6= jk

)→ F [a[i]] .

11.1.2 Decision Procedure

The idea of the decision procedure for the array property fragment is to reduceuniversal quantification to finite conjunction. It constructs a finite set of indexterms such that examining only these positions of the arrays is sufficient todecide satisfiability.


F : a〈i ⊳ v〉 = a ∧ a[i] 6= v ,

which expands to

F ′ : ∀j. a〈i ⊳ v〉[j] = a[j] ∧ a[i] 6= v .

Intuitively, to determine that F ′ is TA-unsatisfiable requires merely examiningindex i:

F ′′ :

∧

j∈i

a〈i ⊳ v〉[j] = a[j]

∧ a[i] 6= v ,

or simply

a〈i ⊳ v〉[i] = a[i] ∧ a[i] 6= v .

Simplifying,

v = a[i] ∧ a[i] 6= v ,

it is clear that this formula, and thus F , is TA-unsatisfiable.

Given array property formula F , decide its TA-satisfiability as follows.

Step 1

Put F in NNF.


Step 2

Apply the following rule exhaustively to remove writes:

F [a〈i ⊳ v〉]F [a′] ∧ a′[i] = v ∧ (∀j. j 6= i → a[j] = a′[j])

for fresh a′ (write)

Rules should be read from top to bottom. For example, this rule states thatgiven a formula F containing an occurrence of a write term a〈i⊳v〉, substituteevery occurrence of a〈i ⊳ v〉 with a fresh variable a′ and conjoin several newconjuncts.

This step deconstructs write terms in a straightforward manner, essen-tially encoding the (read-over-write) axioms into the new formula. After anapplication of the rule, the resulting formula contains at least one fewer writeterms than the given formula.

Step 3

Apply the following rule exhaustively to remove existential quantification:

F [∃i. G[i]]

F [G[j]]for fresh j (exists)

Existential quantification can arise during Step 1 if the given formula has anegated array property.

Step 4

Steps 4-6 accomplish the reduction of universal quantification to finite con-junction. The main idea is to select a set of symbolic index terms on whichto instantiate all universal quantifiers. The proof of Theorem 11.7 argues thatthe following set is sufficient for correctness.

From the output F3 of Step 3, construct the index set I:

I =λ∪ t : ·[t] ∈ F3 such that t is not a universally quantified variable∪ t : t occurs as an evar in the parsing of index guards

Recall that evar is any constant or unquantified variable. This index set is thefinite set of indices that need to be examined. It includes all terms t that occurin some read a[t] anywhere in F (unless it is a universally quantified variable)and all terms t that are compared to a universally quantified variable in someindex guard. λ is a fresh constant that represents all other index positionsthat are not explicitly in I.

296 11 Arrays

Step 5

Apply the following rule exhaustively to remove universal quantification:

H [∀i. F [i] → G[i]]

H

∧

i∈In

(F [i] → G[i]

)

(forall)

where n is the size of the list of quantified variables i. This is the key step.It replaces universal quantification with finite conjunction over the index set.The notation i ∈ In means that the variables i range over all n-tuples ofterms in I.

Step 6

From the output F5 of Step 5, construct

F6 : F5 ∧∧

i ∈ I\λ

λ 6= i .

The new conjuncts assert that the variable λ introduced in Step 4 is indeedunique: it does not equal any other index mentioned in F5.

Step 7

Decide the TA-satisfiability of F6 using the decision procedure for the quantifier-free fragment.

Suppose array elements are interpreted in some theory T with signature Σ.For deciding the (TA∪T )-satisfiability of an array property (ΣA∪Σ)-formula,use a combination decision procedure for the quantifier-free fragment of TA∪Tin Step 7. Thus, this procedure is a decision procedure precisely when thequantifier-free fragment of TA ∪ T is decidable. Chapter 10 discusses decidingsatisfiability in combinations of quantifier-free fragments of theories.

Example 11.6. Consider the array property formula

F : a〈ℓ ⊳ v〉[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (∀i. i 6= ℓ → a[i] = b[i]) .

It contains one array property,

∀i. i 6= ℓ → a[i] = b[i] ,

in which the index guard is i 6= ℓ and the value constraint is a[i] = b[i]. It isalready in NNF. According to Step 2, rewrite F as


F2 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (∀i. i 6= ℓ → a[i] = b[i])∧ a′[ℓ] = v ∧ (∀j. j 6= ℓ → a[j] = a′[j]) .

F2 does not contain any existential quantifiers. Its index set is

I = λ ∪ k ∪ ℓ= λ, k, ℓ .

According to Step 5, replace universal quantification as follows:

F5 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧∧

i ∈ I

(i 6= ℓ → a[i] = b[i])

∧ a′[ℓ] = v ∧∧

j ∈ I

(j 6= ℓ → a[j] = a′[j]) .

Expanding produces

F ′5 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])

∧ (k 6= ℓ → a[k] = b[k]) ∧ (ℓ 6= ℓ → a[ℓ] = b[ℓ])∧ a′[ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′[λ])∧ (k 6= ℓ → a[k] = a′[k]) ∧ (ℓ 6= ℓ → a[ℓ] = a′[ℓ]) .

Simplifying produces

F ′′5 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])

∧ (k 6= ℓ → a[k] = b[k])∧ a′[ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′[λ])∧ (k 6= ℓ → a[k] = a′[k]) .

Step 6 distinguishes λ from other members of I:

F6 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])∧ (k 6= ℓ → a[k] = b[k])∧ a′[ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′[λ])∧ (k 6= ℓ → a[k] = a′[k])∧ λ 6= k ∧ λ 6= ℓ .

Simplifying, we have

F ′6 : a′[k] = b[k] ∧ b[k] 6= v ∧ a[k] = v

∧ a[λ] = b[λ] ∧ (k 6= ℓ → a[k] = b[k])∧ a′[ℓ] = v ∧ a[λ] = a′[λ] ∧ (k 6= ℓ → a[k] = a′[k])∧ λ 6= k ∧ λ 6= ℓ .

There are two cases to consider. If k = ℓ, then a′[ℓ] = v and a′[k] = b[k]imply b[k] = v, yet b[k] 6= v. If k 6= ℓ, then a[k] = v and a[k] = b[k] implyb[k] = v, but again b[k] 6= v. Hence, F ′

6 is TA-unsatisfiable, indicating that Fis TA-unsatisfiable.

Verify that the array decision procedure of Section 9.5 reaches the sameconclusion for F ′

6.

298 11 Arrays

Theorem 11.7 (Sound & Complete). Consider (ΣA ∪Σ)-formula F fromthe array property fragment of TA ∪ T . The output F6 of Step 6 is (TA ∪ T )-equisatisfiable to F .

Proof. Inspection proves the equivalence between the input and the output ofSteps 1-3. The crux of the proof is that the index set constructed in Step 4 issufficient for producing a (TA ∪ T )-equisatisfiable quantifier-free formula.

That satisfiability of F implies the satisfiability of F6 is straightforward:Step 5 weakens universal quantification to finite conjunction. Moreover, thenew conjuncts of Step 6 do not affect the satisfiability of F6, as λ is a freshconstant.

The opposite direction is more complicated. Assume that (TA ∪ T )-interpretation I is such that I |= F6. We construct a (TA ∪ T )-interpretationJ such that J |= F .

First, define a projection function that maps ΣA-terms to terms of theindex set I:

projI : ΣA-terms → I .

In particular, if αI [i] = vi for i ∈ I and αI [λ] = vλ, then

projI(t) =

i if αI [t] = vi for some i ∈ Iλ otherwise

Extend projI to vectors of variables:

projI(i) = (projI(i1), . . . , projI(in)) .

Define J to be like I except for its arrays. Under J , let a[i] = a[projI(i)].Technically, we are specifying how αJ assigns values to terms of F and thearray read function ·[·]; however, we can think in terms of arrays.

To prove that J |= F , we focus on a particular subformula ∀i. F [i] → G[i].Assume that

I |=∧

i∈In

(F [i] → G[i]

);

then also

J |=∧

i∈In

(F [i] → G[i]

)(11.1)

by construction of J . We need to prove that

J |= ∀i. F [i] → G[i] ; (11.2)

that is, that

J ⊳ i 7→ v |= F [i] → G[i]

11.2 Integer-Indexed Arrays 299

for all v ∈ DnJ . Let K = J ⊳ i 7→ v.

To do so, we prove the two implications represented by dashed arrows inthe following diagram:

F [projK(i)] G[projK(i)]

K |=

F [i] G[i]

(1) (2)

?

The top implication holds under K by assumption (11.1). If both implications(1) and (2) hold under K, then the transitivity of implication implies that thebottom implication holds under K as well.

For (1), we apply structural induction to the index guard F [i]. Atoms havethe form i = e, i 6= e, e 6= i, and i = j, for universally quantified variables iand j and term e without universally quantified variables. When such a literalis true under K, then so is the corresponding literal projK(i) = e, projK(i) 6= e,e 6= projK(i), or projK(i) = projK(j), respectively, by definition of projK . Forexample, if K |= i 6= e, then αK [i] is either equal to some αK [j] for somej ∈ I such that αK [j] 6= αK [e] or equal to some other value v. In the lattercase, projK(i) = λ; that λ 6= e is asserted in F6 implies that projK(i) 6= e.

For the inductive case, observe that conjunction and disjunction are mono-tonic on truth-values: each can only become “more true” as its argumentsswitch from false to true. Thus, (1) holds.

For (2), just note that αK [a[i]] = αK [a[projK(i)]] by the construction of J .The bottom implication thus holds under each variant K of J , so (11.2)

holds, completing the proof.

Theorem 11.8 (⋆Complexity). Suppose T -satisfiability is in NP. For sub-fragments of the array property fragment in which formulae have bounded-sizeblocks of quantifiers, (TA ∪ T )-satisfiability is NP-complete.

NP-hardness follows from Theorem 9.23. That the problem is in NP followseasily from the procedure: instantiating a block of n universal quantifiersquantifying subformula G over index set I produces |I|n new subformulae,each of length polynomial in the length of G. Hence, the output of Step 6 isof length only a polynomial factor greater than the input to the procedure forfixed n.

11.2 Integer-Indexed Arrays

Software engineers usually think of arrays as integer-indexed segments ofmemory. Reasoning about indices as integers provides the power of comparisonvia ≤, which enables reasoning about subarrays and properties such as that a(sub)array is sorted or partitioned. In particular, reasoning about subarrays

300 11 Arrays

is essential for reasoning about programs that incrementally construct or ma-nipulate arrays. See, for example, the programs BubbleSort and QuickSort ofChapters 5 and 6.

The theory of integer-indexed arrays T ZA augments the signature of

TA with the signature of TZ. It includes the axioms of both TA and TZ.

11.2.1 Array Property Fragment

As in Section 11.1, we are interested in the array property fragment ofT Z

A . An array property is again a ΣZA-formula of the form

∀i. F [i] → G[i] ,

where i is a list of integer variables, and F [i] and G[i] are the index guard andthe value constraint, respectively. The form of an index guard is constrainedaccording to the following grammar:

iguard → iguard ∧ iguard | iguard ∨ iguard | atomatom → expr ≤ expr | expr = exprexpr → uvar | pexpr

pexpr → pexpr′

pexpr′ → Z | Z · evar | pexpr′ + pexpr′

where uvar is any universally quantified integer variable, and evar is anyexistentially quantified or free integer variable.

The form of a value constraint is also constrained. Any occurrence of aquantified index variable i must be as a read into an array, a[i], for array terma. Array reads may not be nested; e.g., a[b[i]] is not allowed. Section 11.4explains the need for these restrictions.

The array property fragment of T ZA then consists of formulae that are

Boolean combinations of quantifier-free ΣZA-formulae and array properties.

Example 11.9. As in the basic arrays of Section 11.1, reasoning about arraysis most useful when we can say something interesting about their elements.Suppose array elements are interpreted in some theory T with signature Σ.Now that both indices and elements can be interpreted in theories, we listseveral interesting forms of properties and their definitions for various elementtheories.

• Array equality a = b in TA:

∀i. a[i] = b[i]

• Bounded array equality beq(a, b, ℓ, u) in T ZA :

∀i. ℓ ≤ i ≤ u → a[i] = b[i]

• Universal properties F [x] in TA ∪ T , where F [x] is a Σ-formula:

∀i. F [a[i]]

• Bounded universal properties F [x] in T ZA ∪ T , where F [x] is a Σ-formula:

∀i. ℓ ≤ i ≤ u → F [a[i]]

• Bounded and unbounded sorted arrays sorted(a, ℓ, u) in T ZA ∪TZ or T Z

A ∪TQ:

∀i, j. ℓ ≤ i ≤ j ≤ u → a[i] ≤ a[j]

• Partitioned arrays partitioned(a, ℓ1, u1, ℓ2, u2) in T ZA ∪ TZ or T Z

A ∪ TQ:

∀i, j, ℓ1 ≤ i ≤ u1 < ℓ2 ≤ j ≤ u2 → a[i] ≤ a[j]

The last two predicates are necessary for reasoning about sorting algorithms,while the first four forms of properties are useful in general. For example,bounded equality is essential for summarizing the effects of a function on anarray — in particular, what parts of the array are unchanged.


As in Section 11.1, the idea of the decision procedure is to reduce universalquantification to finite conjunction. Given F from the array property fragmentof T Z

A , decide its T ZA -satisfiability as follows:

Step 1

Put F in NNF.

Step 2

Apply the following rule exhaustively to remove writes:

F [a〈i ⊳ e〉]F [a′] ∧ a′[i] = e ∧ (∀j. j 6= i → a[j] = a′[j])

for fresh a′ (write)

To meet the syntactic requirements on an index guard, rewrite the third con-junct as

∀j. j ≤ i− 1 ∨ i + 1 ≤ j → a[j] = a′[j] .

Step 3

Apply the following rule exhaustively to remove existential quantification:

F [∃i. G[i]]

F [G[j]]for fresh j (exists)

Existential quantification can arise during Step 1 if the given formula has anegated array property.

302 11 Arrays

Step 4

From the output of Step 3, F3, construct the index set I:

I =t : ·[t] ∈ F3 such that t is not a universally quantified variable∪ t : t occurs as a pexpr in the parsing of index guards

If I = ∅, then let I = 0. The index set contains all relevant symbolic indicesthat occur in F3.

Step 5

Apply the following rule exhaustively to remove universal quantification:

H [∀i. F [i] → G[i]]

H

∧

i∈In

(F [i] → G[i]

)

(forall)

n is the size of the block of universal quantifiers over i.

Step 6

F5 is quantifier-free. Decide the (TA∪TZ)-satisfiability of the resulting formula.

Suppose array elements are interpreted in some theory T with signatureΣ. For deciding the (T Z

A ∪ T )-satisfiability of an array property (ΣZA ∪ Σ)-

formula, use a combination decision procedure for the quantifier-free fragmentof TA∪TZ∪T in Step 6. Thus, this procedure is a decision procedure preciselywhen the quantifier-free fragment of TA ∪ TZ ∪ T is decidable. Chapter 10discusses deciding satisfiability in combinations of quantifier-free fragments oftheories.

Example 11.10. Consider the following ΣZA-formula:

F : (∀i. ℓ ≤ i ≤ u → a[i] = b[i])∧ ¬(∀i. ℓ ≤ i ≤ u + 1 → a〈u + 1 ⊳ b[u + 1]〉[i] = b[i]) .

In NNF, we have

F1 : (∀i. ℓ ≤ i ≤ u → a[i] = b[i])∧ (∃i. ℓ ≤ i ≤ u + 1 ∧ a〈u + 1 ⊳ b[u + 1]〉[i] 6= b[i]) .

Step 2 produces

F2 : (∀i. ℓ ≤ i ≤ u → a[i] = b[i])∧ (∃i. ℓ ≤ i ≤ u + 1 ∧ a′[i] 6= b[i])∧ a′[u + 1] = b[u + 1]∧ (∀j. j ≤ u + 1− 1 ∨ u + 1 + 1 ≤ j → a[j] = a′[j]) .


Step 3 removes the existential quantifier by introducing a fresh constant k:

F3 : (∀i. ℓ ≤ i ≤ u → a[i] = b[i])∧ ℓ ≤ k ≤ u + 1 ∧ a′[k] 6= b[k]∧ a′[u + 1] = b[u + 1]∧ (∀j. j ≤ u + 1− 1 ∨ u + 1 + 1 ≤ j → a[j] = a′[j]) .

Simplifying, we have

F ′3 : (∀i. ℓ ≤ i ≤ u → a[i] = b[i])

∧ ℓ ≤ k ≤ u + 1 ∧ a′[k] 6= b[k]∧ a′[u + 1] = b[u + 1]∧ (∀j. j ≤ u ∨ u + 2 ≤ j → a[j] = a′[j]) .

The index set is thus

I = k, u + 1 ∪ ℓ, u, u + 2 ,

which includes the read terms k and u + 1 and the terms ℓ, u, and u + 2 thatoccur as pexprs in the index guards. Step 5 rewrites universal quantificationto finite conjunction over this set:

F5 :∧

i ∈ I

(ℓ ≤ i ≤ u → a[i] = b[i])

∧ ℓ ≤ k ≤ u + 1 ∧ a′[k] 6= b[k]∧ a′[u + 1] = b[u + 1]

∧∧

j ∈ I

(j ≤ u ∨ u + 2 ≤ j → a[j] = a′[j]) .

Expanding the conjunctions according to the index set I and simplifyingaccording to trivially true or false antecedents (ℓ ≤ u + 1 ≤ u simplifies to ⊥,while u ≤ u ∨ u + 2 ≤ u simplifies to ⊤) produces:

F ′5 : (ℓ ≤ k ≤ u → a[k] = b[k])

∧ (ℓ ≤ u → a[ℓ] = b[ℓ] ∧ a[u] = b[u])∧ ℓ ≤ k ≤ u + 1 ∧ a′[k] 6= b[k]∧ a′[u + 1] = b[u + 1]∧ (k ≤ u ∨ u + 2 ≤ k → a[k] = a′[k])∧ (ℓ ≤ u ∨ u + 2 ≤ ℓ → a[ℓ] = a′[ℓ])∧ a[u] = a′[u] ∧ a[u + 2] = a′[u + 2] .

(TA∪TZ)-satisfiability of this quantifier-free (ΣA∪ΣZ)-formula can be decidedusing the techniques of Chapter 10. But let us finish the example. F ′

5 is (TA ∪TZ)-unsatisfiable. In particular, note that by the third conjunct k is restrictedsuch that ℓ ≤ k ≤ u + 1; that the first conjunct asserts that for most of thisrange (k ∈ [ℓ, u]), a[k] = b[k]; that the third-to-last conjunct asserts that fork ≤ u, a[k] = a′[k], contradicting a′[k] 6= b[k] by the fourth conjunct; and thateven if k = u + 1, a′[k] 6= b[k] = b[u + 1] = a′[u + 1] = a′[k] by the fourth andfifth conjuncts, a contradiction. Hence, F is T Z

A -unsatisfiable.

304 11 Arrays

Theorem 11.11 (Sound & Complete). Consider (ΣZA∪Σ)-formula F from

the array property fragment of T ZA ∪ T . The output F5 of Step 5 is (T Z

A ∪ T )-equisatisfiable to F .

The proof proceeds using the same strategy as in the proof of Theorem11.7. The main difference is that the projection function projI is defined sothat it maps an index to its nearest neighbor in the index set. For a giveninterpretation I, define a projection function projI : ΣA-terms→ I that mapsΣZ

A-terms to terms of the index set I. Let projI(t) = i ∈ I be such that either

• αI [i] ≤ αI [t] ∧ (∀j ∈ I. αI [j] ≤ αI [t]→ αI [j] ≤ αI [i])• or αI [t] < αI [i] ∧ (∀j ∈ I. αI [i] ≤ αI [j]).

That is, i is the index set term that is t’s nearest neighbor under I, withpreference for left neighbors. Extend projI to vectors of variables:

projI(i) = (projI(i1), . . . , projI(in)) .

Using this projection function, the remainder of the proof closely follows theproof of Theorem 11.7. Exercise 11.4 asks the reader to finish the proof.

11.3 Hashtables

Hashtables are a common data structure in modern programs. In this section,we describe a theory for hashtables TH and provide a reduction of the hashtableproperty fragment of ΣH-formulae into the array property fragment of TA.

The signature of TH is the following:

ΣH : get(·, ·), put(·, ·, ·), remove(·, ·), · ∈ keys(·), = ,

where

• put(h, k, v) is the hashtable that is modified from h by mapping key k tovalue v.

• remove(h, k) is the hashtable that is modified from h by unmapping thekey k.

• get(h, k) is the value mapped by key k, which is undetermined if h doesnot map k to any value.

• k ∈ keys(h) is true iff h maps the key k.

k ∈ keys(h) is merely convenient notation for a binary predicate. However, wewill exploit this notation in the following useful operations:

• Key sets keys(h) can be unioned (k1 ∪ k2), intersected (k1 ∩ k2), andcomplemented (k).

• The predicate init(h) is true iff h does not map any key.

Each is definable using the basic signature.The axioms of TH are the following:

11.3 Hashtables 305

• ∀x. x = x (reflexivity)• ∀x, y. x = y → y = x (symmetry)• ∀x, y, z. x = y ∧ y = z → x = z (transitivity)• ∀h, j, k. j = k → get(h, j) = get(h, k) (hashtable congruence)• ∀h, j, k, v. j = k → get(put(h, k, v), j) = v

(read-over-put 1)• ∀h, k, v. ∀j ∈ keys(h). j 6= k → get(put(h, k, v), j) = get(h, j)

(read-over-put 2)• ∀h, k. ∀j ∈ keys(h). j 6= k → get(remove(h, k), j) = get(h, j)

(read-over-remove)• ∀h, k, v. k ∈ keys(put(h, k, v)) (keys-put)• ∀h, k. k 6∈ keys(remove(h, k)) (keys-remove)

Notice the similarity between the first six axioms of TH and those of TA. Keysets complicate the (read-over-put 2) axiom compared to the (read-over-write2) axiom, while keys sets and key removal require three additional axioms. Inparticular, reading a hashtable with an unmapped key is undefined.

11.3.1 Hashtable Property Fragment

A hashtable property has the form

∀k. F [k] → G[k] ,

where F [k] is the key guard, and G[k] is the value constraint. Key guardsare defined exactly as index guards of the array property fragment of TA: theyare positive Boolean combinations of equalities between universally quantifiedkeys; and equalities and disequalities between universally quantified keys kand other key terms. Value constraints can use universally quantified keysk in hashtable reads get(h, k) and in key set membership checks. Finally, ahashtable property does not contain any init literals.

ΣH-formulae that are Boolean combinations of quantifier-free ΣH-formulaeand hashtable properties comprise the hashtable property fragment of TH.

Example 11.12. Consider the following hashtable property formula:

F : ∀k ∈ keys(h). get(h, k) ≥ 0 .

Its key guard is trivial, while its value constraint is

k ∈ keys(h) → get(h, k) ≥ 0 .

Suppose that F annotates locations L1 and L2 in the following basic path:

@L1 : Fassume v ≥ 0;put(h, s, v);@L2 : F

306 11 Arrays

We want to prove that if F holds at L1, then it holds at L2. The resultingverification condition is

∀h, s, v.

[(∀k ∈ keys(h). get(h, k) ≥ 0) ∧ v ≥ 0→ (∀k ∈ keys(put(h, s, v)). get(put(h, s, v), k) ≥ 0)

].

The key set keys(h) provides a mechanism for reasoning about the incrementalmodification of hashtables.

Example 11.13. To express equality between key sets, keys(h1) = keys(h2),in TH, write

∀k. k ∈ keys(h1) ↔ k ∈ keys(h2) .

To express equality between hashtables, h1 = h2, write

keys(h1) = keys(h2) ∧ ∀k ∈ keys(h1). get(h1, k) = get(h2, k) .


Given F from the hashtable property fragment of TH, the decision procedurereduces its TH-satisfiability to TA-satisfiability of a formula in the array prop-erty fragment. The main idea of the reduction is to represent hashtables h ofF by two arrays in the ΣA-formula: an array h for the elements and an arraykeysh that indicates if a key maps to an element. In particular, keysh[k] = in-dicates that k is not mapped by h, while keysh[k] 6= (keysh[k] = •) indicatesthat h does map h to a value given by h[k].

Step 1

Construct F ∧ • 6= , for fresh constants • and .

Step 2

Rewrite F1 according to the following set of rules:

F [put(h, k, v)] =⇒ F [h′] ∧ h′ = h〈k ⊳ v〉 ∧ keysh′ = keysh〈k ⊳ •〉F [remove(h, k)] =⇒ F [h′] ∧ h = h′〈k ⊳ h[k]〉 ∧ keysh′ = keysh〈k ⊳ 〉

for fresh variable h′. In the second rule, h = h′〈k⊳h[k]〉 expresses that positionk of h′ is undetermined since the mapping from k is being removed. Recallthat equality a = b between arrays is defined by ∀i. a[i] = b[i].

11.3 Hashtables 307

Step 3

Rewrite F2 according to the following set of rules:

F [get(h, k)] =⇒ F [h[k]]F [k ∈ keys(h)] =⇒ F [keysh[k] 6= ]

F [k ∈ K1 ∪K2] =⇒ F [k ∈ K1 ∨ k ∈ K2]F [k ∈ K1 ∩K2] =⇒ F [k ∈ K1 ∧ k ∈ K2]

F [k ∈ K] =⇒ F [¬(k ∈ K)]F [init(h)] =⇒ F [∀k. ¬(k ∈ keys(h))]

where K, K1, and K2 are constructed from union, disjunction, and comple-mentation of key set membership atoms. The final four rules define auxiliaryoperations: the right-side of each is expressible in the hashtable property frag-ment.

Step 4

Decide the TA-satisfiability of F3.

As in Sections 11.1 and 11.2, hashtable values may be interpreted in sometheory T with signature Σ. Then the above procedure is a decision procedureprecisely when there is a decision procedure for the array property fragmentof TA ∪ T .

Example 11.14. Consider the following ΣH-formula:

G : ∀h, s, v.

[(∀k ∈ keys(h). get(h, k) ≥ 0) ∧ v ≥ 0→ (∀k ∈ keys(put(h, s, v)). get(put(h, s, v), k) ≥ 0)

].

To prove its TH-validity, prove the TH-unsatisfiability of the following:

F : (∀k ∈ keys(h). get(h, k) ≥ 0) ∧ v ≥ 0∧ ¬(∀k ∈ keys(put(h, s, v)). get(put(h, s, v), k) ≥ 0) .

Step 1 introduces • and :

F1 : F ∧ • 6= .

Step 2 reduces hashtable modifications to arrays:

F2 : (∀k ∈ keys(h). get(h, k) ≥ 0) ∧ v ≥ 0∧ ¬(∀k ∈ keys(h′). get(h′, k) ≥ 0)∧ h′ = h〈s ⊳ v〉 ∧ keysh′ = keysh〈s ⊳ •〉∧ • 6= .

Step 3 completes the reduction:

308 11 Arrays

F3 : (∀k. keysh[k] 6= → h[k] ≥ 0) ∧ v ≥ 0∧ ¬(∀k. keysh′ [k] 6= → h′[k] ≥ 0)∧ (∀i. h′[i] = h〈s ⊳ v〉[i]) ∧ (∀j. keysh′ [j] = keysh〈s ⊳ •〉[j])∧ • 6= .

F3 is TA-unsatisfiable. In particular, keysh′ = keysh〈s ⊳ •〉 and h′ = h〈s ⊳ v〉,so that for all keys of h (k such that keysh[k] 6= ), h′[k] = h[k] ≥ 0. For theone new key s of h′, we know that h′[s] = h〈s ⊳ v〉[s] = v ≥ 0. Therefore, nokey k of h′ exists at which h[k] < 0, a contradiction. Hence, G is TH-valid.

The decision procedure of Section 11.1 also proves the TA-unsatisfiabilityof F3.

Theorem 11.15 (Sound & Complete). Consider (ΣH∪Σ)-formula F fromthe hashtable property fragment of TH∪T . The rewriting procedure terminates,and its output F3 of Step 3 is equisatisfiable to F and in the array propertyfragment of (TA ∪ T ).

Proof. Inspection shows that if F is a (ΣH ∪ Σ)-formula from the hashtableproperty fragment, then F3 is a ΣA-formula from the array property fragment.

The rewrite system terminates: the rules of Step 2 and the first two rulesof Step 3 remove instances of (ΣH ∪ Σ)-literals and (ΣH ∪ Σ)-terms withoutintroducing more; and the final four rules of Step 3 clearly terminate with afinite number of applications of the first two rules of Step 3.

We must show that F has a satisfying (TH ∪ T )-interpretation preciselywhen F3 has a satisfying (TA ∪ T )-interpretation. Suppose that (TH ∪ T )-interpretation I satisfies F . Construct the (TA∪T )-interpretation J satisfyingF3 as follows:

• if h maps k to v in I, set keysh[k] = • in J and h[k] = v;• otherwise, set keysh[k] = and h[k] = v for some arbitrary value v.

This J satisfies the conjuncts added in Step 2; also, h[k] is the same valueunder J as get(h, k) under I, and keysh[k] 6= under J iff k ∈ keys(h) underI, showing the correctness of the first two rules of Step 3. The remaining rulesare auxiliary and reduce to the first two.

Similarly, a (TA ∪T )-interpretation J satisfying F3 corresponds to a (TH ∪T )-interpretation I satisfying F . Construct I as follows:

• if keysh[k] 6= and h[k] = v in J , assert get(h, k) = v and k ∈ keys(h) in I;• otherwise, assert ¬(h ∈ keys(h)) in I.

Again, the correspondence between the two sides of each rule is clear.

11.4 Larger Fragments

Why does this chapter focus on the array property fragments of TA and T ZA ?

Why not examine more expressive fragments? In fact, the following theorem

11.5 Summary 309

states that extending the array property fragments in natural ways producesfragments for which satisfiability is undecidable.

Theorem 11.16. Consider the following extensions to the array propertyfragment of T Z

A (TA, where appropriate):

• Permit an additional quantifier alternation.• Permit nested reads (e.g., a1[a2[i]], where i is universally quantified).• Permit array reads by a universally quantified variable in the index guard.• Permit general Presburger arithmetic expressions over universally quanti-

fied index variables (even just addition of 1: i + 1) in the index guard orin the value constraint.

• Permit strict comparison < between universally quantified variables.• Augment the theory with a predicate expressing that one array is a permu-

tation of another.

For each resulting fragment, there exists an element theory T such that sat-isfiability in the array property fragment of T Z

A ∪ T (TA ∪ T ) is decidable, yetsatisfiability in the resulting fragment of T Z

A ∪ T (TA ∪ T ) is undecidable.

Bibliographic Remarks refers the interested reader to texts that containthe proof of this theorem.

11.5 Summary

This chapter presents several decision procedures for reasoning about array-like data structures with some quantification. It covers:

• The array property fragment of TA, which allows expressing propertiesof arrays themselves, rather than just their elements. Elements may beinterpreted in some theory.

• The array property fragment of T ZA , which allows expressing properties of

arrays and subarrays with indices interpreted within TZ.• Quantifier instantiation as a basis for decision procedures, which allows

the direct application of decision procedures for quantifier-free fragments.• Hashtables. The decision procedure rewrites a ΣH-formula into a ΣA-

formula in which each hashtable is represented by two arrays. Reductionsof this form extend decision procedures to reason about theories similarto the originally targeted theory.

Reasoning about data structures is crucial for considering the correctnessof programs. The decision procedures of Chapter 9 for reasoning about re-cursive data structures and arrays without quantifiers provide a means ofaccessing elements of data structures. Additionally, equality in TRDS extendsto data structures. The decision procedures for the array property fragmentsof TA and T Z

A facilitate reasoning about whole or segments of arrays, not justindividual elements.

310 11 Arrays


Theories of arrays have been studied for over four decades. McCarthy proposesthe axiomatization based on read-over-write in [58]. James King implementedthe decision procedure for the quantifier-free fragment as part of his thesiswork [50]. Several authors discuss the quantifier-free fragment of a theorywith predicates useful for reasoning about sorting algorithms [57, 43, 89].Suzuki and Jefferson present a permutation predicate in a more restrictedfragment [89]. The approximation to reasoning about the weak permutationpredicate of Exercise 6.5 captures a similar fragment, though for a weaker formof permutation. Stump, Barrett, Dill, and Levitt describe a decision procedurefor the quantifier-free fragment of an extensional theory of arrays [88]. Bradley,Manna, and Sipma [9] and Bradley [6] explore the array property fragment,including the proof of Theorem 11.16, that is the basis for the presentation ofthis chapter.

Exercises

11.1 (DP for array property fragment of TA). Apply the decision pro-cedure for the array property fragment of TA to the following TA-formulae.

(a) ∀i. a〈k ⊳ e〉[i] 6= e(b) a[k] = b[k] ∧ ∀i. a[i] 6= b[i](c) a[k] 6= b[k] ∧ ∀i. a[i] = b[i]

11.2 (DP for array property fragment of T ZA ). Apply the decision pro-

cedure for the array property fragment of T ZA to the following ΣZ

A-formulae.

(a) sorted(a, ℓ, u) ∧ a[ℓ] > a[u](b) sorted(a, ℓ, u) ∧ e ≤ a[ℓ] ∧ ¬sorted(a〈ℓ− 1 ⊳ e〉, ℓ− 1, u)

11.3 (DP for array property fragment of T ZA ). Apply the decision pro-

cedure for the array property fragment of T ZA to the following ΣZ

A-formula:

sorted(a〈0 ⊳ 7〉〈5 ⊳ 9〉, 0, 5) ∧ sorted(a〈0 ⊳ 11〉〈5 ⊳ 13〉, 0, 5) .

11.4 (⋆Correctness of procedure for T ZA ). Prove Theorem 11.11 using

the strategy suggested following the statement of the theorem.

11.5 (⋆Arrays with extensionality). Consider the extensional theory ofarrays T =

A (see Section 3.6). Describe a decision procedure for the quantifier-free fragment of T =

A using the methods of this chapter. Hint : Does the indexset require a fresh λ variable? Why or why not?

12

Invariant Generation

It is easier to write an incorrect program than [to] understand a correctone.

— Alan PerlisEpigrams on Programming, 1982

While applying the inductive assertion method of Chapters 5 and 6 cer-tainly requires insights from the programmer, algorithms that examine pro-grams can discover many simple properties. This chapter describes a formof static analysis called invariant generation, whose task is to discoverinductive assertions of programs. A static analysis is a procedure that oper-ates on the text of a program. An invariant generation procedure is a staticanalysis that produces inductive program annotations as output.

Section 12.1 discusses the general context of invariant generation and de-scribes a methodology for constructing invariant generation procedures. Ap-plying this methodology, Section 12.2 describes interval analysis, an invari-ant generation procedure that discovers invariants of the form c ≤ v or v ≤ c,for program variable v and constant c. Section 12.3 describes Karr’s anal-ysis, an invariant generation procedure that discovers invariants of the formc0 + c1x1 + · · · + cnxn = 0, for program variables x1, . . . , xn and constantsc0, c1, . . . , cn.

Many other invariant generation algorithms are studied in the literature,including analyses to discover linear inequalities c0 + c1x1 + · · · + cnxn ≤ 0,polynomial equalities and inequalities, and facts about memory and variablealiasing. Bibliographic Remarks refers the interested reader to examplepapers on these topics.

12.1 Invariant Generation

After revisiting the weakest precondition and defining the strongest post-condition in Sections 12.1.1 and 12.1.2, Section 12.1.3 describes the general

312 12 Invariant Generation

•s

wp(F, S)

•s′

F

S

•s0

F

•s

sp(F, S)

S

(a) (b)

Fig. 12.1. (a) Weakest precondition and (b) strongest postcondition

forward propagation procedure for discovering inductive assertions. Ingeneral, this procedure is not an algorithm: it can run forever without produc-ing an answer. Hence, Section 12.1.4 presents a methodology for constructingabstract interpretations of programs, which focus on particular elements ofprograms and apply a heuristic to guarantee termination. Subsequent sectionsexamine particular instances of this methodology.

12.1.1 Weakest Precondition and Strongest Postcondition

Recall from Section 5.2.4 that a predicate transformer p is a function

p : FOL× stmts→ FOL

that maps a FOL formula F ∈ FOL and program statement S ∈ stmts to aFOL formula. The weakest precondition predicate transformer wp(F, S) isdefined on the two types of program statements that occur in basic paths:

• wp(F, assume c) ⇔ c→ F , and• wp(F [v], v := e) ⇔ F [e];

and inductively on sequences of statements S1; . . . ; Sn:

wp(F, S1; . . . ; Sn) ⇔ wp(wp(F, Sn), S1; . . . ; Sn−1) .

The weakest precondition wp(F, S) has the defining characteristic that if states is such that

s |= wp(F, S)

and if statement S is executed on state s to produce state s′, then

s′ |= F .

12.1 Invariant Generation 313

In other words, the weakest precondition moves a formula backward over asequence of statements: for F to hold after executing S1; . . . ; Sn, the formulawp(F, S1; . . . ; Sn) must hold before executing the statements.

This situation is visualized in Figure 12.1(a). The region labeled F is theset of states that satisfy F ; similarly, the region labeled wp(F, S) is the setof states that satisfy wp(F, S). Every state s on which executing statementS leads to a state s′ in the F region must be in the wp(F, S) region.

For reasoning in the opposite direction, we define the strongest post-condition predicate transformer. The strongest postcondition sp(F, s) hasthe defining characteristic that if s is the current state and

s |= sp(F, S)

then there exists a state s0 such that executing S on s0 results in state s and

s0 |= F .

Figure 12.1(b) visualizes this characteristic. Executing statement S on anystate s0 in the F region must result in a state s in the sp(F, S) region.

Define sp(F, S) as follows. On assume statements,

sp(F, assume c) ⇔ c ∧ F ,

for if program control makes it past the statement, then c must hold.Unlike in the case of wp, there is no simple definition of sp on assignments:

sp(F [v], v := e[v]) ⇔ ∃v0. v = e[v0] ∧ F [v0] .

Let s0 and s be the states before and after executing the assignment, respec-tively. v0 represents the value of v in state s0. Every variable other than vmaintains its value from s0 in s. Then v = e[v0] asserts that the value of vin state s is equal to the value of e in state s0. F [v0] asserts that s0 |= F .Overall, sp(F, v := e) describes the states that can be obtained by executingv := e from F -states, states that satisfy F .

Finally, define sp inductively on a sequence of statements S1; . . . ; Sn:

sp(F, S1; . . . ; Sn) ⇔ sp(sp(F, S1), S2; . . . ; Sn) .

sp progresses forward through the statements.

Example 12.1. Compute

sp(i ≥ n, i := i + k)

⇔ ∃i0. i = i0 + k ∧ i0 ≥ n

⇔ i− k ≥ n

since i0 = i− k.


Compute

sp(i ≥ n, assume k ≥ 0; i := i + k)

⇔ sp(sp(i ≥ n, assume k ≥ 0), i := i + k)

⇔ sp(k ≥ 0 ∧ i ≥ n, i := i + k)

⇔ ∃i0. i = i0 + k ∧ k ≥ 0 ∧ i0 ≥ n

⇔ k ≥ 0 ∧ i− k ≥ n

Example 12.2. Let us prove that

sp(wp(F, S), S) ⇒ F ⇒ wp(sp(F, S), S) ;

that is, the strongest postcondition of the weakest precondition of F on state-ment S implies F , which implies the weakest precondition of the strongestpostcondition of F on S. We prove the first implication and leave the secondimplication as Exercise 12.1.

Suppose that S is the statement assume c. Then

sp(wp(F, assume c), assume c)

⇔ sp(c→ F, assume c)

⇔ c ∧ (c→ F )

⇔ c ∧ F

⇒ F

Now suppose that S is an assignment statement v := e. Then

sp(wp(F [v], v := e[v]), v := e[v])

⇔ sp(F [e[v]], v := e[v])

⇔ ∃v0. v = e[v0] ∧ F [e[v0]]

⇔ ∃v0. v = e[v0] ∧ F [v]

⇒ F

Recall the definition of a verification condition in terms of wp:

FS1; . . . ; SnG : F ⇒ wp(G, S1; . . . ; Sn) .

We can similarly define a verification condition in terms of sp:

FS1; . . . ; SnG : sp(F, S1; . . . ; Sn) ⇒ G .

Typically, we prefer working with the weakest precondition because of itssyntactic handling of assignment statements. However, in the remainder ofthis chapter, we shall see the value of the strongest postcondition.


12.1.2 ⋆General Definitions of wp and sp

Section 12.1.1 defines the wp and sp predicate transformers for our simplelanguage of assumption (assume c) and assignment (v := e) statements. Thissection defines these predicate transformers more generally.

Describe program statements with FOL formulae over the program vari-ables x, the program counter pc, and the primed variables x′ and pc′.The program counter ranges over the locations of the program. The primedvariables represent the values of the corresponding unprimed variables in thenext state. For example, the statement

Li : assume c;Lj :

is captured by the relation

ρ1[pc, x, pc′, x′] : pc = Li ∧ pc′ = Lj ∧ c ∧ x′ = x . (12.1)

To execute this statement, the program counter must be Li and c must betrue. Executing the statement sets the program counter to Lj and leaves thevariables x unchanged. Similarly, the assignment

Li : xi := e;Lj :

is captured by the relation

ρ2[pc, x, pc′, x′] : pc = Li ∧ pc′ = Lj ∧ x′i = e ∧ pres(x \ xi) , (12.2)

where pres(V ), for a set of variables V , abbreviates the assertion that eachvariable in V remains unchanged:

∧

v∈V

v′ = v .

Formulae (12.1) and (12.2) are called transition relations. The expressive-ness of FOL allows many more constructs to be encoded as transition relationsthan can be encoded in either pi or the simple language of assumptions andassignments.

Let us consider the weakest precondition and the strongest postcon-dition in this general context. For convenience, let y be all the variables ofthe program including the program counter pc. The weakest precondition ofF over transition relation ρ[y, y′] is given by

wp(F, ρ[y, y′]) ⇔ ∀y′. ρ[y, y′] → F [y′] .

Notice that free(wp(F, ρ)) = y. A satisfying y represents a state from which allρ-successors, which are states y′ that ρ relates to y, are F -states. Technically,


such a state might not have any successors at all; for example, ρ could describea guard that is false in the state described by y.

Consider the transition relation (12.1) corresponding to assume c:

wp(F, pc = Li ∧ pc′ = Lj ∧ c ∧ x′ = x)

⇔ ∀pc′, x′. pc = Li ∧ pc′ = Lj ∧ c ∧ x′ = x → F [pc′, x′]

⇔ pc = Li ∧ c → F [Lj, x]

Disregarding the program counter, we have recaptured our original definitionof wp on assume statements:

wp(F, assume c) ⇔ c→ F .

Similarly,

wp(F, pc = Li ∧ pc′ = Lj ∧ x′i = e ∧ pres(x \ xi))

⇔ ∀pc′, x′. pc = Li ∧ pc′ = Lj ∧ x′i = e ∧ pres(x \ xi)

→ F [pc′, x′1, . . . , x

′i, . . . , x

′n]

⇔ pc = Li → F [Lj, x1, . . . , e, . . . , xn]

Again, disregarding the program counter reveals our original definition:

wp(F [v], v := e) ⇔ F [e] .

The general form of the strongest postcondition is the following:

sp(F, ρ) ⇔ ∃y0. ρ[y0, y] ∧ F [y0] .

Notice that free(sp(F, ρ)) = y. It describes all states y that have someρ-predecessor y0 that is an F -state; or, in other words, it describes all ρ-successors of F -states.

Exercise 12.2 asks the reader to specialize this definition of sp to the caseof assumption and assignment statements. Exercise 12.3 asks the reader toreproduce the arguments of Example 12.2 and Exercise 12.1 in this moregeneral setting.

12.1.3 Static Analysis

We now turn to the task of generating inductive information about programs.Throughout this chapter, we consider programs with just a single function.Exercise 12.6 asks the reader to generalize the methods to treat programswith multiple, possibly recursive, functions.

Consider a function with locations L forming a cutset including the initiallocation L0. A cutset is a set of locations such that every path that begins andends at a pair of cutpoints — a location in the cutset — without includinganother cutpoint is a basic path. An assertion map


µ : L → FOL

is a map from the set L to first-order assertions. Assertion map µ is an induc-tive assertion map, also called an inductive map, if for each basic path

Li : @ µ(Li)Si;...

Sj ;Lj : @ µ(Lj)

for Li, Lj ∈ L, the verification condition

µ(Li)Si; . . . ; Sjµ(Lj) (12.3)

is valid. The task of invariant generation is to find inductive assertion mapsµ. Viewing each µ(Li) as a variable that ranges over FOL formulae, the taskof invariant generation is to find solutions to the constraints imposed by theverification conditions (12.3) for all basic paths. To avoid trivial solutions,we require that at function entry location L0 with function precondition Fpre,Fpre ⇒ µ(L0).

How can we solve constraints (12.3) for µ? One common technique is toperform a symbolic execution of the function using the strongest postcondi-tion predicate transformer. This process is also called forward propagationbecause information flows forward through the function. The idea is to rep-resent sets of states by formulae over the program variables. sp provides amechanism for executing the function over formulae, and thus over sets ofstates.

Suppose that the function has function precondition Fpre at initial locationL0. Define the initial assertion map µ: let

µ(L0) := Fpre , and for L ∈ L \ L0, µ(L) := ⊥ .

This configuration represents entering the function with some values satisfyingits precondition.

Maintain a set S ⊆ L of locations that still need processing. Initially, letS = L0. Terminate when S = ∅.

Suppose that we are on iteration i, having constructed the intermediateassertion map µ. Choose some location Lj ∈ S to process, and remove it fromS. For each basic path

Lj : @ µ(Lj)Sj ;...

Sk;


let ForwardPropagate Fpre L =S := L0;µ(L0) := Fpre;µ(L) := ⊥ for L ∈ L \ L0;while S 6= ∅ do

let Lj = choose S in

S := S \ Lj;foreach Lk ∈ succ(Lj) dolet F = sp(µ(Lj), Sj ; . . . ; Sk) in

if F 6⇒ µ(Lk)then µ(Lk) := µ(Lk) ∨ F ;

S := S ∪ Lk;done;

done;µ

Fig. 12.2. ForwardPropagate

Lk : @ µ(Lk)

starting at Lj , compute if

sp(µ(Lj), Sj ; . . . ; Sk) ⇒ µ(Lk) . (12.4)

If the implication holds, then sp(µ(Lj), Sj ; . . . ; Sk) does not represent anystates that are not already represented by µ(Lk), so do nothing. Otherwise,set

µ(Lk) := µ(Lk) ∨ sp(µ(Lj), Sj ; . . . ; Sk) .

In other words, assign µ(Lk) to be the union of the set of states representedby sp(µ(Lj), Sj ; . . . ; Sk) with the set of states currently represented by µ(Lk).Additionally, add Lk to S so that future iterations propagate the effect of thenew states. For all other locations Lℓ ∈ L, assign µ(Lℓ) := µ(Lℓ).

This procedure is summarized in Figure 12.2. Given a function’s precon-dition Fpre and a cutset L of its locations, ForwardPropagate returns theinductive map µ if the main loop finishes. In the code, Lk ∈ succ(Lj) is asuccessor of Lj if there is a basic path from Lj to Lk.

If at some point S is empty, then implication (12.4) and the policy foradding locations to S guarantees that all VCs (12.3) are valid, so that µ isnow an inductive map. For the initialization of µ ensures that

Fpre ⇒ µ(L0) ,

while the check in the inner loop and the emptiness of S guarantees that

sp(µ(Lj), Sj ; . . . ; Sk) ⇒ µ(Lk) ,

which is precisely the verification condition

µ(Lj)Sj ; . . . ; Skµ(Lk) .

However, there is no guarantee that S ever becomes empty.

Example 12.3. The forward propagation procedure is an algorithm whenanalyzing hardware circuits: it always terminates. In particular, if a circuithas n Boolean variables, then the set of possible states has cardinality 2n.Since each state set µ(Li) can only grow, it must eventually be the case(within |L| × 2n iterations) that no new states are added to any µ(Li), so Seventually becomes empty.

12.1.4 Abstraction

Two elements of the forward propagation procedure prohibit it from termi-nating in the general case. First, as we know from Chapter 2, checking theimplication

sp(µ(Lj), Sj ; . . . ; Sk) ⇒ µ(Lk)

of the inner loop is undecidable for FOL. Second, even if this check weredecidable — for example, if we restricted µ(Lk) to be in a decidable theory orfragment — the while loop itself may not terminate, as the following exampleshows.

Example 12.4. Consider the following loop with integer variables i and n:

@L0 : i = 0 ∧ n ≥ 0;while

@L1 : ?(i < n) i := i + 1;

There are two basic paths to consider:(1)

@L0 : i = 0 ∧ n ≥ 0;@L1 : ?;

and(2)

@L1 : ?;assume i < n;i := i + 1;@L1 : ?;

To obtain an inductive assertion at location L1, apply the procedure ofFigure 12.2. Initially,

µ(L0) ⇔ i = 0 ∧ n ≥ 0µ(L1) ⇔ ⊥ .

Following path (1) results in assigning

µ(L1) := µ(L1) ∨ (i = 0 ∧ n ≥ 0) .

µ(L1) was ⊥, so that it becomes

µ(L1) ⇔ i = 0 ∧ n ≥ 0 .

On the next iteration, following path (2) yields

µ(L1) := µ(L1) ∨ sp(µ(L1), assume i < n; i := i + 1) .

Currently µ(L1) ⇔ i = 0 ∧ n ≥ 0, so

F : sp(i = 0 ∧ n ≥ 0, assume i < n; i := i + 1)

⇔ sp(i < n ∧ i = 0 ∧ n ≥ 0, i := i + 1)

⇔ ∃i0. i = i0 + 1 ∧ i0 < n ∧ i0 = 0 ∧ n ≥ 0

⇔ i = 1 ∧ n > 0 .

Since the implication

i = 1 ∧ n > 0︸︷︷︸F

⇒ i = 0 ∧ n ≥ 0︸︷︷︸µ(L1)

is invalid,

µ(L1) ⇔ (i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n > 0)︸︷︷︸F

at the end of the iteration.At the end of the next iteration,

µ(L1) ⇔ (i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n > 0) ∨ (i = 2 ∧ n > 1) ,

and at the end of the kth iteration,

µ(L1) ⇔ (i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n ≥ 1)∨ · · · ∨ (i = k ∧ n ≥ k) .

Because the implication

i = k ∧ n ≥ k⇓

(i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n ≥ 1) ∨ · · · ∨ (i = k − 1 ∧ n ≥ k − 1)


is invalid for any k, the main loop of ForwardPropagate never finishes.However, it is obvious that

0 ≤ i ≤ n

is an inductive annotation of the loop.

The problem is that ForwardPropagate attempts to compute the exactset of reachable states. A state s is reachable if s appears in some compu-tation. Inductive annotations usually over-approximate the set of reachablestates: every reachable state s satisfies the annotation, but other unreachablestates can also satisfy the annotation. Cleverly over-approximating the reach-able state set can force the main loop of ForwardPropagate to terminate.

The method of abstract interpretation addresses these problems. Anabstract interpretation makes two compromises to guarantee termination.First, it manipulates an artificially constrained form of state sets so thatimplication checking is decidable. The constrained form cannot express everypossible state set. Rather, the form is chosen to focus on a particular aspect ofprograms, such as relationships among numerical variables or aliasing in pro-grams with pointers. Second, it introduces a heuristic step, called widening,in which it guesses a limit over-approximation to a sequence of state sets. Wedescribe how to construct an abstract interpretation in six steps.

Step 1: Choose an abstract domain D.

In the first step, we choose the form of state sets that the abstract interpreta-tion manipulates. The abstract domain D is a syntactic class of Σ-formulaeof some theory T ; each member Σ-formula represents a particular set of states(those that satisfy it). In Section 12.2, the interval abstract domain DI

consists of conjunctions of ΣQ-literals of the forms

c ≤ v and v ≤ c ,

for constant c and program variable v. In Section 12.3, we fix Karr’s abstractdomain DK to consist of conjunctions of ΣQ-literals of the form

c0 + c1x1 + · · ·+ cnxn = 0 ,

for constants c0, c1, . . . , cn and program variables x1, . . . , xn.

Step 2: Construct a map from FOL formulae to D.

It is useful to glean as much information from existing annotations, such as thefunction precondition, as possible. Additionally, Step 3 requires interpretingconditions of assume statements in D. Hence, define a function

νD : FOL→ D


to map a FOL formula F to element νD(F ) of D. For example, the assertion

F : i = 0 ∧ n ≥ 0

at L0 of the loop of Example 12.4 can be represented in the interval abstractdomain by

νDI(F ) : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n

and in Karr’s abstract domain by

νDK(F ) : i = 0

with some loss of information.The map should have the property that for any F ,

F ⇒ νD(F ) .

Step 3: Define an abstract sp.

Define an abstract strongest postcondition operator spD for assumptionand assignment statements such that

sp(F, S) ⇒ spD(F, S) and spD(F, S) ∈ D

for statement S and F ∈ D.Consider first the statement assume c. Conjunction is used in computing

the strongest postcondition of assumption statements:

sp(F, assume c) ⇔ c ∧ F .

Define abstract conjunction ⊓D for this purpose such that

F1 ∧ F2 ⇒ F1 ⊓D F2 and F1 ⊓D F2 ∈ D

for F1, F2 ∈ D. Then for F ∈ D,

spD(F, assume c) ⇔ νD(c) ⊓D F .

Often, the abstract domain D consists of conjunctions of literals of some form;in this case, ⊓D is just the usual conjunction ∧.

Defining sp for assignment statements is more complex. For suppose thatwe use the standard definition

sp(F [v], v := e[v]) ⇔ ∃v0. v = e[v0] ∧ F [v0]︸︷︷︸G

,

which requires existential quantification. Then, later, when we compute thevalidity of an implication


G ⇒ µ(L) ,

µ(L) can contain existential quantification, resulting in a quantifier alterna-tion. However, most decision procedures apply only to quantifier-free formulae,including the procedures of Chapters 8-10. Therefore, introducing existentialquantification in sp is undesirable.

We look at specific instances in Sections 12.2 and 12.3.

Step 4: Define abstract disjunction.

Disjunction is applied in ForwardPropagate to grow the formulae µ(L)representing the state sets at locations L ∈ L. Define abstract disjunction ⊔D

for this purpose, such that

F1 ∨ F2 ⇒ F1 ⊔D F2 and F1 ⊔D F2 ∈ D

for F1, F2 ∈ D.Unlike conjunction, exact disjunction is usually not represented in the

domain D. We examine specific instances in Sections 12.2 and 12.3.

Step 5: Define abstract implication checking.

On each iteration of the inner loop of ForwardPropagate, validity of theimplication

sp(µ(Lj), Sj ; . . . ; Sk) ⇒ µ(Lk)

is checked to determine whether µ(Lk) has changed. A proper selection ofD ensures that this validity check is decidable. Additionally, some domainsadmit a simpler check than querying a full decision procedure, as we see inSections 12.2 and 12.3.

Step 6: Define widening.

While some abstract domains D and abstract strongest postconditions spD

guarantee that the forward propagation procedure terminates, defining anabstraction is not sufficient to guarantee termination in general; we exhibitsuch a case in Example 12.10 of Section 12.2. Thus, abstractions that do notguarantee termination are equipped with a widening operator D.

A widening operator D is a binary function

D : D ×D → D

such that

F1 ∨ F2 ⇒ F1 D F2 .


let AbstractForwardPropagate Fpre L =S := L0;µ(L0) := νD(Fpre);µ(L) := ⊥ for L ∈ L \ L0;while S 6= ∅ do

let Lj = choose S in

S := S \ Lj;foreach Lk ∈ succ(Lj) dolet F = spD(µ(Lj), Sj ; . . . ; Sk) inif F 6⇒ µ(Lk)then if Widen()

then µ(Lk) := µ(Lk) D (µ(Lk) ⊔D F );else µ(Lk) := µ(Lk) ⊔D F ;S := S ∪ Lk;

done;done;µ

Fig. 12.3. AbstractForwardPropagate

for F1, F2 ∈ D. It obeys the following property. Let F1, F2, F3, . . . be an infinitesequence of elements Fi ∈ D such that for each i,

Fi ⇒ Fi+1 .

Define the sequence

G1 = F1 and Gi+1 = Gi D Fi+1 .

Notice that Fi ⇒ Gi. For some i∗ and for all i ≥ i∗,

Gi ⇔ Gi+1 .

That is, the sequence Gi converges even if the sequence Fi does not converge.Intuitively, the widening operator “guesses” an over-approximation to thelimit of a sequence of formulae. Of course, a widening operator could alwaysreturn ⊤; however, better widening operators make more precise guesses.

A proper strategy of applying widening guarantees that the forward prop-agation procedure terminates.

Abstract Forward Propagation

Figure 12.3 applies the operators defined in these six steps in the abstractforward propagation algorithm. Given a function’s precondition Fpre anda cutset L of its locations, AbstractForwardPropagate returns the in-ductive map µ. When Widen() determines that widening should be applied,

12.2 Interval Analysis 325

µ(Lk) is updated to be the widening of µ(Lk) and µ(Lk)⊔DF . A proper defini-tion of Widen() ensures that the procedure terminates. For example, a simplestrategy is that after some predetermined number of iterations, Widen() al-ways evaluates to true.

Subsequent sections examine instances of this framework.

12.2 Interval Analysis

In this section, we describe the interval abstract interpretation, whichfinds inductive assertions that are conjunctions of ΣQ-literals of the forms

v ≤ c and c ≤ v ,

for c ∈ Q. This analysis is useful for generating simple bounds on variables.Although the abstract domain reasons about rational variables, the analysisis applicable to integer variables: the generated invariants hold whether theprogram variables range over integers or rationals. Our presentation is fairlysimple; one could improve some of the operations to provide better invariants.

Example 12.5. In the loop

@L0 : i = 0 ∧ n ≥ 0;while

@L1 : ?(i < n) i := i + 1;

the interval analysis discovers the inductive invariant 0 ≤ i ∧ 0 ≤ n at L1.

Step 1: Construct the domain DI.

Fix the domain DI to consist of ⊥, ⊤, and conjunctions of literals of the form

c ≤ v and v ≤ c ,

for c ∈ Q and (program) variable v. Throughout our discussion, we assumethat formulae are maintained in a canonical form according to the followingrules:

• Simplify ⊥ ∧ F to ⊥.• Simplify ⊤ ∧ F to F .• Simplify c1 ≤ v ∧ c2 ≤ v to max(c1, c2) ≤ v.• Simplify v ≤ c1 ∧ v ≤ c2 to v ≤ min(c1, c2).• If c1 > c2, simplify c1 ≤ v ∧ v ≤ c2 to ⊥.

Maintaining formulae in canonical form guarantees that no two syntactically-unequal formulae in canonical form are equivalent.

While the domain is a syntactic class of ΣQ-formulae, we can representelements in a way that is suitable for computation. A useful representation isbased on interval arithmetic. Let [ℓ, u] represent the interval from ℓ to u,inclusive. The sum of two intervals is

[ℓ1, u1] + [ℓ2, u2] = [ℓ1 + ℓ2, u1 + u2] .

The sum of an interval and a scalar c is found by treating c as the interval[c, c].

For scalar multiplication, compute

c [ℓ, u] =

[cℓ, cu] if c ≥ 0[0, 0] if c = 0[cu, cℓ] if c < 0

It is convenient to allow lower and upper bounds to be −∞ or ∞. Define

−∞+ c = −∞ , ∞+ c =∞ , c · −∞ = −∞ , and c · ∞ =∞

for c ≥ 0, and

−∞+ c = −∞ , ∞+ c =∞ , c · −∞ =∞ , and c · ∞ = −∞ ,

for c < 0. Also, define the empty interval as

[∞,−∞] ;

multiplying it by scalar c or summing it with other intervals yields the emptyinterval.

Consider set operations:

• Interval intersection:

[ℓ1, u1] ⊓ [ℓ2, u2]

=

[∞,−∞] if max(ℓ1, ℓ2) > min(u1, u2)[max(ℓ1, ℓ2), min(u1, u2)] otherwise

Intersection is exact: the computed interval represents the set that is theset intersection of the two sets represented by the given intervals.

• Interval union:

[ℓ1, u1] ⊔ [ℓ2, u2] = [min(ℓ1, ℓ2), max(u1, u2)]

The result is called the interval hull. It over-approximates the true union:the computed interval represents a set that may include more elementsthan the set union of the two sets represented by the given intervals.


For predicates,

[ℓ1, u1] ≤ [ℓ2, u2] if ℓ1 ≤ u2 ,

and

[ℓ1, u1] = [ℓ2, u2] if [ℓ1, u1] ⊓ [ℓ2, u2] 6= [∞,−∞] .

Like interval union, these predicates are over-approximations: if any pair ofpoints in [ℓ1, u1] and [ℓ2, u2] satisfy the predicate, then the predicate holds onthe intervals.

These definitions are enough to define interval arithmetic over linear arith-metic. For example, let each variable xi range in the interval [ℓi, ui]. Then

c0 + c1x1 + · · ·+ cnxn ,

for ci ∈ Q, ranges in the interval

c0 + c1[ℓ1, u1] + · · ·+ cn[ℓn, un] .

The correspondence of interval arithmetic with DI is the following. F ∈ DI,which is a conjunction of literals of the form c ≤ v and v ≤ c, asserts thatxi ∈ [ℓi, ui], for variable xi, where

• ℓi =∞ and ui = −∞ if F is the formula ⊥;• ℓi = d if d ≤ xi is a literal of F , and ℓi = −∞ otherwise;• ui = e if xi ≤ e is a literal of F , and ui =∞ otherwise.

Example 12.6. The formula

F : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n

is an element of DI. It asserts that i ∈ [0, 0] and n ∈ [0,∞]. Given F , compute

i + 2n = [0, 0] + 2[0,∞] = [0,∞] ,

and

i− 2n = [0, 0] +−2[0,∞] = [0, 0] + [−∞, 0] = [−∞, 0] .

Step 2: Construct the map νDI.

Define the map νDIas follows. Consider literal F ; then

νDI(F ) =

⊥ if F ⇔ ⊥ba≤ v ∧ v ≤ b

aif F ⇔ av = b

v ≤ ba

if F ⇔ av ≤ bba≤ v if F ⇔ b ≤ av⊤ otherwise


for a, b ∈ N such that a > 0. Define

νDI

(∧

i

Fi

)=∧

i

νDI(Fi) .

This definition of νDIis not as precise as it could be. For consider the case

in which we know G ∈ DI; then it is possible to evaluate the truth-value of,for example,

H : c0 + c1x1 + · · ·+ cnxn ≤ 0 ,

even when n > 1. Define νDI(H, G) to equal the interval evaluation (either⊤ or

⊥) of H in the context of G. For all other literals F , define νDI(F, G) = νDI

(F ).When F has several literals, it is sometimes more precise to compute

νDI(F, G ∧ νDI

(F )) ,

which uses as much information from F as possible.

Example 12.7. Consider

F : i = 0 ∧ n ≥ 0 ∧ j + 2n ≤ 4︸︷︷︸H

.

Then

νDI(F ) = 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .

Note that νDI(H) = ⊤. Now, given

G : 5 ≤ j ,

compute G ∧ νDI(F ):

G′ : 5 ≤ j ∧ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .

Then compute

νDI(H, G′) = ⊥

since in the context G′,

j + 2n = [5,∞] + 2[0,∞] = [5,∞] 6≤ [4, 4] .

Hence,

νDI(F, G ∧ νDI

(F )) = ⊥ .

Compare this result to the weaker νDI(F ).

Simply computing νDI(F, G) yields

νDI(F, G) : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n ,

since in the context G,

j + 2n = [5,∞] + 2[−∞,∞] = [−∞,∞] ≤ [4, 4] .


Step 3: Define sp.

Conjunction is natural in DI, so ⊓DIis simply normal conjunction ∧.

For assume statements, define

spDI(F, assume c) ⇔ νDI

(c, F ) ∧ F .

That is, compute the element of DI corresponding to c in the context F , andconjoin it to F .

If c is a conjunction of several literals, the following may compute a moreprecise formula:

spDI(F, assume c) ⇔ νDI

(c, F ∧ νDI(c)) ∧ F .

It uses information from c to form a more precise context.


F : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .

Compute

spDI(F, assume i ≤ n)

⇔ νDI(i ≤ n, 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n ∧ νDI

(i ≤ n)) ∧ F

⇔ νDI(i ≤ n, 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n) ∧ F

⇔ [0, 0] ≤ [0,∞] ∧ F

⇔ F

The context F asserts that i ∈ [0, 0] and n ∈ [0,∞], resulting in the compari-son of intervals of the fourth line.

Assignment statements are handled via interval arithmetic. ComputespDI

(F, v := e) as follows. Let G be the conjunction of all the literals ofF except those referring to v (at most two refer to v: c1 ≤ v and v ≤ c2). Ife is a linear expression, let [ℓ, u] be the interval evaluation of e in the contextF , and

spDI(F, v := e) ⇔ ℓ ≤ v ∧ v ≤ u ∧ G ,

where −∞ ≤ v and v ≤ ∞ both simplify to ⊤. If e is not a linear expression,then let

spDI(F, v := e) ⇔ G .

According to G, v can have any value.

F : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .

To compute spDI(F, i := i + 1), let G be the literals of F not involving i,

G : 0 ≤ n ,

and compute the interval evaluation of i + 1 in context F :

i + 1 = [0, 0] + [1, 1] = [1, 1] .

Then

spDI(F, i := i + 1) ⇔ 1 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n .

Step 4: Define interval disjunction, ⊔DI.

Define F1 ⊔DIF2 as follows. For each variable x, let F1 assert that x ∈ [ℓ1, u1]

and F2 assert that x ∈ [ℓ2, u2]. Then F1 ⊔DIF2 asserts that x is in the interval

hull of [ℓ1, u1] and [ℓ2, u2]: x ∈ [ℓ1, u1]⊔ [ℓ2, u2]. The interval hull is defined inStep 1.

Step 5: Define interval implication checking.

Validity of the ΣQ-formula F → G is decidable for F, G ∈ DI. However, we candefine a more efficient check: let F assert xi ∈ [ℓi, ui] and G assert xi ∈ [mi, vi]for each variable xi. Then F ⇒ G iff for each xi

[ℓi, ui] ⊆ [mi, vi] , i.e., mi ≤ ℓi ∧ ui ≤ vi .

Step 6: Define interval widening, DI.

Interval analysis does not naturally terminate, as the following example illus-trates.

Example 12.10. Consider again the loop

@L0 : i = 0 ∧ n ≥ 0;while

@L1 : ?(i < n) i := i + 1;

of Example 12.4 with paths(1)

@L0 : i = 0 ∧ n ≥ 0;@L1 : ?;

and(2)

@L1 : ?;assume i < n;i := i + 1;@L1 : ?;

Initially,

µ(L0) ⇔ νDI(i = 0 ∧ n ≥ 0) ⇔ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .

After one iteration, following path (1), µ(L1) ⇔ µ(L0).Following path (2) on the next iteration yields

µ(L1) := µ(L1) ⊔DIspDI

(µ(L1), assume i < n; i := i + 1) .

Currently µ(L1) ⇔ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n, so

sp(0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n, assume i < n; i := i + 1)

⇔ sp(0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n, i := i + 1)

⇔ 1 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n

Then the new µ(L1) is

µ(L1) := µ(L1) ⊔DIspDI

(· · · )⇔ (0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n) ⊔DI

(1 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n)⇔ 0 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n

Since the implication

0 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n︸︷︷︸new µ(L1)

⇒ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n︸︷︷︸old µ(L1)

is invalid,

µ(L1) ⇔ 0 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n

at the end of the iteration.At the end of the kth iteration,

µ(L1) ⇔ 0 ≤ i ∧ i ≤ k ∧ 0 ≤ n .

It is never the case that the implication

0 ≤ i ∧ i ≤ k ∧ 0 ≤ n ⇒ 0 ≤ i ∧ i ≤ k − 1 ∧ 0 ≤ n

is valid, so the main loop of AbstractForwardPropagate never finishes.

Hence, we need to define a widening operator DI. First, for any F ∈ DI,

F DI⊥ ⇔ F and ⊥DI

F ⇔ F .

Otherwise, let F, G ∈ DI be other than ⊥. For each variable v, suppose thatF asserts v ∈ [ℓ1, u1] and G asserts v ∈ [ℓ2, u2]. Then F DI

G asserts thatv ∈ [ℓ, u], where

• ℓ = −∞ if ℓ2 < ℓ1, and otherwise ℓ = ℓ1;• u =∞ if u2 > u1, and otherwise u = u1.

Intuitively, F DIG drops bounds that grow from F to G. Since at most only

twice as many finite bounds as variables exist, widening can only be applieda finite number of times before all bounds become stable.

Example 12.11. On the kth iteration (for some small k, say, k = 3) of theanalysis in Example 12.10, compute

µ(L1) := µ(L1)DI(µ(L1) ⊔DI

spDI(µ(L1), assume i < n; i := i + 1)) .

That is,

(0 ≤ i ∧ i ≤ k − 1 ∧ 0 ≤ n)DI(0 ≤ i ∧ i ≤ k ∧ 0 ≤ n)

⇔ 0 ≤ i ∧ 0 ≤ n

because the upper bound on i increases from k − 1 to k. Then

µ(L1) ⇔ 0 ≤ i ∧ 0 ≤ n .

While this new µ(L1) does not imply the previous one,

0 ≤ i ∧ 0 ≤ n 6⇒ 0 ≤ i ∧ i ≤ k − 1 ∧ 0 ≤ n ,

one more iteration yields the same µ(L1), finishing the analysis. Thus,

0 ≤ i ∧ 0 ≤ n

is an inductive assertion at L1.Unfortunately, the interval abstract domain is incapable of representing

the more interesting invariant i ≤ n.

12.3 Karr’s Analysis 333

12.3 Karr’s Analysis

Karr’s analysis discovers inductive assertions of the form

c0 + c1x1 + · · ·+ cnxn = 0 ,

for ci ∈ Z and program variables xi. Such assertions are called affine asser-tions. They are useful for tracking the relationship among program variablesand loop counters. Karr’s analysis can be implemented efficiently, with run-ning time polynomial in the program size.

In this section, we present a simplified version of the analysis that MichaelKarr originally proposed. In particular, our analysis ignores guards of loopsand if statements. We use the notation and concepts from Section 8.2.

Example 12.12. Consider the loop

@L0 : ⊤;i := 0;j := 0;k := 0;while

@L1 : ?(∗) k := k + 1;if (∗) i := i + 1;else j := j + 1;

The guard ∗ denotes nondeterministic choice: either branch can be taken.Karr’s analysis discovers the inductive invariant i + j = k at L1.

Step 1: Construct the domain DK.

DK consists of ⊥, ⊤, and conjunctions of literals of the form

c0 + c1x1 + · · ·+ cnxn = 0 ,

which define affine spaces. An affine space is a point, a line, a plane, etc. Anaffine space can be specified by a set of equations

Ax = b , abbreviating∧

i

ai1x1 + · · ·+ ainxn = bi ,

so that the space is given by points v satisfying Av = b; or by a finite set ofpoints V = v1, . . . , vk, so that the space is given by

affine(V ) =

∑

i

λivi :∑

i

λi = 1

,


that is, the set of affine combinations of vectors in V . An affine combinationof vectors V is a weighted sum

∑

i

λivi such that∑

i

λi = 1 .

For example, the affine combination of two disjoint points is a line passingthrough both. These two representations are the constraint representationand the vertex representation, respectively.

Example 12.13. The affine space represented in constraint form by

i + j = k

has vertex representation

000

,

101

,

011

,

where vectors represent values for[i j k

]T. Confirm that each vertex satisfies

i + j = k.The point

112

= (−1)

000

+ (1)

101

+ (1)

011

is in the affine space because λ1 + λ2 + λ3 = −1 + 1 + 1 = 1: it is an affinecombination of the vertices.

The vertex representation is best suited for the version of Karr’s analysisthat we present. Recall, though, that the abstract domain is really the set ofΣQ-formulae that are conjunctions of literals of the form

c0 + c1x1 + · · ·+ cnxn = 0 .

The vertex representation of domain elements is convenient for computation.

Step 2: Construct the map νDK.

This step is trivial, as this version of Karr’s analysis does not use informationfrom annotations or assumption statements. Hence,

νDK(F ) = ⊤ .


Step 3: Define sp.

Let

spDK(F, assume c) ⇔ F

for any F and c. That is, ignore assumption statements. Ignoring assump-tion statements is not a terrible loss in precision: at best, only affine guardsc : Ax = b could be interpreted within DK. Such guards are uncommon inpractice.

Consider assignment xk := e, where e is an affine expression

e0 + e1x1 + · · ·+ enxn ,

for ei ∈ Q. Construct the affine transformation

Ax + b :

11

...e1 e2 · · · en

...1

x1

...xn

+

00...e0

...0

where the row with e is the kth row, corresponding to xk, and the rest ofthe matrix is the identity matrix. Abbreviate this transformation with thenotation [[xk := e]].

Now consider an affine space F represented by a set of vertices VF . Tocompute the effect of applying the assignment xk := e, apply [[xk := e]]:

[[xk := e]]F = [[xk := e]]affine(VF ) = affine[[xk := e]]v : v ∈ VF .

The transformed affine space is given by applying [[xk := e]] to each of thevertices of VF . Then

spDK(F, xk := e) ⇔ [[xk := e]]F

for affine expression e.

Example 12.14. Consider the affine space given by vertex representation

V =

101

,

for variables[i j k

]T, and assignment i := 2i + j + 3. The assignment corre-

sponds to transformation


2 1 00 1 00 0 1

ijk

+

300

.

Then

[[i := 2i + j + 3]]V =

2 1 00 1 00 0 1

101

+

300

=

501

.

For assignments xk := e in which e is not an affine expression, define theaffine hull ⊔DK

of two affine spaces F1 and F2 with vertex representations Vand W , respectively, as

F1 ⊔DKF2 =

∑

i

λivi +∑

j

µjwj :∑

i

λi +∑

j

µj = 1

.

To implement this definition, simply let U = V ∪W be the vertex represen-tation of F1 ⊔DK

F2. Then

F1 ⊔DKF2 = affine(U) =

∑

i

λiui :∑

i

λi = 1

,

which is equivalent to the definition in terms of V and W , as desired. Forexample, the affine hull of two disjoint points is a line; and the affine hull of aline and a point not on the line is a plane. Clearly, the affine hull vastly over-approximates the union of two affine spaces; however, it is the most preciseaffine space that includes their union.

Now define

spDK(F, xk := e) ⇔ [[xk := 0]]F ⊔DK

[[xk := 1]]F

when e is not an affine expression. In the new affine space, xk can have anyvalue. Exercise 12.5 asks the reader to prove this claim.

Example 12.15. Consider affine space given by vertex representation

V =

101

,

for variables[i j k

]T, and non-affine assignment i := f(i, j, k). Compute


[[i := 0]]V ⊔DK[[i := 1]]V

=

0 0 00 1 00 0 1

101

+

000

∪

0 0 00 1 00 0 1

101

+

100

=

001

∪

101

=

001

,

101

The final set of vertices represents the set of states in which j = 0, k = 1, andi is any value.

Step 4: Define affine disjunction, ⊔DK.

The affine hull ⊔DK, defined in Step 3, over-approximates disjunction.

Step 5: Define affine implication checking.

For F1, F2 ∈ DK, whether F1 ⇒ F2 is decidable. But a more efficient testrelies on the vertex representations V and W of F1 and F2, respectively. Foreach v ∈ V , check if v ∈ affine(W ): determine if there is a λ such that

v =∑

j

λjwj and∑

j

λj = 1 .

In other words, determine if there is some affine combination of the elementsof W that equals v. More concisely, determine if there is some λ such that,for AW a matrix with columns w ∈W ,

[AW

1T

]λ =

[v1

]; that is, 1

Tλ = 1 and AW λ = v .

This query can be decided efficiently using algorithms for solving linear equa-tions, such as Gaussian elimination.

Then F1 ⇒ F2 iff for all v ∈ V , v ∈ affine(W ).

Example 12.16. Consider affine space F1 given by vertex representation

V1 =

112

and affine space F2 given by vertex representation


V2 =

000

,

101

,

011

.

Is the implication F1 ⇒ F2 valid? Yes, as

112

= (−1)

000

+ (1)

101

+ (1)

011

and −1 + 1 + 1 = 1, so the vertex is in affine(V2).

Step 6: Define affine widening.

With each growth of a µ(L), its dimension increases by at least 1 by definitionof the affine hull. As each µ(L) can be at most n-dimensional, for n the numberof program variables, the procedure AbstractForwardPropagate termi-nates even without the use of widening. Hence, we do not define a wideningoperator.

Example 12.17. Consider the loop

@L0 : ⊤;i := 0;j := 0;k := 0;while

@L1 : ?(∗) k := k + 1;if (∗) i := i + 1;else j := j + 1;

which has three basic paths:(1)

@L0 : ⊤;i := 0;j := 0;k := 0;@L1 : ?;

which is summarized by transformation

τ1 :

0 0 00 0 00 0 0

ijk

+

000

,


(2)@L1 : ?;k := k + 1;i := i + 1;@L1 : ?;


τ2 : I

ijk

+

101

where, recall, I is the identity matrix, and(3)

@L1 : ?;k := k + 1;j := j + 1;@L1 : ?;


τ3 : I

ijk

+

011

.

Initially, µ(L0)⇔ ⊤, so its vertex representation is the set of unit vectors

100

,

010

,

001

,

000

,

while µ(L1)⇔ ⊥, represented by ∅. Then

µ(L1) := µ(L1) ⊔DKτ1µ(L0) = ∅ ∪

000

.

For the next iteration, consider the two transitions τ2 and τ3:

µ(L1) := µ(L1) ⊔DKτ2µ(L1) =

000

∪

τ2

000

=

000

,

101

.

Next,


µ(L1) := µ(L1) ⊔DKτ3µ(L1) =

000

,

101

∪

τ3

000

, τ3

101

=

000

,

101

,

011

.

The new vertex[0 1 1

]Tis obtained from τ3

[0 0 0

]T. Note that τ3 is applied

to[1 0 1

]Tas well; however

τ3

101

=

112

= (−1)

000

+ (1)

101

+ (1)

011

,

and −1 + 1 + 1 = 1. Hence,[1 1 2

]Tis redundant.

On the next iteration, we obtain convergence. For

τ2

000

=

101

, τ2

101

=

202

= (−1)

000

+ (2)

101

,

and

τ2

011

=

112

= (−1)

000

+ (1)

101

+ (1)

011

,

so that τ2 does not modify µ(L1). Additionally,

τ3

000

=

011

, τ3

101

=

112

= (−1)

000

+ (1)

101

+ (1)

011

,

and

τ3

011

=

022

= (−1)

000

+ (2)

011

,

so that τ3 does not modify µ(L1), either.Hence, the final vertex representation of µ(L1) is

V =

000

,

101

,

011

.

To obtain the constraint representation of this affine space, solve

12.4 ⋆Standard Notation and Concepts 341

[V T −1

]

a1

...an

b

= 0 ;

that is,

0 0 0 −11 0 1 −10 1 1 −1

a1

a2

a3

b

=

000

.

In this case, we find the inductive assertion

i + j = k

at L1.

12.4 ⋆Standard Notation and Concepts

Our presentation of abstract interpretation differs markedly from the stan-dard presentation in the literature. To facilitate the reader’s foray into theliterature, we discuss here the standard notation and concepts and relate itto our presentation. The main idea is to describe an abstract interpretationin terms of a set of operations over two lattices.

Lattices

A partially ordered set (S,), also called a poset, is a set S equipped witha partial order , which is a binary relation that is

• reflexive: ∀s ∈ S. s s;• antisymmetric: ∀s1, s2. s1 s2 ∧ s2 s1 → s1 = s2;• transitive: ∀s1, s2, s3 ∈ S. s1 s2 ∧ s2 s3 → s1 s3.

A lattice (S,⊔,⊓) is a set equipped with join ⊔ and meet ⊓ operatorsthat are

• commutative:– ∀s1, s2. s1 ⊔ s2 = s2 ⊔ s1,– ∀s1, s2. s1 ⊓ s2 = s2 ⊓ s1;

• associative:– ∀s1, s2, s3. s1 ⊔ (s2 ⊔ s3) = (s1 ⊔ s2) ⊔ s3,– ∀s1, s2, s3. s1 ⊓ (s2 ⊓ s3) = (s1 ⊓ s2) ⊓ s3;

• idempotent:– ∀s. s ⊔ s = s,


⊔S

s1 s2 : s3 ⊔ s4

... s3 s4

s6 : s3 ⊓ s4

......

s7 s8

⊓S

⊔S

s1 s2 : s3 ⊔ s4

... s3 s4

s6 : s3 ⊓ s4

......

s7 s8

⊓S

(a) (b)

Fig. 12.4. (a) A complete lattice with (b) monotone function f

– ∀s. s ⊓ s = s.

Additionally, they satisfy the absorption laws:

• ∀s1, s2. s1 ⊔ (s1 ⊓ s2) = s1;• ∀s1, s2. s1 ⊓ (s1 ⊔ s2) = s1.

One can define a partial order on S:

∀s1, s2. s1 s2 ↔ s1 = s1 ⊓ s2 ,

or equivalently

∀s1, s2. s1 s2 ↔ s2 = s1 ⊔ s2 .

A lattice is complete if for every subset S′ ⊆ S (including infinite sub-sets), both the join, or supremum, ⊔S′ and the meet, or infimum, ⊓S′ exist.In particular, a complete lattice has a least element ⊓S and a greatest element⊔S. Figure 12.4(a) visualizes a complete lattice.

A function f on elements of a lattice is monotone in the lattice if

∀s1, s2. s1 s2 → f(s1) f(s2) .

A fixpoint of f is any element s ∈ S such that f(s) = s. The dashed linesof Figure 12.4(b) visualize the iterative application of monotone function ffrom ⊓S until it reaches the fixpoint s3.

The following theorem is an important classical result for monotone func-tions on complete lattices.

Theorem 12.18 (Knaster-Tarski Theorem). The fixpoints of a monotonefunction on a complete lattice comprise a complete lattice.

12.4 ⋆Standard Notation and Concepts 343

Since complete lattices are nonempty, a fixpoint of a monotone function on acomplete lattice always exists. More, the Knaster-Tarski theorem guaranteesthe existence of the least fixpoint and the greatest fixpoint of f , which arethe least and greatest elements, respectively, of the complete lattice definedby f ’s fixpoints.

Abstract Interpretation

An important complete lattice for our purposes is the lattice defined by thesets of a program P ’s possible states S: the join is set union, the meet is setintersection, and the partial order is set containment. The greatest elementis the set of all states; the least element is the empty set. The lattice is thusrepresented by (2S,∪,∩), where 2S is the powerset of S. Call this lattice CP .

Treat P as a function on S: P (s) is the successor state of s during execu-tion. Define the strongest postcondition on subsets S′ of states S and programP :

sp(S′, P )def= P (s) : s ∈ S′ ,

which we abbreviate as P (S′). Now define the function

FP (S′)def= S′ ∪ P (S′) ;

FP is a monotone function on CP . Hence, by the Knaster-Tarski theorem, theleast fixpoint of FP exists on CP : it is the set of reachable states of P . Thelattice CP and the monotone function FP define the concrete semantics ofprogram P . For this reason, the elements 2S of CP comprise the concretedomain.

The ForwardPropagate algorithm of Figure 12.2 can be described asapplying FP to the set of states satisfying the function precondition until theleast fixpoint is reached.

However, the least fixpoint of FP usually cannot be computed in practice(see Section 12.1.4). Therefore, consider another lattice, AP : (A,⊔,⊓), whoseelements A are “simpler” than those of CP . Its elements A might be intervalsor affine equations, for example, and thus comprise the abstract domain.Define two functions

α : 2S → A and γ : A→ 2S ,

the abstraction function and the concretization function, respectively.These functions should preserve order:

∀S1, S2 ⊆ S. S1 ⊆ S2 → α(S1) α(S2) ,

and

∀a1, a2 ∈ A. a1 a2 → γ(a1) ⊆ γ(a2) .


A monotone function FP on AP is a valid abstraction of FP if

∀a ∈ A. FP (γ(a)) ⊆ γ(FP (a)) ,

that is, if the set of program states represented by abstract element FP (a)is a superset of the application of FP to the set of states represented byabstract element a. One possible abstraction is given by applying FP to theconcretization of a and then abstracting the result:

FP (a)def= α(FP (γ(a))) .

A valid abstraction FP provides a means for approximating the least fixpointof FP in CP : the concretization γ(a) of any fixpoint a of FP in AP is a supersetof the least fixpoint of FP in CP .

Lattice AP need not be any better behaved than CP , though. A well-behaved lattice satisfies the ascending chain condition: every nondecreas-ing sequence of elements eventually converges. Consequently, computing theleast fixpoint of a monotone function in such a lattice requires only a finitenumber of iterations. The lattice defined by the affine spaces of Karr’s analysistrivially satisfies this condition: it is of finite height because the number ofdimensions is finite. But, for example, the lattice defined by intervals does notsatisfy the ascending chain condition. In this case, one must define a wideningoperator on the abstract domain.

Relating Notation and Concepts

In our presentation, we take as our concrete set not 2S but rather FOL rep-resentations of sets of states. The concrete lattice is thus (FOL,∨,∧) withpartial order ⇒. This lattice is not complete: there need not be a finite first-order representation of the conjunction of an infinite number of formulae. Butnot surprisingly, there need not be a finite represention of a set of infinitecardinality, either, so the completeness of CP is not of practical value.

Our abstract domains are given by syntactic restrictions on the form ofFOL formulae. The abstraction function is νD, and the concretization functionis just the identity. spD is a valid abstraction of sp, and both are monotonein their respective lattices.

12.5 Summary

This chapter describes a methodology for developing algorithms to reasonabout program correctness. It covers:

• Invariant generation in a general setting. The forward propagation al-gorithm based on the strongest postcondition. The need for abstraction.Issues: decidability and convergence. Abstract interpretations.

Exercises 345

• Interval analysis, an abstract interpretation with a domain describing in-tervals. It is appropriate for reasoning about simple bounds on variables.

• Karr’s analysis, an abstract interpretation with a domain describing affinespaces. It discovers equations among rational or real variables.

• Notation and concepts in the standard presentation of abstract interpre-tation.

This chapter provides just an introduction to a widely studied area of research.


We present a simplified version of the abstract interpretation framework ofCousot and Cousot, who also describe a version of the interval domain thatwe present [20].

Karr developed his analysis a year before the abstract interpretation frame-work was presented [46]. For background on linear algebra, see [42]. Our pre-sentation of Karr’s analysis is based on that of Muller-Olm and Seidl [63].

Many other domains of abstract interpretation have been studied. Themost widely known is the domain of polyhedra, which Cousot and Halbwachsdescribe in [21]. Exercise 12.4 explores the octagon domain of Mine [61]. As anexample of a non-numerical domain, see the work of Sagiv, Reps, and Wilhelmon shape analysis [96].

Exercises

12.1 (wp and sp). Prove the second implication of Example 12.2; that is,prove that

F ⇒ wp(sp(F, S), S) .

12.2 (General sp). Compute sp(F, ρ1) and sp(F, ρ2) for transition relationsρ1 and ρ2 of (12.1) and (12.2), respectively. Show that disregarding pc revealsthe original definition of sp.

12.3 (General wp and sp). For the general definitions of wp and sp, provethat

sp(wp(F, S), S) ⇒ F ⇒ wp(sp(F, S), S) .

12.4 (Octagon Domain). Design an abstract interpretation for the octagondomain. That is, extend the interval domain to include literals of the form

c ≤ v1 + v2 , v1 + v2 ≤ c , c ≤ v1 − v2 , and v1 − v2 ≤ c .

Apply it to the loop of Example 12.4. Because i and n are integer variables,the loop guard i < n is equivalent to i ≤ n− 1.


12.5 (Non-affine assignments). Consider affine spaces F and

G : [[xk := 0]]F ⊔DK[[xk := 1]]F .

Prove the following:

(a) If [x1 · · ·xk · · ·xn] ∈ F , then [x1 · · ·m · · ·xn] ∈ G for all m ∈ R.(b) If [x1 · · ·xk · · ·xn] ∈ G, then [x1 · · ·m · · ·xn] ∈ F for some m ∈ R.

Hint : Use the definition of the affine hull.

12.6 (⋆Analyzing programs).

(a) Describe how to use AbstractForwardPropagate to analyze pro-grams with many functions, some of which may be recursive. Hints : First,include a location in each function to collect information for the functionpostcondition. Consider that this information might include function vari-ables other than rv and the parameters, so define an abstract operatorelim to eliminate these variables. Then recall that function calls in basicpaths are replaced by function summaries constructed from the functionpostconditions. Therefore, AbstractForwardPropagate need not bemodified to handle function calls. But in which order should functions beanalyzed? What about recursive functions?

(b) Describe an instance of this generalization for interval analysis. In partic-ular, define elim.

(c) Describe an instance of this generalization for Karr’s analysis. In partic-ular, define elim.

13

Further Reading

Do not seek to follow in the footsteps of the men of old; seek what theysought.

— Matsuo BashoKyoroku Ribetsu no Kotoba, 1693

In this book we have presented a classical method of specifying and verify-ing sequential programs (Chapters 5 and 6) based on first-order logic (Chap-ters 1–3) and induction (Chapter 4). We then focused on algorithms for au-tomating the application of this method: decision procedures for reasoningabout verification conditions (Chapters 7–11), and invariant generation pro-cedures for deducing inductive facts about programs (Chapter 12). This ma-terial is fundamental to all modern research in verification. In this chapter,we indicate topics for further reading and research.

First-Order Logic

Other texts on first-order logic include [87, 31, 55]. Smullyan [87], on whichthe presentation of Section 2.7 is partly based, concisely presents the mainresults in first-order logic. Enderton [31] provides a comprehensive discussionof theories of arithmetic and Godel’s first incompleteness theorem. Manna andWaldinger [55] explore additional first-order theories.

Decision Procedures

We covered three forms of decision procedures: quantifier elimination (Chap-ter 7) for full theories, decision procedures for quantifier-free fragments oftheories with equality (Chapters 8–10), and instantiation-based proceduresfor limited quantification (Chapter 11). New decision procedures in each ofthese styles are discovered regularly (see, for example, the proceedings of theIEEE Symposium on Logic in Computer Science [51]).

348 13 Further Reading

Proofs of correctness in Chapters 7 and 11 appeal to the structure ofinterpretations. Model theory studies logic from the perspective of models (in-terpretations), and decision procedures can be understood in model-theoreticterms. Hodges covers this topic comprehensively [40].

The combination result of Chapter 10 is not the only one known, althoughit is the simplest general result. For example, Shostak’s method treats a sub-class of fragments with greater speed [84, 78]. Current efforts aim to extendcombination methods to require fewer or different restrictions [93, 99, 94, 4].However, most theories cannot be combined in a general way, particularlywhen quantification is allowed, so logicians develop special combination pro-cedures [100].

Decision procedures for PL (“SAT solvers”) and combination procedures(“SMT”, for SAT Modulo Theories) have received much practical attentionrecently, even in the form of annual competitions [81, 86]. Motivating appli-cations include software and hardware verification.

Automated Theorem Proving

While satisfiability/validity decision procedures have obvious advantages —speed; the guarantee of an answer in theory and often in practice; and, typi-cally, the ability to produce counterexamples — not all theories or interestingfragments are decidable. For example, reasoning about permutations of arraysis undecidable in certain contexts (see Exercise 6.5 and [77, 6]), yet we wouldlike to prove, for example, that sorting functions return permutations of theirinput.

Because FOL is complete, semi-decision procedures (see Section 2.6.2) arepossible: the procedure described before the proof of Lemma 2.31 is such aprocedure. More relevant are widely-used procedures with tactics [76, 67, 47].Recent work has looked at effectively incorporating decision procedures intogeneral theorem proving [48]. The specification language for verified program-ming is more expressive using these procedures. However, proving verificationconditions sometimes requires user intervention.

FOL may not be sufficiently expressive for some applications. Second-orderlogic extends first-order logic with quantification over predicates. Despite be-ing incomplete, researchers continue to investigate heuristics for partly au-tomating second-order reasoning [67]. Separation logic is another incompletelogic designed for reasoning about mutable data structures [75].

Static Analysis

Static analysis is one of the most active areas of research in verification. Clas-sically, static analyses of the form presented in Chapter 12 have been studiedin two areas: compiler development and research [62] and verification [20].

Important areas of current research include fast numerical analyses fordiscovering numerical relations among program variables [80], precise alias

13 Further Reading 349

and shape analyses for discovering how a program manipulates memory [96],and predicate abstraction and refinement [5, 37, 16, 24].

Static analyses of the form in Chapter 12 solve a set of implications fora fixpoint (an inductive assertion map is a fixpoint) by forward propagation.Other methods exist for finding fixpoints, including constraint-based staticanalysis [2, 17].

Static analyses also address total correctness by proving that loops andfunctions halt [18, 7, 8]. Their structure is different than the analyses of Chap-ter 12 as they seek ranking functions.

Concurrent Programs

We focus on specifying and verifying sequential programs in this book. Just asconcurrent programs are more complex than sequential programs, specifica-tion and verification methodologies for concurrent programs are more complexthan for sequential programs. Fortunately, many of the same methods, includ-ing the inductive assertion method and the ranking function method, are stillof fundamental importance in concurrent programming. Manna and Pnueliprovide a comprehensive introduction to this topic [53, 54]. Milner describes acalculus of concurrent systems in which both the system and its specificationis written [60].

Temporal Logic

Because functions of pi do not have side effects, specifying function behaviorthrough function preconditions and postconditions is sufficient. However, re-active and concurrent systems, such as operating systems, web servers, andcomputer processors, exhibit remarkably complex behavior. Temporal logicsare typically used for specifying their behavior.

A temporal logic extends PL or FOL with temporal operators that ex-press behavior over time. Canonical behaviors include invariance, in whichsome condition always holds; progress, in which a particular event eventuallyoccurs; and reactivity, in which a condition causes a particular event to occureventually.

Temporal logics are divided into logics over linear-time structures (LinearTemporal Logic (LTL)) [53]; branching-time structures (Computational TreeLogic (CTL), CTL∗) [14]; and alternating-time structures (Alternating-timeTemporal Logic (ATL), ATL∗) [3].

Model Checking

A finite-state model checker [15, 74] is an algorithm that checks whether finite-state systems such as hardware circuits satisfy given temporal properties.

An explicit-state model checker manipulates sets of actual states as vectorsof bits. A symbolic model checker uses a formulaic representation to represent

350 13 Further Reading

sets of states, just as we use FOL to represent possibly infinite sets of statesin Chapters 5 and 6. The first symbolic model checker was for CTL [12]; itrepresents sets of states with Reduced Ordered Binary Decision Diagrams(ROBDDs, or just BDDs) [10, 11].

LTL model checking is based on manipulating automata over infinitestrings [95]. A rich literature exists on such automata; see [91] for an in-troduction.

Predicate abstraction and refinement [5, 37, 16, 24] has allowed modelcheckers to be applied to software and represents one of the many intersectionsbetween areas of research (model checking and static analysis in this case).

Clarke, Grumberg, and Peled discuss model checking in detail [14].

References

1. W. Ackermann. Solvable cases of the decision problem. The Journal of Sym-bolic Logic, 22(1):68–72, March 1957.

2. A. Aiken. Introduction to set constraint-based program analysis. Science ofComputer Programming, 35(2-3):79–111, November 1999.

3. R. Alur, T. Henzinger, and O. Kupferman. Alternating-time temporal logic.Journal of the ACM, 49:672–713, 2002.

4. F. Baader, S. Ghilardi, and C. Tinelli. A new combination procedure forthe word problem that generalizes fusion decidability results in modal logics.Information and Computation, 204(10):1413–1452, 2006.

5. N. S. Bjørner, A. Browne, and Z. Manna. Automatic generation of invariantsand intermediate assertions. Theoretical Computer Science, 173(1):49–87, 1997.

6. A. R. Bradley. Safety Analysis of Systems. PhD thesis, Stanford University,June 2007.

7. A. R. Bradley, Z. Manna, and H. B. Sipma. Linear ranking with reachability. InComputer Aided Verification, volume 3576 of LNCS, pages 491–504. Springer-Verlag, 2005.

8. A. R. Bradley, Z. Manna, and H. B. Sipma. Termination of polynomial pro-grams. In Verification, Model Checking and Abstract Interpretation, volume3385 of LNCS, pages 113–129. Springer-Verlag, 2005.

9. A. R. Bradley, Z. Manna, and H. B. Sipma. What’s decidable about arrays?In Verification Model Checking and Abstract Interpretation, volume 3855 ofLNCS, pages 427–442. Springer-Verlag, January 2006.

10. R. E. Bryant. Graph-based algorithms for Boolean function manipulation.IEEE Transactions on Computers, 35(8):677–691, August 1986.

11. R. E. Bryant. Symbolic Boolean manipulation with ordered binary-decisiondiagrams. ACM Computing Surveys, 24(3):293–318, September 1992.

12. J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Sym-bolic model checking: 1020 states and beyond. Information and Computation,98(2):142–170, 1992.

13. A. Church. A note on the Entscheidungsproblem. Journal of Symbolic Logic,1, 1936.

14. E. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 2000.15. E. M. Clarke and E. A. Emerson. Design and synthesis of synchronization

skeletons using branching-time temporal logic. In Logic of Programs, pages52–71. Springer-Verlag, 1982.

352 References

16. E. M. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith. Counterexample-guided abstraction refinement. In Computer Aided Verification, volume 1855of LNCS, pages 154–169. Springer-Verlag, 2000.

17. M. Colon, S. Sankaranarayanan, and H. B. Sipma. Linear invariant generationusing non-linear constraint solving. In Computer Aided Verification, volume2725 of LNCS, pages 420–433. Springer-Verlag, 2003.

18. M. Colon and H. B. Sipma. Synthesis of linear ranking functions. In Toolsand Algorithms for the Construction and Analysis of Systems, volume 2031 ofLNCS, pages 67–81. Springer-Verlag, April 2001.

19. D. C. Cooper. Theorem proving in arithmetic without multiplication. MachineIntelligence, 7:91–99, 1972.

20. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model forstatic analysis of programs by construction or approximation of fixpoints. InPrinciples of Programming Languages, pages 238–252. ACM Press, 1977.

21. P. Cousot and N. Halbwachs. Automatic discovery of linear restraints amongthe variables of a program. In Principles of Programming Languages, pages84–96. ACM Press, 1978.

22. G. B. Dantzig. Programming in a Linear Structure. USAF, February 1948.23. G. B. Dantzig. Maximization of linear function of variables subject to linear

inequalities. In Cowles Commission for Research in Economics, volume 13,pages 339–347. Wiley, 1951.

24. S. Das and D. L. Dill. Successive approximation of abstract transition relations.In Logic in Computer Science, page 51. IEEE Computer Society, 2001.

25. M. Davis, G. Logemann, and D. Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7):394–397, July 1962.

26. M. Davis and H. Putnam. A computing procedure for quantification theory.Journal of the ACM, 7(3):201–215, July 1960.

27. D. L. Detlefs, K. R. M. Leino, G. Nelson, and J. B. Saxe. Extended staticchecking. Technical Report 159, Compaq SRC, December 1998.

28. E. W. Dijkstra. Guarded commands, nondeterminacy and formal derivation ofprograms. Communications of the ACM, 18(8):453–457, August 1975.

29. P. J. Downey, R. Sethi, and R. E. Tarjan. Variations on the common subex-pressions problem. Journal of the ACM, 27(4):758–771, 1980.

30. B. Dreben and H. Putnam. The Craig interpolation lemma. Notre DameJournal of Formal Logic, 8(3):229–233, July 1967.

31. H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, 1972.32. J. Ferrante and C. Rackoff. A decision procedure for the first order theory of

real addition with order. SIAM Journal on Computing, 4(1):69–76, 1975.33. M. J. Fischer and M. O. Rabin. Super-exponential complexity of Presburger

arithmetic. In R. M. Karp, editor, Complexity of Computation, volume 7 ofSIAM-AMS Proceedings, pages 27–42. American Mathematical Society, 1974.

34. R. W. Floyd. Assigning meanings to programs. In Symposia in Applied Math-ematics, volume 19, pages 19–32. American Mathematical Society, 1967.

35. K. Godel. Uber die Vollstandigkeit des Logikkalkuls. PhD thesis, University ofVienna, 1929.

36. K. Godel. Uber formal unentscheidbare Satze der Principia Mathematica undverwandter Systeme I. Monatshefte fur Mathematik und Physik, 38:173–198,1931.

References 353

37. S. Graf and H. Saidi. Construction of abstract state graphs with PVS. InComputer Aided Verification, volume 1254 of LNCS, pages 72–83. Springer-Verlag, 1997.

38. D. Hilbert. Die Grundlagen der Mathematik. Abhandlungen aus dem Seminarder Hamburgischen Universitat, 6:65–85, 1928.

39. C. A. R. Hoare. An axiomatic basis for computer programming. Communica-tions of the ACM, 12(10):576–580, October 1969.

40. W. Hodges. A Shorter Model Theory. Cambridge University Press, 1997.41. J. E. Hopcroft, R. Motwani, and J. D. Ullman. Automata Theory, Languages,

and Computation. Addison-Wesley, 3rd edition, 2006.42. R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,

1985.43. J. Jaffar. Presburger arithmetic with array segments. Information Processing

Letters, 12(2):79–82, 1981.44. L. Kantorovich. Mathematical methods of organizing and planning production.

Management Science, 6:366–422, 1960. In Russian, Leningrad University, 1939.45. N. Karmarkar. A new polynomial-time algorithm for linear programming.

Combinatorica, 4:373–395, 1984.46. M. Karr. Affine relationships among variables of a program. Acta Informatica,

6:133–151, 1976.47. M. Kaufmann, P. Manolios, and J. S. Moore. Computer-Aided Reasoning: An

Approach. Kluwer Academic Publishers, 2000.48. M. Kaufmann, J. S. Moore, S. Ray, and E. Reeber. Integrating external deduc-

tion tools with ACL2. In Workshop on the Implementation of Logics, volume212, pages 7–26, 2006.

49. L. G. Khachian. A polynomial algorithm in linear programming. Soviet Math.Dokl., 20:191–194, 1979. In Russian, Dokl. Akad. Nauk SSSR 244, 1093–1096,1979.

50. J. King. A Program Verifier. PhD thesis, Carnegie Mellon University, Septem-ber 1969.

51. IEEE Symposium on Logic in Computer Science. http://www2.informatik.

hu-berlin.de/lics.52. Z. Manna. Mathematical Theory of Computation. McGraw-Hill, 1974. Also

Dover, 2004.53. Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent

Systems: Specification. Springer-Verlag, 1991.54. Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems: Safety.

Springer-Verlag, 1995.55. Z. Manna and R. Waldinger. The Deductive Foundations of Computer Pro-

gramming. Addison-Wesley, 1993.56. Z. Manna and C. G. Zarba. Combining decision procedures. In Formal Methods

at the Cross Roads: From Panacea to Foundational Support, volume 2757 ofLNCS, pages 381–422. Springer-Verlag, 2003.

57. P. Mateti. A decision procedure for the correctness of a class of programs.Journal of the ACM, 28(2), 1981.

58. J. McCarthy. Towards a mathematical science of computation. In InternationalFederation for Information Processing, pages 21–28, 1962.

59. J. McCarthy. A basis for a mathematical theory of computation. ComputerProgramming and Formal Systems, 1963.

354 References

60. R. Milner. Communication and Concurrency. Prentice Hall, 1989.61. A. Mine. The octagon abstract domain. In Analysis, Slicing and Transfor-

mation (part of Working Conference on Reverse Engineering), IEEE, pages310–319. IEEE Computer Society, October 2001.

62. S. S. Muchnick. Advanced Compiler Design and Implementation. MorganKaufmann Publishers Inc., 1997.

63. M. Muller-Olm and H. Seidl. A note on Karr’s algorithm. In InternationalColloquium on Automata, Languages and Programming, volume 3142 of LNCS,pages 1016–1027. Springer-Verlag, 2004.

64. G. Nelson. Combining satisfiability procedures by equality-sharing. Contem-porary Mathematics, 29:201–211, 1984.

65. G. Nelson and D. C. Oppen. Simplification by cooperating decision procedures.ACM Transactions on Programming Languages and Systems, 1(2):245–257, Oc-tober 1979.

66. G. Nelson and D. C. Oppen. Fast decision procedures based on congruenceclosure. Journal of the ACM, 27(2):356–364, April 1980.

67. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL: A Proof Assistantfor Higher-Order Logic, volume 2283 of LNCS. Springer-Verlag, 2002.

68. D. C. Oppen. A 222

pn

upper bound on the complexity of Presburger arithmetic.Journal of Computer and System Sciences, 16(3):323–332, 1978.

69. D. C. Oppen. Reasoning about recursively defined data structures. In Princi-ples of Programming Languages, pages 151–157. ACM Press, 1978.

70. D. C. Oppen. Complexity, convexity and combinations of theories. TheoreticalComputer Science, 12:291–302, 1980.

71. D. C. Oppen. Reasoning about recursively defined data structures. Journal ofthe ACM, 27(3):403–411, July 1980.

72. C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1993.73. M. Presburger. Uber die Vollstandigkeit eines gewissen Systems der Arithmetik

ganzer Zahlen, in welchen die Addition als einzige Operation hervortritt. InComptes Rendus du Premier Congres des Mathematicienes des Pays Slaves,pages 92–101, 1929.

74. J. P. Queille and J. Sifakis. Specification and verification of concurrent systemsin CESAR. In International Symposium on Programming, volume 137 of LNCS.Springer-Verlag, 1982.

75. J. C. Reynolds. Separation logic: A logic for shared mutable data structures.In Logic in Computer Science, pages 55–74. IEEE Computer Society, 2002.

76. A. Riazanov and A. Voronkov. The design and implementation of VAMPIRE.AI Communications, 15(2):91–110, September 2002.

77. L. E. Rosier. A note on Presburger arithmetic with array segments, permuta-tion and equality. Information Processing Letters, 22(1):33–35, 1986.

78. H. Ruess and N. Shankar. Deconstructing Shostak. In Logic in ComputerScience. IEEE Computer Society, 2001.

79. B. Russell. Philosophy of Logical Atomism, 1918.80. S. Sankaranarayanan, M. Colon, H. Sipma, and Z. Manna. Efficient strongly

relational polyhedral analysis. In Verification, Model Checking, and AbstractInterpretation, volume 3855 of LNCS, pages 111–125, 2006.

81. The international SAT competitions web page. http://www.satcompetition.org.

82. A. Schrijver. Theory of Linear and Integer Programming. Wiley, 1986.

References 355

83. R. E. Shostak. An algorithm for reasoning about equality. Communications ofthe ACM, 21(7):583–585, July 1978.

84. R. E. Shostak. Deciding combinations of theories. Journal of the ACM, 31(1):1–12, January 1984.

85. M. Sipser. Introduction to the Theory of Computation. Course Technology,1996.

86. SMT-LIB: SMT solver competition. http://goedel.cs.uiowa.edu/smtlib.87. R. M. Smullyan. First-Order Logic. Dover, 1968.88. A. Stump, C. W. Barrett, D. L. Dill, and J. Levitt. A decision procedure for

an extensional theory of arrays. In Logic in Computer Science, pages 29–37.IEEE Computer Society, 2001.

89. N. Suzuki and D. Jefferson. Verification decidability of Presburger array pro-grams. Journal of the ACM, 27(1):191–205, January 1980.

90. A. Tarski. A Decision Method for Elementary Algebra and Geometry. Univer-sity of California Press, 1951.

91. W. Thomas. Automata on infinite objects. In J. van Leeuwen, editor, Handbookof Theoretical Computer Science, volume B, pages 133–191. Elsevier North-Holland, Inc., 1990.

92. C. Tinelli and M. T. Harandi. A new correctness proof of the Nelson-Oppencombination procedure. In Frontiers of Combining Systems, pages 103–120.Kluwer Academic Publishers, 1996.

93. C. Tinelli and C. Ringeissen. Unions of non-disjoint theories and combina-tions of satisfiability procedures. Theoretical Computer Science, 290(1):291–353, January 2003.

94. C. Tinelli and C. Zarba. Combining non-stably infinite theories. TechnicalReport TinZar-RR-03, University of Iowa, 2003.

95. M. Y. Vardi. An automata-theoretic approach to linear temporal logic. InLogics for Concurrency: Structure versus Automata, pages 238–266. Springer-Verlag, 1996.

96. R. Wilhelm, S. Sagiv, and T. W. Reps. Parametric shape analysis via 3-valuedlogic. Transactions on Programming Languages and Systems, 24(3):217–298,May 2002.

97. M. Yadegari. The use of mathematical induction by Abu Kamil Shuja’ IbnAslam (850-930). Isis, 69(2):259–262, June 1978.

98. R. Zach. Hilbert’s program. In E. N. Zalta, editor, The Stanford Ency-clopedia of Philosophy. The Metaphysics Research Lab, Fall 2003. http:

//plato.stanford.edu/archives/fall2003/entries/hilbert-program.99. C. Zarba, Z. Manna, and H. B. Sipma. Combining theories sharing dense

orders. In Automated Reasoning with Analytic Tableaux and Related Methods:Position Papers and Tutorials, pages 83–98, 2003.

100. T. Zhang, H. B. Sipma, and Z. Manna. Decision procedures for term algebraswith integer constraints. Information and Computation, 204(10):1526–1574,October 2006.

Index

+ associativity [axiom] 80, 81, 83+ commutativity [axiom] 80, 81, 83+ identity [axiom] 80, 81, 83+ inverse [axiom] 80, 81, 83+ ordered [axiom] 81, 83· associativity [axiom] 80, 81· commutativity [axiom] 80, 81· identity [axiom] 81· inverse [axiom] 81· left identity [axiom] 80· ordered [axiom] 81· right identity [axiom] 80NPTIME 54NPTIME-complete 55NP 54NP-hard 55PTIME 54P 54

abelian group, theory of 80, 83abs [program] 173absorption laws 342abstract domain 321, 343

interval 321Karr’s 321

abstract forward propagation algorithm324

abstract interpretation 312, 321abstract strongest postcondition 322AbstractForwardPropagate

[function] 324abstraction function 343ack left zero [axiom] 106

ack right zero [axiom] 106ack successor [axiom] 106

adjacent vertices 218admits (quantifier elimination) 184

affine assertion 333affine combination 334affine hull 336

affine space 333affine transformation 335

algorithm 54ancestor (semantic argument) 12

annotation 118assertion 122

function call assertion 131function postcondition 118

function precondition 118loop invariant 121

runtime assertion 122antecedent 4

antisymmetric 341antisymmetry [axiom] 81, 82application programming interface

175arguments 115

arity 4binary 4

unary 4arrangement 273

array congruence [axiom] 88, 263array predicate

bounded equality 168partitioned 127sorted 120

358 Index

weak permutation 170, 174array property 292, 300array property fragment 292, 300arrays, theory of 87, 291ascending chain condition 344assertion 122assertion map 316assignment (interpretation) 39assignment statement 126associative 341assume statement 126at least one root [axiom] 81, 82atom 4, 36atom [axiom] 85, 86, 259augmented matrix 211axiom schema 71axioms 69

+ associativity 80, 81, 83+ commutativity 80, 81, 83+ identity 80, 81, 83+ inverse 80, 81, 83+ ordered 81, 83· associativity 80, 81· commutativity 80, 81· identity 81· inverse 81· left identity 80· ordered 81· right identity 80ack left zero 106ack right zero 106ack successor 106antisymmetry 81, 82array congruence 88, 263at least one root 81, 82atom 85, 86, 259concat. atom 98concat. list 98construction 85, 86, 259distributivity 81divides 187divisible 83exp3 successor 96exp3 zero 96exp. successor 96exp. zero 96extensionality 89, 293flat atom 98flat list 98

function congruence 71, 242hashtable congruence 305induction 73, 75keys-put 305keys-remove 305left distributivity 80left projection 85, 259length atom 102length list 102plus successor 73, 75plus zero 73, 75predicate congruence 71, 243projection 86quotient less 100quotient successor 100read-over-put 1 305read-over-put 2 305read-over-remove 305read-over-write 1 88, 263read-over-write 2 88, 263reflexivity 71, 242, 305remainder less 100remainder successor 100reverse atom 98reverse list 98right distributivity 80right projection 85, 259separate identities 81square-root 81successor 73, 75symmetry 71, 242, 305times successor 73times zero 73torsion-free 83totality 81, 82transitivity 71, 81, 82, 242, 305two 270zero 73, 75

base case 96, 98basic path 125, 316basis 211BCP 28Bell numbers 275binary 4BinarySearch [program] 115–176Boolean connectives 4bound variable 36bounded equality 168

Index 359

branch (semantic argument) 12BubbleSort [program] 116–176

canonical form 325cardinality 39ccpar [function] 255characteristic formula 63class

congruence 245equivalence 245

clause 21closed (semantic argument) 12closed formula 37CNF see conjunctive normal formco-NP 55commutative 341Compactness Theorem 61complete 70complete (lattice) 342complete (proof method) 56complete (theory) 70complete induction 95, 99complete induction principle 100composition (substitution) 18computation 142concat. atom [axiom] 98concat. list [axiom] 98concrete domain 343concrete semantics 343concretization function 343congruence class 245

representative 252congruence closure 244, 247

algorithm 248parents 253

congruence closure parents 253congruence relation 72, 243, 245congruence-closure algorithm 241congruent [function] 255conjunctive fragment 208conjunctive normal form 21consequent 4consistent 70constant 35constraint 206constraint representation (affine space)

334constraint representation (polyhedron)

238

construction [axiom] 85, 86, 259constructor 259contrapositive 89convex 276convex hull 239convex space 218, 238convex theory 276Cooper’s method 187

divisibility predicate 186left elimination 194left infinite projection 189periodicity property 189right elimination 194right infinite projection 194triangular form 197

Craig Interpolation Lemma 61, 284cutpoint 316cutset 316cycle (graph) 251cylindrical algebraic decomposition 82

DAG see directed acyclic graphDe Morgan’s Law 19decidable 54, 70decision procedure 21deduction (semantic argument) 10descendant (semantic argument) 12dimension 211direct descendant (semantic argument)

12directed acyclic graph 251disjoint (partition) 246disjunctive normal form 20distributivity [axiom] 81divides [axiom] 187divisibility predicate 186divisible [axiom] 83DNF see disjunctive normal formdomain (interpretation) 39domain (substitution) 16DPLL decision procedure 28

Boolean constraint propagation 28unit clause 29unit resolution 29

dual optimization problem 215

echelon form 212edge (graph) 251elementarily equivalent 83

360 Index

elementary algebra see reals, theory ofelementary recursive 90elementary row operations 211entry function 142equality with uninterpreted functions

see equality, theory ofequality, theory of 71, 241equationally satisfied 215equisatisfiable 24equivalence class 245equivalence closure 247equivalence relation 71, 242, 245equivalent 14, 70

weakly 284EUF see equality, theory ofexistential closure 37existential quantifier 36exp3 successor [axiom] 96exp3 zero [axiom] 96exp. successor [axiom] 96exp. zero [axiom] 96extensionality 85, 293extensionality [axiom] 89, 293

falsifying interpretation 10Ferrante and Rackoff’s method 200

left infinite projection 201right infinite projection 201

field, theory of 80find [function] 254finished (semantic argument) 13first-order arithmetic see Peano

arithmeticfirst-order logic 4, 35

atom 36constant 35formula 36

closed 37existential closure 37universal closure 37

function 36interpretation 39

assignment 39domain 39variant 41

literal 36predicate 36propositional variable 36quantifier 36

bound variable 36existential 36quantified variable 36scope 36universal 36

satisfiable 42semantics 39subformula 37

strict 37substitution 46subterm 37

strict 38term 35valid 42variable 35

bound 37free 37

first-order theory see theoryfixpoint 342flat atom [axiom] 98flat list [axiom] 98FOL see first-order logicformal parameters 115formula 4, 36, 69

closed 37existential closure 37universal closure 37

formula schema 48placeholder 48side condition 48valid 49

formula template 16forward propagation 312, 317ForwardPropagate [function] 318fragment 70

array property 292, 300conjunctive 90, 208hashtable property 292, 305pure equality 284quantifier-free 70

free variable 37function 36function call assertion 131function congruence [axiom] 71, 242function postcondition 118function precondition 118functions

AbstractForwardPropagate

324

Index 361

ccpar 255congruent 255find 254ForwardPropagate 318merge 256node 254union 254Widen 324

Gaussian elimination 211echelon form 212elementary row operations 211

global optimum 218graph 251

directed 251directed acyclic 251edge 251node 251

graph coloring 33greatest fixpoint 343group

torsion-free, theory of 83group, theory of 80

has equality 284hashtable congruence [axiom] 305hashtable property 305hashtable property fragment 292, 305hashtables, theory of 291Hintikka set 60Hintikka’s Lemma 60Hoare triple 137hull

affine 336convex 239interval 326

idempotent 341identity matrix 210implies 15

weakly 284index guard 292, 300index set 295induction 95

complete 95, 99lexicographic well-founded 104stepwise 95, 96, 98structural 60, 95, 108well-founded 95, 102

induction [axiom] 73, 75inductive 143inductive assertion map 317inductive assertion method 113, 124,

143inductive definition 7inductive hypothesis 96, 98, 100, 103,

109inductive map see inductive assertion

mapinductive step 96, 98infimum see meetInsertionSort [program] 150, 174, 176integer-indexed arrays, theory of 300integers, theory of 73, 76, 183intended interpretation 74interpolant 61interpretation 6, 39, 70

assignment 39domain 39falsifying 10intended 74, 75, 77variant 41

interpreted 71intersection [program] 175, 176interval abstract domain 321interval analysis 311interval arithmetic 326interval hull 326invariant 143invariant generation 311invertible matrix 213irreflexive 112

join 341

Konig’s Lemma 59Karr’s abstract domain 321Karr’s analysis 311, 333key guard 305key set 304keys-put [axiom] 305keys-remove [axiom] 305

Lowenheim’s Theorem 60Lowenheim-Skolem Theorem 61lattice 341

complete 342least fixpoint 343

362 Index

left distributivity [axiom] 80left elimination 194left infinite projection

Cooper’s method 189Ferrante and Rackoff’s method 201

left projection [axiom] 85, 259left projector 259length atom [axiom] 102length function 142length list [axiom] 102lexicographic relation 103lexicographic well-founded induction

104line (semantic argument) 12linear equation 211linear inequality 214LinearSearch [program] 115–176lists, theory of 84literal 4, 36local optimum 218logical connectives 4loop invariant 121

matrix 209augmented 211identity 210invertible 213nonsingular 213square 211triangular form 211

matrix-matrix multiplication 210matrix-vector multiplication 210meet 341merge [function] 256merge [program] 174, 177MergeSort [program] 174minimal disjunction 283modus ponens 13monotone function 342ms [program] 174, 177

N-O see Nelson-Oppen combinationmethod

negation normal form 19Nelson-Oppen combination method

269, 270arrangement 273

NNF see negation normal formnode [function] 254

node (graph) 251nondeterministic-polynomial time 54nonsingular matrix 213normal form 18

conjunctive 21disjunctive 20negation 19prenex 52

null space 213

objective function 206, 214open (semantic argument) 13optimization problem 206, 214

constraint 206constraints 214dual 215, 216objective function 206, 214primal 216

order 55order (group) 83ordered field, theory of 81

parent (semantic argument) 12partial correctness 113partial order 341partially correct 124partially decidable see semi-decidablepartially ordered set 341partition 246partition [program] 164–176partitioned 127path 124

basic 125, 316Peano arithmetic, theory of 73, 96periodicity property 189pivot rule 236PL see propositional logicplaceholder (formula schema) 48plus successor [axiom] 73, 75plus zero [axiom] 73, 75PNF see prenex normal formpolyhedra

vertex enumeration algorithm 239polyhedron 214, 238

constraint representation 238ray 238vertex representation 238

polynomial time 54poset see partially ordered set

Index 363

precondition method 156predicate 36predicate calculus see first-order logicpredicate congruence [axiom] 71, 243predicate logic see first-order logicpredicate transformer 136, 312

strongest postcondition 312–316weakest precondition 136, 312–316

premise (semantic argument) 10prenex normal form 52Presburger arithmetic, theory of 73,

75primed variables 315program 142program annotation 113program counter 135, 315programs

abs 173BinarySearch 115–176BubbleSort 116–176InsertionSort 150, 174, 176intersection 175, 176LinearSearch 115–176merge 174, 177MergeSort 174ms 174, 177partition 164–176qsort 164–176QuickSort 164–176random 164–176subset 175, 176union 175, 178, 179

progress property 113projection [axiom] 86projector 259proof rules (semantic argument) 10proof tactics 22propositional calculus see proposi-

tional logicpropositional logic 4

atom 4Boolean connectives 4DPLL decision procedure 28

Boolean constraint propagation 28unit clause 29unit resolution 29

equivalent 14formula 4implies 15

interpretation 6falsifying 10

literal 4logical connectives 4resolution

resolvent 27resolution decision procedure 27satisfiable 8semantic argument method 10semantics 6subformula 5

strict 5substitution 16syntax 4truth symbols 4truth table 6truth values 6truth-table method 9valid 8variables 4

propositional variable 36pure equality fragment 284

QE see quantifier eliminationqsort [program] 164–176quantifier 36

bound variable 36existential 36quantified variable 36scope 36universal 36

quantifier alternation 208, 291quantifier elimination 82, 183, 184

admits 184integer arithmetic see Cooper’s

methodrational arithmetic see Ferrante and

Rackoff’s methodquantifier elimination procedure 183quantifier instantiation 291quantifier-free fragment 70, 208QuickSort [program] 164–176quotient 246quotient less [axiom] 100quotient successor [axiom] 100

random [program] 164–176range (substitution) 16ranking function 114, 144

364 Index

ranking function method 114rationals, theory of 79, 82, 183, 207RDS see recursive data structures,

theory ofreachable state 321read-over-put 1 [axiom] 305read-over-put 2 [axiom] 305read-over-remove [axiom] 305read-over-write 1 [axiom] 88, 263read-over-write 2 [axiom] 88, 263real closed field, theory of 81reals, theory of 79, 80recursive see decidable, recursiverecursive data structures, theory of 84recursively enumerable see semi-

decidablerefinement (relation) 246reflexive 245, 341reflexivity [axiom] 71, 242, 305relation

congruence 72, 243, 245equivalence 71, 242, 245lexicographic 103well-founded 102

remainder less [axiom] 100remainder successor [axiom] 100renaming (substitution) 46representative (congruence class) 252resolution

resolvent 27resolution decision procedure 27resolvent 27reverse atom [axiom] 98reverse list [axiom] 98right distributivity [axiom] 80right elimination 194right infinite projection

Cooper’s method 194Ferrante and Rackoff’s method 201

right projection [axiom] 85, 259right projector 259ring, theory of 80runtime assertion 122runtime error 122

safe (substitution) 47safety property 113satisfiable 8, 42, 70schema (substitution) 48

scope 36semantic argument method 10

ancestor 12branch 12closed 12deduction 10descendant 12direct descendant 12finished 13line 12open 13parent 12premise 10proof rules 10

derived 13semantics 6, 39semi-decidable 54separate identities [axiom] 81side condition (formula schema) 48signature 69simplex method 207, 218simultaneously satisfiable 61sorted 120sound 56source (edge) 251space

affine 333vector 210

specification 113square matrix 211square-root [axiom] 81stably infinite 270state 135statement

assignment 126assume 126

static analysis 311interval analysis 311, 325Karr’s analysis 311, 333

stepwise induction 95, 96, 98stepwise induction principle 96, 98strengthened hypothesis 97strict subformula 5strict subformula relation 108strict subterms 38strongest postcondition 312–316

abstract 322structural induction 95, 108structural induction principle 109

Index 365

subformula 5, 37strict 5, 37

subformula ordering 8subset [program] 175, 176substitution 16, 46

composition 18domain 16range 16renaming 46safe 47schema 48variable 17

subterm 37set 247strict 38

subterm set 247successor [axiom] 73, 75successor location 318supremum see joinsymbolic execution 317symmetric 245symmetry [axiom] 71, 242, 305syntax 4

target (edge) 251term 35theory 69

axioms 69complete 70consistent 70convex 276decidable 70equivalent 70formula 69fragment 70

conjunctive 90quantifier-free 70

has equality 284interpretation 70

intended 74, 75, 77satisfiable 70signature 69stably infinite 270valid 70

theory ofabelian group 80, 83arrays 87, 291equality 71, 241field 80

group 80torsion-free 83

hashtables 291integer-indexed arrays 300integers 73, 76, 183lists 84ordered field 81Peano arithmetic 73, 96Presburger arithmetic 73, 75rationals 79, 82, 183, 207real closed field 81reals 79, 80recursive data structures 84ring 80total order 81, 83

theory of rationals 79theory of reals 79times successor [axiom] 73times zero [axiom] 73torsion-free 83torsion-free [axiom] 83total (partition) 246total correctness 113, 143total order, theory of 81, 83totality [axiom] 81, 82transition relation 315transitive 245, 341transitivity [axiom] 71, 81, 82, 242, 305transpose 209triangular form (Cooper’s method)

197triangular form (matrix) 211truth symbols 4truth table 6truth values 6truth-table method 9Turing machine 54Turing-decidable see decidableTuring-recognizable see semi-

decidabletwo [axiom] 270

unary 4undecidable 54union [function] 254union [program] 175, 178, 179union-find algorithm 254unit clause 29unit resolution 29

366 Index

unit vector 210universal closure 37universal quantifier 36unsatisfiable core 237

valid 8, 42, 49, 70valid (formula schema) 49valid abstraction 344value constraint 292, 300, 305variable 35

bound 37free 37

variable (substitution) 17variable vector 209variables 4variant interpretation 41VC see verification conditionvector 209

unit 210vector space 210

basis 211dimension 211

vector-vector multiplication 209

verification condition 124, 136, 137verifying compiler 114vertex 214

defining constraints 214vertex enumeration algorithm 239vertex representation (affine space)

334vertex representation (polyhedron)

238

weak permutation 170, 174weakest precondition 136, 312–316weakly equivalent 284weakly implies 284well-founded induction 95, 102well-founded induction principle 103well-founded relation 102Widen [function] 324widening 321widening operator 323

zero [axiom] 73, 75

web educationwebéducation.com/wp-content/uploads/2020/03/aaron-r.-bradley-zo… · preface logic...

Documents