computing.coventry.ac.ukcomputing.coventry.ac.uk/~mengland/temp/scsquare2018.pdf · 2018-09-11 ·...

�� !"#$%�&�'()*�+�!�%$,�"-."/".012��1�,!3�4 �%$,��1�,!35�10%$�!36�� "-#��,!7"%%,.�+�!��!"/1%,�1-.181.,7"8��0!��3,36�9$"3�/�207,�"3��042"3$,.�1-.�8�� !"#$%,.�4 �"%3�,."%�!36��:�;��;<=>?=;�=�=<@��A�BC=��;��@D��=B��DE�<;<=��FB��=��>�?�<A��G��C>AE��;<=>?=;�=�=<@��A�BC=��;��@D��=B��DE�<;<=��B��B;<��H=<A�I��;<��J�=B��?��B��KIJL��M�LN?�O��PO�Q��@��O��=<��@R��;�S��T=�;<<=�U��S;<=��T;=��UU��V�W"�1!%"7,-%��."�X1%,71%"81Y�Z-"/,!3"%[�.,#2"�\%0."�."�],-�/1Y�],-�/1Y�%12 �VV�W,�1!%7,-%��+��7�0%,!�\8",-8,Y�Z-"/,!3"% ��+�_ +�!.Y�_ +�!.Y�Za��

SC-square 2018 Table of Contents

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Program committee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Author index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Keyword index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Invited talk

Hard Combinatorial Problems: A Challenge for Satisfiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Ilias Kotsireas

2 Research papers

Towards Incremental Cylindrical Algebraic Decomposition in Maple . . . . . . . . . . . . . . . . . . . . . . 3

Alexander Cowen-Rivers and Matthew England

Evaluation of Equational Constraints for CAD in SMT Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Rebecca Haehn, Gereon Kremer and Erika Abraham

Refutation of Products of Linear Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Jan Horacek and Martin Kreuzer

Non-linear Real Arithmetic Benchmarks derived from Automated Reasoning in Economics 48

Casey Mulligan, Russell Bradford, James H. Davenport, Matthew England and ZakTonks

A Practical Polynomial Calculus for Arithmetic Circuit Verification . . . . . . . . . . . . . . . . . . . . . . 61

Daniela Ritirc, Armin Biere and Manuel Kauers

Unknot Recognition Through Quantifier Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Meesum Syed Mohammad and Prathamesh T. V. H.

3 Extended abstracts

New in CoCoA-5.2.4 and CoCoALib-0.99600 for SC-Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

John Abbott, Anna Maria Bigatti and Elisa Palezzato

Constraint Systems from Traffic Scenarios for the Validation of Autonomous Driving . . . . . 95

Andreas Eggers, Matthias Stasch, Tino Teige, Tom Bienmuller and Udo Brockmeyer

Wrapping Computer Algebra is Surprisingly Successful for Non-Linear SMT . . . . . . . . . . . . . . 110

Pascal Fontaine, Mizuhito Ogawa, Thomas Sturm, Van Khanh To and Xuan Tung Vu

SMT-like Queries in Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Stephen A. Forrest

Techniques for Natural-style Proofs in Elementary Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Tudor Jebelean

i

SC-square 2018 Table of Contents

ii

SC-square 2018 Preface

Preface

This volume contains the papers presented at Satisfiability Checking and Symbolic Computation(SC-square) 2018 held on July 11, 2018 in Oxford, UK as part of FLoC’18.

The workshop was one part of the SC-square project, a H2020 FETOPEN Coordinationand Support Activity. Its goal was to provide time and space to bring together two communi-ties, Symbolic Computation and SAT/SMT Solving, to share knowledge and experience, buildbridges and collaborate.

The Symbolic Computation community is focused on using mathematical approaches, par-ticularly computational algebraic geometry, to create algorithms that can find exact solutionsto complex mathematical problems. With a comprehensive foundation of modern algebra, ge-ometry and analysis, they represent the state-of-the-art in mathematical insight into real-valuedpolynomial problems.

Conversely, the SAT/SMT community takes a practical and engineering approach to solvinga variety of logical problems arising from verification and synthesis of computer hardware andsoftware. Increasingly this includes supporting algebraic “theories” such as reasoning over realand floating-point numbers.

These two communities, separated by history, tradition and approach, are unified in theircommon in interest in providing capable, efficient, scalable and flexible tools for solving a varietyof mathematical, engineering and computation problems. The SC-square project has broughtthe two communities together to build understanding between researchers and interfaces andcommon road-maps for tools.

The chairs would like to thank the PC, the authors and all of the participants who madethe workshop and the wider project an interesting and productive meeting of minds.

Anna Maria BigattiMartin Brain

This volume has been created with the help of EasyChair

ii

SC-square 2018 Preface

iii

SC-square 2018 Program Committee

Program Committee

John Abbott Universita degli Studi di GenovaErika Abraham RWTH Aachen UniversityAnna Maria Bigatti Universita degli Studi di GenovaMartin Brain University of OxfordJames H. Davenport University of BathMatthew England Coventry UniversityPascal Fontaine Loria, INRIA, University of LorraineStephen Forrest Maplesoft Europe GmbHAlberto Griggio Fondazione Bruno KesslerTudor Jebelean RISC-LinzMartin Kreuzer Universitaet PassauWerner M. Seiler University of KasselThomas Sturm CNRSWolfgang Windsteiger RISC Institute, JKU Linz, Austria

v

SC-square 2018 Program Committee

vi

SC-square 2018 Author Index

Author Index

Abbott, John 88

Abraham, Erika 19

Bienmuller, Tom 95Biere, Armin 61Bigatti, Anna Maria 88Bradford, Russell 48Brockmeyer, Udo 95

Cowen-Rivers, Alexander 3

Davenport, James H. 48

Eggers, Andreas 95England, Matthew 3, 48

Fontaine, Pascal 110Forrest, Stephen A. 118

Haehn, Rebecca 19Horacek, Jan 33

Jebelean, Tudor 122

Kauers, Manuel 61Kostireas, Ilias 1Kremer, Gereon 19Kreuzer, Martin 33

Mulligan, Casey 48

Ogawa, Mizuhito 110

Palezzato, Elisa 88

Ritirc, Daniela 61

Stasch, Matthias 95Sturm, Thomas 110Syed Mohammad, Meesum 77

T. V. H., Prathamesh 77Teige, Tino 95To, Van Khanh 110

vii

SC-square 2018 Author Index

Tonks, Zak 48

Vu, Xuan Tung 110

viii

SC-square 2018 Keyword Index

Keyword Index

algebraic extensions 88algebraic proof systems 33algorithms 77autocorrelation 1automatic proof checking 61autonomous driving 95

boolean polynomials 33

CoCoA and MathSAT 88computer algebra 61constraint solving 95constraint systems 95cylindrical algebraic decomposition 3, 19

D-optimal designs 1decision procedures 110

economic reasoning 48equational constraints 19

factorization 88

Grobner basis 61, 88

Hadamard matrices 1

incremental 3interval arithmetics 88interval propagation 110

knot theory 77

linear clauses 33logic 77

Maple 118multiplier verification 61

natural-style proofs 122non-linear arithmetic 110non-linear real arithmetic 48

polynomial calculus 61

ix

SC-square 2018 Keyword Index

quantifier elimination 48, 110

real algebraic geometry 77real roots 88

SAT solving 33satisfiability checking 122satisfiability modulo theories 48SATsolvers 1SMT 3, 110, 118SMT solving 19, 77SMTLIB 118symbolic computation 77, 122

traffic scenarios 95

x

Hard Combinatorial Problems: A Challenge forSatisfiability?

Ilias S. Kotsireas1

CARGO LabWilfrid Laurier University

Waterloo, ON, [email protected]

http://www.cargo.wlu.ca

Abstract. The theory and practice of satisfiability solvers has expe-rienced dramatic advances [1] in the past couple of decades. This factattracted the attention of researchers that work with hard combinato-rial problems [2, 6, 9–11, 5] with the hope that if suitable and efficientSAT encodings of these problems can be constructed, then SAT solverscan be used to solve large instances of such problems effectively. On theother hand, researchers working in the area of SAT and SMT solversobserved that by combining the combinatorial search capabilities of SATsolvers with mathematical reasoning abilities of computer algebra sys-tems (CAS), one could attack combinatorial problems in a way thateither of these approaches by themselves may not be able to [2]. Further,SAT researchers have been interested in hard combinatorial problemsand produced significant breakthroughs [7, 8, 3, 4] using either custom-tailored highly-tuned SAT solvers implementations or by combining theSAT and CAS paradigms. In our own work, we are using SAT solversto solve hard combinatorial problems, such as Williamson Hadamardmatrices, D-optimal matrices, complex Golay pairs and so forth. Theseproblems are defined via the fundamental concept of autocorrelation [12].It turns out that both these approaches (namely hand-tuned SAT solversand SAT+CAS combinations) have had a number of successes alreadyand it is safe to assume that a lot more successes are to be expected inthe near future. Combinatorics is a vast source of very hard and chal-lenging problems, often containing thousands of discrete variables, and Ifirmly believe that the interaction between SAT researchers and combi-natorialists will continue to be very fruitful.

Acknowledgement This is joint work with Vijay Ganesh at the Uni-versity of Waterloo.

Keywords: SAT solvers · combinatorial conjectures · autocorrelation ·D-optimal designs · Hadamard matrices.

? Supported by an NSERC grant

2 Ilias S. Kotsireas

References

1. Armin Biere, Marijn Heule, Hans van Maaren, Toby Walsh. Handbook of Satis-fiability. Frontiers in Artificial Intelligence and Applications Volume 185, 2009.

2. Edward Zulkoski, Krzysztof Czarnecki, Vijay Ganesh. MathCheck: A Math Assis-tant via a Combination of Computer Algebra Systems and SAT Solvers. Interna-tional Conference on Automated Deduction CADE 2015, pp. 607-622, LNCS 9195,Springer, 2015.

3. Curtis Bright, Ilias Kotsireas, Vijay Ganesh. A SAT+CAS Method for Enumer-ating Williamson Matrices of Even Order. Thirty-second Conference on ArtificialIntelligence AAAI 2018, AAAI Press, 2018.

4. Curtis Bright, Ilias Kotsireas, Albert Heinle, Vijay Ganesh. Enumeration of Com-plex Golay Pairs via Programmatic SAT. International Symposium on Symbolicand Algebraic Computation ISSAC 2018, pp. 111-118, ACM, 2018.

5. Erika Abraham, Building Bridges between Symbolic Computation and SatisfiabilityChecking, Invited Talk, ISSAC 2015, Bath, United Kingdom.

6. Srinivasan Arunachalam, Ilias Kotsireas. Hard satisfiable 3-SAT instances via auto-correlation. J. Satisf. Boolean Model. Comput. 10 (2016), pp. 11–22.

7. Marijn Heule. Avoiding triples in arithmetic progression. J. Comb. 8 (2017), no.3, pp. 391–422.

8. Marijn Heule, Oliver Kullmann, Victor W. Marek. Solving and verifying the BooleanPythagorean triples problem via cube-and-conquer. Theory and applications of sat-isfiability testing, SAT 2016, pp. 228–245, LNCS 9710, Springer, Cham, 2016.

9. Edward Zulkoski, Curtis Bright, Albert Heinle, Ilias Kotsireas, Krzysztof Czarnecki,Vijay Ganesh. Combining SAT solvers with computer algebra systems to verifycombinatorial conjectures. J. Automat. Reason. 58 (2017), no. 3, pp. 313–339.

10. Daniela Ritirc, Armin Biere, Manuel Kauers. Improving and extending the alge-braic approach for verifying gate-level multipliers. DATE 2018, pp. 1556–1561

11. Manuel Kauers, Martina Seidl. Symmetries of Quantified Boolean Formulas. SAT2018, pp. 199–216

12. I. S. Kotsireas. Algorithms and Meta-heuristics for Combinatorial Matrices. Hand-book of Combinatorial Optimization, 2nd Edition, 2013, P. M. Pardalos, D.-Z. Du,R. Graham (Editors) pp. 283–309

Towards Incremental CylindricalAlgebraic Decomposition in Maple

Alexander I. Cowen-Rivers1 and Matthew England2

1 University College London, London, [email protected]

2 Coventry University, Coventry, [email protected]

Abstract. Cylindrical Algebraic Decomposition (CAD) is an importanttool within computational real algebraic geometry, capable of solvingmany problems for polynomial systems over the reals. It has long beenstudied by the Symbolic Computation community and has found recentinterest in the Satisfiability Checking community.

The present report describes a proof of concept implementation of anIncremental CAD algorithm in Maple, where CADs are built and thenrefined as additional polynomial constraints are added. The aim is tomake CAD suitable for use as a theory solver for SMT tools who searchfor solutions by continually reformulating logical formula and queryingwhether a logical solution is admissible.

We describe experiments for the proof of concept, which clearly displaythe computational advantages compared to iterated re-computation. Inaddition, the project implemented this work under the recently verifiedLazard projection scheme (with corresponding Lazard valuation).

1 Introduction

We aim to adapt Cylindrical Algebraic Decomposition (CAD) for use with SMT-solvers [2], as part of the SC2 Project which seeks to build collaborations betweenresearchers in Symbolic Computation and Satisfiability Solving [1]. We reporton an implementation of incremental CAD in Maple which can build a CADand then refine it by incrementally adding polynomials. The implementation isrestricted to Open CAD (full dimensional cells) and the addition of constraints(an SMT solver would also want the ability to remove them). While a proof ofconcept implementation, experiments show clear savings on offer.

Another minor contribution of the present work is an implementation of theLazard projection operator (and corresponding valuation for lifting). The oper-ator was proposed in 1994 [8], but shortly after a flaw was found in its proof ofcorrectness (see [10] for details). However, recent work [11] has given an alter-native proof (which necessitates some changes to the lifting stage). It is now thesmallest known complete CAD projection operator.

4 A.I. Cowen-Rivers and M. England

1.1 Terminology

We work over n-dimension real space Rn in which there is a variable ordering.

Definition 1 A decomposition of the space X ⊂ Rn is a finite collection ofdisjoint regions, called cells, whose union is X .

Definition 2 A set is semi-algebraic if it can be constructed by finitely manyapplications of union, intersection and complementation operations on sets ofthe form {x ∈ Rn | f(x) ≥ 0} where f ∈ R[x1, · · · , xn].

Definition 3 A decomposition D is algebraic if each of its components x ∈ Dis a semi-algebraic set.

Definition 4 A finite partition of D of Rn is called a cylindrical decompositionof Rn if the projections of any two cells onto any lower dimensional coordinatespace with respect to the variable ordering are either equal or disjoint.

Thus a cylindrical algebraic decomposition (CAD) satisfies Definitions 1− 4.

Definition 5 A CAD is sign-invariant with respect to a set of input polynomialsif each polynomial has a constant sign (positive, negative or zero) on each cell.

CADs may be produced with other invariant properties (see for example [5], [3])but we assume sign-invariance in the present work. Each CAD cell is equippedwith: a cell index which is a list of integers that defines the position of a cellin the decomposition; and a sample point of the cell. The cells we produce alsocome with a cell description: a cylindrical formula, that is, a description of thecell as a sequence of conditions on ordered variables of the form `(x1, . . . , xk) <xk+1 < u(x1, . . . , xk), where ` and u may be ±∞.

CADs are traditionally produced through a two-stage process: first projectionidentifies polynomials of importance for the invariance property and then liftingincrementally builds CADs of Rk for k = 1, . . . , n according to these polynomials.Decompositions are performed by working at a sample point of a cell, reducingmultivariate polynomials to univariate and then decomposing according to theoutput of real root isolation. For a fuller introduction see the lecture notes [7].

1.2 Example

We give a visual example3. The gingerbread face in Figure 1 is formed by fourclosed curves, each of which defined by a bi-variate polynomial equation. Acorresponding sign-invariant CAD of R2 is visualised in Figure 2. We label the37 open cells (those of two dimensions). There are a further 28 partially open(1-dimensional line segments) and 28 closed cells (isolated points) giving 93CAD cells in total. Of course, in many industrial and SMT applications thepolynomials will not form such aesthetically pleasing geometric shapes.

3 Inspired by http://planning.cs.uiuc.edu/node296.html

Towards Incremental CAD in Maple 5

Fig. 1. Gingerbread face formed by 4 bi-variate polynomials

Fig. 2. A CAD of Fig 1. Numbers corre-spond to open (full dimensional) cells.

1.3 Report plan

We aim to work with CADs that change incrementally by constraint. I.e. given aCAD of Rn sign-invariant for polynomials {f1, . . . , fm} we aim to adapt this toone sign-invariant for {f1, . . . , fm, fm+1} or {f1, . . . , fm−1}. The present reportdeals only with the first problem. Such incrementality is needed to use CAD inSMT, and could also benefit CAD directly as a way of reducing the search space.We proceed by considering the changes required in Projection (Section 2) andLifting (Section 3) alongside the issues in using the Lazard projection operator.We finish with a summary and plans for future work in Section 4. A larger reportwith additional details we do not have room for here is available on arXiv [4].

2 Projection

2.1 Lazard projection

The present project built upon code from the ProjectionCAD package [6] forMaple. This implemented the McCallum family of projection operators [9], [3]and our first step was to adapt this for the Lazard projection scheme. All pro-jection operators take a set of polynomials and produce another set in one lessvariable. The Lazard operator is essentially a subset of the McCallum opera-tor. Both take discriminants and cross-resultants of the input polynomials. TheLazard operator then takes in addition leading and trailing coefficients whilethe McCallum operator takes all coefficients. Thus adaptations to the projectionalgorithms were fairly minimal here (see [4] for algorithms). The main compu-tational differences occur in the lifting stage as discussed later.


2.2 Worked example

We describe a worked example to illustrate what projection does and to use laterto illustrate incremental projection. We work with a system of two polynomialsF1 and seek a sign-invariant CAD:

F1 = {x21 + x22 − 1︸︷︷︸f1

, x31 − x22︸︷︷︸f2

}. (1)

Since the problem involves only two variables, we need only a single projection(which we do with respect to x2). The leading coefficients are constant, but thetrailing coefficients clearly identify the points on the x1 line at 0 and ±1. Thediscriminants do not identify anything more but the resultant

Resultant(f1, f2, x2) = (x31 + x21 − 1)2 (2)

identifies ±α1 =≈ ±0.75494. Thus the real line is decomposed into 11 cellsaccording to these 5 points.

Figure 3 plots the graphs of these functions along with the real roots isolated. Wesee they mostly correspond to geometrically relevant features (−α1 correspondsto an intersect in C2).

2.3 Incremental Lazard projection

Our second step was to adapt the Lazard Projection algorithms to calculate newprojection polynomials incrementally.

We continue our example to illustrate the incremental working. Suppose we takeF1 from above and also the polynomial forming a line, f3 = x2 − x1 to give

F2 = {x21 + x22 − 1, x31 − x22︸︷︷︸F1

, x2 − x1︸︷︷︸f3

}. (3)

We will compute all the projection polynomials discussed above and some addi-tional ones. The discriminant and leading coefficient of f3 are constant, and thetrailing coefficient identifies x1 = 0 which was already present from the trailingcoefficient of f2. Similarly, the resultant of f2 and f3 is x21(x1 − 1) identifies twomore roots we saw already. However Resultant(f1, f3, x2) = 2x21 − 1 identifiestwo new points, ±α2 = ±1/

√2 ≈ ±0.7071.

Figure 4 shows that the two new roots correspond to the two new intersectionsof the straight line with the circle.

4 We give a decimal approximation but emphasise that CAD would use the full alge-braic number representation: that α1 is the sole real root of (2) in (0, 1).


Fig. 3. The blue curve is f1 and the or-ange f2. Dotted lines show the projectionroots ∈ R.

Fig. 4. As Fig. 3 but with the additionalteal curve f2 and the additional rootsidentified.

We developed Algorithms 1 and 2 to implement such an incremental projection.The pseudo code is closely linked to the Maple implementation. The formercomputes all the projection polynomials via calls to the latter which performs oneprojection. The adjustments (highlighted in blue) required to make the originalprojection code (see [4]) incremental were:

1. To output from ProjectionPolysAdd the full table of projection polynomialsorganised by the main variable and re-input this with incremental calls.

2. The process and pass polynomials separately into ProjectionAdd.

3. To calculate only the new projection polynomials and store appropriately.

2.4 Incremental projection experimental results

There were seven elements of the projection set to calculate in the original projec-tion system ProjL(F1) and after the addition of the extra polynomial ProjL(F2)had 12. By avoiding full recomputation, we had to compute only the extra five.At the cell level the savings are more significant: on the real line there were 5original roots (11 cells), and two new ones (making 15 cells).

We performed experiments to see how such savings transferred into computationtime. We created examples using the random polynomial function randpoly inMaple. Testing was conducted through an external bash script creating newMaple instances to avoid any result caching. The testing code is available on-line5.5 https://github.com/acr42/InCAD.git


Algorithm 1 ProjectionPolysAdd

1: Input: Set of calculated projection polynomials prev ∈ R[xn, . . . , x1] and

set of new polynomials new ∈ R[xn, . . . , x1] (Adjustment 1)

2: Output: All projection polynomials ∈ R[xn, . . . , x1]

3: Procedure ProjectionPolysAdd (Compute all projection polynomials)

4: dim← Number of variables +1

5: pset[0] = table()

6: pset[0]← Primitive set from new, wrt variable xn (Adjustment 2)

7: pset[0]← Square free basis set from pset[0], wrt variable xn

8: pset[0]← Set of factors from pset[0], wrt variable xn

9: cont← Set of contents of new, wrt xn (Adjustment 2)

10: for i from 1 to dim-1 do

11: out← ProjectionAdd(prev[i− 1], pset[i− 1]) (Adjustment 2)

12: pset[i]← (out ∪ cont)

13: cont← Content set from pset[i], wrt variable xn−i

14: pset[i]← Prime set from pset[i], wrt variable xn−i

15: pset[i]← Square free basis from set pset[i], wrt variable xn−i

16: pset[i]← Set factors from pset[i], wrt variable xn−i

17: end for

18: pset[dim− 1]← Remove constant multiples from pset[[dim− 1]

19: ret← pset[[dim− 1]

20: return pset (Adjustment 1)

Bivariate polynomials We created 60 pairs of bivariate polynomials, consid-ered first finding the projection of one and then incrementing the projection byincluding the other. On average it was 16% faster to increment compared tocomputing the projection for both polynomials together. However, there was alarge variance: the cases which were faster were on average 55% faster, while asmall number of cases were slower, by as much as 87%.

Projection ResultsClassical Incremental

Variance 0.0004660s 0.0006425s 27.46% LargerMean 0.03743s 0.0315s 15.85% FasterLower Quartile 0.024s 0.008s 66.66% FasterMedian 0.0285s 0.015s 47.39% FasterUpper Quartile 0.03675s 0.05525s 50.34% Slower


Algorithm 2 ProjectionAdd

1: Input: Sets of polynomials new = {f1, . . . , fm} ∈ R[xn, . . . , x1], old ∈R[xn−1, . . . , x1], and variable ordering (Adjustment 2)

2: Output: Set of polynomials Pset = {p1, . . . , pq} ∈ R[xn−1, . . . , x1]

3: Procedure ProjectionAdd

4: Polys← Primitive set from new, wrt variable xn (Adjustment 2)

5: Cont← Content set from new, wrt variable xn (Adjustment 2)

6: Polys← Square free basis set from Polys, wrt variable xn

7: Pset1 = table() :

8: for i from 1 to number of elements of Polys do

9: Pol← Polys[i]

10: clist← Lazard coefficient set from Pol, wrt to xn

11: temp← Discriminant set from Pol, wrt xn

12: temp← Remove constant multiples from temp

13: Pset1[i]← union temp & clist

14: end for


16: for i from 1 to number of Polys do

17: for j from i+1 to number of Polys do

18: Pset2[i, j]← Resultant of Polys[i] and Polys[j], wrt to variable var

19: Pset2[i, j]← Remove constant multiples from Pset2[i, j]

20: end for

21: end for

22: oldset← old


24: for i from 1 to number of Polys do

25: for j from 1 to size(oldset) do

26: Pset3[i, j]← Resultant of Polys[i] and oldset[j], wrt to variable var

27: Pset3[i, j]← Remove constant multiples from Pset3[i, j]

28: end for

29: end for(Adjustment 3)

30: Pset← union (cont, Pset1, Pset2, Pset3)

31: Pset← Remove constant multiples from Pset

32: return Pset


Trivariate polynomials We next created 80 further examples with pairs oftrivariate polynomials, in this case restricting to 4 terms per polynomial. Herethere were greater savings, on average 29% faster to increment than to computealtogether. There was also a smaller variance in the timings of the example set,although it was still the case that a few examples were slower to increment thanrecompute.

Projection ResultsClassical Incremental

Variance 0.002743s 0.002205s 24.39% SmallerMean 0.06739s 0.04809s 28.64% FasterLower Quartile 0.02475s 0.013s 47.47% FasterMedian 0.0625s 0.035s 44.00% FasterUpper Quartile 0.09425s 0.07525s 20.16% Faster

We suggest that the extra overheads of the incremental approach will becomeless important in comparison to the savings as the number of variables increase:indeed, this follows from the well-known complexity results on CAD.

3 Lifting

3.1 Lifting after Lazard projection

We had to make changes to the lifting code in ProjectionCAD, not just toallow for incrementality but also to validate the use of the Lazard projectionoperator [11]. The McCallum projection operator [9] is known to be incompleteif it occurs that a projection polynomial is nullified over the sample point ofa cell. For example, the polynomial (y2 − 2)w + z(y − x2 + x + 2) is nullifiedover a cell in (x, y, z)-space with sample point (

√2,−√

2, 1). When lifting afterMcCallum projection, one must check for this situation and warn the user thatabove the cell in question we are not guaranteed sign-invariance.

With the Lazard operator (as proved valid in [11]) we can avoid such checksand warnings but we must do some additional work during lifting to recoverinformation lost by nullification, as outlined in Algorithm 3. With the previousexample we would first substitute for x =

√2 to get (y2 − 2)w+ z(y +

√2); but

then before substituting for y we must divide by y +√

2 to give (y −√

2)w + z.We can only then substitute for y to give −2

√2w + z and finally z to give

−2√

2w + 1. We thus must lift with respect to this univariate polynomial in w,creating necessary cell divisions that would have been lost by nullification.

This process is difficult when involving irrational sample points. However, sinceour prototype implementation lifts only over open cells, we have avoided suchdifficulties for now. Our implementation produces Open CADs, avoiding costlyalgebraic number calculations, but still getting a good understanding of thesolution set.


Definition 6 An Open-CAD is produced by lifting over open intervals only6.

Algorithm 3 Lazard valuation

1: Input: A polynomial f ∈ R[x1, . . . , xd], and sp = [r1, . . . , rd−1] ∈ Rd−1.

2: Output: List of roots.

3: Procedure LazardValuation:

4: Set roots to be an empty list

5: for j from 1 to d− 1 do

6: while f(r1, . . . , rj) = 0 do

7: f ← f/(xj − rj)8: substitute xj = rj into f .

9: end while

10: end for

11: return f:

We will now discuss our approach for incremental lifting, which can be thoughtof graphically as a form of acyclic tree merge, as shown later.

3.2 Worked example

We will describe lifting the projection polynomial system defined previouslyProjL(F1). Recall that we identified four points on the real line: {−1, 0, α1, 1}.Thus, we need to choose a sample value from the 9 cells in the decomposition:

a1 = {x1 < −1}, a2 = {x1 = −1}, a3 = {−1 < x1 < 0},a4 = {x1 = 0}, a5 = {0 < x1 < α1}, a6 = {x1 = α1},a7 = {α1 < x1 < 1}, a8 = {x1 = 1}, a9 = {1 < x1}

(4)

We are forced to pick non-rational sample points for cell a6 but the others canbe rational. We choose sample points: (−2,−1,− 1

2 , 0,12 , α1,

910 , 1, 2).

We lift over each cell at the designated sample point by isolating real roots ofthe univariate polynomials in x2 we get by from the Lazard valuation at thesample point in x1. Below, pi,j denotes the polynomial acquired after applyingthe Lazard valuation method to the i’th sample point, on the j’th polynomialfrom F1. For example, if we use sample point − 1

2 for cell a3 then polynomial f1becomes 1

4 + x22 − 1 which has two real roots at ±β0 = ±√

3/2 ≈ ±0.8660. Weproceed this way to generate our CAD cells in R2. The structure is as shown inFigure 5. For a full list of the new cell descriptions see [4].

6 Not actually a decomposition of Rn as missing boundaries of the n-dimensional cells.


Fig. 5. CAD tree of F1. Green nodes are in the first dimension and red the second.

3.3 Incremental Lazard lifting

The general concept of how we solved this stage of the problem was to think of itas solving a graph (tree) attachment/detachment problem. One should think ofthe old CAD as having a tree structure which we save: where nodes are cells; andbranches link a cell to its parent (cell it projects onto), or child (decompositionin a cylinder above) cells. At each depth of the CAD/tree, say depth p, are allthe cells within Rp before we lifted to Rp+1. We go through a worked example.

We perform an incremental lift on the polynomial system F1, incremented by anew polynomial f4 = x31 + x22, forming the new system (5).

F3 = {x21 + x22 − 1, x31 − x22︸︷︷︸F1

, x31 + x22︸︷︷︸f4

} (5)

The new system is symmetrical about the y axis as you can see in Figure 8.

Fig. 6. The blue curve is f1, the or-ange f2 and the teal f4.

Fig. 7. Dotted lines show the pro-jection roots.


Fig. 8. CAD tree of unchanged cells from F1 incremented by f4.

Fig. 9. CAD tree of new cells from F1 incremented by f4. Blue outlinesaround lines/ nodes represent new connections / cells.

We skip the projection steps7. The addition polynomials identify one furtherpoint on the real line: −α1. The decomposition of the real line is now:

a1 = {x1 < −1}, a2 = {x1 = −1}, a3 = {−1 < x1 < −α1},a4 = {x1 = −α1}, a5 = {−α1 < x1 < 0}, a6 = {x1 = 0}, a7 = {0 < x1 < α1},a8 = {x1 = α1}, a9 = {α1 < x1 < 1}, a10 = {x1 = 1}, a11 = {1 < x1}

and our sample points are: −2,−1,− 910 ,−α1,− 1

2 , 0,12 , α1,

910 , 1, 2.

We first test whether each sample point leads to new roots when lifting with f4.If so we must refine the decomposition in the cylinder above. For example, on a2we have sample point x1 = −1, and the Lazard valuation of f4 is x22−1 with realroots at ±1. However, we had already identified these from other polynomials,so no change is required. However, on a5 with sample point − 1

2 , f4 evaluates tox22− 1

8 and we find two new real roots. We thus refine the decomposition above.

We then lift over the new sample points with respect to all polynomials creatingnew decompositions. For example, at x1 = 9

10 we have two roots from the valua-tion of f1 and another two from f4 (so decomposition above into 9 cells). Figures10-12 show the new CAD tree structure and its split into new and unchangedcells, illustrating potential savings. Full cell descriptions are in [4].

7 See ”Worked Examples” worksheet in: https://github.com/acr42/InCAD.git


Fig. 10. CAD tree of F1 incremented by f4: a merger of Fig. 10 and 11.

3.4 Algorithms

The two new algorithms created for incremental lifting were LiftSetupAdd (Al-gorithm 4) concentrating on the base phase (CAD of the real line) and LiftAdd

(Algorithm 5) which deals with the rest of the CAD. The non-incremental ver-sions can be found in the report [4].

When incrementing the lift stage, we can think of it as starting at the root nodeof the old CAD tree and working our way down it, one depth level at a time,until we reach the leaves. In the process of working down the old tree, we willbe creating subtrees, which will later be reconnected to the unchanged tree, toform the new incremented CAD tree.

One tree (UnchangedCells) is a strict subset of the old trees nodes and edges,discovered through analysing the old structure down and Lazard valuating oneach cell not marked as new, at each depth p with new projection polynomialsin Rp+1. Then, if there are new real roots discovered, we prune inspected cellschildren, and the cell is sent to the NewCells set for full re-computation. In theNewCells set, we will then use this cell to form a subtree, to later be reconnectedwith the UnchangedCells tree. Each cell in the NewCells list is a subtree, tolater be reconnected via source indices.

When going through the old CAD tree structure, we have only two cases:

CASE 1: When a node has new children:New real roots have been acquired from one of the new projection polynomials.We start by pruning all of the child branches in the old tree structure, by labellingthem as new, then performing a full lift onto the set of all projection polynomialsfrom Rk, . . . ,Rn, where k is the depth the new root was discovered. When welabel a cell as new, effectively that halts tree growth in the UnchangedCells

structure, so that later on we can attach the branch extensions, gained from theincremental lift. Such cells have an updated source index, as its source cell wouldnow be saved in a new tree structure with a new index.

CASE 2: When a node has no new children.The new flag is passed down to children from parents. Cells which are not neware stored in UnchangedCells. Cells here only have Lazard valuation at the new


projection polynomial (rather than all projection polynomials). If this leads toa new root, we move it over to the NewCells structure. Otherwise, we continuewith its child cells.

When moving cells into UnchangedCells, we make sure that the indexing of eachcell does not clash with that of the indexing in NewCells at each depth level inthe tree. We then merge the NewCells and UnchangedCells trees, forming thefull incremented CAD tree. At each stage of the lift, we merge-sort the list.

3.5 Incremental lifting experimental results

We conducted testing for the incremental lift method on the same examples usedto test the incremental projection earlier.

Bivariate polynomials On average 30% faster than recomputing.

Lift ResultsClassical Incremental

Variance 0.003734s 0.002903s 28.63% SmallerMean 0.1778s 0.1240s 30.25% FasterLower Quartile 0.1328s 0.089s 32.96% FasterMedian 0.163s 0.1255s 23.01% FasterUpper Quartile 0.226s 0.163s 27.87% Faster

Trivariate polynomials On average only 7% faster; one example 28% slower.

Lift ResultsClassical Incremental

Variance 0.05541s 0.06838s 18.96% LargerMean 0.2880s 0.2687s 6.707% FasterLower Quartile 0.1275s 0.0995s 21.96% FasterMedian 0.207s 0.164s 20.77% FasterUpper Quartile 0.3605s 0.3523s 2.29% Faster

So the lifting code shows opposite results to projection with the savings decreas-ing with input size: indicating that the overheads required for the incrementalwork grow faster than the savings (at least for the size of examples studied).

Of course, we can also experiment with combined projection and lifting. Unsur-prisingly the lifting costs dominate. For the bivariate polynomials incrementalcode was 37% faster but for the trivariate only 12% faster (see [4] for details).We think the reason for such drops in performance was due to poor choices ofMaple’s data-structure: in particular Maple lists which are implemented as im-mutable types meaning our edits of them caused separate lists to be created eachtime. Further, the project may benefit from a fully object-oriented approach. Itis likely further progress could come through code re-factoring.


4 Summary and Future Work

The presented work acts as a proof of concept that incremental CAD construc-tion in Maple is possible with savings on offer. Care needs to be given to thedatatypes used. Beyond that the main areas of further work are moving out ofthe open case (our implementation restricted lifting to open cells), consideringwhat happens if the new polynomial has a variable not already represented inthe system, and considering incremental reduction of constraints. To remove apolynomial from the CAD means finding all those projection polynomials cre-ated from only that source polynomial (or as a resultant of that with another)and removing them and the cylinder splits their real roots caused.

Acknowledgements This work was funded by the EU’s H2020 programme under

grant No H2020-FETOPEN-2015-CSA 712689 (SC2 ). We thank C. Brown for a tutorial

on the Lazard valuation; S. Timms, J.H. Davenport and S. Forrest for useful discussions;

and the organisers of the SC2 2017 Summer School where these took place.

References

1. E. Abraham, J. Abbott, B. Becker, A.M. Bigatti, M. Brain, B. Buchberger,A. Cimatti, J.H. Davenport, M. England, P. Fontaine, S. Forrest, A. Griggio,D. Kroening, W.M. Seiler, and T. Sturm. SC2: Satisfiability checking meets sym-bolic computation. In: Intelligent Computer Mathematics (LNCS 9791), pages28–43. Springer International Publishing, 2016.

2. A. Biere, M. Heule, H. van Maaren, and T. Walsh. Handbook of Satisfiability (Vol.185 Frontiers in Artificial Intelligence and Applications). IOS Press, 2009.

3. R. Bradford, J.H. Davenport, M. England, S. McCallum, and D. Wilson. Truthtable invariant cylindrical algebraic decomposition. J. Symb. Comp., 76:1–35, 2016.

4. A.I. Cowen-Rivers and M. England. Summer research report: Towards incrementalLazard cylindrical algebraic decomposition. arXiv:1804.08564, 2018.

5. M. England, R. Bradford, and J.H. Davenport. Improving the use of equationalconstraints in cylindrical algebraic decomposition. In Proc. ISSAC ’15, pages 165–172. ACM, 2015.

6. M. England, D. Wilson, R. Bradford, and J.H. Davenport. Using the RegularChains Library to build cylindrical algebraic decompositions by projecting andlifting. In Proc. ICMS ’14, (LNCS 8592), pages 458–465. Springer, 2014.

7. M. Jirstrand. Cylindrical algebraic decomposition: An introduction. Course notesfrom Linkoping University, 1995.

8. D. Lazard. An improved projection for cylindrical algebraic decomposition. Alge-braic Geometry and its Applications, page 467–476, 1994.

9. S. McCallum. An improved projection operation for cylindrical algebraic decompo-sition. In Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts& Monographs in Symbolic Computation, pages 242–268. Springer-Verlag, 1998.

10. S. McCallum and H. Hong. On using Lazard’s projection in CAD construction. J.Symb. Comp., 72:65–81, 2016.


11. S. McCallum, A. Parusiniski, and L. Paunescu. Validity proof of Lazard’s methodfor CAD construction. In Press: J. Symb. Comp., 2017.

Algorithm 4 LiftSetupAdd

1: Input: A sets of new projection polynomials new, an oldcad, a table of sets

of all projection polynomials psetfull, and a variable ordering

2: Output: [NewCells,OldCad,OldRoots, UnchangedCells] where NewCells

are in the last variable LiftIncf2 is information for the next lift, OldCad

contains the previous CAD tree, and UnchangedCells is a subset.

3: Procedure LiftSetup

4: cad← table()

5: NewRoots← [ ]

6: NewCells← table()

7: UnchangedCells← table()

8: LiftIncf2 ← [ ]

9: for i from 1 to size(new[1]) do

10: Append to NewRoots output of RealRoots(new[1][i])

11: NewRoots← Sort in ascending order and remove duplicates

12: end for

13: NewCells[1], UnchangedCells[1] = Split(OldRoots,NewRoots,OldCad)

14: for i from 1 to size(oldcad[1]) do

15: for j from 1 to size(new[2]) do

16: Add oldcad[1][i] to NewCells

17: end for

18: end for

19: for i from 1 to size(NewCells[1]) do

20: for j from 1 to size(psetfull[2]) do

21: Set roots to real roots of LazardValuation(oldcad[1][i], new[2][j]

22: Append [[i], [roots]] to LiftIncf2

23: end for

24: end for

25: for i from 1 to size(oldcad[1]) do

26: If cell OldCad[1][i]’s flag is not equal to new, then add cell to

UnchangedCells[1] and update index accordingly.

27: end for

28: return list of variables as in Output


Algorithm 5 LiftAdd

1: Input: A sets of new projection polynomials new, an oldcad, a table of sets

of all projection polynomials psetfull, and a variable ordering

2: Output: Incremented CAD

3: Procedure LiftAdd

4: [NewCells,OldCad, LiftIncf2, Unchanged]← LiftSetupAdd(pset, vars)

5: dim← Number of elements in vars

6: for d from 2 to dim− 1 do

7: LiftIncfd+1 ← [ ]

8: NewCells[d]← Lift(NewCellsd−1, LiftIncfd )

9: for i from 1 to size(OldCad[d]) do

10: for j from 1 to size(new[d+ 1]) do

11: Add oldcad[d][i] to NewCells

12: end for

13: end for

14: for i from 1 to size(NewCells[d]) do

15: for j from 1 to size(psetfull[d+ 1]) do

16: Set roots to real roots of LazardValuation(OldCad[d][i], new[d+1][j]

17: Append [[i], [roots]] to LiftIncfd+1

18: end for

19: end for

20: for i from 1 to size(OldCad[d]) do

21: If cell OldCad[d][i]’s flag is not equal to new, then add cell to

Unchanged[d] and update index accordingly.

22: end for

23: end for

24: FinalUnchangedCells← []

25: FinalUnchangedCells Union Unchanged[d][f ] for all cells f with their cor-

responding flags not equal to new.

26: IncCAD ← FinalUnchangedCells Union NewCells[d]

27: return IncCAD

Evaluation of Equational Constraints for CAD inSMT Solving

Rebecca Haehn, Gereon Kremer, and Erika Abraham

RWTH Aachen University, Germany

Abstract. The cylindrical algebraic decomposition algorithm is a quan-tifier elimination method for real-algebraic formulae. We use it as a the-ory solver in the context of satisfiability-modulo-theories (SMT) solvingto solve sequences of related real-algebraic satisfiability questions.In this paper, we consider some optimizations for handling equationalconstraints. We review some previously published ideas, in particularBrown’s projection operator and some improvements suggested by Mc-Callum. Then we discuss different variants of the restricted projectionoperator to implement them in our SMT solver SMT-RAT and provideexperimental results on the SMT-LIB benchmark set QF NRA. We showthat the performance improves especially for unsatisfiable inputs.

1 Introduction

Solving the satisfiability (SAT) problem is an important sub-problem in manyapplications, ranging from program analysis [GAB+17] to industrial configura-tion management [Vol15] or the dependency management in Linux package man-agers [ADCTZ11]. An extension to the regular SAT problem is the satisfiability-modulo-theories (SMT) problem that allows for richer logics. While SAT onlyallows for propositional formulae, SMT deals with quantifier-free first-order logicformulae over one or more theories. One popular theory is the theory of (non-linear) real arithmetic.

An SMT solver [BHvMW09,KS08] is traditionally separated into a SAT solv-ing part – which makes use of a regular SAT solving engine – and a theory solvingpart. The SAT solving part deals with the logical structure of the formula and is-sues theory queries to the theory solving part. The theory solving module checkssets of theory constraints for consistency.

A number of theory solving modules for non-linear real arithmetic havebeen proposed, including incomplete algorithms like interval constraint prop-agation [GGI+10,HR97] or virtual substitution [Wei97]. In this paper, we dealwith the cylindrical algebraic decomposition (CAD) method [Col75] which is, tothe best of our knowledge, the only complete decision procedure for non-linearreal arithmetic implemented in any SMT solver.

Given a set (conjunction) of real-algebraic input constraints and an orderingof the variables, the CAD method proceeds in two phases. It first uses a projectionoperator to produce a sequence of sets of polynomials of decreasing dimension.

20 Rebecca Haehn, Gereon Kremer, and Erika Abraham

The sets of real roots of these polynomials can be used to construct the bordersof finitely many semi-algebraic sets that we call cells. The points in each of thecells are equivalent regarding the satisfaction of the input formula. In the secondphase, samples are constructed dimensionwise for each of the cells. Thus todecide satisfiability it is sufficient to check whether any of the generated samplessatisfies the formula.

A number of different projection operators exist, most notably the ones dueto Collins [Col75], Hong [Hon90], Lazard [Laz94,MPP17], McCallum [McC85],and Brown [Bro01]. They use the same ingredients and mainly differ in the sizeof the sets of polynomials, essentially representing the continuing research inthis area. It should be noted that the projection operators due to McCallum andBrown are incomplete in the sense that they may not generate some polynomialsthat are needed to find satisfying solutions. Past studies [VKA17] have shown,however, that this incompleteness does not surface on the SMT-LIB [BFT16]benchmarks.

In addition to the different projection operators, several modifications havebeen proposed to reduce the number of polynomials in certain cases. One par-ticular modification that we analyse in this paper makes use of equational con-straints. If equations are present in the input, they can be used to reduce thework for the projection.

In this paper, we present several possibilities to exploit the fundamental idea,in particular different variants of the restricted projection operator suggested byMcCallum [McC01] in combination with Brown’s projection operator [Bro01].Then we discuss their impact on a CAD implementation used in the context ofSMT solving in the SMT solver SMT-RAT and review some experimental resultson the benchmark set QF NRA provided by SMT-LIB. Some of the presentedoptimizations are proven to be sound while others might lead to incompleteness.However, similar to the projection operators that are incomplete in general, noneof our optimizations yielded incorrect satisfiability results in our experiments.

After presenting some preliminaries in Section 2 we discuss the optimizationsthat we have implemented for the CAD projection phase in Section 3. In Section4 we present and discuss our experimental results and conclude the paper inSection 5.

2 Preliminaries

2.1 SMT Solving

SMT solvers [BHvMW09,KS08] are used to decide the satisfiability of first-orderlogic formulae over one or more underlying theories. Traditionally, an SMT solverconsists of a SAT solver and one or more theory solving modules. The SAT solverdeals with the logical structure of the input formula by iteratively searching forsolutions for its Boolean skeleton, which is the propositional logic formula ob-tained by replacing the theory constraints in the formula with fresh Booleanpropositions. During this search, the theory solving modules are used to check

Evaluation of Equational Constraints for CAD in SMT Solving 21

Boolean abstraction

SAT solver

Theory

solver(s)

constraints

(SAT+model) or

(UNSAT+explanation) or

(UNKNOWN)

input

CNF formula

SAT/UNSAT

Fig. 1: The SMT solving framework [VKA17]

whether the current Boolean assignment is consistent with the theories. There-fore they check the consistency of a set of theory constraints, which containsall theory constraints whose corresponding proposition in the Boolean skeletonis assigned true and that appear non-negated in the formula, as well as thosewhose proposition is assigned false and that appear negated.

The above described so-called lazy SMT solving approach is illustrated inFigure 1. The frequency of theory checks varies in different approaches, in thefull lazy approach they are only executed for full Boolean solutions, while inthe less lazy approach they are already executed for partial solutions. If thetheory constraints are conflicting then the theory solving module returns anexplanation for the conflict and the SAT solver searches for a different Booleansolution informed by the explanation. In case the constraints are consistent in thetheory, the Boolean assignment is either not yet complete, then the SAT solvercontinues its search; or it is complete, which means that a satisfying solution isfound.

2.2 The CAD Method

The CAD method [Col75] can be used as a theory solving module for the theoryof (non-linear) real arithmetic. It works with respect to a fixed, static variableordering that we assume to be given.

Polynomials p ∈ Z[x1, ..., xn] with integer coefficients over variables x1, . . . , xn

are expressions of the form p =∑m

i=1 ai∏n

j=1 xeijj with coefficients ai ∈ Z and

exponents ei,j ∈ N0 for all i = 1, . . . ,m and j = 1, . . . , n. The products∏n

j=1 xeijj

are called monomials. If n = 1 then p is called univariate, otherwise multivari-ate. Multivariate polynomials in Z[x1, ..., xn] are usually interpreted as univari-ate polynomials in xn with polynomial coefficients from Z[x1, ..., xn−1], i.e., aspolynomials from Z[x1, ..., xn−1][xn].

Polynomial constraints have the form p � 0, where p is a polynomial inZ[x1, . . . , xn] and where � is one of the comparison predicates <, ≤, =, 6=, ≥, >.The sign of a polynomial p ∈ Z[x1, ..., xn] in a point r ∈ Rn is defined as −1 ifp evaluates in r to a strictly negative value, 1 for a strictly positive value, and 0otherwise. To decide whether p � 0 evaluates to true for a point r, it is sufficientto determine the sign of p under r, since p is compared to zero. If the sign of p is


the same under all points from a set C ⊆ Rn then C is called p-sign-invariant ;in this case it suffices to determine the sign of p for a single point in the set toevaluate the constraint for all points in the set. Thus the satisfiability of p canbe determined if we can construct a finite partition C = {C1, . . . , Cm} of Rn intofinitely many p-sign-invariant cells Ci, which we call a decomposition.

A decomposition C = {C1, . . . , Cm} is called Pn-sign-invariant for a set Pn

of polynomials if it is p-sign-invariant for each polynomial from Pn. It is fur-thermore called algebraic if each Ci ∈ C is a connected semi-algebraic1 set. Sucha decomposition can be determined by the roots of the polynomials in Pn. Itis additionally cylindrical when the cells are cylindrically ordered, which meansthat for each Ci, Cj ∈ C and each 1 ≤ k < n the projections of Ci and Cj toRn−k (by removing the last k coordinates) are either disjoint or identical.

For a set Pn of polynomials, a Pn-sign-invariant cylindrical algebraic de-composition (CAD) of Rn as described above can be computed using the CADmethod. Though in general CAD works on real arithmetic formulae, in this pa-per we consider only sets of polynomial constraints as input, as we use the CADmethod as an SMT theory solving module. The considered set Pn contains allpolynomials of the input constraints.

The CAD method computes the CAD in two phases: the projection phaseand the lifting phase. In the projection phase the polynomials that describe theboundaries of the CAD cells are computed using a projection operator. The pro-jection operator is applied to the polynomials in Pn ⊆ Z[x1, . . . , xn] to computea set Pn−1 ⊆ Z[x1, . . . , xn−1] of polynomials, for which a CAD is computed re-cursively. The projection operator has the important property that each CADCn−1 for Pn−1 can be extended to a CAD Cn for Pn in an easy way by definingthe cells of Cn to be the Pn-sign-invariant regions in the cylinders C ′i × R foreach cell C ′i ∈ Cn−1.

There are several possible projection operators, in this paper, Brown’s pro-jection operator [Bro01] is used since compared to other operators it is fasterand fewer polynomials have to be computed [VKA17]. In the following the prop-erties degree deg(p), and leading coefficient lcf(p) of a polynomial p are usedwith the usual meaning. To define Brown’s projection operator in addition theSylvester matrix of univariate polynomials p =

∑ki=0 aix

in and q =

∑li=0 bix

in

in xn with polynomial coefficients ai, bi ∈ Z[x1, . . . , xn−1], deg(p) = k ≥ 1, and

1 A set C ⊆ Rn that can be described by a conjunction of polynomial constraints iscalled semi-algebraic.


deg(q) = l ≥ 1, which is the following (k+l)×(k+l)-matrix, needs to be defined.

Sylxn(p, q) :=

ak · · · a0 · · · 0

ak · · · a0...

.... . .

. . .

0 · · · ak · · · a0bl · · · b0 · · · 0

bl · · · b0...

.... . .

. . .

0 · · · bl . . . b0

l

k

Furthermore is the resultant of p and q defined as res(p, q) = det(Sylxn(p, q)), the

discriminant of p as disc(p) = det(Sylxn(p, p′)), and the content of p as cont(p) =

gcd(a0, ..., ak), where gcd is the greatest common divisor. Finally Brown’s pro-jection operator Proj itself is defined as follows: Let P ′n = {p′1, . . . , p′m} be thefinest square-free basis of Pn.

Proj(Pn) = Pn−1 = {lcf(pi), disc(pi), res(pi, pj) | pi, pj ∈ P ′n, i 6= j}∪ {cont(pi)|pi ∈ Pn and cont(pi) is non-zero, non-unit}⊂ Z[x1, ..., xn−1]

Instead of polynomial sets, individual polynomials can be projected incre-mentally [CKJ+15]. The CAD itself is then modified accordingly in an alsoincremental lifting phase. This way the polynomials can be added and removedindividually one after another. That enables to reuse large parts of previouslycomputed CADs instead of recomputing them, which is useful since many similarCAD computations are needed when using the CAD method as a theory solvingmodule in a less lazy SMT solver. This is due to the SAT solver that adds andremoves only a few constraints while usually most constraints remain the same.

Given a Pn-sign-invariant CAD it suffices to consider one sample point fromeach cell to check whether any point in the cell satisfies the input constraints.These sample points are computed in the lifting phase. Since the CAD methodis used to check whether a set of polynomial constraints is consistent, which isthe case if we can assign a real value to each of the variables occurring in the setsuch that each constraint evaluates to true, it is also sufficient to compute onlya partial CAD and stop the computation if a satisfying point is found [CH91].

3 Modification of the CAD Projection

When we construct a CAD for the purpose of theory solving in an SMT solver,we do not need the CAD to be sign-invariant on the input polynomials, insteadwe are content if every cell is invariant with respect to the evaluations of theconstraints. We call a CAD truth-invariant if the truth value of every inputconstraint is constant on each cell.


Additionally, we can exploit the fact that we only use the CAD method as atheory solving module, which means that it is only used to check the consistencyof sets of constraints instead or arbitrary Boolean combinations of constraints.This implies that a part of the solution space where a single constraint evaluatesto false can be discarded as a whole, even if other constraints are not sign-invariant on this area. This part of the solution space may be more than a singlecell of the CAD and we can try to avoid constructing individual cells within thispart of the solution space altogether.

In order to exploit this, we consider modifications of the projection operatorsbased on equational constraints [McC99]. These are all constraints of the formp = 0, where p is a polynomial as defined above, which we call an equationalconstraint polynomial, since we only consider conjunctions of constraints as in-put for the CAD method. Technically, McCallum suggested different ways toconstruct CADs that are sign-invariant with respect to equational constraintsand sign-invariant with respect to the other constraints only in the cells wherethe equational constraints evaluate to true. Of course, this is only possible ifequational constraints are present in the set of input constraints.

The advantage of these modifications is that we can use a coarser CAD usingfewer polynomials in the projection phase that still answers our question. ThisCAD may consider fewer polynomials in the projection phase which leads toa smaller number of sample points in the lifting phase and eventually a fewernumber of cells. Therefore we expect the modified method to scale better onlarger inputs in the presence of equational constraints.

We note that we compute partial CADs, i.e., we let the CAD method termi-nate if a satisfying sample is found, and thus a full CAD is usually only computedon unsatisfiable inputs. We also observe that many satisfiable SMT problems donot produce a lot of unsatisfiable theory calls since most examples in the usedbenchmark set have little Boolean structure and large Boolean satisfying regions,therefore we expect the impact of this modification to be beneficial mainly onunsatisfiable inputs. Furthermore, the lifting phase may even profit if more poly-nomials are present that are comparably easy – for example with small degrees– in the projection as they may lead to a satisfying sample without consideringhard polynomials at all. In such a case these modifications may actually hinderthe lifting phase and slow down our solver.

3.1 Restricted Projection

The first modification suggested by McCallum is to restrict the projection oper-ator that is applied in the first step of the projection phase [McC99]. Afterwards,we continue with the original projection operator.

McCallum suggested this approach for his projection operator, however, itcan be applied in combination with projection operators other than his ownas well. We use this restricted projection with the projection operator due toBrown [Bro01].

For a finite set of constraints with polynomials Pn ⊂ Z[x1, ..., xn] let E ⊆ Pn

a set of one polynomial which appears in an equational constraint and contains


the variable xn that is to be eliminated first. The restricted projection of Pn

relative to E is defined as follows:

ProjE(Pn) = Proj(E′) ∪ {res(e, p) | e ∈ E′, p ∈ P ′n, p /∈ E′}∪ {cont(pi) | pi ∈ Pn and cont(pi) is non-zero, non-unit}

= {lcf(e), disc(e) | e ∈ E′} ∪ {res(e, p) | e ∈ E′, p ∈ P ′n, p /∈ E′}∪ {cont(pi) | pi ∈ Pn and cont(pi) is non-zero, non-unit}⊂ Z[x1, ..., xn−1]

with the primitive part of a set of polynomials P defined as prim(P ) = {p/cont(p)|p ∈ P and p/cont(p) is not constant} and the finest square-free basis P ′n forprim(Pn) and the finest square-free basis E′ for prim(E). Note that we mayonly be able to apply this restricted projection operator under an appropriatevariable ordering as xn must be present in the equational constraint.

When using this restricted projection operator instead of the original pro-jection operator less leading coefficients, discriminants, and resultants are addedto the projection. This may reduce the size of the projection significantly whichwe hope to be beneficial for the overall performance. As mentioned before, theremoval of comparably easy polynomials (in particular leading coefficients) mayalso be a disadvantage for satisfiable sets of constraints, though. We neverthe-less hope a partial CAD using this modified projection operator to be faster onaverage.

The result when applying this approach is a CAD that is sign-invariant withrespect to the used equational constraint and sign-invariant with respect to theother constraints in those cells where the equational constraint is satisfied. Mc-Callum gave a proof that validates the use of ProjE in the first projection stepin [McC99], as well as in both steps for a 3-dimensional CAD.

In our implementation, we apply ProjE on all levels – provided that ap-propriate equational constraints are part of the input. Though doing so is notformally proven to be sound, we base our application on the following argument:the places where the proof fails are statistically rare so in the context of solvingmany problems quickly we accept the risk.

3.2 Semi-Restricted Projection

Another modification suggested by McCallum is the semi-restricted projectionoperator [McC01], for which the repeated application is formally validated, wher-ever it is applicable throughout the projection phase. For sets of polynomialsPn ⊂ Z[x1, ..., xn] and E = {e} ⊂ Pn, where e is an equational constraint poly-nomial that contains the variable xn, the semi-restricted projection of Pn relativeto E is defined as follows:

Proj∗E(Pn) = ProjE(Pn) ∪ {disc(p)|p ∈ P ′n, p /∈ E′}⊂ Z[x1, ..., xn−1]


with the finest square-free basis P ′n for prim(Pn) and the finest square-free basisE′ for prim(E).

McCallum has shown that this operator can be used whenever applicable andthat alternatively the restricted projection operator ProjE can be used for thefirst step and the last step in the projection phase, and the semi-restricted pro-jection operator Proj∗E in every other step [McC01]. For the latter combinationa detailed complexity analysis can be found in [EBD15,ED16]. This allows usingequational constraint polynomials at every projection step where one is presentto reduce the size of the projection. Note that if the underlying projection oper-ator is incomplete (like McCallum’s or Brown’s) then the restricted projectionoperator is also incomplete.

3.3 The Resultant Rule

McCallum proposed a method called the resultant rule [McC01] to exploit thesemi-restricted projection even if no explicit equational constraint is present fora specific level. If e1 and e2 are both equational constraint polynomials theirresultant res(e1, e2) is a propagated equational constraint polynomial, since e1 =0 ∧ e2 = 0 ⇒ res(e1, e2) = 0. Due to this rule more polynomials in theprojection are classified as equational constraint polynomials.

This method was only proposed for the semi-restricted projection, as the re-stricted projection was defined for the top level only. Given that we assume therestricted projection to be usable on all levels as well, we could also use the resul-tant rule when we apply the restricted projection multiple times. Currently, wedo not use this rule in our implementation. The reasons are specific requirementsfor the underlying data structures, causing challenges for the implementation.Intuitively, a polynomial can be part of the projection due to several reasons.In the incremental setting, we would need to keep track of all possible reasonsand apply postponed projection steps if e.g. an equational constraint is removedfrom the input set.

3.4 Bounds

A different approach to modify the projection uses bounds. Bounds are polyno-mial constraints of the form b · x + a � 0, with a, b ∈ Z, � ∈ {<, ≤, ≥, >}and a variable x. They can be used to neglect some polynomials in the pro-jection, namely those that are for all permitted values of their variables eitheralways positive or always negative since these have no roots and are thereforenot needed to determine the CAD. For polynomials that can be neglected dueto bounds no successors (leading coefficient, discriminant, and resultants) needto be computed. This approach is described in more detail in [LSC+13].

4 Experimental Results

We implemented several of the modifications described in the previous section aspart of the CAD module in our SMT solver SMT-RAT [CKJ+15]. These imple-


mentations are included in the publicly available version. To evaluate them weexamine different combinations of the restricted and semi-restricted projection,as well as the simplification using bounds. We consider the following strategies:B uses only bounds to simplify the projection while R uses the restricted pro-jection operator for as many as possible consecutive steps, starting at the firststep. BR combines these two modifications. BRI extends it by allowing for aninterruption of the restricted projection in the sense that it may be applied evenwhen it is not applied in the step before. BSI employs the semi-restricted pro-jection operator instead of the restricted one, otherwise, there is no difference toBRI.

For comparison we use Default that is the standard CAD based strategy inSMT-RAT. It uses a – supposedly less powerful – variant of the simplificationbased on bounds, but no optimizations using equational constraints. The modi-fications added to the regular CAD computations in the strategies B, BSI, andDefault are provably sound, while for the other strategies no formal soundnessproofs are provided yet. Additionally, all strategies are based on the projectionoperator due to Brown [Bro01] which is incomplete on certain inputs.

As benchmark problems we use the QF NRA benchmark set from the SMT-LIB [BFT16] which consists of 12084 problems from 10 different applications.We used a time limit of 30 seconds per problem instance and allowed at most 4GB of memory for every solver strategy and every input problem. The possibleoutcomes of a solving run are sat, unsat, timeout or memout. For sat and unsatthe problem was solved correctly and is satisfiable respectively unsatisfiable. Fortimeout and memout the solver was unable to find a solution or determineunsatisfiability within the given time and memory limit.

We did not find incorrect results for any of the solvers on any input problem,despite the incompleteness of the projection operator used. For the projectionoperator due to Brown inputs that actually lead to incorrect results are known,but past experiments [VKA17] already showed that this benchmark set does notcontain such examples.

Strategy sat unsat timeout memout

Default 4743 3939 3259 143

B 4764 3951 3225 144

R 4744 3962 3236 142

BR 4750 3964 3228 142

BRI 4752 4039 3151 142

BSI 4752 4038 3152 142

Table 1: Overall solver performances

The overall performance of each strategy is shown in Table 1. As expectedall modified strategies solve more problems than Default and the improvementsare mostly arising on unsatisfiable inputs. We assume that the reason is that the


solver can only take advantage of the modifications when a full CAD is computedsince in that case fewer polynomials are needed. A full CAD is computed onan unsatisfiable problem to detect a theory conflict, which should occur moreoften for unsatisfiable problems. Satisfiable instances, on the other hand, areusually solved with only a fraction of the projection computed. We examine thenumber of solved problems for which theory conflicts occurred in Table 2. Mostsatisfiable problems can be solved without a single theory conflict. Also, mostof the problems that could be solved additionally are problems where a theoryconflict occurs.

Strategy overall thereof sat thereof unsat

Default 3798 396 3402

B 3823 408 3415

R 3828 402 3426

BR 3835 407 3428

BRI 3912 409 3503

BSI 3911 409 3502

Table 2: Numbers of problems where a theory conflict occurred

The strategy B was the best on satisfiable problems, but also the one withthe least improvement on unsatisfiable problems. The pruning of polynomialsdue to bounds reduces the size of the projection but essentially does not changethe lifting phase as the removed polynomials provide no new samples anyway.The restricted projection operator further decreases the size of the projection butalso removes samples from the lifting phase. As the restricted projection tendsto remove polynomials of smaller degrees, this may actually be detrimental forthe lifting phase and thus for satisfiable instances. We note that in practice thereseems to be no significant impact of using the restricted projection on satisfiableinstances.

The strategies BRI and BSI achieve the best results overall. Allow for inter-ruptions in between the application of the (semi-)restricted projection operatorimproves the solver’s performance on unsatisfiable problems significantly com-pared to the other variants. At least on this set of benchmarks, the differencesbetween the restricted or semi-restricted projection are negligible.

Next, we take a closer look at the running times of the different strategies,based on the 8657 input problems that could be solved by all strategies. Theresults for the average running times are collected in Table 3. Compared to De-fault the simplifications by using bounds and the restricted projection operatordo speed up the computations significantly. The best average running time hasthe strategy BSI, closely followed by BRI. This directly reflects the superioroverall result of these two strategies.

Last we take a look at the memouts, as we can see that none of the modi-fications significantly changes the number of memouts. One would expect that


Strategy running time in ms for sat for unsat

Default 705 108 1422

B 675 98 1368

R 683 113 1369

BR 673 101 1361

BRI 649 101 1308

BSI 649 101 1307

Table 3: Average running times

significantly reducing the size of the projections would mitigate memory issues.However, the modified strategies, in contrast to Default, never actually removepolynomials but merely deactivate them. This causes the solver to consume morememory which leads to more memouts. It is possible to delete these polynomialsinstead of deactivating them when using the modified strategies as well. How-ever, we do not expect that to significantly improve the overall performance sincethis is only relevant for large problems that are often hard to solve anyway andare therefore likely to just result in a timeout instead. This gets even more likelydue to the fact that deleted polynomials might have to be recomputed later.

We examined this by means of the BRI strategy and implemented the cor-responding strategy with the deletion of polynomials, in the following referredto as BRID. In Table 4 the results for these two strategies are compared andthey are indeed relatively similar. As expected more timeouts occur when usingBRID, while in BRI one more memout occurs. Thus we observe that deletingpolynomials even decreases the overall solver performance though the averagerunning time on the problems solved by all strategies is nearly the same for bothvariants.

Strategy sat unsat timeout memout

BRI 4751 4039 3151 142

BRID 4736 4032 3175 141

Table 4: Comparison of deactivation and deletion

5 Conclusion

We examined the impact of restricted projection operators as proposed by Mc-Callum in the CAD method on the performance of an SMT solver. After pre-senting several variants on how to use these in an actual implementation, weprovided some experimental results for the implemented modifications. We canshow that the performance improved especially for unsatisfiable inputs when therestricted or semi-restricted projection operators are used instead of the original


one. It, however, made no noticeable difference whether the restricted or thesemi-restricted projection operator was used, though the repeated application iscurrently only validated for the semi-restricted operator.

We further investigated the difference of either deleting polynomials fromthe projection or only disabling them in the incremental setting during SMTsolving. Though one could hope for a decreased memory consumption whendeleting polynomials, this change did not make a significant difference.

Our ideas for future investigation concerning equational constraints mainlydeal with making the best possible use of the restricted projection operator byapplying this idea in as many levels as possible. One direction would be the useof the resultant rule [McC01] that allows propagating equational constraints.Another option would be the modification of the variable ordering heuristicdepending on the equations that are present in the input. It would also be in-teresting to investigate whether a heuristic for the choice of which equationalconstraint to use for the restricted projection operator can be found. Currentlywe are using the equational constraint that is added first, however, the numberof cells in the resulting CAD depends on the designated equational constraintas shown in [EBD15]. In that paper is furthermore shown how equational con-straints can be used to make additional savings in the lifting phase, which couldalso be implemented in SMT-RAT.

References

[ADCTZ11] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, and Stefano Zacchiroli,MPM : A modular package manager, 14th International ACM SIGSOFTSymposium on Component Based Software Engineering (CBSE-2011)(Boulder, CO, United States) (Ivica Crnkovic, Judith A. Stafford, An-tonia Bertolino, and Kendra M. L. Cooper, eds.), Proceedings of the14th International ACM Sigsoft Symposium on Component Based Soft-ware Engineering, CBSE 2011, part of Comparch ’11 Federated Eventson Component-Based Software Engineering and Software Architecture,ACM, ACM, 2011, pp. 179–187.

[BFT16] Clark Barrett, Pascal Fontaine, and Cesare Tinelli, The satisfiability mod-ulo theories library (SMT-LIB), www.SMT-LIB.org, 2016.

[BHvMW09] Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, Hand-book of satisfiability, Frontiers in Artificial Intelligence and Applications,vol. 185, IOS Press, 2009.

[Bro01] Christopher W. Brown, Improved projection for cylindrical algebraic de-composition, Journal of Symbolic Computation 32 (2001), no. 5, 447–465.

[CH91] George E. Collins and Hoon Hong, Partial cylindrical algebraic decom-position for quantifier elimination, Journal of Symbolic Computation 12(1991), no. 30, 299 – 328.

[CKJ+15] Florian Corzilius, Gereon Kremer, Sebastian Junges, Stefan Schupp, andErika Abraham, SMT-RAT: An open source C++ toolbox for strategic andparallel SMT solving, Proceedings of SAT’15, LNCS, vol. 9340, Springer,2015, pp. 360–368.


[Col75] George E. Collins, Quantifier elimination for real closed fields by cylin-drical algebraic decomposition, Automata Theory and Formal Languages,LNCS, vol. 33, Springer, 1975, pp. 134–183.

[EBD15] Matthew England, Russell Bradford, and James H. Davenport, Improvingthe use of equational constraints in cylindrical algebraic decomposition,Proceedings of the 2015 ACM on International Symposium on Symbolicand Algebraic Computation (New York, NY, USA), ISSAC ’15, ACM,2015, pp. 165–172.

[ED16] Matthew England and James H. Davenport, The complexity of cylindri-cal algebraic decomposition with respect to polynomial degree, ComputerAlgebra in Scientific Computing (Cham) (Vladimir P. Gerdt, WolframKoepf, Werner M. Seiler, and Evgenii V. Vorozhtsov, eds.), Springer In-ternational Publishing, 2016, pp. 172–192.

[GAB+17] Jurgen Giesl, Cornelius Aschermann, Marc Brockschmidt, Fabian Emmes,Florian Frohn, Carsten Fuhs, Jera Hensel, Carsten Otto, Martin Plucker,Peter Schneider-Kamp, Thomas Stroder, Stephanie Swiderski, and ReneThiemann, Analyzing program termination and complexity automaticallywith AProVE, Journal of Automated Reasoning 58 (2017), no. 1, 3–31.

[GGI+10] Sicun Gao, Malay Ganai, Franjo Ivancic, Aarti Gupta, Sriram Sankara-narayanan, and Edmund M. Clarke, Integrating ICP and LRA solvers fordeciding nonlinear real arithmetic problems, Proceedings of FMCAD’10,IEEE, 2010, pp. 81–90.

[Hon90] Hoon Hong, An improvement of the projection operator in cylindrical al-gebraic decomposition, ISSAC ’90 Proceedings of the International Sym-posium on Symbolic and Algebraic Computation (1990), 261–264.

[HR97] Stefan Herbort and Dietmar Ratz, Improving the efficiency of a nonlinear-system-solver using a componentwise Newton method, Tech. Report2/1997, Inst. fur Angewandte Mathematik, University of Karlsruhe, 1997.

[KS08] Daniel Kroening and Ofer Strichman, Decision procedures: An algorithmicpoint of view, Springer, 2008.

[Laz94] Daniel Lazard, An improved projection for cylindrical algebraic decom-position, Algebraic Geometry and its Applications: Collections of Papersfrom Shreeram S. Abhyankar’s 60th Birthday Conference (Chandrajit L.Bajaj, ed.), Springer New York, New York, NY, 1994, pp. 467–476.

[LSC+13] Ulrich Loup, Karsten Scheibler, Florian Corzilius, Erika Abraham, andBernd Becker, A symbiosis of interval constraint propagation and cylin-drical algebraic decomposition, Proceedings of CADE-24, LNCS, vol. 7898,Springer, 2013, pp. 193–207.

[McC85] Scott McCallum, An improved projection operation for cylindrical alge-braic decomposition, Tech. report, University of Wisconsin Madison, 1985.

[McC99] , On projection in CAD-based quantifier elimination with equa-tional constraint, Proceedings of the 1999 International Symposium onSymbolic and Algebraic Computation, ACM, 1999, pp. 145–149.

[McC01] , On propagation of equational constraints in CAD-based quan-tifier elimination, Proceedings of the 2001 International Symposium onSymbolic and Algebraic Computation, ACM, 2001, pp. 223–231.

[MPP17] Scott McCallum, Adam Parusiski, and Laurentiu Paunescu, Validity proofof Lazard’s method for CAD construction, Journal of Symbolic Compu-tation (2017).


[VKA17] Tarik Viehmann, Gereon Kremer, and Erika Abraham, Comparing dif-ferent projection operators in the cylindrical algebraic decomposition forSMT solving, Proceedings of the 2nd International Workshop on Satisfi-ability Checking and Symbolic Computation, CEUR Workshop Proceed-ings, vol. 1974, CEUR-WS.org, 2017.

[Vol15] Matthias Volk, Using SAT solvers for industrial combinatorial problems,Master’s thesis, RWTH Aachen University, Germany, Aachen, 2015.

[Wei97] Volker Weispfenning, Quantifier elimination for real algebra - thequadratic case and beyond, Applicable Algebra in Engineering, Commu-nication, and Computing 8 (1997), no. 2, 85–101.

Refutation of Products of Linear Polynomials

Jan Horacek and Martin Kreuzer

Faculty of Informatics and MathematicsUniversity of Passau, D-94030 Passau, Germany

[email protected], [email protected]

Abstract. In this paper we consider formulas that are conjunctions oflinear clauses, i.e., of linear equations. Such formulas are very interestingbecause they encode CNF constraints that are typically very hard forSAT solvers. We introduce a new proof system SRES that works withlinear clauses and show that SRES is implicationally and refutationallycomplete. Algebraically speaking, linear clauses correspond to productsof linear polynomials over a ring of Boolean polynomials. That is whySRES can certify if a product of linear polynomials lies in the idealgenerated by some other such products, i.e., the SRES calculus decidesthe ideal membership problem. Furthermore, an algorithm for certifyinginconsistent systems of the above shape is described. We also establishthe connection with an another combined proof system R(lin).

Keywords: linear clauses, Boolean polynomials, algebraic proof systems, SATsolving

1 Introduction

In this paper we deal with a special case of the ideal membership problem: givena set of Boolean polynomials S which are products of linear polynomials anda Boolean polynomial f which is also a product of linear polynomials, decidewhether f ∈ 〈S〉 . Traditionally, one can use Grobner bases to solve this prob-lem. The assumption that the given polynomials are products of linear polyno-mials allows us introduce a new calculus, called SRES, that is tailored to tacklethis problem. It needs only a lightweight version of the S -polynomials usedGrobner basis algorithms, namely the s-resolvents we introduced in [6]. Theses-resolvents also generalize the classical resolution rule from propositional logic.Nevertheless, we consider SRES to be an algebraic proof system (in the senseof [5], [2] or [4]), and hence we mostly use the terminology of commutative al-gebra to describe it. The main difference in the algebraic approach is that westudy assignments that yield False , i.e., are zeros of a system, in constrast tolook for assignments that yield True in the SAT terminology.

From the propositional logic point of view, we would like to decide if thesemantic implication S |= f holds. Recall that a linear Boolean polynomialxi1 + · · · + xij corresponds to a linear XOR constraint xi1 ⊕ · · · ⊕ xij . Suchconstraints are very difficult for SAT solvers because the truth value depends

33

34 Jan Horacek and Martin Kreuzer

genuinely on all variables, i.e., changing the value of one variable results in flip-ping the truth value of the XOR. Many hard formulas in CNF can be constructedby combining literals into linear XOR constraints, resulting in so-called linearclauses. For details, we refer the readers to [1] or [12]. Such formulas appearquite frequently in cryptography, where linear constraints are typically mixedwith highly nonlinear ones, for instance in substitution permutation networks.Another application is to handle the bit-blasted version of ARX operations (Add-Roll-Xor) which is becoming a popular part of ARX ciphers. When converted tobit-wise operations, these conversions give XOR-heavy formulae with a handfulof AND gates describing the carry chains of the adders.

The paper is organized as follows. In Section 2 we recall the definition oflinear clauses and introduce several ways of describing them, in particular usingproducts of linear Boolean polynomials. Our main data structure to describethem is the set of linear polynomials that appear in the linear clause.

In Section 3 we define the proof system SRES which incorporates and ex-tends the s-resoution rule defined in [6]. We prove that the SRES proof systemis implicationally and refutationally complete. Moreover, we establish a rela-tion with other combined systems that use resolution and polynomial calculus(see [11]). In particular, we show that SRES simulates the proof system R(lin)defined in [7, 11].

In Section 4 we describe a concrete algorithm that searches for SRES-refuta-tion proofs of inconsistent systems, called the SRES Refutation Algorithm, andcompare it to other approaches to the problem at hand. Finally, in Section 5, weapply the SRES Refutation Algorithm to some examples and point out possiblefurther improvements.

In the following we use some of the definitions and results given in [6], in par-ticular the algorithm Sres given there. However, to aid the readers, we tried tomake the paper as self-contained as possible and mention any overlaps explicitly.

2 Background

Let F2 = Z/2Z be the binary field and F2[x1, . . . , xn] a polynomial ring over F2 .The ring Bn = F2[x1, . . . , xn]/F with F = 〈x21 + x1, . . . , x

2n + xn〉 is called the

ring of Boolean polynomials in the indeterminates x1, . . . , xn . Let Ln bethe set of all linear polynomials in F2[x1, . . . , xn] , i.e., the set of all polynomialsof degree ≤ 1. (Here we use deg(0) = −1.)

Throughout the paper we consider the obvious correspondences between thefollowing types of objects, where `1, . . . , `k ∈ Ln :

(C1) A set of linear polynomials H = {`1, . . . , `k} ⊆ Ln .

(C2) A Boolean polynomial h =∏k

i=1 `i which is a product of linear polynomialsin Bn .

(C3) A linear clause (`1 = 0) ∨ · · · ∨ (`k = 0).

Next we give an example of the above correspondence.

Refutation of Products of Linear Polynomials 35

Example 1 Let H = {x1 +x2 + 1, x1 +x3} ⊆ L3 . Then h = (x1 +x2 + 1)(x1 +x3) = x1x3 +x1x2 +x2x3 +x3 , and H (resp. h) coresponds to the propositionallogic formula

(x1 ⊕ x2 ⊕ 1 = 0) ∨ (x1 ⊕ x3 = 0).

To simplify the notation, we view the sets in (C1) as polynomials in (C2),e.g., we speak of zeros of H instead of zeros of h . Furthermore, we always assumethat the linear polynomials `1, . . . , `k appearing in (C1) are pairwise distinct,i.e., that the set H does not contain duplicates. The notion of linear clauses hasbeen considered before in [7,11]. Note that many difficult CNF instances can benaturally encoded in the form (C3). Thus we are targeting very hard formulas(see [10, Sec. 3]). Furthermore, we observe that two different subsets in (C1)may represent the same polynomial, as the following example shows.

Example 2 Consider the sets of linear polynomials {x3, x2 + x3, x1 + x3 + 1}and {x3, x1+x2, x1+x3+1} as in (C1). Then we have x3(x2+x3)(x1+x3+1) =x3(x1 +x2)(x1 +x3 +1) in B3 , i.e. the corresponding polynomials in (C2) agree.

The subsets in (C1) are measured by degree and size. We recall here somedefinitions from [6] and [11].

Definition 3 Let H = {`1, . . . , `k} ⊆ Ln .

(1) The number deg(h) = #H is called the degree of H .(2) For ` ∈ Ln , we let var(`) be the set of indeterminates occurring in ` . We

call the number size(H) = # var(`1) + · · ·+ # var(`k) the size of H .

Our goal in this paper is to show that a given set of linear clauses is unsat-isfiable. For this purpose, we define semantic implication as follows.

Definition 4 Let F1, . . . , Fm, H ⊆ Ln . We say that F1, . . . , Fm semanticallyimply H if every F2 -rational common zero of the Boolean polynomials corre-sponding to F1, . . . , Fm is a zero of H . In this case we write F1, . . . , Fm |= H .

From the algebraic point of view, we have F1, . . . , Fm |= H if and only ifH ∈ 〈F1, . . . , Fm〉 ⊆ Bn . Note that the zeros of an ideal in Bn are always F2 -rational, because ideals in Bn correspond to ideals in F2[x1, . . . , xn] that containthe field ideal 〈x21 +x1, . . . , x

2n +xn〉 . Using the newly defined terminology, we

can say that our goal is to refute a given set of linear clauses, i.e. to prove thatthe corresponding sets of linear polynomials semantically imply {1} .

3 The SRES Proof System

Recall that a proof system, sometimes called a Hilbert system or Hilbert calcu-lus, consists of a syntax, i.e. a set of rules which determine the set of well-formedformulas of the system, by a set of axioms, i.e. a set of formulas which are as-sumed to be tautologies, and by its set of rules of inference, i.e. by a set of ruleswhich determine how one can get new tautologies from known ones.


For instance, the Grobner proof system defined in [5] admits all formulasof the form f(x1, . . . , xn) = 0 where f ∈ F2[x1, . . . , xn] . Its axioms are theBoolean axioms x2i + xi = 0 for i = 1, . . . , n , and its rules of inference are

f g

f + g(Pa) and

f

xif(Pm)

where f, g ∈ F2[x1, . . . , xn] and xi is an indeterminate.In this section we continue to study the proof system based on s-resolution

which was introduced in [6]. More precisely, we extend that system by otherinference rules, and thus we define the new proof system called SRES. Let usbegin by recalling the proof system in [6].

Definition 5 The proof system s-resolve is defined by the following parts:

(1) A formula is a set of linear polynomials {`1, . . . , `s} ⊂ Ln , where s ∈ N+ .(2) The axioms are the Boolean axioms given by {xi, xi + 1} for i = 1, . . . , n .(3) There exists one rule of inference (R), namely⋃s

i=1{ì} ∪G⋃s

i=1{ì + 1} ∪ G⋃s−1i=1 {ì + ì+1 + 1} ∪G ∪ G

(R)

where G, G ⊆ Ln and s ≥ 1. This rule is also called s-resolution. (For

s = 1, we let⋃s−1

i=1 {ì + ì+1 + 1} = ∅ .) The result of an application of thes-resolution rule is called an s-resolvent.

Let us note that the Boolean axioms immediately imply the following remark.

Remark 6 By applying 2-resolution to the axioms {xi, xi+1} and {xi+1, xi} ,we obtain that {0} is a tautology in the s-resolve proof system.

In [6] we showed that s -resolve is correct in the sense of the following defi-nition.

Definition 7 Let (I) be a rule of inference of a proof system. We say that therule of inference (I) is correct if

F1 F2 . . . Fm

H(I)

for some formulas F1, . . . , Fm, H implies F1, . . . , Fm |= H .

For s ≥ 3, the s -resolution rule depends e.g. on the numbering of the linearpolynomials. (2-resolvents are unique because there is only one way how to formthe linear polynomial `1+`2+1.) However, considered as a Boolean polynomial,the s-resolvent is uniquely determined. The next example is a point in case.


Example 8 Resolving two sets F = {x1 + 1, x1 + x3, x1 + x2 + 1, x2 + x3 + 1}and G = {x1, x1 + x3 + 1, x1 + x2, x2 + x3} with the numbering `1 = x1 + 1,`2 = x1 + x3 , `3 = x1 + x2 + 1, `4 = x2 + x3 + 1 yields the 4-resolventR1 = {x3, x2 + x3, x1 + x3 + 1} . If we swap the last two polynomials in G ,4-resolution with the numbering `1 = x1 + 1, `2 = x1 + x3 , `4 = x1 + x2 + 1,`3 = x2 + x3 + 1 yields R2 = {x3, x1 + x2, x1 + x3 + 1} . Both R1 and R2

correspond to the same Boolean polynomial, as we saw in Example 2.

Next we enrich s-resolve with a new rule of inference as follows.

Definition 9 The proof system SRES is defined by the following parts.

(1) The syntax agrees with the syntax of s -resolve, i.e. a formula is a set oflinear polynomials {`1, . . . , `s} ⊂ Ln , where s ∈ N+ .

(2) The axioms are the Boolean axioms {xi, xi + 1} for i = 1, . . . , n .(3) The rules of inference consist of s-resolution (R) and the following weak-

ening rule (W):H

H ∪ {`}(W)

for H ⊆ Ln and ` ∈ Ln .

Since we trivially have H |= H ∪{`} , the proof system SRES is correct withrespect to semantic implication. Next we recall the definition of a proof.

Definition 10 Let PS be a proof system. A PS-proof of a formula H from theinitial premises F1, . . . , Fm in the proof system PS is a sequence of formulasπ = (G1, . . . , Gk) such that Gk = H and each of the formulas Gi is of one ofthe following forms:

(1) Gi ∈ {F1, . . . , Fm}(2) Gi is one of the axioms of PS.(3) Gi is obtained from some of the formulas Gj with j < i by applying one of

the rules of inference of the proof system PS.

If a formula H has a PS-proof from {F1, . . . , Fm} , we write F1, . . . , Fm `PS

H , or simply F1, . . . , Fm ` H if no confusion can arise.

Since the proof system SRES is correct, our goal to refute {F1, . . . , Fm}can now be expressed by asking that we should prove F1, . . . , Fm ` {1} . No-tice that the intended semantics is that an unsatisfiable formula corresponds topolynomial equations `i,1 · · · `i,s = 0 for i ∈ {1, . . . ,m} which has no solution.Equivalently, a tautology is an equation which holds for all points of Fn

2 . A proofcan be seen as a sequence of applications of rules to axioms or previously provedtautologies.

Next we extend the capabilities of our proof system by deriving the followingfurther rules of inference.

Definition 11 Let F,G be subsets of Ln , and let `, `1, `2 ∈ Ln .


(1) The rule (U) defined byH ∪ {1}

H(U)

is called unit cancellation.(2) The rule (MP) defined by

{`} H ∪ {`+ 1}

H(MP)

is called modus ponens.(3) The rule (A) defined by

F ∪ {`1} H ∪ {`2}

F ∪H ∪ {`1 + `2}(A)

is called the addition rule.

Proposition 12 The rules (U), (MP), and (A) are correct. Furthermore, anyproof derived using (U), (MP) and (A) can be rewritten into an SRES-proof.

Proof. The correctness of (U) follows from the observation that multiplying aBoolean polynomial by 1 does not change its zeros. Using Remark 6 we apply1-resolution on H ∪ {1} and {0} , and we get H . To prove modus ponens itsuffices to use (R) with s = 1 and G = ∅ . Finally, we show the correctnessof (A). Using the weakening rule, we infer F ∪ {`1, `2 + 1} from F ∪ {`1} andH∪{`2, `1+1} from H∪{`2} . Then, using 2-resolution, we get F∪H∪{`1+`2} .

ut

The following remark and the subsequent proposition will become importantlater in the proof of Proposition 18.

Remark 13 Let π = (H1, . . . ,Hk) be an SRES-proof of Hk from F1, . . . , Fm

that uses only the s-resolution rule. Let ` be an element of one of the sets Hi .Then ` lies in the F2 -vector space generated by the linear polynomials containedin the union

⋃mi=1 Fi .

Proposition 14 Let F ⊆ Ln and let i ∈ {1, . . . , n} . For a ∈ {0, 1} , letF (xi 7→ a) denote the set which is obtained by substituting xi 7→ a intothe linear polynomials contained in F . Then the following claims hold truefor i ∈ {1, . . . , n} .

(1) F, {xi} `SRES F (xi 7→ 0)(2) F (xi 7→ 0), {xi} `SRES F(3) F, {xi + 1} `SRES F (xi 7→ 1)(4) F (xi 7→ 1), {xi + 1} `SRES F


Proof. First we prove (1). Let ` ∈ F be such that xi occurs in ` . The poly-nomial xi + ` corresponds to substituting xi = 0 into ` . This addition can bederived in SRES using Proposition 12. The claims (2)–(4) follow analogouslyfrom Proposition 12. ut

The next example illustrates this proposition.

Example 15 Let F1 = {x1 + x2, x1} and F2 = {x1 + 1} . By Proposition 12,there exists an SRES-proof of the set {x2 +1, 1} from F1, F2 which correspondsto the substitution of x1 = 1 into F1 . On the other hand, we can “backtrack”the substitution, i.e. we have F2, {x2 + 1, 1} `SRES F1 .

The following example shows an important advantage of the SRES proofsystem over the Grobner proof system defined in [5]. More precisely, it showsthat the data structures (C1)–(C3) efficiently store dense linear clauses.

Example 16 Consider the sets

F1 = {x1 + x2, . . . , xn + xn+1}F2 = {x1 + x2 + 1}F3 = {x2 + x3 + 1}... =

...

Fn+1 = {xn + xn+1 + 1}

On one hand, it is easy to see that the system is inconsistent because sub-stituting F2, . . . , Fn+1 into F1 gives us 1. On the other hand, the input forthe Grobner basis algorithm is assumed to be expanded (i.e., not in the form ofproducts of linear polynomials). We can always find n ∈ N such that expand-ing F1 to the polynom f1 =

∏ni=1(xi + xi+1), which has 2n terms, exceeds the

available memory, and hence any Grobner basis algorithm can not be applied.Notice that we have size(F1) = 2n .

One workaround would be to introduce new indeterminates to break thelong product, but it does not help too much because the Grobner basis algo-rithm substitutes the new indeterminates back, and thus recovers the expandedpolynomial F1 again.

However, by Proposition 14, there exists a short SRES refutation that cor-responds to the substitution of F2, . . . , Fn+1 into F1 .

The following definition provides further useful properties of proof systems.

Definition 17 (1) A proof system is called implicationally complete if forevery formula H and every set of formulas {F1, . . . , Fm} such that we haveF1, . . . , Fm |= H , there exists a proof of H from F1, . . . , Fm in this proofsystem.

(2) A proof system is called refutationally complete if for every inconsistentset of formulas {F1, . . . , Fm} , i.e. for F1, . . . , Fm |= {1} , there exists a proofof {1} from F1, . . . , Fm in this proof system.


The proof of the following proposition is inspired by the corresponding prooffor the classical resolution calculus (see for instance [3, Th. 4.1.5]) and by theproof given in [11, Th. 5.1])

Proposition 18 The proof system SRES is implicationally and refutationallycomplete.

Proof. First we show that SRES is implicationally complete. Let F1, . . . , Fm, H ⊆Ln be sets of linear polynomials such that F1, . . . , Fm |= H . We want toprove that F1, . . . , Fm `SRES H . We proceed by induction on the numberk = size(F1) + · · ·+ size(Fm) + size(H).

Let us consider the case k = 0, i.e. the case when all linear polynomialsin Fi and H are constants 0 or 1. If all sets F1, . . . , Fm are equal to {0} , thereis trivially an SRES-proof of {0} . All the semantic implicants of {0} of size 0can be derived from {0} by the weakening rule. If there exists a set Fi = {1} ,then there is trivially an SRES-proof that refutes F1, . . . , Fm , and hence we canderive any linear clause by the weakening rule and the unit cancellation rule.Finally, if Fi = {0, 1} , then Fi is simplified to {0} by unit cancellation, andthus we can use the previous argument.

Now assume that the claim holds for some k ≥ 0 and consider a semanticimplication

F1, . . . , Fm |= H

in which the sum of sizes of F1, . . . , Fm, H is at most k+1. Choose i ∈ {1, . . . , n}such that xi occurs in F1∪· · ·∪Fm∪H . Given a ∈ {0, 1} , let F (xi 7→ a) denotethe set which is obtained by substituting xi 7→ a into the linear polynomialscontained in F .

By Proposition 14, we have Fj , {xi} `SRES Fj(xi 7→ 0) for j = 1, . . . ,m .Moreover, we clearly have F1(xi 7→ 0), . . . , Fm(xi 7→ 0) |= H(xi 7→ 0). Bythe induction hypothesis, there is an SRES-proof of H(xi 7→ 0) from F1(xi 7→0), . . . , Fm(xi 7→ 0). Altogether, we get

{xi}, F1, . . . , Fm `SRES H(xi 7→ 0).

Furthermore, by Proposition 14, we have H(xi 7→ 0), {xi} `SRES H . All inall, there is an SRES-proof π1 of H from {xi}, F1, . . . , Fm . Analogously, we getan SRES-proof π2 of H from {xi + 1}, F1, . . . , Fm .

Next we modify the derivations of π1 (resp. π2 ) such that they start from{xi, xi + 1}, F1, . . . , Fm instead of {xi}, F1, . . . , Fm (resp. {xi + 1}, F1, . . . , Fm ).Note that the same rules of inference can be applied, and thus one can rewritethe proof π1 into an SRES-proof α1 from {xi, xi + 1}, F1, . . . , Fm of either Hor H ∪ {xi} . Similarly, we rewrite π2 into an SRES-proof α2 from {xi, xi +1}, F1, . . . , Fm of H or H ∪ {xi + 1} . If the proof α1 or α2 ends with H , weare done. Otherwise, we have H ∪ {xi}, H ∪ {xi + 1} `SRES H by a single stepof the 1-resolution rule, which concludes the proof. ut

The SRES proof system is in fact a combined proof system in the sense of [8,Sec.7.1]. For instance, R(lin) in [11] is an another example of a combined proof


systems that combines resolution with polynomial calculus. By Proposition 12 weimmediately get that SRES efficiently simulates the system R(lin). This meansthat there exists a polynomial-time algorithm that translates any R(lin)-proofof H from F1, . . . , Fm to an SRES-proof of H from F1, . . . , Fm .

On the other hand, SRES cannot simulate the rule (Pa) of the Grobner proofsystem because addition of two products of linear polynomials does not have tobe a product of linear polynomials in Bn , e.g., x1 · x2 + 1 cannot be written asa product of linear polynomials in B3 .

4 The SRES Refutation Algorithm

In [6] the authors introduced an algorithm that finds refutation proofs for un-satisfiable formulas in CNF using s-resolution. In this section we generalize thisresult to formulas that are conjuctions of linear clauses and show that it is refuta-tionally complete in the sense of Definition 17. We recall the following algorithmfor computing the s-resolvent of two polynomials which is described in [6].

Algorithm 1 Sres (s-Resolution of Two Polynomials)Input: Sets F,G ⊆ Ln . We assume that #{` ∈ F | `+ 1 ∈ G} ≥ 1.Output: A set R ⊆ Ln such that R is the s-resolvent of F and G .

1: Write {` ∈ F | ` + 1 ∈ G} as {`1, . . . , `s} .2: F ′ := {` ∈ F | ` + 1 /∈ G}3: G′ := {` ∈ G | ` + 1 /∈ F}4: R := F ′ ∪G′

5: if s = 1 and R = ∅ then6: return {1}7: else8: for i = 1, . . . , s− 1 do9: if ì + ì+1 /∈ R then

10: R := R ∪ {ì + ì+1 + 1}11: else12: return {0}13: end if14: end for15: end if16: return R

Algorithm 2 extends two polynomials by the weakening rule in all possibleways such that s-resolution can be applied. More precisely, this set of extensionsis defined as follows.

Definition 19 Let F,G ⊆ Ln . The set of all pairs K ⊆ Ln ×Ln such that thefollowing two conditions hold for all (F ′, G′) ∈ K

(1) #{` ∈ F ′ | `+ 1 ∈ G′} ≥ 1


(2) F ′ ∪G′ ⊆ F ∪G ∪ {`+ 1 | ` ∈ F ∪G}

is called the set of expansions for F,G .

Condition (1) encodes the fact that there exists at least one ` ∈ Ln in F ′ andG′ on which we can s-resolve. Condition (2) restrains F ′, G′ to contain linearpolynomials ` or `+ 1 such that ` ∈ F ∪G . Note that the set of expansions forF,G is unique.

Algorithm 2 produces the set of expansions in a brute-force way, i.e., Condi-tion (1) is implemented in Step 10, and Condition (2) is fulfilled because of thetwo foreach loops in Steps 3, 4.

Algorithm 2 AllExpansions (All Possible Expansions to s-Resolution)Input: Sets F,G ⊆ Ln .Output: The set of expansions for F,G .

1: {`1, . . . , `k} := {` ∈ F | ` /∈ G, ` + 1 /∈ G}2: {`′1, . . . , `′k′} := {` ∈ G | ` /∈ F, ` + 1 /∈ F}3: foreach A ∈ {`1, `1 + 1, 1} × · · · × {`k, `k + 1, 1} do4: foreach B ∈ {`′1, `′1 + 1, 1} × · · · × {`′k′ , `′k′ + 1, 1} do5: Write A = (a1, . . . , ak) .6: Write B = (b1, . . . , b

′k) .

7: F ′ := F ∪⋃k

i=1{ai}8: G′ := G ∪

⋃k′

i=1{bi}9: Minimize the representation of F ′ and G′ by applying unit cancellation, i.e.,

remove the element 1 from F ′, G′ .10: if #{` ∈ F ′ | ` + 1 ∈ G′} ≥ 1 then11: K := K ∪ {(F ′, G′)}12: end if13: end foreach14: end foreach15: return K

The next example shows the generation of the pairs in Algorithm 2.

Example 20 Let F = {x1, x2, x3 + 1} and G = {x1 + 1, x2, x4} . Then wemay extend F to F itself, to {x1, x2, x3 + 1, x4} , or to {x1, x2, x3 + 1, x4 + 1} .Similarly, we may extend G to G , to {x1 + 1, x2, x4, x3 + 1} , or to {x1 +1, x2, x4, x3} . Altogether, nine pairs are constructed.

Next we process the pairs computed by Algorithm 2 in increasing order withrespect to the following ordering relation.

Definition 21 Let F,G, F1, F2, G1, G2 ⊆ Ln .

(1) We write F � G if we have deg(F ) < deg(G), or if we have deg(F ) = deg(G)and size(F ) < size(G).


(2) Let #{` ∈ F1 | `+ 1 ∈ G1} ≥ 1 and #{` ∈ F2 | `+ 1 ∈ G2} ≥ 1. We write(F1, G1) E (F2, G2) if Sres(F1, G1) � Sres(F2, G2).

Note that the size of the s-resultants can be determined without actuallyexecuting Sres (see [6, Alg. 8]). Now we are ready to present Algorithm 3 forconstructing SRES-refutations. We assume that all sets occurring in the algo-rithm do not contain duplicates, i.e., that removal of duplicates is applied when-ever possible. This is the classical assumption on sets in programming languagessuch as python.

Algorithm 3 SRES Refute (SRES Refutation Algorithm)Input: Subsets F1, . . . , Fm ⊆ Ln such that Fi does not contain any duplicateelements.Output: False if F1, . . . , Fm |= {1} , True otherwise.Require: Algorithms 1, 2.

1: S := {F1, . . . , Fm}2: if {1} ∈ S then3: return False

4: end if5: Apply unit simplification and cancellation on S .6: {Q1, . . . , Qk} := S ∪

⋃ni=1{xi, xi + 1}

7: Let P be the list containing all pairs computed by AllExpansions(Qi, Qj) for1 ≤ i < j ≤ k .

8: while P 6= ∅ do9: Let (F,G) be a minimum of P w.r.t. E , and remove (F,G) from P .

10: R :=Sres(F,G)11: if R = {1} then12: return False

13: else if R is not a subset of any Q ∈ S then14: Remove all Q from S with R ( Q and all pairs (Q1, Q2) from P such that

R ( Q1 or R ( Q2 .15: Append all pairs in AllExpansions(R,G) for G ∈ S,R 6= G to the list P .16: S := S ∪ {R}17: end if18: end while19: return True

Proposition 22 Algorithm 3 is correct, refutationally complete, and finite. Inparticular, the algorithm returns False if and only if F1, . . . , Fm |= {1} .

Proof. Finiteness follows from the fact that there are only finitely many linearpolynomials in Ln . Hence the sets in S are finite, and so is the set P ⊆ Ln×Ln .

The algorithm is correct because the SRES inference rules semantically implytheir results.

It remains to show that the algorithm is refutationally complete. Assume thatthe ideal generated by the polynomials corresponding to the input sets has no


common zero. By Proposition 18, we know that there exists an SRES-refutationof F1, . . . , Fm . The Boolean axiom is incorporated in Step 6, and weakeningtakes place in the function AllExpansions such that all possible choices to forman s -resolvent are created for all possible s ∈ N+ . The s-resolution rule is thenapplied by calling Sres. Any set Q that is a proper superset of some set alreadyoccurring in S is ignored because all possible s -resolvents that would be createdusing Q are at that time already in P . Thus the algorithm sequentially createsall SRES-derivations from F1, . . . , Fm . It arranges the computation such that“smaller” sets in terms of size are preferred. Since Proposition 18 shows thatthere exists an SRES-refutation, the set {1} is eventually discovered by thealgorithm. ut

The list P in Algorithm 3 can be implemented as a min-heap such that theminimal pair is always easily found and extracted. The number of pairs in P maybe huge. Thus it is convenient to compute the s -resolvents only in Step 10. Thenext example indicates how the pairs can be stored. The s-resolvent is createdonly if its size is minimal.

Example 23 Let F1 = {x1 +1} and F2 = {x1, x2} . Algorithm AllExpansions

outputs (F1, F2), (F1 ∪ {x2 + 1}, F2), (F1 ∪ {x2}, F2). The pair (F1 ∪ {x2}, F2)can be stored as a tuple (1, {x2}, 2, ∅, 1) with the meaning “Sres of F1 ∪ {x2}and F2 ∪ ∅ has size 1”.

Recall that we need to know the sizes of all s-resultants in P in order toselect the minimum. Thus the chosen format is very convenient because the sizeof the s -resolvent can be predicted as in [6, Alg. 8] without computing the actuals-resolvents.

Furthermore, one can form only 2-resolvents in Algorithm 3 since 2-resolutionis enough to simulate R(lin) which is implicationally and refutationally complete.However, “extra” linear clauses coming from s-resolution steps with s ≥ 3 maycome in handy, and the refutation can be found faster in Algorithm 3.

5 Examples and Future Directions

In this section we give some examples of the SRES proof system. Initial experi-ments on refuting CNF formulae using SRES can be found in [6, Sec. 9]. Thoseexperiments were focused on comparing resolution with SRES based on CNFbenchmarks coming from [10]. These formulae are hard for classical resolution,but there may exist short refutations in other axiomatic systems of propositionalcalculus [13].

In the following example taken from [5], we compare SRES with algebraicsystems such as the Grobner proof system and the Nullstellensatz system [4, Def.2.1]. Let P = F2[x1, . . . , xn] . A Nullstellensatz proof of a polynomial h ∈ Pfrom polynomials f1, . . . , fm ∈ P is a tuple of polynomials (p1, . . . , pm, r1, . . . , rn)∈ Pm+n such that the equation


m∑i=1

pifi +

n∑j=1

rj(x2j + xj) = h

holds in P . The degree of the proof is defined as

max{

maxi

(deg(pi) + deg(fi)

),max

j

(deg(rj) + 2

)}.

Example 24 Consider the polynomials f1 = x1 + x1x2 and f2 = x2 + x2x3 inF2[x1, x2, x3] . They encode two implications x1 → x2 and x2 → x3 . (E.g., ifx1 = 1, then x2 is constrained to be 1.) Let us write a proof of h = x1 + x1x3 ,i.e., the implication x1 → x3 , in the Grobner proof system.

Firstly, we multiply f2 by x1 , and we get g1 = x1x2 +x1x2x3 . The additionf1 + g1 gives us x1 + x1x2x3 . Then we compute g2 = x3 · f1 = x1x3 + x1x2x3 .Finally, we get g1 + g2 = x1 + x1x3 .

Note that the maximal degree appearing in the proof is 3. On the otherhand, there exists a Nullstellensatz proof of the maximal degree O(log(n)) ofx1 → xn from x1 → x2, . . . , xn−1 → xn (see [5]). In our case, the Nullstellensatzproof is the tuple (1 + x3, x1, 0, 0, 0) because of the equality

(1 + x3)(x1 + x1x2) + x1(x2 + x2x3) = x1 + x1x3.

Now we write the polynomials f1, f2 as F1 = {x1, 1 +x2} and F2 = {x2, 1 +x3} . Resolving on x2 yields H = {x1, 1+x3} which correponds to the polynomialh . Note that the maximal degree in the proof is now 2, and the SRES proof issimpler than the Grobner proof.

Further typical examples that are easy for SRES and difficult for resolu-tion [13] are inconsistent systems of linear equations. By Proposition 12, SRESsimulates the addition rule, and thus one can use Gaußian elimination to pro-duce SRES-refutations for such systems. Recall the encoding into linear clausesin Example 16 is very efficient for SRES, but multiplying out the products causesan exponential blowup. A similar problem emerges in the next example in thecase of CNF encodings.

Example 25 Consider the linear system

f1 = x1 + · · ·+ xn, f2 = x1 + · · ·+ xn + 1

over F2 . On one hand, encoding f1 and f2 in CNF suffers from introducingeither exponentially many auxiliary variables or exponentially many new clauses.On the other hand, by the weakening rule and 2-resolution we get f1(f2 + 1) +(f1 + 1)f2 = 1 in Bn immediately.

Let us conclude this section and this paper with a few remarks about theproblem that is solved by Algorithm 3. The traditional way in Computer Algebrais to use Grobner basis algorithms. As we saw in Example 16, they have seriousdrawbacks in our setting and tend to run out of memory quickly.


Another approach is to use a SAT solver, e.g. a version of the DPLL algo-rithm. However, the DPLL approach needs a huge number of clauses to describeXOR constraints. All in all, the classical DPLL algorithm can be extended tolinear clauses in a rather straightforward way, as for instance done in [7], butappears to be not very efficient.

The new approach introduced here, namely the SRES proof system and theSRES Refutation Algorithm, offers the prospect of a new and tailor-made wayto treat systems of linear clauses. Besides finding possible optimizations of theimplementation, we may enhance it further by combining it with a suitableconflict-learning mechanism such as the ones which lie at the heart of modernSAT solvers. Recall that conflict clauses in SAT solvers can be generated from1-resolvents of a cut in the implication graph. Thus, s -resolution may generalizeconflict learning in the case of linear clauses or help to improve a separate XOR-reasoning module such as one introduced in [9].

Acknowledgments. The authors thank Jan Krajıcek and Iddo Tzameret forfruitful discussions on combined proof systems. We thank the anonymous review-ers for their many insightful comments. This work was financially supported bythe DFG project “Algebraische Fehlerangriffe” [KR 1907/6-2].

References

1. Baumgartner, P., and Massacci, F. The taming of the (X)OR. In Computa-tional LogicCL 2000. Springer, 2000, pp. 508–522.

2. Berkholz, C. The relation between polynomial calculus, Sherali-Adams, andsum-of-squares proofs. In LIPIcs-Leibniz International Proceedings in Informatics(2018), vol. 96, Leibniz-Zentrum fuer Informatik, Schloss Dagstuhl.

3. Buning, H. K., and Lettmann, T. Propositional logic: deduction and algorithms,vol. 48. Cambridge University Press, 1999.

4. Buss, S., Impagliazzo, R., Krajıcek, J., Pudlak, P., Razborov, A. A., andSgall, J. Proof complexity in algebraic systems and bounded depth frege systemswith modular counting. Computational Complexity 6, 3 (1996), 256–298.

5. Clegg, M., Edmonds, J., and Impagliazzo, R. Using the Groebner basis al-gorithm to find proofs of unsatisfiability. In Proceedings of the 28th Annual ACMSymposium on the Theory of Computing (New York, USA, 1996), STOC ’96, ACMPress, pp. 174–183.

6. Horacek, J., and Kreuzer, M. On conversions from CNF to ANF. (submitted)(2018).

7. Itsykson, D., and Sokolov, D. Lower bounds for splittings by linear combi-nations. In International Symposium on Mathematical Foundations of ComputerScience (2014), Springer, pp. 372–383.

8. Krajicek, J. Proof complexity (book draft). Available athttps://www.karlin.mff.cuni.cz/krajicek/prfdraft2.pdf, 2018.

9. Laitinen, T., Junttila, T., and Niemela, I. Conflict-driven xor-clause learn-ing. In Theory and Applications of Satisfiability Testing – SAT 2012 (Berlin,Heidelberg, 2012), A. Cimatti and R. Sebastiani, Eds., Springer Berlin Heidelberg,pp. 383–396.


10. Lauria, M., Elffers, J., Nordstrom, J., and Vinyals, M. CNFgen: A gen-erator of crafted benchmarks. In Theory and Applications of Satisfiability Testing– SAT 2017 (Cham, 2017), S. Gaspers and T. Walsh, Eds., Springer InternationalPublishing, pp. 464–473.

11. Raz, R., and Tzameret, I. Resolution over linear equations and multilinearproofs. Annals of Pure and Applied Logic 155, 3 (2008), 194 – 224.

12. Soos, M., Nohl, K., and Castelluccia, C. Extending SAT solvers to cryp-tographic problems. In International Conference on Theory and Applications ofSatisfiability Testing (2009), Springer, pp. 244–257.

13. Urquhart, A. Hard examples for resolution. J. ACM 34, 1 (Jan. 1987), 209–219.

Non-linear Real Arithmetic Benchmarks derivedfrom Automated Reasoning in Economics

Casey B. Mulligan1, Russell Bradford2, James H. Davenport2,Matthew England3, and Zak Tonks2

1 University of Chicago, [email protected] University of Bath, UK

{R.J.Bradford, J.H.Davenport, Z.P.Tonks}@bath.ac.uk3 Coventry University, UK

[email protected]

Abstract. We consider problems originating in economics that maybe solved automatically using mathematical software. We present andmake freely available a new benchmark set of such problems. The prob-lems have been shown to fall within the framework of non-linear realarithmetic, and so are in theory soluble via Quantifier Elimination (QE)technology as usually implemented in computer algebra systems. Fur-ther, they all can be phrased in prenex normal form with only existentialquantifiers and so are also admissible to those Satisfiability Module The-ory (SMT) solvers that support the QF_NRA logic. There is a great bodyof work considering QE and SMT application in science and engineering,but we demonstrate here that there is potential for this technology alsoin the social sciences.

1 Introduction

1.1 Economic reasoning

A general task in economic reasoning is to determine whether, with variablesv = (v1, . . . , vn), the hypotheses H(v) follow from the assumptions A(v), i.e. isit the case that

∀v . A(v)⇒ H(v)?

Ideally the answer would be True or False, and of course logically it is. But theapplication being modelled, like real life, can require a more nuanced analysis. Itmay be that for most v the theorem holds but there are some special cases thatshould be ruled out (additions to the assumptions). Such knowledge is valuableto the economist. Another possibility is that the particular set of assumptionschosen are contradictory4, i.e. A(v) itself is False. As all students of an intro-ductory logic class know, this would technically make the implication True, butfor the purposes of economics research it is important to identify this separately!

4 A lengthy set of assumptions is many times required to conclude the hypothesis ofinterest so the possibility of contradictory assumptions is real.

NRA Benchmarks derived from Automated Reasoning in Economics 49

Table 1. Table of possible outcomes from a potential theorem ∀ .vA ⇒ H

¬∃v[A ∧ ¬H] ∃v[A ∧ ¬H]

∃v[A ∧H] True Mixed

¬∃v[A ∧H] Contradictory Assumptions False

We categorise the situation into four possibilities that are of interest to aneconomist (Table 1) via the outcome of a pair of fully existentially quantifiedstatements which check the existence of both an example (∃v[A∧H]) and coun-terexample (∃v[A∧¬H]) of the theorem. So we see that every economics theoremgenerates a pair of SAT problems, in practice actually a trio since we would firstperform the cheaper check of the compatibility of the assumptions (∃v[A]).

Such a categorisation is valuable to an economist. They may gain a proof oftheir theorem, but if not they still have important information to guide theirresearch: knowledge that their assumptions contradict; or information about{v : A(v)⇒ H(v)}.

Also of great value to an economist are the options for exploration that suchtechnology would provide. An economist could vary the question by strength-ening the assumptions that led to a Mixed result in search of a True theorem.However, of equal interest would be to weaken the assumption that generated aTrue result for the purpose of identifying a theorem that can be applied morewidely. Such weakening or strengthening is implemented by quantifying moreor less of the variables in v. For example, we might partition v as v1,v2 andask for {v1 : ∀v2 . A(v1,v2) ⇒ H(v1,v2)}. The result in these cases would bea formula in the free variables that weakens or strengthens the assumptions asappropriate.

1.2 Technology to solve such problems automatically

Such problems all fall within the framework of Quantifier Elimination (QE). QErefers to the generation of an equivalent quantifier free formula from one thatcontains quantifiers. SAT problems are thus a sub-case of QE when all variablesare existentially quantified.

QE is known to be possible over real closed fields (real QE) thanks to theseminal work of Tarski [34] and practical implementations followed the workof Collins on the Cylindrical Algebraic Decomposition (CAD) method [11] andWeispfenning on Virtual Substitution [36]. There are modern implementationsof real QE in Mathematica [32], Redlog [14], Maple (SyNRAC [21] and theRegularChains Library [10]) and Qepcad-B [6].

The economics problems identified all fall within the framework of QE overthe reals. Further, the core economics problem of checking a theorem can beanswered via fully existentially quantified QE problems, and so also soluble usingthe technology of Satisfiability Modulo Theory (SMT) Solvers; at least those thatsupport the QF_NRA (Quantifier Free Non-Linear Real Arithmetic) logic such asSMT-RAT [12], veriT [18], Yices2 [24], and Z3 [23].

50 Mulligan-Bradford-Davenport-England-Tonks

1.3 Case for novelty of the new benchmarks

QE has found many applications within engineering and the life sciences. Recentexamples include the derivation of optimal numerical schemes [17], artificial in-telligence to pass university entrance exams [35], weight minimisation for trussdesign [9], and biological network analysis [3]. However, applications in the socialsciences are lacking (the nearest we can find is [25]).

Similarly, for SMT with non-linear reals, applications seem focussed on otherareas of computer science. The original nlsat benchmark set [23] was madeup mostly of verification conditions from the Keymaera [30], and theorems onpolynomial bounds of special functions generated by the MetiTarski automatictheorem prover [29]. This category of the SMT-LIB [1] has since been broadenedwith problems from physics, chemistry and the life sciences [33]. However, weare not aware of any benchmarks from economics or the social sciences.

The reader may wonder why QE has not been used in economics previously.On a few occasions when QE algorithms have been mentioned in economics theyhave been characterized as “something that is do-able in principle, but not by anycomputer that you and I are ever likely to see” [31]. Such assertions were basedon interpretations of theoretical computer science results rather than experiencewith actual software applied to an actual economic reasoning problem. Simplyput, the recent progress on QE/SMT technology is not (yet) well known in thesocial sciences.

1.4 Availability and further details

The benchmark set consists of 45 potential economic theorems. Each theoremrequires the three QE/SMT calls to check the compatibility of assumptions,the existence of an example, and the existence of a counterexample, so 135problems in total. In all cases the assumption and example checks are SAT,in fact often fully satisfied (any point can witness it) as they relate to a truetheorem. Correspondingly, the counterexample is usually UNSAT (for 42/45theorems) and thus the more difficult problem from the SMT point of view.

The benchmark problems are hosted on the Zenodo data repository at URLhttps://doi.org/10.5281/zenodo.1226892 in both the SMT2 format and asinput files suitable for Redlog and Maple. The SMT2 files have been acceptedinto the SMT-LIB [1] and will appear in the next release.

Available from http://examples.economicreasoning.com/ are Mathe-matica notebooks5 which contain commands to solve the examples in Mathe-matica and also further information on the economic background: meaning ofvariable names and economic implications of results etc.

5 A Mathematica licence is needed to run them, but static pdf print outs are alsoavailable to download.


1.5 Plan

The paper proceeds as follows. In Section 2 we describe in detail some examplesfrom economics, ranging from textbook examples common in education to ques-tions arising from current research discussions. Then in Section 3 we offer somestatistics on the logical and algebraic structure of these examples. We then givesome final thoughts on future and ongoing work in Section 4.

2 Examples of Economic Reasoning in Tarski’s Algebra

The fields of economics ranging from macroeconomics to industrial organizationto labour economics to econometrics involve deducing conclusions from assump-tions or observations. Will a corporate tax cut cause workers to get paid more?Will a wage regulation reduce income inequality? Under what conditions will po-litical candidates cater to the median voter? We detail in this section a varietyof such examples.

2.1 Comparative static analysis

We start with Alfred Marshall’s [26] classic, and admittedly simple, analysis ofthe consequences of cost-reducing progress for activity in an industry. Marshallconcluded that, for any supply-demand equilibrium in which the two curves havetheir usual slopes, a downward supply shift increases the equilibrium quantity qand decreases the equilibrium price p.

One way to express his reasoning in Tarski’s framework is to look at the indus-try’s local comparative statics: meaning a comparison of two different equilibriumstates between supply and demand. With a downward supply shift representedas da > 0 (where a is a reduction in costs) we have here:

A ≡ D′(q) < 0 ∧ S′(q) > 0

∧ d

da

(S(q)− a

)=

dp

da∧ dp

da=

d

daD(q)

H ≡ dq

da> 0 ∧ dp

da< 0

where:

– D′(q) is the slope of the inverse demand curve in the neighborhood of theindustry equilibrium;

– S′(q) is the slope of the inverse supply curve;

–dq

dais the quantity impact of the cost reduction; and

–dp

dais the price impact of the cost reduction.

Economically, the first atoms of A are the usual slope conditions: that demandslopes down and supply slopes up. The last two atoms of A say that the cost


change moves the industry from one equilibrium to another. Marshall’s hypoth-esis was that the result is a greater quantity traded at a lesser price.

Hence we set the “variables” v to be four real numbers (v1, v2, v3, v4):

v =

{D′(q), S′(q),

dq

da,dp

da

}.

Then, after applying the chain rule, A and H may be understood as Booleancombinations of polynomial equalities and inequalities:

A ≡ v1 < 0 ∧ v2 > 0 ∧ v3v2 − 1 = v4 ∧ v4 = v3v1,

H ≡ v3 > 0 ∧ v4 < 0.

Thus Marshall’s reasoning fits in the Tarski framework and therefore is amenableto automatic solution. Any of the tools mentioned in the introduction can eas-ily evaluate the two existential problems for Table 1 and conclude easily that∀v . A⇒ H is True, confirming Marshall’s conclusion.

In fact, for this example it is straightforward to produce a short hand proofin a couple of lines6 and so we did not include it in the benchmark set. Butit demonstrates the basic approach to placing economics reasoning into a formadmissible for QE/SMT tools. A similar approach is used for many of the largerexamples forming the benchmark set, such as the next one.

2.2 Scenario analysis

Economics is replete with “what if?” questions. Such questions are logically andalgebraically more complicated, and thereby susceptible to human error, becausethey require tracking various scenarios. We consider now one such question orig-inating from recent economic debate.

Writing at nytimes.com7, Economics Nobel laureate Paul Krugman assertedthat whenever taxes on labour supply are primarily responsible for a recession,then wages increase. Two scenarios are discussed here: what actually happens(act) when taxes (t) and demand forces (a) together create a recession, and whatwould have happened (hyp) if taxes on labour supply had been the only factoraffecting the labour market.

6 Subtract the final assumption from the penultimate one and then apply the signconditions of the other assumptions to conclude v3 > 0. Then with this the lastassumption provides the other hypothesis, v4 < 0.

7 https://krugman.blogs.nytimes.com/2012/11/03/soup-kitchens-caused-the-

great-depression/


Expressed logically we have:

A ≡(

∂D(w, a)

∂w< 0 ∧ ∂S(w, t)

∂w> 0

∧ ∂D(w, a)

∂a= 1 ∧ ∂S(w, t)

∂t= 1

∧ d

dact

(D(w, a) = q = S(w, t)

)∧ d

dhyp

(D(w, a) = q = S(w, t)

)∧ dt

dact=

dt

dhyp∧ da

dhyp= 0

∧ dq

dhyp<

1

2

dq

dact< 0

)H ≡ dw

dact> 0.

In economics terms, the first line of assumptions contains the usual slope restric-tions on the supply and demand curves. Because nothing is asserted about theunits of a or t, the next line just contains normalizations. The third and fourthlines say that each scenario involves moving the labour market from one equilib-rium (at the beginning of a recession) to another (at the end of a recession). Thefifth line defines the scenarios: both have the same tax change but only the actscenario has a demand shift. The final assumption / assertion is that a majorityof the reduction in the quantity q of labour was due to supply (that is, most ofdq

dact would have occurred without any shift in demand). The hypothesis is thatwages w are higher at the end of the recession than they were at the beginning.

Viewed as a Tarski formula this has twelve variables,

v =

{da

dact,

da

dhyp,

dt

dact,

dt

dhyp,

dq

dact,

dq

dhyp,dw

dact,dw

dhyp,

∂D(w, a)

∂a,∂S(w, t)

∂t,∂D(w, a)

∂w,∂S(w, t)

∂w

},

each of which is a real number representing a partial derivative describing thesupply and demand function or a total derivative indicating a change over timewithin a scenario.

An analysis of the two existential problems as suggested in Section 1.1. showsthat the result is Mixed: that is both examples and counterexamples exist. Inparticular, even when all of the assumptions are satisfied, it is possible thatwages actually go down.

Moreover, if ∂D(w,a)∂w and ∂S(w,t)

∂w are left as free variables, a QE analysisrecovers a disjunction of three quantifier-free formulae. Two of them contradict


the assumptions, but the third does not and can therefore be added to A inorder to guarantee the truth of H:

∂S(w, t)

∂w≥ −∂D(w, a)

∂w> 0.

I.e. assuming that labour supply is at least as sensitive to wages as labour demandis (recall that the demand slope is a negative number) guarantees that wages ware higher at the end of the recession that is primarily due to taxes on laboursupply. See also [27] for further details on the economics.

This example can be found in the benchmark set as Model #0013.

2.3 Vector summaries

Economics problems sometimes involve an unspecified (and presumably large)number of variables. Take Sir John Hicks’ analysis of household decisions amonga large number N of goods and services [19]. The quantities purchased are rep-resented as a N -dimensional vector, as are the prices paid.

We assume that when prices are p (p), the household makes purchases q (q),respectively:8

A ≡ (p · q ≤ p · q) ∧ (p · q ≤ p · q).

Hicks asserted that the quantity impact of the price changes q − q cannot bepositively correlated with the price changes p− p:

H ≡ (q− q) · (p− p) ≤ 0.

Hicks’ reasoning depends only on the value of vector dot products, four of whichappear above, rather than the actual vectors themselves whose length remainsunspecified.

Hicks implicitly assumed that prices and quantities are real valued, whichplaces additional restrictions on the dot products. We need then to restrict thatthe symmetric Gram matrix corresponding to the four vectors q,q, p,p be pos-itive semi-definite. To state this restriction we then need to increase the list ofvariables from the four dot products that appear in A and H to all ten that itis possible to produce with four vectors.

QE technology certifies that there are no values for the dot products thatcan simultaneously satisfy A and contradict H. It follows that Sir John Hickswas correct regardless of the length N of the price and quantity vectors. In fact,further, it turns out that the additional restriction to real valued quantities isnot needed for this example (there are no counter-examples even without this).While those additional restrictions could hence be removed from the theorem wehave left them in the statement in the benchmark set because (a) they would

8 The price change is compensated in the Hicksian sense, which means that q and qare the same in terms of results or “utility”, so that the consumer compares themonly on the basis of what they cost (dot product with prices).


be standard restrictions an economist would work with and (b) it is not obviousthat they are unnecessary before the analysis.

The above analysis could be performed easily with both Mathematica andZ3 but proved difficult for Redlog. We have yet to find a variable orderingthat leads to a result in reasonable time9. It is perhaps not surprising that a toollike Z3 excels in comparison to computer algebra here, since the problem has alarge part of the logical statement irrelevant to its truth: the search based SATalgorithms are explicitly designed to avoid considering this.


2.4 Concave production function

Our final example is adapted from the graduate-level microeconomics textbook[22]. The example asserts that any differentiable, quasi-concave, homogeneous,three-input production function with positive inputs and marginal products mustalso be a concave function. In the Tarski framework, we have a twelve-variablesentence, which we have expressed simply with variable names v1, . . . , v12 tocompress the presentation:

A ≡ v1v10 + v2v7 + v3v5 = 0 ∧ v1v11 + v2v8 + v3v7 = 0

∧ v1v12 + v10v3 + v11v2 = 0

∧ v1 > 0 ∧ v2 > 0 ∧ v3 > 0 ∧ v4 > 0 ∧ v6 > 0 ∧ v9 > 0

∧ 2v11v6v9 > v12v26 + v8v

29

∧ 2v10v6(v11v4 + v7v9) + v9(2v11v4v7 − 2v11v5v6 + v5v8v9)

+ v12(v24v8 − 2v4v6v7 + v5v

26

)> v210v

26 + 2v10v4v8v9 + v211v

24 + v27v

29 ,

H ≡ v12 ≤ 0 ∧ v5 ≤ 0 ∧ v8 ≤ 0

∧ v12v5 ≥ v210 ∧ v12v8 ≥ v211 ∧ v8v5 ≥ v27

∧ v8(v210 − v12v5

)+ v211v5 + v12v

27 ≥ 2v10v11v7.

In disjunctive normal form (DNF) A ∧ ¬H is a disjunction of seven clauses.Each atom of H, and therefore each clause of A ∧ ¬H in DNF, corresponds toa principal minor of the production function’s Hessian matrix. So the clausesin the DNF are A ∧ v12 > 0, A ∧ v5 > 0 ∧ . . . in the same sequence used in Habove. The existential quantifiers applied to A ∧ ¬H can be distributed acrossthe clauses to form seven smaller QE problems.

As of early 2018, Z3 could not determine whether the Tarski formula A∧¬His satisfiable: it was left running for several days without producing a result or

9 Since the problem is fully existentially quantified we are free to use any ordering ofthe 10 variables; but there are 10! = 3, 628, 800 such orderings; so we have not triedthem all.


error message. For this problem virtual substitution techniques may be advan-tageous (note the degree in any one variable is at most two). Indeed, Redlogand Mathematica can resolve it quickly. The problem would certainly be dif-ficult for CAD based tools (which underline the theory solver used by Z3 forsuch problems). For example, if we take only the first clause of the disjunction(the one containing v12 > 0), then the Tarski formula has twelve polynomialsin twelve variables, and just three CAD projections (using Brown’s projectionoperator) to eliminate {v12, v11, v10} results in 200 unique polynomials with ninevariables still remaining to eliminate!

We note that the problem has a lot of structure to exploit in regards tothe sparse distribution of variables in atoms. One approach currently underinvestigation is to eliminate quantifiers incrementally, taking advantage of therepetitive structure of the sub-problems one uncovers this way (an approach thatworks on several other problems in the benchmark set). This will be the topicof a future paper.


3 Analysis of the Structure of Economics Benchmarks

The examples collected for the benchmark set come from a variety of areas ofeconomics. They were chosen for their importance in economic reasoning andtheir ability to be expressed in the Tarski framework, not on the basis of theircomputational complexity. The sentences tend to be significantly larger formulaethan the initial examples described above. We summarise some statistics on thebenchmarks. These statistics are generated for the 45 counterexample checks(∃v[A ∧ ¬H]) which are representative of the full 135 problems.

3.1 Logical size

Presented in DNF the benchmarks have number of clauses ranging from 7 to 43with a mean average of 18.5 and median of 18. Of course each clause can be aconjunction but this is fairly rare: the average number of literals in a clause isat most 1.5 in the dataset, with the average number of atoms in a formula onlyslightly more than the number of clauses at 19.3.

3.2 Algebraic size

The number of polynomials involved in the benchmarks varies from 7 to 43 withboth mean average of 19.2 and median average of 19. The number of variablesranges from 8 to 101 with an mean average of 17.2 and median of 14. It is worthnoting that examples with this number of variables would rarely appear in theQE literature, since many QE algorithms have complexity doubly exponentialin the number of variables [15], [16], with QE itself doubly exponential in thenumber of quantifier changes [2]. The number of polynomials per variable rangesfrom 0.4 to 2.0 with an average of 1.2.


3.3 Low degrees

While the algebraic dimensions of these examples are large in comparison tothe QE literature, their algebraic structure is usually less complex. In particularthe degrees are strictly limited. The maximum total degree of a polynomial in abenchmark ranges from 2 to 7 with an average of 4.2. So although linear methodscannot help with any of these benchmarks the degrees do not rise far. The meantotal degree of a polynomial in a benchmark is only 1.8 so the non-linearity whilealways present can be quite sparse.

However, of greater relevance to QE techniques is not total degree of a poly-nomial but the maximum degree in any one variable: e.g. this controls the num-ber of potential roots of a polynomial and thus cell decompositions in a CAD(with this degree becoming the base for the double exponent in the complexitybound); see for example the complexity analysis in [4]. The maximum degree inany one variable is never more than 4 in any polynomial in the benchmarks. Ifwe take the maximum degree in any one variable in a formula and average forthe dataset we get 2.1; if we take the maximum degree in any one variable in apolynomial and average for the dataset we get 1.3.

3.4 Plentiful useful side conditions

With variables frequently multiplying each other in a sentence’s polynomials, apotentially large number of polynomial singularities might be relevant to a QEprocedure. However, we note that the benchmarks here will typically include signconditions that in principle rule out the need to consider many such singularities.That is, these side conditions will often coincide with unrealistic economic inter-pretations that are excluded in the assumptions. Of course, the computationalpath of many algorithms may not necessarily exploit these currently withouthuman guidance.

3.5 Sparse occurrence of variables

For each sentence ∃v[A ∧ ¬H] (these are the computationally more challengingof the 135) we have formed a matrix of occurrences, with rows representingvariables and columns representing polynomials. A matrix entry is one if andonly if the variable appears in the polynomial, and zero otherwise. The averageentry of all 45 occurrence matrices is 0.15. This sparsity indicates that the useof specialised methods or optimisations for sparse systems could be beneficial.

3.6 Variable orderings

Since the problems are fully existentially quantified we have a free choice ofvariable ordering for the theory tools. It is well known that the ordering cangreatly affect the performance of tools. A classic result in CAD presents a classof problems in which one variable ordering gave output of double exponentialcomplexity in the number of variables and another output of a constant size


[7]. Heuristics have been developed to help with this choice (e.g. [13] [5]) with amachine learned choice performing the best in [20]. The benchmarks come witha suggested variable ordering but not a guarantee that this is optimal: toolsshould investigate further (see also [28]).

4 Final Thoughts

4.1 Summary

We have presented a dataset of examples from economic reasoning suitable forautomatic solution by QE or SMT technology. These clearly broaden the ap-plication domains present in the relevant category of the SMT-LIB, and theliterature of QE. Further, their logical and algebraic structure are sources ofadditional diversity.

While some examples can be solved easily there are others that prove chal-lenging to the various tools on offer. The example in Section 2.3 could not betackled by Redlog while the one in Section 2.4 caused problems for Z3. Of thetools we tried, only Mathematica was able to decide all problems in the bench-mark set. However, in terms of median average timing over the benchmark setMathematica takes longer with 825ms than Redlog (50ms) or Z3 (< 15ms).

4.2 Ongoing and future work

As noted at the end of Section 2, some of these problems have a lot of structure toexploit and so the optimisation of existing QE tools for problems from economicsis an ongoing area of work.

More broadly, we hope these benchmarks and examples will promote interestfrom the Computer Science community in problems from the social sciences, andcorrespondingly, that successful automated solution of these problems will pro-mote greater interest in the social sciences for such technology. A key barrier tothe latter is the learning curve that technology like Computer Algebra Systemsand SMT solvers has, particularly to users from outside Mathematics and Com-puter Science. In [28] we describe one solution for this: a new Mathematicapackage called TheoryGuru which acts as a wrapper for the QE functionality inthe Resolve command designed to ease its use for an economist.

References

1. C. Barrett, P. Fontaine, and C. Tinelli. The Satisfiability Modulo Theories Library(SMT-LIB). www.SMT-LIB.org, 2016.

2. S. Basu, R. Pollack, and M.F. Roy. Algorithms in Real Algebraic Geometry. 10 ofAlgorithms and Computations in Mathematics. Springer-Verlag, 2006.

3. R. Bradford, J.H. Davenport, M. England, H. Errami, V. Gerdt, D. Grigoriev,C. Hoyt, M. Kosta, O. Radulescu, T. Sturm, and A. Weber. A case study onthe parametric occurrence of multiple steady states. In Proc. 2017 ACM Inter-national Symposium on Symbolic and Algebraic Computation, ISSAC ’17, pages45–52. ACM, 2017.


4. R. Bradford, J.H. Davenport, M. England, S. McCallum, and D. Wilson. Truthtable invariant cylindrical algebraic decomposition. Journal of Symbolic Compu-tation, 76:1–35, 2016.

5. R. Bradford, J.H. Davenport, M. England, and D. Wilson. Optimising problemformulations for cylindrical algebraic decomposition. In J. Carette, D. Aspinall,C. Lange, P. Sojka, and W. Windsteiger, editors, Intelligent Computer Mathemat-ics, vol. 7961 of Lecture Notes in Computer Science, pages 19–34. Springer BerlinHeidelberg, 2013.

6. C.W. Brown. QEPCAD B: A program for computing with semi-algebraic setsusing CADs. ACM SIGSAM Bulletin, 37(4):97–108, 2003.

7. C.W. Brown and J.H. Davenport. The complexity of quantifier elimination andcylindrical algebraic decomposition. In Proc. 2007 International Symposium onSymbolic and Algebraic Computation, ISSAC ’07, pages 54–60. ACM, 2007.

8. B. Caviness and J. Johnson. Quantifier Elimination and Cylindrical AlgebraicDecomposition. Texts & Monographs in Symbolic Computation. Springer-Verlag,1998.

9. A.E. Charalampakis and I. Chatzigiannelis. Analytical solutions for the minimumweight design of trusses by cylindrical algebraic decomposition. Archive of AppliedMechanics, 88(1):39–49, 2018.

10. C. Chen and M. Moreno Maza. Quantifier elimination by cylindrical algebraicdecomposition based on regular chains. In Proc. 39th International Symposium onSymbolic and Algebraic Computation, ISSAC ’14, pages 91–98. ACM, 2014.

11. G.E. Collins. Quantifier elimination for real closed fields by cylindrical algebraicdecomposition. In Proc. 2nd GI Conference on Automata Theory and FormalLanguages, pages 134–183. Springer-Verlag (reprinted in the collection [8]), 1975.

12. F. Corzilius, U. Loup, S. Junges, and E. Abraham. SMT-RAT: An SMT-compliantnonlinear real arithmetic toolbox. In A. Cimatti and R. Sebastiani, editors, Theoryand Applications of Satisfiability Testing - SAT 2012, vol. 7317 of Lecture Notesin Computer Science, pages 442–448. Springer Berlin Heidelberg, 2012.

13. A. Dolzmann, A. Seidl, and T. Sturm. Efficient projection orders for CAD. In Proc.2004 International Symposium on Symbolic and Algebraic Computation, ISSAC’04, pages 111–118. ACM, 2004.

14. A. Dolzmann and T. Sturm. REDLOG: Computer algebra meets computer logic.SIGSAM Bull., 31(2):2–9, 1997.

15. M. England, R. Bradford, and J.H. Davenport. Improving the use of equationalconstraints in cylindrical algebraic decomposition. In Proc. 2015 InternationalSymposium on Symbolic and Algebraic Computation, ISSAC ’15, pages 165–172.ACM, 2015.

16. M. England and J.H. Davenport. The complexity of cylindrical algebraic decompo-sition with respect to polynomial degree. In V.P. Gerdt, W. Koepf, W.M. Werner,and E.V. Vorozhtsov, editors, Computer Algebra in Scientific Computing: 18thInternational Workshop, CASC 2016, vol. 9890 of Lecture Notes in Computer Sci-ence, pages 172–192. Springer International Publishing, 2016.

17. M. Erascu and H. Hong. Real quantifier elimination for the synthesis of optimalnumerical algorithms (Case study: Square root computation). Journal of SymbolicComputation, 75:110–126, 2016.

18. P. Fontaine, M. Ogawa, T. Sturm, and X.T. Vu. Subtropical satisfiability. InC. Dixon and M. Finger, editors, Frontiers of Combining Systems (FroCoS 2017),vol. 10483 of Lecture Notes in Computer Science, pages 189–206. Springer Inter-national Publishing, 2017.


19. J.R. Hicks. Value and Capital. Clarendon Press, 2nd edition, 1946.20. Z. Huang, M. England, D. Wilson, J.H. Davenport, L. Paulson, and J. Bridge. Ap-

plying machine learning to the problem of choosing a heuristic to select the variableordering for cylindrical algebraic decomposition. In S.M. Watt, J.H. Davenport,A.P. Sexton, P. Sojka, and J. Urban, editors, Intelligent Computer Mathematics,vol. 8543 of Lecture Notes in Artificial Intelligence, pages 92–107. Springer, 2014.

21. H. Iwane, H. Yanami, and H. Anai. SyNRAC: A toolbox for solving real algebraicconstraints. In H. Hong and C. Yap, editors, Mathematical Software – Proc. ICMS2014, vol. 8592 of Lecture Notes in Computer Science, pages 518–522. SpringerHeidelberg, 2014.

22. G.A. Jehle and P.J. Reny. Advanced Microeconomic Theory. Pearson, 2011.23. D. Jovanovic and L. de Moura. Solving non-linear arithmetic. In B. Gramlich,

D. Miller, and U. Sattler, editors, Automated Reasoning: 6th International JointConference (IJCAR), vol. 7364 of Lecture Notes in Computer Science, pages 339–354. Springer, 2012.

24. D. Jovanovic and B. Dutertre. LibPoly: A library for reasoning about polynomi-als. In M. Brain and L. Hadarean, editors, Proc. 15th International Workshop onSatisfiability Modulo Theories (SMT 2017), number 1889 in CEUR-WS, 2017.

25. X. Li and D. Wang. Computing equilibria of semi-algebraic economies using tri-angular decomposition and real solution classification. Journal of MathematicalEconomics, 54:48–58, 2014.

26. A. Marshall. Principles of Economics. MacMillan and Co., 1895.27. C.B. Mulligan. The Redistribution Recession: How Labor Market Distortions Con-

tracted the Economy. Oxford University Press, 2012.28. C.B. Mulligan, J.H. Davenport, and M. England. TheoryGuru: A Mathematica

package to apply quantifier elimination technology to economics. In: J.H. Dav-enport, M. Kauers, G. Labahn and J. Urban editors, Mathematical Software –Proc. ICMS 2018, vol. 10931 of Lecture Notes in Computer Science, pages 369-378. Springer, 2018.

29. L.C. Paulson. Metitarski: Past and future. In L. Beringer and A. Felty, editors,Interactive Theorem Proving, vol. 7406 of Lecture Notes in Computer Science,pages 1–10. Springer, 2012.

30. A. Platzer, J.D. Quesel, and P. Rummer. Real world verification. In R.A. Schmidt,editor, Automated Deduction (CADE-22), vol. 5663 of Lecture Notes in ComputerScience, pages 485–501. Springer Berlin Heidelberg, 2009.

31. C. Steinhorn. Tame Topology and O-Minimal Structures, pages 165–191. SpringerBerlin Heidelberg, 2008.

32. A. Strzebonski. Cylindrical algebraic decomposition using validated numerics.Journal of Symbolic Computation, 41(9):1021–1038, 2006.

33. T. Sturm and A. Weber. Investigating generic methods to solve Hopf bifurcationproblems in algebraic biology. In K. Horimoto, G. Regensburger, M. Rosenkranz,and H. Yoshida, editors, Algebraic Biology, pages 200–215. Springer, 2008.

34. A. Tarski. A Decision Method For Elementary Algebra And Geometry. RANDCorporation, Santa Monica, CA (reprinted in the collection [8]), 1948.

35. Y. Wada, T. Matsuzaki, A. Terui, and N.H. Arai. An automated deduction and itsimplementation for solving problem of sequence at university entrance examina-tion. In G.-M. Greuel, T. Koch, P. Paule, and A. Sommese, editors, MathematicalSoftware – Proc. ICMS 2016, vol. 9725 of Lecture Notes in Computer Science,pages 82–89. Springer International Publishing, 2016.

36. V. Weispfenning. The complexity of linear problems in fields. Journal of SymbolicComputation, 5(1/2):3–27, 1988.

A Practical Polynomial Calculusfor Arithmetic Circuit Verification

Daniela Ritirc, Armin Biere, Manuel Kauers

Johannes Kepler University, Linz, Austria

Abstract. Generating and automatically checking proofs independentlyincreases confidence in the results of automated reasoning tools. The useof computer algebra is an essential ingredient in recent substantial im-provements to scale verification of arithmetic gate-level circuits, such asmultipliers, to large bit-widths. There is also a large body of work ontheoretical aspects of propositional algebraic proof systems in the proofcomplexity community starting with the seminal paper introducing thepolynomial calculus. We show that the polynomial calculus provides aframe-work to define a practical algebraic calculus (PAC) proof formatto capture low-level algebraic proofs needed in scalable gate-level ver-ification of arithmetic circuits. We apply these techniques to generateproofs obtained as by-product of verifying gate-level multipliers usingstate-of-the-art techniques. Our experiments show that these proofs canbe checked efficiently with independent tools.

1 Introduction

Formal verification gives correctness guarantees. However, the process of verifi-cation might also not be error-free. A common approach to increase confidence inthe results of verification consists of generating machine checkable proofs whichare then checked by independent proof checkers. These checkers are less complexthan for example theorem provers producing proofs and can also be verified.

For instance many applications of formal verification rely on SAT solvers.Their results can be validated by producing and checking resolution proofs [17,37]or clausal proofs [15,17]. Generating proofs is mandatory in the main track of theSAT Competition since 2016. These approaches have also recently been shownto scale to huge low-level proofs of combinatorial problems such as the BooleanPythagorean triples problem [18] or Schur Number Five [16].

However, in certain applications, e.g., arithmetic circuit verification, reso-lution based SAT solving does not work. Especially reasoning about gate-levelmultipliers is considered to be hard [5]. For arithmetic circuit verification thecurrently most promising approach uses algebraic reasoning [11,26,30,32].

In this approach each circuit gate is translated into a polynomial to modelconstraints between its output and inputs, i.e., roots of polynomials are identi-fied as solutions of gate constraints. Additional polynomials ensure that valuesremain in the Boolean domain. Word-level specifications relating circuit out-puts and inputs can also be translated into polynomials. Thus verification boils

62 Daniela Ritirc, Armin Biere, Manuel Kauers

down to show that the specification polynomial is “implied” by the polynomialsinduced by the circuit gates (contained in the ideal generated by them).

To validate results of algebraic reasoning the polynomial calculus can beused [12]. It operates on polynomials and allows to check if a polynomial is alogical consequence of a given set of polynomials. The main focus in this areahas been on proof complexity to obtain lower-bounds for the degree and size ofproofs [20]. For instance [27] introduces a general method to obtain lower boundsand [25] shows that certifying the non-k-colorability of graphs requires proofsof large degree. A more general calculus capable of detecting unsatisfiability ofnonlinear equalities as well as inequalities is discussed in [34].

Our paper shows that the polynomial calculus can also be used in practice.In particular we generate low-level algebraic proofs needed to validate the resultsof ideal membership testing used in arithmetic circuit verification by translat-ing proofs extracted from computer algebra systems to polynomial refutationsin the polynomial calculus. After we review preliminaries in Sect. 2, we presenta concrete proof format for polynomial calculus proofs, called practical alge-braic calculus in Sect. 3. In Sect. 4 we give a comprehensive introduction toarithmetic circuit verification, following [30]. Section 5 introduces the tool flowof verifying and proof checking arithmetic circuits. In our experiments, shownin Sect. 6, our new proof checker PacTrim is used to independently validatethe results of multiplier verification [30]. We further apply these techniques toequivalence checking of multipliers [31] and proving certain ring properties, e.g.commutativity of multipliers [3]. In general, we claim that our approach is thefirst to provide machine checkable proofs for current state-of-the-art techniquesin verifying arithmetic circuits [11,26,30,32].

2 Preliminaries

Proof systems are used to validate the results of verification systems. While averification system only gives a yes/no answer, a proof system provides addition-ally a certificate with which the answer can be checked independently. We areconcerned here with a proof system for reasoning about polynomial equations.The question is whether the zeroness of a certain set of polynomials implies thezeroness of another polynomial. We consider polynomials p ∈ F[X] where F is afield and X = {x1, . . . , xn} is a finite set of variables. The function X 7→ p(X)is called polynomial function of p. The polynomial equation of p is defined asp(X) = 0 and the solutions of this equation are the roots of p. From now on wedrop the function argument and write p = 0 instead of p(X) = 0.

Reasoning with polynomial equations is well-understood both in computeralgebra and in computational logic. Already Hilbert and collaborators have stud-ied the theory of polynomial ideals in order to reason about the solution sets ofpolynomial equations. The application of Grobner bases [8] by for instance Ka-pur [21,22,23] has turned the algebraic approach into a valuable computationaltool for automated theorem proving with renewed recent interest [1,38].

A Practical Polynomial Calculus for Arithmetic Circuit Verification 63

In order to introduce the notation and terminology needed later, let us givea brief summary of the theory. As far as algebra is concerned, we follow thestandard textbooks [4,9,13]. From the logical perspective, we use a variant ofthe polynomial calculus (PC) as proposed by [12]. It is more flexible than theNullstellensatz (NS) proof system [2], which is also heavily used in the proofcomplexity community. The relation between PC and NS in the context of ourapplication is further discussed at the end of this section.

Let G ⊆ F[X] and f ∈ F[X]. In logical terms, the question is whether theequation f = 0 can be deduced from the equations g = 0 with g ∈ G, i.e., everycommon root of the polynomials g ∈ G is also a root of f . As we will onlyconsider polynomial equations with right hand side zero, we take the freedomto write f instead of f = 0. We write proofs as tuples P = (p1, . . . , pn) ofpolynomials where each pi is derived by one of the following rules.

Additionpi pjpi + pj

pi, pj appearing earlier in the proofor are contained in G

Multiplicationpiqpi

pi appearing earlier in the proofor is contained in Gand q ∈ F[X] being arbitrary

If f can be deduced from the polynomials g ∈ G, i.e. pn = f , we write G ` f . Inalgebraic terms, G ` f means that f belongs to the ideal generated by G. Recallthat an ideal I ⊆ F[X] is defined as a set with 0 ∈ I and the closure propertiesu, v ∈ I ⇒ u+v ∈ I and w ∈ F[X], u ∈ I ⇒ wu ∈ I. If G = {g1, . . . , gm} ⊆ F[X]is a finite set of polynomials, then the ideal generated by G is defined as theset {q1g1 + · · · + qmgm : q1, . . . , qm ∈ F[X]} and denoted by 〈G〉. The set G iscalled a basis of the ideal 〈G〉. It is clear that this is an ideal and that it consistsof all the polynomials whose zeroness can be deduced from the zeroness of thepolynomials in G. In logical terms we would call G an axiom system and 〈G〉the corresponding theory. If we can derive G ` 1, or in algebraic terms 1 ∈ 〈G〉,the PC proof is called a PC refutation.

Example 1. This example shows that the output c of an XOR gate over an inputa and its negation b = ¬a is always true, i.e., c = 1 or equivalently −c+ 1 (= 0).We apply the polynomial calculus over the ring Q[c, b, a]. Over Q a NOT gatex = ¬y is modeled by the polynomial −x + 1 − y and an XOR gate z = x ⊕ yis modeled by the polynomial −z + x + y − 2xy. Because the variables are ofthe boolean domain we further need to enforce that every variable can onlytake the values 0 or 1. Therefore we add for each variable xi a polynomial ofthe form xi(xi − 1) to the given set of polynomials. The corresponding circuitrepresentation, the given polynomials and a polynomial proof are shown in Fig. 1.

Example 2. Let G = {x, x + y} ⊆ Q[x, y], f = y. We have G ` f . A proofis P = (−x, y). The first entry follows by the multiplication rule from x withq = −1, and the second entry follows by the addition rule from the first entryand x+ y which is contained in G.


G = { − b + 1− a,

− c + a + b− 2ab,

a2 − a, b2 − b, c2 − c}

a c

b

−c + a + b− 2ab −b + 1− a

−c + 1− 2ab

−b + 1− a

2ab− 2a + 2a2

−c + 1− 2a + 2a2

a2 − a

−2a2 + 2a−c + 1

Fig. 1: The circuit, polynomial representation of the gates and proof for Ex. 1.

Thanks to the theory of Grobner bases [4,8,13], the polynomial calculus isdecidable, i.e., there is an algorithm, which for any finite G ⊆ F[X] and f ∈ F[X]can decide whether G ` f or not. A basis of an ideal I is called a Grobner basisif it enjoys certain structural properties whose precise definition is not relevantfor our purpose. What matters are the following fundamental facts:

– There is an algorithm (Buchberger’s algorithm) which for any given finiteset B ⊆ F[X] computes a Grobner basis for the ideal 〈B〉 generated by B.

– Given a Grobner basis G, there is a computable function redG : F[X]→ F[X]such that ∀ p ∈ F[X] : redG(p) = 0 ⇐⇒ p ∈ 〈G〉.

– Moreover, if G = {g1, . . . , gm} is a Grobner basis of an ideal I and p, r ∈ F[X]are such that redG(p) = r, then there exist h1, . . . , hm ∈ F[X] such thatp− r = h1g1 + · · ·+ hmgm, and such polynomials hi can be computed.

Consider the extended calculus with the additional rule

Radicalpmipi

m ∈ N \ {0} andpmi appearing earlier in the proof or is contained in G.

If the polynomial f can be deduced from the polynomials g, where g ∈ G, withthe rules of PC and this additional radical rule, we write G `+ f and call thisproof radical proof (`+). In algebra, the set { f ∈ F[X] : G `+ f } is called theradical ideal of G and is typically denoted by

√〈G〉.

Also the extended calculus `+ is decidable. It can be reduced to ` using theso-called Rabinowitsch trick [13, 4§2 Prop. 8], which says

f ∈√〈G〉 ⇐⇒ 1 ∈ 〈G ∪ {yf − 1}〉 or G `+ f ⇐⇒ G ∪ {yf − 1} ` 1,

depending whether you prefer algebraic or logic notation. In both cases, y is anew variable and the ideal/theory on the right hand sides is understood as anideal/theory of the extended ring F[X, y]. The Rabinowitsch trick is thereforeused to replace a radical proof (`+) by a PC refutation.

For a given set G ⊆ F[X], a model is a point u = (u1, . . . , un) ∈ Fn suchthat for all g ∈ G we conclude that g(u1, . . . , un) = 0. Here, by g(u1, . . . , un)we mean the element of F obtained by evaluating the polynomial g for x1 =u1, . . . , xn = un. For a set G ⊆ F[X] and a polynomial f ∈ F[X], we write


G |= f if every model for G is also a model for {f}. Given G ⊆ F[X], defineV (G) as the set of all models of G. For an algebraically closed field F, Hilbert’sNullstellensatz [13, 4§1 Thms. 1 and 2] asserts that V (G) is nonempty if and onlyif 1 6∈ 〈G〉, and furthermore, f ∈

√〈G〉 ⇐⇒ V (G) ⊆ V ({f}). In other words,

G |= f ⇐⇒ G `+ f . Particularly, the PC including the radical rule is correct(“⇐”) and complete (“⇒”). In combination with Rabinowitsch’s trick, we cantherefore decide the existence of models and furthermore produce certificates forthe non-existence of models.

For our applications, only models u ∈ {0, 1}n ⊆ Fn matter. Let us writeG |=bool f if every model u ∈ {0, 1}n of G is also a model of {f}. Using basicproperties of ideals as described in [13, 4§3 Thm. 4], it is easy to show thatG |=bool f ⇐⇒ G∪B |= f , where B = {xi(xi−1) : i = 1, . . . , n}. Furthermore,the equivalence G ∪ B |= f ⇐⇒ G ∪ B `+ f holds also when F is notalgebraically closed, because changing from F to its algebraic closure F will nothave any effect on the models in {0, 1}n. Finally, let us remark that the finitenessof {0, 1}n also implies that G ∪ B `+ f ⇐⇒ G ∪ B ` f . This follows fromSeidenberg’s lemma [4, Lemma 8.13] and generalizes Theorem 1 of [12].

In contrast to a PC refutation G∪ {1− yf} ∪B ` 1, where each polynomialin the proof is generated using the rules of PC, a refutation in the NS proofsystem is a set of polynomials Q = {q1, . . . , qm} ⊆ F[X] such that

m∑i=0

qipi = 1 for pi ∈ G ∪ {1− yf} ∪B.

Although both systems are able to verify correctness of a refutation, we will usePC and not the NS proof system, because for arithmetic circuit verification wewill rewrite some polynomials of G ∪ {1− yf} ∪B, and thus gain an optimizedalgebraic representation of the circuit, cf. Sect. 4. In a correct NS refutation wewould also need to express these rewritten polynomials as a linear combinationof elements of G ∪ {1 − yf} ∪ B and thus lose the optimized representation,which will most likely lead to an exponential blow-up of monomials in the NSproof [10]. In PC we can generate these optimized polynomials on-the-fly andthen use these polynomials to show the correctness of the refutation.

3 Practical Algebraic Calculus

For practical proof checking we translate the abstract polynomial calculus (PC)into a concrete proof format, i.e., we only define a format based on PC, which islogically equivalent but more precise. In principle a proof in PC can be seen asa finite sequence of polynomials derived from given polynomials and previouslyinferred polynomials by applying either an addition or multiplication rule.

To ensure correctness of each rule it is of course necessary to know which rulewas used, to check that it was applied correctly, and in particular which givenor previously derived polynomials are involved. During proof generation thesepolynomials are usually known and thus we require that all of this information


letter ::= ‘a ’ | ‘b ’ | . . . | ‘z ’ | ‘A ’ | ‘B ’ | . . . | ‘Z ’

number ::= ‘0 ’ | ‘1 ’ | . . . | ‘9 ’

constant ::= (number)+

variable ::= letter (letter | number)∗

power ::= variable [ ‘^ ’ constant ]

term ::= power (‘* ’ power)∗

monomial ::= constant | [ constant ‘* ’ ] term

operator ::= ‘+ ’ | ‘- ’

polynomial ::= [ ‘- ’ ] monomial (operator monomial)∗

given ::= (polynomial ‘; ’)∗

rule ::= (‘+ ’ | ‘* ’) ‘: ’ polynomial ‘, ’ polynomial ‘, ’ polynomial ‘; ’

proof ::= (rule ‘; ’)∗

Fig. 2: Syntax of given polynomials and proofs in PAC-format

is part of a rule in our concrete practical algebraic calculus (PAC) proof formatto simplify proof checking. The syntax of PAC is shown in Fig. 2. White space isallowed everywhere except between letters and digits in a constant or a variable.A proof rule contains four components

o : v, w, p;

The first component o denotes the operator which is either ‘+ ’ for addition or ‘* ’for multiplication. The next two components v, w specify the two (antecedent)polynomials used to derive p (conclusion). In the multiplication rule w plays therole of the polynomial q of the multiplication rule of PC, cf. Sect. 2. A refutationin PAC is a proof, which contains a non-zero constant polynomial (typically justthe constant “1”) as conclusion p of a rule.

As discussed above we do not need the radical rule for our purpose, eventhough it could be easily added. Further note that the format is independent ofthe domain of the models u, e.g., u ∈ {0, 1}n for gate-level circuit verification,to which the values of variables are restricted. If such a restriction is necessary,all elements of the corresponding set B (often also called field polynomials) haveto be added to the given set of polynomials.

Although the definition of number together with the definition of polynomialonly allows integer coefficients this is not a severe restriction. Rational numbercoefficients can be simulated by multiplying involved polynomials with appro-priate non-zero constants to eliminate denominators.

Example 3. Consider again Ex. 1. To test membership of 1− c ∈√〈G〉 we add

1 + y(c− 1) to the set of given polynomials G in order to apply Rabinowitsch’strick and obtain a PAC refutation:

+ : -c+a+b-2a*b, -b+1-a, -c+1-2a*b;

* : -b+1-a, -2a, 2a*b-2a+2a^2;

+ : -c+1-2a*b, 2a*b-2a+2a^2, -c+1-2a+2a^2;

* : a^2-a, -2, -2a^2+2a;

+ : -c+1-2a+2a^2, -2a^2+2a, -c+1;

* : -c+1, y, -c*y+y;

+ : -c*y+y, 1+c*y-y, 1;


input G sequence of given polynomials

r1 · · · rk sequence of PAC proof rules

output “incorrect”, “correct-proof”, or “correct-refutation”

P0 ← G

for i ← 1 . . . k

let ri = (oi, vi, wi, pi)

case oi = +

if vi ∈ Pi−1 ∧ wi ∈ Pi−1 ∧ pi = vi + wi then Pi ← append(Pi−1, pi)

else return “incorrect”

case oi = ∗if vi ∈ Pi−1 ∧ pi = vi ∗ wi then Pi ← append(Pi−1, pi)

else return “incorrect”

for i ← 1 . . . k

if pi is a non zero constant polynomial then return “correct-refutation”

return “correct-proof”

Fig. 3: Proof Checking Algorithm

For proof validation we need to make sure that two properties hold. Theconnection property states that the components v, w are either given polynomialsor conclusions of previously applied proof rules. For multiplication we only haveto check this property for v, because w is an arbitrary polynomial. By the secondproperty, called inference property, we verify the correctness of each rule, namelywe simply calculate v+w resp. v ∗w and check that the obtained result matchesp. In a correct PAC refutation we further need to verify that at least one pi is anon-zero constant. The complete checking algorithm is shown in Fig. 3.

4 Circuit verification using Computer Algebra

Following [11,30,31,32,33,36] we consider gate-level (integer) multipliers with 2ninput bits a0, . . . , an−1, b0, . . . , bn−1 ∈ {0, 1} and 2n output bits s0, . . . , s2n−1 ∈{0, 1}. Each internal gate (output) is represented by a further variable l1, . . . , lm.In this setting let X = a0, . . . , an−1, b0, . . . , bn−1, l1, . . . , lm, s0, . . . , s2n−1. Thena multiplier is correct iff for all possible inputs the following specification holds:

2n−1∑i=0

2isi =

(n−1∑i=0

2iai

)(n−1∑i=0

2ibi

)(1)

Using algebraic reasoning this can be verified by showing that the specificationis contained in the ideal generated by the gate constraints. For each logical gatein the circuit a so-called gate polynomial g ∈ Q[X] representing the relation be-tween the gate inputs and output is defined. Example 1 defines these polynomialsfor a NOT and an XOR gate. Indicating that the circuit operates over Booleanvariables we add for each variable xi ∈ X the relation xi(xi − 1) matching thedefinition of B in the last paragraph of Sect. 2 to the gate polynomials G.


Although all variables are restricted to boolean values we use Q as the basefield. Using Q connects the circuit specification (Eqn. (1)) to multiplication in Q.The specification would be the same over Z, but Z is not a field, hence theunderlying Grobner basis theory would be more complex. Theoretically reasoningin the field Z2 is possible, but probably would be much more involved. A moreprecise comparison will be done in the future.

A term order is a lexicographic term order if for all terms σ1 = xu11 · · ·xun

n ,σ2 = xv1

1 · · ·xvnn we have σ1 < σ2 iff there exists i with uj = vj for all j < i,and ui < vi. If the terms in the gate polynomials are ordered according tosuch a lexicographic variable ordering where the variable corresponding to theoutput of a gate is always bigger than the variables corresponding to inputsof the gate, then by Buchberger’s product criterion [13] the gate polynomialsdefine a Grobner basis for the ideal generated by the gate polynomials. Thusthe correctness of the circuit can be shown by reducing the specification by thegate polynomials using polynomial reduction (redG) and checking if the resultis zero. We generate and check proofs for this reduction, cf. Sect. 5.

Directly reducing the specification without rewriting the Grobner basis leadsto an explosion of intermediate results [30]. In practice it is necessary to userewriting techniques to simplify the Grobner basis. In recent work [32] a re-duction scheme was proposed which effectively (partially) reduces the Grobnerbasis. These preprocessing steps [32] are also applied in [30], where we intro-duced a column-wise checking algorithm which cuts the circuit into 2n slices Si

with 0 ≤ i < 2n such that each slice contains exactly one output bit si. In eachslice the relation that the sum of the outgoing carries Ci+1 and the output-bit siis equal to the sum of the partial products Pi =

∑k+l=i akbl and the incoming

carries of the slice Ci has to hold. Thus we define for each slice Si a correspond-ing specification Ci = 2Ci+1 + si − Pi. Initially we set C2n = 0 and recursivelycalculate Ci as the remainder of reducing 2Ci+1+si−Pi by the gate polynomialsof the corresponding slice. In a correct multiplier C0 = 0 has to hold. Hence eachslice is verified recursively, thus the problem of circuit verification is divided intosmaller more manageable sub-problems.

In [31] we further improved incremental checking by eliminating variables [7],local to full- and half-adders. Since these preprocessing and incremental algo-rithms are complex and error prone to implement but essential to achieve scalableverification we also generate and check proofs for them.

5 Engineering

We take as input circuit an And-Inverter Graph (AIG) [24] in the commonAIGER format [6]. The AIG is then verified using the computer algebra systemMathematica [35]. We also generate proofs in our PAC-format (c.f. Sect. 3) whichthen are either passed on to the computer algebra system Singular [14] or to ourown algebraic proof checker PacTrim. The complete verification flow is depictedin Fig. 4. Boxes with “.〈suffix〉” refer to the input AIG or generated files. Thevariable n defines the length of the two input bitvectors of the multiplier.


.aig

n

.wl

.wl

.out

Verification

.pac

.polys

.spec

.singular

Certification

CertificationAigMul.&

ProofIt

Mathematica Python Python Singular

connect inference

PacMultSpec\PacEqSpec

AigMul.

AigToPoly

Mathematica PacTrim

verify+

verify check I

check II

Fig. 4: Toolflow of verifying and proof checking circuits

The tool AigMulToPoly [30,31] is used for verification without generatingproofs (verify). It takes an AIG as input and produces a file which can be passedon to either Mathematica or Singular, which then performs the actual idealmembership test. Different option settings can be selected to enable or disablethe preprocessing and rewriting techniques discussed in Sect. 4.

For proof generation (verify+) we use a second tool ProofIt which takesthe output file from AigMulToPoly as well as the original AIG and returnsa file which can be passed on to Mathematica. In Mathematica the proof (.pac)is calculated. In the tool AigToPoly the original AIG is translated into a setof polynomials G without applying any preprocessing. Together with the setB = {xi(xi − 1) | xi ∈ X} these polynomials define the given set of polynomialsG ∪ B of the PAC proof (.polys). This is a rather trivial task implemented inless than 130 lines of C code (half of them are just about command line optionhandling) using the AIGER [6] library for parsing.

In the same spirit PacMultSpec and PacEqSpec have been implementedto produce the specifications we want to verify (.spec). In PacMultSpec wesimply generate the multiplier specification as given in Sect. 4, i.e. Eqn. (1)flattened. In PacEqSpec we generate a similar specification for equivalencechecking of two multipliers [31]. To gain a PAC refutation both types of speci-fications are produced in negated form using the Rabinowitsch trick and hencebecome part of the given set of polynomials.

Each polynomial of AigMulToPoly which is derived during preprocessingneeds to be checked if it is a logical consequence of the given set of polynomials.Hence for each preprocessed polynomial f the representation modulo the givenset of polynomials G ∪B = {g1, . . . , gk} is calculated in Mathematica using thebuilt-in function “PolynomialReduce”. This command does not only allow tocompute the reduction redG∪B(f) = r, but it also returns cofactors h1, . . . , hksuch that f = h1g1 + . . . + hkgk + r. If the preprocessing is done correctly thederived polynomials f are contained in the ideal 〈G ∪ B〉, thus redG∪B(f) = 0and the above representation simplifies to f = h1g1 + . . . + hkgk. Knowing thecofactors hi and the corresponding elements of G∪B we generate proof rules inPAC in the following way. First we generate a multiplication proof rule for eachproduct higi.

∗ : g1, h1, h1g1; · · · ∗ : gk, hk, hkgk;


In the listed rules the result p is always depicted simply as the product higi,but in the actual PAC proof p is written in expanded (flattened) form. Theseproducts are now simply added together as follows:

+ : h1g1, h2g2, h1g1 + h2g2;+ : h1g1 + h2g2, h3g3, h1g1 + h2g2 + h3g3;

...+ : h1g1 + . . .+ hk−1gk−1, hkgk, f ;

In the experiments we also use a non-incremental verification approach wherewe do not use the incremental optimizations presented in Sect. 4, hence we haveto reduce the complete word-level specification of a multiplier by the (prepro-cessed) gate and field polynomials. Extracting a proof works in the same way asjust described for the preprocessed polynomials.

Generating proofs for incremental verification is also similar, but insteadof the word-level specification of the multiplier we have to use the incremen-tal specifications Ci = 2Ci+1 + si − Pi of each slice, cf. Sect. 4. The poly-nomials Ci describing the incoming carries of a slice can be derived by cal-culating redG∪B(2Ci+1 + si − Pi) = Ci. Since verification can be assumed tosucceed we have C2n = 0 and C0 = 0. As described in the last bullet on fun-damental facts in Sect. 2 we are able to obtain cofactors h1, . . . , hk such that2Ci+1 + si−Pi−Ci = h1g1 + . . .+hkgk and consequently a translation into thePAC-format to derive the left-hand side of the equation.

To derive the word-level specification of a multiplier from the incrementalspecifications we first multiply for each slice Si its incremental specificationCi = 2Ci+1 + si − Pi by the constant 2i.

∗ : 2C1 + s0 − P0, 1, 2C1 + s0 − P0;∗ : 2C2 + s1 − P1 − C1, 2, 4C2 + 2s1 − 2P1 − 2C1;

...∗ : s2n−1 − P2n−1 − C2n−1, 22n−1, 22n−1s2n−1 − 22n−1P2n−1 − 22n−1C2n−1;

Subsequent accumulation of the polynomials above using PAC addition rulescancels the terms Ci and

∑2n−1i=0 2isi−

∑2n−1i=0 2iPi remains. It holds that the sum

of partial products can be reordered to∑2n−1

i=0 2iPi = (∑n−1

i=0 2iai)(∑n−1

i=0 2ibi) [30]and thus we are able to deduce the word-level specification of multipliers.

In both approaches the incremental as well as the non-incremental one wemultiply the word-level specification of the multiplier by the additional variabley and add it to the given polynomial 1− y ∗ spec ∈ G ∪ B to derive 1 and thusobtain a correct PAC refutation.

As Fig. 4 shows we have two different flows for checking PAC proofs inde-pendently from Mathematica, which was used for verification. The first one usesPython scripts to validate the connection property of each rule and whether theproof actually defines a refutation. With Singular we check the inference prop-erty of each proof line, which in essence uses Singular as a calculator for addingand multiplying polynomials.


HAFAFAHA

HAFAFAFA

HAFAFAFA

s7 s6 s5 s4 s3 s2 s1 s0

p00p01p10p11p20p21p30p31

p02p12p22p32

p03p13p23p33

HAFAFAHA

HAFAFAFA

HAFAFAFA

s7 s6 s5 s4 s3 s2 s1 s0

p00p01p10p02p11p20p12p21p30p22p31

p03p13p23p32

p33

Fig. 5: Architecture of “btor” (left) and “sparrc” (right), where pij = aibj [31]

We also provide a new dedicated proof checker called PacTrim implementedfrom scratch in C. It has similar features as DRAT-trim, which is the standardproof checker in the SAT community for clausal proofs (and is used in the SATCompetition – see also [16,18]). Our new PacTrim checker contains a parserfor PAC proofs and checks the connection property using hash tables and theinference property using a dedicated stand-alone implementation of polynomialarithmetic over arbitrary precision integers represented as strings.

While the first approach is rather general and easy to adapt it is, as theexperiments confirm, less robust (due to for instance the limit on variables inSingular) and more importantly far less efficient than our dedicated checker. Thelatter also allows to produce proof cores (of both original polynomials and prooflines), and is also much closer to being certifiable.

6 Experiments

In our experiments we generate and validate PAC proofs for the (integer) mul-tiplier benchmarks used in [30,31]. The “btor”-benchmarks are generated byBoolector [28] and the “sparrc”-multipliers are part of the bigger AOKI bench-mark set [19], containing several multiplier architectures. In both multiplier ar-chitectures the partial products are generated as products of two input bitswhich are then accumulated by full- and half-adders, as shown in Fig. 5 for in-put size n = 4. In “btor”-multipliers the full- and half-adders are accumulatedin a grid-like structure, thus they are considered as array multipliers, whereas in“sparrc”-multipliers full- and half-adders are accumulated diagonally.

In all our experiments we use a standard Ubuntu 16.04 Desktop machine withIntel i7-2600 3.40GHz CPU and 16 GB of main memory. The (wall-clock) timelimit is 90 000 seconds and the main memory usage is limited to 7GB. The timein our experiments is measured in seconds (wall-clock time). We mark unfinishedexperiments by TO (reached time limit), MO (reached memory limit) or by EE,when an error state is reached. An error state is reached by Singular, becauseit has a limit of 32767 on the number of ring variables. All experimental data,benchmarks and source code is available at http://fmv.jku.at/pac.

In Table 1 we separately list the time taken for verification, the generationas well as checking of PAC-proofs for “btor”and “sparrc” multipliers of differentinput bitwidth n. The third column lists configurations of AigMulToPoly. Thedefault configuration uses incremental column-wise slicing of [30], c.f. Sect. 4,

http://fmv.jku.at/pac


n mult option verify verify+ chk I con inf chk II length core size core deg

4 btor inc 0 1 0 0 0 0 646 68% 3551 72% 64 btor inc-add 0 1 0 0 0 0 594 62% 4001 63% 54 btor noninc 0 1 0 0 0 0 638 68% 3862 74% 6




64 btor inc 622 49638 4 586 4539 5125 259606 63% 2839901 81% 664 btor inc-add 121 22378 4 414 4236 4650 216834 61% 2387831 74% 564 btor noninc MO MO - - - - - - - - -

4 sparrc inc 0 1 0 0 0 0 753 64% 4943 68% 64 sparrc inc-add 0 1 0 0 0 0 764 65% 8156 66% 84 sparrc noninc 0 1 0 0 0 0 745 65% 5252 71% 6



32 sparrc inc 104 3582 1 43 132 175 73301 62% 735218 74% 632 sparrc inc-add 8 2611 2 55 402 457 75244 63% 1492082 63% 832 sparrc noninc 351 TO - - - - - - - - -

64 sparrc inc 1575 TO - - - - - - - - -64 sparrc inc-add 133 80906 12 1307 EE EE 309164 62% 6727026 65% 864 sparrc noninc MO - - - - - - - - - -

Table 1: Wordlevel proof checking

both with (inc-add) and without (inc) our new optimization of eliminating localvariables in full- and half-adders [31]. In the third configuration (noninc) thewhole word-level specification is reduced without any slicing of the multiplier.

The time needed for verification, proof generation and proof checking is listedin the following columns. The corresponding execution paths are marked in Fig. 4by dashed rectangles. The column verify shows the time Mathematica needs toverify the multiplier, column verify+ shows the time needed to generate the proofincluding the time of verify and in column chk I we measure the time our ownproof checker PacTrim needs to validate the proof. The time Python needs toverify the connection property is listed in column con and the time Singularneeds to verify the inference property is listed in column inf. The column chk IIis the total time needed to verify the proof with Python and Singular. We did notinclude the time the tools AigToPoly, PacMultSpec and PacEqSpec need,because in the worst-case it only takes a second for 64-bit multipliers.

Inspired by [27] we also compute and include the number of polynomials in aproof (length), the total number of monomials of the derived polynomials (size),counted with repetition, and the maximum total degree of any monomial (deg).


n mult verify verify+ chk I con inf chk II length core size core deg

4 btor-btor 1 1 0 0 1 1 1170 59% 7952 61% 58 btor-btor 1 6 0 0 1 1 5794 59% 43902 63% 5

16 btor-btor 2 75 1 5 10 14 25410 59% 210666 65% 532 btor-btor 27 1632 3 87 189 277 106114 59% 995330 69% 564 btor-btor 502 45155 15 1625 EE EE 433410 59% 4942642 74% 5

4 btor-sparrc 1 2 0 1 1 2 1340 61% 12107 64% 88 btor-sparrc 1 9 1 1 2 3 6844 61% 81317 63% 8

16 btor-sparrc 3 148 1 7 42 48 30476 61% 424189 63% 832 btor-sparrc 28 3456 7 163 848 1011 128236 60% 1999501 64% 8

4 sparrc-sparrc 1 2 0 0 0 1 1510 62% 16270 65% 88 sparrc-sparrc 1 12 1 1 5 6 7894 62% 118820 63% 8

16 sparrc-sparrc 2 223 2 9 73 82 35542 61% 638248 62% 832 sparrc-sparrc 29 5363 11 308 1591 1899 150358 61% 3006256 63% 8

Table 2: Equivalence proof checking

Usually not all given polynomials in the data set G∪{1− yf}∪B are needed toderive a correct refutation, especially only a small subset of B is used. Thus nextto the length and size columns we list the percentage of polynomials and mono-mials which are actually necessary to derive a PAC refutation (core) w.r.t. thenumber of original and derived polynomials.

In general it can be seen that “sparrc”-multipliers need more time and spacefor verification, certification and proof checking than “btor”-multipliers. By farmost of the time is needed for generating the proofs. For more scalable proofgeneration it is clear that computer algebra systems would need to be adaptedto support proof generation on-the-fly or even application specific algebraic rea-soning engines have to be implemented. Checking the proof with PacTrim takesa fraction of the time needed for verification, at most 12 seconds, even for 64bit multipliers. Proof checking using an independent computer algebra systemtakes much longer – for 64 bit multipliers more than 4000 seconds.

In further experiments shown in Table 2 we construct proofs for the commu-tativity property of multipliers, i.e., we want to prove for a certain multiplierarchitecture that A ∗B = B ∗A holds. Among other things it was shown in thework of [3] that polynomial sized resolution proofs for the commutativity prop-erty of array and diagonal multipliers exist. Motivated by this result we generateproofs for these two multiplier architectures, where “btor”-multipliers play therole of array multipliers and “sparrc”-multipliers are considered as diagonal mul-tipliers. We generate the commutativity miters by checking the equivalence ofa multiplier and the same multiplier with input bit-vectors swapped (btor-btor,sparrc-sparrc). Furthermore we derive proofs for checking the equivalence of thetwo architectures “btor” vs. “sparrc” (btor-sparrc). The columns in Table 2 fol-low the same structure as in Table 1. In all commutativity and equivalence check-ing experiments we used the configuration “inc-add”, which uses our incrementalcolumn-wise slicing of [30] with the optimization of eliminating local variables infull- and half-adders. We did not include commutativity or equivalence checkingexperiments containing “sparrc” multipliers with bit-width n = 64, because wereached an error state (EE) in the experiments of Table 1.


10 20 30 40 50 60

050

000

1500

0025

0000

Bitwidth n

Leng

th o

f cor

e pr

oof

10 20 30 40 50 600e+

001e

+06

2e+

063e

+06

Bitwidth n

Siz

e of

cor

e pr

oof

Fig. 6: Length and size of btor-btor commutativity check

In Fig. 6 data points depicting core size (left plot) and core length (right plot)of the “btor-btor”-commutativity proofs are shown for various input bitwidths n.The additional polynomial curves are fitted to the data points (using linear re-gression with R). For the proof length we used a parameterized model of aquadratic polynomial. The proof size required a cubic polynomial. In both casesthe match is perfect, with absolute values of residuals less than 9∗10−10. This em-pirically suggested quadratic complexity of algebraic proofs compares favourablyto the O(n7log n) upper bound for resolution proofs given in Thm. 2 of [3].

Comparing the meta data of the “btor-btor” and “sparrc-sparrc”-benchmarksthe proof lengths of “sparrc-sparrc”-benchmarks are of the same magnitude asthe proof lengths of “btor-btor”-benchmarks. The proof sizes of “sparrc-sparrc”are around three times as big as the proof sizes of “btor-btor” with nearlysame percentages for the cores. Hence both measurements of “sparrc-sparrc”-benchmarks can also be depicted by quadratic and cubic curves.

7 Conclusion

This paper applies proof checking to algebraic reasoning, not only in theory, butalso in practice, in order to validate verification techniques based on computeralgebra. We show how the abstract polynomial calculus [12] can be instantiatedto yield a practical proof format (PAC). Proofs in this format can be obtained asby-product of verifying multiplier circuits using state-of-the-art techniques andcan be checked with our new proof checker tool PacTrim in a fraction of thetime needed for verification. Our experiments produce small polynomial proofswhich certify the correctness of certain multipliers. The theoretical analysis in [3]gives much larger polynomial upper bounds (for clausal resolution proofs).

To explore the connection between PAC and clausal proof systems, suchas RUP and DRAT [17], is an interesting subject for future work, as well asembedding PAC into more general systems, such as Isabelle [29].

We want to thank Thomas Sturm for pointing out the Rabinowitsch trickto the second author and Jakob Nordstrom for discussions on the polynomialcalculus and Nullstellensatz proof systems. This work is supported by AustrianScience Fund (FWF), NFN S11408-N23 (RiSE), Y464-N18, SFB F5004.


References

1. E. Abraham, J. Abbott, B. Becker, A. M. Bigatti, M. Brain, B. Buchberger,A. Cimatti, J. H. Davenport, M. England, P. Fontaine, S. Forrest, A. Griggio,D. Kroening, W. M. Seiler, and T. Sturm. Satisfiability checking and symboliccomputation. ACM Comm. Computer Algebra, 50(4):145–147, 2016.

2. P. Beame, R. Impagliazzo, J. Krajıcek, T. Pitassi, and P. Pudlak. Lower boundson hilbert’s nullstellensatz and propositional proofs. In PROCEEDINGS OF THELONDON MATHEMATICAL SOCIETY, pages 1–26, 1996.

3. P. Beame and V. Liew. Towards verifying nonlinear integer arithmetic. In CAV,volume 10427 of LNCS, pages 238–258. Springer, 2017.

4. T. Becker, V. Weispfenning, and H. Kredel. Grobner Bases. Springer, 1993.5. A. Biere. Collection of combinational arithmetic miters submitted to the SAT

Competition 2016. In SAT Competition 2016, volume B-2016-1 of Department ofComputer Science Series of Publications B, pages 65–66. Univ. Helsinki, 2016.

6. A. Biere, K. Heljanko, and S. Wieringa. AIGER 1.9 and beyond. Technical re-port, FMV Reports Series, Institute for Formal Models and Verification, JohannesKepler University, Altenbergerstr. 69, 4040 Linz, Austria, 2011.

7. A. Biere, M. Kauers, and D. Ritirc. Challenges in verifying arithmetic circuitsusing computer algebra. In SYNASC, 2017, in press.

8. B. Buchberger. Ein Algorithmus zum Auffinden der Basiselemente des Restklassen-ringes nach einem nulldimensionalen Polynomideal. PhD thesis, 1965.

9. B. Buchberger and M. Kauers. Grobner basis. Scholarpedia, 5(10):7763, 2010.http://www.scholarpedia.org/article/Groebner_basis.

10. J. Buresh-Oppenheim, M. Clegg, R. Impagliazzo, and T. Pitassi. Homogenizationand the polynomial calculus. Computational Complexity, 11(3-4):91–108, 2002.

11. M. Ciesielski, C. Yu, W. Brown, D. Liu, and A. Rossi. Verification of gate-levelarithmetic circuits by function extraction. In DAC, pages 52:1–52:6. ACM, 2015.

12. M. Clegg, J. Edmonds, and R. Impagliazzo. Using the groebner basis algorithm tofind proofs of unsatisfiability. In STOC, pages 174–183. ACM, 1996.

13. D. Cox, J. Little, and D. O’Shea. Ideals, Varieties, and Algorithms. Springer, 1997.14. W. Decker, G.-M. Greuel, G. Pfister, and H. Schonemann. Singular 4-1-0. http:

//www.singular.uni-kl.de, 2016.15. E. I. Goldberg and Y. Novikov. Verification of proofs of unsatisfiability for CNF

formulas. In DATE, pages 10886–10891. IEEE Computer Society, 2003.16. M. J. H. Heule. Schur number five. CoRR, abs/1711.08076, 2017.17. M. J. H. Heule and A. Biere. Proofs for satisfiability problems. In All about Proofs,

Proofs for All, volume 55, pages 1–22, 2015.18. M. J. H. Heule, O. Kullmann, and V. W. Marek. Solving and verifying the boolean

pythagorean triples problem via cube-and-conquer. In SAT, volume 9710 of LNCS,pages 228–245. Springer, 2016.

19. N. Homma, Y. Watanabe, T. Aoki, and T. Higuchi. Formal design of arith-metic circuits based on arithmetic description language. IEICE Transactions,89-A(12):3500–3509, 2006.

20. R. Impagliazzo, P. Pudlak, and J. Sgall. Lower bounds for the polynomial calculusand the grobner basis algorithm. Computational Complexity, 8(2):127–144, 1999.

21. D. Kapur. Geometry theorem proving using hilbert’s nullstellensatz. In SYMSAC,pages 202–208. ACM, 1986.

22. D. Kapur. Using grobner bases to reason about geometry problems. J. Symb.Comput., 2(4):399–408, 1986.

http://www.scholarpedia.org/article/Groebner_basis

http://www.singular.uni-kl.de

http://www.singular.uni-kl.de


23. D. Kapur and P. Narendran. An equational approach to theorem proving in first-order predicate calculus. In IJCAI, pages 1146–1153. Morgan Kaufmann, 1985.

24. A. Kuehlmann, V. Paruthi, F. Krohm, and M. Ganai. Robust boolean reason-ing for equivalence checking and functional property verification. IEEE TCAD,21(12):1377–1394, 2002.

25. M. Lauria and J. Nordstrom. Graph Colouring is Hard for Algorithms Basedon Hilbert’s Nullstellensatz and Grobner Bases. In R. O’Donnell, editor, CCC,volume 79 of LIPIcs, pages 2:1–2:20. Schloss Dagstuhl, 2017.

26. J. Lv, P. Kalla, and F. Enescu. Efficient Grobner basis reductions for formalverification of Galois field arithmetic circuits. IEEE TCAD, 32(9):1409–1420, 2013.

27. M. Miksa and J. Nordstrom. A generalized method for proving polynomial calculusdegree lower bounds. In CCC, volume 33 of LIPIcs. Schloss Dagstuhl, 2015.

28. A. Niemetz, M. Preiner, and A. Biere. Boolector 2.0 system description. JSAT,9:53–58, 2014 (published 2015).

29. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - A Proof Assistant forHigher-Order Logic, volume 2283 of LNCS. Springer, 2002.

30. D. Ritirc, A. Biere, and M. Kauers. Column-wise verification of multipliers usingcomputer algebra. In FMCAD, pages 23–30. IEEE, 2017.

31. D. Ritirc, A. Biere, and M. Kauers. Improving and extending the algebraic ap-proach for verifying bit-level multipliers. In DATE, 2018, in press.

32. A. Sayed-Ahmed, D. Große, U. Kuhne, M. Soeken, and R. Drechsler. Formalverification of integer multipliers by combining Grobner basis with logic reduction.In DATE, pages 1048–1053. IEEE, 2016.

33. A. Sayed-Ahmed, D. Große, M. Soeken, and R. Drechsler. Equivalence checkingusing Grobner bases. In FMCAD, pages 169–176. IEEE, 2016.

34. A. Tiwari. An Algebraic Approach for the Unsatisfiability of Nonlinear Constraints,pages 248–262. Springer Berlin Heidelberg, Berlin, Heidelberg, 2005.

35. Wolfram Research, Inc. Mathematica, 2016. Version 10.4.36. C. Yu, W. Brown, D. Liu, A. Rossi, and M. Ciesielski. Formal verification of

arithmetic circuits by function extraction. IEEE TCAD, 35(12):2131–2142, 2016.37. L. Zhang and S. Malik. Validating SAT solvers using an independent resolution-

based checker: Practical implementations and other applications. In DATE, 2003.38. E. Zulkoski, C. Bright, A. Heinle, I. S. Kotsireas, K. Czarnecki, and V. Ganesh.

Combining SAT solvers with computer algebra systems to verify combinatorialconjectures. J. Autom. Reasoning, 58(3):313–339, 2017.

Unknot Recognition Through QuantifierElimination

Syed M. Meesum1[0000−0002−1771−403X] and T. V. H.Prathamesh2[0000−0003−3842−3626]

1 Institute of Computer Science,University of Wroc law, Poland2 Department of Computer Science,University of Innsbruck, Austria

meesum.syed@gmail, [email protected]

Abstract. Unknot recognition is one of the fundamental questions inlow dimensional topology. In this work, we show that this problem can beencoded as a validity problem in the existential fragment of the first-ordertheory of real closed fields. This encoding is derived using a well-knownresult on SU(2) representations of knot groups by Kronheimer-Mrowka.We further show that applying existential quantifier elimination to theencoding enables an UnKnot Recognition algorithm with a complexityof the order 2O(n), where n is the number of crossings in the given knotdiagram. Our algorithm is simple to describe and has the same runtimeas the currently best known unknot recognition algorithms. This leadsto an interesting class of problems, of interest to both SMT solving andknot theory.

Keywords: real algebraic geometry, knot theory, algorithms, symbolic compu-tation, SMT solving.

1 Introduction

In mathematics, a knot refers to an entangled loop. The fundamental problemin the study of knots is the question of knot recognition: can two given knotsbe transformed to each other without involving any cutting and pasting? Thisproblem was shown to be decidable by Haken in 1962 [7] using the theory ofnormal surfaces. We study the special case in which we ask if it is possible tountangle a given knot to an unknot. The UnKnot Recognition recognitionalgorithm takes a knot presentation as an input and answers Yes if and only ifthe given knot can be untangled to an unknot. The best known complexity ofUnKnot Recognition recognition is 2O(n), where n is the number of crossingsin a knot diagram [2,7].

More recent developments show that the UnKnot Recognition recogni-tion is in NP ∩ co-NP. Using the theory of normal surfaces, Hass, Lagarias andPippenger [8] proved existence of an NP membership certificate for UnKnotRecognition. A notion which is closer to the commonly accepted notion ofuntangling a knot is that of using Reidemeister moves. The existence of a polyno-mial length sequence of Reidemeister moves having size O(n11), that untangles

78 S. M. Meesum and T. V. H. Prathamesh

an unknot, was proved to exist by Lackenby [12]. Searching over all possibleReidemeister moves will give a simple to describe algorithm having runtime of2O(n11). According to Lackenby [12], a proof sketch for co-NP membership ofUnKnot Recognition was first announced by Agol, but not written downin detail. Assuming the Generalized Reimann Hypothesis, a polynomial-timecertificate for non-membership of a knot in UnKnot Recognition was provedto exist by Kuperberg [11]. Finally, an unconditional proof for the membershipof UnKnot Recognition in co-NP was given by Lackenby [13].

Several approaches to unknot recognition can be found in literature, such asa complete knot invariant such as Khovanov homology, branching algorithms [2]involving normal surface theory, manifold hierarchies[13], Dynnikov diagrams [4],equational reasoning [5], and automated reasoning[15].

Most of the known algorithms deciding UnKnot Recognition are complexand have an involved analysis. There are few implementations of this algorithm.One of the implementations is known to show polynomial time behaviour torecognize an unknot, but behaves exponentially to detect that a given diagramrepresents a knotted knot. [2]

The authors believe that this paper presents a simpler alternate algorithm,which relies on reducing the above problem to a sentence in the existential theoryof reals [17]. This enables application of the decision procedure for existentialtheory of reals using quantifier elimination, to obtain an algorithm which isexponential in complexity, thus of the same complexity class as the best knownapproaches. The algorithm presented in this paper has not yet been implemented.

Acknowledgments: The authors would like to thank the Institute of Mathe-matical Sciences, HBNI, Chennai, India, where a part of the work was carried out.The first author is supported by the NCN grant number 2015/18/E/ST6/00456.The second author is supported by the FWF project number P30301.

2 Preliminaries

This section contains some of the basic definitions in knot theory, and a briefnote on quantifier elimination and existential theory of reals without explicitlystating the algorithm. For a more detailed introduction to knot theory one mayrefer to [18,3,14,9], and for quantifier elimination in existential theory of reals,one may refer to [6] [1].

For a natural number n, we use [n] to denote the set {1, 2, . . . , n}. We useSU(2) to denote the group which contains 2 × 2 complex hermitian matriceswith unit determinant, with multiplication as the group operation. For a naturalnumber d, we use Sd to denote the subset of Rd with euclidean norm equal toone. The symbol i denotes

√−1, the imaginary root of unity. The symbol ∧ is

used to denote the operation of logical conjunction. The symbol ∨ is used todenote the operation of logical disjunction.

Unknot Recognition Through Quantifier Elimination 79

2.1 Knot Theory

Basic Definitions

Definition 1. A (tame) knot K is the image of a smooth injective map from S1

to S3.

Remark 1. S3 in the above definition can be replaced by R3. But we use S3,because some of the concepts that we introduce such as knot groups, exist onlyin the context of the embedding of a circle in S3.

Two knots are considered to be the same, if they are related by an equivalencecondition called ambient isotopy. It is defined as follows:

Definition 2 (Ambient Isotopy). The knots K1 and K2 are ambient isotopicif there exists a smooth map F : S3 × [0, 1]→ S3 such that:

– ∀x ∈ S3. F (x, 0) = x.– F (K1, 1) = K2.– ∀t ∈ [0, 1]. F (S3, t) is a homeomorphism of S3.

Ambient isotopy describes when a knot can be transformed into another bya deformation that does not involve any cutting and pasting. To draw a knoton paper, we use the convention that wherever the string is shown broken itis assumed to be passing under the unbroken string. To illustrate the abovecondition, consider the following knots:

1) Unknot 2) A Twist 3) Trefoil Knot

The first two knot diagrams shown above can be deformed into each other bytwisting/untwisting, thus they represent the same knot. Deforming either of thefirst two knots into the third knot, would involve cutting and pasting. Thus it isdifferent from the former knots.

Definition 3. An unknot is a knot which is ambient isotopic to the circle S1.A knot k is knotted if and only if it is not an unknot.

Determining when two diagrams represent the same knot, is the key problemof knot theory. The special case of it, determining when a given knot diagram isequivalent to unknot is called the unknot recognition problem. There have beenseveral algorithms and approaches to the knot recognition, listed in the previoussection.


Knot Group One of the known invariants of knots is the fundamental groupof the knot complement. Knot complement refers to the compact 3-manifoldobtained by considering the complement of a tubular neighbourhood of the knot.This invariant can detect knots up to mirror image. Presentations of this group,called the Wirtinger presentation, can be easily computed from a knot diagramin the following manner:

– The knot diagram is oriented in one of the two possible directions. The stringconstituting the knot is given a direction which fixes the direction of all thearcs occurring in the knot diagram.

– Every connected arc is associated to a distinct generator.– Every crossing gives rise to a relation in the presentation. The relation

depends on the orientation of the arcs in the crossing, in the manner as shownin Figure 1.

Computing the Wirtinger presentation of a group from the diagram can beachieved using the steps described above in time which is a linear function of thenumber of crossings in the given knot diagram.

SU(2) representations of the knot group The following theorem by Kron-heimer - Mrowka, translates unknot recoginition to existence of non-commutativeSU(2) representations of the knot group.

Proposition 1 ([10], [11]). If K is knotted, then it has an non-commutativeSU(2) representations of the knot group π1(S3 \K).

The following lemma is derived from the theorem above. The reverse direction ofthe lemma follows from the fact the knot group of the unknot is Z, and all itsSU(2) representations are commutative.

Lemma 1. A knot K is knotted if and only if there exists a non-commutativeSU(2) representation of the knot group π1(S3 \K).

We note that every finitely presented group has a trivial homomorphism to thegroup SU(2) via a mapping of each generator to the identity matrix.

xk

xj

xj

xk+1

xj xj

xk+1 xk

xjxkxj-1 = xk+1 xj

-1xkxj = xk+1

Fig. 1. Wirtinger presentation relations for the knot group.


2.2 Quantifier Elimination in Existential Theory of Reals

Decidability of the first-order existential theory of reals refers to the existence ofa decision procedure for validity of all sentences of the form:

∃X. F (X),

where F is any quantifier free formula of polynomial equalities and inequalitiesin real variables X. It follows from the Tarski-Seidenberg theorem that theproblem above is decidable by a quantifier elimination algorithm. The quantifierelimination in this case, in fact holds true for deciding validity of all first-ordersentences. Quantifier elimination algorithm refers to computation of a quantifierfree sentence, which is equivalent to the sentence with quantifiers. Validity ofthe quantifier free sentences can be computed, which makes the algorithm adecision procedure for the first-order theory. Quantifier elimination algorithm inthe existential fragment is restricted to finding equivalent quantifier free sentencesonly for first-order sentences with existential quantifiers, of the form describedabove.

The known complexity bounds for the quantifier elimination in the generalfirst-order theory of reals are doubly exponential. The existential fragment hasa much lower complexity bound and there are several algorithms for it. Ouranalysis will be based on the following result:

Proposition 2 (see Proposition 4.2 in [17]). Given a set P of equations,each of which is either a ` polynomial equalities or inequality of degree d in kvariables, and with integer coefficients of bit length L, we can decide the feasibilityof P with L logL log logL(` · d)O(k) bit operations.

3 Algorithm

The algorithm Unknot-QE appears as Algorithm 1 on the next page.

Remark 2. The algorithm can be simplified leading to improvements in efficiency,within the same complexity class, but our choice of description is motivated byexpository concerns.

The key idea behind the algorithm can be stated in terms of the followingtheorem which will be proved in the next section.

Theorem 1. There exists a computable map Φ, which takes a knot diagramK to a sentence in the existential fragment of the first order theory of reals. Aknot K is knotted if and only if Φ(K) is valid. Moreover, applying an existentialquantifier elimination algorithm to Φ(K) leads to a decision method for UnKnotRecognition.


Algorithm: Unknot-QEInput: A knot group π1(S3 \K) = 〈 g1, g2, . . . , gn | R1, R2, . . . , Rn 〉Output: Output Yes if K is an unknot, otherwise output No

1 begin2 E ←− ∅3 P ←− ∅4 for k ← 1 to n do

5 Mk ←−[ak + ibk ck + idk−ck + idk ak − ibk

]6 end for7 for k ← 1 to n do8 if Rk = gjgkg

−1j g−1

k+1 then

9 Ek ←−Mk+1Mj −MjMk

10 end if

11 if Rk = g−1j gkgjg

−1k+1 then

12 Ek ←−MkMj −MjMk+1

13 end if/* the real part of the entries of the first row of Ek */

14 ERek ←− ReU(Ek)

/* the complex part of the entries of the first row of Ek */

15 EImk ←− ImU(Ek)

16 Put all the polynomials in ERek and EIm

k in the set P17 Put a2k + b2k + c2k + d2k − 1 in P18 end for19 Put the equation

∑p∈P p

2 = 0 in E20 N ←− ∅21 for k ← 2 to n do22 Put ak − a1, bk − b1, ck − c1 and dk − d1 in N23 end for24 Put the inequality

∑p∈N p

2 6= 0 in E25 if E is satisfiable then26 return Yes27 else28 return No29 end if

30 end

Algorithm 1: Description of the algorithm for Unknot detection.


4 Proof of the Algorithm

In the proof, we reduce the Kronheimer-Mrowka property, stated in Section2.1, to a first-order sentence in the existential theory of quantifier elimination.Observe that every knot group has Wirtinger presentations which correspond toknot diagrams. These presentations are of the following form:

〈 g1, g2, . . . , gn | R1, R2, . . . , Rn 〉.

For i ∈ [n], the symbol gi denotes a generator of the group and Ri denotes arelation satisfied by the generators. In the Wirtinger presentation, each Rk iseither gjgkg

−1j g−1k+1 or g−1j gkgjg

−1k+1, where j ∈ [n] and depends on k, we use

+(Rk) or −(Rk) to denote them respectively.Finding a non-commutative SU(2) representation of π1(S3 \K), if it exists,

can be seen as a conjunction of the following steps:

1. Mapping generators of the Wirtinger presentation to matrices in SU(2).2. Checking that the above map extends to a well defined homomorphism, i.e.

the matrices corresponding to generators satisfy the generating relations ofthe Wirtinger presentation.

3. Checking that the map is non-commutative.

In the following paragraphs, we elaborate on and construct equivalent con-ditions for each of the above steps. Let gk be a generator in the Wirtingerpresentation, associated to a knot diagram. Consider a map Φ from the set ofgenerators to M, in which we map gk to Mk.

Mk =

[ak + ibk ck + idk−ck + idk ak − ibk

](1)

Here ak, bk, ck, dk are real variables. For Mk to be an element of SU(2), it mustbe unitary (i.e. the inverse of Mk is equal to transpose of its complex-conjugate)and it must have unit determinant, which gives us the following extra conditionon the variables used to define it.

Observation 2 (folklore) Mk ∈ SU(2) if and only if (a2k + b2k + c2k + d2k = 1).

In addition, the mapped elements Mk’s have to satisfy the knot group relationsobtained from the Wirtinger presentation i.e.

(+(Rk)→MjMkM−1j M−1k+1 = I) ∧ (−(Rk)→M−1j MkMjM

−1k+1 = I) (2)

where I is the 2× 2 identity matrix.For k ∈ [n], we define Ek as follows:

Ek =

{Mk+1Mj −MjMk +(Rk)

MkMj −MjMk+1 −(Rk)

The condition on matrices in Equation (2) can be restated in terms ot Ek asfollows:


Observation 3 For Mk ∈ SU(2), for i ∈ [n], a knot group embedding must

satisfy Ek =

[0 00 0

].

The above observation meets the goal of step (2). The above matrix equalitycan be rewritten as a system of four quadratic equations in real variables in thefollowing fashion:

– Decompose the matrix Ek into real and imaginary parts – Re(Ek) andIm(Ek): then Ek = 0 if and only if Re(Ek) = 0 and Im(Ek) = 0.

– Define ReU (Ek) and ImU (Ek) to be the sets of polynomial equalities:

p(x) = 0,

where p(x) is an entry in the top row of the Re(Ek) and Im(Ek) respectively.We similarly define ReD(Ek) and ImD(Ek) for the bottom row. Either bysimplifying Ek or by noticing the fact that the matrices Mk form a groupand their product matrix must also be of the same form as Equation (1), onecan observe that:

ReU (Ek) ∪ ImU (Ek) = ReD(Ek) ∪ ImU (Ek).

Consider the set P , consisting of all the polynomials ReU (Ek), ImU (Ek) anda2k + b2k + c2k + d2k − 1 = 0, where k ∈ [n]. The following lemma allows us todecrease the number of equalities we have in the system of equations.

Lemma 2 (Reverse Rabinoswitch Encoding [16]).Let P = {p1 = 0, . . . , pm = 0} be the system of m equality constraints, as definedabove. Then p1 = 0∧p2 = 0 · · ·∧pm = 0 is satisfiable if and only if

∑i∈[m] p

2i = 0

is satisfiable.

The above equation gives an equivalent condition for checking the existenceof a SU(2) representation of a knot group. We need to further ensure that therepresentation is non-commutative. In general, to check that the generatorsare non-commutative, we would have to check that at least one of the pairsof generators does not commute. However, the special structure of knot grouprelations allows for a much simpler encoding into polynomial inequalities. In thefollowing lemma we show that finding a non-commutative SU(2) representation isequivalent to finding a representation which maps at least two distinct generatorsof the Wirtinger presentation to distinct elements of SU(2).

Lemma 3. A knot group π1(S3 \K), with generators gi, has a non-commutativehomomorphism ρ to a group if and only if ρ(gi) 6= ρ(gj), for some i 6= j.

Proof. In the forward direction, observe that if the generator’s images are allequal then ρ is commutative. In the backward direction, assume that the imageset of {gi}1≤i≤n has at least two distinct elements. Therefore, there must existan index k ∈ [n] such that ρ(gk) 6= ρ(gk+1). Without loss of generality assume


the relation Rk = +(Rk), similar steps would be true for the −(Rk) form of therelations. Since ρ(Rk) = I, we have

ρ(gj)ρ(gk)ρ(gj)−1ρ(gk+1)−1 = I.

As ρ(gk) 6= ρ(gk+1), it must be the case that

ρ(gj)ρ(gk)ρ(gj)−1 6= ρ(gk)

=⇒ ρ(gj)ρ(gk) 6= ρ(gk)ρ(gj).

Therefore ρ is non-commutative.

If ρ is the SU(2) representation, then it suffices to check that there existat least two distinct matrices in the image to obtain the existence of a non-commutative representation, in addition to the earlier mentioned constraints.The following series of observations further simplify the criterion:

Observation 4 Consider the matrices Mj and Mk, as defined above wherej, k ∈ [n].

(Mj 6= Mk)↔ (aj 6= ak ∨ bj 6= bk ∨ cj 6= ck ∨ dj 6= dk)

Observation 5 Let r1, . . . , rm be real numbers. There exist indices j, k ∈ [n]such that rj 6= rk if and only if

∨m`=2(r1 6= r`) is true.

The following lemma allows us to convert the system of inequalities encodingthe constraint of non-commutativity into just one equivalent inequality.

Lemma 4. Let N = {p1 6= 0, . . . , pm 6= 0} be a system of m inequality con-straints. Then p1 6= 0 ∨ p2 6= 0 ∨ · · · ∨ pm 6= 0 is satisfiable if and only if∑

i∈[m] p2i 6= 0 is satisfiable.

Proof. The lemma follows from the negation of the statement of Lemma 2.

Combining Lemmas 3, 4 and Observations 4, 5, we get that it suffices to addthe the following inequality, to check non-commutativity:∑n

i=1(ai − a1)2 + (bi − b1)2 + (ci − c1)2 + (di − d1)2 6= 0

Let E be the set consisting of above inequality and the equation in Lemma 2.It is easy to see from Lemma 2, Observations 4 and 5 and Lemma 4, that thereexists a non-commutative representation from the knot group to SU(2), if andonly if E has a solution. This completes the proof of Theorem 1.


5 Complexity Analysis

The algorithm consists of first computing Wirtinger presentation from the inputknot diagram, which can be done in linear time. The formula E can be constructedin polynomial time. Next, we analyse the complexity of deciding the feasibility ofthe constructed existential formula. If the number of crossings in the providedknot diagram is n then the number of real variables in the system of equations is4n. The system of equations E consists of exactly two statements; one equalityand one inequality, with maximum total degree of any monomial of 4. Finally,note that the coefficients of polynomials occurring in our system of equations arefrom the set {−2− 1, 1, 2}, as the coefficients of the polynomials before squaringare from the set {−1, 1}. Using Proposition 2, we get the following result.

Theorem 6. The procedure Unknot-QE solves the problem UnKnot Recog-nition in time 2O(n), where n is the number of crossings in the given knotdiagram.

6 Conclusion

In this article, we presented an algorithm for UnKnot Recognition, a proof ofcorrectness, and an analysis of its complexity. The key advantage of this algorithmover the existent algorithms is the simplicity of description while having thesame runtime complexity as the other currently best algorithms. The simplicityof this algorithm is largely from an expository perspective, if the existentialquantifier elimination is treated as a blackbox. As an open problem, it wouldbe of interest to probe whether one can reduce the runtime complexity furtherby using a variant of the algorithm presented. It may be possible to do so bydecreasing the number of variables in the equation via substitution methods.Practical aspects of this algorithm also need to be explored further, for instance-an implementation using existent tools such as SMT solvers and CylindricalAlgebraic Decomposition would be useful. A more ‘efficient’ or implementablealgorithm for quantifier elimination in the existential theory of reals would furtheraid both the theoretical and practical aspects of unknot recognition.

References

1. Saugata Basu, Richard Pollack, and Marie-Francoise Coste-Roy. Algorithms in realalgebraic geometry, volume 10. Springer Science & Business Media, 2007.

2. Benjamin A Burton and Melih Ozlen. A fast branching algorithm for unknotrecognition with experimental polynomial-time behaviour. arXiv preprint arXiv:1211.1079v3, 2014.

3. Richard H Crowell and Ralph Hartzler Fox. Introduction to knot theory, volume 57.Springer Science & Business Media, 2012.

4. IA Dynnikov. Arc-presentations of links: monotonic simplification. FundamentaMathematicae, 190:29–76, 2006.


5. Andrew Fish, Alexei Lisitsa, David Stanovsky, and Sarah Swartwood. Efficient knotdiscrimination via quandle coloring with sat and#-sat. In International Congresson Mathematical Software, pages 51–58. Springer, 2016.

6. D Yu Grigor’ev. Complexity of deciding Tarski algebra. Journal of symbolicComputation, 5(1-2):65–108, 1988.

7. Wolfgang Haken. Theorie der normalflachen. Acta Mathematica, 105(3-4):245–375,1961.

8. Joel Hass, Jeffrey C Lagarias, and Nicholas Pippenger. The computational com-plexity of knot and link problems. Journal of the ACM (JACM), 46(2):185–211,1999.

9. Akio Kawauchi. A survey of knot theory. Birkhauser, 2012.10. Peter B Kronheimer, Tomasz S Mrowka, et al. Witten’s conjecture and property P.

Geometry & Topology, 8(1):295–310, 2004.11. Greg Kuperberg. Knottedness is in NP, modulo GRH. Advances in Mathematics,

256:493–506, 2014.12. M Lackenby. A polynomial upper bound on Reidemeister moves. Annals of

Mathematics, 182(2):491–564, 2015.13. Marc Lackenby. The efficient certification of knottedness and thurston norm. arXiv

preprint arXiv:1604.00290, 2016.14. WB Raymond Lickorish. An introduction to knot theory, volume 175. Springer

Science & Business Media, 2012.15. Alexei Lisitsa and Alexei Vernitski. Automated reasoning for knot semigroups and

π -orbifold groups of knots. In Mathematical Aspects of Computer and InformationSciences, pages 3–18. Springer International Publishing, 2017.

16. Grant Olney Passmore and Paul B Jackson. Combined decision techniques for theexistential theory of the reals. In International Conference on Intelligent ComputerMathematics, pages 122–137. Springer, 2009.

17. James Renegar. On the computational complexity and geometry of the first-ordertheory of the reals. part I: Introduction. preliminaries. the geometry of semi-algebraicsets. the decision problem for the existential theory of the reals. Journal of symboliccomputation, 13(3):255–299, 1992.

18. Edward Witten, Martin Bridson, Helmut Hofer, Marc Lackenby, and Rahul Pand-haripande. Lectures on Geometry. Oxford University Press, 2017.

New in CoCoA-5.2.4 and CoCoALib-0.99600for SC-Square

John Abbott1, Anna M. Bigatti1, and Elisa Palezzato2

1 Dipartimento di Matematica, Universita degli Studi di Genova, Italy2 Department of Mathematics, Hokkaido University

Abstract. CoCoALib is a C++ software library offering operations onpolynomials, ideals of polynomials, and related objects. The principaldevelopers of CoCoALib are members of the SC2 project. We give anoverview of the latest developments of the library, especially those relat-ing to the project SC2.

The CoCoA software suite includes also the programmable, interactivesystem CoCoA-5. Most of the operations in CoCoALib are also acces-sible via CoCoA-5. The programmability of CoCoA-5 together with itsinteractivity help in fast prototyping and testing conjectures.

1 Introduction

We briefly recall what is described in [2] and [3]. The CoCoA project datesback to 1987, with the first public release of its interactive system in 1989.The main purpose of the project has always been to offer a convenient softwarelaboratory for studying Computational Commutative Algebra, especially idealsof multivariate polynomials (e.g. Grobner bases).

The CoCoA software has a modular design: its “mathematical expertise”resides in the C++ software library [4], while the interactive system [7] usesan interpreter which grants easy access to CoCoALib’s capabilities (withoutrequiring any knowledge of C++). All code is free and open source (under licenceGPL-3).

We give an overview of the latest developments of the library and of thesystem (for the summer 2018 release). There are some new aspects of particularinterest to the project SC2. One feature is an implementation in CoCoALib ofa new efficient algorithm for computing the minimal polynomial ([5, 3]), and itsapplication to many operations on 0-dimensional ideals — a prototype imple-mentation in CoCoA-5 was mentioned in [3]. In particular, in view of applicationsto Cylindrical Algebraic Decomposition (CAD), we focus on factoring polyno-mials over algebraic field extensions, and on evaluating good approximations fortheir roots.

Another feature of interest to SC2 is the prototype interface for communica-tion between CoCoALib and MathSAT (Section 4).

New in CoCoA-5.2.4 and CoCoALib-0.99600 for SC-Square 89

2 Improving usability of CoCoA for SC2

2.1 Timeout mechanism

A flexible “timeout” mechanism has been added to CoCoALib. It has a sim-ple user interface, and can be used with several function calls to CoCoALib.The exact behaviour of the timeout depends on the specific function: e.g. somefunctions either return the correct and complete answer within the requestedtime limit, or throw an exception to say that the result could not be computedquickly enough; other functions, which compute approximations to some exactvalue, return the best approximation which could be obtained within the giventime limit.

One purpose of the timeout mechanism is to allow a “speculative” approachto solving: i.e. a potentially costly algorithm may be called with a time limit,and in fortunate cases the answer is returned, but in the worst case of a vainattempt only the specified amount of time has been consumed. For example,one may attempt to find all real solutions to a 0-dimensional polynomial sys-tem (internally this computes a Grobner basis); if successful then the result isvaluable, but in some unlucky cases the Grobner basis computation may be un-reasonably slow, so we use the timeout mechanism to avoid the “black hole”, andmust then proceed without knowing what the solutions to the system are — thistechnique has already been successfully employed inside CoCoALib itself: forexample, in testing primality of zero-dimensional ideals (which is called duringthe factorization of polynomials over algebraic extensions, Section 3.1).

2.2 Towards Real Solving

We have implemented a prototype for GBasisRealSolve which removes somenon-real components during Grobner basis computation. The critical internaloperation is a quick function for “approximating” the real radical of a poly-nomial. In this context “approximating” means applying quick heuristics fordetermining whether a polynomial admits any real solutions; the heuristic mayrespond true, false or uncertain. Factors which surely have no real roots areremoved, whereas those which do have or may have real roots are retained.

The heuristic employs two other ideas mentioned here: real root counting(via Sturm Sequences), and timeout. Even though this heuristic approach is,mathematically speaking, not a satisfactory solution to the general problem, thefunction GBasisRealSolve seems to work well on examples arising from SMTcomputations (see Section 4).

A mathematically complete solution could go via Cylindrical Algebraic De-composition. Towards this end, we have developed in CoCoA factorisation ofpolynomials over algebraic extensions, and evaluation of polynomials over realintervals (Section 3).

90 John Abbott, Anna M. Bigatti, and Elisa Palezzato

3 Algebraic extensions

CoCoALib has had for some time the capability to compute with polynomialswhose coefficients lie in an algebraic extension. The extension may be specified byany maximal ideal, and does not require that a primitive element be furnished.One important step for CAD is the computation of isolating intervals for the realroots of a polynomial with coefficients in a (real) algebraic extension. CoCoALibis not yet able to do this last operation, but we are developing the variouscomponents necessary to reach this goal.

3.1 Factorization over algebraic extensions

In this section we describe an effective method for factorizing univariate poly-nomials over finite field extensions. This method, based on the computation ofprimary decompositions of zero-dimensional ideals, is described in the PhD the-sis [10] of the third author, and took inspiration from [11] and [9]; this workincluded a full implementation in CoCoA-5 language. It has now been stream-lined and ported into CoCoALib-0.99600 (together with all functions derivedfrom the computation of the minimal polynomial — see [5]).

Example 1. Let L = F2[a]/M ∼= F8 be the base field, where M = 〈a3 + a + 1〉is a maximal ideal, and consider f(x) = x5 + x+ 1 ∈ L[x], the polynomial to befactorized.

/**/ FF_2 ::= ZZ/(2);

/**/ use S ::= FF_2[a];

/**/ M := ideal(a^3+a+1);

/**/ L := S/M;

/**/ use Lx ::= L[x];

/**/ f := x^5+x+1;

/**/ factor(f);

record[

RemainingFactor := (1),

factors := [x+(a^2+a+1), x+(a^2+1), x+(a+1), x^2+x+(1)],

multiplicities := [1, 1, 1, 1]

]

This is the algorithm implemented in CoCoALib. Let S = K[a1, . . . , am] bea polynomial ring over a perfect field K, let M be a maximal ideal in S, letL = S/M, and let f(x) ∈ L[x] be a univariate polynomial. The factorization off(x) is obtained by computing the primary decomposition of the idealM+〈f(x)〉in the polynomial ring K[a1, . . . , am, x]; for the proof of correctness see [10].

Algorithm for Factorization over Algebraic ExtensionNotation: let S = K[a1, . . . , am], M a maximal ideal in S, L = S/MInput f(x) polynomial in L[x]


1 create the ring P = K[a1, . . . , am, x]let φ : S −→ P , defined by ai 7→ ailet ψ : P −→ L[x], defined by ai 7→ ai and x 7→ x

2 let MP = 〈φ(g) | g ∈ gens(M)〉 ⊆ Plet fP ∈ ψ−1(f), a representative of f in Plet J =MP + 〈fP 〉 ideal in P

3 compute the primary decomposition of J −→ Q1 ∩ · · · ∩Qs

4 compute gi, the monic generator of ψ(Qi) in L[x] — note: L[x] is PID

5 compute hi = rad(gi) and let mi = deg(gi)deg(hi)

Output LC(f) ·∏s

j=1 hmii , the factorization of f in L[x]

Example 2. The internal computation for Example 1, following the algorithmabove, actually performs these steps:

/**/ use P ::= FF_2[a, x];

/**/ phi := PolyAlgebraHom(S, P, [a]);

/**/ psi := PolyRingHom(P, Lx, CanonicalHom(FF_2 , L),

[RingElem(Lx ,"a"), RingElem(Lx ,"x")]);

/**/ J := ideal(a^3+a+1) + ideal(x^5+x+1);

/**/ PrDec := PrimaryDecomposition(J); PrDec;

[ ideal(a^3 +a +1, x^5 +x +1, x^2 +x +1),

ideal(x^3 +x^2 +1, a^3 +a +1, a^2*x +a*x^2 +a^2 +a),

ideal(x^3 +x^2 +1, a^2 +a*x +x^2 +a, a*x^2 +x +1),

ideal(x^3 +x^2 +1, a^2 +a*x +x^2 +a, a*x^2 +x) ]

/**/ [ ReducedGBasis(ideal(apply(psi ,gens(Q)))) | Q in PrDec ];

[[x^2 +x +(1)], [x +(a^2+a+1)], [x +(a^2+1)] , [x +(a+1)]]

and these are indeed the (square-free) factors of f .Next we see a call to factor in an algebraic extension which is given by a

non-principal, maximal ideal, L = Q[a, b]/〈a2 − 3, b2 − ab− 4〉,

/**/ use S ::= QQ[a, b];

/**/ M := ideal(a^2-3, b^2-a*b-4);

/**/ IsMaximal(M); // check that M is a maximal ideal

true

/**/ L := S/M;

/**/ use Lx ::= L[x];

/**/ factor(x^6 +(b^2)*x^4 +( -55*b^2+80)*x^2 +(315*b^2 -528));

record[

RemainingFactor := (1),

factors := [x^2 +(3*b^2), x +(-b), x +(b)],

multiplicities := [1, 2, 2]

]

3.2 Counting real roots

We have implemented a function for computing a primitive Sturm sequence of agiven (univariate) polynomial with rational coefficients. Sturm sequences enable


one to compute easily the number of real roots a given (univariate) polynomialhas in a given real interval; in particular, they can be used to tell whether agiven (univariate) polynomial has any real roots. CoCoA also offers a functionto count the number of real roots:

/**/ W := product ([x-k | k in 1..20]); //Wilkinson ’s polynomial

/**/ W2 := W + (1/7402570310)*x^19;

/**/ NumRealRoots(W2);

18

/**/ W3 := W + (1/7402570311)*x^19;

/**/ NumRealRoots(W3);

20

The interactive CoCoA-5 system offers a suite of interpreted functions forcomputing tight isolating intervals for real roots. This code is not yet available inCoCoALib, but we are planning to port it (though this lengthy task will likely becompleted after the end of this first SC2 project). While a list of isolating intervalsgives more explicit information than a Sturm sequence, it is also generally morecostly to compute.

3.3 Bounds on roots of polynomials

CoCoA now also offers a collection of functions for obtaining a good upper boundfor the absolute value of every complex root of a given (univariate) polynomial.The main function strikes an automatic balance between speed of computationand tightness of the computed bound; an optional second argument lets thecaller choose a different balance. A root bound gives less information than aSturm sequence but can be computed faster, and may sometimes give enoughinformation to decide unsolvability. A root bound can be computed for any non-constant (univariate) polynomial, but gives no indication whether any of theroots is real.

/**/ H := HermitePoly (50, x); // largest real root is 9.18

/**/ RootBound(f); // fair compromise for speed/accuracy

295/32 // 9.22

/**/ RootBound(f,0); // fastest , 0 iterations

237/8 // 29.625 (quite loose)

/**/ RootBound(f,1); // still fast , 1 iteration

115/8 // 14.375 (better than 0 iters)

3.4 Interval arithmetic, and range of a polynomial

We have recently added to CoCoALib a prototype suite of functions for “intervalarithmetic” on closed real intervals with rational endpoints. The suite includesa more advanced function for evaluating a given (univariate) polynomial withrational coefficients (or interval coefficients) over a given interval. The result isanother interval, comprising effectively a lower bound for the minimum and anupper bound for the maximum for the polynomial over the given interval. In


general, the result contains strictly the true interval of reachable values: in fact,the true range interval may have irrational end points; e.g. if f = x3 − x thenits range over the interval [−1, 1] has both end points being irrational, namely

[− 2√3

9 , 2√3

9 ]. Consequently, when using intervals with rational end points onlyan approximation to the correct answer can be produced.

We are still studying the compromise between speed of computation andtightness of the resulting interval. Naturally, it can help answer some questions ofsolvability where both the variable and the value of the polynomial are limited tofinite intervals. One future aim is to use this suite to allow isolation of real rootsof polynomials whose coefficients are real algebraic numbers (each representedas a small isolating interval together with the minimal polynomial).

As an example we can compute the range of values of the 10-th Chebyshevpolynomial evaluated on the interval I = [−1, 1]. From the theory we know thatthe true answer is the interval [−1, 1]; our current prototype implementationproduces a fair approximation, being the wider interval from −1.15 to 1.12. TheChebyshev polynomials are an interesting test case because they have many localmaximums and minimums, which become global maximums and minimums whenwe restrict the domain to the interval I. We do not give an explicit example inCoCoA-5 since the user interface is not yet settled.

4 Interface MatSAT↔CoCoA-5

In [3] we described the first steps in the communication between CoCoALiband MathSAT. Some developments followed, in close collaboration with AlbertoGriggio.

The construction in CoCoALib of polynomial rings with user defined order-ings (such as elimination orderings) has been reorganized so that it makes fewerrepeated checks on the admissibility of the term-ordering. This is not a problemin the usual context of Commutative Algebra, where the cost of the computationof a Grobner basis exceeds by far the time spent in the preliminary constructionof the polynomial ring. But the first collection of examples arising from Math-SAT computations, highlighted that this is indeed an issue when we deal withmany indeterminates and relatively simple Grobner basis computations.

The communication between CoCoALib and MathSAT has been improved,so the overhead for the data conversions is now negligible.

Indeed, thanks to this collaboration both CoCoALib and MathSAT speededup their code in some parts which had never been highlighted during their re-spectively normal usage.

Using this new interface, we have built a first collection of examples of systemsof polynomial equalities and inequalities generated from MathSAT computa-tions, for experimenting with CoCoA. Since CoCoA natively handles only equal-ities, we applied an input manipulation which converts MathSAT inequalitiesinto CoCoA equalities by adding squares of slack variables, with further increaseof the number of indeterminates. Then we tested the function GBasisRealSolve


(Section 2.2), using a specific weighted graduation and term-ordering, and ob-served it works very effectively on these examples, detecting UNSAT in remark-ably few steps.

We wrote a prototype CoCoA-5 package using MSatLinSolve, the first Math-SAT function in CoCoA-5 (see [3]) for the computation of Grobner fan as analternative to the call to GFanRelativeInteriorPoint (part of the communi-cation between CoCoALib and Gfanlib [8]). This is an intriguing application;it is not (yet) competitive with the original code for two main reasons, partlybecause the non-trivial code preparing the input for MSatLinSolve is writtenin the interpreted language of CoCoA-5, but especially because the adaptationof the original CoCoA code was not designed to exploit incrementality, which isthe strong point of MathSAT.

References

1. J. Abbott, Fault-Tolerant Modular Reconstruction of Rational Numbers,J. Symb. Comp. 80P3, pp. 707–718, 2017.

2. J. Abbott, A.M. Bigatti, CoCoA and CoCoALib: Fast prototyping and flexible C++library for Computations in Commutative Algebra Proc. 1st Workshop on Satisfi-ability Checking and Symbolic Computation, SC-Square 2016, CEUR WorkshopProceedings 1804, pp. 1–3, 2016.

3. J. Abbott, A.M. Bigatti, CoCoA and CoCoALib: Fast prototyping and flexible C++library for Computations in Commutative Algebra Proc. 2nd Workshop on Satis-fiability Checking and Symbolic Computation, SC-Square 2017, CEUR WorkshopProceedings 1974, pp. 1–6, 2017.

4. J. Abbott, A.M. Bigatti, CoCoALib: a C++ library for doing Computations inCommutative Algebra Available at http://cocoa.dima.unige.it/cocoalib

5. J. Abbott, A. Bigatti, E. Palezzato, L. Robbiano, Computing and Using MinimalPolynomials, arXiv:1704.03680, 2017.

6. J. Abbott, A. Bigatti, L. Robbiano, Ideals modulo p, In preparation.7. J. Abbott, A.M. Bigatti, L. Robbiano CoCoA: a system for doing Computations

in Commutative Algebra Available at http://cocoa.dima.unige.it/

8. A.N. Jensen, Gfan, a software system for Grobner fans and tropical varieties. Avail-able from http://home.math.au.dk/jensen/software/gfan/gfan.html

9. M. Kreuzer and L. Robbiano, Computational Linear and Commutative Algebra,Springer, Heidelberg 2016.

10. E. Palezzato, Minimal Polynomials, Sectional Matrices, and Applications; PhD.thesis, Universita degli Studi di Genova, 2017.

11. Sun Y. and Wang D., Efficient algorithm for factoring polynomials over algebraicextension fields, Sci. China Math., 56, 1155-1168, 2013.

Constraint Systems from Traffic Scenarios forthe Validation of Autonomous Driving?

Andreas Eggers, Matthias Stasch, Tino Teige,Tom Bienmuller, and Udo Brockmeyer

BTC Embedded Systems AG, Gerhard-Stalling-Straße 19, 26135 Oldenburg, Germany{eggers, stasch, teige, bienmueller, brockmeyer}@btc-es.de

http://www.btc-es.de

Abstract. One does not need the gift of clairvoyance to predict thatin the near future autonomously driving cars will occupy a significantplace in our everyday life. In fact, in designated and even public test-drive areas it is reality even now. Autonomous driving comes with theambition to make road traffic safer, more efficient, more economic, andmore comfortable – and thus to “make the world a bit better”. Recentaccidents with automatic cars resulting in severe injuries and death, how-ever, spark a debate on the safety validation of autonomous driving ingeneral. The biggest challenge for autonomous driving to become a real-ity is thus most likely not the actual development of intelligent vehiclesthemselves but their rigorous validation that would justify the necessarylevel of confidence. It is common sense that classical test approaches areby far not feasible in this emerging area of autonomous driving as thesewould induce billions of kilometers of real-world driving in each releasecycle. To cope with the tremendous complexity of traffic situations thata self-driving vehicle must deal with – without doing any harm to anyother traffic participants – a promising approach to safety validation isvirtual simulation, i.e. driving the huge amount of test kilometers in avirtual but realistic simulation environment. A particular interest here isin the validation of the behavior of the autonomous car in rather criticaltraffic scenarios.In this position paper, we concentrate on one important aspect in vir-tual safety validation of autonomous driving, namely on the specificationand concretization of critical traffic scenarios. This aspect gives rise tocomplex and challenging constraint systems to be solved, to which webelieve the SC2 community may give essential contributions by their richvariety of diverse methods and techniques.

Keywords: Autonomous Driving, Traffic Scenarios, Constraint Systems,Constraint Solving

? This work has been conducted within the ENABLE-S3 project that has receivedfunding from the ECSEL joint undertaking under grant agreement no 692455. Thisjoint undertaking receives support from the European Union’s Horizon 2020 Researchand Innovation Programme and Austria, Denmark, Germany, Finland, Czech Re-public, Italy, Spain, Portugal, Poland, Ireland, Belgium, France, Netherlands, UnitedKingdom, Slovakia, Norway.

http://www.btc-es.de

96 Eggers, Stasch, Teige, Bienmuller, Brockmeyer

1 Introduction

While driver assistance systems that are in use today already have to be sub-jected to rigorous validation and verification efforts, there always is a humandriver who can serve as a fallback when the system goes into a “fail safe” state.Latest when it comes to “mind-off” SAE level 4 automated driving [?], however,human drivers can no longer be expected to instantaneously and in-time takeover control when a malfunction of the autonomous driving service occurs. Asa conclusion, humans can not serve as fallback insurance in that case, meaningthat with this degree of automation the level of required confidence in the con-trolling machine enormously increases. ISO 26262 [?] defines the term “safetyvalidation” for safety goals on the vehicle level based on tests and examination.Applying traditional, today’s state-of-the-art validation methods for this safetyvalidation ends up in billions of kilometers to autonomously test-drive to reachthe needed confidence, which obviously turns out to be impractical, in partic-ular, when observing that any change in the algorithm requires to iterate suchtest-drives [?]. Note that this “billions of kilometers to drive” observation isbased on today’s statistics about injuries and fatalities, taking into account hu-man drivers. On the one hand, a single human driver can act differently in verysimilar traffic situations – also depending on physical conditions, emotions, andabilities. On the other hand, humans will be able to do analogous decisions basedon comparable situations – also relying on acquired knowledge and intelligenceto transfer from experience. A digital machine does neither have emotions nor isit able to really transfer decisions from previous experience – validating safetyfor automated driving requires a paradigm change for safety goal validation.

Validating safety goals for automated driving requires two major ingredients.First, virtualization will be needed to shift safety goal validation from physicalreal world driving to a virtual world [?]. Second, traffic situations need to be(graphically) captured and expressed in a semantically unique, systematic, andcomprehensive way [?,?,?,?,?]. Fundamental to virtual validation is the way howtraffic scenarios are specified. On the one hand, a huge mass of scenarios will berequired to show safe behavior of the automated driver. Here, abstract scenarioscapturing the general scene such as an overtaking maneuver need to be varied,for example, under weather or road conditions. On the other hand, when lookingat today’s virtual simulation engines such as VIRES VTD1 or IPG CarMaker2,modeling reality as specifically as possible requires lots of details such as trees,buildings, road surfaces, etc. in order to end up in a very concrete scenario,which can serve as an input to a simulator.

We believe that dealing with the huge amount of scenarios and details neededfor simulation pragmatically requires to provide an abstract specification methodfor base scenarios only capturing the essential core of a driving maneuver, andto equip this with automatic concrete scenario generation out of these abstract

1 https://vires.com/vtd-vires-virtual-test-drive/2 https://ipg-automotive.com/products-services/simulation-software/

carmaker/

https://vires.com/vtd-vires-virtual-test-drive/

https://ipg-automotive.com/products-services/simulation-software/carmaker/

https://ipg-automotive.com/products-services/simulation-software/carmaker/

Constraint Systems for the Validation of Autonomous Driving 97

descriptions. This paper aims at showing the fundamental experimental conceptsof reducing this automatic generation to a constraint solving problem, while stillabstracting from details like road surfaces or weather conditions.

Structure of the paper. We first sketch ideas on a graphical specification languagefor abstract traffic scenarios in Section 2.1 by means of an example. Section 2.2describes the constraints of the example and a simple physical model of thetraffic participants including the ego vehicle which allows modeling the abstractscenario on a straight road. A brief outline of how more complex road geometriescould be handled is given in Section 2.3. In Section 2.4, we discuss the uncon-trollability of the ego vehicle and its impact on the simulation of a scenario.Intuitively, as the ego vehicle’s behavior cannot (and must not) be controlledby the test environment but the intended scenario is still to be simulated asexpected, we view the other traffic participants as run by to-be-synthesized con-trollers whose aim is to interact with the ego vehicle in a reactive way suchthat the expected scenario is actually performed as accurately as possible. InSection 3, we give some further examples of abstract traffic scenarios which mayserve as benchmarks for the SC2 community, while Section 4 concludes the paper.

2 Problem Statement

As motivated in Section 1, one major challenge in the validation of autonomousdriving functionality is to synthesize concrete critical traffic situations from arather abstract, intuitive, graphical, and declarative description of a traffic sce-nario. In this section, we approach the problem by means of a concrete example,namely an overtaking maneuver on a highway for a truck as ego vehicle3. Wefirst explain the abstract description of the example scenario and then derivethe necessary constraint system to be solved in order to concretize the abstractscenario, i.e. to synthesize a concrete instance of the allowed traffic behaviormatching the constraints of the original scenario.

The full problem class would involve challenging ingredients from severalscientific disciplines like continuous behavior to enforce the driving physics ofvehicles, discrete choices to start lane changes or braking actions, geometricshapes to specify curves by clothoids, or material science to describe the effectof the road surface on driving vehicles. Ultimately, solving the problem not onlycalls for finding value sequences for system inputs like acceleration or steering

3 The ego vehicle in the terminology of autonomous driving is the vehicle that iscontrolled by the autonomous driving system and in the context of its validationis thus the system under test. To avoid some possible confusion, we shall point outthat during an actual test of the ego vehicle, it has the freedom to control its ownbehavior, i.e. it is uncontrollable from the perspective of the test environment. Forthe scenario concretization problem which we are about to introduce in this section,however, we first ignore this limitation and assume that a concretization may infact prescribe a behavior of all vehicles including the ego vehicle to concretize thescenario. Later in Sect. 2.4, we revisit this important topic.


Fig. 1. Graphical description of the traffic scenario overtaking maneuver on a highway.(Note: we use as a convention traffic moving from left to right, i.e. on the bottom isthe rightmost, slowest lane.)

movement of the vehicles, but aims for synthesizing a reactive controller of thesurrounding traffic, which is controllable, that obeys the abstract traffic scenariounder each behavior of the uncontrollable ego vehicle.

2.1 Example: Overtaking Maneuver on Highway

To approach the problem statement, we start by explaining the example of anovertaking maneuver on a highway. The graphical abstract scenario descriptionis shown in Fig. 1. We first explain the meaning of the traffic scenario phase byphase.

Phase 0. Initially there are four vehicles on a three-lane road while these vehicleskeep their corresponding lanes throughout this phase. Truck 1 is driving on therightmost lane with a velocity ranging between 21 and 22 m/s. Truck 2 is drivingin the middle lane with a velocity that is a bit greater than this of Truck 1,namely by 0.1 to 0.2 m/s. The front distance between Truck 1 and Truck 2 liesin the interval [1.5 m, 2.5 m] initially and can then evolve arbitrarily within thisphase. We remark that the distance will decrease, of course, due to the constraint


on the velocities. However, this knowledge, in particular how the distance willevolve, is not specified but is implied by physics and potentially other depen-dent constraints, and its concrete evolution is left to solving the concretizationproblem. The Ego Truck shall drive a bit faster than Truck 2 (also in the middlelane), thus closing the distance to Truck 2 from 50 m initially to 30 m at the endof the phase. A similar behavior is specified for the vehicle Car (in the middlelane, too). Each traffic phase also consists of a global timing constraint on theallowed duration. The duration of phase 0 must be at least 0 s and at most 30 s.

Phase 1. In the beginning of this phase all four vehicles reside in their corrspond-ing lanes. This is required to ensure continuity between the phases, meaning thatthere cannot be unspecified behavior between two consecutive traffic phases.Truck 1 is staying on the right lane throughout the phase and keeps a constantvelocity. As a design decision for this scenario, the exact velocity is not speci-fied, but instead it is constrained to be the final velocity of Truck 1 at the endof phase 0. A similar behavior is defined for Truck 2 which stays on the middlelane with a constant velocity. The Ego Truck and Car are also specified to use aconstant velocity but they have to change from the middle lane to the left laneduring the phase. This lane change must be completed within 10 s from the startof the phase as this is defined as the upper limit for the duration of the phase.

Phase 2. In this phase Truck 1 and Truck 2 are to continue their behavior fromphase 1 in staying in their respective lanes at a constant velocity. The Car isdriving on the left lane for the entirety of this phase while keeping a similarvelocity as the Ego Truck denoted by v C ∼ v E and staying at a distancebetween 25 m and 35 m behind the Ego Truck. The Ego Truck is staying on theleft lane during this phase and is not directly constrained in its velocity. Thevelocity of the Ego Truck is governed by the duration of the phase given by theinterval [0 s, 50 s] and the distance constraint which requires the Ego Truck to bebetween 25 m and 35 m in front of Truck 2 at the end of the phase.

Phase 3. During this phase the Ego Truck changes from the left lane to the rightlane while keeping a constant velocity. The exact velocity is not given and istherefore defined by the final velocity at the end of phase 2. The lane changemust be completed within the phase duration of [0 s, 30 s]. The Car is staying onthe left lane and has to keep a distance which is greater than or equal to 0 mto the Ego Truck while the velocity is unconstrained. Truck 1 and Truck 2 areboth keeping their respective lanes and driving with constant velocities definedby the final velocities of phase 2.

Phase 4. In this phase the Car keeps staying on the left lane and overtakes theEgo Truck. The velocity of Car is undefined but it has to drive between 5 m and20 m in front of the Ego Truck at the end of the phase. The duration of the phaseis given by the interval [0 s, 60 s]. The Ego Truck, Truck 1, and Truck 2 all keeptheir lanes and drive at a constant velocity.


Phase 5. In this phase, Truck 1 and Truck 2 are not present anymore as theyare expected to be too far away having no further impact on the Ego Truck.The Ego Truck keeps driving on the right lane at constant velocity. The Car ischanging from the left lane to the right lane while keeping a constant velocity.This lane change has do be finished within 100 s.

Phase 6. This is the final phase of the scenario which has a duration of at least60 s and at most 100 s in which the Ego Truck and Car both keep driving on theright lane at a constant velocity.

2.2 Constraints of the Problem Statement

On the most abstract level, the scenario concretization problem consists of asequence of phases p1, . . . , pn. Within each phase, constraints describe the initialstate of the phase, invariants regarding its continuous evolution over time, and afinal state which must overlap with the initial state of the next phase (if any). Letx be the vector of vehicle position (x, y), orientation (γ), velocity (v), angularvelocity (ω), and acceleration (a) variables and their derivatives with respect totime for all vehicles involved in the scenario. Furthermore, let ψt(x) denote thevaluation of these variables at a time t. Then a discrete scenario concretization isa sequence of valuations ψt0(x), . . . , ψtm(x) such that there exist monotonicallyincreasing time stamps τ0, . . . , τn+1 ∈ {t0, . . . , tm} with τ0 = t0 and τn+1 = tmsuch that

ψτ0(x) |= initp0(x)

∧ ∀t ∈ [τ0, τ1] : ψt(x) |= invar p0(x)

∧ min durationp0 ≤ τ1 − τ0 ≤ max durationp0∧ ψτ1(x) |= final p0(x)

∧ ψτ1(x) |= initp1(x)

∧ ∀t ∈ [τ1, τ2] : ψt(x) |= invar p1(x)

∧ min durationp1 ≤ τ2 − τ1 ≤ max durationp1∧ ψτ2(x) |= final p1(x)

∧ ψτ2(x) |= initp2(x)

...

∧ ψτn(x) |= final pn−1(x)

∧ ψτn(x) |= initpn(x)

∧ ∀t ∈ [τn, τn+1] : ψt(x) |= invar pn(x)

∧ min durationpn ≤ τn − τn−1 ≤ max durationpn∧ ψτn+1

(x) |= final pn(x)

∧ ∀t ∈ [t0, tm] : C(ψt(x)),


i.e. each time point τi selects a valuation that allows at least one transitionfrom one phase to the next, between such time points, the invariant and du-ration constraints of the then-current phase are satisfied, and there are finitelymany (possibly none) intermediate valuations within each phase (the ti withoutcorresponding τj), at which the stimuli (a, ω) – which determine the vehicles’behaviors – can change, while all valuations can be connected by continuousevolutions of the physical system dynamics (C as detailed further below). Themost precise interpretation of the continuous behavior and invariant constraintson them is a continuous-time semantics, i.e. ∀t ∈ [τi, τi+1] really means for allpoints of time that exist in [τi, τi+1] ⊆ R. Less strict interpretations can howeverbe employed to reach approximations of this semantics, e.g. by considering onlyt ∈ {t0, . . . , tm} with τi ≤ t ≤ τi+1. For the purpose of the semantics descrip-tion, we will take a continuous-time perspective, but for practical purposes, suchdiscrete approximations may very well be suitable, too (also see Sect. 2.4). Theintention of the final constraint ∀t ∈ [t0, tm] : C(ψt(x)) is to ensure that a so-lution ψt0(x), . . . , ψtm(x) of the discrete scenario concretization problem can beunderstood as an excerpt of a continuous solution, i.e. a function that satisfies Cover continuous time, keeps the stimuli constant except at times t0, . . . , tm, andswitches from one phase to the next at time points τ1, . . . , τn.

Intuitively, such a sequence of valuations describes stimuli (accelerations andangular velocities4) and the states reached when applying them such that thephases of the scenario are traversed. Between the valuations, no change to thestimuli is made, i.e. finding a discrete scenario concretization directly leads toshowing that a finite number of stimuli changes suffices to perform the scenario.We expect that this can often be achieved with relatively low numbers of in-termediate steps for realistic scenarios. However, in general, when not imposingany limit on the number of intermediate steps within each phase, this allows atleast theoretically an arbitrarily fine resolution of stimuli changes.5

The ingredients of the above formula need to be described in more detail.Focusing on phase 0 of the example (see Fig. 2), we can identify the followingtypes of constraints:

– Distance evolution constraints: the distance marker between the vehicles Car(with xCar as its longitudinal position) and Ego Truck (analogously xEgo Truck)has been graphically annotated with a constraint “50 m 30 m”. This no-tation actually amounts to three specifications which contribute to (1) theinitial predicate of the phase initp0 , i.e. when entering the phase, the initial

4 Alternatively, the use of an angular acceleration as stimulus and the angular velocityas resulting physical variable are of course possible, too.

5 Note that the formula structure can be matched to the bounded model checkingformulas used to analyze among others also discrete-continuous-hybrid systems, ofwhich this problem is one special case [?]. Starting e.g. from the assumption ofat most one valuation per phase transition, incremental solving could amount toincreasing the number of intermediate valuations, effectively looking for simple so-lutions before looking for more complex ones.


Fig. 2. Phase 0 of the traffic scenario overtaking maneuver on a highway.

state ψτ0(x) of the phase must satisfy

xEgo Truck − xCar = 50 m,

(2) invar p0 , such that all states traversed while passing through the phasemust stay within the interval, i.e.

30 m ≤ xEgo Truck − xCar ≤ 50 m,

and (3) final p0 , i.e. the phase can only be left when

xEgo Truck − xCar = 30 m.

– Velocity constraints: the vehicle Truck 1 is annotated with the constraint“vTruck 1 : [21, 22]m/s”. This constraint describes the velocity of the vehicleduring the entire phase, i.e. it contributes the constraint vTruck 1 ∈ [21, 22]m/sto initp0 , invar p0 , and final p0 .

– Velocity difference constraints: the Ego Truck is annotated with “vEgo Truck :vTruck 2 + [0.4, 1.2]m/s”, which constrains the difference of velocities betweenthe Ego Truck and the truck in front of it. Again, this constraint directlycontributes to initp0 , invar p0 , and final p0 as vEgo Truck ∈ vTruck 2 + [0.4, 1.2]m/s,i.e. must hold when entering, during, and when leaving phase 0.

– Lane position constraints: each vehicle has been placed graphically onto oneof the lanes, e.g. Truck 1 drives on the rightmost lane, which amounts toa constraint yTruck 1 ∈ [mid lane right + 0.5 · widthTruck 1,mid lane right − 0.5 ·widthTruck 1] where y is the lateral position of the vehicle, width the vehi-cle’s width, and mid lane right the midpoint of the rightmost lane. Also thisconstraint contributes directly to initp0 , invar p0 , and final p0 . Analogous con-straints are added for the other vehicles based on their widths and lanemidpoint constants.

– Phase duration constraints: on top of the phase graph, a duration constraintis given, such that min durationp0 = 0 s and max durationp0 = 30 s.

Looking at the remaining phases, we additionally get the following types ofconstraints:


– Constant velocity constraints: in phase 1, vehicle Car is annotated with“vCar : const”, which contributes as a constraint v = 0 m/s2, i.e. the time-derivative of v is zero and hence the velocity does not change.

– Lane position evolution constraints: in phase 1, e.g. vehicle Car changes fromthe middle to the left lane. While this could be annotated with a change rate,this has not been done here explicitly, so that only (1) initp1 is supplied witha constraint

yCar ∈ [mid lane mid + 0.5 · widthCar,mid lane mid − 0.5 · widthCar],

(2) invar p1 is constrained by

yCar ∈ [mid lane mid + 0.5 · widthCar,mid lane left − 0.5 · widthCar],

and (3) final p1 contains the resulting state of the lane change – the leftmostlane has been reached – i.e.

yCar ∈ [mid lane left + 0.5 · widthCar,mid lane left − 0.5 · widthCar].

Note, however, that the presence of a phase duration constraint of [0 s, 10 s]implicitly defines a minimum rate such that the lane change takes no longerthan 10 s.

– Approximate velocity constraints: in phase 2, the velocity of the vehicle Car

is constrained to vCar ∼ vEgo Truck. This constraint is only an abbreviatednotation for a velocity difference constraint with an implicit ε-tolerance thatis chosen small enough for the velocities to be similar, e.g. ε = 0.4 m/s.

More systematically, the constraints can be categorized in the following way.Simple constraints provide non-changing bounds within which the value of thevariable may fluctuate arbitrarily. Evolution constraints use the initial and finalstates of a phase to provide an expectation about the development of the statesduring the time in which the phase is active, potentially even quite preciselywith a change rate, including as a special case a change rate of 0 to constraina value not to change. In Section 3, we will provide examples of more complexevolution constraints. Orthogonally, constraints may refer to a single variable,e.g. the velocity of one vehicle, or to differences of variables, e.g. the distance asa difference in positions, or the difference of velocities.

To model the physical behavior of the vehicles, C(x) contains the followingrelationships between the variables and their derivatives:

γ = ω

v = a

x = cos(γ) · vy = sin(γ) · v

for each vehicle.


2.3 Road Geometries

Using only the constraints presented so far, a scenario concretization can beinterpreted in the context of a sufficiently long straight road. The position ofcurves or even the height profile of a road, however, is not negligible when think-ing about their impact on the criticality of a scenario. When intending to usescenarios to simulate and test autonomous driving functions, all kinds of roadgeometries may thus have to be used to ensure that e.g. the overtaking scenariocan be carried out safely even if curves or slopes make it harder to detect allrelevant surrounding traffic.

Regarding the constraint system that needs to be solved, some of these as-pects can be ignored under the assumption that their impact can be compensatedfor by changing the stimuli. If e.g. the height of the road increases gently, thevehicle’s acceleration can be adjusted by just a bit such that its velocity reachesthe exact same values as it would on a flat surface (assuming that the simulationtakes this slow-down effect caused by gravity into account). Similarly, the longeroutside lanes in curves can be compensated for by driving a bit faster. A scenariofound for a straight road can thus be adapted to more complex road geometriesby adapting the stimuli. The price, however, is that the resulting stimuli maybe considered too artificial to serve as test cases – since they accelerate e.g. atthe beginning of a curve and brake at its end just to follow an “ideal” trajec-tory computed for a straight road – or even leave the envelope allowed by theconstraints (e.g. a maximum velocity needs to be exceeded). If the phases of thescenario request two vehicles to be side by side at the end of a curve, the road-geometry-aware solution could instead e.g. simply give the vehicle on the outerlane a head start, which is consumed during the curve without any additionalchanges to the stimuli.

When aiming at such geometry-aware solutions, the constraint system needsto be augmented by constraints describing the direction of straight road seg-ments, the radii of arc segments, the curvature change rates of clothoid segmentsthat connect straight and arc segments, or the parameters of polynomial roadsegments, which form less regular connections. One possible formalization can bebased on a function r : l 7→ β that yields a road orientation β for any given lengthl. For example for a straight road segment s with direction βs, which starts atroad length sstart and ends at road length send , this function would constantlyreturn the same angle: rs(l) = βs for all l ∈ [sstart , send ]. For an arc segment,on the other hand, the returned angle would change with a constant rate thatdepends on the radius. For clothoid and polynomial segments, even that rate oforientation change itself changes over the length of the segment. The purposeof this piecewise-defined function is then to provide an expected direction forthe vehicles such that they drive in the direction of the road. A constraint thatshall keep a vehicle on its lane must enforce that the direction of the vehiclecoincides with the road’s direction, while during a lane-change this requirementmust be relaxed such that the orientation can deviate from the road’s directionsufficiently to reach the other lane. For distance constraints, the semantics mustbe clarified, by either using a road-length interpretation (i.e. when in a curve, the


distance along the curve is taken) versus the Euclidean distance. A major tech-nical challenge in the formalization may additionally be the lack of closed-formsolutions for the points on clothoid road segments.

Since we believe that solving the scenario concretization problem on thestraight-road geometry is already a very challenging task by itself and a reason-able intermediate step in the direction of solving this more complex problem, wehave not yet attempted to fully formalize the road-geometry-aware case.

2.4 Variants of the Problem Statement

Looking at the validation of autonomous driving functions via simulated scenar-ios, the discrete scenario concretization has one significant drawback: it containsstimuli for all vehicles including the ego vehicle, which must not be controlled bythe test environment since it is the system under test. While such solutions provethe existence of concretizations and thus the plausibility of the scenario (not anunimportant result at all), an actual test run will contain controllable vehicles –whose stimuli must be chosen such that the scenario is performed successfully –and the uncontrollable ego vehicle – which choses its stimuli by itself.

To solve this issue, the constraint system presented above must be understoodmore like a controller synthesis problem rather than a pure constraint solvingproblem. The solution would thus have a reactive element, e.g. in the form of anexecutable function, such that in each simulation step the state of and stimulichosen by the ego vehicle can be used as an input and the stimuli of the othervehicles can then be computed by the function as an output such that thesestimuli lead to a discrete scenario concretization in the sense presented above.Obviously, an ego vehicle can undermine the attempt to finish a scenario, e.g.by changing to the wrong lane, driving too slowly or too fast, etc. An optimalsolution would thus be a reactive controller that concretizes the solution for allbehaviors of the ego vehicle for which there still exists a possible completion.

The above problem statement can thus be considered a challenge with mul-tiple degrees of solution:

– Controller synthesis: Find a controller that stimulates the surrounding traf-fic in such a way that the resulting sequence forms a discrete scenario con-cretization, as long as the ego vehicle’s behavior allows for the existence ofsuch a concretization.

– Plausibility proof: Find one discrete scenario concretization with stimuli forall vehicles including the ego vehicle. Though not directly simulatable, thisresult proves that the scenario can be realized under the assumption thatthe ego vehicle cooperates, i.e. the scenario specification as such is plausibleand does not contradict itself (e.g. adjacent phases whose constraints allowno valid transitions).

• . . . on road geometry: using road geometry constraints such that the ve-hicles follow a potentially complex road geometry during the scenariosimulation.


Fig. 3. Graphical description of the traffic scenario truck platooning.

• . . . on a straight road: not taking any road geometry constraints intoaccount, i.e. driving just on a straight road and thus showing that thescenario is plausible in theory, but not necessarily on all concrete roadgeometries (e.g. because velocities would need to be higher than allowedby the constraints when driving on the longer outer lanes of a curve).

Orthogonally to these degrees of solutions, different levels of abstraction can bechosen. An approximative solution might e.g. be more easily obtained when ig-noring phase invariant constraints between stimuli changes and limiting oneselfto just checking a posteriori whether the invariant constraints hold during eachsimulation step. Using simulations with sufficiently small sample times, one willnormally be able to detect invariant violations with practically sufficient accu-racy during simulation. If violations are observed in this way, the concretizationof the scenario can be repeated with a finer resolution for the intermediate steps,i.e. using more intermediate steps (potentially evenly distributed between thesteps of the concretization in which the violation occurred) and thus reducingthe risk of overlooking an invariant violation. Similarly, the continuous dynamicscan be approximated, especially when using a reactive component that uses con-trol strategies to achieve satisfaction of the constraints. In this case, the behaviorthat is provided by the simulation environment and the own approximation canbe compared in each step and stimuli be adapted in such a way that small devi-ations caused by the approximation are corrected. Additionally, tolerances canbe used to allow slight deviations from the “exact” constraints – e.g. a velocityis barely ever truly constant in the real world, so a small environment aroundthe constraint may be considered a rather natural relaxation of the problem.

3 Further Examples of Abstract Traffic Scenarios

In this section, we give three further graphical descriptions of abstract trafficscenarios which may serve as benchmarks for the SC2 community.


Fig. 4. Graphical description of the traffic scenario dangerous lane change.

Fig. 5. Graphical description of the traffic scenario late merge.

Truck Platooning. The critical aspect of the truck platooning example, as shownin Fig. 3, is the very short distance between the trucks of the platoon, whichis essential however, since such platoons of autonomous and cooperating trucksaim at reducing fuel consumption and optimizing utilization of roads.

We remark that phase 2 introduces a more complex velocity evolution con-straint : in addition to the initial and final velocity constraints, the change rateof the velocity, i.e. the accelaration aE, is bounded from above by 0.6 m/s2, whichis important in order to specify fuel-efficient driving.

Dangerous Lane Change. A dangerous lane change of a car from the left to theright lane into a small gap between two trucks is illustrated in Fig. 4. The task ofthe ego truck is to assure a safe distance to the car driving ahead while avoidingabrupt braking.

Late Merge. The abstract traffic scenario of a late merge in Fig. 5 describes thecritical situation in dense traffic when two lanes are merged into one.


4 Conclusion and Future Work

The development and, in particular, the safety validation of autonomous vehiclesis one of the biggest challenges in the near future. In this position paper, wefocussed on a central aspect of the validation of autonomous driving, namelyon the specification and concretization of abstract traffic scenarios which willestablish the fundamental basis for safety argumentation. We sketched ideason a graphical specification language for abstract traffic scenarios. By meansof a concrete example we derived the constraint problem class to be solved inorder to synthesize concrete traffic scenarios. We also outlined an extension ofthe problem statement by adding road geometry constraints and by asking forcontroller synthesis in addition to plausibility check of abstract traffic scenarios.In total, we provided four concrete examples of abstract scenarios which mayserve as benchmarks for the SC2 community.

With regard to potential approaches to solving such constraint systems aris-ing from traffic scenarios, one has to cope with several dimensions of complexity.The problem class incorporates system evolution over real time while the behav-ior of vehicles is represented by differential equations involving transcendentalfunctions like cos and sin. This indicates to employ constraint solvers being ableto handle differential equations, which in turn creates doubts on applicability inindustrial praxis. When trying to simplify the problem, it is not obvious whetherthese differential equations have closed-form solutions and, if not, whether good-enough approximations exist that still yield valid concrete scenarios. In additionto the intricacy of the constraints, it might be the case that the controller syn-thesis problem calls for nested quantifiers and thus for quantifier eliminationtechniques. A potential approach might also be along the lines of combining nu-merical optimization and SAT solving [?]. Another important question is whetherdecomposition of the abstract traffic scenario, e.g. phase-wise or even more fine-grained within a phase, or abstraction-refinement approaches might be beneficialwhen solving the scenario concretization problem.

Solving this problem class arising from traffic scenarios thus is a major chal-lenge and is of fundamental importance for the validation of autonomous driving.Due to the remarkable progress that the two communities of Symbolic Compu-tation and of Satisfiability Checking made in recent years to solve industriallyrelevant problems and due to their recent intention of “joining forces”, we be-lieve that the SC2 community may give essential contributions based in theirrich variety of diverse methods and techniques.

Acknowledgments. We would like to thank the anonymous reviewers for theirvaluable comments that have helped us clarifying the presentation in this paper.Additionally, we would like to thank David Specht for his feedback on one of theexamples and Karsten Scheibler for helpful discussions on potential constraintrepresentations. This work was supported by the H2020-FETOPEN-2016-2017-CSA project SC2 (712689).


References

1. Damm, W., Kemper, S., Mohlmann, E., Peikenkamp, T., Rakow, A.: Traffic se-quence charts - from visualization to semantics. AVACS Technical Report 117 (Oct2017)

2. Damm, W., Kemper, S., Mohlmann, E., Peikenkamp, T., Rakow, A.: Using trafficsequence charts for the development of HAVs. In: Embedded Real Time Softwareand Systems - ERTS2018 (2018)

3. Damm, W., Mohlmann, E., Peikenkamp, T., Rakow, A.: A formal semantics fortraffic sequence charts. In: Festschrift in honor of Edmund A. Lee (2017)

4. Giorgetti, N., Pappas, G., Bemporad, A.: Bounded model checking of hybrid dy-namical systems. In: 44th IEEE Conference on Decision and Control and 2005European Control Conference (CDC-ECC’05). pp. 672–677 (2005)

5. International Organization for Standardization: Road vehicles – Functional safety(ISO 26262) (2011)

6. Kalra, N., Paddock, S.M.: Driving to safety: How many miles of driving would ittake to demonstrate autonomous vehicle reliability? (2016), https://www.rand.org/pubs/research_reports/RR1478.html

7. Kemper, S., Etzien, C.: A visual logic for the description of highway traffic sce-narios. In: Aiguier, M., Boulanger, F., Krob, D., Marchal, C. (eds.) ComplexSystems Design & Management, Proceedings of the Fourth International Con-ference on Complex Systems Design & Management CSD&M 2013, Paris, France,December 4-6, 2013. pp. 233–245. Springer (2013), https://doi.org/10.1007/

978-3-319-02812-5_17

8. Maurer, M., Gerdes, J.C., Lenz, B., Winner, H.: Autonomes Fahren - Technische,rechtliche und gesellschaftliche Aspekte. Springer Berlin Heidelberg (May 2015)

9. Menzel, T., Bagschik, G., Maurer, M.: Scenarios for Development, Test and Val-idation of Automated Vehicles. ArXiv e-prints (Jan 2018), https://arxiv.org/abs/1801.08598

10. Priya Inala, J., Gao, S., Kong, S., Solar-Lezama, A.: REAS: Combining NumericalOptimization with SAT Solving. ArXiv e-prints (Feb 2018), https://arxiv.org/abs/1802.04408

11. SAE-International: J3016 201609: Taxonomy and Definitions for Terms Related toDriving Automation Systems for On-Road Motor Vehicles (Sep 2016)

https://www.rand.org/pubs/research_reports/RR1478.html

https://www.rand.org/pubs/research_reports/RR1478.html

https://doi.org/10.1007/978-3-319-02812-5_17

https://doi.org/10.1007/978-3-319-02812-5_17

https://arxiv.org/abs/1801.08598




Wrapping Computer Algebra is Surprisingly Successfulfor Non-Linear SMT

Pascal Fontaine1 (orcid.org/0000-0003-4700-6031), Mizuhito Ogawa2

(orcid.org/0000-0002-8050-7228), Thomas Sturm1,3

(orcid.org/0000-0002-8088-340X), Van Khanh To4, Xuan Tung Vu1,2

(orcid.org/0000-0002-2239-6574)?

1 University of Lorraine, CNRS, Inria, and LORIA, Nancy, France{pascal.fontaine,thomas.sturm}@loria.fr

2 Japan Advanced Institute of Science and Technology, Nomi, Japan{mizuhito,tungvx}@jaist.ac.jp

3 MPI Informatics and Saarland University, Saarbrucken, [email protected]

4 University of Engineering and Technology, VNU, Hanoi, [email protected]

Abstract. We report on a prototypical tool for Satisfiability Modulo Theory solv-ing for quantifier-free formulas in Non-linear Real Arithmetic or, more precisely,real closed fields, which uses a computer algebra system as the main compo-nent. This is complemented with two heuristic techniques, also stemming fromcomputer algebra, viz. interval constraint propagation and subtropical satisfiabil-ity. Our key idea is to make optimal use of existing knowledge and work in thesymbolic computation community, reusing available methods and implementa-tions to the most possible extent. Experimental results show that our approach issurprisingly efficient in practice.

1 Introduction

Quantifier-free non-linear real arithmetic (QF NRA) appears in many applications ofsatisfiability modulo theories (SMT) solving. Efficient reasoning techniques for the cor-responding constraints in SMT solvers is thus highly relevant. Examples of SMT solverswith theory reasoners for non-linear real arithmetic are CVC4 [2], SMT-RAT [6], Z3 [11]and Yices [5]. We report here on the prototypical non-linear theory reasoner in veriT.It is based on three techniques, namely interval constraint propagation, subtropical sat-isfiability, and quantifier elimination. Interval constraint propagation and subtropicalsatisfiability are heuristic procedures that are easy to implement, but incomplete. Com-pleteness of the theory solver is guaranteed by the use of quantifier elimination. Inorder to make optimal use of existing knowledge and work in the symbolic computa-tion community, we take advantage of an independent computer algebra system (Red-log/Reduce). The theory reasoner built from these three components is called lazily, onfull models produced by the underlying SAT solver. This simple SMT solver is actuallyquite efficient in solving problems in the QF NRA category of the SMT-LIB.? The order of authors is strictly alphabetic.

Wrapping Computer Algebra is Surprisingly Successful for Non-Linear SMT 111

In Section 2, we briefly present the three procedures that are used in the tool. Wethen describe how they are combined into one theory reasoner for non-linear real arith-metic literals. We finally report on experiments on the SMT-LIB.

2 Component Procedures

2.1 Interval Constraint Propagation

This section briefly introduces Interval Constraint Propagation (ICP) [1], a techniqueproviding an incomplete but efficient method to check the satisfiability over the reals ofa finite set of polynomial constraints [8,17,9,12,25]. Algorithm 1 depicts a naive ver-sion of ICP. This algorithm works on boxes in Rn; initially (line 2) there is one boxset to ]−∞,∞[n for the n variables in the constraints. It iteratively (line 3) picks onebox B (line 4) and uses the constraints to heuristically contract that box (line 6), then itevaluates the constraints on B, discarding the box if there is no possible solution (oneconstraint is unsatisfiable on the box, line 7), or decomposing B into smaller boxes,which will later be considered the same way (line 12). The algorithm terminates con-cluding unsatisfiability (line 15) when all boxes have been discarded (one constraint isunsatisfiable, for each box), or concluding satisfiability when all constraints are validin the same box (line 10, indeed every point in the box is a solution for the set ofconstraints). Note that ICP does not require tedious computations with real algebraicnumbers at all. However, the algorithm does not necessarily terminate since it mightinfinitely decompose boxes into smaller ones.

Algorithm 1 ICP for a set of polynomial constraints ϕ1: function ICP(ϕ)2: S ←{ ]−∞,∞[n}3: while S 6= /0 do4: choose B ∈ S5: S ← S \{B}6: B′← contract B from constraints7: if B′ = /0 or one constraint is unsatisfiable on B′ then8: continue9: else if all constraints are valid on B′ then

10: return SAT11: end if12: B1,B2← decompose B′

13: S ← S ∪{B1,B2}14: end while15: return UNSAT16: end function

Our implementation of ICP uses testing [12,25], to improve satisfiability detection:the constraints are regularly checked for satisfiability on random test points inside given

112 Authors Suppressed Due to Excessive Length

boxes. Since ICP uses interval arithmetic as an approximation to check validity or un-satisfiability of constraints on a box of full dimension, it is rarely able to find satisfiableassignments in presence of equations. Our ICP algorithm furthermore uses the Interme-diate Value Theorem as a heuristic to assert that equations are satisfied within a givenbox [25].

2.2 Subtropical Satisfiability

The subtropical satisfiability checking method [7] (SUBTROP) is based on the simulta-neous evaluation of the input set of polynomial constraints when the values of variablesapproach 0, 1, or ∞ following some polynomial curve. It basically checks for suchlimits, if within each polynomial one monomial dominates all other ones. If so, thealgorithm computes an interpretation for variables, based on the curve, such that allpolynomials in the set have a value of a given sign. The method is mostly limited todetermine the satisfiability of a set of constraints containing only inequalities, but itappears that this is useful for the various applications represented in the SMT-LIB.

Example 1. Consider f1 =−12+2x12y25z49−31x13y22z110−11x1000y500z89 and f2 =−23+ 5xy22z110− 21x15y20z1000 + 2x100y2z49, and the satisfiability checking problemf1 > 0∧ f2 > 0. The method (see [7] for details) shows that the monomials 2x12y25z49

and 5xy22z110 respectively dominate in f1 and f2 when (x,y,z) tends to (0,∞,0), moreprecisely, for values (x,y,z) = (an1 ,an2 ,an2) with, for instance, n1 = − 238834

120461 , n2 =26724601325071 , n3 = − 368561

1325071 and a sufficiently large. The method actually computes thosen1,n2, and n3. So, if a ∈ R+ is large enough, then both constraints f1(an1 ,an2 ,an3) > 0and f2(an1 ,an2 ,an3)> 0 are satisfied. Consider a= 2 for instance: then f1(an1 ,an2 ,an3)≈16371.99 and f2(an1 ,an2 ,an3)≈ 17707.27. ut

The subtropical method expresses as linear constraints some conditions for one mono-mial to dominate within a polynomial. Subsequently, an SMT solver is used to checkthe satisfiability of these linear constraints. If the linear constraints are satisfiable, thenthe non-linear constraint is satisfiable and the model of the linear constraints providesa witness for the polynomial curve, i.e., for the previous example, the values of n1,n2,and n3. For sets of polynomial constraints, the process is essentially the same.

Experimentally, the application of the subtropical satisfiability method to solve non-linear constraints is fast, either to succeed or to fail. On our experiments, it further-more provides an answer on several problems that defeat all state-of-the-art solvers.The method thus establishes a useful complementing heuristic with little drawback, tobe used either upfront or in portfolio with other approaches to deal with non-linearconstraints.

2.3 Quantifier Elimination

As a complete method, we use Quantifier Elimination techniques (QE). While it is pos-sible to quickly implement ICP and subtropical satisfiability checking, those quantifierelimination techniques are significantly more complex and require expertise. It is notfeasible to implement a custom version for SMT with a few person months. We thus


used a computer algebra system, namely Reduce, and more precisely, the package Red-log [4] which implements interpreted first-order logic on top of computer algebra. Be-sides many other domains, Redlog provides decision procedures for real closed fields.For real arithmetic Redlog combines two quantifier elimination techniques, virtual sub-stitution [26,27,13] with partial cylindrical algebraic decomposition [3,19]. For a surveyof real applications of Redlog see [22]. Further theories supported by Redlog includediscretely valued fields [21], term algebras [23], QBF [20,24], Presburger Arithmetic[15] and fragments of non-linear integer arithmetic [14].

To embed a decision procedure in SMT, one property is mandatory: the procedureshould feature small conflict production. From an unsatisfiable set of literals, it shouldproduce an unsatisfiable subset containing few unnecessary literals, optimally a min-imal unsatisfiable subset, i.e. such that all proper subsets are satisfiable. With smallconflict production, the SMT infrastructure goes considerably beyond a SAT solver enu-merating all possible models of the Boolean abstraction of the input formula, the theoryreasoner refuting them one at a time. We recently presented [10] a method using linearoptimization to compute conflict sets for cylindrical algebraic decomposition and vir-tual substitution. Virtual substitution and cylindrical algebraic decomposition share thesame basic idea of finding a finite set of test points that suffice to determine the unsat-isfiability of a set of constraints S . If the set S is unsatisfiable, each of these test pointsfalsifies at least one constraint in S . Finding a conflict set reduces to finding a subset ofthe constraints that contains, for each test point, at least one unsatisfied constraint. Weimplemented this in the Quantifier Elimination algorithms in Redlog. Redlog/Reducenow furthermore provides an interface for SMT solver, so that the software can be usedas a theory reasoner with little effort. This reasoner is used in our portfolio of tools, andguarantees the completeness of the combination of reasoners on real closed fields.

3 Combining Procedures

We have previously discussed subtropical satisfiability (SUBTROP), interval constraintpropagation (ICP), and quantifier elimination (QE). Our aim is to combine these in acomplete and efficient framework for solving polynomial constraints. Recall that SUB-TROP is only an incomplete heuristics, and ICP does not even guarantee termination;ICP might loop forever, for instance in the case of touching spheres. Combining ICP se-quentially with other procedures thus requires heuristics for ICP termination to ensurefairness and to let the other algorithms also work on the constraints. Before decom-posing a box into smaller ones (line 12 in Algorithm 1), our algorithm will check ifbounded boxes have been generated, all the bounded boxes are smaller than a chosenvalue ε in all dimensions. If so, the algorithm gives up, and returns UNKNOWN alongwith the box B resulting from the contraction of ]−∞,∞[n from constraints; the al-gorithm returns the empty box in case of unsatisfiability and the valid box in case ofsatisfiability. Intuitively, B is an over-approximation of all boxes possibly containingsolutions of constraints. We furthermore require that boxes are handled in a chronologi-cal order, that is, S in Algorithm 1 becomes a queue, where the first box (the oldest one)is chosen and removed, and newer boxes are added at the end. Boxes are decomposedonly along axes with lengths greater than ε. The procedure ensures unbounded boxes


are not decomposed infinitely. The algorithm terminates either when it detects satisfia-bility (or unsatisfiability) or all bounded boxes have all dimensions smaller than ε. Werefer to this terminating version of ICP as ICPT.

A lazy combining approach considers the procedures as black boxes and invokesthem sequentially to check the satisfiability of the constraints. The fastest procedures toterminate are ordered first, and the complete procedure is called last. In Algorithm 2,SUBTROP(ϕ) returns SAT if and only if subtropical satisfiability succeeds in findinga model for ϕ; remember that subtropical satisfiability is indeed essentially a modelfinding method. The complete decision procedure (in our case, quantifier eliminationmethods as implemented in Redlog/Reduce) is called in line 9, and returns SAT (resp.UNSAT) if the input is satisfiable (resp. unsatisfiable). Notice that, when calling thecomplete decision procedure, the set of constraints ϕ is complemented with a box Bfound by ICPT. Actually (this is not shown in Algorithm 2), ϕ is furthermore cleaned ofthe constraints that are valid in the box B.

Algorithm 2 Lazy combination of procedures1: function LAZY(ϕ)2: if SUBTROP(ϕ) = SAT then3: return SAT4: end if5: (result, B)← ICPT(ϕ)6: if result 6= UNKNOWN then7: return result8: end if9: return QE (ϕ∧B)

10: end function

We experimentally evaluated another combination interleaving more tightly thecomplete procedure and ICP. The combination is similar to ideas in [16]: when a boxof ICP becomes smaller than a threshold ε, quantifier elimination is called, for the sakeof completeness, to solve the remaining unknown constraints over this small box. Wefound however that this performs less well than the lazy one above. We are investigatingfurther techniques to fix this.

4 Experiments

All experiments were conducted on an Intel Xeon E5-2680v2 at 2.80GHz runningGNU/Linux CentOS 6.4 with kernel 2.6.32. Each solver ran on the 11354 benchmarksof the QF NRA category (quantifier-free Non-linear Real Arithmetic) of the SMT-LIB(4963 are labeled as satisfiable, 5296 unsatisfiable, 1095 unknown) with a memory limitof 8 GB memory and a time out of 2500 seconds for each benchmark.5

5 See http://www.jaist.ac.jp/∼s1520002/veriT+STROPSAT+raSAT+Redlog/ for fullresults and the tool with the source code. Note to reviewers: this will soon be on Zenodo.

http://www.jaist.ac.jp/~s1520002/veriT+STROPSAT+raSAT+Redlog/


Table 1. Numbers of benchmarks solved by component procedures

Benchmarks SUBTROP ICP QE All w/oSUBTROP

w/oICP

w/oQE

SAT 1936 4302 4400 4450 4433 4436 4313UNSAT 2530 4472 4959 5012 5012 4959 4472Total (11354) 4466 8774 9359 9462 9445 9395 8785Total time (s) 4744 18835 67945 50632 44815 67420 22357

In our implementation, raSAT [12,25] serves as the ICP-based solver, the computeralgebra system Redlog/Reduce [4] for quantifier elimination, and STROPSAT [7] asthe implementation of subtropical satisfiability. The interface for the three tools withinveriT is 900 lines of C code. Notice that CVC4 is used inside STROPSAT to solve linearconstraints (see again [7]). The framework for solving polynomial constraints in theSMT solver is called lazily, when the underlying SAT solver has produced a full modeland if the corresponding set of literals is not shown unsatisfiable by the linear arithmeticdecision procedure in veriT.

Table 1 compares the performance of the distinct procedures and their combina-tion. Notice that QE alone already solves many problems, but combining the proceduresbrings improvements both in running time and in the number of solved problems. SUB-TROP increases the number of solved satisfiable problems and improves times for sev-eral satisfiable problems, without significantly impacting negatively for the problemssolved by the other methods: indeed, SUBTROP actually takes around 2000 additionalseconds for six difficult problems also solved by the combination of ICP and QE, anduses 4000 seconds to solve 17 more problems. ICP has a general positive impact bothon running times and on the number of problems solved.

Table 2. Performance of state-of-the-art SMT solvers on QF NRA

Benchmarks CVC4 SMT-RAT Z3 Yices veriT veriTonly

Virtualbest

SAT 2929 4398 4905 4845 4450 18 5183UNSAT 5324 4425 5038 5120 5012 1 5744Total (11354) 8253 8823 9943 9965 9462 19 10927Total time (s) 146154 57787 37740 132137 50632 11706 119998

Table 2 compares state-of-the-art SMT solvers with support for non-linear arith-metic. SMT-RAT implements a less lazy combination (as mentioned in the previoussection) between interval constraint propagation and cylindrical algebraic decompo-sition [16], Yices and Z3 implement the nlsat procedure [11]. CVC4 uses context-dependent simplification [18] and incremental linearization [2]. Our results validatethe main point of the paper: a combination of simple heuristics (ICP and subtropicalsatisfiability) with quantifier elimination as implemented in a computer algebra system(Redlog/Reduce) slightly tuned to fit the SMT infrastructure is an efficient decision pro-cedure to solve non-linear arithmetic SMT problems. Table 2 also clearly exhibits that


the virtual best solver — a portfolio of all mentioned solvers running in parallel — ismuch better than each individual solver; we attribute this to the variety of techniquesused in the solvers.

5 Conclusion

Implementing state-of-the-art decision procedures for the theory of real non-linear arith-metic requires a lot of expertise and is very time consuming. Hopefully, reusing a com-puter algebra system actually provides good results on the SMT-LIB. We also noticethat, thanks to the diversity of techniques used to tackle this theory, the various SMTsolvers have complementary strengths and the virtual best solver is much better thaneach solver alone.

We expect the non-linear arithmetic reasoner in veriT to improve, following theimprovements in the embedded computer algebra system itself. For instance, Red-log/Reduce features a new experimental virtual substitution algorithm which showsvery promising results on problems stemming from other computer algebra applica-tions; veriT will eventually benefit from this new algorithm once it becomes stable.

In future works, we plan to investigate a less lazy combination of ICP and quantifierelimination. In our current prototype, the quantifier elimination is applied for each boxthat ICP is unable to discard, one at a time. The idea would be, with a single call toquantifier elimination, to eliminate several boxes in ICP.

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 researchand innovation programme under grant agreement No H2020-FETOPEN-2015-CSA712689 (SC2), and from the European Research Council under the European Union’sHorizon 2020 research and innovation program (grant agreement No. 713999, Ma-tryoshka). Xuan Tung Vu would like to acknowledge the JAIST Off-Campus ResearchGrant for fully supporting him during his stay at LORIA, Nancy. The work has also beenpartially supported by the JSPS KAKENHI Grant-in-Aid for Scientific Research(B)(15H02684) and the JSPS Core-to-Core Program (A. Advanced Research Networks).

References

1. Benhamou, F., Granvilliers, L.: Continuous and interval constraints. In: F. Rossi, P.v.B.,Walsh, T. (eds.) Handbook of Constraint Programming, pp. 571–604. Elsevier, New York(2006)

2. Cimatti, A., Griggio, A., Irfan, A., Roveri, M., Sebastiani, R.: Satisfiability modulo transcen-dental functions via incremental linearization. In: de Moura, L. (ed.) CADE 2017. LNCS,vol. 10395, pp. 95–113. Springer (2017)

3. Collins, G.E., Hong, H.: Partial cylindrical algebraic decomposition for quantifier elimina-tion. J. Symb. Comput. 12(3), 299–328 (Sep 1991)

4. Dolzmann, A., Sturm, T.: Redlog: Computer algebra meets computer logic. SIGSAM Bull.31(2), 2–9 (Jun 1997)


5. Dutertre, B.: Yices 2.2. In: Biere, A., Bloem, R. (eds.) CAV 2014. pp. 737–744. Springer(2014)

6. Florian, C., Ulrich, L., Sebastian, J., Erika, A.: SMT-RAT: An SMT-Compliant nonlinear realarithmetic toolbox. In: Alessandro, C., Roberto, S. (eds.) SAT 2012, pp. 442–448. Springer(2012)

7. Fontaine, P., Ogawa, M., Sturm, T., Vu, X.T.: Subtropical satisfiability. In: Dixon, C., Finger,M. (eds.) FroCoS 2017, pp. 189–206. Springer (2017)

8. Franzle, M., Herde, C., Teige, T., Ratschan, S., Schubert, T.: Efficient solving of large non-linear arithmetic constraint systems with complex boolean structure. JSAT 1, 209–236 (2007)

9. Gao, S., Kong, S., Clarke, E.M.: Satisfiability modulo ODEs. In: FMCAD 2013. pp. 105–112(2013)

10. Jaroschek, M., Dobal, P.F., Fontaine, P.: Adapting real quantifier elimination methods forconflict set computation. In: Lutz, C., Ranise, S. (eds.) FroCoS 2015, pp. 151–166. Springer(2015)

11. Jovanovic, D., de Moura, L.: Solving non-linear arithmetic. In: Gramlich, B., Miller, D.,Sattler, U. (eds.) Automated Reasoning, pp. 339–354. Springer (2012)

12. Khanh, T.V., Ogawa, M.: SMT for polynomial constraints on real numbers. ENTCS 289, 27– 40 (2012), TAPAS’ 2012

13. Kosta, M.: New Concepts for Real Quantifier Elimination by Virtual Substitution. Doctoraldissertation, Saarland University, Germany (December 2016)

14. Lasaruk, A., Sturm, T.: Weak integer quantifier elimination beyond the linear case. In: CASC2007, LNCS, vol. 4770. Springer (2007)

15. Lasaruk, A., Sturm, T.: Weak quantifier elimination for the full linear theory of the integers.A uniform generalization of Presburger arithmetic. Appl. Algebra Eng. Commun. Comput.18(6), 545–574 (Dec 2007)

16. Loup, U., Scheibler, K., Corzilius, F., Abraham, E., Becker, B.: A symbiosis of interval con-straint propagation and cylindrical algebraic decomposition. In: Bonacina, M.P. (ed.) CADE24, pp. 193–207. Springer (2013)

17. Ratschan, S.: Efficient solving of quantified inequality constraints over the real numbers.ACM Trans. Comput. Log. 7, 723–748 (Oct 2006)

18. Reynolds, A., Tinelli, C., Jovanovic, D., Barrett, C.: Designing theory solvers with exten-sions. In: Dixon, C., Finger, M. (eds.) FroCoS 2017. LNCS, vol. 10483, pp. 22–40. Springer(2017)

19. Seidl, A.: Cylindrical Decomposition Under Application-Oriented Paradigms. Ph.D. thesis,Universitat Passau, 94030 Passau, Germany (Mar 2006)

20. Seidl, A.M., Sturm, T.: Boolean quantification in a first-order context. In: Ganzha, V.G.,Mayr, E.W., Vorozhtsov, E.V. (eds.) CASC 2003, pp. 329–345. Institut fur Informatik, Tech-nische Universitat Munchen, Munchen, Germany (2003)

21. Sturm, T.: Linear problems in valued fields. J. Symb. Comput. 30(2), 207–219 (Aug 2000)22. Sturm, T.: A survey of some methods for real quantifier elimination, decision, and satisfia-

bility and their applications. Math. Comput. Sci. 11(3–4), 483–502 (December 2017)23. Sturm, T., Weispfenning, V.: Quantifier elimination in term algebras. The case of finite lan-

guages. pp. 285–30024. Sturm, T., Zengler, C.: Parametric quantified SAT solving. In: Watt, S.M. (ed.) ISSAC 2010,

pp. 77–84. ACM Press, New York (2010)25. Tung, V.X., Khanh, T.V., Ogawa, M.: rasat: an SMT solver for polynomial constraints. For-

mal Methods in System Design 51(3), 462–499 (2017)26. Weispfenning, V.: The complexity of linear problems in fields. J. Symb. Comput. 5(1–2),

3–27 (Feb–Apr 1988)27. Weispfenning, V.: Quantifier elimination for real algebra—the quadratic case and beyond.

Appl. Algebra Eng. Commun. Comput. 8(2), 85–101 (Feb 1997)

SMT-like Queries in Maple

Stephen A. Forrest

Maplesoft Europe Ltd., Cambridge, [email protected]

Abstract. The recognition that Symbolic Computation tools could ben-efit from techniques from the world of Satisfiability Checking was a pri-mary motive for the founding of the SC2 community. These benefitswould be further demonstrated by the existence of “SMT-like” queriesin legacy computer algebra systems; that is, computations which seek todecide satisfiability or identify a satisfying witness.

The Maple CAS has been under continuous development since the 1980sand its core symbolic routines incorporate many heuristics. We describeongoing work to compose an inventory of such “SMT-like“ queries ex-tracted from the built-in Maple library, most of which were added longbefore the inclusion in Maple of explicit software links to SAT/SMTtools. Some of these queries are expressible in the SMT-LIB format us-ing an existing logic, and it is hoped that those that are not could helpinform future development of the SMT-LIB standard.

1 Introduction

1.1 Maple

Maple is a computer algebra system originally developed by members of theSymbolic Computation Group in the Faculty of Mathematics at the Universityof Waterloo. Since 1988, it has been developed and commercially distributedby Maplesoft (formally Waterloo Maple Inc.), a company based in Waterloo,Ontario, Canada, with ongoing contributions from affiliated research centres.The core Maple language is implemented in a kernel written in C++ and muchof the computational library is written in the Maple language, though the systemdoes employ external libraries such as LAPACK and the GNU MultiprecisionLibrary (GMP) for special-purpose computations.

2 The commands is and coulditbe

Consistent with Maple’s roots as a computer algebra system, its core symbolicsolvers (such as solve, dsolve, and int) generally aim to provide a generalsolution to a posed problem which is both compact and useful. Further trans-formation or simplification of such solutions using simplifiers based on heuristicmethods [3] is often necessary.

SMT-like Queries in Maple 119

Nevertheless the approach of posing queries as questions about satisfiabilityor requests for a satisfying witness is not unknown in Maple. The most obviousexample is in the commands is and coulditbe. These are the standard general-purpose commands in Maple for querying universal and existential properties,respectively, about a given expression. [5] They are widely used by other symboliccommands in Maple (e.g. solve, int).

The is command accepts an expression p and asks if p evaluates to the valuetrue for every possible assignment of values to the symbols in p. The coulditbecommand operates similarly but asks if there is any assignment of values to thesymbols in p which could cause p to evaluate to true.

Both is and coulditbe return results in ternary logic: true, false, or FAIL.Both also make use of the “assume facility”, which is a system for associatingBoolean properties with symbolic variables. This provides limits on the range ofpossible assignments considered by is and coulditbe and is roughly analogousto a type declaration. For example, the expression is(x^2>=0) evaluates tofalse because there are many possible values of x which do not evaluate tononnegative real numbers, in particular the imaginary unit

√−1. By contrast,

the expression is(x^2>=0) assuming x::real returns true because the rangeof possible values of x has been constrained to real numbers.

An illustrative example is found in the function product. In the evaluation ofthe expression product(f(n),n=a..b), the system seeks to compute a symbolic

formula for the product∏b

n=a f(n). As one can verify by inspecting the sourcecode with showstat(product), the implementation of product computes a setof roots of f(n) and, if neither a nor b is infinite, checks whether there exists aroot r such that r is an integer and a ≤ r ≤ b. If so, it returns zero as the resultof the product. (Similar logic is applied if either of a or b is infinite.)

As evidence of the ubiquity of such queries, Table 1 summarizes the distinctinvocations of is and coulditbe encountered during a complete run throughMaplesoft’s internal test suite for the Maple library performed on 24 April 2018.(An investigation into an earlier version of this dataset was published in [4]). Thisincludes both instances in which the test case explicitly calls is/coulditbe andinstances in which is/coulditbe are invoked by other library functions such asproduct, as shown previously.

Note that in the above table, whenever logic L2 is an extension of logic L1,the listed results for logic L2 refer only to those queries which are expressiblein L2 but not in L1. For example, the 2888 is queries expressible in QF LIA arenot included among the 1449 queries expressible in QF LIRA, even though all ofthem are expressible in the more general logic.

In total, 24006 distinct is and 5701 distinct coulditbe queries were issuedduring the course of the test run. The inputs vary considerably in size and in thecomplexity of the underlying theory, and for both is and coulditbe approxi-mately 11% of queries cannot be decided (i.e. return FAIL rather than true orfalse). A complete list of queries encountered may be viewed athttps://doi.org/10.5281/zenodo.943349.

https://doi.org/10.5281/zenodo.943349

120 Stephen A. Forrest

Description is coulditbeTotal expressible in SMT-LIB 15572 4690

Expressible with QF LIA 2888 1686Expressible with QF NIA 2129 744Expressible with QF LRA 1542 505Expressible with QF NRA 284 41

Expressible with AUFLIRA 1449 687Expressible with AUFNIRA 7230 1027

Total not expressible in SMT-LIB 8434 1011

Expressible with complex arithmetic 4134 565Linear arithmetic with Gaussian integers (“QF LICA”) 258 171

Nonlinear arithmetic with Gaussian integers (“QF NICA”) 259 88Linear arithmetic with complex numbers (“AUFLIRCA”) 165 32

Nonlinear arithmetic with complex numbers (“AURNIRCA”) 2728 207

All “special” functions 3248 441Exponential functions and logarithms 461 76

RootOf expressions 232 40RootOf with exponential and trigometric functions 231 24

Remaining queries with Boolean structure 599 5

Total distinct queries 24006 5701

Table 1. Distinct is and coulditbe queries encountered in a full library test run

Of the total, 15572 of the is queries and 4690 of the coulditbe queries can beassigned to one of the SMT-LIB2 predefined logics. Of the queries which cannotbe so assigned, the reasons include the use of special functions unsupported bySMT-LIB, as well as complex arithmetic.

3 Future Work

Recent versions of Maple have seen the addition of explicit links to SAT andSMT solvers: Maple 2018 is distributed with both the SAT solver MapleSAT [1]and the SMT solver Z3 [7]. In future, we aim to examine the effectiveness ofusing these packaged solvers on SMT instances which arise during evaluation ofsymbolic expressions.

An important factor in this assessment will be whether this implementationoffers better performance and meaningful answers (not FAIL) for a larger classof such queries than existing tools in Maple.

References

1. Hui Liang, Vijay Ganesh. MapleSAT development site.https://sites.google.com/a/gsd.uwaterloo.ca/maplesat/

2. Clark Barrett, Pascal Fontaine, and Cesare Tinelli. The Satisfiability Modulo The-ories Library (SMT-LIB), http://www.smt-lib.org, 2016.

https://sites.google.com/a/gsd.uwaterloo.ca/maplesat/

http://www.smt-lib.org

SMT-like Queries in Maple 121

3. Jacques Carette. 2004. Understanding Expression Simplification. In Proceed-ings of the 2004 International Symposium on Symbolic and Algebraic Compu-tation, Santander, Spain (ISSAC 2004), ACM, New York, NY, USA, 72-79.doi:10.1145/1005285.1005298.

4. Stephen A. Forrest. 2017. Integration of SMT-LIB Support into Maple. SecondAnnual SC2 Workshop, ISSAC 2017, Kaiserslautern, Germany. http://www.sc-square.org/CSA/workshop2-papers/EA5-FinalVersion.pdf

5. The Assume Facility in Maple, Maple Online Help : The Assume Facility.6. Logics in SMT-LIB, http://smtlib.org/logics.shtml.7. de Moura, L. M., and Bjørner, N. Z3: an efficient SMT solver. In TACAS

(2008), vol. 4963 of Lecture Notes in Computer Science, Springer, pp. 337340.https://github.com/Z3Prover/z3.

http://dx.doi.org/10.1145/1005285.1005298

http://www.sc-square.org/CSA/workshop2-papers/EA5-FinalVersion.pdf

http://www.sc-square.org/CSA/workshop2-papers/EA5-FinalVersion.pdf

https://www.maplesoft.com/support/help/maple/view.aspx?path=assume

http://smtlib.org/logics.shtml

https://github.com/Z3Prover/z3

Techniques for Natural-style Proofsin Elementary Analysis?

work in progress

Tudor Jebelean

RISC–Linz, JKUwww.risc.jku.at

Abstract. Combining methods from satisfiability checking with meth-ods from symbolic computation promises to solve challenging problems invarious areas of theory and application. We look at the basically equiv-alent problem of proving statements directly in a non–clausal setting,when additional information on the underlying domain is available inform of specific properties and algorithms. We demonstrate on a concreteexample several heuristic techniques for the automation of natural–styleproving of statements from elementary analysis. The purpose of this workin progress is to generate proofs similar to those produced by humans, bycombining automated reasoning methods with techniques from computeralgebra. Our techniques include: the S-decomposition method for formu-lae with alternating quantifiers, quantifier elimination by cylindrical al-gebraic decomposition, analysis of terms behaviour in zero, bounding theε-bounds, rewriting of expressions involving absolute value, algebraic ma-nipulations, and identification of equal terms under unknown functions.These techniques are being implemented in the Theorema system andare able to construct automatically natural–style proofs for numerousexamples including: convergence of sequences, limits and continuity offunctions, uniform continuity, and other.

Keywords: Satisfiability Checking · Natural-style Proofs · SymbolicComputation.

1 Introduction

The need for natural1–style proofs (that is: similar to proofs produced by hu-mans – see [2]) arises in various applications, as for instance in tutorials, demon-strations, and interactive teaching systems. Some authors argue for the use ofnatural style when the proof system is not completely automatic (e. g. interactiveprovers) because this facilitates the interaction with the human user.

? Partially supported by project “Satisfiability Checking and Symbolic Computation”(H2020-FETOPN-2015-CSA 712689).

1 Here we do not mean natural deduction as described e. g. in [7].

Natural-style Proofs 123

When applied to problems over reals, Satisfiability Modulo Theories (SMT)solving combines techniques from Automated Reasoning and from Computer Al-gebra. From the point of view of Automated Reasoning, proving unsatisfiabilityof a set of clauses appears to be quite different from producing natural-styleproofs. Indeed the proof systems are different (resolution on clauses2 vs. someversion of sequent calculus3), but they are essentially equivalent, relaying onequivalent transformations of formulae. Moreover, the most important steps infirst–order proving, namely the instantiations of universally quantified formu-lae (which in natural–style proofs is also present as the equivalent operationof finding witnesses for existentially quantified goals), are actually the same orvery similar. (For an illustration of instantiations in natural–style proofs see [4].)From the point of view of Computer Algebra, finding these instantiations is themost important operation, thus here again one can use the same techniques inSMT solving and natural–style proving. Therefore the techniques can be easilymoved from one area to the other, because they are essentially equivalent.

In this paper we present our results on a class of proof problems which arisein elementary analysis. These are problems involving formulae with alternatingquantifiers, which are difficult to solve by the purely logic approach, becausethis requires the use of a large number of formulae which express the necessaryproperties of numbers (naturals, integers, rationals, reals). We use the followingtechniques, which extend our previous work [8]:

– the S-decomposition method for formulae with alternating quantifiers [6],

– Quantifier Elimination by Cylindrical Algebraic Decomposition [3],

– analysis of terms behaviour in zero,

– bounding the ε-bounds,

– rewriting of expressions involving absolute value,

– algebraic manipulations: solving, substitution, and simplification,

– identification of equal terms under unknown functions.

Our prover, implemented in the frame of the Theorema system [2], aims atproducing natural–style proofs for simple theorems involving limits of sequencesand of functions, continuity, uniform continuity, etc. An important aspect of thenaturalness of the proof is the fact that the prover does not need to access alarge collection of formulae (expressing the properties of the domains involved).Rather, the prover uses symbolic computation techniques from algebra in orderto discover relevant terms and to check necessary conditions, and only needs asstarting formulae the definitions of the main notions involved.

The prover can run either in interactive mode (the user has a choice of certaintechniques at certain points) or fully automatic mode, because this is providedby the Theorema system. When in automatic mode, the proofs tested by us tookat most 30 seconds.

2 en.wikipedia.org/wiki/Resolution (logic) (May 2018)3 www.encyclopediaofmath.org/index.php/Sequent calculus (May 2018),en.wikipedia.org/wiki/Sequent calculus (May 2018)

124 T. Jebelean

2 Example: Product of Convergent Sequences

We illustrate our method by the proof of the theorem “The product of twoconvergent sequences is convergent.”, which is presented in detail on the nextpages. The lines labeled [K1], . . . , [K5] are not part of the proof, but onlyannotations for the purpose of this presentation: they indicate the key steps inthe proof where special symbolic methods have to be used.

Note the specific structure of the quantified formulae: the quantifier has afirst underline which declares the type of the variable, and possibly a secondunderline which declares a condition upon the quantified variable. These twoconditions have a specific role during the decomposition of the proof using ourmethod – see [1]. For space reasons, in this presentation we do not address thetreatment of types. Note also that we follow the convention of Mathematica andTheorema, by denoting function and predicate application by square bracketsinstead of the traditional round parantheses.

The proof starts from the definition of the notion of convergent sequence andof the product of sequences, and decomposes the theorem into 3 statements: twoassumptions (1), (2) and one goal (3).

The main structure of the proof follows from the S-decomposition method(see [6]): the quantifiers are removed from the 3 statements in parallel, using acombination of inference steps which decompose the proof into several branches,introduce Skolem constants4, and require special terms (for instantiations or aswitnesses). In the background the prover keeps certain quantified formulae whichexpress the general structure of the proof and which are used at certain momentsfor finding the witnesses and the instantiation terms (as described in [1]).

At the beginning the 3 formulae are existential, in this situation S-decomposi-tion is applied as follows:

– First for the assumptions (1), (2): introduce the Skolem constants a1, a2instead of the quantified variables; assume that the type conditions underthe quantifier hold;

– Second for the goal (3): introduce the witness a1 ∗ a2 instead of the existen-tially quantified variable; prove additionally the type condition (not shownin the example) under the quantifier.

As an effect of this transformations the 3 formulae become universal (4), (5),(6), and S-decomposition proceeds as follows:

– First for the goal (6): introduce the Skolem constant e0 instead of the quan-tified variable; assume that the conditions e0 ∈ R and e0 > 0 under thequantifier hold;

– Second for the assumptions (4), (5): introduce the instantiation term e = . . .instead of the universally quantified variable; prove additionally the condi-tions under the quantifier (the proof of the type condition is not shown inthe example); and instantiate the assumptions.

4 Skolem constants are new constant symbols introduced instead of quantified variablesin certain situations: existential assumptions and universal goals.


After these transformations we obtain again 3 existentially quantified formulaeand the cycle re-iterates. At every iteration of the proof cycle one needs a witnessfor the existential goal and an instantiation term for the universal assumptions:these are the difficult steps in the proof, for which we use special proof techiquesbased on symbolic computation.

Definition: The sequence f : N −→ R is convergent iff:

∃a∈R

∀e∈Re>0

∃M∈N

∀n∈Nn≥M

|f [n]− a| < e

Theorem: The product of convergent sequences is convergent.

Proof:We assume:

(1) ∃a∈R

∀e∈Re>0

∃M∈N

∀n∈Nn≥M

|f1[n]− a| < e

(2) ∃a∈R

∀e∈Re>0

∃M∈N

∀n∈Nn≥M

|f2[n]− a| < e

and we prove :

(3) ∃a∈R

∀e∈Re>0

∃M∈N

∀n∈Nn≥M

| (f1[n] ∗ f2[n])− a| < e

By (1), (2) we can take a1,a2 ∈ R such that :

(4) ∀e∈Re>0

∃M∈N

∀n∈Nn≥M

|f1[n]− a1| < e

(5) ∀e∈Re>0

∃M∈N

∀n∈Nn≥M

|f2[n]− a2| < e

[K1] Witness for the existential goal: a→ a1 ∗ a2

For proving (3) it is sufficient to prove :

(6) ∀e∈Re>0

∃M∈N

∀n∈Nn≥M

| (f1[n] ∗ f2[n])− (a1 ∗ a2) | < e

For proving (6) we take e0 ∈ R arbitrary but fixed, we assume :

(7) e0 > 0

and we prove :

(8) ∃M∈N

∀n∈Nn≥M

| (f1[n] ∗ f2[n])− (a1 ∗ a2) | < e0

126 T. Jebelean

[K2] Instantiation term for universal assumptions: e→ Min [. . .]

We consider :

e = Min[1, e0|a2|+|a1|+1

]First we prove :

(9) e > 0

This follows from (7) and elementary properties of R.

Using (9), from (4) and (5) we obtain :

(10) ∃M∈N

∀n∈Nn≥M

|f1[n]− a1| < e

(11) ∃M∈N

∀n∈Nn≥M

|f2[n]− a2| < e

By (10) and (11) we can take M1,M2∈ N such that :

(12) ∀n∈N

n≥M1

|f1[n]− a1| < e

(13) ∀n∈N

n≥M2

|f2[n]− a2| < e

[K3] Witness for existential goal : M → Max [M1,M2]

In order to prove (8) it suffices to prove:

(14) ∀n∈N

n≥Max[M1,M2]

| (f1[n] ∗ f2[n])− (a1 ∗ a2) | < e0

For proving (14) we take n0 ∈ N arbitrary but fixed, we assume :

(15) n0 ≥ Max [M1,M2]

and we prove :

(16) | (f1 [n0] ∗ f2 [n0])− (a1 ∗ a2) | < e0


[K4] Instantiation term for universal assumptions : n→ n0

First we prove :

(17) (n0 ≥M1) ∧ (n0 ≥M2)

This follows from (15) and elementary properties of R.

Using (17), from (12) and (13) we obtain:

(18) |f1 [n0]− a1| < e

(19) |f2 [n0]− a2| < e

[K5] Algebraic manipulations

Using elementary properties of R we transform (16) into:

(20) |a1 ∗ (f2[n0]− a2) + a2 ∗ (f1[n0]− a1) + (f1[n0]− a1) ∗ (f2[n0]− a2)| < e0

Using elementary properties of R, from (18) and (19) we obtain:

(21) |a1 ∗ (f2[n0]− a2) + a2 ∗ (f1[n0]− a1) + (f1[n0]− a1) ∗ (f2[n0]− a2)| ≤

≤ |a1 ∗ (f2[n0]− a2)|+ |a2 ∗ (f1[n0]− a1)| + |(f1[n0]− a1) ∗ (f2[n0]− a2)| =

= |a1| ∗ |f2[n0]− a2|+ |a2| ∗ |f1[n0]− a1| + |f1[n0]− a1| ∗ |f2[n0]− a2| <

< |a1| ∗ e+ |a2| ∗ e+ e ∗ e ≤ |a1| ∗ e+ |a2| ∗ e+ e = e ∗ (|a1|+ |a2|+ 1) =

= e0|a2|+|a1|+1 ∗ (|a1|+ |a2|+ 1) = e0

which proves the goal.

3 Application of Special Techniques

We describe here how the special techniques mentioned in the introduction areused in the course of the proof presented above, namely at the key steps indicatedwith [K1], . . . , [K5].

3.1 K1: Witness for Existential Goal

Here the prover must produce the witness a1 ∗a2 needed for the existential vari-able a in the current goal (3). We use the well known technique of metavariables(see also [4]), that is we replace the existential variable by a new symbol, whichis a name for the term which we need to find. This term will be found later,when the prover generates a certain simplified formula (see [8, 1]) which we call

128 T. Jebelean

formula (A):

∀a1,a2∃a0∀e0(e0 > 0⇒ ∃e1,e2(e1 > 0 ∧ e2 > 0∧

∀x1,x2(|x1 − a1| < e1 ∧ |x2 − a2| < e2)⇒ |x1 ∗ x2 − a0| < e0))

The value of the metavariable (standing for a0) can be found using QuantifierElimination (QE) by Cylindrical Algebraic Decomposition (CAD), as describedin [1] – which works for the case of the sum of convergent sequences. Howeverin the case of product the corresponding QE problem cannot be solved in areasonable time by CAD (e. g. in Mathematica), and even if solved, it generatesa very complicated expression which cannot be used for finding a0.

In the proof above we used another, much simpler technique: reasoning aboutterms behaviour in zero. It is clear that the formula (A) expresses the behaviourof the polynomials in any (small) vicinity of zero. Since polynomials are continu-ous, this will also be their behaviour in zero. One can in fact prove that formula(A) is equivalent to the formula:

∀a1,a2∃a0∀x1,x2(|x1 − a1| = 0 ∧ |x2 − a2| = 0)⇒ |x1 ∗ x2 − a0| = 0

In our special case it is immediately clear that a0 equals a1 ∗ a2, but we im-plemented a more general method: the two LHS equations are solved for x1, x2,then the values are replaced in the RHS equation, which is then solved for a0.

3.2 K2: Instantiantion Term for Universal Assumptions

Here the prover must produce an appropriate term for the instantiation of theassumptions (4) and (5). In the case of sum of sequences this is e0/2 and isrelatively easy to guess by a human prover. Similar to [K1], a metavariable isused instead of the unknown term, and this will correspond to the existentialvariables e1, e2 in formula (A).

Again it is possible to use QE by CAD, by treating the formula (A) afterreplacing a0 with its value and removal of the quantifiers for a0, a1, a2 – see [1].This works in the case of sum, but again it does not work satisfactorily in thecase of product.

In order to find this witness (we assume that it is the same for e1 and e2), weuse algebraic manipulation (solving, substitution, and computation), as well asrewriting of terms under the absolute value function. This is probably the mostinteresting of the new techniques presented here, and is detailed below at [K5].

3.3 K3: Witness for the Existential Goal

The proof needs a witness for the existential variable M in the goal (8). Sim-ilarly to the other key steps, the prover uses a metavariable and produces anappropriate quantified formula whose treatment by QE allows to infer the rightterm, as described in [1].


3.4 K4: Instantiation of Universal Assumptions

The assumptions (12) and (13) need to be instantiated with appropriate termsfor the universally quantified variable n. Here we use the special heuristics: iden-tification of equal terms under unknown functions.

Since f1 and f2 are universally quantified in the original formula, and laterbecome arbitrary constants, we do not know anything about their behaviour.In the goal (16), f1 and f2 have argument n0. Therefore it will be possible touse the assumptions (12) and (13) for proving (16) only if f1 and f2 are appliedto the same argument. (This corresponds in fact to resolution in first orderlogic.) In the case of this proof the solution is to set n to n0, but even if theexpressions are more complicated one can use equation solving, substitution,and computation in order to find more complicated terms. Moreover, after thisinstantiation we substitute f1[n0] and f2[n0] with new arbitrary constants (e. g.x1, x2, respectively): this makes our expressions polynomial and helps creatingthe formula (A).

3.5 K5: Algebraic Manipulations

The most challenging part is the automatic generation of the instantiation termneeded at step [K2], which is performed by a heuristic combination of solving,substitution, and simplifying, as well as rewriting of expressions under the ab-solute value function.

Note that the goal (16) has under the absolute value function the expression(call it E0) corresponding to x1 ∗ x2 − a1 ∗ a2. Let us also name the absolutevalue arguments of the assumptions (18) and (19) as E1 and E2, respectively.

First we use the following heuristic principle: transform the goal expressionE0 such that it uses as much as possible E1 and E2, because about those we knowthat they are small. In order to do this we take new variables y1, y2, we solve theequations y1 = E1 and y2 = E2 for x1, x2, we substitute the solutions in E0 andthe result simplifies to: a1∗y2+a2∗y1+y1∗y2. This is the internal representationof the absolute value argument in the goal (20). Note that the transformationfrom (16) to (20) is relatively challenging even for a human prover.

The formula (21) is realized by rewriting of the absolute value expressions.Namely, we apply certain rewrite rules to expressions of the form |E| and theircombination, as well as to the metavariable e. Every rewrite rule transforms a(sub)term into one which is not smaller, so we are sure to obtain a greater orequal term. The final purpose of these transformations is to obtain a strictlypositive ground term t multiplied by the target metavariable (here e). Since weneed a value for e which fullfils t ∗ e ≤ e0, we can set e to e0/t. The rewriterules come from the elementary properties of the absolute value function: (e. g.|u + v| ≤ |u| + |v|)) and from the principle of bounding the ε-bounds: Since weare interested in the behaviour of the expressions in the immediated vicinity ofzero, the bounds (e, e0, e1, e2) can be bound from above by any positive value.In the case of product (presented here), we also use the rule: e ∗ e ≤ e, that is

130 T. Jebelean

we bound e to 1. This is why the final expression of e is the minimum between1 and the term t found as above.

This method works of course also for the case of sum of sequences.In order to make it work for more complex expressions, namely rational

functions, we use a second set of rules which decrease the term: in order to obtaina bound for U/V , increase U and decrease V . Using this we obtain automaticallyappropriate bounds for the case of inverse of a sequence and for the case offraction of two sequences.

Full detail of the techniques and of the examples are presented in [5].

3.6 Proving Simple Conditions

At certain places in the proof, the conditions upon certain quantified variableshave to be proven. The prover does not display a proof of these simple state-ments, but just declares them to be consequences of “elementary properties ofR”. (Such elementary properties are also invoked when developing formulae (20)and (21).) By “elementary properties” in this context we understand the prop-erties of various constants (like 0, 1), functions (like Min, Max, absolute value,+,−, ∗, /), and predicates (like =, <,≤) over reals and naturals, which are nor-mally studied before the notions of limit, continuity, etc. and which are typicallyconsidered prerequisites for working in elementary analysis.

In the background, however, the prover uses Mathematica functions in orderto check that these statements are correct. This happens for the subgoal (9) andwill be treated after the instatiation term is found, by using QE on the formula∀e0(e0 > 0⇒ e > 0) (where e has the found value Min[. . .]), which returns truein Mathematica. The same procedure is used for the subgoal (17).

4 Conclusion and Further Work

The full automation of proofs in elementary analysis constitutes a very inter-sting application for the combination of logic and algebraic techniques, whichis essentially equivalent to SMT solving (combining satisfiability checking andsymbolic computation). Our experiments show that complete and efficient au-tomation is possible by using certain heuristics in combination with complexalgebraic algorithms.

Further work includes a systematic treatment of various formulae which ap-pear in textbooks, and extension of the heuristics to more general types of for-mulae. In this way we hope to address the class of problems which are usuallysubject to SMT solving.

References

1. Abraham, E., Jebelean, T.: Adapting Cylindrical Algebraic Decomposition for ProofSpecific Tasks. In: Kusper, G. (ed.) ICAI 2017: 10th International Conference onApplied Informatics (2017), in print


2. Buchberger, B., Jebelean, T., Kutsia, T., Maletzky, A., Windsteiger, W.: Theorema2.0: Computer-Assisted Natural-Style Mathematics. JFR 9(1), 149–185 (2016)

3. Collins, G.E.: Quantier elimination for real closed fields by cylindrical algebraicdecomposition. In: Automata Theory and Formal Languages. LNCS, vol. 33, pp.134–183. Springer (1975)

4. Dramnesc, I., Jebelean, T.: Synthesis of list algorithms by mechanical proving. Jour-nal of Symbolic Computation 69, 61–92 (2015)

5. Jebelean, T.: Experiments with Automatic Proofs in Elementary Analysis. Tech.Rep. 18-06, Research Institute for Symbolic Computation (RISC), Johannes KeplerUniversity Linz (April 2018)

6. Jebelean, T., Buchberger, B., Kutsia, T., Popov, N., Schreiner, W., Windsteiger,W.: Automated Reasoning. In: Buchberger, B., et al. (eds.) Hagenberg Research,pp. 63–101. Springer (2009)

7. Pelletier, F.J.: A history of natural deduction and elementary logic textbooks. Log-ical consequence: Rival approaches 1, 105–138 (2000)

8. Vajda, R., Jebelean, T., Buchberger, B.: Combining Logical and Algebraic Tech-niques for Natural Style Proving in Elementary Analysis. Mathematics and Com-puters in Simulation 79(8), 2310–2316 (April 2009)

computing.coventry.ac.ukcomputing.coventry.ac.uk/~mengland/temp/scsquare2018.pdf · 2018-09-11 ·...

Documents