automated theorem proving presentation by kenny pearce cse 391: artificial intelligence april 22,...

Automated Theorem ProvingPresentation by Kenny PearceCSE 391: Artificial IntelligenceApril 22, 2005

The State of the ArtOTTEROrganized Techniques for Theorem-Proving and Effective ResearchProves theorems in first order logic with equality using resolution-based techniquesWritten in C for UnixDeveloped at Argonne National Labhttp://www-unix.mcs.anl.gov/AR/otter/

ResolutionRussel and Norvig 7.5Consider this problem (to which we will return later):Axiom Set: {(L&W), (L->(~F))}Theorem: (~F)Standard Proof:(L&W) L (& Elimination){(L->(~F)), L} (~F) (Modus Ponens)QED

ResolutionRussel and Norvig 7.5Resolution Proof:First, Translate into Conjunctive Normal Form, with Theorem Negated:(L&W&(~L|~F)&F)Consider this problem (to which we will return later):Axiom Set: {(L&W), (L->(~F))}Theorem: (~F)XX(L&W&~F&F)Yields an empty clause, therefore argument valid

OTTER SuccessesFound a shorter single axiom for lattice theoryFound several shorter single axioms for ternary boolean algebraProved that all Robbins algebras are booleanFound a shorter axiom system for two-value sentential calculusMore at http://www-unix.mcs.anl.gov/AR/new_results/

Is OTTER an AI Theorem Prover?Uses ResolutionGuided by HeuristicsBUT relative weights of heuristics, restrictions, etc. must be customized by a human for each problemThe human has to have a good idea of what kinds of proof techniques will be useful on this problem; OTTER can't figure that out on its own.

My Approach: A Computer That Does Logic Like a HumanResolution is not the most effective proof procedure for humans. It is easy to do, but is time consuming.Instead, humans use truth trees or derivation systemsThese proof systems are better for humans because we are good at deciding which of several possible next steps will be most useful; an AI with a sufficiently powerful heuristic should perform better with one of these systems for the same reason humans do.My program implements a subset of the derivation system found in The Logic Book by Merrie Bergmann, James Moor, and Jack Nelson

Bergmann et al.'s Sentential Derivation System (SD)SD Rules:Reiteration (P P)Introduction and Elimination Rules for Each OperatorRules I Implemented:Those not requiring sub-derivations, namely:& Elimination~ Elimination& IntroductionModus Ponens (-> and Elimination)-> Introduction Introduction| Introduction Elimination

Implementation Details:The Sentence Class (Sentence.py)class Sentence: def __init__(self, rep=None, sent1=None, op=None, sent2=None): if op == None: self.op = None self.sent1 = None self.sent2 = None self.rep = rep elif op == "&" or op == "|" or op == "->" or op == "": self.op = op self.sent1 = sent1 self.sent2 = sent2 self.rep = None elif op == "~": self.op = op self.sent1 = sent1 self.sent2 = None self.rep = None else: raise Sentence.OpException()

Implementation Details:The Sentence Class (Sentence.py) def negate(self): """Returns the negation of this sentence""" return newComplexSentence(self, "~") def conjoin(self, other): """Returns the conjunction of this Sentence with the Sentence other""" return newComplexSentence(self, "&", other) def disjoin(self, other): """Returns the disjunction of this Sentence with the Sentence other""" return newComplexSentence(self, "|", other) def cond(self, other): """Returns the Sentence "self -> other".""" return newComplexSentence(self, "->", other) def bicond(self, other): """Returns the Sentence "self other".""" return newComplexSentence(self, "", other)

Implementation Details:The Proof Class (Proof.py)class Proof(Problem): def path_cost(self, c, state1, action, state2): addCost = 1 if state2[len(state2)-1].complexity() > self.goal.complexity(): addCost = addCost + (state2[len(state2)-1].complexity() - \ self.goal.complexity()) * 2 S = state2[len(state2)-1].atoms() dups = 0 for s in S: dups = dups + S.count(s) - 1 if s not in self.goal.atoms(): addCost = addCost + 1 addCost = addCost + dups if action.find("Intro") != -1: addCost = addCost + 3 for op in self.goal.ops(): if(action.startswith(op)): if(addCost > 1): addCost = addCost - 1 else: addCost = 1 return addCost + c

Implementation Details:The Proof Class (Proof.py) def goal_test(self, state): return (self.goal in state or self.goal.negate() in state) def missing_pieces(self, thm, state): """Helper function that counts the number of subcomponents of thm not in state. For use in recursive calls (see component_h below)""" if(thm in state): return 0 if(thm.sent1 == None and thm.sent2 == None): return 1 count = 1 if(thm.sent1 != None): count = count + self.missing_pieces(thm.sent1, state) if(thm.sent2 != None): count = count + self.missing_pieces(thm.sent2, state) return count def component_h(self, n): """A heuristic that counts the number of subcomponents of the goal theorem which are not in the state""" val = self.missing_pieces(self.goal, n.state) return val

Implementation Details:Defining a Problemp2.py

from Proof import * Ax1 = Sentence("L").conjoin(Sentence("W")) Ax2 = Sentence("L").cond(Sentence("F").negate()) thm = Sentence("F").negate() p2 = Proof((Ax1,Ax2), thm)

Console

>>> from p2 import * >>> p2.initial ((L & W), (L -> ~F)) >>> p2.goal ~F

Does It Work?Yes!Problems Solved:P1 (A* Only)Axioms: {(P&C), (T&M), (E&(I&R))}Theorem: ((P&T)&R)Solution: (P&C), (T&M), (E&(I&R)), P, T, (I&R), R, (P&T), ((P&T)&R)P2 (The one we looked at earlier)Solution: (L&W), (L->~F), L, ~FP3Axioms: {(K(~E&Q)), K}Theorem: QSolution: (K(~E & Q)), K, (~E&Q), QP4 (A* Only)Axioms: {(P->(S(A&B))), (P&S)}Theorem: BSolution: (P->(S(A&B))), (P&S), S, P, (S(A&B)), (A&B), B

Does It Work?Yes!Problems Solved:P5 (A* Only)Axioms: {(S&(S->E)), (CE), (CG)}Theorem: GSolution: (S&(S->E)), (CE), (CG), S, (S->E), E, C, GP6 (A* Only)Axioms: {((A&(B&C))&D), ((B&C)->E))}Theorem: ESolution: ((A&(B&C))&D), ((B&C)->E), (A&(B&C)), (B&C), EP7Axioms: {A, C, (C->D)}Theorem: (A&D)Solution: A, C, (C->D), D, (A&D)

Efficiency ComparisonP2Axioms: {(L&W), (L->~F)}Theorem: ~FP3Axioms: {(K(~E&Q)), K}Theorem: QP7Axioms: {A, C, (C->D)}Theorem: (A&D)Format: Searcher P2 P3 P7 breadth_first_graph_search < 50/ 51/3221/((L > < 322/ 323/24703/((K > iterative_deepening_search < 3/ 54/ 91/((L > < 9/ 332/ 358/((K > < 39/2982/3051/(A, > astar_search < 8/ 9/ 345/((L > < 22/ 23/ 967/((K > < 2/ 3/ 125/(A, >

Conclusions: Automated Theorem ProvingComputers are good at it!But not as good as people...

Resolution works best right nowBut with better heuristics other methods may overtake itHaving more rules requires fewer steps, but more code and more memory, and there's more opportunity for the prover to go down the wrong track.Even with only very simple heuristics, a computer can solve all the logic problems on your 260 homework. Hmm...

BibliographyArgonne National Laboratory. Automated Reasoning at Argonne. [Cited 21 April 2005]. Available Online (http://www-unix.mcs.anl.gov/AR/).Bergmann, Merrie, James Moor, and Jack Nelson. 1998. The Logic Book. New York: McGraw-Hill.Boyer, Robert S. and J. Strother Moore. July 1990. A Theorem Prover for a Computational Logic. Keynote Address, 10th Conference on Automated Deduction. Available Online (http://www.cs.utexas.edu/users/boyer/cade90.pdf).McCune, William. 2003. OTTER 3.3 Reference Manual. Argonne, IL: Argonne National Laboratory. Available Online (http://www-unix.mcs.anl.gov/AR/otter/otter33.pdf).Norton, L. M., 1971. Experiments with a heuristic theorem proving program for the predicate calculus with equality. Artificial Intelligence 2:261-284.Russell, Stuart and Peter Norvig. 2003. Artificial Intelligence: A Modern Aproach. Upper Saddle River, New Jersey: Pearson Education, Inc.

automated theorem proving presentation by kenny pearce cse 391: artificial intelligence april 22,...

Documents