KNOWLEDGE REPRESENTATION & REASONING - SAT
1
SATProblem Definition
KR with SATTractable Subclasses
DPLL Search Algorithm
Slides by: Florent Madelaine Roberto Sebastiani
Edmund Clarke Sharad Malik
Toby Walsh Kostas Stergiou
KNOWLEDGE REPRESENTATION & REASONING - SAT
2
Material of lectures on SAT
SAT definitions Tractable subclasses
Horn-SAT 2-SAT
CNF
Algorithms for SAT DPLL-based
Basic chronological backtracking algorithm Branching heuristics Look-ahead (propagation) Backjumping and learning
Local Search GSAT WalkSAT Other enhancements
Application of SAT Planning as satisfiability Hardware verification
KNOWLEDGE REPRESENTATION & REASONING - SAT
3
What is SAT?
SATisfying assignment!
Given a propositional formula in Conjunctive Normal Form (CNF), find an assignment to Boolean variables that makes the formula true:
c1 = (x2 x3)
c2 = (x1 x4)
c3 = (x2 x4)
A = {x1=0, x2=1, x3=0, x4=1}
c1 = (x2 x3)
c2 = (x1 x4)
c3 = (x2 x4)
A = {x1=0, x2=1, x3=0, x4=1}
KNOWLEDGE REPRESENTATION & REASONING - SAT
4
Why do we study SAT?
Fundamental problem from theoretical point of view NP-completeness
First problem to be proved NP-complete (Cook’s theorem) Reduction to SAT often used to prove NP-completeness for other problems Studies on tractability
Numerous applications: CAD, VLSI Combinatorial Optimization Bounded Model Checking and other type of formal software and hardware
verification AI, planning, automated deduction
KNOWLEDGE REPRESENTATION & REASONING - SAT
5
Representing knowledge using SAT
Embassy ball (a diplomatic problem)
King wants to invite PERU or exclude QATAR
Queen wants to invite QATAR or ROMANIA
King wants to exclude ROMANIA or PERU
Who can we invite?
KNOWLEDGE REPRESENTATION & REASONING - SAT
6
Representing knowledge using SAT
Embassy ball (a diplomatic problem)
King wants to invite PERU or exclude QATAR
Queen wants to invite QATAR or ROMANIA
King wants to exclude ROMANIA or PERU
(P Q) (Q R) (R P)
is satisfied by P=true, Q=true, R=false
and by P=false, Q=false, R=true
KNOWLEDGE REPRESENTATION & REASONING - SAT
7
Other applications of SAT
Hardware verification
S = Cin (P Q), …
KNOWLEDGE REPRESENTATION & REASONING - SAT
8
The K-Coloring problem:Given an undirected graph G(V,E) and a natural number k, is there an assignment color:
Formulation of a famous problem as SAT: k-Coloring
KNOWLEDGE REPRESENTATION & REASONING - SAT
9
Formulation of a famous problem as SAT: k-Coloring
xi,j = node i is assigned the ‘color’ j (1 i n, 1 j k)
Constraints:
iii) Coloring constraints:
).(),( ,,111
cjci
n
j
n
i
k
cxxEji
)( ,,1
1
11tiji
k
jt
k
j
n
ixx
ii) At most one color to each node:
)(V ,11
ji
k
j
n
ix
i) At least one color to each node: (x1,1 x1,2 … x1,k)
KNOWLEDGE REPRESENTATION & REASONING - SAT
10
SAT Notation
Boolean Formula: T and F are formulas A propositional atom (variable) is a formula If φ1 and φ2 are formulas then φ1, φ1φ2, φ1φ2, φ1φ2, φ1φ2 are formulas
Atoms(φ): the set of atoms appearing in φ Literal: either an atom p (positive literal) or its negation p (negative
literal) p and p are complementary literals Clause: a disjunction L1 … Ln, n 0 of literals. Empty clause when n = 0 (the empty clause is false in every
interpretation). Unit clause when n = 1.
KNOWLEDGE REPRESENTATION & REASONING - SAT
11
SAT Notation
Total truth assignment μ for φ: μ: Atoms(φ) {Τ,F}
Partial Truth assignment μ for φ: μ: A {Τ,F}, A Atoms(φ)
Set and formula representation of an assignment: μ can be represented as a set of literals:
E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1 , A2} μ can be represented as a formula:
E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1 A2} both representations used for sets of clauses (formulas)
KNOWLEDGE REPRESENTATION & REASONING - SAT
12
SAT Notation
μ |= φ (μ satisfies φ): μ |= Ai μ(Ai) = T μ |= φ not μ |= φ μ |= φ1 φ2 μ |= φ1 μ |= φ2 ...
φ is satisfiable iff μ |= φ for some μ φ1 |= φ2 (φ1 entails φ2)
iff for every μ, μ |= φ1 => μ |= φ2 |= φ (φ is valid)
iff for every μ, μ |= φ what does this mean for φ ?
KNOWLEDGE REPRESENTATION & REASONING - SAT
13
SAT Notation
φ1 and φ2 are equivalent iff for every μ, μ |= φ1 iff μ |= φ2
φ1 and φ2 are equisatisfiable iff exists μ1 s.t. μ1 |= φ1 iff exists μ2 s.t. μ2 |= φ2
If φ1 and φ2 are equivalent then they are also equisatisfiable but the opposite does not hold
Example: φ1 φ2 and (φ1 l) (l φ2), where l not in φ1 φ2, are equisatisfiable
but not equivalent
KNOWLEDGE REPRESENTATION & REASONING - SAT
14
Conjunctive Normal Form (CNF)
A formula A is in conjunctive normal form, or simply CNF, if it is
either T, or F, or a conjunction of disjunctions of literals:
(That is, a conjunction of clauses.) A formula B is called a conjunctive normal form of a formula A
if B is equivalent to A and B is in conjunctive normal form.
)(V , jiji
x
KNOWLEDGE REPRESENTATION & REASONING - SAT
15
Conjunctive Normal Form
Every sentence in propositional logic can be transformed into conjunctive normal form
i.e. a conjunction of disjunctions
Simple Algorithm
Eliminate using the rule that (p q) is equivalent to (p q)
Use de Morgan’s laws so that negation applies to literals only
Distribute and to write the result as a conjunction of disjunctions
KNOWLEDGE REPRESENTATION & REASONING - SAT
16
Conjunctive Normal Form - Example
(p q) (r p)
Eliminate implication signs (p q) (r p)
Apply de Morgan’s laws (p q) (r p)
Apply associative and distributive laws (p r p) (q r p) (p r) (q r p)
KNOWLEDGE REPRESENTATION & REASONING - SAT
17
Tractable Subclasses
SAT is NP-complete therefore it generally is hard to solve!
Question: In what ways can we restrict the expressiveness of SAT in order to
achieve tractability?
Answer: Horn-SAT 2-SAT
KNOWLEDGE REPRESENTATION & REASONING - SAT
18
Algorithms for SAT
The study of algorithms for SAT dates back to 1960! one of the most widely studied NP-complete problems
There are five general approaches to SAT solving Resolution-based (DP) Complete Search (DPLL) Decision Diagrams Incomplete Local Search Stalmärck’s algorithm (breadth-first search)
most widely used in practice
and the ones we will study
KNOWLEDGE REPRESENTATION & REASONING - SAT
19
Algorithms for SAT
How do we test if a problem is SAT or not?
Complete methods Return “Yes” if SATisfiable Return “No” if UNSATisfiable
Incomplete methods If return “Yes”, problem is SATisfiable Otherwise timeout/run forever, problem can be SAT or UNSAT
KNOWLEDGE REPRESENTATION & REASONING - SAT
20
Algorithms for SAT
The first algorithm was based on resolution (Davis & Putnam, 1960)
exponential space complexity memory explosion!
The second algorithm was based on search (Davis, Logemann, Loveland, 1962) usually referred to as DPLL (although Putnam was not involved) still the basis of most modern complete SAT solvers
Some early DPLL-based SAT solvers: Tableau (NTAB), POSIT, 2cl, CSAT not used any more (many orders of magnitude slower than modern
solvers)
KNOWLEDGE REPRESENTATION & REASONING - SAT
21
Davis-Putnam Algorithm
Existential abstraction using resolution Iteratively select a variable for resolution till no more variables are left.
(a b c) (b -c f) (-b e) (a b) (a -b) (-a c) (-a -c)
∃b (a c e) (-c e f) ∃b (a) (-a c) (-a -c)
∃bc (a e f) ∃ba (c) (-c)
∃bcaef T ∃bac ( ) SAT UNSAT
KNOWLEDGE REPRESENTATION & REASONING - SAT
22
Algorithms for SAT
The first algorithm was based on resolution (Davis & Putnam, 1960)
exponential space complexity memory explosion!
The second algorithm was based on search (Davis, Logemann, Loveland, 1962) usually referred to as DPLL (although Putnam was not involved) still the basis of most modern complete SAT solvers
Some early DPLL-based SAT solvers: Tableau (NTAB), POSIT, 2cl, CSAT not used any more (many orders of magnitude slower than modern
solvers)
KNOWLEDGE REPRESENTATION & REASONING - SAT
23
DPLL Solvers
DPLL-based solvers are relatively small pieces of software a few thousand lines of code
but they involve quite complex algorithms and heuristics The evolution of SAT solvers into the modern ultra-fast
tools that can tackle large (and huge) real problems is based on the following enhancements of DPLL:
preprocessing advanced propagation/deduction techniques for look-ahead and
preprocessing sophisticated branching heuristics very detailed and fast implementations + smart memory management backjumping and learning methodsincreasing order of importance?
KNOWLEDGE REPRESENTATION & REASONING - SAT
24
DPLL
status = preprocess();if (status!=UNKNOWN) return status;while (1) {
decide_next_branch();while (true) {
status = deduce();if (status == CONFLICT) {
blevel = analyze_conflict();if (blevel == 0) return
UNSATISFIABLE;else backtrack(blevel);
}else if (status == SATISFIABLE)
return SATISFIABLE;else break;
}}
DPLL is traditionally described in a recursive way We will use this modern iterative description due
to Zhang and Malik
branching heuristics
propagation/deduction
preprocessing
backjumping/learning
KNOWLEDGE REPRESENTATION & REASONING - SAT
25
Unit Propagation
Unit propagation (UP) is the core deduction method used by all DPLL-based solvers
a clause is called unit if all but one of its literals have been assigned to false (i.e. it consists of a single literal)
UP repeatedly applies unit resolution (i.e. it resolves unit clauses)
Let us look at an example
The efficient implementation of UPis of primary importance in a SAT
solver
most of the time is spenton doing UP!!!
KNOWLEDGE REPRESENTATION & REASONING - SAT
26
Given in CNF: (x,y,z),(-x,y),(-y,z),(-x,-y,-z)
Decide()
Deduce()
Analyze_Conflict()
-xx
-zz-yy
z -z y -y
() ()
(z ),(-z ) ()
(y),(-y,z ),(-y,-z )
()
() ()
(y),(-y)
(y,z ),(-y,z )
X
X X X X
DPLL examples
more examples
KNOWLEDGE REPRESENTATION & REASONING - SAT
27
DPLL
status = preprocess();if (status!=UNKNOWN) return status;while (1) {
decide_next_branch();while (true) {
status = deduce();if (status == CONFLICT) {
blevel = analyze_conflict();if (blevel == 0) return
UNSATISFIABLE;else backtrack(blevel);
}else if (status == SATISFIABLE)
return SATISFIABLE;else break;
}}
branching heuristics
UP
preprocessing
backjumping/learning
KNOWLEDGE REPRESENTATION & REASONING - SAT
28
Propagation / Deduction
Apart from UP several other deduction methods have been proposed and used during preprocessing (mainly) and search (less frequently)
Pure Literal rule Binary Clause reasoning Hyper Resolution Failed Literal Detection Equality Reduction Krom Subsumption Resolution Generalized Subsumption Resolution …
most of them are only usedfor preprocessing the formulabecause they are expensive
One notable exception is the pure literal rule
KNOWLEDGE REPRESENTATION & REASONING - SAT
29
Pure Literal Rule
The pure literal rule (Davis, Logemann, Loveland, 1962) states the following:
if a variable occurs only positively then it can be assigned to true if a variable occurs only positively then it can be assigned to false
Example:Given in CNF: (x,y,z),(-x,y),(y,-w),(-x,y,-z)
y is a pure literal it can be assigned truew is a pure literal it can be assigned false
Clauses with pure literals or tautologies can be removed!
a tautology is a clause of the form x –x y The pure literal rule is expensive to apply during search
KNOWLEDGE REPRESENTATION & REASONING - SAT
30
Pure Literal Rule
The pure literal rule can be sequentially applied Consider the formula
(u w x), (-w x y), (-u -x), (v w -y) v is a pure literal it can be assigned trueThe formula becomes
(u w x), (-w x y), (-u -x)y is a pure literal it can be assigned true
The formula becomes
(u w x), (-u -x) w is a pure literal it can be assigned true
The formula becomes
(-u -x) both u and x are pure literals they can be assigned false
KNOWLEDGE REPRESENTATION & REASONING - SAT
31
Other Deduction Methods
Weaker versions of UP Binary UP resolves only unit and binary clauses
Can be used to solve a 2-SAT problem in quadratic time Fixed-depth UP applies UP only up to a certain depth
Variants of Binary Resolution BinRes, Equality Reduction, HyperBinRes
Failed Literal Detection Hyper-Resolution Krom Subsumption Resolution
Generalized Subsumption Resolution
Equivalence Reasoning Etc.
propagation/deduction
preprocessing
KNOWLEDGE REPRESENTATION & REASONING - SAT
32
Failed Literal Detection
Failed literal detection (Freeman, 1995) is a one-step lookahead with UP.
Say we force (assign) literal l and then perform UP. If this process yields a contradiction (empty literal) then we know that l is entailed by the current input and we can force it (and then perform UP).
DPLL solvers often perform failed literal detection on a set of likely, heuristically selected, literals at each node.
The SATZ system (Li & Anbulagan, 1997) was the first to show that very aggressive failed literal detection can pay off.
but doing it on all literals is too expensive
KNOWLEDGE REPRESENTATION & REASONING - SAT
33
Binary Resolution
One “cheap” form of binary resolution consists of performing all possible resolutions of pairs of binary clauses
Such resolutions yield only new binary clauses or new unit clauses
BinRes (Bacchus, 2002) repeatedly: (a) adds to the formula all new binary or unit clauses producible by resolving
pairs of binary clauses, and (b) performs UP on any new unit clauses that appear (which in turn might
produce more binary clauses causing another iteration of (a)),
until either a contradiction is achieved, or nothing new can be added by a step of (a) or (b).
BinRes ((a,b),(a,c),(b,c)) produces the new binary clauses (b,c), (a,c), and (c). Then unit propagation yields the final reduction.
KNOWLEDGE REPRESENTATION & REASONING - SAT
34
Hyper Resolution
A hyper resolution rule resolves more than two clauses at the same time
HypBinRes is a rule of inference involving hyper-resolution It takes as input a single n-ary clause (n 2) (l1, l2, ..., ln) and n−1 binary clauses each of the form (li,l) (i = 1, . . . , n−1). It produces as output the new binary clause (l, ln).
For example, using HypBinRes hyperresolution on the inputs (a, b, c, d), (h, a), (h, c), and (h, d), produces the new binary clause (h, b)
HypBinRes is equivalent to a sequence of ordinary resolution steps (i.e., resolution steps involving only two clauses). However, such a sequence would generate clauses of intermediate length while HypBinRes only generates the final binary clause
KNOWLEDGE REPRESENTATION & REASONING - SAT
35
Krom Subsumption
Krom-subsumption resolution (van Gelder and Y. Tsuji, 1996) takes as input two clauses of the form x y and ¬x y Z and generates the clause y Z
where Z is a clause of arbitrary length y Z subsumes (entails) ¬x y Z, therefore ¬x y Z can be deleted
Generalized Subsumption resolution takes two clauses x Y and ¬x Y Z and generatesY Z
We can derive propagation methods derived by repeatedly applying either form of resolution
KNOWLEDGE REPRESENTATION & REASONING - SAT
36
Equality Reduction
If a formula F contains (a,b) as well as (a,b), then we can form a new formula EqReduce(F) by equality reduction.
Equality reduction (Bacchus, 2002) involves: (a) replacing all instances of b in F by a (or vice versa), (b) removing all clauses which now contain both a and a, (c) removing all duplicate instances of a (or a) from all clauses. This process might generate new binary clauses
For example, EqReduce((a,b),(a,b),(a,b,c),(b,d), (a,b,d)) = ((a, d),(a,d))
EqReduce(F) has a satisfying truth assignment iff F does. And any truth assignment for EqReduce(F) can be extended to one for F
by assigning b the same value as a.
KNOWLEDGE REPRESENTATION & REASONING - SAT
37
DPLL
status = preprocess();if (status!=UNKNOWN) return status;while (1) {
decide_next_branch();while (true) {
status = deduce();if (status == CONFLICT) {
blevel = analyze_conflict();if (blevel == 0) return
UNSATISFIABLE;else backtrack(blevel);
}else if (status == SATISFIABLE)
return SATISFIABLE;else break;
}}
branching heuristics
UP
HyperRes, BinRes, EqRed etc.
backjumping/learning
KNOWLEDGE REPRESENTATION & REASONING - SAT
38
DLIS (Dynamic Largest Individual Sum)For a given variable x: Cx,p – # unresolved clauses in which x appears positively
Cx,n - # unresolved clauses in which x appears negatively
Let x be the literal for which Cx,p is maximal
Let y be the literal for which Cy,n is maximal
If Cx,p > Cy,n choose x and assign it TRUE Otherwise choose y and assign it FALSE
Requires l (#literals) queries for each decision. (Implemented in some solvers e.g. Grasp)
Decision heuristics
KNOWLEDGE REPRESENTATION & REASONING - SAT
39
Decision heuristics
DLCS (Dynamic Largest Combined Sum)For a given variable x: Cx,p – # unresolved clauses in which x appears positively
Cx,n - # unresolved clauses in which x appears negatively
Let x be the literal for which Cx,p + Cx,n is maximal
If Cx,p > Cx,n and assign x to TRUE Otherwise assign x to FALSE
Requires l (#literals) queries for each decision. (Implemented in some solvers e.g. Grasp)
KNOWLEDGE REPRESENTATION & REASONING - SAT
40
Decision heuristics
Bohm’s Heuristic At each step of the backtrack search algorithm, the BOHM
heuristic selects a variable with the maximal vector (H1(x),H2(x),…,Hn(x)) in lexicographic order. Each Hi(x) is computed as follows:
Hi(x) = a max(hi(x), hi(x)) + b min(hi(x), hi(x)) where hi(x) is the number of unresolved clauses with i literals that
contain literal x. Hence, each selected literal gives preference to satisfying small clauses (when assigned value true) or to further reducing the size of small clauses (when assigned value false).
The values of a and are b chosen heuristically.
KNOWLEDGE REPRESENTATION & REASONING - SAT
41
Compute for every clause and every literal l:
J(l) :=
One-sided JW: Choose a literal l that maximizes J(l) Two-sided JW: Choose a variable x that maximizes J(x) + J(x)
Assign it to true if J(x) J(x) and false otherwise
This gives an exponentially higher weight to literals in shorter clauses.
Jeroslow-Wang method
Decision heuristics
,
||2l
KNOWLEDGE REPRESENTATION & REASONING - SAT
42
Let f*(x) be the # of unresolved smallest clauses containing x. Choose x that maximizes:
((f*(x) + f*(x)) * 2k + f*(x) * f*(x)
k is chosen heuristically. The idea:
Give preference to satisfying small clauses. Among those, give preference to balanced variables (e.g.
f*(x) = 3, f*( x) = 3 is better than f*(x) = 1, f*(x) = 5).
MOM (Maximum Occurrence of clauses of Minimum size).
Decision heuristics
KNOWLEDGE REPRESENTATION & REASONING - SAT
43
VSIDS (Variable State Independent Decaying Sum)
4. Periodically, all the counters are divided by a constant.
3. The unassigned variable with the highest counter is chosen.
2. When a clause is added, the counters are updated.
1. Each variable in each polarity has a counter initialized to 0.
(Implemented in Chaff)
Decision heuristics
KNOWLEDGE REPRESENTATION & REASONING - SAT
44
VSIDS (cont’d)
• Chaff holds a list of unassigned variables sorted by the counter value.
• Updates are needed only when adding conflict clauses.
• Thus - decision is made in constant time.
Decision heuristics
KNOWLEDGE REPRESENTATION & REASONING - SAT
45
VSIDS is a ‘quasi-static’ strategy:
- static because it doesn’t depend on current assignment
- dynamic because it gradually changes. Variables that appear in recent conflicts have higher priority.
Decision heuristics
This strategy is a conflict-driven decision strategy.
“..employing this strategy dramatically (i.e. an orderof magnitude) improved performance ... “
KNOWLEDGE REPRESENTATION & REASONING - SAT
46
DPLL
status = preprocess();if (status!=UNKNOWN) return status;while (1) {
decide_next_branch();while (true) {
status = deduce();if (status == CONFLICT) {
blevel = analyze_conflict();if (blevel == 0) return
UNSATISFIABLE;else backtrack(blevel);
}else if (status == SATISFIABLE)
return SATISFIABLE;else break;
}}
branching heuristics
UP
HyperRes, BinRes, EqRed etc.
backjumping/learning
KNOWLEDGE REPRESENTATION & REASONING - SAT
47
Conflict Analysis, Learning, Backjumping
When a conflicting clause is derived (i.e. a clause with all its literals 0), the solver must backtrack
conflict analysis finds the reason for a conflict and tries to resolve it
The DPLL algorithm uses chronological backtracking it backtracks to the most recent decision point where a variable has not
both of values its tried, and flips the current assignment Example
Modern SAT solvers employ more advanced conflict analysis techniques to identify the actual reasons for the conflict
in this way they can achieve non-chronological backjumping
KNOWLEDGE REPRESENTATION & REASONING - SAT
48
Conflict Analysis, Learning, Backjumping
Suppose the conflicting clause = (a x c) has been derived i.e. a=1, x=0, c=1
A set R of value assignments to variables in the problem is called a conflict assignment if after making these assignments and running UP, clause becomes unsatisfiable
assignment {a=1, x=0, c=1} is a trivial conflict assignment But it is not of much use
Question: how can we derive more interesting conflict assignments? Answer: determine why and at what decision level a=1, x=0, c=1
Suppose we find that R={x=0, y=1, z=1} is also a conflict assignment for clause
the implied clause (x y z) which records the conflict assignment R is called a conflict clause
KNOWLEDGE REPRESENTATION & REASONING - SAT
49
Conflict Analysis, Learning, Backjumping
Suppose that assignment x=0 of R={x=0, y=1, z=1} is chosen (or implied) at the current decision level v
assume that y=1 and z=1 are deduced at nodes v’ and v’’ respectively suppose that v>v’>v’’ (i.e. v’’ is closest to the root)
After adding conflict clause (x y z) to the problem, we can backjump from v to v’ (skipping the nodes in between)
because whatever assignments we make there, the conflict at node v will still exist!
After we make the backjump, we can deduce x=1. Why? because the added clause (x y z) will be a unit clause, forcing x=1
without learning this clause, this deduction would not be possible now we can avoid needless search and save time!
KNOWLEDGE REPRESENTATION & REASONING - SAT
50
Conflict Analysis, Learning, Backjumping
During the conflict analysis information about conflicts is usually recorded and added to the problem as new (learned) clauses
these conflict clauses are redundant but they often help prune the search space in the future
this mechanism is called conflict-directed learning Non-chronological backtracking is also called conflict-directed
backjumping originally proposed for CSPs (Prosser, 1993) then incorporated in SAT solvers like GRASP (Silva and Sakallah, 1996) and
rel_sat (Bayardo and Schrag, 1997)
Learning and conflict-directed backjumping can be analyzed using implication graphs or they can be viewed as a resolution process
KNOWLEDGE REPRESENTATION & REASONING - SAT
51
Implication graphs and learning
DPLL solvers organize the search in the form of a decision tree
Each node corresponds to a decision
Depth of the node in the decision tree decision level
Notation: x=v@d v {0,1} is assigned to x at decision level d
KNOWLEDGE REPRESENTATION & REASONING - SAT
52
Implication graphs and learning
1 = (x2 x3)
2 = (x1 x4)
3 = (x2 x4)
1 = (x2 x3)
2 = (x1 x4)
3 = (x2 x4)
x1
x1 = 0@1
{(x1,0), (x2,0), (x3,1)}
x2 x2 = 0@2
{(x1,1), (x2,0), (x3,1) , (x4,0)}
x1 = 1@1
x3 = 1@2
x4 = 0@1 x2 = 0@1
x3 = 1@1
No backtrack in this example!
KNOWLEDGE REPRESENTATION & REASONING - SAT
53
Implication graphs and learning
1 = (x2 x3)
2 = (x1 x4)
3 = (x2 x4)
4 = (x1 x2 x3)
1 = (x2 x3)
2 = (x1 x4)
3 = (x2 x4)
4 = (x1 x2 x3)
Let’s add a clause
x4 = 0@1
x2 = 0@1
x3 = 1@1
conflict
{(x1,0), (x2,0), (x3,1)}
x2
x2 = 0@2 x3 = 1@2
x1 = 0@1
x1
x1 = 1@1
KNOWLEDGE REPRESENTATION & REASONING - SAT
54
Implication graphs and learning
c1 = (x1 x2)
c2 = (x1 x3 x9)
c3 = (x2 x3 x4)
c4 = (x4 x5 x10)
c5 = (x4 x6 x11)
c6 = (x5 x6)
c7 = (x1 x7 x12)
c8 = (x1 x8)
c9 = (x7 x8 x13)
c1 = (x1 x2)
c2 = (x1 x3 x9)
c3 = (x2 x3 x4)
c4 = (x4 x5 x10)
c5 = (x4 x6 x11)
c6 = (x5 x6)
c7 = (x1 x7 x12)
c8 = (x1 x8)
c9 = (x7 x8 x13)
Current truth assignment: {x9=0 ,x10=0, x11=0, x12=1, x13=1}
Current decision assignment: {x1=1@6}
c6
c6
conflict
x9=0
x1=1@6
x10=0
x11=0
x5=1c4
c4
c5
c5 x6=1c2
c2
x3=1
c1
x2=1
c3
c3
x4=1
We learn the conflict clause c10 : (x1 x9 x11 x10) (x1 x9 x11 x10)
KNOWLEDGE REPRESENTATION & REASONING - SAT
55
Implication graph, flipped assignment
x1=0@6
x11=0
x10=0
x9=0
x7=1@6
x12=1
c7
c7
x8=1@6
c8
c10
c10
c10 c9
c9
’
x13=1
c9
Due to the conflict clause
Current truth assignment: {x9=0 ,x10=0, x11=0, x12=1, x13=1}
Current decision assignment: {x1=0}
c1 = (x1 x2)
c2 = (x1 x3 x9)
c3 = (x2 x3 x4)
c4 = (x4 x5 x10)
c5 = (x4 x6 x11)
c6 = (x5 x6)
c7 = (x1 x7 x12)
c8 = (x1 x8)
c9 = (x7 x8 x13)
c10 : (x1 x9 x11 x10)
c1 = (x1 x2)
c2 = (x1 x3 x9)
c3 = (x2 x3 x4)
c4 = (x4 x5 x10)
c5 = (x4 x6 x11)
c6 = (x5 x6)
c7 = (x1 x7 x12)
c8 = (x1 x8)
c9 = (x7 x8 x13)
c10 : (x1 x9 x11 x10)
KNOWLEDGE REPRESENTATION & REASONING - SAT
56
Conflict-directed backjumping
Non-chronological backtracking
x1
4
5
6
’
Decision level
Which assignments caused the conflicts ? x9= 0
x10= 0
x11= 0
x12= 1
x13= 1
If the deepest was done at level 3
Backtrack to decision level 3
3
These assignmentsare sufficient forcausing a conflict. another example
KNOWLEDGE REPRESENTATION & REASONING - SAT
57
Resolution and learning
Learned clauses can be generated by resolution as we know, resolution takes two clauses (C1,a), (a,C2) where C1 and
C2 are disjunctions of literals and generates clause (C1,C2). the resulting clause is redundant. Therefore, we can apply resolution
and add new clauses to the clause knowledge base without changing the problem’s satisfiability
Why would we want to do this?
Conflict-driven learning uses resolution to generate new clauses that can help us achieve conflict-directed backjumping and prune the search space in the future
as usual we assume that UP is the only deduction method applied
KNOWLEDGE REPRESENTATION & REASONING - SAT
58
Resolution and learning
analyze_conflict {cl = find_conflicting_clause();while (!stop_criterion_met(c1)) {
lit = choose_literal(c1);var = variable_of_literal(lit);ante = antecedent(var);cl = resolve(c1,ante,var);
} add_clause_to_knowledge_base(c1); back_dl = clause_asserting_level(c1);return back_dl;
}
Choose a literal from the conflicting clause
Find the unit clause which implies var
Return a clause that contains all literals in c1 and ante except the literals of var
Every implied variable has an antecedent Decision variables don’t
KNOWLEDGE REPRESENTATION & REASONING - SAT
59
Resolution and learning
The loop stops when some predefined stop criterion is met usually the stop criterion is that an asserting clause has been derived
A clause is asserting if all literals it contains have value 0 and only one of them (l) is assigned at the current decision level
the decision level of the literal with the second highest decision level in an asserting clause is the asserting level of the clause
After backtracking to the asserting level, the asserting clause will become unit and force l to take its opposite value
thus a new are in the search space will be explored It is easy to meet the stop criterion if choose_literal(c1)
chooses literal in reverse chronological order of their assignment Εxample from BerkMin paper
KNOWLEDGE REPRESENTATION & REASONING - SAT
60
Resolution and learning
As new learned clauses are added two problems may appear: The system slows down because it has to process a larger number of
clauses The memory requirements increase
To solve these problems, SAT solvers delete some learned clauses periodically
this does not affect correctness since these clauses are redundant the less useful ones and the ones that have many literals are preferred for
deletion usefulness is measured according to various heuristics
e.g. most recent ones are kept, older ones are deleted
KNOWLEDGE REPRESENTATION & REASONING - SAT
61
DPLL
status = preprocess();if (status!=UNKNOWN) return status;while (1) {
decide_next_branch();while (true) {
status = deduce();if (status == CONFLICT) {
blevel = analyze_conflict();if (blevel == 0) return
UNSATISFIABLE;else backtrack(blevel);
}else if (status == SATISFIABLE)
return SATISFIABLE;else break;
}}
branching heuristics
UP
HyperRes, BinRes, EqRed etc.
backjumping/learning
KNOWLEDGE REPRESENTATION & REASONING - SAT
62
Random Restarts
The variable order during search plays a very significant role in the runtime
two problems that are identical except for the variable order may take totally different times to solve
a given variable order can make the algorithm focus on certain parts of the search tree (that may not be fruitful)
One way to solve this problem is through random restarts search process is initiated from scratch at certain points
Random restarts may force the algorithm to consider diverse areas in the search tree in two ways:
by keeping some information after a restart (learned clauses) by incorporating some degree of randomness in the branching heuristic
KNOWLEDGE REPRESENTATION & REASONING - SAT
63
Engineering Aspects of SAT Solvers
Storage of Clause Knowledge Base many real problems currently solved by SAT solvers are very large
circuit verification problems may contain millions of variables and clauses also, during search learned clauses are added to the clause KB
Efficient ways to store the clauses are needed!
usually clauses are stored in a linear way (sparse matrix representation) each clause has its own space and no overlap exists
early SAT solvers used complex pointer-heavy data structures convenient for clause manipulation (additions/deletions) but not memory efficient (access locality not preserved many cache misses)
KNOWLEDGE REPRESENTATION & REASONING - SAT
64
Engineering Aspects of SAT Solvers
Storage of Clause Knowledge Base Chaff (Moskewicz, Madigan, Zhao, Zhang, Malic, 2001) stores all
clauses in a large array less flexible in clause manipulation than linked lists but increased access locality offers significant speed-ups
Another approach to the storing problem proposes the use of tries to store clauses (Zhang & Stickel, 1994)
a trie is a ternary tree each internal node is a variable index the three children of each node are labeled Pos, Neg, DC (Don’t Care) a leaf node is either true or false a path to a true leaf node represents a clause
KNOWLEDGE REPRESENTATION & REASONING - SAT
65
Tries for clause storage
Apart from saving space, the ordered trie structure has the
nice property of being able to detect tail subsumed clauses of a
database quickly. A clause is said to be tail
subsumed by another clause if its first portion of the literals is also a clause in the clause database. For example, (a bc) is tail subsumed by (ab)
A trie data structure representing clauses
(V1V2),(V1 V3),(V1 V3),(V2 V3)
KNOWLEDGE REPRESENTATION & REASONING - SAT
66
Deduction() creates new implied variables and conflicts. How can this be done efficiently ?
Observation: More than 90% of the time SAT solvers perform Deduction().
• i.e. UP
More engineering aspects of SAT solvers
KNOWLEDGE REPRESENTATION & REASONING - SAT
67
Hold 2 counters for each clause :
val1() - # of negative literals assigned 0 in +
# of positive literals assigned 1 in .
val0() - # of negative literals assigned 1 in +
# of positive literals assigned 0 in .
Grasp implements Deduction() with counters
Each variable x has two lists that contain: list1(x) – all clauses where x appears positivelylist2(x) – all clauses where x appears negatively
KNOWLEDGE REPRESENTATION & REASONING - SAT
68
Grasp implements Deduction() with counters
is satisfied iff val1() > 0
is unsatisfied iff val0() = ||
is unit iff val1() = 0 val0() = || - 1
is unresolved iff val1() = 0 val0() < || - 1
Backtracking: Same complexity as variable assignment
Every assignment to a variable x results in updating the counters for all the clauses that contain x.
This is the main problem with counters!
KNOWLEDGE REPRESENTATION & REASONING - SAT
69
SATO implements Deduction() with head/tail pointers
In SATO each clause has two pointers associated with it, called the head and tail pointer respectively. A clause stores all its literals in an array. Initially, the head pointer points to the first literal of the clause and the tail pointer points to the last literal of the clause.
Each variable keeps four linked lists that contain pointers to clauses. The linked lists for the variable v are clause_of_pos_head(v), clause_of_neg_head(v), clause_of_pos_tail(v) and clause_of_neg_tail(v).
If v is assigned with the value 1, clause_of_pos_head(v) and clause_of_pos_tail(v) will be ignored. For each clause C in clause_of_neg_head(v), the solver will search for a literal that does not evaluate to 1 from the position of the head literal of C to the position of the tail literal of C.
KNOWLEDGE REPRESENTATION & REASONING - SAT
70
SATO implements Deduction() with head/tail pointers
During this search process, four cases may occur: If during the search we first encounter a literal that evaluates to 1, then the
clause is satisfied, we need to do nothing. If during the search we first encounter a literal l that is free and l is not the tail
literal, then we remove C from clause_of_neg_head(v) and add C to head list of the variable corresponding to l. In essence the head pointer is moved from its original position to the position of l.
If all literals in between these two pointers are assigned value 0, but the tail literal is unassigned, then the clause is a unit clause, and the tail literal is the unit literal for this clause.
If all literals in between these two pointers and the tail literal are assigned value 0, then the clause is a conflicting clause.
Similar actions are performed for clause_of_neg_tail(v), only the search is in the reverse direction (i.e. from tail to head).
KNOWLEDGE REPRESENTATION & REASONING - SAT
71
Chaff implements Deduction() with a pair of watched literals
Observation: during Deduction(), we are only interested in newly implied variables and conflicts.
These occur only when the number of literals in with value ‘false’ is greater than || - 2
Conclusion: no need to visit a clause unless (val0() > || - 2)
How can this be implemented ?
KNOWLEDGE REPRESENTATION & REASONING - SAT
72
Chaff implements Deduction() with a pair of watched literals
Define two ‘watched literals’: w1(), w2(). w1() and w2() point to two distinct literals in which are not ‘false’
Each variable x has two lists containing pointers to all the watched literals corresponding to it (in either polarity) pos_watched(x) and neg_watched(x)
When a variable x is assigned value 0 (1), for each literal p pointed to by a pointer in the list of pos_watched(x) (neg_watched(x) - notice p must be a negative literal of x in this case), the solver will search for a literal l in the clause containing p that is not set to 0.
In this way we visit clause only if w1() or w2() become ‘false’.
KNOWLEDGE REPRESENTATION & REASONING - SAT
73
Chaff implements Deduction() with a pair of watched literals
…the solver will search for a literal l in the clause containing p that is not set to 0. There are four cases that may occur during search:
If there exists such a literal l and it is not the other watched literal, then we remove pointer to p from neg_watched(x), and add pointer to l to the watched list of the variable corresponding to l.
We refer to this operation as moving the watched literal, because one of w1() and w2() is moved from its original position to the position of l.
If the only such l is the other watched literal and it is free, then the clause is a unit clause, with the other watched literal being the unit literal.
If the only such l is the other watched literal and it evaluates to 1, then we need to do nothing.
If all literals in the clause are assigned value 0 and no such l exists, then the clause is a conflicting clause.
KNOWLEDGE REPRESENTATION & REASONING - SAT
74
Both watched literals of an implied clause are on the highest decision levelpresent in the clause. Therefore, backtracking will un-assign them first.Conclusion: when backtracking, watched literals stay in place.
Chaff implements Deduction() with a pair of watched literals
1 2 3 4 5
V[1]=0
1 2 3 4 5
V[5]=0, v[4]= 0
1 2 3 4 5
V[2]=0
w1 w2
1 2 3 4 5 Unit clause
Backtrackv[4] = v[5]= Xv[1] = 1
1 2 3 4 5
Backtracking: No updating. Complexity = constant.
KNOWLEDGE REPRESENTATION & REASONING - SAT
75
Chaff implements Deduction() with a pair of watched literals
The choice of watched literals is important. Best strategy is - the least frequently updated variables.
The watched literals method has a learning curve in this respect:
1. The initial watched literals are chosen arbitrarily.
2. The process shifts the watched literals away from variables that were recently updated (these variables will most probably be reassigned in a short time).
Another example from Zhang-Malik paper
KNOWLEDGE REPRESENTATION & REASONING - SAT
76
head/tail pointers vs. watched literals