artificial intelligence inference in first-order logic

Artificial IntelligenceInference in first-order logic

Fall 2008

professor: Luigi Ceccaroni

2

Role of first-order logic (FOL)

• Knowledge representation (KR) is an interdisciplinary subject that applies theories and techniques from three fields:– Logic provides the formal structure and

rules of inference.– Ontology defines the kinds of things that

exist in the application domain.– Computation supports the applications that

distinguish KR from pure philosophy. 2

Syntax of FOL

3

Effective procedures for answering questions posed in FOL

• More or less anything can be stated in FOL.• It is important to have algorithms that can

answer any answerable question stated in FOL.

• Three major families of first-order inference algorithms:– forward chaining and its applications to

deductive databases and production systems

– backward chaining and logic programming systems

– resolution-based theorem-proving systems

Forward chaining

• Forward chaining can be applied to first-order definite clauses.

• First-order definite clauses are disjunctions of literals of which exactly one is positive.

• Definite clauses such as Situation 1 ∧Situation 2 Response⇒ are especially useful for systems that make inference in response to newly arrived information.

6

Forward chaining

• The idea:– start with the atomic sentences in the knowledge

base– apply Modus Ponens (A, A B |- B) in the ⇒

forward direction, adding new atomic sentences, until no further inferences can be made

• Reasoning with forward chaining can be much more efficient than resolution theorem proving.

• This idea needs to be applied efficiently to first-order definite clauses. 6

7

First-order definite clauses

• Consider the following problem:– The law says that it is a crime for a Spanish

to sell weapons to other nations. The country Israel has some missiles and all of its missiles were sold to it by Rodríguez Zapatero, who is Spanish.

• To prove that Rodríguez Zapatero is a criminal, first, we have to represent the facts as first-order definite clauses.

• Then we can solve the problem with forward-chaining.

7

8


• “... it is a crime for a Spanish to sell weapons to other nations”:Spanish(x) Weapon(y) Sells(x, y, z) ∧ ∧ ∧NotInSpain(z) Criminal(x).⇒

• “...Israel has some missiles” is transformed into two definite clauses by Existential Elimination, introducing a new constant AGM12:Owns(Israel, AGM12)

Missile(AGM12)8

9


• “All of its missiles were sold to it by Rodríguez Zapatero”:Missile(x) Owns(Israel, x) Sells(ZP, x, Israel).∧ ⇒

• We need to know that missiles are weapons:Missile(x) Weapon(x)⇒

• We also need to know that Israel is not in Spain:NotInSpain(Israel)

• “Rodríguez Zapatero, who is Spanish...”:Spanish(ZP)

9

10

A simple forward-chaining algorithm

function FOL-FC-ASK (KB,a) returns a substitution or falselocal new, the new sentences inferred on each iterationrepeat until new is empty

new <- {}for each sentence r in KB do

(p1 … pn q ) <- STANDARDIZE-APART(r)

for each t such that SUBST(t , p1 … pn)=SUBST(t , p1’ … pn’) for some p1’ ,…, pn’ in KB

q’ <- SUBST(t ,q) if q’ <- is not a renaming of some sentence already in

KB or new then do add q’ to new

f <- UNIFY(q’, a)

if f is not fail then return fadd new to KB

return answers10

11


• Starting from the known facts,– it triggers all the rules whose premises are

satisfied,– adding their conclusions to the known facts.

• The process repeats until – the query is answered (assuming that just one

answer is required)– or no new facts are added.

11

12


• Notice that a fact is not “new” if it is just renaming a known fact.

• One sentence is a renaming of another if they are identical except for the names of the variables.

• Likes(x, IceCream) and Likes(y, IceCream) are renamings of each other– their meanings are identical:

•everyone likes ice cream12

13


• In the application of forward chaining to the crime problem (with three rules), two iteration are required:– Iteration 1

“Spanish(x) Weapon(y) Sells(x, y, z) ∧ ∧ ∧NotInSpain(z) Criminal(x).” ⇒ has unsatisfied premises.

“Missile(x) Owns(Israel, x) Sells(ZP, x, Israel).” ∧ ⇒ is satisfied with {x/AGM12} and “Sells(ZP, AGM12, Israel).” is added.

“Missile(x) Weapon(x).” ⇒ is satisfied with {x/AGM12} and “Weapon(AGM12)” is added.

13

14


– Iteration 2“Spanish(x) Weapon(y) Sells(x, y, z) ∧ ∧ ∧NotInSpain(z) Criminal(x).” ⇒ is satisfied with {x/ZP, y/AGM12, z/Israel} and Criminal(ZP) is added.

•Fixed point: no new inferences are possible at this point because every sentence that could be concluded by forward chaining is already contained explicitly in the KB.

•Algorithm not designed for efficiency of operation. 14

15

Efficient forward chaining

• There are three possible sources of complexity:– The “inner loop” involves finding all possible

unifiers such that the premise of a rule unifies with a suitable set of facts in the KB.•This pattern matching can be very expensive.

– The algorithm rechecks every rule on every iteration, even if very few additions are made to the KB on each iteration.

– The algorithm might generate many facts that are irrelevant to the goal. 15

16

Matching rules against known facts

• Consider the two rules:Missile(x) Owns(Israel, x) Sells(ZP, x, Israel).∧ ⇒

Missile(x) Weapon(x).⇒

• If the KB contains many objects owned by Israel and very few missiles, – it would be better to find all the missiles first – then check whether they are owned by Israel

• This is the conjunct ordering problem:– find an ordering to solve the conjuncts of the rule

premise so that the total cost is minimized.16

17

Matching rules against known facts

• Finding the optimal ordering is NP-hard, but good heuristics are available:– for example, the most constrained variable

heuristic

• The connection between pattern matching and constraint satisfaction is very close.

• Most rules in real-world KBs are small and simple rather than large and complex.

17

18

Incremental forward chaining

• To avoid redundant rule matching:– Every new fact inferred on iteration t must be

derived from at least one new fact inferred on iteration t-1.

• This leads to an incremental forward chaining algorithm where, at iteration t, a rule is checked only if its premise includes a conjunct that unifies with a fact newly inferred at iteration t-1.

18

Rete algorithm

• Only a small fraction of the rules in a knowledge base are triggered by the addition of new facts.

• A great deal of redundant work is done in constructing partial matches repeatedly that have some unsatisfied premises.

• It would be better to retain and gradually complete the partial matches as new facts arrive, rather than discarding them.

• The rete algorithm was the first to address this problem seriously.

• The algorithm preprocesses the set of rules in the knowledge base to construct a sort of dataflow network in which each node is a literal from a rule premise.

• Variable bindings flow through the network and are filtered out when they fail to match a literal.

Production systems

• Rete networks have been a key component of so-called production systems, which were among the earliest forward chaining systems in widespread use.

• The word “production” denotes a condition-action rule.

• The steps to solve a problem are described as a chain of deductions.

Production systems

• The representation is based on two elements:– facts: definite clauses with no negative literals, which

simply assert a given proposition; they can include variables

– rules: conditional formulas where the consequent is exactly one literal

• A problem is defined by:– fact base: atomic sentences describing the specific

problem– rule base: rules describing the reasoning

mechanisms– inference engine: algorithm executing the rules in a

reasoning chain

22

Irrelevant facts

• Forward chaining makes all allowable inferences, even if they are irrelevant to the goal at hand.

• One solution is to restrict forward chaining to a selected subset of rules.

• Another solution is to use backward chaining.

22

23

Backward chaining

• This algorithm works backward from the goal, chaining through rules to find known facts that support the proof.

• It is called with a list of goals containing a single element, the original query.

• It returns the set of all substitutions satisfying the query.

• If all of them can be satisfied, then the current branch of the proof succeeds.

23

24

Backward chaining

• Set of sentences:• S1: for each x1,y1 child(x1,y1) ⇒ parent(y1,x1)• S2: for each x2,y2 parent(x2,y2) ∧ female(x2) ⇒ mother(x2,y2)• S3: child(Lisa,Homer)• S4: child(Lisa,Marge)• S5: female(Marge)

• Goal: mother(x0,Lisa)

• Note: variables have already been standardized apart using subscripts

24

25

Backward chaining

Backward chaining

• Based on the inductive method: – The objective/hypothesis is validated

reconstructing the reasoning chain backwards

• Each step implies new sub-objectives:– New hypotheses to be validated

26

Backward chaining

• Algorithm– The FB is initialized with a set of initial facts.– The hypothesis set (HS) is initialized with the

objectives.– While there are hypotheses in HS, one is

chosen and validated:•Facts and right-end-side of rules are compared to

the hypotheses• If a hypothesis is in the FB, then eliminate it from

HS.•Else, look for rules with the hypothesis as

conclusion. Select one and add its non-satisfied premises to HS.

27

Backward chaining

• Advantages:– Only what is necessary for the resolution of

the problem is considered.

• The resolution process consist of a tree exploration

• Most representative language: Prolog– W. F. Clocksin y C. S. Mellish . Programming

in Prolog: Using the ISO Standard. Springer, 2003 (first edition: 1981).

28

Hybrid chaining

• Parts of the reasoning chain from facts to objectives are deductively constructed.

• Other parts are inductively constructed.• Bi-directional exploration• The combinatorial explosion of

deductive reasoning is avoided.

29

Hybrid chaining

• Meta-rules decide about strategy change as a function of:– Number of initial and final states

•Better from smaller to larger

– Branching factor•Better in the direction of lower factor

– Necessity of justifying the reasoning process•Better what reflects typical user reasoning

30

Inference engine

• The inference engine or control mechanism is made of two elements:– Rule interpreter or inference mechanism

• It determines which rules can be applied

– Control strategy of conflict-resolution strategy

• The function of the inference engine is to execute actions to solve the problem (objective) from an initial set of facts, possibly via some interaction with the user. 31

Inference engine

• Phases of a basic engine cycle:1. Detection (filter): relevant rules

• Obtaining from the KB the set of executable rules for a given situation (state) of the FB (rule interpreter)

• Creation of the conflict set2. Selection: which rule?

• Conflict resolution: selection of a rule to be executed (control strategy)

3. Execution• Execution of the rule with an instantiation of the

variables: modification of the working memory4. Return to 1, or stop if the problem is solved

• If a solution had not been found and no rule is executable: failure

32

Detection

• Creation of the set of executable rules• The rule interpreter obtains possible

instantiations of the LHS of the rules via matching.

• A rule can be instantiated more than once with different values matching the variables.

33

Selection

• Rules are selected according to the control strategy, which can be:– Fixed– Dynamic with criteria decided in advance– Guided by dynamic meta-rules

34

Selection

• Strategies for the selection of the “best” rule/instantiation:– First rule according to KB ordering– Least/most used rule– Most specific (with more literals) / most general rule– Rule with the highest certainty value– The instantiation satisfying

• Facts with highest priority• Oldest facts• Most recent facts

– Most recently used rule– Dynamic meta-rules– Combination of different criteria 35

Execution

• The execution of a rule can cause:– A modification of the FB (in forward chaining

only)– New calculations, new actions, questions to

the user– New sub-objectives (in backward chaining)

• Instantiations are propagated• Certainty degrees are propagated

36

End of deduction/induction process

• The conclusion (objective) is found/demonstrated: success

• No rule is executable: success? / failure?

37

Prolog: lenguaje de programación lógica

• Programa Prolog:– conjunto de aserciones lógicas– cada aserción es una cláusula de Horn– proceso de comparación: unificación– el orden de las aserciones (reglas) es significativo

• Ejemplo– gato (bufa).– animaldomestico (X) :- gato (X).– animaldomestico (X) :- pequeño (X), conplumas (X).

• p → q ⇔ q :- p• Consultas: ?- predicado.• La negación se representa con la falta de la aserción

correspondiente: asunción de mundo cerrado.

Cláusula de Horn•De Wikipedia:

Las cláusulas de Horn (instrucciones ejecutables de PROLOG) tienen el siguiente aspecto:

hija (*A, *B) :- mujer (*A), padre (*B, *A). que podría leerse así: "A es hija de B si A es mujer y B es padre de A".

Obsérvese que, en PROLOG, el símbolo :- separa la conclusión de las condiciones. En PROLOG, las variables se escriben comenzando con un asterisco. Todas las condiciones deben cumplirse simultáneamente para que la conclusión sea válida. Por tanto, la coma que separa las distintas condiciones es equivalente a la conjunción copulativa (en algunas versiones de PROLOG se sustituye la coma por el símbolo &). La disyunción, en cambio, no se representa mediante símbolos especiales, sino definiendo reglas nuevas, como la siguiente:

hija (*A, *B) :- mujer (*A), madre (*B, *A). que podría leerse así: "A es hija de B si A es mujer y B es madre de A".

Cláusulas con como mucho un literal positivo.

http://es.wikipedia.org/wiki/PROLOG




artificial intelligence inference in first-order logic

Documents