inference in fol all pl inference rules hold in fol, as well as the following additional ones:...

Inference in FOL

All PL inference rules hold in FOL, as well as the following additional ones:

– Universal elimination rule: x, A ===> Subst({x / g}, A). Here x is a variable, A is a wff, g is a ground term, and Subst(, A) is a result of applying the substitution to A.

Example: Consider the formula y likes(Bob, y). From here, we can infer by the UE rule likes(Bob, Tom) if {y / Tom}, likes(Bob, Bob} if

{y / Bob}, likes(Bob, father(Bob) if {y / father(Bob)}.

– Existential elimination rule: x, A ===> Subst({x / k}, A). Here k is a constant term which does not appear elsewhere in the KB. Example: Consider the formula z hates(y, z). Let foe be a new function constant. Then, we can infer by the EE rule hates(y, foe(y)). Here foe(y) designates a particular object hated by y.

Note that in performing existential elimination, it is important to avoid objects

and functions that have already been used, because otherwise we would be

able to infer hates(Bob, Bob) from the fact z hates(Bob, z) which has much

weaker meaning.

Inference in FOL (cont.)

– Existential introduction rule: A ===> x Subst({g / x}, A).

Example: From hates(Bob, apple), we can infer by the EI rule

x hates(Bob, x).

Given any set of inference rules, we say that conclusion is derivable

from a set of premises iff: , or is the result of applying a rule of inference to sentences derivable

from .

A derivation of from is a sequence of sentences in which each

sentence is either a member of , or is a result of applying a rule of

inference to sentences generated earlier in the sequence.

Example

We know that horses are faster than dogs, and that there is a greyhound that is

faster than any rabbit. We also know that Harry is a horse, and Ralph is a rabbit.

We want to derive the fact that Harry is faster than Ralph.

Step 1: Build the initial knowledge base (i.e. the set )

1. x, y horse(x) & dog(y) => faster(x, y)

2. y greyhound(y) & ( z rabbit(z) => faster(y, z))

3. y greyhound(y) => dog(y) Note that this piece of knowledge is not

explicitly stated in the problem description.

4. x, y, z faster(x, y) & faster(y, z) => faster(x, z) We must explicitly say

that faster is a transitive relationship.

5. Horse(Harry)

6. Rabbit(Ralph)

Example (cont.)

Step 2: State the goal, i.e. the formula that must be derived from the initial set of

premises 1 to 6.

faster(Harry, Ralph)

Step 3: Apply inference rules to and all newly derived formulas, until the goal

is reached.

• From 2.) and the existential elimination rule, we get

7. greyhound(Greg) & ( z rabbit(z) => faster(Greg, z))

• From 7.) and the and-elimination rule, we get

8. greyhound(Greg)

9. z rabbit(z) => faster(Greg, z)

• From 9.) and the universal elimination rule, we get

10. rabbit(Ralph) => faster(Greg, Ralph)

• From 10.), 6.) and MP, we get

11. Faster(Greg, Ralph)

Example (cont.)

• From 3.) and the universal elimination rule, we get

12. greyhound(Greg) => dog(Greg)


13. dog(Greg)

• From 1.) and the universal elimination rule applied twice, we get

14. horse(Harry) & dog(Greg) => faster(Harry, Greg)

• From 5.), 13.) and the and-introduction rule, we get

15. horse(Harry) & dog(Greg)


16. faster(Harry, Greg)

• From 4.) and the universal elimination rule applied three times, we get

17. faster(Harry, Greg) & faster(Greg, Ralph) => faster(Harry, Ralph)

• From 16.), 11.) and the and-introduction rule, we get

18. faster(Harry, Greg) & faster(Greg, Ralph)


19. faster(Harry, Ralph)

Inference in FOL (cont.)

There are two important conclusions that we can make from these examples:

1. The derivation process is completely mechanical: each conclusion

follows from previous conclusions by a mechanical application of a

rule of inference.

2. The derivation process can be viewed as a search process, where

inference rules are the operators transforming one state of the

search space into another state. We know that a search problem

of this type has an exponential complexity (only the universal elimination

operator has an enormous branching factor). Notice, however, that in the

derivation process we have used 3 inference rules (universal elimination,

and-introduction and MP) in combination. If we can substitute them with a

single rule, then the derivation process will become much more efficient.

Towards developing a more efficient inference procedure utilizing a single inference rule.

Consider the following sentences:

1. missile(M1)

2. y owns(y, M1)

3. x missile(x) & owns(Nono, x) => sells(West, Nono, x).

In order to infer sells(West, Nono, M1), we must first apply the universal

elimination rule with substitution {y / Nono} to 2.) to infer owns(Nono, M1).

Then, apply the same rule with substitution {x / M1} to 3.) to infer

missile(M1) & owns(Nono, M1) => sells(West, Nono, M1).

Then, by applying the and-introduction rule to missile(M1) and owns(Nono, M1),

we get missile(M1) & owns(Nono, M1), which finally will let us derive

sells(West, Nono, M1) by applying the MP rule. All this can be combined in a

single inference rule, called the generalized modus ponens.

Generalized Modus Ponens

Given a set of n atomic sentences, p1’, p2’, …, pn’, and q, and one implication

p1 & p2 & … & pn => q, the generalized MP rule states that:

p1’, p2’, …, pn’, (p1 & p2 & … & pn => q) ===> Subst(, q)

where Subst(, pi’) = Subst(, pi) for all i. The process of computing

substitution is called unification.

In the “Colonel West” example, we have the following:

p1’: missile(M1)

p1: missile(x)

p2’: owns(y, M1)

p2: owns(Nono, x)

: {x / M1, y / Nono}

q: sells(West, Nono, x}

Subst(, q): sells(West, Nono, M1)

Canonical form of a set of FOL formulas

Generalized MP (GMP) can only be applied to sets of Horn formulas. These areformulas of the following form:

A1 & A2 & ... & An ==> B,

where A1, A2, ..., An, B are positive literals (a literal is a formula or its negation)

Therefore, to utilize the GMP we must first convert the initial knowledge base intoa canonical form, where all sentences are Horn formulas. Assuming that thishas already been done, the generalized MP can be applied in two ways:

1. Given the initial description of the problem, generate all possible conclusions that can be derived from the initial set of formulas by the GMP. This type of reasoning from the premises to the conclusions is called forward chaining.

2. Given the goal, find implication sentences that would derive this goal. This type of reasoning from the goal to the premises is called backward chaining.

Forward chaining: an overview

We start with a set of data collected through observations, and check each rule to

see if the data satisfy its premises. This process is called rule interpretation; it is

performed by the Inference Engine (IE) and involves the following steps:

Rules New rules

Applicable Selected

rules rules

Facts New facts

Knowledge (rule memory)

Facts(working memory)

Step1:Match

Step2:Conflict

Resolution

Step 3:Execution

Augmenting the rule memory with new rules: explanation-based learning

To illustrate how one type of machine learning, known as explanation-basedlearning, can help an AI agent to augment its knowledge consider again the5 puzzle problem. Let the following solution be found by the breadth-first searchfor the initial state (4 5 3 0 1 2):

19 ;; the length of the shortest path((4 5 3 0 1 2) (0 5 3 4 1 2) (5 0 3 4 1 2) (5 1 3 4 0 2) (5 1 3 4 2 0) (5 1 0 4 2 3) (5 0 1 4 2 3) (0 5 1 4 2 3) (4 5 1 0 2 3) (4 5 1 2 0 3) (4 0 1 2 5 3) (4 1 0 2 5 3)(4 1 3 2 5 0) (4 1 3 2 0 5) (4 1 3 0 2 5) (0 1 3 4 2 5) (1 0 3 4 2 5) (1 2 3 4 0 5)(1 2 3 4 5 0))

Let us rewrite this solution in terms of moves of the empty tile (not in terms ofstates as we did it so far):

up right down right up left left down right up right down left left up right down right

Search for expensive or repetitive components of the reasoning (search) process

The two outlined components of the solution are exactly the same. Instead of

repeating the work involved in the generation of intermediate states, we can

create a new operator which, if applicable, will transform the starting state of

the sequence into the finish state of the sequence. That is,

4 5 3

0 1 2

5 1 3

4 2 0

4 1 3

0 2 5

1 2 3

4 5 0

new

operator

new

operator

Here is the revised set of operators for the specified position of the empty tile:

(defun move-4 (state) ;; defines all moves if the empty tile is in position 4

(setf new-states (cons (append (list (second state)) ;; the new operator

(list (fifth state)) (list (third state)) ;; will be tried

(list (first state)) (list (sixth state)) ;; first

(list (fourth state))) new-states))

(setf new-states (cons (append (list (fourth state))

(list (second state)) (list (third state)) (list (first state))

(nthcdr 4 state)) new-states))

(setf new-states (cons (append (butlast state 3)

(list (fifth state)) (list (fourth state)) (last state))

new-states)))

The revised solution has a length of 13, and as seen below the new operator

Was applied twice, at state 1 (4 5 3 0 1 2) and state 12 (4 1 3 0 2 5). Here is the

new solution after learning:

((4 5 3 0 1 2) (5 1 3 4 2 0) (5 1 0 4 2 3) (5 0 1 4 2 3) (0 5 1 4 2 3) (4 5 1 0 2 3) (4 5 1 2 0 3)

(4 0 1 2 5 3) (4 1 0 2 5 3) (4 1 3 2 5 0) (4 1 3 2 0 5) (4 1 3 0 2 5) (1 2 3 4 5 0))

Forward chaining: The fruit identification example (adapted from Gonzalez & Dankel)

To illustrate chaining algorithms (no learning component is assumed), considerthe following set of rules intended to recognize different fruits provided fruit descriptions.

Rule 1: If (shape = long) and (color = green) Then (fruit = banana)Rule 1A: If (shape = long) and (color = yellow) Then (fruit = banana)Rule 2: If (shape = round) and (diameter > 4 inches) Then (fruitclass = vine)Rule 2A: If (shape = oblong) and (diameter > 4 inches) Then (fruitclass = vine)Rule 3: If (shape = round) and (diameter < 4 inches) Then (fruitclass = tree)Rule 4: If (seedcount = 1) Then (seedclass = stonefruit)Rule 5: If (seedcount > 1) Then (seedclass = multiple)Rule 6: If (fruitclass = vine) and (color = green) Then (fruit = watermelon)

The fruit identification example (cont.)

Rule 7: If (fruitclass = vine) and (surface = smooth) and (color = yellow)

Then (fruit = honeydew)

Rule 8: If (fruitclass = vine) and (surface = rough) and (color = tan)

Then (fruit = cantaloupe)

Rule 9: If (fruitclass = tree) and (color = orange) and (seedclass = stonefruit)

Then (fruit = apricot)

Rule 10: If (fruitclass = tree) and (color = orange) and (seedclass = multiple)

Then (fruit = orange)

Rule 11: If (fruitclass = tree) and (color = red) and (seedclass = stonefruit)

Then (fruit = cherry)

Rule 12: If (fruitclass = tree) and (color = orange) and (seedclass = stonefruit)

Then (fruit = peach)

Rule 13: If (fruitclass = tree) and (color = red) and (seedclass = multiple)

Then (fruit = apple)

Rule 13A: If (fruitclass = tree) and (color = yellow) and (seedclass = multiple)


Rule 13B: If (fruitclass = tree) and (color = green) and (seedclass = multiple)



Assume the following set of facts comprising the initial working memory:

FB = ((diameter = 1 inch) (shape = round) (seedcount = 1)

(color = red))

The forward reasoning process is carried out as follows:

Cycle 1:

Step1 (matching) Rules 3 and 4 are applicable

Step2 (conflict resolution) Select rule 3

Step 3 (execution) FB (fruitclass = tree)

Cycle 2:

Step1 (matching) Rules 3 and 4 are applicable


Step 3 (execution) FB (seedclass = stonefruit)


Cycle 3:

Step1 (matching) Rules 3, 4 and 11 are applicable


Step 3 (execution) FB (fruit = cherry)

Cycle 4:

Step1 (matching) Rules 3, 4 and 11 are applicable

Step2 (conflict resolution) No new rule can be selected. Stop.

Forward chaining: general rule format

Rules used to represent “diagnostic” or procedural knowledge have the following format:

If <antecedent 1> is true, <antecedent 2> is true, … <antecedent i> is trueThen <consequent> is true.

The rule interpretation procedure utilizes the idea of renaming, which states that one sentence is a renaming of another if they are the same except for thenames of the variables, called pattern variables.Examples:

• likes(x, ice-cream) is a renaming of likes(y, ice-cream)• flies(bird1) is a renaming of flies(bird2), and both are renaming of

flies(Tom). Here x, y, bird1 and bird2 are pattern variables.

Pattern matching

To recognize pattern variables more easily, we will arrange them in two-element

lists, where the first element is ?, and the second element is the pattern variable.

Examples:

(color (? X) red)

(color apple (? Y))

(color (? X) (? Y))

If the pattern contains no pattern variables, then the pattern matches the “basic”

statement (called a datum) iff the pattern and the datum are exactly the same.

Example: Pattern (color apple red) matches datum (color apple red).

If the pattern contains a pattern variable, then an appropriate substitution must be

found to make the pattern match the datum.

Example: Pattern (color (? X) red) matches datum (color apple red) given substitution = {x / apple}

Pattern matching (cont.)

To implement pattern matching, we need a function, match, which works as

follows:

> (match ’(color (? X) red) ’(color apple red))

((X apple))

> (match ’((? Animal) is a parent of (? Child)) ’(Tim is a parent of Bob))

((Child Bob) (Animal Tim))

(? _) will denote anonymous variables; these match everything.

Example: (color (? _) (? _)) match (color apple red), (color grass green), etc.

Macro procedures in LISP

Macro procedures in Lisp have the following format:

(defmacro <macro name> (<parameter1> … <parameter m>)

<form 1> … <form n>)

Contrary to ordinary procedures they do not evaluate their arguments, and

evaluating the body results in a new form which is then evaluated to

produce a value.

Example: Develop a procedure, when-plusp, which prints out alarm whenever variable pressure becomes greater than zero.

Version 1.

> (defun when-plusp (number result)

(when (plusp number) result))

> (setf pressure -3)

> (when-plusp pressure (print ’Alarm))

ALARM ;the side effect produced when the second argument is evaluated

NIL

Example (cont.)

Version 2.

> (defmacro when-plusp (number result)

(list ’when (list ’plusp number) result))

> (when-plusp pressure (print ’Alarm))

Step 1: Evaluation of the body produces the following

form

(when (plusp pressure) (print ’Alarm))

Step 2A: Evaluation of the form generated at the first step

produces the final result if pressure is greater than zero.

ALARM

ALARM

Step 2B: Evaluation of the form generated at the first step

produces the final result if pressure is less than zero.

NIL

The backquote mechanism

Recall that the normal quote, ’ , isolates the entire expression from evaluation;

the backquote, `, does not have such a strong meaning. Whenever a comma

appears inside a backquoted expression, the sub-expression immediately

following the comma is replaced by its value.

* (setf variable 'test)

TEST

* `(This is a ,variable)

(THIS IS A TEST)

* (setf variable '(Yet another example))

(YET ANOTHER EXAMPLE)

* `(This is a ,variable)

(THIS IS A (YET ANOTHER EXAMPLE))

* `(This is a ,@variable)

(THIS IS A YET ANOTHER EXAMPLE)

The backquote mechanism is widely used in macro procedures to fill-in the macro

templates.

Object streams

Streams are lists of objects intended to be processed in the exact order in

which they appear in the stream, namely from the front to the back of the

stream. When a new object is added to the stream, it must be added to its back.

To represent streams, we can use ordinary lists. Then, first allow us to access

the first element, and rest will trim the first element off. However, we can also

access the other elements of the list by means of second, third, etc., thus

violating the definition of the stream. To prevent this from happening, streams

will be represented as two-element lists, where the first element is the first

object in the stream, and the second element is itself a stream.

Example:* 'empty-stream

EMPTY-STREAM

* (stream-cons 'object1 'empty-stream)

(OBJECT1 EMPTY-STREAM)

* (stream-cons 'object1 '(OBJECT1 EMPTY-STREAM) )

(OBJECT1 (OBJECT1 EMPTY-STREAM))

Operations on streams

(defun stream-endp (stream) (eq stream 'empty-stream)) (defun stream-first (stream) (first stream)) (defun stream-rest (stream) (second stream)) (defun stream-cons (object stream) (list object stream))

(defun stream-append (stream1 stream2)

(if (stream-endp stream1) stream2

(stream-cons (stream-first stream1)

(stream-append (stream-rest stream1)

stream2))))

(defun stream-concatenate (streams)

(if (stream-endp streams) 'empty-stream

(if (stream-endp (stream-first streams))

(stream-concatenate (stream-rest streams))

(stream-cons (stream-first (stream-first streams))

(stream-concatenate

(stream-cons (stream-rest

(stream-first streams)) (stream-rest streams)))))))

Operations on streams (cont.)

(defun stream-transform (procedure stream)

(if (stream-endp stream) 'empty-stream

(stream-cons (funcall procedure (stream-first stream))

(stream-transform procedure

(stream-rest stream)))))

(defun stream-member (object stream)

(cond ((stream-endp stream) nil)

((equal object (stream-first stream)) t)

(t (stream-member object (stream-rest stream)))))

(defmacro stream-remember (object variable)

`(unless (stream-member ,object ,variable)

(setf ,variable

(stream-append ,variable

(stream-cons ,object 'empty-stream)))

,object))

Here STREAM-REMEMBER is a macro that inserts new assertions at the end of the stream, so that

when the stream is processed the assertions will be processed in the order in which they have been

entered.

An implementation of forward chaining

Consider the “Zoo” example from Winston & Horn. The KBS is intended to identify

animals provided their descriptions. Assume that all facts about the domain are

stored in the fact base, *assertions*, represented as a stream, and all rules

are stored in the rule base, *rules*, also represented as a stream. Pattern

variables make it possible for a rule to match the fact base in a different way.

Let us keep all such possibilities in a binding stream.

Example: Consider the following rule set containing just one rule:

((identify

((? Animal) is a (? Species))

((? Animal) is a parent of (? Child))

((? Child) is a (? Species))) empty-stream)

Let the fact base contains the following facts:

((Bozo is a dog) ((Deedee is a horse) ((Deedee is a parent of sugar)

((Deedee is a parent of Brassy) empty-stream))))

Example (cont.)

The first match produces the following two-element binding stream:

(((species dog) (animal bozo))

((species horse) (animal deedee)) empty-stream)

Next, the second rule antecedent is matched against each of the assertions in

The fact base. However, this time we have the binding list from the first step,

Which suggests particular substitutions:

1 Matching ((? Animal) is a parent of (? Child)) against the fact base fails for substitution ((species dog) (animal bozo))

2 Matching ((? Animal) is a parent of (? Child)) against the fact base given the substitution ((species horse) (animal deedee)) succeeds resulting in a new binding list:

(((child sugar) (species horse) (animal deedee))

((child brassy) (species horse) (animal deedee)) empty-stream)

inference in fol all pl inference rules hold in fol, as well as the following additional ones:...

Documents