chapter 10 learning sets of rules. content introduction sequential covering algorithm learning...

40
Chapter 10 Learning Sets Of Rules

Upload: monica-lang

Post on 18-Dec-2015

244 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Chapter 10 Learning Sets Of Rules

Page 2: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Content

Introduction Sequential Covering Algorithm Learning First-Order Rules

(FOIL Algorithm) Induction As Inverted Deduction Inverting Resolution

Page 3: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

3

Introduction GOAL: Learning a target function as a set of IF-THEN rules BEFORE: Learning with decision trees

Learning the decision tree Translate the tree into a set of IF-THEN rules (for each

leaf one rule) OTHER POSSIBILITY: Learning with genetic algorithms

Each set of rule is coded as a bitvector Several genetic operators are used on the hypothesis

space TODAY AND HERE:

First: Learning rules in propositional form Second: Learning rules in first-order form (Horn

clauses which include variables) Sequential search for rules, one after the other

Page 4: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Introduction

IF (Outlook = Sunny) ∧ (Humidity = High) THEN PlayTennis = No

IF (Outlook = Sunny) ∧ (Humidity = Normal) THEN PlayTennis = Yes

Page 5: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Introduction

An example of first-order rule sets target concept: Ancestor

IF Parent(x,y) THEN Ancestor(x,y) IF Parent(x,y) Parent(y,z)∧ THEN Ancestor(x,z)

The content of this chapter Learning algorithms capable of learing such

rules , given sets of training examples

Page 6: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Content

Introduction Sequential Covering Algorithm Learning First-Order Rules

(FOIL Algorithm) Induction As Inverted Deduction Inverting Resolution

Page 7: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Sequential Covering Algorithm

Page 8: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

8

Sequential Covering Algorithm

Goal of such an algorithm:Learning a disjunctive set of rules, which defines a preferably good classification of the training data

Principle: Learning rule sets based on the strategy of learning one rule, removing the examples it covers, then iterating this process.

Requirement for the Learn-One-Rule method: As Input it accepts a set of positive and negative training

examples As Output it delivers a single rule that covers many of the

positive examples and maybe a few of the negative examples

Required: The output rule has a high accuracy but not necessarily a high coverage

Page 9: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

9

Sequential Covering Algorithm

Procedure: Learning set of rules invokes the Learn-One-Rule

method on all of the available training examples Remove every positive example covered by the rule Eventually short the final set of the rules: more

accurate rules can be considered first Greedy search:

It is not guaranteed to find the smallest or best set of rules that covers the training example.

Page 10: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

10

Sequential Covering Algorithm

SequentialCovering( target_attribute, attributes, examples, threshold )

learned_rules { }

rule LearnOneRule( target_attribute, attributes, examples )

while (Performance( rule, examples ) > threshold ) do

learned_rules learned_rules + rule

examples examples - { examples correctly classified by rule }

rule LearnOneRule( target_attribute, attributes, examples )

learned_rules sort learned_rules according to Performance over examples

return learned_rules

Page 11: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

11

General to Specific Beam Search

LearnOneRule (target_attribute, attributes, examples, k )Initialise best_hypothesis to the most general hypothesis ØInitialise candidate_hypotheses to the set { best_hypothesis }while ( candidate_hypothesis is not empty ) do 1. Generate the next more-specific candidate_hypothesis 2. Update best_hypothesis 3. Update candidate_hypothesisreturn a rule of the form “IF best_hypothesis THEN prediction“ where prediction is the most frequent value of target_attribute among those examples that match best_hypothesis.Performance( h, examples, target_attribute )h_examples the subset of examples that match hreturn -Entropy( h_examples ), where Entropy is with respect to target_attribute

The CN2-Algorithm

Page 12: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

12

General to Specific Beam Search Generate the next more specific candidate_hypothesis

' Update best_hypothesis

all_constraints set of all constraints (a = v), where a attributes and v is a value of an occuring in the current set of examplesnew_candidate_hypothesis for each h in candidate_hypotheses, for each c in all_constraints create a specialisation of h by adding the constraint c Remove from new_candidate_hypothesis any hypotheses which are duplicate, inconsistent or not maximally specific

for all h in new_candidate_hypothesis do if statistically significant when tested on examples Performance( h, examples, target_attribute ) > Performance( best_hypothesis, examples, target_attribute ) ) then best_hypothesis h

Page 13: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

13

General to Specific Beam Search

Update the candidate-hypothesis

' Performance function guides the search in the Learn-One -Rule

O s: the current set of training examples

O c: the number of possible values of the target attribute

O : part of the examples, which are classified with the ith. value

candidate_hypothesis the k best members of new_candidate_hypothesis, according to Performance function

c

i 2 ii=1

Entropy s = p log p

pi

Page 14: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learn-One-Rule

Page 15: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

15

Learning Rule Sets: Summary Key dimension in the design of the rule

learning algorithm Here sequential covering: learn one rule, remove the positive

examples covered, iterate on the remaining examples ID3 simultaneous covering Which one should be preferred?

Key difference: choice at the most primitive step in the searchID3: chooses among attributes by comparing the partitions of the data they generatedCN2: chooses among attribute-value pairs by comparing the subsets of data they cover

Number of choices: learn n rules each containing k attribute-value tests in their precondition CN2: n*k primitive search stepsID3: fewer independent search steps

If the data is plentiful, then it may support the larger number of independent decisons

If the data is scarce, the sharing of decisions regarding preconditions of different rules may be more effective

Page 16: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

16

Learning Rule Sets: Summary

CN2: general-to-specific (cf. Find-S specific-to-general):the direction of the search in LEARN-ONE-RULE.

Advantage: there is a single maximally general hypothesis from which to begin the search <=> there are many specific ones

GOLEM: choosing several positive examples at random to initialise and to guide the search. The best hypothesis obtained through multiple random choices is the selected one

CN2: generate then test Find-S, CANDIDATE-ELIMINATION are example-driven Advantage of the generate and test approach: each

choice in the search is based on the hypothesis performance over many examples, the impact of noisy data is minimized

Page 17: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Content

Introduction Sequential Covering Algorithm Learning First-Order Rules

(FOIL Algorithm) Induction As Inverted Deduction Inverting Resolution

Page 18: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Why do that ?

Can learn sets of rules such as IF Parent(x,y) THEN Ancestor(x,y) IF Parent(x,y)∧Ancestor(y,z) THEN Ancestor(x,z)

Page 19: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Terminology Term : Mary , x , age(Mary) , age(x) Literal : Female(Mary), ﹁ Female(x) Greater_than(age(Mary) ,20) Clause : M1 … M∨ ∨ n

Horn Clause: H←(L1 … L∧ ∧ n) Substitution : {x/3,y/z} Unifying substitution: =

FOIL Algorithm

(Learning Sets of First-Order Rules)

Page 20: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Cover all positive examples

Avoid all negative examples

Page 21: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Analysis of FOIL Outer Loop Specific-to-general Inner Loop

general-to-specific Difference between FOIL and Sequential- covering

and Learn-one-rule generate candidate specializations of the rule FOIL_Gain

Page 22: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Candidate Specializations in FOIL (Ln+1)

Page 23: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Example

Target Predicate : GrandDaughter(x,y)

Other Predicate: Father , Female

Rule : GrandDaughter(x,y)

Candidate Literal: Equal(x,y) ,Female(x),Female(y),Father(x,y),Father(y,x),

Father(x,z),Father(z,x), Father(y,z), Father(z,y) + negative Literals

GrandDaugther(x,y) Father(y,z)

GrandDaughter(x,y) Father(y,z) Father(z,x) Female(y)∧ ∧

Page 24: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Information Gain in FOIL Assertions of training data

GrandDaughter(Victor,Sharon) Father(Sharon,Bob) Father(Tom,Bob) Female(Sharon) Father(Bob,Victor)

Rule : GrandDaughter(x,y) Variable binding (16): {x/Bob , y/Sharon}

Positive example binding: {x/Victor , y/Sharon} Negative example binding (15): {x/Bob,y/Tom} ……

Page 25: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules Information Gain in FOIL

L the candidate literal to add to rule R

p0 number of positive bindings of R

n0 number of negative bindings of R

p1 number of positive bindings of R+L

n1 number of negative bindings of R+L t the number of positive bindings of R also covered by R+L

Page 26: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Learning Recursive Rule Sets

Target Predicate can be the candidate literals

IF Parent(x,y) THEN Ancestor(x,y)

IF Parent(x,y)∧Ancestor(y,z)

THEN Ancestor(x,z)

Page 27: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Learning First-Order Rules

Learning Recursive Rule Sets

Target Predicate can be the candidate literals

IF Parent(x,y) THEN Ancestor(x,y)

IF Parent(x,y)∧Ancestor(y,z)

THEN Ancestor(x,z)

Page 28: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Content

Introduction Sequential Covering Algorithm Learning First-Order Rules

(FOIL Algorithm) Induction As Inverted Deduction Inverting Resolution

Page 29: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Induction As Inverted Deduction

Machine Learning Building theories that explain the observed data

( D <xi , f(xi)> ; B ; h )

Induction is finding h such that

entail

Page 30: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Induction As Inverted Deduction

Example target concept : Child(u , v)

Page 31: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Induction As Inverted Deduction

What we will be interested is designing

inverse entailment operators inverse entailment operators

Page 32: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Content

Introduction Sequential Covering Algorithm Learning First-Order Rules

(FOIL Algorithm) Induction As Inverted Deduction Inverting Resolution

Page 33: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverting Resolution(逆规约 )

C1

C2

C

Resolution operator construct C : ( C1 C∧ 2 C )

Page 34: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted
Page 35: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverse resolution operator : O( C , C1 )=C2

Page 36: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverting Resolution

First-order resolution

)(

}/{

)(

)()(

21

2

1

FredSwanLL

Fredx

FredSwanC

xWhitexSwanC

)(FredWhiteC

Resolution operator (first-order):

Page 37: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverting Resolution

Inverting First order resolution

Inverse resolution (first-order )

21

Page 38: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverting Resolution

Example

target predicate : GrandChild(y,x)

D = { GrandChild(Bob,Shannon) }

B = {Father(Shannon,Tom) , Father(Tom,Bob)}

C

}/{{} 21

1 xShannon

Tom)Father(x,x)(Bob,GrandChild

Tom)Father(x,x)(Bob,GrandChildC2

C1 L1

Page 39: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Inverting Resolution

C

C1 C2

Page 40: Chapter 10 Learning Sets Of Rules. Content Introduction Sequential Covering Algorithm Learning First-Order Rules (FOIL Algorithm) Induction As Inverted

Summary

Sequential covering algorithm Propositional form first-order (FOIL algorithm)

A second approach to learning first-order rules -- Inverting Resolution