covering (rules-based) algorithm

34
Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology

Upload: zhao-sam

Post on 19-Jan-2015

14.892 views

Category:

Technology


2 download

DESCRIPTION

What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application

TRANSCRIPT

Page 1: Covering (Rules-based) Algorithm

Chapter 8

Covering (Rules-based) Algorithm

Data Mining Technology

Page 2: Covering (Rules-based) Algorithm

Chapter 8

Covering (Rules-based) Algorithm

Written by Shakhina Pulatova

Presented by Zhao Xinyou

[email protected]

Data Mining Technology

Some materials (Examples) are taken from Website.

Page 3: Covering (Rules-based) Algorithm

Contents

What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application

Page 4: Covering (Rules-based) Algorithm

Introduction-App-1PP87-88

Attributes

Record

Rules

1. Rules given by people

2. Rules generated by computer

Setting1.(1.75, 0) short

2. [1.75, 1.95) Medium

3. [1.95, ..) tall

Training Data

Page 5: Covering (Rules-based) Algorithm

Introduction-App-2PP87-88

How to get all tall people from B based on A

A B

+

Training Data

Page 6: Covering (Rules-based) Algorithm

What is Rule-based Algorithm? Definition: Each classification method uses an algorithm

to generate rules from the sample data. These rules are then applied to new data.

Rule-based algorithm provide mechanisms that generate rules by

1. concentrating on a specific class at a time 2. maximizing the probability of the desired

classification.

PP87-88

Should be compact, easy-to-interpret, and accurate.

Page 7: Covering (Rules-based) Algorithm

Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree

PP88-89

Page 8: Covering (Rules-based) Algorithm

formal Specification of Rule-based Algorithm The classification rules, r=<a, c>, consists of : a (antecedent/precondition): a series of tests

that be valuated as true or false; c (consequent/conclusion): the class or

classes that apply to instances covered by rule r.

PP88

a=0

b=0 b=0

yes no

X XY Y

nonoyes yes

a=0,b=0a=0,b=1a=1,b=0a=1,b=1

a=

x

y

c=

Page 9: Covering (Rules-based) Algorithm

Remarks of Straightforward classification The antecedent contains a predicate that can be val

uated as true or false against each tuple in database. These rules relate directly to corresponding decision

tree (DT) that could be created. A DT can always be used to generate rules, but they

are not equivalent. Differences: -the tree has a implied order in which the splitting is

performed; rules have no order. -a tree is created based on looking at all classes; onl

y one class must be examined at a time.

PP88-89

Page 10: Covering (Rules-based) Algorithm

If-Then rule

Straightforward way to perform classification is to generate if-then rules that cover all cases.

1

PP88

Page 11: Covering (Rules-based) Algorithm

Generating rules from Decision Tree -1-Con’

Decision Tree

2

Page 12: Covering (Rules-based) Algorithm

Generating rules from Decision Tree -2-Con’

a

by

c

d

x

y

y

n

Page 13: Covering (Rules-based) Algorithm

Generating rules from Decision Tree -3-Con’

Page 14: Covering (Rules-based) Algorithm

Remarks Rules may be more

complex and incomprehensible from DT.

A new test or rules need reshaping the whole tree

a

b

x

y

yc

d

x

y

y n

n

n

n

c

d

x

y

y n

n

c

d

x

y

y n

n

c

d

x

y

y n

n

duplicate subtrees

Rules obtained without decision trees are more compact and accurate.

So many other covering algorithms have been proposed.

PP89-90

a=0

b=0 b=0

yes no

X XY Y

nonoyes yes

a=1 and c=0 Y

Page 15: Covering (Rules-based) Algorithm

Rule-based Classification

Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm

PP90

Page 16: Covering (Rules-based) Algorithm

Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a

specific class Generate the “best” rule possible by optimizing the

desired classification probability Usually, the “best” attribute-pair is chosen Remark

-these technologies are also called covering algorithms because they attempt to generate rules which exactly cover a specific class.

Page 17: Covering (Rules-based) Algorithm

Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to

classify persons as tall. Basic format of the rule:

if ? then class = tall Goal: replace “?” with predicates that can be

used to obtain the “best” probability of being tall

PP90

Page 18: Covering (Rules-based) Algorithm

Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule

R; 3. Repeat the process.

PP90

Page 19: Covering (Rules-based) Algorithm

Generate Rules-Example-4-Con'Sequential Covering

(I) Original data (ii) Step 1

r = NULL

(iii) Step 2

R1

r = R1

(iii) Step 3

R1

R2

r = R1 U R2

(iii) Step 4

R1

R2R3

r = R1 U R2 U R3

Wrong Class

Page 20: Covering (Rules-based) Algorithm

1R Algorithm/ Learn One Rule-Con’ Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single

attribute. Idea:Rules will be constructed to test a single

attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring

PP91

Page 21: Covering (Rules-based) Algorithm

1R Algorithm/ Learn One Rule-Con’ Idea:1. Rules will be constructed to test a single attribute and branch for every value of

that attribute. Step 2. For each branch, the class with the test classification is the one occurring.3. Find one biggest number as rules4. Error rate will be evaluated.5. The minimum error rate will be chosen.

PP91

M->T Error=5F->M Error=3

Total Error=8 Total Error=3

A2 AnGender

F

2 5 1

SM T

M

1 4 10

SM T

Total Error=..

Page 22: Covering (Rules-based) Algorithm

1R AlgorithmInput: D //Training Data T //Attributes to consider for rules C //Classes

Output: R //Rules

ALgorithm: R=Φ; for all A in T do RA=Φ; for all possbile value, v, of A do for all Cj C do∈ find count(Cj) end for let Cm be the class with the largest count; RA=RA((A=v) ->(class=Cm)); end for ERRA=number of tuples incorrectly classified by

RA; end forR=RA where ERRA is minimum

D

T={Gender, Height}

C={{F, M}, {0, ∞}}

C1 C2

Training Data

Gender

F MShort

Medium Tall

3 6

0

ShortMedium Tall

1 2

3

R1=F->medium R2=M->tall

Height

Page 23: Covering (Rules-based) Algorithm

Example 5 – 1R-3-Con’

Option Attribute Rules Error Total Error

1 Gender F->medium

M->tall

3/9

3/6

6/15

2 Height

(Step=0.1)

(0 , 1.6]-> short

(1.6, 1.7]->short

(1.7, 1.8]-> medium

(1.8, 1.9]-> medium

(1.9, 2.0]-> medium

(2.0, ∞]-> tall

0/2

0/2

0/3

0/4

1/2

0/2

1/15

Rules based on height…...…

Page 24: Covering (Rules-based) Algorithm

Example 6 -1RAttribute Rules Error Total Error

1 outlook Sunny->no

Overcast->yes

Rainy->yes

2/5

0/4

2/5

4/14

2 temperature Hot->no

Mild->yes

Cool->yes

2/4

2/6

1/4

5/14

3 humidity High->no

Normal->yes

3/7

1/7

4/14

4 windy False->yes

True->no

2/8

3/6

5/14

Rules based on humidity ORHigh->noNormal->yes

Rules based on outlookSunny->noOvercast->yesRainy->yes

PP92-93

Page 25: Covering (Rules-based) Algorithm

PRISM Algorithm-Con’

PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class.

Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%.

Measures the success of a rule by a p/t, where -p is number of positive instance, -T is total number of instance covered by the rule.

Gender=Male P=10, T=10Gender=Female P=1 T=8

R=Gender = Male ……

A2 AnGender

F

2 5 1

SM T

M

0 0 10

SM T

Page 26: Covering (Rules-based) Algorithm

PRISM AlgorithmInput: D //Training Data C //ClassesOutput: R //Rules

StepInput D and C (Attribute -> Value)

1.Compute all class P/T (Attribute->Value)

2. Find one or more pair of (Attribute->Value) P/T = 100%

3. Select (Attribute->Value) as Rule

4. Repeat 1-3 until no data in D

Page 27: Covering (Rules-based) Algorithm

Example 8-Con’-which class may be tall?

Num (Attribute, value) p / t

1 Gender = F 0/9

2 Gender = M 3/6

3 Height ≤ 1.6 0/2

4 1.6< Height ≤ 1.7 0/2

5 1.7< Height ≤ 1.8 0/3

6 1.8< Height ≤ 1.9 0/4

7 1.9< Height ≤ 2.0 ½

8 2.0< Height 2/2

R1 = 2.0< HeightCompute the value p / tWhich one is 100%

PP94-95

Page 28: Covering (Rules-based) Algorithm

Num (Attribute, value) p / t

… … …

1.9< Height ≤ 1.95 0/1

1.95< Height ≤ 2.0 1/1

R2 = 1.95< Height ≤ 2.0

R = R1 U R2

PP94-96

Page 29: Covering (Rules-based) Algorithm

Example 9-Con’-which days may play?

The predicate outlook=overcast correctly implies play=yes on all four rows

R1=if outlook=overcast, then play=yes

Compute the value p / t

Page 30: Covering (Rules-based) Algorithm

Example 8-Con’

R2=if humidity=normal and windy=false, then play=yes

Page 31: Covering (Rules-based) Algorithm

Example 8-Con’

R3=….. R = R1 U R2 U R3 U…

Page 32: Covering (Rules-based) Algorithm

Application of Covering Algorithm To derive classification rules applied for

diagnosing illness, business planning, banking, government.

Machine learning Text classification. But to photos, it is difficult… And so on.

Page 33: Covering (Rules-based) Algorithm

Application on E-learning/M-learning

Adaptive and personalized learning materials Virtual Group Classification

Initial Learner’s information

Classification of learning styles or some

Provide adaptive and personalized materials

Collect learning styles

feedback

Chapter 2 or 3Similarity, Bayesian…

Rule-based algorithm

Page 34: Covering (Rules-based) Algorithm

Discussion