iterative dichotomiser 3 (id3) algorithm
DESCRIPTION
Iterative Dichotomiser 3 (ID3) Algorithm. Medha Pradhan CS 157B, Spring 2007. Agenda. Basics of Decision Tree Introduction to ID3 Entropy and Information Gain Two Examples. Basics. What is a decision tree? - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/1.jpg)
Iterative Dichotomiser 3 (ID3) Iterative Dichotomiser 3 (ID3) AlgorithmAlgorithm
Medha PradhanMedha PradhanCS 157B, Spring 2007CS 157B, Spring 2007
![Page 2: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/2.jpg)
AgendaAgenda
Basics of Decision Tree Introduction to ID3 Entropy and Information Gain Two Examples
![Page 3: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/3.jpg)
BasicsBasics What is a decision tree?
A tree where each branching (decision) node represents a choice between 2 or more alternatives, with every branching node being part of a path to a leaf node
Decision node: Specifies a test of some attribute
Leaf node: Indicates classification of an example
![Page 4: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/4.jpg)
ID3ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through
the space of possible decision trees.Greedy because there is no backtracking. It picks highest values first.
Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).
![Page 5: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/5.jpg)
EntropyEntropy Entropy measures the impurity of an arbitrary collection of
examples. For a collection S, entropy is given as:
For a collection S having positive and negative examplesEntropy(S) = -p+log2p+ - p-log2p-
where p+ is the proportion of positive examples
and p- is the proportion of negative examples
In general, Entropy(S) = 0 if all members of S belong to the same class.Entropy(S) = 1 (maximum) when all members are split equally.
![Page 6: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/6.jpg)
Information GainInformation Gain Measures the expected reduction in entropy. The
higher the IG, more is the expected reduction in entropy.
where Values(A) is the set of all possible values for attribute A,Sv is the subset of S for which attribute A has value v.
![Page 7: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/7.jpg)
Example 1Example 1Sample training data to determine whether an animal lays eggs.
Independent/Condition attributesDependent/
Decision attributes
Animal Warm-blooded
Feathers Fur Swims Lays Eggs
Ostrich Yes Yes No No Yes
Crocodile No No No Yes Yes
Raven Yes Yes No No Yes
Albatross Yes Yes No No Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No
![Page 8: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/8.jpg)
Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6)
= 0.91829
Now, we have to find the IG for all four attributes Warm-blooded, Feathers, Fur, Swims
![Page 9: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/9.jpg)
For attribute ‘Warm-blooded’:Values(Warm-blooded) : [Yes,No]S = [4Y,2N]SYes = [3Y,2N] E(SYes) = 0.97095SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class) Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0]
= 0.10916For attribute ‘Feathers’:Values(Feathers) : [Yes,No]S = [4Y,2N]SYes = [3Y,0N] E(SYes) = 0SNo = [1Y,2N] E(SNo) = 0.91829Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829]
= 0.45914
![Page 10: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/10.jpg)
For attribute ‘Fur’:Values(Fur) : [Yes,No]S = [4Y,2N]SYes = [0Y,1N] E(SYes) = 0
SNo = [4Y,1N] E(SNo) = 0.7219
Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219]= 0.3167
For attribute ‘Swims’:Values(Swims) : [Yes,No]S = [4Y,2N]SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes)
SNo = [3Y,1N] E(SNo) = 0.81127
Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127]= 0.04411
![Page 11: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/11.jpg)
Gain(S,Warm-blooded) = 0.10916Gain(S,Feathers) = 0.45914Gain(S,Fur) = 0.31670Gain(S,Swims) = 0.04411
Gain(S,Feathers) is maximum, so it is considered as the root node
Feathers
Y N
[Ostrich, Raven, Albatross]
[Crocodile, Dolphin, Koala]
Lays Eggs ?
Animal
Warm-
blooded
Feathers
Fur Swims
Lays Eggs
Ostrich
Yes Yes No No Yes
Crocodile
No No No Yes Yes
Raven Yes Yes No No Yes
Albatross
Yes Yes No No Yes
Dolphin
Yes No No Yes No
Koala Yes No Yes No No
The ‘Y’ descendant has only positive examples and becomes the leaf node with classification ‘Lays Eggs’
![Page 12: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/12.jpg)
We now repeat the procedure,S: [Crocodile, Dolphin, Koala]S: [1+,2-]
Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)= 0.91829
Animal Warm-blooded
Feathers Fur Swims Lays Eggs
Crocodile No No No Yes Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No
![Page 13: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/13.jpg)
For attribute ‘Warm-blooded’:Values(Warm-blooded) : [Yes,No]S = [1Y,2N]SYes = [0Y,2N] E(SYes) = 0SNo = [1Y,0N] E(SNo) = 0 Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829
For attribute ‘Fur’:Values(Fur) : [Yes,No]S = [1Y,2N]SYes = [0Y,1N] E(SYes) = 0SNo = [1Y,1N] E(SNo) = 1Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162
For attribute ‘Swims’:Values(Swims) : [Yes,No]S = [1Y,2N]SYes = [1Y,1N] E(SYes) = 1SNo = [0Y,1N] E(SNo) = 0 Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162
Gain(S,Warm-blooded) is maximum
![Page 14: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/14.jpg)
The final decision tree will be:
Feathers
Y N
Lays eggs Warm-blooded
Y N
Lays EggsDoes not lay eggs
![Page 15: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/15.jpg)
Example 2Example 2 Factors affecting sunburn
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Alex Brown Short Average Yes No
Annie Blonde Short Average No Yes
Emily Red Average Heavy No Yes
Pete Brown Tall Heavy No No
John Brown Average Heavy No No
Katie Blonde Short Light Yes No
![Page 16: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/16.jpg)
S = [3+, 5-]Entropy(S) = -(3/8)log2(3/8) – (5/8)log2(5/8)
= 0.95443
Find IG for all 4 attributes: Hair, Height, Weight, Lotion
For attribute ‘Hair’:Values(Hair) : [Blonde, Brown, Red]S = [3+,5-]SBlonde = [2+,2-] E(SBlonde) = 1SBrown = [0+,3-]E(SBrown) = 0 SRed = [1+,0-] E(SRed) = 0Gain(S,Hair) = 0.95443 – [(4/8)*1 + (3/8)*0 + (1/8)*0]
= 0.45443
![Page 17: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/17.jpg)
For attribute ‘Height’:Values(Height) : [Average, Tall, Short]SAverage = [2+,1-] E(SAverage) = 0.91829STall = [0+,2-] E(STall) = 0 SShort = [1+,2-] E(SShort) = 0.91829Gain(S,Height) = 0.95443 – [(3/8)*0.91829 + (2/8)*0 + (3/8)*0.91829]
= 0.26571 For attribute ‘Weight’:Values(Weight) : [Light, Average, Heavy]SLight = [1+,1-] E(SLight) = 1SAverage = [1+,2-] E(SAverage) = 0.91829 SHeavy = [1+,2-] E(SHeavy) = 0.91829Gain(S,Weight) = 0.95443 – [(2/8)*1 + (3/8)*0.91829 + (3/8)*0.91829]
= 0.01571 For attribute ‘Lotion’:Values(Lotion) : [Yes, No]SYes = [0+,3-] E(SYes) = 0SNo = [3+,2-] E(SNo) = 0.97095Gain(S,Lotion) = 0.95443 – [(3/8)*0 + (5/8)*0.97095]
= 0.01571
![Page 18: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/18.jpg)
Gain(S,Hair) = 0.45443Gain(S,Height) = 0.26571Gain(S,Weight) = 0.01571Gain(S,Lotion) = 0.3475Gain(S,Hair) is maximum, so it is considered as the root nodeName Hair Height Weigh
tLotion Sunbur
ned
Sarah Blonde Average
Light No Yes
Dana Blonde Tall Average
Yes No
Alex Brown Short Average
Yes No
Annie Blonde Short Average
No Yes
Emily Red Average
Heavy No Yes
Pete Brown Tall Heavy No No
John Brown Average
Heavy No No
Katie Blonde Short Light Yes No
HairBlonde Red
Brown
[Sarah, Dana,Annie, Katie]
[Emily]
[Alex, Pete, John]
Sunburned
NotSunburned?
![Page 19: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/19.jpg)
Repeating again:S = [Sarah, Dana, Annie, Katie]S: [2+,2-]Entropy(S) = 1
Find IG for remaining 3 attributes Height, Weight, Lotion For attribute ‘Height’:Values(Height) : [Average, Tall, Short]S = [2+,2-]SAverage = [1+,0-] E(SAverage) = 0STall = [0+,1-] E(STall) = 0 SShort = [1+,1-] E(SShort) = 1Gain(S,Height) = 1 – [(1/4)*0 + (1/4)*0 + (2/4)*1]
= 0.5
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Annie Blonde Short Average No Yes
Katie Blonde Short Light Yes No
![Page 20: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/20.jpg)
For attribute ‘Weight’:Values(Weight) : [Average, Light]S = [2+,2-]SAverage = [1+,1-] E(SAverage) = 1SLight = [1+,1-] E(SLight) = 1 Gain(S,Weight) = 1 – [(2/4)*1 + (2/4)*1]
= 0
For attribute ‘Lotion’:Values(Lotion) : [Yes, No]S = [2+,2-]SYes = [0+,2-] E(SYes) = 0SNo = [2+,0-] E(SNo) = 0 Gain(S,Lotion) = 1 – [(2/4)*0 + (2/4)*0]
= 1
Therefore, Gain(S,Lotion) is maximum
![Page 21: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/21.jpg)
In this case, the final decision tree will be
Hair
BlondeRed
Brown
Sunburned NotSunburnedLotion
Y N
SunburnedNotSunburned
![Page 22: Iterative Dichotomiser 3 (ID3) Algorithm](https://reader035.vdocuments.site/reader035/viewer/2022081420/56815dc5550346895dcbf134/html5/thumbnails/22.jpg)
ReferencesReferences "Machine Learning", by Tom Mitchell, McGraw-Hill, 1997 "Building Decision Trees with the ID3 Algorithm", by:
Andrew Colin, Dr. Dobbs Journal, June 1996
http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/dt_prob1.html
Professor Sin-Min Lee, SJSU. http://cs.sjsu.edu/~lee/cs157b/cs157b.html