decision trees
DESCRIPTION
Some concepts on decision treesTRANSCRIPT
![Page 1: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/1.jpg)
Decision Trees
![Page 2: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/2.jpg)
What is a tree in CS?
• A tree is a non-linear data structure• It has a unique node called the root• Every non-trivial tree has one or more leaf
nodes, arranged in different levels• Trees are always drawn with the root at
the top or on the left• Nodes at a level are connected to nodes at
higher (parent) level or lower (child) level• There are no loops in a tree
![Page 3: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/3.jpg)
Decision Trees
• A decision tree (DT) is a hierarchical classification and prediction model
• It is organized as a rooted tree with 2 types of nodes called decision nodes and class nodes
• It is a supervised data mining model used for classification or prediction
![Page 4: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/4.jpg)
An Example Data Set and Decision Tree
# Class
Outlook Company Sailboat Sail?
1 sunny big small yes
2 sunny med small yes
3 sunny med big yes
4 sunny no small yes
5 sunny big big yes
6 rainy no small no
7 rainy med small yes
8 rainy big big yes
9 rainy no big no
10 rainy med big no
Attribute
yes
no
yes no
sunny rainy
nomed
yes
small big
big
outlook
company
sailboat
![Page 5: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/5.jpg)
Classification
• What is classification?• What are some applications of Decision
Tree Classifiers (DTC)• What is a BDTC?• Misclassification errors
![Page 6: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/6.jpg)
Classification
# Class
Outlook Company Sailboat Sail?
1 sunny no big ?
2 rainy big small ?
Attribute
yes
no
yes no
sunny rainy
nomed
yes
small big
big
outlook
company
sailboat
![Page 7: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/7.jpg)
Chance and Terminal nodes
• Each internal node of a DT is a decision point, where some condition is tested
• The result of this condition determines which branch of the tree is to be taken next
• Thus they are called decision node, chance node or non-terminal node
• Chance nodes partition the available data at that point to maximize dependent variable differences
![Page 8: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/8.jpg)
Terminal nodes
• The leaf nodes of a DT are called terminal node
• They indicate the class into which a data instance will be classified
• They have just one incoming node• They do not have child nodes (outgoing nodes)• There are no conditions tested at terminal
nodes• Tree traversal from the root to the leaf
produces the production rule for that class
![Page 9: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/9.jpg)
Advantages of DT
• Easy to understand and interpret• Works for categorical and quantitative
data• DT can grow to any depth• Attributes can be chosen in any desired
order• Pruning a DT is very easy• Works for missing or null values
![Page 10: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/10.jpg)
Advantages contd.
• Can be used to identify outliers• Production rules can be obtained directly
from the built DT• They are relatively faster than other
classification models• DT can be used even when domain
experts are absent
![Page 11: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/11.jpg)
Disadvantages
• A DT induces sequential decisions• Class-overlap problem• Correlated data• Complex production rules• A DT can be sub-optimal
![Page 12: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/12.jpg)
Quinlan’s classical example
# Class
Outlook Temperature Humidity Windy Play
1 sunny hot high no N
2 sunny hot high yes N
3 overcast hot high no P
4 rainy moderate high no P
5 rainy cold normal no P
6 rainy cold normal yes N
7 overcast cold normal yes P
8 sunny moderate high no N
9 sunny cold normal no P
10 rainy moderate normal no P
11 sunny moderate normal yes P
12 overcast moderate high yes P
13 overcast hot normal no P
14 rainy moderate high yes N
Attribute
![Page 13: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/13.jpg)
Simple Tree
Outlook
Humidity WindyP
sunnyovercast
rainy
PN
high normal
PN
yes no
![Page 14: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/14.jpg)
Complicated Tree
Temperature
Outlook Windy
cold
moderatehot
P
sunny rainy
N
yes no
P
overcast
Outlook
sunny rainy
P
overcast
Windy
PN
yes no
Windy
NP
yes no
Humidity
P
high normal
Windy
PN
yes no
Humidity
P
high
Outlook
N
sunny rainy
P
overcast
null
![Page 15: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/15.jpg)
Production rules
• Rules abstracted by a DT can be converted into production rules
• These are obtained by traversing each branch of the DT from root to each of the leaves
• A DT can be reconstructed if all production rules are known
![Page 16: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/16.jpg)
General View of DT Induction
![Page 17: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/17.jpg)
ID3 induction algorithm
• ID3 (Interactive dichotomiser)• Introduced in 1986 by Quinlan• Uses greedy tree-growing method• Works on binary attributes• Uses entropy measure
![Page 18: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/18.jpg)
C4.5 induction algorithm
• Invented by Quinlan in 1993• Is an extension of ID3 algorithm• Uses greedy tree-growing method• Works on general attributes• Uses entropy measure• Uses multi-way splits
![Page 19: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/19.jpg)
CART induction algorithm
• Invented by Breiman, et.al. in 1984• Uses binary recursive partitioning method• Works on general attributes• Uses Gini measure• Uses two-way splits
![Page 20: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/20.jpg)
Measures for node splitting
• Gini’s Index measure• Modified Gini Index• Normalized, symmetric and asymmetric
Gini Index measure• Shannon’s entropy measure• Minimum classification error measure• Chi-square statistic
![Page 21: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/21.jpg)
Entropy
• The average amount of information I needed to classify an object is given by the entropy measure
• For a two-class problem:
![Page 22: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/22.jpg)
Chi-squared Automatic Interaction Detector(CHAID)
• As the name implies, this is a statistical technique for tree induction that uses Karl Pearson's X2 test for contingency tables.
• It works for categorical variables (with 2 or more categories), and can be used as an alternative to logistic regression.
• There is no pruning step as it stops growing the DT when a certain condition is met.
![Page 23: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/23.jpg)
Pruning DT
• Once the decision tree has been constructed, a sensitivity analysis should be performed to test the suitability of the model to variations in the data instances. Expected values of each alternative are evaluated to determine optimal model. But the decision maker's attitude towards high risk alternatives can negatively influence the outcome of a sensitivity analysis. Most of the decision tree software packages allows the user to carry out sensitivity analysis.
![Page 24: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/24.jpg)
Pre Vs Post-pruning
• There are two approaches to prune a DT -- pre-pruning and post-pruning. In pre-pruning, the tree growing is halted when a stopping condition is met.
• Post-pruning works with a completely grown tree. In post-pruning, test cases are used to prune the DT to minimize the classification error or to adjust the tree to data changes.
• Tree pruning is usually a post-processing step with an intention to minimize over fitting, and to remove redundancies.
![Page 25: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/25.jpg)
Decision Tables
• A decision table is a hierarchical structure akin to decision trees, except that data are enumerated into a table using a pair of attributes, rather than a single attribute.
• Quantitative variables should be categorized using the discretisation technique discussed in chapter 1.
![Page 26: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/26.jpg)
Fraud Detection
• Fraud detection is increasingly becoming a necessity due to the large number of uncaught frauds. Fraudulent financial transaction amounts to billions of dollars every year throughout the world. Fraud prevention is different from fraud detection, as the former is pre-transaction safety, and the later is used during or immediately after a transaction.
![Page 27: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/27.jpg)
Software for DT
• DTREG is a powerful statistical analysis program that generates classification and regression trees (www.dtreg.com)
• GATree (www.gatree.com)• Weka (University of Waikato, NZ)• TreeAge Pro (www.treeage.com)• YaDT (www.di.unipi.it/~ruggieri/YaDT/YaDT1.2.1.zip)
![Page 28: Decision Trees](https://reader033.vdocuments.site/reader033/viewer/2022061116/54658a43b4af9f54198b4b83/html5/thumbnails/28.jpg)
THE END