page number: 1 datamining in e-business: veni, vidi, vici! by prof. dr. veljko milutinovic
TRANSCRIPT
Page Number:
2
THIS IS A DEMO VERSION OF THIS IS A DEMO VERSION OF
THE TUTORIAL IN DATAMINING FOR E-BUSINESSTHE TUTORIAL IN DATAMINING FOR E-BUSINESS
ONLY A FEW SLIDES OF THE ORIGINAL TUTORIAL ONLY A FEW SLIDES OF THE ORIGINAL TUTORIAL ARE PRESENTED HEREARE PRESENTED HERE
Page Number:
3
Focus of this PresentationFocus of this PresentationFocus of this PresentationFocus of this Presentation
Data Mining problem types
Data Mining models and algorithms
Efficient Data Mining
Available software
Page Number:
4
Decision TreesDecision TreesDecision TreesDecision Trees
Balance>10 Balance<=10
Age<=32 Age>32
Married=NO Married=YES
Page Number:
6
Rule InductionRule InductionRule InductionRule Induction
Method of deriving a set of rules to classify cases
Creates independent rules that are unlikely to form a tree
Rules may not cover all possible situations
Rules may sometimes conflict in a prediction
Page Number:
7
Comparison of foComparison of fouurteen DM toolsrteen DM tools
Evaluated by four undergraduates inexperienced at data mining, Evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student a relatively experienced graduate student,, and and a profes a professsional data mining consultantional data mining consultant
Run under the MS Windows 95, MS Windows NT, Run under the MS Windows 95, MS Windows NT, Macintosh System 7.5Macintosh System 7.5
Use one of the four technologies: Use one of the four technologies: Decision Trees, Rule Inductions, NeuralDecision Trees, Rule Inductions, Neural,, or Polynomial Networks or Polynomial Networks
Solve two binary classification problems: Solve two binary classification problems: multi-class classification and noiseless estimation problem multi-class classification and noiseless estimation problem
Price from 75$ to 25.000$Price from 75$ to 25.000$
Page Number:
8
Comparison of foComparison of fouurteen DM toolsrteen DM tools
The Decision Tree products were The Decision Tree products were - - CART CART
- Scenario - Scenario - See5 - See5
- S-Plus - S-Plus The Rule Induction tools were The Rule Induction tools were
- - WizWhy WizWhy - - DataMindDataMind
- - DMSK DMSK Neural Networks were built from three programsNeural Networks were built from three programs
- - NeuroShell2NeuroShell2- PcOLPARS - PcOLPARS
- - PRW PRW The Polynomial Network tools were The Polynomial Network tools were
- - ModelQuest Expert ModelQuest Expert - - Gnosis Gnosis - a module of - a module of NeuroShellNeuroShell22
- - KnowledgeMiner KnowledgeMiner
Page Number:
9
Criteria for evaluating DM toolsCriteria for evaluating DM tools
A list of 20 criteria for evaluating DM tools, put into 4 categories:A list of 20 criteria for evaluating DM tools, put into 4 categories:
CapabilityCapability measures what a desktop tool can do, measures what a desktop tool can do, and how well it does itand how well it does it
- Handles- Handles missing datamissing data- - - - Considers misclassification costsConsiders misclassification costs
- Allows data transformations- Allows data transformations- - Includes qIncludes quality of tesing uality of tesing
optionsoptions - Has - Has a a programming languageprogramming language- Provides useful - Provides useful
output reportsoutput reports - - Provides Provides vvisualisationisualisation
Page Number:
10
Criteria for evaluating DM toolsCriteria for evaluating DM tools
InteroperabilityInteroperability shows a tool’s ability to interface shows a tool’s ability to interface with other computer applicationswith other computer applications
- Importing data- Importing data- Exporting data- Exporting data
- Links to other applications- Links to other applications
Flexibility Flexibility
- Model adjustment flexibility- Model adjustment flexibility- Customizable work - Customizable work
enviromentenviroment - Ability to - Ability to write or change codewrite or change code