issues in data mining infrastructure
DESCRIPTION
Issues in Data Mining Infrastructure. Authors:Nemanja Jovanovic, [email protected] Valentina Milenkovic, [email protected] Voislav Galic, [email protected] Dusan Zecevic, [email protected] Sonja Tica, [email protected] Prof. Dr. Dusan Tosic, [email protected] - PowerPoint PPT PresentationTRANSCRIPT
Page Number:
1
Issues in Data Mining Issues in Data Mining InfrastructureInfrastructure
Issues in Data Mining Issues in Data Mining InfrastructureInfrastructure
Authors: Nemanja Jovanovic, [email protected] Milenkovic, [email protected] Voislav Galic, [email protected] Zecevic, [email protected] Tica, [email protected]. Dr. Dusan Tosic, [email protected]. Dr. Veljko Milutinovic, [email protected]
Page Number:
2
Data Mining in the NutshellData Mining in the NutshellData Mining in the NutshellData Mining in the Nutshell
Uncovering the hidden knowledge
Huge n-p complete search space
Multidimensional interface
NOTICE:
All trademarks and service marks mentioned in this document are marks of their respective owners. Furthermore CRISP-DM consortium (NCR Systems Engineering Copenhagen (USA and Denmark), DaimlerChrysler AG (Germany), SPSS Inc. (USA) and OHRA Verzekeringen en Bank Groep B.V (The Netherlands)) permitted presentation of their process model.
Page Number:
3
A Problem …A Problem …A Problem …A Problem …
You are a marketing manager for a cellular phone company
Problem: Churn is too high
Bringing back a customer after quitting is both difficult and expensive
Giving a new telephone to everyone whose contract is expiring is expensive
You pay a sales commission of 250$ per contract
Customers receive free phone (cost 125$)
Turnover (after contract expires) is 40%
Page Number:
4
… … A SolutionA Solution… … A SolutionA Solution
Three months before a contract expires, predict which customers will leave
If you want to keep a customer that is predicted to churn, offer them a new phone
The ones that are not predicted to churn need no attention
If you don’t want to keep the customer, do nothing
How can you predict future behavior?
Tarot Cards?
Magic Ball?
Data Mining?
Page Number:
5
Still Skeptical?Still Skeptical?Still Skeptical?Still Skeptical?
Page Number:
6
The DefinitionThe DefinitionThe DefinitionThe Definition
Automated
The automated extraction of predictive information from (large) databases
Extraction
Predictive
Databases
Page Number:
7
History of Data MiningHistory of Data MiningHistory of Data MiningHistory of Data Mining
Page Number:
8
Repetition in Solar ActivityRepetition in Solar ActivityRepetition in Solar ActivityRepetition in Solar Activity
1613 – Galileo Galilei
1859 – Heinrich Schwabe
Page Number:
9
The Return of theThe Return of theHalley CometHalley Comet
The Return of theThe Return of theHalley CometHalley Comet
1910 1986 2061 ???
1531
1607
1682
239 BC
Edmund Halley (1656 - 1742)
Page Number:
10
Data Mining is NotData Mining is NotData Mining is NotData Mining is Not
Data warehousing
Ad-hoc query/reporting
Online Analytical Processing (OLAP)
Data visualization
Page Number:
11
Data Mining isData Mining isData Mining isData Mining is
Automated extraction of predictive informationfrom various data sources
Powerful technology with great potential to help users focus on the most important information stored in data warehouses or streamed through communication lines
Page Number:
12
Data Mining canData Mining canData Mining canData Mining can
Answer question that were too time consuming to resolve in the past
Predict future trends and behaviors, allowing us to make proactive, knowledge driven decision
Page Number:
13
Data Mining ModelsData Mining ModelsData Mining ModelsData Mining Models
Page Number:
14
Neural NetworksNeural NetworksNeural NetworksNeural Networks
Characterizes processed data with single numeric value
Efficient modeling of large and complex problems
Based on biological structures - Neurons
Network consists of neurons grouped into layers
Page Number:
15
Neuron FunctionalityNeuron FunctionalityNeuron FunctionalityNeuron Functionality
I1
I2
I3
In
Output
W1
W2
W3
Wn
f
Output = f (W1*I1, W2*I1, …, Wn*In)Output = f (W1*I1, W2*I1, …, Wn*In)
Page Number:
16
Training Neural NetworksTraining Neural NetworksTraining Neural NetworksTraining Neural Networks
Page Number:
17
Neural NetworksNeural NetworksNeural NetworksNeural Networks
Once trained, Neural Networks can efficiently estimate the value of an output variable for given input
Neurons and network topology are essentials
Usually used for prediction or regression problem types
Difficult to understand
Data pre-processing often required
Page Number:
18
Decision TreesDecision TreesDecision TreesDecision Trees
A way of representing a series of rules that lead to a class or value
Iterative splitting of data into discrete groups maximizing distance between them at each split
CHAID, CHART, Quest, C5.0
Classification trees and regression trees
Unlimited growth and stopping rules
Univariate splits and multivariate splits
Page Number:
19
Decision TreesDecision TreesDecision TreesDecision Trees
Balance>10 Balance<=10
Age<=32 Age>32
Married=NO Married=YES
Page Number:
20
Decision TreesDecision TreesDecision TreesDecision Trees
Page Number:
21
Rule InductionRule InductionRule InductionRule Induction
Method of deriving a set of rules to classify cases
Creates independent rules that are unlikely to form a tree
Rules may not cover all possible situations
Rules may sometimes conflict in a prediction
Page Number:
22
Rule InductionRule InductionRule InductionRule Induction
If balance>100.000 then confidence=HIGH & weight=1.7
If balance>25.000 andstatus=married
then confidence=HIGH & weight=2.3
If balance<40.000 then confidence=LOW & weight=1.9
Page Number:
23
K-nearest Neighbor and K-nearest Neighbor and Memory-Based Reasoning (MBR)Memory-Based Reasoning (MBR)
K-nearest Neighbor and K-nearest Neighbor and Memory-Based Reasoning (MBR)Memory-Based Reasoning (MBR)
Usage of knowledge of previously solved similar problems in solving the new problem
Assigning the class to the group where most of the k-”neighbors” belong
First step – finding the suitable measure for distance between attributes in the data
+ Easy handling of non-standard data types
- Huge models
Page Number:
24
K-nearest Neighbor and K-nearest Neighbor and Memory-Based Reasoning (MBR)Memory-Based Reasoning (MBR)
K-nearest Neighbor and K-nearest Neighbor and Memory-Based Reasoning (MBR)Memory-Based Reasoning (MBR)
Page Number:
25
Data Mining AlgorithmsData Mining AlgorithmsData Mining AlgorithmsData Mining Algorithms
Logistic regression
Discriminant analysis
Generalized Adaptive Models (GAM)
Genetic algorithms
The Apriori algorithm
Etc…
Many other available models and algorithms
Many application specific variations of known models
Final implementation usually involves several techniques
Page Number:
26
The Apriori AlgorithmThe Apriori Algorithm
The task – mining association rules by finding large itemsets and The task – mining association rules by finding large itemsets and translating them to the corresponding association rules;translating them to the corresponding association rules;
A A B, or A B, or A11 AA22 …… A Am m BB1 1 B B22 …… B Bnn, where A , where A B = B = The terminologyThe terminology
– ConfidenceConfidence
– SupportSupport
– k-itemset – a set of k-itemset – a set of kk items; items;
– Large itemsets – the large itemset {A, B} corresponds to the following rules Large itemsets – the large itemset {A, B} corresponds to the following rules (implications): A (implications): A B and B B and B A; A;
Page Number:
27
The The Apriori AlgorithmApriori Algorithm
The The operator definition operator definition– n = 1: Sn = 1: S22 = S = S11 S S11 = {A}, {B}, {C}} = {A}, {B}, {C}} {{A}, {B}, {C}} = {{AB}, {AC}, {BC}} {{A}, {B}, {C}} = {{AB}, {AC}, {BC}}
– n = k: Sn = k: Sk+1k+1 = S = Skk S Skk = {X = {X Y| X, Y Y| X, Y S Skk, |X , |X Y| = k-1} Y| = k-1}
– X and Y must have the same number of elements, and must have exactly X and Y must have the same number of elements, and must have exactly k-1k-1 identical elements;identical elements;
– Every k-element subset of any resulting set element (an Every k-element subset of any resulting set element (an elementelement is actually a is actually a k+1 element set) has to belong to the original set of itemsets;k+1 element set) has to belong to the original set of itemsets;
Page Number:
28
The The Apriori AlgorithmApriori Algorithm
Example:Example:
TIDTID elementselements
1010 AA CC DD
2020 BB CC EE
3030 AA BB CC EE
4040 BB EE
Page Number:
29
The The Apriori AlgorithmApriori Algorithm
Step 1 – generate a candidate set of 1-itemsets Step 1 – generate a candidate set of 1-itemsets CC11
– Every possible 1-element set from the database is potentially a large itemset, Every possible 1-element set from the database is potentially a large itemset, because we don’t know the number of its appearances in the database in because we don’t know the number of its appearances in the database in advance (á priori advance (á priori ););
– The task adds up to identifying (counting) all the different elements in the The task adds up to identifying (counting) all the different elements in the database; every such element forms a 1-element candidate set;database; every such element forms a 1-element candidate set;
– CC1 1 = {{A}, {B}, {C}, {D}, {E}}= {{A}, {B}, {C}, {D}, {E}}
– Now, we are going to scan the entire database, to count the number of Now, we are going to scan the entire database, to count the number of appearances for each one of these elements (i.e. appearances for each one of these elements (i.e. one-element setsone-element sets););
Page Number:
30
The The Apriori AlgorithmApriori Algorithm
Now, we are going to scan the entire database, to count the number of Now, we are going to scan the entire database, to count the number of appearances for each one of these elements (i.e. appearances for each one of these elements (i.e. one-element setsone-element sets););
{A}{A} 22
{B}{B} 33
{C}{C} 33
{D}{D} 11
{E}{E} 33
Page Number:
31
The The Apriori AlgorithmApriori Algorithm
Step 2 – generate a set of large 1-itemsets Step 2 – generate a set of large 1-itemsets LL11
– Each element in CEach element in C11 with support that exceeds some adopted minimum support with support that exceeds some adopted minimum support
(for example 50%) becomes a member of L(for example 50%) becomes a member of L11;;
– LL11 = {{A}, {B}, {C},{E}} = {{A}, {B}, {C},{E}}
and we can omit D in further and we can omit D in further steps (if D doesn’t have steps (if D doesn’t have enough support alone, enough support alone, there is no way it could there is no way it could satisfy requested support satisfy requested support in a combination with some in a combination with some other element(s));other element(s));
{A}{A} 22
{B}{B} 33
{C}{C} 33
{D}{D} 11
{E}{E} 33
Page Number:
32
The The Apriori AlgorithmApriori Algorithm
Step 3 – generate a candidate set of large 2-itemsets, Step 3 – generate a candidate set of large 2-itemsets, CC22
– CC22 = L = L11 L L11 ={{AB}, {AC}, {AE}, {BC}, {BE}, {CE}} ={{AB}, {AC}, {AE}, {BC}, {BE}, {CE}}
– Count the corresponding appearancesCount the corresponding appearances
Step 4 – generate a set of large 2-itemsets, Step 4 – generate a set of large 2-itemsets, LL22;;
– Eliminate the candidates Eliminate the candidates without minimum support;without minimum support;
– LL22 = {{AC}, {BC}, {BE}, {CE}} = {{AC}, {BC}, {BE}, {CE}}
{AB}{AB} 11
{AC}{AC} 22
{AE}{AE} 11
{BC}{BC} 22
{BE}{BE} 33
{CE}{CE} 22
Page Number:
33
The The Apriori AlgorithmApriori Algorithm
Step 5 (Step 5 (CC33))
– CC33 = L = L22 L L22 = {{BCE}} = {{BCE}}
– Why not {ABC} and {ACE} – because their 2-element subsets {AB} and {AE} are Why not {ABC} and {ACE} – because their 2-element subsets {AB} and {AE} are not the elements of large 2-itemset set Lnot the elements of large 2-itemset set L22 (calculation is made according to the (calculation is made according to the
operator operator definition); definition);
Step 6 (Step 6 (LL33))
– LL33 = {{BCE}}, since {BCE} satisfies the required support of 50% (two = {{BCE}}, since {BCE} satisfies the required support of 50% (two
appearances);appearances); There can be no further steps in this particular case, There can be no further steps in this particular case,
because Lbecause L33 L L33 = = ;;
Answer = LAnswer = L1 1 L L22 L L33;;
Page Number:
34
The The Apriori AlgorithmApriori Algorithm
LL11 = {large 1-itemsets} = {large 1-itemsets}
forfor (k=2; L (k=2; Lk-1 k-1 ; k++); k++)
CCkk = apriori-gen(L = apriori-gen(Lk-1k-1););
forallforall transactions t transactions t D D dodo beginbegin
CCtt = subset (C = subset (Ckk, t);, t);
forallforall candidates c candidates c C Ctt dodo
c.count++;c.count++;
endend;;
LLkk = {c = {c C Ckk | c.count | c.count minsup} minsup}
endend;;
Answer = Answer = kk L Lkk
Page Number:
35
The The Apriori AlgorithmApriori Algorithm
Enhancements to the basic algorithmEnhancements to the basic algorithm Scan-reductionScan-reduction
– The most time consuming operation in Apriori algorithm is the database scan; it The most time consuming operation in Apriori algorithm is the database scan; it is originally performed after each candidate set generation, to determine the is originally performed after each candidate set generation, to determine the frequency of each candidate in the database;frequency of each candidate in the database;
– Scan number reduction – counting candidates of multiple sizes in one pass;Scan number reduction – counting candidates of multiple sizes in one pass;
– Rather than counting only candidates of size k in the kRather than counting only candidates of size k in the k thth pass, we can also pass, we can also calculate the candidates calculate the candidates C’C’k+1k+1, where , where C’C’k+1 k+1 is generated from is generated from CCkk (instead (instead LLkk), using ), using
the the operator; operator;
Page Number:
36
The The Apriori AlgorithmApriori Algorithm
– Compare: CCompare: C’’k+1k+1 = C = Ckk C Ck k C Ck+1k+1 = L = Lkk L Lkk
– Note that CNote that C’’k+1k+1 C Ck+1k+1
– This variation can pay off in later passes, when the cost of counting and keeping This variation can pay off in later passes, when the cost of counting and keeping in memory additional Cin memory additional C’’
k+1k+1 - C - Ck+1k+1 candidates becomes less than the cost of candidates becomes less than the cost of
scanning the database;scanning the database;
– There has to be enough space in main memory for both CThere has to be enough space in main memory for both Ckk and C and C’’k+1k+1;;
– Following this idea, we can make further scan reduction:Following this idea, we can make further scan reduction:
• C’k+1 is calculated from Ck for k > 1;
• There must be enough memory space for all Ck’s (k >
1);– Consequently, only two database scans need to be performed (the first to Consequently, only two database scans need to be performed (the first to
determine Ldetermine L11, and the second to determine all the other L, and the second to determine all the other Lkk’s);’s);
Page Number:
37
The The Apriori AlgorithmApriori Algorithm
Abstraction levelsAbstraction levels– Higher level associations are stronger (more powerful), but also less certain;Higher level associations are stronger (more powerful), but also less certain;
– A good practice would be adopting different thresholds for different abstraction A good practice would be adopting different thresholds for different abstraction levels (higher thresholds for higher levels of abstraction)levels (higher thresholds for higher levels of abstraction)
Page Number:
38
ReferencesReferences
Devedzic, V., “Devedzic, V., “Inteligentni informacioni sistemi,”Inteligentni informacioni sistemi,” Digit, FON, Beograd, 2003. Digit, FON, Beograd, 2003. http://www.http://www.marconimarconi.com.com http://www.http://www.blueyedblueyed.com.com http://www.http://www.fipafipa.org.org http://www.http://www.rpirpi..eduedu http://research.http://research.microsoftmicrosoft.com.com http://http://imatchimatch..lcslcs..mitmit..eduedu
Page Number:
39
DM Process ModelDM Process ModelDM Process ModelDM Process Model
CRISP – tends to become a standard
5A – used by SPSS Clementine(Assess, Access, Analyze, Act and Automate)
SEMMA – used by SAS Enterprise Miner(Sample, Explore, Modify, Model and Assess)
Page Number:
40
CRISP - DMCRISP - DMCRISP - DMCRISP - DM
CRoss-Industry Standard for DM
Conceived in 1996 by three companies:
Page Number:
41
CRISP – DM methodologyCRISP – DM methodologyCRISP – DM methodologyCRISP – DM methodology
Four level breakdown of the CRISP-DM methodology:
Phases
Generic Tasks
Process Instances
Specialized Tasks
Page Number:
42
Mapping generic modelsMapping generic modelsto specialized modelsto specialized models
Mapping generic modelsMapping generic modelsto specialized modelsto specialized models
Analyze the specific context
Remove any details not applicable to the context
Add any details specific to the context
Specialize generic context according toconcrete characteristic of the context
Possibly rename generic contents to provide more explicit meanings
Page Number:
43
CRISP – DM modelCRISP – DM modelCRISP – DM modelCRISP – DM model
Business understanding
Data understanding
Data preparation
Modeling
Evaluation
Deployment
Business understanding
Data understanding
Datapreparation
ModelingEvaluation
Deployment
Page Number:
44
Business UnderstandingBusiness UnderstandingBusiness UnderstandingBusiness Understanding
Determine business objectives
Assess situation
Determine data mining goals
Produce project plan
Page Number:
45
Data UnderstandingData UnderstandingData UnderstandingData Understanding
Collect initial data
Describe data
Explore data
Verify data quality
Page Number:
46
Data PreparationData PreparationData PreparationData Preparation
Select data
Clean data
Construct data
Integrate data
Format data
Page Number:
47
ModelingModelingModelingModeling
Select modeling technique
Generate test design
Build model
Assess model
Page Number:
48
EvaluationEvaluationEvaluationEvaluation
Evaluate results
Review process
Determine next steps
results = models + findings
Page Number:
49
DeploymentDeploymentDeploymentDeployment
Plan deployment
Plan monitoring and maintenance
Produce final report
Review project
Page Number:
50
At Last…At Last…At Last…At Last…
Page Number:
51
Evolution Evolution oof Data Miningf Data Mining
Prospective, proactive information delivery
Lockheed,
IBM, SGI,
numerous startups
Advanced algorithms, multiprocessors, massive databases
What’s likely to happen to Boston unit sales next month? Why?
Data MiningData Mining
(2000)(2000)
Retrospective, dynamic data delivery at multiple levels
Pilot, IRI,
Arbor, Redbrick, Evolutionary Technologies
OLAP, Multidimensional databases,
data warehouses
What were unit sales in New England last March?
Drill down to Boston.
Data NavigationData Navigation
(1990s)(1990s)
Retrospective, dynamic data delivery at record level
Oracle, Sybase Informix, IBM, Microsoft
RDBMS,
SQL,
ODBC
What were unit sales in New England
last March?
Data AccessData Access
(1980s)(1980s)
Retrospective,
static data delivery
IBM,
CDC
Computers,
tapes,
disks
What was my average total revenue over the last 5 years?
Data Collection Data Collection (1960s)(1960s)
CharacteristicsProduct ProvidersEnabling Technologies
Business QuestionEvolutionary StepEvolutionary Step
Page Number:
52
Examples of DM projects to stimulate your imaginationExamples of DM projects to stimulate your imagination
Here are six examples of how data mining is helping corporations Here are six examples of how data mining is helping corporations to operate more efficiently and profitably in today's business environment to operate more efficiently and profitably in today's business environment
– Targeting a set of consumers Targeting a set of consumers who are most likely to respond to a direct mail campaign who are most likely to respond to a direct mail campaign
– Predicting the probability of default for consumer loan applicationsPredicting the probability of default for consumer loan applications
– Reducing fabrication flaws in VLSI chipsReducing fabrication flaws in VLSI chips
– Predicting audience share for television programsPredicting audience share for television programs
– Predicting the probability that a cancer patient Predicting the probability that a cancer patient will will respond to radiation therapyrespond to radiation therapy
– Predicting the probability that an offshore oil well is actually going Predicting the probability that an offshore oil well is actually going to produce oil to produce oil
Page Number:
53
Comparison of foComparison of fouurteen DM toolsrteen DM tools
Evaluated by four undergraduates inexperienced at data mining, Evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student a relatively experienced graduate student,, and and a profes a professsional data mining consultantional data mining consultant
Run under the MS Windows 95, MS Windows NT, Run under the MS Windows 95, MS Windows NT, Macintosh System 7.5Macintosh System 7.5
Use one of the four technologies: Use one of the four technologies: Decision Trees, Rule Inductions, NeuralDecision Trees, Rule Inductions, Neural,, or Polynomial Networks or Polynomial Networks
Solve two binary classification problems: Solve two binary classification problems: multi-class classification and noiseless estimation problem multi-class classification and noiseless estimation problem
Price from 75$ to 25.000$Price from 75$ to 25.000$
Page Number:
54
Comparison of foComparison of fouurteen DM toolsrteen DM tools
The Decision Tree products were The Decision Tree products were - - CART CART
- Scenario - Scenario - See5 - See5
- S-Plus - S-Plus The Rule Induction tools were The Rule Induction tools were
- - WizWhy WizWhy - - DataMindDataMind
- - DMSK DMSK Neural Networks were built from three programsNeural Networks were built from three programs
- - NeuroShell2NeuroShell2- PcOLPARS - PcOLPARS
- - PRW PRW The Polynomial Network tools were The Polynomial Network tools were
- - ModelQuest Expert ModelQuest Expert - - Gnosis Gnosis - a module of - a module of NeuroShellNeuroShell22
- - KnowledgeMiner KnowledgeMiner
Page Number:
55
Criteria for evaluating DM toolsCriteria for evaluating DM tools
A list of 20 criteria for evaluating DM tools, put into 4 categories:A list of 20 criteria for evaluating DM tools, put into 4 categories:
CapabilityCapability measures what a desktop tool can do, measures what a desktop tool can do, and how well it does itand how well it does it
- Handles- Handles missing datamissing data- - - - Considers misclassification costsConsiders misclassification costs
- Allows data transformations- Allows data transformations- - Includes qIncludes quality of tesing uality of tesing
optionsoptions - Has - Has a a programming languageprogramming language- Provides useful - Provides useful
output reportsoutput reports - - Provides Provides vvisualisationisualisation
Page Number:
56
Visualisation Visualisation
+ excellent capability excellent capability good capabilitygood capability - some capability “blank” no capabilitysome capability “blank” no capability
Page Number:
57
Criteria for evaluating DM toolsCriteria for evaluating DM tools
Learnability/UsabilityLearnability/Usability shows how easy a tool is to learn and use shows how easy a tool is to learn and use
- Tutorials- Tutorials- Wizards- Wizards
- Easy to learn- Easy to learn- User’s - User’s
manualmanual - Online help- Online help- -
Interface Interface
Page Number:
58
Criteria for evaluating DM toolsCriteria for evaluating DM tools
InteroperabilityInteroperability shows a tool’s ability to interface shows a tool’s ability to interface with other computer applicationswith other computer applications
- Importing data- Importing data- Exporting data- Exporting data
- Links to other applications- Links to other applications
Flexibility Flexibility
- Model adjustment flexibility- Model adjustment flexibility- Customizable work - Customizable work
enviromentenviroment - Ability to - Ability to write or change codewrite or change code
Page Number:
59
Data Input & Output ModelData Input & Output Model
+ excellent capability excellent capability good capabilitygood capability - some capabilitysome capability “ “blank” no capabilityblank” no capability
Page Number:
60
A classification of data setsA classification of data sets
Pima Indians Diabetes data setPima Indians Diabetes data set– 768 cases of Native American women from the Pima tribe 768 cases of Native American women from the Pima tribe
some of whom are diabetic, most of whom are not some of whom are diabetic, most of whom are not – 8 attributes plus the binary class variable for diabetes per instance8 attributes plus the binary class variable for diabetes per instance
Wisconsin Breast Cancer data set Wisconsin Breast Cancer data set – 699 instances of breast tumors some of which are malignant, 699 instances of breast tumors some of which are malignant,
most of which are benignmost of which are benign– 10 attributes plus the binary malignancy variable per case10 attributes plus the binary malignancy variable per case
The Forensic Glass Identification data set The Forensic Glass Identification data set – 214 instances of glass collected during crime investigations 214 instances of glass collected during crime investigations – 10 attributes plus the multi-class output variable per instance10 attributes plus the multi-class output variable per instance
Moon Cannon data set Moon Cannon data set – 300 solutions to the equation:300 solutions to the equation:
x = 2v 2 sin(g)cos(g)/g x = 2v 2 sin(g)cos(g)/g – the data were generated without adding noisethe data were generated without adding noise
Page Number:
61
Evaluation of forteen DM toolsEvaluation of forteen DM tools
Potentials of R&DPotentials of R&Dinin
Cooperation with U. of Belgrade Cooperation with U. of Belgrade
An Overview of Advanced Datamining Projects
for High-Tech Computer Industry
in the USA and EU
Page Number:
63
VLSI Detection VLSI Detection for for
Internet/Telephony Interfaces Internet/Telephony Interfaces
Goran Davidović, Miljan Vuletić, Veljko Milutinović,
Tom Chen, and Tom Brunett
* eT
Page Number:
64
INTERNET
SERVICE
PROVIDER
REMOTESITE
Superposition/DETECTION Superposition/DETECTION
. . .
USERS...
HOME/OFFICE/FACTORY AUTOMATION ON THE INTERNET
SPECIALIZED
Page Number:
65
Reconfigurable FPGA for EBI
Božidar Radunović, Predrag Knežević, Veljko Milutinović,
Steve Casselman, and John Schewel*
* Virtual
Page Number:
66
INTERNET
PROVIDER
. . .
USERS
VCC VCC
CUSTOMER SATISFACTION vs CUSTOMER PROFILE
SPECIALIZED
SERVICE
Page Number:
67
BioPoPBioPoP
Veljko Milutinovic, Vladimir Jovicic, Milan Simic,Veljko Milutinovic, Vladimir Jovicic, Milan Simic,
Bratislav Milic, Milan Savic, Veljko Jovanovic, Bratislav Milic, Milan Savic, Veljko Jovanovic,
Stevo Ilic, Djordje Veljkovic, Stojan Omorac,Stevo Ilic, Djordje Veljkovic, Stojan Omorac,
Nebojsa Uskokovic, and Fred DarnellNebojsa Uskokovic, and Fred Darnell
•isItWorking.com
Page Number:
68
Testing the Infrastructure for EBITesting the Infrastructure for EBI
PhonesPhones FaxesFaxes EmailEmail Web linksWeb links ServersServers RoutersRouters SoftwareSoftware
• Statistics
• Correlation
• Innovation
Page Number:
69
CNUCECNUCEIntegration and DataminingIntegration and Datamining
on Ad-Hoc Networks and the Interneton Ad-Hoc Networks and the Internet
Veljko Milutinović,
Luca Simoncini, and Enrico Gregory
*University of Pisa, Santanna, CNUCE
Page Number:
70
GSM
DMAd-Hoc
Internet
Page Number:
71
Genetic SearchGenetic Search with Spatial/Temporal Mutations with Spatial/Temporal Mutations
Jelena Mirković, Dragana Cvetković,and Veljko Milutinović
*Comshare
Page Number:
72
Drawbacks of INDEX-BASED:Drawbacks of INDEX-BASED: Time to index + ranking Time to index + ranking
Advantages of LINKS-BASED:Advantages of LINKS-BASED: Mission critical applications + customer tuned ranking Mission critical applications + customer tuned ranking
Provider
Well organized markets: Best first searchIf elements of disorder: G w DB mutationsChaotic markets: G w S/T mutations
Page Number:
73
e-Banking on the Internete-Banking on the Internet
MiloMiloš Kovačević,š Kovačević, Bratislav Milic, Veljko Milutinovi Bratislav Milic, Veljko Milutinović, ć, Marco Gori, and Roberto GiorgiMarco Gori, and Roberto Giorgi
*University of Siena
Page Number:
74
Bottleneck#1: Searching for Clients and InvestmentsBottleneck#1: Searching for Clients and Investments
1472++
*University of Siena + Banco di Monte dei Paschi
Page Number:
75
WaterMarking forWaterMarking fore-Banking on the Internete-Banking on the Internet
Darko Jovic, Ivana Vujovic, Veljko MilutinovicDarko Jovic, Ivana Vujovic, Veljko Milutinovic
Fraunhofer, IPSI, Darmstadt, Germany
Page Number:
76
Bottleneck#1: SpeedUpBottleneck#1: SpeedUp
Page Number:
77
SSGRRSSGRROrganizing Conferences via the InternetOrganizing Conferences via the Internet
Zoran Horvat, Nataša Kukulj, Vlada Stojanović,
Dušan Dingarac, Marjan Mihanović, Miodrag Stefanović,
Veljko Milutinović, and Frederic Patricelli
*SSGRR, L’Aquila
Page Number:
78
2000:
Arno Penzias
2001:
Bob Richardson
2002:
Jerry Friedman
2003:
Harry Kroto
http://www.ssgrr.it
Page Number:
79
SummarySummary
Books with Nobel Laureates:Books with Nobel Laureates:
Kenneth Wilson, Ohio (North-Holland)Kenneth Wilson, Ohio (North-Holland) Leon Cooper, Brown (Prentice-Hall)Leon Cooper, Brown (Prentice-Hall) Robert Richardson, Cornell (Kluwer-Academics)Robert Richardson, Cornell (Kluwer-Academics) Herb Simon (Kluwer-Academics) Herb Simon (Kluwer-Academics) Jerome Friedman, MIT (IOS Press)Jerome Friedman, MIT (IOS Press)
Harold Kroto (IOS Press)Harold Kroto (IOS Press) Arno Penzias (IOS Press)Arno Penzias (IOS Press)