intelligent data mining techniquesktiml.ms.mff.cuni.cz/~mrazova/tutor_im.pdf · iveta mrázová,...
TRANSCRIPT
![Page 1: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/1.jpg)
Intelligent Data Mining Techniques(tutorial presented at ANNIE´2003)
Iveta Mrázová
Department of Software EngineeringCharles University Prague
![Page 2: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/2.jpg)
Iveta Mrázová, ANNIE´03 2
Content outlineIntelligent Data Mining: introduction and overview of Intelligent DataMining Techniques (20 min)Selected Data Mining Techniques: principles and examples
– undirected DM-techniques:Market Basket Analysis (MBA) - (20 min)Link Analysis and Scale-Free Networks (10 min)Automatic Cluster Detection and Fuzzy Systems:Clustering the World Bank Data (20 min)
– directed DM-techniques:Internal Knowledge Representation in BP-Networks (20 min)Modular Networks, Sensitivity Analysis and Feature Selection (20 min)Neural Networks and Decision Trees: Students´Questionnaire (20 min)Genetic Algorithms and BP-networks: Generating Melodies (10 min)
Conclusions, Questions + Answers (10 min)
![Page 3: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/3.jpg)
Iveta Mrázová, ANNIE´03 3
Intelligent Data Mining: ReferencesM. J. A. Berry, G. Linoff: Data Mining Techniques for Marketing, Sales,and Customer Support, John Wiley & Sons, 1997M. J. A. Berry, G. Linoff: Mastering Data Mining, John Wiley & Sons,2000J. Han, M. Kamber: Data Mining: Concepts and Techniques, MorganKaufmann Publishers, 2001D. Hand, H. Mannila, P. Smyth: Principles of Data Mining, The MIT Press,2001D. Pyle: Data Preparation for Data Mining, Morgan Kaufmann Press, 1999I. H. Witten, E. Frank: Data Mining: practical machine learning tools andtechniques with Java implementations, Morgan Kaufmann Publishers, 2000http://www.mkp.com/datamining
http://www.cs.waikato.ac.nz/ml/weka
![Page 4: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/4.jpg)
Iveta Mrázová, ANNIE´03 4
What is Data Mining?discovering patterns in data– discovered patterns should be meaningful– should lead to some advantage, e.g. economic, …– allows to make non-trivial predictions on new data
the data is present in substantial quantitiesautomatic or semi-automatic processtwo extremes for the form of discovered patterns:– black box - e.g. neural networks– transparent box - more structured, capture the
decision structure in an explicit way
![Page 5: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/5.jpg)
Iveta Mrázová, ANNIE´03 5
Building models for the data
Classification model:– assigns an existing classification to new records
Predictive model– Time-series model
Clustering model
Model
INPUTS
Output
Confidence level
![Page 6: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/6.jpg)
Iveta Mrázová, ANNIE´03 6
Data Analysis:Influence of other disciplinesstatisticssamplingregression analysis
– linear regressioncorrelation analysis
memory-based reasoninglink analysisgenetic algorithms andneural networks
interpret observationsreduce the size of datainter- and extrapolateobservations
– fit a line to observed datamutual occurrence ofobservationsdirectly from AIgraph theorymodel biological processes
![Page 7: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/7.jpg)
Iveta Mrázová, ANNIE´03 7
Intelligent DM-Techniques: an overview
Market Basket Analysis (MBA)Memory-Based Reasoning (MBR)Automatic Cluster DetectionFuzzy Systems (FS)Link AnalysisDecision TreesArtificial Neural Networks (ANN)Genetic Algorithms (GA)
![Page 8: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/8.jpg)
Iveta Mrázová, ANNIE´03 8
Analyses in the retail industry:
What items occur together in a “basket”?
Results:– expressed as rules– highly actionable
Applications:– planning store layouts– offering coupons, limiting specials– bundling products
Market Basket Analysis (MBA)
![Page 9: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/9.jpg)
Iveta Mrázová, ANNIE´03 9
Memory-Based Reasoning (MBR)
Look for the nearest “known” neighbor toclassify or predict value!applicable to virtually any datanew instances learned by adding them to the data setdistance to neighbors estimates the correctness of theresultsKey elements in MBR:
– distance function - to find nearest neighbors– combination function - combine values at nearest
neighbors to classify or predict
![Page 10: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/10.jpg)
Iveta Mrázová, ANNIE´03 10
Link AnalysisGoals:– find patterns in relationships between records– visualize the links
Application areas:– telecommunications– law enforcement - clues about crimes are linked
together to solve them– marketing - relationships between customers
![Page 11: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/11.jpg)
Iveta Mrázová, ANNIE´03 11
Goal:
Find previously unknown similarities in the data!
Build models that find data records similarto each otherGood as an initial analysis of the dataUndirected data mining
Automatic Cluster Detection
![Page 12: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/12.jpg)
Iveta Mrázová, ANNIE´03 12
Decision Trees and Rule Induction
Divide the data into disjoint subsetscharacterized by simple rules!Directed data mining (classification)Explainable rules applicable directly to newrecordsTechniques:– Classification And Regression Trees (CART)– Chi-squared Automatic Induction (CHAID)– C4.5
![Page 13: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/13.jpg)
Iveta Mrázová, ANNIE´03 13
Artificial Neural Networks (ANN)
Detect patterns in the data in a way “similar” tohuman thinking!
Directed data mining (classification andprediction)Applicable also to undirected data mining (SOMs)Two major drawbacks:– difficulty in understanding the models they produce– sensitivity to the format of incoming data
![Page 14: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/14.jpg)
Iveta Mrázová, ANNIE´03 14
Genetic Algorithms (GA)Apply genetics and natural selection to findoptimal parameters of a predictive function!
GA use “genetic” operators to evolve successivegenerations of solutions:– selection– crossover– mutation
Best candidates “survive” to further generationsuntil convergence is achievedDirected data mining
![Page 15: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/15.jpg)
Iveta Mrázová, ANNIE´03 15
On-Line Analytic Processing (OLAP)
an important tool for extracting and presentinginformationfacilitates understanding of the data and importantpatterns inside ita way of presenting relational data to usersmulti-dimensional databases (MDDs):– a representation of data– allows users to drill down into the data and understand
various important summarizations
![Page 16: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/16.jpg)
Iveta Mrázová, ANNIE´03 16
Analyses in the retail industry:
What items occur together in a “basket”?
Results:– expressed as rules– highly actionable
Applications:– planning store layouts– offering coupons, limiting specials– bundling products
Market Basket Analysis (MBA)
![Page 17: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/17.jpg)
Iveta Mrázová, ANNIE´03 17
Association rulesHow do the products relate one to each other?Association rules should be:
– easy to understand: once the pattern is found, it is easy tojustify it
– useful: contain actionable information leading to otherinterventions
Association rules should not be:– trivial: results are already known by anyone familiar with
the business– inexplicable: seem to have no explanation and do not
suggest any action
![Page 18: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/18.jpg)
Iveta Mrázová, ANNIE´03 18
MBA to compare storesVirtual items:– specify which group the transaction comes from– do not correspond to a product or service
Comparison between new and existing stores:1 Gather data for a specific period from store openings2 Gather about the same amount of data from existing
stores3 Apply MBA to find association rules in each set4 Consider especially association rules containing the
virtual items
![Page 19: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/19.jpg)
Iveta Mrázová, ANNIE´03 19
MBA - how does it work?Items - products or service offeringsTransactions contain one or more itemsCo-occurrence table– indicates the number of times that any two
items co-occur in a transaction (i.e. theseproducts were purchased together)
– values along the diagonal represent the numberof transactions containing just that one item
![Page 20: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/20.jpg)
Iveta Mrázová, ANNIE´03 20
MBA - exampleGrocery transactions:
Customer Items 1 bread, butter 2 milk, bread, butter 3 bread, coffee 4 bread, butter, coffee 5 coffee, butter
Co-occurrence of products:
bread butter milk coffee bread 4 3 1 2 butter 3 4 1 2 milk 1 1 1 0 coffee 2 2 0 3
Sales patterns apparent from the co-occurrence table:
Milk is never purchased with coffee. Bread and butter are likely to be purchased together.
![Page 21: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/21.jpg)
Iveta Mrázová, ANNIE´03 21
Rule: IF Condition THEN Result. ( Rule_r : IF Item_i THEN Item_j . )
Questions:– How good are the found association rules?
supportconfidenceimprovement
– How to find association rules automatically?
MBA - Association rules
![Page 22: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/22.jpg)
Iveta Mrázová, ANNIE´03 22
Support and confidenceSupport: How frequently can the rule be applied?
Confidence: How much can we rely on the result of the rule?
Nr_of_Transactions_containing_i_and_j
Number_of_all_Transactions
Nr_of_Transactions_containing_i
Nr_of_Transactions_containing_i_and_j• 100 %
• 100 %Support(Rule_r) =
Confidence(Rule_r) =
![Page 23: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/23.jpg)
Iveta Mrázová, ANNIE´03 23
Support and confidence - example
Rule 1: If a customer purchases bread then the customer also purchases butter.
Rule 2: If a customer purchases coffee then the customer also purchases butter.
Support ( Rule_1 ) = 3 / 5 • 100 % = 60 %
Confidence ( Rule_1 ) = 3 / 4 • 100 % = 75 %
Support ( Rule_2 ) = 2 / 5 • 100 % = 40 %
Confidence ( Rule_2 ) = 2 / 3 • 100 % = 66 %
![Page 24: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/24.jpg)
Iveta Mrázová, ANNIE´03 24
Improvement of a rule
p(i_and_j) p(i) • p(j)
Improvement(Rule_r) =
Improvement: How much is a rule better at predict- ing the result than just assuming it?
If Improvement < 1:rule is worse at predicting the result than random chanceNEGATING the result might produce a better rule
IF Condition THEN NOT Result.
![Page 25: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/25.jpg)
Iveta Mrázová, ANNIE´03 25
Improvement of a rule - example
Rule: If a customer purchases milk then the customer also purchases butter.
Support ( Rule_1 ) = 1 / 5 • 100 % = 20 %
Confidence ( Rule_1 ) = 1 / 1 • 100 % = 100 %
Improvement ( Rule_1 ) = ( 1 / 5 ) / ( ( 1 / 5 ) • ( 4 / 5 ) ) = 5 / 4 = 1.25
![Page 26: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/26.jpg)
Iveta Mrázová, ANNIE´03 26
Basic steps of MBAChoose the right set of items and the right levelGenerate rules by deciphering the co-occurrencematrix– calculate the probabilities and joint probabilities of
items and their combinations in transactions– limit the search with thresholds set on support
Analyze probabilities to determine best rules– overcome limits imposed by the number of items and
their combinations in “interesting” transactions
![Page 27: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/27.jpg)
Iveta Mrázová, ANNIE´03 27
MBA - the choice of right items
Gathering transaction data:often bad quality requiring extensive pre-processingitems of interest may change over timethe right level of detail:– a growing number of item combinations– actionable results (specific items)– rules with sufficient support (frequent occurrence in
the data set)
![Page 28: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/28.jpg)
Iveta Mrázová, ANNIE´03 28
Taxonomies: hierarchical categories
MBA - Complexity of generated rules:Use more general items initiallyThen, generate rules for more specific items using onlytransactions containing these items
MBA - Actionable results:Items should occur in roughly the same number of transactions:
roll up rare items to higher levels in the taxonomy (tobecome more frequent)keep more common items at lower levels (to prevents rulesfrom being dominated by the most common items)
![Page 29: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/29.jpg)
Iveta Mrázová, ANNIE´03 29
Virtual items: go beyond the taxonomy
cross product boundaries of original items– e.g. designer labels - Calvin Klein
may include information about the transaction itself– anonymous (day of week, time, etc.)– signed (info about customers and their behavior over time)
might be a cause of redundant rules– items from the taxonomy are associated with just one virtual
item (“If Coke product then Coke.”)– virtual and generalized items appear together in a rule (“If
Coke product and diet soda then pretzels” instead of “If dietcoke then pretzels”)
![Page 30: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/30.jpg)
Iveta Mrázová, ANNIE´03 30
MBA - generating rulesCompute the co-occurrence table:– provides the information about which combinations of
items occur most commonly in the transactions– applicable for evaluating basic probabilities necessary to
evaluate the importance of generated rules
Provide useful rules:– improvement should be greater than 1
low improvement can be increased by negating the rulesnegated rules might be less actionable than original rules
– reduce the number of generated rules - PRUNING
![Page 31: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/31.jpg)
Iveta Mrázová, ANNIE´03 31
Minimum support pruningEliminate less frequent items
actions should affect enough transactionstwo possibilities:– eliminate rare items from consideration (then,
eliminate their respective associative rules)– use taxonomy to generalize items (then, resulting
generalized items should meet the threshold criteria)variable minimum support - a cascading effect
![Page 32: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/32.jpg)
Iveta Mrázová, ANNIE´03 32
MBA - Dissociation rulesRule: IF A AND NOT B THEN C.
– Introduce new items inverse to original ones– Each transaction will contain an inverse item if it does
not contain the original oneDrawbacks:– doubled number of items– growing size of transactions– inverse items tend to occur more frequently than original
(leading to less actionable rules with all items inverted:“IF NOT A AND NOT B THEN NOT C.”)
![Page 33: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/33.jpg)
Iveta Mrázová, ANNIE´03 33
Time-series analysis with MBAAnalyze cause and effects:– time- or sequencing information to determine when
transactions occurred relative to each other– usually requires some way of identifying the customer
Conversions to an MBA-problem:– include in transactions items before the event of interest
(for causes) or after the event of interest (for effects);then, remove duplicate items from the transaction
– time-window: a “snapshot” of all items that occurwithin a certain period (e.g. all transactions within a month)
trends for rare items
![Page 34: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/34.jpg)
Iveta Mrázová, ANNIE´03 34
Strengths of MBAProduces clear and understandable results– actionable IF - THEN - rules
Supports undirected data mining– important when approaching large data sets
with no prior knowledgeWorks on variable-length dataComputations are easy to understand– Computational costs grow exponentially with
the number of items!
![Page 35: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/35.jpg)
Iveta Mrázová, ANNIE´03 35
Weaknesses of MBAExponentially growing computational costs– necessity for item taxonomies and virtual items
Limited support for attributes on the data– pruning of less actionable general items
Difficult to determine the right number of items– items should have approximately the same frequency
Discounts rare items– variable thresholds for minimum support pruning– higher levels in item taxonomies
![Page 36: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/36.jpg)
Iveta Mrázová, ANNIE´03 36
Link AnalysisGoals:– find patterns in relationships between records– visualize the links
Application areas:– telecommunications– law enforcement - clues about crimes are linked
together to solve them– marketing - relationships between customers
![Page 37: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/37.jpg)
Iveta Mrázová, ANNIE´03 37
Scale-Free Networks
Some nodes have an extremely large number oflinks (edges) to other nodes - hubsMost nodes have just a few links to other nodesRobust against accidental failuresVulnerable to coordinated attacksNew application areas
– preventing computer viruses spreading through the Internet– medicine (vaccinations)– business (marketing)
![Page 38: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/38.jpg)
Iveta Mrázová, ANNIE´03 38
Scale-Free Networks
adapted from “A. L. Barabasi and E. Bonabeau: Scale-Free Networks, Scientific American, May 2003”
A random graph
Distribution of edges Distribution of edges
number of edges number of edgesnr. o
f nod
es
nr. o
f nod
es
A scale-free network
![Page 39: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/39.jpg)
Iveta Mrázová, ANNIE´03 39
Examples of Scale-Free NetworksSocial networks
– research collaboration (scientists, co-authorship of papers)– Hollywood (actors, appearance in the same movie)
Biological networks– cellular metabolism (molecules involved in energy production,
participation in the same biological reaction)– protein regulatory network (proteins controlling cell activity,
interactions among proteins)
Socio-technical networks– Internet (routers, optical or other connections)– World Wide Web (Web-pages and URLs)
![Page 40: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/40.jpg)
Iveta Mrázová, ANNIE´03 40
Scale-Free Networks:basic characteristics
Two basic mechanisms:– growth– preferential attachment
“The rich get richer” (hubs):– new nodes tend to connect to the more
connected sites– “popular locations” acquire more links
over time than less connected neighborsReliability
– accidental failures (80% of randomlyselected nodes can fail withoutfragmenting the cluster)
– coordinated attacks (eliminating 5-15% of all hubs can crash the system)
![Page 41: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/41.jpg)
Iveta Mrázová, ANNIE´03 41
Scale-Free Networks
adapted from “A. L. Barabasi and E. Bonabeau: Scale-Free Networks, Scientific American, May 2003”
node
before before before
hub hub
failed node failed node
after afterafter
attacked hub
Random network accidental node failure
Scale-free network accidental node failure
Scale-free network attack on hubs
![Page 42: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/42.jpg)
Iveta Mrázová, ANNIE´03 42
Implications ofScale-Free NetworksComputing– networks with scale-free architectures
Medicine– vaccination campaigns and new drugs
Business– cascading financial failures– marketing
![Page 43: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/43.jpg)
Iveta Mrázová, ANNIE´03 43
Implications ofScale-Free Networks
Computingcomputer networks with scale-free architectures(e.g. WWW)
highly resistant to accidental failuresvery vulnerable to deliberate attacks and sabotage
eradicating viruses from the Internet will beeffectively impossible
![Page 44: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/44.jpg)
Iveta Mrázová, ANNIE´03 44
Implications ofScale-Free NetworksMedicine
vaccination campaigns against serious virusesfocused on hubs– people with many connections to others– difficult to identify such people
new drugs targeting the hub molecules involved incertain diseasescontrol the side-effects of drugs with maps ofnetworks within cells
![Page 45: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/45.jpg)
Iveta Mrázová, ANNIE´03 45
Implications ofScale-Free NetworksBusiness
financial failures– understand how companies, industries and economies
are inter-linked– monitor and avoid cascading financial failures
marketing– study the spread of a contagion on a scale-free network– more efficient ways of propagating consumer buzz
about new products
![Page 46: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/46.jpg)
Iveta Mrázová, ANNIE´03 46
Goal:
Find previously unknown similarities in the data!
Build models that find data records similarto each otherGood as an initial analysis of the dataUndirected data mining
Automatic Cluster Detection
![Page 47: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/47.jpg)
Iveta Mrázová, ANNIE´03 47
Economies groupedaccording to their results
purchasing power paritygross national product
GD
P gr
owth
rate
s
![Page 48: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/48.jpg)
Mining the World Bank Data:the Fuzzy c-means Clustering Approach
with Cihan H. Dagli,Engineering Management Department, University of Missouri - Rolla
![Page 49: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/49.jpg)
Iveta Mrázová, ANNIE´03 49
FCM-clustering: introductionWorld Development Indicators (WDI)– published annually by the World Bank– reflect development process in the countries– incomplete and imprecise data
Previously applied techniques– regression analysis - linear relationships– US-based grouping of countries (G. Ip, Wall Street
Journal)– GDP-based grouping of economies (World Bank)– self-organizing feature maps (T. Kohonen, S. Kaski, G.
Deboeck)
![Page 50: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/50.jpg)
Iveta Mrázová, ANNIE´03 50
• more neurons thancountries
• only localgeometric relationsare important
• countries mappedclose to each otherhave a similar stateof development
adapted from “T. Kohonen: Self-Organizing Maps,3-rd Edition, Springer-Verlag, 2001”
Poverty maps - T. Kohonen
![Page 51: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/51.jpg)
Iveta Mrázová, ANNIE´03 51
Poverty maps - T. Kohonen, S. Kaski
U-matrix:illustrate “boundaries”between clustersrepresent averagedistances betweenneighboring neuronsin a gray scale
small average distance ⇒ light shade
large average distance ⇒ dark shadeadapted from “T. Kohonen: Self-Organizing Maps,
3-rd Edition, Springer-Verlag, 2001”
![Page 52: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/52.jpg)
Iveta Mrázová, ANNIE´03 52
Our goalCluster efficiently imprecise dataEstimate the number of clustersVisualize the resultsInterpret the results
![Page 53: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/53.jpg)
Iveta Mrázová, ANNIE´03 53
Our goalCluster efficiently imprecise dataEstimate the number of clustersVisualize the resultsInterpret the results
fuzzy c - means clustering (FCM) cluster validity indicators spread-sheet-like form find “landmarks”
![Page 54: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/54.jpg)
Iveta Mrázová, ANNIE´03 54
corresponds to the weighted distancebetween input patterns and cluster centers:
membership degrees between 0 and 1:total membership of a pattern equals to 1:no empty or full clusters:
The objective function
−= ∑∑ ∑
== =
n
jijpj
P
p
c
i
sips vxuJ
1
2
1 1
)()(),( vU
membership degreefuzziness parameter input pattern
cluster center
10 ≤≤ ipu1
1=∀ ∑=
c
i ipup
Pui P
p ip <<∀ ∑ =10
![Page 55: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/55.jpg)
Iveta Mrázová, ANNIE´03 55
Fuzzy c-means Clustering (FCM)Step 1: Initialize c, s, ε and t. Choose randomly .Step 2: Determine new fuzzy cluster centers:
Step 3: Calculate new partition matrix :
Step 4: EvaluateIf ∆ > ε then set t = t + 1 and go to Step 2. If ∆ ≤ ε then Stop.END of FCM
)0(U
)()1(,
)()1( max tip
tippi
tt uuUU −=−=∆ ++
ps
p
tip
p
stip
ti xu
uv rr ∑∑
= )()(
1 )()(
)(
∑=
−
−+
−
−= c
k
stkp
stipt
ip
vx
vxu
1
1/12)(
1/12)()1(
)/1(
)/1(rr
rr
)1( +tU
![Page 56: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/56.jpg)
Iveta Mrázová, ANNIE´03 56
Cluster validity criteriaPartition coefficient:
Partition entropy:
Windham´s proportion exponent:
∑ ∑= =
=P
p
c
iipu
PcUF
1 1
2)(1);(
)ln(1);(1 1
ip
P
p
c
iip uu
PcUH ∑ ∑
= =
−= .0for 0)ln( ; == ipipip uuu
( )
⋅−
−−= −
=
+
=∑∑
−
1
1
1
1
1)1(ln);(1
cp
j
jP
p
jjc
cUWp
µµ
{ }ipcip u max ; 1 ≤≤=µ
clusters
membershipdegreepatterns
![Page 57: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/57.jpg)
Iveta Mrázová, ANNIE´03 57
How many clusters?Partition coefficient:
Partition entropy:
Windham´s proportion exponent:
{ [ ] });(minmin 12 cUHUPc −≤≤
{ [ ] });(maxmax 12 cUFUPc −≤≤
{ [ ] });(maxmax 12 cUWUPc −≤≤
clusterspartitions
![Page 58: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/58.jpg)
Iveta Mrázová, ANNIE´03 58
Supporting experiments - artificial dataCluster validity indicators for artificial data Fuzzy 4-partition of the data
21 input patterns, s = 1.4, ε = 0.05 ´×´ indicates cluster centers, patterns from the same clusters are labeled identically
![Page 59: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/59.jpg)
Iveta Mrázová, ANNIE´03 59
Supporting experiments - artificial dataFuzzy 8-partition of the dataFuzzy 6-partition of the data
´×´ indicates cluster centers, patterns from the same clusters are labeled identically
´×´ indicates cluster centers, patterns from the same clusters are labeled identically
![Page 60: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/60.jpg)
Iveta Mrázová, ANNIE´03 60
Interpret the results!Characteristic features for detected clusters:
cluster centers - “fictive” patterns out of the data set“calibrate” clusters with the “most representative”patterns from the data set - based on just one patternDetermine outstanding properties for clusters:
– compared to other properties within the cluster– compared to properties of other clusters– exception: “border areas”
fuzzy c-landmarks
![Page 61: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/61.jpg)
Iveta Mrázová, ANNIE´03 61
Automatic landmark selectionFuzzy c-landmark for cluster :
“fuzzy distance” from the cluster center should be small“fuzzy distance” from all other cluster centers should belarge
∑
∑
∑
∑
=
=
≠≤≤
=
=
≤≤
−
−
=
P
p
spi
P
pijpj
spi
iici
P
p
spi
P
pjipj
spi
nj
u
vxu
u
vxu
j
1
11
1
1
1*
*
*
*
*
**
min
min arg
*i ( )**,*jivj
membershipdegrees
cluster centers
input patterns
clusters
inputs
![Page 62: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/62.jpg)
Iveta Mrázová, ANNIE´03 62
Supporting experiments:The World Bank Data
99 state economies with 16 (latest) indicators for eachcountryeconomical and social potential of countries and their citizensall indicators are relative to populationelement-wise transformation to (0,1) with:
and
the choice of other parameters (k=4; s=1.4; ε=0.05)
minmax
min´xx
xxx−
−=
)2/1´(11´´ −−+
= xkex
maximum over all patternsminimum over all patterns
![Page 63: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/63.jpg)
Iveta Mrázová, ANNIE´03 63
Used Development IndicatorsGNP per capitaPurchasing Power ParityGrowth rate of GDP percapitaGDP implicit deflatorExternal debt (% of GNP)Total debt service (% ofexport of goods and services)High technology exports(% of manufactured exports)Military expenditures(% of GNP)
Expenditures for R&D (% ofGNP)Total expenditures on health (%of GDP)Public expenditures on education(% of GNP)Male life expectancy at birthFertility ratesGINI-index (distribution ofincome/consumption)Internet hosts per 10000 peopleMobile phones per 1000 people
![Page 64: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/64.jpg)
Iveta Mrázová, ANNIE´03 64
Supporting experiments:the World Bank data Cluster validity indicators for the WB-data
99 countries with 16 indicators s = 1.1, ε = 0.05
Cluster validity indicators for the WB-data
99 countries with 16 indicators s = 1.4, ε = 0.05
![Page 65: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/65.jpg)
Iveta Mrázová, ANNIE´03 65
Fuzzy 7-partition of the WB data
A part of the fuzzy 7-partition of the World Bank data: 99 countries with 16 indicators; s = 1.4, ε = 0.05
![Page 66: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/66.jpg)
Iveta Mrázová, ANNIE´03 66
Landmarks for the WB data
“Representative patterns” and fuzzy 7-landmarks for the World Bank data: 99 countries with 16 indicators; s = 1.4, ε = 0.05
![Page 67: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/67.jpg)
Iveta Mrázová, ANNIE´03 67
FCM-Clustering: conclusionsFCM-clustering– efficiency and cluster validity– choice of the fuzziness parameter– grouping of country economies (World Bank, Ip,
Kohonen, Deboeck)
Visualization– membership degree– topological relationships
Landmarks and interpretation of the results– formulation of “class discriminating” criteria
![Page 68: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/68.jpg)
Iveta Mrázová, ANNIE´03 68
Rule extraction: Characteristics Economical_results– fuzzy inference systems– (feed-forward) neural networks
back-propagationRBF-networks
Neuro-fuzzy systems with adaptive inputs– detection of significant input patterns– influence of internal knowledge representation– speed-up the training and recall process
From FCM towards Fuzzy Systems
![Page 69: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/69.jpg)
Iveta Mrázová, ANNIE´03 69
Artificial Neural Networks (ANN)
Detect patterns in the data in a way “similar” tohuman thinking!
Directed data mining (classification andprediction)Applicable also to undirected data mining (SOMs)Two major drawbacks:– difficulty in understanding the models they produce– sensitivity to the format of incoming data
![Page 70: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/70.jpg)
Back-Propagationand GREN-networks
![Page 71: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/71.jpg)
Iveta Mrázová, ANNIE´03 71
Introduction
Multi-layer feed-forward networks (BP-networks)– one of the most often used models– relatively simple training algorithm– relatively good results
Limits of the considered model– the speed of the training process– convergence and local minimums– generalization and “over-training” additional demands on the desired network behavior
![Page 72: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/72.jpg)
Iveta Mrázová, ANNIE´03 72
corresponds to the difference between theactual and the desired network output:
the goal of the training process is to minimizethis difference on the given training set
Back-Propagation training algorithm
The error function
( )2,,2
1 ∑ ∑ −=p j
pjpj dyE
actual output
desiredoutput
patterns output neurons
![Page 73: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/73.jpg)
Iveta Mrázová, ANNIE´03 73
The Back-Propagationtraining algorithm
computes the actual outputfor a given training patterncompares the desired andthe actual outputadapts the weights and thethresholds
– against the gradient of theerror function
– backwards from the outputlayer towards the inputlayer
I N P U T
O U T P U T
![Page 74: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/74.jpg)
Iveta Mrázová, ANNIE´03 74
Drawbacks of thestandard BP-model
The error function– correspondence to the desired behavior– the form of the training set
requires the knowledge of desired network outputsbetter performance for “larger” and “well-balanced”training sets
Generalization abilities– ability to interpret and evaluate the “gained” experience– retraining for modified and/or developing task domains
![Page 75: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/75.jpg)
Iveta Mrázová, ANNIE´03 75
Desired propertiesof trained networks
Robustness against small deviations ofthose input patterns lying “close to theseparating hyper-plane”Transparent network structure with asuitable internal knowledge representation
A possible reuse of already trainednetworks under changed conditions
![Page 76: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/76.jpg)
Iveta Mrázová, ANNIE´03 76
Condensed internal representation
interpret the activity ofhidden neurons:1 active YES0 passive NO
silent“no decision possible”
“clear” the innernetwork structuredetect superfluousneurons and pruneI N P U T
O U T P U T
12
![Page 77: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/77.jpg)
Iveta Mrázová, ANNIE´03 77
How to force the condensedinternal representation?
formulate “the desired properties” in the form ofan objective function:
local minima of the representation error functioncorrespond to active, passive and silent states:
G E c Fs= +Standard error function
Representation error function
the influence of F on G
( ) ( )F y y yh ps
hph p
s
h p= − −∑∑ , , , .1 0 52
silent stateactive state
passive statepatternshidden neurons
the shape of F
![Page 78: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/78.jpg)
Iveta Mrázová, ANNIE´03 78
Influence of parametersslower forcing of the internalrepresentation and the desirednetwork functionstability of the forced internalrepresentation and an optimalnetwork architecturethe shape of the representationerror function, the speed of therepresentation forcing processand its formthe time-overhead of the weightadjustment( ) ( )w t w t y yij ij j i r j i+ = + + +1 αδ α ρ
( ) ( )( )+ − −α m ij ijw t w t 1
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0 0.2 0.4 0.6 0.8 1
repr
esen
tatio
n er
ror
actual neuron output
![Page 79: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/79.jpg)
0
5e-07
1e-06
0 0.2 0.4 0.6 0.8 10
9e-08
1.8e-07
0 0.2 0.4 0.6 0.8 1
0
4e-05
8e-05
0 0.2 0.4 0.6 0.8 10
0.008
0.016
0 0.2 0.4 0.6 0.8 1
Shape of the representation function( ) ( )F y y ys s t
= − −1 0 5.
s t= =1 2
s t= =8 2
s t= =4 2
s t= =5 4
![Page 80: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/80.jpg)
Iveta Mrázová, ANNIE´03 80
Further modifications ofthe representation function
Discrete internal representation:(S allowed output values for neurons from thelast hidden layer)
Condensed internal representation for allhidden layers:
( ) ( ) ( )F y r y r y rj pjp
t
j p S
t
j p ssjp
tS s
= − − = −∑∑ ∏∑∑, , ,1
2 2 21
K
( ) ( )F y y yj ps
jplj p
s
j pll
l l= − −∑∑∑ '
'' ',
', , .1 0 5
2
r rS1, ,K
![Page 81: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/81.jpg)
Iveta Mrázová, ANNIE´03 81
Unambiguousinternal representation
Patterns with highly different outputs should formhighly different internal representationsFormulate the requirements as a modifiedobjective function:Ambiguity criterion for the internal representation:
G E F H= + +
( ) ( )H d d y yo p o qojq pp
j p j q= − − −∑∑∑∑≠
12
2 2
, , , ,
= const. for a given p
= const. for a given p= const. for a given ppatterns
hidden neurons output neurons
![Page 82: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/82.jpg)
Iveta Mrázová, ANNIE´03 82
Modular structure of BP-networks
Decompose the task into the particular subtasksPropose and form the modular architecture– strategy for extracting ε-equivalent BP-modules
elimination of superfluous hidden and/or input neuronssuitable for "already trained" networksa compromise between the desired accuracy of the extractedmodule and its optimal architecture
Communication between the particular modules– serial and parallel composition of BP-networks
![Page 83: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/83.jpg)
Iveta Mrázová, ANNIE´03 83
0
0.5
1
1.5
2
2.5
-4 -3 -2 -1 0 1 2 3 4 5
0
0.5
1
Extracting BP-modules- allowed potential deviations
The potential changeis in this case smaller than thepotential changeThe potential should change"towards the separating hyper-plane"The changed potential shouldpreserve the location of the inputpattern in the same half-spaceThe allowed potential changesshould be independent of eachparticular input pattern (from S)
( )δ ξr−
( )δ ξr+
ξ δ ξ ξ ξ δ ξ− − +r r( ) ( ) +
f r( )ξ ε+f ( )ξ
f r( )ξ ε−
••° °εr = 0 2.
δ ξ
δ ξr
r
+
−
( )( )
![Page 84: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/84.jpg)
Iveta Mrázová, ANNIE´03 84
Notes on the construction ofan ε-equivalent network
possible improvements ofnetwork properties:
– “egalitarian” versus“differentiated” approach
the relationship of theconstruction to “morerobust” networks
– necessary knowledge ofεr-boundary regions
– preserve the created internalrepresentation
I N P U T
O U T P U T
![Page 85: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/85.jpg)
Iveta Mrázová, ANNIE´03 85
Desired properties of “experts”for training (modular) BP-networks
evaluate the error connectedwith the actual response ofa BP-network“explain” the BP-network itserror during trainingnot require the knowledge ofthe desired network outputbut should recognize a correctbehavior“suggest” a “better” behavior
![Page 86: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/86.jpg)
Iveta Mrázová, ANNIE´03 86
Desired properties of “experts”for training BP-networks
evaluate the error connectedwith the actual response ofa BP-network“explain” the BP-network itserror during trainingnot require the knowledge ofthe desired network outputbut should recognize a correctbehavior“suggest” a “better” behavior
![Page 87: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/87.jpg)
Iveta Mrázová, ANNIE´03 87
Desired properties of “experts”for training BP-networks
evaluate the error connected withthe actual response of a BP-network“explain” the BP-network its errorduring trainingnot require the knowledge of thedesired network outputbut should recognize a correctbehavior“suggest” a “better” behavior
![Page 88: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/88.jpg)
Iveta Mrázová, ANNIE´03 88
GREN-networks:Generalized relief error networks
assign the error to the pairs[input pattern, actual output]trained e.g. by the standardBP-training algorithmshould have good approximationand generalization abilities“approximates” the error functionby:
INPUT PATTERN ACTUAL OUTPUT
ERROR VALUES
∑ ∑=p e
GRpe
BeE ,
output neurons ofthe GREN-network
output values ofthe GREN-networkpatterns
![Page 89: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/89.jpg)
Iveta Mrázová, ANNIE´03 89
A modular system for trainingBP-networks with GREN-networks
ERROR VALUES
INPUT PATTERN
INPUT PATTERN
ACTUAL OUTPUT
GREN-NETWORK
ADAPTEDBP-NETWORK
![Page 90: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/90.jpg)
Iveta Mrázová, ANNIE´03 90
Applied the basic idea of Back-PropagationHow to determine the error terms at theoutput of the trained BP-network?
Use error terms back-propagated from the GREN-network
Weight adjustment rules similar to thestandard Back-Propagation
Training with a GREN-network
![Page 91: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/91.jpg)
Iveta Mrázová, ANNIE´03 91
Training with a GREN-network
Applied the basic idea of Back-Propagation
How to determine at the outputlayer of the BP-network B?
Bij
Bj
Bj
Bj
Bj
Bij
BijE w
yyE
wEw
∂∂
∂∂
∂∂
−=∂∂
−=∆ξ
ξ
potential of neuron j
weight of a BP-network Bactual output
error computed by the GREN-network
BjyE ∂∂
![Page 92: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/92.jpg)
Iveta Mrázová, ANNIE´03 92
Weight adjustment rules
Use error terms back-propagated from theGREN-networkRules similar to the standard Back-Propagation
For output neurons, compute by means ofpropagated from the GREN-network
Bi
Bj
Bij
Bij yoldwneww αδ+= )()(
BGRk
BGRk
BGRk
B yy
EGRk ξ
δ∂
∂
∂∂−=
Bjδ
learning rate error term
actual output of neuron i
weight
BGRkδ
BGR
error term
actual output of neuron k
potential of neuron k
error
![Page 93: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/93.jpg)
Iveta Mrázová, ANNIE´03 93
Error terms forthe trained BP-network
The back-propagated error terms correspondfor to:
( )
−
−−
=
∑
∑
∑
)´(
)´(
)´(1
Bj
k
Bjk
Bk
Bj
k
GRjk
GRk
Bj
e
GRje
GRe
GRe
Bj
fw
fw
fwee
BB
BBB
ξδ
ξδ
ξ
δ
∑=e
eeE
Bjδ
for an output neuron of B and GRB with no hidden layer
for an output neuron of B and GRB with hidden layers
for a hidden neuron of B
![Page 94: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/94.jpg)
Iveta Mrázová, ANNIE´03 94
Is the GREN-network an “expert”?
Has not to “know the right answer”But should “recognize the correct answer”
for an input pattern, yield the minimum error only for one actual output - the right one
Simple test for “problematic” GREN-experts:– zero-weights from the actual output– zero “y-terms” of potentials in the 1. hidden layer– “too many large negative weights” ∑∑ +− >>
ii
ii ww
By
![Page 95: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/95.jpg)
Iveta Mrázová, ANNIE´03 95
Find “better” input patterns!
input patterns of a GREN-network“similar” to those presented to and recalled bythe BP-networkwith a smaller error
minimize the error at the output of the GREN-network, e.g. by back-propagation adjust input patterns against the gradient of the GREN-network error function
![Page 96: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/96.jpg)
Iveta Mrázová, ANNIE´03 96
Avoid “problematic”GREN-networks!
Insensitive to the outputs of trained BP-networks– inadequately small error terms back-propagated by the
GREN-network
Incapable of training further BP-networks– small error terms even for large errors
Our goal: Increase the sensitivity of GREN-
networks to their inputs!
![Page 97: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/97.jpg)
Iveta Mrázová, ANNIE´03 97
How to handle thesensitivity of BP-networks?Increase their robustness:
over-fitting leads to functions with a lot of structureand a relatively high curvaturefavor “smoother” network functionsalternative formulation of the objective function
– penalizing large second-order derivatives of the networkfunction
– penalizing large second-order derivatives of the transferfunction for hidden neurons
– weight-decay regularizers
![Page 98: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/98.jpg)
Iveta Mrázová, ANNIE´03 98
Controlled learningof GREN-networks
Require GREN-networks sensitive to their inputs– non-zero error terms for incorrect BP-network outputs
Favor larger values of the error terms
Minimize during training2
,
,∑ ∑ ∑∑>
∂
∂−==
q s nr qr
qs
q
REGq
REG
yy
EE
patterns output neurons controlledinput neurons
outputvalues
controlledinput values
![Page 99: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/99.jpg)
Iveta Mrázová, ANNIE´03 99
Weight adjustment rules
Regularization by means of
Rules similar to the standard Back-Propagation
∂∂
−∂
∂−=
∂∂
−=∆ ∑ ∑>s nr r
s
ijij
REG
ijE yy
wwEwREG
2
controlled inputs
output
( ) ( )( ) ( )( )1
1
−−+
+∆+∆+=+
TwTw
wwTwTwBB
REGBB
GRij
GRijm
ijEcijEGRij
GRij
α
αα
BP-weight adjustment
weight
controlled weight adjustment
moment
![Page 100: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/100.jpg)
Iveta Mrázová, ANNIE´03 100
Characteristics of the method
Applicable to any BP-network and/or input neuronQuicker training of “actual” BP-networks– larger “sensitivity terms” transfer better the
errors from the GREN-network
Oscillations during training “actual” BP-networks– due to the “linear” nature of the GREN-specified error
function
rs yy ∂∂ /
patterns
∑ ∑=p e
GRpe
BeE ,
output neurons ofthe GREN-network
output values of theGREN-network
![Page 101: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/101.jpg)
Iveta Mrázová, ANNIE´03 101
Modification of the methodUse “quadratic” GREN-specified error terms for training“actual” BP-networks
Considers both the GREN-network outputs and the“sensitivity” terms
Crucial for low sensitivity to erroneous training patterns
Bpj
GRpe ye B
,, / ∂∂
patterns
( )2,
ˆ ∑ ∑=p e
GRpe
BeE
output neurons of theGREN-network
output values of theGREN-network
BGRpee ,
![Page 102: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/102.jpg)
Iveta Mrázová, ANNIE´03 102
Supporting experimentsOutput of the standard BP-network
00.2
0.40.6
0.81x-coordinate 0
0.2
0.4
0.6
0.8
1
y-coordinate
0
0.2
0.4
0.6
0.8
1
1.2
network output
Output of the standard BP-network
x-coordinate
y-coordinate
network output
3000 cycles, SSE = 0.89
Output of the BP-network trained with a GREN-network
00.2
0.40.6
0.81x-coordinate 0
0.2
0.4
0.6
0.8
1
y-coordinate
0
0.2
0.4
0.6
0.8
1
1.2
network output
y-coordinatex-coordinate
network output
3000 cycles, SSE = 0.05
Output of the GREN-trained BP-network
![Page 103: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/103.jpg)
Iveta Mrázová, ANNIE´03 103
Supporting experiments
-1
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
BP-network output for a constant y=0.25
errorbars correspond to the GREN-error
-1
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
BP-network output for a constant y=0.25
errorbars correspond to the GREN-error
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
-0.5
0
0.5
1
1.5
2
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
netw
ork
outp
ut
x-coordinate
GREN-adjusted input/output patterns for a constant y=0.25
errorbars correspond to the GREN-error
initial I/O_pattern_1=[0,0.25,0.197]
initial I/O_pattern_2=[0.5,0.25,0.388]
initial I/O_pattern_3=[1,0.25,0.932]
I/O_pattern_1
I/O_pattern_2 I/O_pattern_3
errorbars correspond to the GREN-error
errorbars correspond to the GREN-error
I/O pattern 1
I/O pattern 2 I/O pattern 3
GREN-adjusted input/output patterns for a constant y=0.25
y-co
ordi
nate
x-coordinate
BP-network output for a constant y=0.25
y-co
ordi
nate
x-coordinate
![Page 104: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/104.jpg)
Iveta Mrázová, ANNIE´03 104
Supporting experiments
Output of the standard BP-network (with 8 hidden neurons)
0
0.5
1
x-coordinate0
0.5
1
y-coordinate
0
0.5
1
network output
1500 cycles, SSE = 0.51
Output of the GREN-trained BP-network (with 8 hidden neurons)
0
0.5
1
x-coordinate0
0.5
1
y-coordinate
0
0.5
1
network output
1500 cycles, SSE = 0.06, GREN-error = 1.2
![Page 105: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/105.jpg)
Iveta Mrázová, ANNIE´03 105
Supporting experiments
0
2
4
0 2500 5000
network error
network sensitivity
cycles
SSE
0
2
0 2500 5000
network sensitivity
network error
SSE
cycles
Sensitivity and error for a standard BP-trained GREN-network
Sensitivity and error for a controlled-trained GREN-network (control rates = 0.2)
![Page 106: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/106.jpg)
Iveta Mrázová, ANNIE´03 106
Supporting experiments
Sensitivities and error for a controlled-trained GREN-network (control rates = 0.2)
Sensitivities and error for an over-trained GREN-network (control rates = 0.2)
-2
0
2
0 2500 5000
0
4
8
0 5000 10000 15000
SSESSE
cyclescycles
sensitivityto BP-outputsensitivity to BP-output
network sensitivity
network error
network error
![Page 107: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/107.jpg)
Iveta Mrázová, ANNIE´03 107
GREN-networks: conclusions
GREN-networks can train BP-networkswithout the knowledge of their desiredoutputsA simple detection of “problematic”GREN-expertsGREN-networks can find “similar” inputpatterns with a lower error
![Page 108: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/108.jpg)
Iveta Mrázová, ANNIE´03 108
Conclusions:Sensitivity of GREN-networks
Increase the sensitivity of trained GREN-networksto their inputsDetect “over-training” in GREN-networksTrain BP-networks more efficiently byminimizing squared GREN-network outputsinstead of the “linear” onesFurther research: simplified sensitivity control
![Page 109: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/109.jpg)
Acoustic Emission andFeature Selection Based on
Sensitivity Analysis
with M. Chlada and Z. Převorovský, Institute of Thermomechanics, Academy of Science
![Page 110: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/110.jpg)
Iveta Mrázová, ANNIE´03 110
Acoustic Emission and FeatureSelection Based on Sensitivity Analysis
BP-networks and sensitivity analysis– larger “sensitivity terms” indicate higher
importance of the feature i
numerical experiments– acoustic emission:
classification of simulated AE data– feature selection:
reduction of original input parameters (from 14 to 6)model dependence between parameters
|| ij xy ∂∂
![Page 111: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/111.jpg)
Iveta Mrázová, ANNIE´03 111
Simulation of AE-data
0 10 20 30-1
-0.5
0
0.5
1WAVE 1
0 10 20 30-0.5
0
0.5
1
1.5WAVE 2
0 10 20 30-0.5
0
0.5
1
1.5WAVE 3
0 10 20 30-0.2
0
0.2
0.4
0.6
0.80.3*WAVE1 + 0.2*WAVE2 + 0.5*WAVE3
MODEL PULSES
![Page 112: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/112.jpg)
Iveta Mrázová, ANNIE´03 112
0 50 100 150 200-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08INP UT S IGNAL (a=0.3, b=0.2, c=0.5)
CONVOLUTION WITH THE GREEN FUNCTION
0 20 40 60 80 100 120 140 160 180 200-0.2
-0.15
-0.1
-0.05
0
0.05GREEN FUNCTION - 140mm
Simulation of AE-data
![Page 113: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/113.jpg)
Iveta Mrázová, ANNIE´03 113
Original Features for AE-signalsamplitude:rise timeeffective value (RMS)
energy moment:
mean value:
deviation:
asymmetry:
excess:
6 spectral parameters:
with
andfN is the Nyquist frequency
{ }FEDCBAXdkkf
dkkfP
G
XX ,,,,, ;
)(
)(∈=
∫∫
{ })(maxmax tzz Tt∈=
∫=T
dttzT
RMS )(1 2
∫ ⋅=TE dttztT )(2
( ) Tdttztt Ts ∫ ⋅= )(
( ) TdttztT s∫ −= 22 ))((σ
( ) 332 ))(( ση ∫ −=T s dttzt
( ) 442 ))(( σξ ∫ −=T s dttzt
[ ]( )[ ]( )[ ]( )[ ]( )[ ]( )
[ ]( )[ ]( )N
N
N
N
N
N
N
fGfF
fEfDfCfB
fA
2/1,02/1,6.0
2/6.0,48.02/48.0,36.02/36.0,24.02/24.0,12.0
2/12.0,0
=======
![Page 114: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/114.jpg)
Iveta Mrázová, ANNIE´03 114
Factor analysisfor input parameters
9 factors selected“explain” 98.4% of allvariables, e.g.
– higher energy of signalslead to higheramplitudes and RMS(parameters 1, 3, 4)
allow to reduce linearlydependent inputparameters
– in our case to: 2, 3, 5, 6,7, 8, 11, 12 and 2 newspectral parametersselected factors
inpu
t par
amet
ers
![Page 115: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/115.jpg)
Iveta Mrázová, ANNIE´03 115
0.1730.0930.3200.3010.5640.1960.0990.0650.0220.0530.0350.0390.0810.260
0.2660.0680.1930.1780.2500.3220.0630.0150.0140.0200.0120.0500.1340.172
0.1490.0470.1840.1960.2060.1580.0430.0300.0160.0120.0320.0220.0820.109
S ENS ITIVITY COEFFICIENTS
OUTP UTS
INP
UTS
1 2 3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Sensitivity analysisof trained BP-networks
2000 samples– 500 training s.
14-27-19-3180 iterationsselected inputs:
– sensitivity analysis1, 3, 4, 5, 6, 13, 14
– + factor analysis 1, 3, 5, 6, 13, 14
new architecture:6-13-7-3 (even with slightlybetter MSE-results)
![Page 116: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/116.jpg)
Iveta Mrázová, ANNIE´03 116
0.32
0.04
0.02
0.16
0.16
0.92
0.05
0.14
0.08
0.03
0.98
0.05
0.04
0.04
0.03
0.35
SENSITIVITY COEFFICIENTS ... X4 = (X1)4
TARGET - PARAMETER
PAR
AME
TE
RS
1 2 3 4
1
2
3
4
-1 -0.5 0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1X4 = (X1)4
Model dependence
![Page 117: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/117.jpg)
Iveta Mrázová, ANNIE´03 117
Model dependence
0.77
0.03
0.04
0.12
0.13
0.89
0.02
0.11
0.06
0.05
0.97
0.05
0.11
0.03
0.03
0.59
SENSITIVITY COEFFICIENTS ... X4 = s in(9*X1)
TARGET - PARAMETER
PAR
AME
TE
RS
1 2 3 4
1
2
3
4
-1 -0.5 0 0.5 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1X4 = s in(9*X1)
![Page 118: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/118.jpg)
Knowledge extractionin neural networks(students’ questionnaire)
with Eva Poučková,Department of Software Engineering, Charles University Prague
![Page 119: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/119.jpg)
Iveta Mrázová, ANNIE´03 119
Knowledge representation in NN
Distributed!⌦Which inputs are the
most important ones?⌦What and how does the
network do?
I N P U T
O U T P U T
![Page 120: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/120.jpg)
Iveta Mrázová, ANNIE´03 120
Knowledge extraction in NN
Dimension reduction and sensitivity analysisfor inputsRule extraction from trained networks– Structural learning with forgetting
BP-networksGREN-networks
– Babsi-trees (B. Hammer et al.)GRLVQ
![Page 121: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/121.jpg)
Iveta Mrázová, ANNIE´03 121
Dimension reductionPCA: linear transformation of the input dataSensitivity analysis:Feature Subset Selection (FSS)Correlation-based Feature Selection (CFS):select a group of features with a highaverage correlation input_feature - outputbut with a low mutual correlation
![Page 122: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/122.jpg)
Iveta Mrázová, ANNIE´03 122
Dimension reduction: resultsPCA: method not suitable for further
processing – knowledge and rule extraction
FSS: 7 features CFS: 7 features
25 original features
8 features selected as a union of theresults for FSS and CFS
![Page 123: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/123.jpg)
Iveta Mrázová, ANNIE´03 123
Features selected forthe overall evaluation
Feature subset selection(FSS):(1) understandable subject(2) structured and prepared
presentations(3) interesting classes(4) quality of education(5) understandable classes(6) start/end of class on time(7) relationship to students
Correlation-based featureselection (CFS):(1) understandable subject(2) structured and prepared
presentations(3) interesting classes(4) quality of education(5) understandable classes(6) start/end of class on time(8) students prepare for
classes
![Page 124: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/124.jpg)
Iveta Mrázová, ANNIE´03 124
Methods for knowledge extraction
SLF – Structural learning with forgetting• Learning with forgetting• Learning with forcing internal representations on
hidden neurons• Learning with selective forgetting
Babsi-trees• Form a tree from a neural network trained by
means of the GRLVQ-method
![Page 125: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/125.jpg)
Iveta Mrázová, ANNIE´03 125
Generalized relevance learningvector quantization (GRLVQ)
a robust combination of GLVQ and RLVQprovides weighing factors (λ) for input features– larger λ corresponds to a “more important” feature
applicable to pruning of input featuresGLVQ: considers class representatives– separating surfaces approach the optimum Bayessian ones
RLVQ: input features can have different importance– relatively unstable, sensitive to noise
![Page 126: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/126.jpg)
Iveta Mrázová, ANNIE´03 126
Generalized LVQ - GLVQSelect a fixed number of representatives w1, …,wL for all classes Ci, i=1, …, q.Receptive field of the class representative wi
Receptive fields of class representatives shouldbe as small as possible!– minimize ; σ denotes the sigmoid
– and
{ }||||||:||| kii ikTR wxwxx −<−≠∀∈=
( )( )∑ ==
p
ktE
1)(xησ
( )||||||||||||||||
)()(
)()()(
−+
−+
−+−−−−
=wxwxwxwxx tt
tttη
correct classification wrong classification
![Page 127: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/127.jpg)
Iveta Mrázová, ANNIE´03 127
( )( )+
−+
−+ −
−+−−=∆ wx
wxwxwxw )(
2)()(
)(
|||||||||||´ t
tt
t
ασ
Generalized LVQ - GLVQWeight adjustment:
withand learning rates α
( )( )−
−+
+− −
−+−−
−=∆ wxwxwx
wxw )(2)()(
)(
|||||||||||´ t
tt
t
ασ
)))((1))((())´((´ )()()( ttt xxx ησησησσ −==
![Page 128: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/128.jpg)
Iveta Mrázová, ANNIE´03 128
( )( )( )
−+
=−−=∆
−
−
else0,max
2)()1(
)(2)()1()(
ijtt
i
jt
ijtt
iti
wxcdwx
i
i
ελ
ελλ
Relevance LVQ - RLVQInput features can have different importance λ
Receptive field of the class representative wi
Weight adjustment according to GLVQ with adaptiveimportance factors λ i for input features ( 0 < ε < 1 ):
{ }λλλ ||||||:||| jii ijTR wxwxx −<−≠∀∈=
∑∑ ===−=
n
i in
i iii wxdist1
21
2 1;)(),( λλλwx
![Page 129: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/129.jpg)
Iveta Mrázová, ANNIE´03 129
( )( )
( )( )
−
−+−−−
−
−
−+−−−=∆
−−+
+
+−+
−
2)(2)()(
)(
2)(2)()(
)()(
|||||||||||
|||||||||||´
it
itt
t
it
tt
tt
i
wx
wxi
wxwxwx
wxwxwxεσλ
Generalized Relevance LearningVector Quantization (GRLVQ)
Weight adjustment according to GLVQwith adaptive importance factors λ i forinput features:
![Page 130: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/130.jpg)
Iveta Mrázová, ANNIE´03 130
Babsi-treesRoot-trees G=(V,E) satisfying the following conditions:
all vertices vi∈V can have an arbitrary number ni of sonsall leaves vJ are labeled with the corresponding classificationclass CJ
all vertices vi which are not leafs are labeled with stands for the currently processed input dimension i dimensions are “ordered” according to their importance (λ)
All edges going from a vertex vi to its sons are labeled withintervals
interval boundaries are placed in the middle between twoneighboring cluster representatives
ivI
)ii vl
vk stst ,
ivI
![Page 131: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/131.jpg)
Iveta Mrázová, ANNIE´03 131
SLF for layered networks feed-forwardneural networks
GREN-networks
![Page 132: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/132.jpg)
Iveta Mrázová, ANNIE´03 132
Results for the SLF-method
Both BP-networks and GREN-trainednetworks lead to similar sets of rules:
![Page 133: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/133.jpg)
Iveta Mrázová, ANNIE´03 133
The resulting Babsi-treeRelevant dimensions (features): 4, 8 and 7 (4)
(8)
(7)
Overall evaluation
many patterns
just one pattern
Values for intervals: 1 (-∞,1.5)2 [1.5,2.5)3 [2.5,3.5)4 [3.5,4.5)5 [4.5,∞)
ambiguous classification
![Page 134: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/134.jpg)
Iveta Mrázová, ANNIE´03 134
Comparison of the resultsSLF
Few simple hierar-chically ordered rulesPossibility to add rulesafter achieving thedesired accuracyRule correctlyapplicable - 71%and 73%, resp.
Babsi-treesMany simple rulesQuick training of thenetworkFew trainingparametersRule correctlyapplicable - 67%
![Page 135: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/135.jpg)
Iveta Mrázová, ANNIE´03 135
Knowledge extraction: Conclusions
Main results achieved:• dimension reduction for the input space• analysis of various models for knowledge extraction• rule extraction from GREN-networks• comparison with other neural network models
Further research:• adjusting rules extracted from a neural network trained
with the GRLVQ-algorithm• (automatic) selection of training parameters for the
SLF-algorithm
![Page 136: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/136.jpg)
Iveta Mrázová, ANNIE´03 136
Genetic Algorithms (GA)Apply genetics and natural selection to findoptimal parameters of a predictive function!
GA use “genetic” operators to evolve successivegenerations of solutions:– selection– crossover– mutation
Best candidates “survive” to further generationsuntil convergence is achievedDirected data mining
![Page 137: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/137.jpg)
Iveta Mrázová, ANNIE´03 137
The basic Genetic AlgorithmStep 1:Create an initial population of individualsStep 2: Evaluate the fitness of all individuals in
the populationStep 3: Select candidates for the next generationStep 4: Create new individuals (use genetic
operators - crossover and mutation)Step 5: Form a new population by replacing
(some) old individuals by new onesGOTO Step 2
![Page 138: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/138.jpg)
ANTARESSTUDENT SOFTWARE PROJECTsupervised by I. Mrázová, F. Mráz
participating students:D. Bělonožník, D. J. Květoň, M. Šubert,
J. Tomaštík, J. Tupý
http://www.ms.mff.cuni.cz/~mraz/antares
![Page 139: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/139.jpg)
Iveta Mrázová, ANNIE´03 139
Project ANTARESGenerate melodies with genetic algorithms– the fitness of candidate solutions is evaluated by
the cooperating feed-forward neural networkParallel implementation of genetic algorithms– open system for the design and testing of genetic
algorithms and neural networks– supports mutual cooperation between neural
networks and genetic algorithms
![Page 140: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/140.jpg)
Iveta Mrázová, ANNIE´03 140
Fitness evaluationwith neural networksFor some problems, it might be difficult todefine explicitly the fitness function– e.g. „evaluate“ the beauty of generated melodies
Fitness of candidate melodies (generated byGA) is evaluated by the pre-trained NN:– provide a set of positive and negative examples
(supervised learning)– train a feed-forward network to approximate the
„unknown“ fitness function (on the training set)
![Page 141: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/141.jpg)
Iveta Mrázová, ANNIE´03 141
Generating melodies:positive training samples
![Page 142: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/142.jpg)
Iveta Mrázová, ANNIE´03 142
Generating melodiesPositive training samples
Negative training samples
Test samples (with a high fitness value)
Generated melodies
![Page 143: Intelligent Data Mining Techniquesktiml.ms.mff.cuni.cz/~mrazova/Tutor_IM.pdf · Iveta Mrázová, ANNIE´03 2 Content outline QIntelligent Data Mining: introduction and overview of](https://reader034.vdocuments.site/reader034/viewer/2022051912/600296e08dd40b326108cdcf/html5/thumbnails/143.jpg)
Thank you for your attention!