data mining
TRANSCRIPT
![Page 1: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/1.jpg)
• Data mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 2: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/2.jpg)
Data mining
1 The overall goal of the data mining process is to extract information from
a data set and transform it into an understandable structure for further
use
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 3: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/3.jpg)
Data mining
1 Even the popular book "Data mining: Practical machine learning tools and techniques with Java" (which covers mostly machine learning material)
was originally to be named just "Practical machine learning", and the term "data mining" was only added
for marketing reasons
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 4: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/4.jpg)
Data mining
1 Neither the data collection, data preparation, nor result interpretation
and reporting are part of the data mining step, but do belong to the overall KDD process as additional
steps.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 5: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/5.jpg)
Data mining
1 The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample
parts of a larger population data set that are (or may be) too small for reliable
statistical inferences to be made about the validity of any patterns discovered.
These methods can, however, be used in creating new hypotheses to test against
the larger data populations.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 6: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/6.jpg)
Data mining
1 Data mining interprets its data into real time analysis that can be used to increase sales, promote new product,
or delete product that is not value-added to the company.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 7: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/7.jpg)
Data mining Etymology
1 Currently, Data Mining and Knowledge
Discovery are used interchangeably.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 8: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/8.jpg)
Data mining Background
1 Data mining is the process of applying these methods with the intention of uncovering hidden
patterns in large data sets
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 9: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/9.jpg)
Data mining Research and evolution
1 The premier professional body in the field is the Association for Computing
Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining (SIGKDD). Since 1989 this ACM SIG has hosted an annual international
conference and published its proceedings, and since 1999 it has
published a biannual academic journal titled "SIGKDD Explorations".
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 10: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/10.jpg)
Data mining Research and evolution
1 Computer science conferences on data
mining include:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 11: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/11.jpg)
Data mining Research and evolution
1 DMKD Conference – Research Issues on Data Mining and
Knowledge Discovery
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 12: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/12.jpg)
Data mining Research and evolution
1 ECDM Conference – European
Conference on Data Mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 13: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/13.jpg)
Data mining Research and evolution
1 ECML-PKDD Conference – European Conference on Machine Learning and Principles and Practice of Knowledge
Discovery in Databases
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 14: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/14.jpg)
Data mining Research and evolution
1 EDM Conference – International Conference
on Educational Data Mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 15: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/15.jpg)
Data mining Research and evolution
1 PAKDD Conference – The annual Pacific-Asia Conference on Knowledge Discovery and Data
Mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 16: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/16.jpg)
Data mining Research and evolution
1 SSTD Symposium – Symposium on Spatial and Temporal
Databases
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 17: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/17.jpg)
Data mining Research and evolution
1 Data mining topics are also present on many data
management/database conferences such as the ICDE Conference,
SIGMOD Conference and International Conference on Very
Large Data Bases
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 18: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/18.jpg)
Data mining Process
1 (5) Interpretation/Evaluation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 19: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/19.jpg)
Data mining Process
1 It exists, however, in many variations on this theme, such as the Cross
Industry Standard Process for Data Mining (CRISP-DM) which defines six
phases:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 20: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/20.jpg)
Data mining Process
1 (5) Evaluation
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 21: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/21.jpg)
Data mining Process
1 or a simplified process such as (1) , (2) data mining, and (3) results validation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 22: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/22.jpg)
Data mining Process
1 Polls conducted in 2002, 2004, and 2007 show that the CRISP-DM methodology is the leading methodology used by data miners. The only other data mining standard named
in these polls was SEMMA. However, 3-4 times as many people reported using CRISP-
DM. Several teams of researchers have published reviews of data mining process
models, and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in
2008.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 23: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/23.jpg)
Data mining Pre-processing
1 Before algorithms can be used, a target data set must be
assembled
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 24: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/24.jpg)
Data mining
1 Anomaly detection (Outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or
data errors that require further investigation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 25: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/25.jpg)
Data mining
1 Association rule learning (Dependency modeling) – Searches for relationships
between variables. For example a supermarket might gather data on customer purchasing habits. Using
association rule learning, the supermarket can determine which products are
frequently bought together and use this information for marketing purposes. This is
sometimes referred to as market basket analysis.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 26: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/26.jpg)
Data mining
1 Clustering – is the task of discovering groups and structures in the data that are in some way or another "similar", without using known
structures in the data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 27: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/27.jpg)
Data mining
1 Classification – is the task of generalizing known structure to
apply to new data. For example, an e-mail program might attempt to
classify an e-mail as "legitimate" or as "spam".
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 28: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/28.jpg)
Data mining
1 Regression – Attempts to find a function which models the data with the least error.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 29: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/29.jpg)
Data mining
1 Summarization – providing a more compact representation of the data
set, including visualization and report generation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 30: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/30.jpg)
Data mining Results validation
1 For example, a data mining algorithm trying to distinguish "spam" from
"legitimate" emails would be trained on a training set of sample e-mails
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 31: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/31.jpg)
Data mining Results validation
1 If the learned patterns do not meet the desired , subsequently it is
necessary to re-evaluate and change the pre-processing and data mining
steps. If the learned patterns do meet the desired , then the final step
is to interpret the learned patterns and turn them into knowledge.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 32: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/32.jpg)
Data mining Standards
1 There have been some efforts to define standards for the data mining process, for
example the 1999 European Cross Industry Standard Process for Data Mining
(CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM 1.0). Development on successors to these processes (CRISP-DM 2.0 and JDM 2.0) was active in 2006,
but has stalled since. JDM 2.0 was withdrawn without reaching a final draft.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 33: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/33.jpg)
Data mining Standards
1 As the name suggests, it only covers prediction models, a particular data mining task of high importance to
business applications
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 34: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/34.jpg)
Data mining Games
1 for 3x3-chess) with any beginning configuration, small-board dots-and-boxes, small-board-hex, and certain endgames in chess, dots-and-boxes, and hex; a new area for data mining
has been opened
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 35: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/35.jpg)
Data mining Business
1 If Walmart analyzed their point-of-sale data with data mining
techniques they would be able to determine sales trends, develop marketing campaigns, and more
accurately predict customer loyalty
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 36: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/36.jpg)
Data mining Business
1 Once the results from data mining (potential prospect/customer and
channel/offer) are determined, this "sophisticated application" can either
automatically send an e-mail or a regular mail
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 37: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/37.jpg)
Data mining Business
1 In order to maintain this quantity of models, they need to manage model versions and move on to automated
data mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 38: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/38.jpg)
Data mining Business
1 Data mining can also be helpful to human resources (HR) departments in identifying the
characteristics of their most successful employees. Information obtained – such as universities attended by highly successful
employees – can help HR focus recruiting efforts accordingly. Additionally, Strategic Enterprise
Management applications help a company translate corporate-level goals, such as profit
and margin share targets, into operational decisions, such as production plans and
workforce levels.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 39: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/39.jpg)
Data mining Business
1 If a clothing store records the purchases of customers, a data
mining system could identify those customers who favor silk shirts over
cotton ones
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 40: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/40.jpg)
Data mining Business
1 Market basket analysis has also been used to identify the purchase patterns of the Alpha Consumer. Alpha Consumers are
people that play a key role in connecting with the concept behind a product, then
adopting that product, and finally validating it for the rest of society. Analyzing the data collected on this type of user has allowed companies to predict future buying trends
and forecast supply demands.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 41: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/41.jpg)
Data mining Business
1 Data mining is a highly effective tool in the catalog marketing industry.
Catalogers have a rich database of history of their customer transactions for millions of customers dating back a number of years. Data mining tools
can identify patterns among customers and help identify the most
likely customers to respond to upcoming mailing campaigns.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 42: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/42.jpg)
Data mining Business
1 Data mining for business applications is a component that needs to be integrated
into a complex modeling and decision making process. Reactive business
intelligence (RBI) advocates a "holistic" approach that integrates data mining, modeling, and interactive visualization
into an end-to-end discovery and continuous innovation process powered
by human and automated learning.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 43: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/43.jpg)
Data mining Business
1 The relation between the quality of a data mining system and the amount of investment that the decision maker is willing to make was
formalized by providing an economic perspective on the value of “extracted knowledge” in terms of its payoff to the
organization This decision-theoretic classification framework was applied to a real-world semiconductor wafer manufacturing line, where decision rules for effectively monitoring
and controlling the semiconductor wafer fabrication line were developed.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 44: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/44.jpg)
Data mining Business
1 Another implication is that on-line monitoring of the semiconductor
manufacturing process using data mining may be highly effective.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 45: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/45.jpg)
Data mining Science and engineering
1 In recent years, data mining has been used widely in the areas of science and engineering, such as
bioinformatics, genetics, medicine, education and electrical power
engineering.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 46: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/46.jpg)
Data mining Science and engineering
1 The data mining method that is used to perform this task is known as
multifactor dimensionality reduction.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 47: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/47.jpg)
Data mining Science and engineering
1 In the area of electrical power engineering, data mining methods
have been widely used for condition monitoring of high voltage electrical
equipment
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 48: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/48.jpg)
Data mining Science and engineering
1 Data mining methods have also been applied to dissolved gas analysis
(DGA) in power transformers. DGA, as a diagnostics for power
transformers, has been available for many years. Methods such as SOM
has been applied to analyze generated data and to determine
trends which are not obvious to the standard DGA ratio methods (such as
Duval Triangle).https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 49: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/49.jpg)
Data mining Science and engineering
1 In this way, data mining can facilitate
institutional memory.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 50: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/50.jpg)
Data mining Science and engineering
1 Other examples of application of data mining methods are biomedical
data facilitated by domain ontologies, mining clinical trial data,
and traffic analysis using SOM.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 51: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/51.jpg)
Data mining Science and engineering
1 In adverse drug reaction surveillance, the Uppsala Monitoring Centre has, since 1998,
used data mining methods to routinely screen for reporting patterns indicative of emerging drug safety issues in the WHO global database of 4.6 million suspected
adverse drug reaction incidents. Recently, similar methodology has been developed to mine large collections of electronic health records for temporal patterns associating drug prescriptions to medical diagnoses.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 52: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/52.jpg)
Data mining Human rights
1 Data mining of government records – particularly records of the justice
system (i.e., courts, prisons) – enables the discovery of systemic
human rights violations in connection to generation and publication of
invalid or fraudulent legal records by various government agencies.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 53: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/53.jpg)
Data mining Medical data mining
1 In 2011, the case of Sorrell v. IMS Health, Inc., decided by the Supreme Court of the United States, ruled that pharmacies may share information
with outside companies. This practice was authorized under the 1st
Amendment of the Constitution, protecting the "freedom of speech."
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 54: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/54.jpg)
Data mining Spatial data mining
1 So far, data mining and Geographic Information Systems (GIS) have
existed as two separate technologies, each with its own
methods, traditions, and approaches to visualization and data analysis
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 55: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/55.jpg)
Data mining Spatial data mining
1 Data mining offers great potential benefits for GIS-based applied decision-making.
Recently, the task of integrating these two technologies has become of critical
importance, especially as various public and private sector organizations possessing
huge databases with thematic and geographically referenced data begin to
realize the huge potential of the information contained therein. Among those
organizations are:https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 56: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/56.jpg)
Data mining Spatial data mining
1 offices requiring analysis or dissemination of geo-referenced statistical data
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 57: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/57.jpg)
Data mining Spatial data mining
1 public health services searching for explanations of disease clustering
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 58: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/58.jpg)
Data mining Spatial data mining
1 environmental agencies assessing the impact of changing land-use patterns on climate
change
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 59: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/59.jpg)
Data mining Spatial data mining
1 geo-marketing companies doing customer segmentation based on spatial location.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 60: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/60.jpg)
Data mining Spatial data mining
1 Challenges in Spatial mining: Geospatial data repositories tend to be very large
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 61: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/61.jpg)
Data mining Spatial data mining
1 Developing and supporting geographic data warehouses
(GDW's): Spatial properties are often reduced to simple aspatial attributes
in mainstream data warehouses. Creating an integrated GDW requires solving issues of spatial and temporal
data interoperability – including differences in semantics, referencing systems, geometry, accuracy, and
position.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 62: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/62.jpg)
Data mining Spatial data mining
1 Geographic data mining methods should recognize more complex
geographic objects (i.e., lines and polygons) and relationships (i.e., non-
Euclidean distances, direction, connectivity, and interaction through attributed geographic space such as
terrain)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 63: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/63.jpg)
Data mining Spatial data mining
1 Geographic knowledge discovery using diverse data types: GKD
methods should be developed that can handle diverse data types
beyond the traditional raster and vector models, including imagery
and geo-referenced multimedia, as well as dynamic data types (video
streams, animation).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 64: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/64.jpg)
Data mining Sensor data mining
1 By measuring the spatial correlation between data sampled by different sensors, a wide class of specialized
algorithms can be developed to develop more efficient spatial data
mining algorithms.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 65: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/65.jpg)
Data mining Visual data mining
1 In the process of turning from analogical into digital, large data sets have been generated, collected, and
stored discovering statistical patterns, trends and information
which is hidden in data, in order to build predictive patterns. Studies
suggest visual data mining is faster and much more intuitive than is traditional data mining. See also
Computer vision.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 66: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/66.jpg)
Data mining Music data mining
1 Data mining techniques, and in particular co-occurrence analysis,
has been used to discover relevant similarities among music corpora (radio lists, CD databases) for the purpose of classifying music into
genres in a more objective manner.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 67: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/67.jpg)
Data mining Surveillance
1 Data mining has been used to fight
terrorism by the U.S
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 68: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/68.jpg)
Data mining Surveillance
1 In the context of combating terrorism, two particularly plausible methods of data mining are "" and
"subject-based data mining".
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 69: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/69.jpg)
Data mining Pattern mining
1 "Pattern mining" is a data mining method that involves finding existing patterns in data. In
this context patterns often means association rules. The original motivation for searching association rules came from the desire to
analyze supermarket transaction data, that is, to examine customer behavior in terms of the
purchased products. For example, an association rule "beer ⇒ potato chips (80%)"
states that four out of five customers that bought beer also bought potato chips.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 70: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/70.jpg)
Data mining Pattern mining
1 In the context of pattern mining as a tool to identify terrorist activity, the National Research
Council provides the following definition: "Pattern-based data mining looks for patterns (including
anomalous data patterns) that might be associated with terrorist activity — these patterns
might be regarded as small signals in a large ocean of noise." Pattern Mining includes new
areas such a Music Information Retrieval (MIR) where patterns seen both in the temporal and
non temporal domains are imported to classical knowledge discovery search methods.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 71: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/71.jpg)
Data mining Subject-based data mining
1 "Subject-based data mining" is a data mining method involving the search for associations between individuals in data. In the context of combating terrorism, the National Research
Council provides the following definition: "Subject-based data mining uses an initiating individual or other datum that is considered,
based on other information, to be of high interest, and the goal is to determine what other persons or financial transactions or movements,
etc., are related to that initiating datum."
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 72: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/72.jpg)
Data mining Knowledge grid
1 Knowledge discovery "On the Grid" generally refers to conducting
knowledge discovery in an open environment using grid computing
concepts, allowing users to integrate data from various online data
sources, as well make use of remote resources, for executing their data
mining tasks
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 73: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/73.jpg)
Data mining Reliability / Validity
1 Data mining can be misused, and can also unintentionally produce
results which appear significant but which do not actually predict future behavior and cannot be reproduced on a new sample of data. See Data
dredging.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 74: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/74.jpg)
Data mining Privacy concerns and ethics
1 In particular, data mining government or commercial data sets
for national security or law enforcement purposes, such as in the Total Information Awareness Program
or in ADVISE, has raised privacy concerns.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 75: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/75.jpg)
Data mining Privacy concerns and ethics
1 This is not data mining per se, but a result of the preparation of data
before – and for the purposes of – the analysis
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 76: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/76.jpg)
Data mining Privacy concerns and ethics
1 It is recommended that an individual is made aware of the following before data are
collected:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 77: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/77.jpg)
Data mining Privacy concerns and ethics
1 the purpose of the data collection and any (known)
data mining projects
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 78: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/78.jpg)
Data mining Privacy concerns and ethics
1 how the data will be used
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 79: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/79.jpg)
Data mining Privacy concerns and ethics
1 who will be able to mine the data and use the data and their derivatives
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 80: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/80.jpg)
Data mining Privacy concerns and ethics
1 the status of security surrounding access to the data
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 81: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/81.jpg)
Data mining Privacy concerns and ethics
1 In America, privacy concerns have been addressed to some extent by the US Congress via the passage of
regulatory controls such as the Health Insurance Portability and
Accountability Act (HIPAA)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 82: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/82.jpg)
Data mining Privacy concerns and ethics
1 Data may also be modified so as to become anonymous, so that
individuals may not readily be identified. However, even "de-
identified"/"anonymized" data sets can potentially contain enough
information to allow identification of individuals, as occurred when
journalists were able to find several individuals based on a set of search
histories that were inadvertently released by AOL.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 83: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/83.jpg)
Data mining Free open-source data mining software and applications
1 Carrot2: Text and search results clustering
framework.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 84: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/84.jpg)
Data mining Free open-source data mining software and applications
1 Chemicalize.org: A chemical structure miner and web
search engine.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 85: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/85.jpg)
Data mining Free open-source data mining software and applications
1 ELKI: A university research project with advanced cluster analysis and outlier detection methods written in
the Java language.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 86: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/86.jpg)
Data mining Free open-source data mining software and applications
1 GATE: a natural language processing and language
engineering tool.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 87: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/87.jpg)
Data mining Free open-source data mining software and applications
1 KNIME: The Konstanz Information Miner, a user friendly and comprehensive data
analytics framework.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 88: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/88.jpg)
Data mining Free open-source data mining software and applications
1 ML-Flex: A software package that enables users to integrate with third-
party machine-learning packages written in any programming
language, execute classification analyses in parallel across multiple
computing nodes, and produce HTML reports of classification results.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 89: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/89.jpg)
Data mining Free open-source data mining software and applications
1 NLTK (Natural Language Toolkit): A suite of libraries and programs for
symbolic and statistical natural language processing (NLP) for the
Python language.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 90: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/90.jpg)
Data mining Free open-source data mining software and applications
1 SenticNet API: A semantic and affective resource for opinion mining and sentiment
analysis.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 91: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/91.jpg)
Data mining Free open-source data mining software and applications
1 Orange: A component-based data mining and machine learning
software suite written in the Python language.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 92: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/92.jpg)
Data mining Free open-source data mining software and applications
1 R: A programming language and software environment for statistical
computing, data mining, and graphics. It is part of the GNU
Project.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 93: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/93.jpg)
Data mining Free open-source data mining software and applications
1 UIMA: The UIMA (Unstructured Information Management
Architecture) is a component framework for analyzing
unstructured content such as text, audio and video – originally
developed by IBM.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 94: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/94.jpg)
Data mining Free open-source data mining software and applications
1 Weka: A suite of machine learning software applications written in the Java programming
language.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 95: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/95.jpg)
Data mining Commercial data-mining software and applications
1 Angoss KnowledgeSTUDIO: data mining tool provided by
Angoss.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 96: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/96.jpg)
Data mining Commercial data-mining software and applications
1 BIRT Analytics: visual data mining and predictive analytics tool provided by Actuate
Corporation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 97: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/97.jpg)
Data mining Commercial data-mining software and applications
1 Clarabridge: enterprise class text analytics solution.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 98: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/98.jpg)
Data mining Commercial data-mining software and applications
1 IBM DB2 Intelligent Miner: in-database data mining platform provided by IBM, with modeling,
scoring and visualization services based on the SQL/MM - PMML
framework.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 99: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/99.jpg)
Data mining Commercial data-mining software and applications
1 LIONsolver: an integrated software application for data mining, business
intelligence, and modeling that implements the Learning and
Intelligent OptimizatioN (LION) approach.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 100: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/100.jpg)
Data mining Commercial data-mining software and applications
1 NetOwl: suite of multilingual text and entity analytics products that enable data mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 101: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/101.jpg)
Data mining Commercial data-mining software and applications
1 SAS Enterprise Miner: data mining software provided by
the SAS Institute.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 102: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/102.jpg)
Data mining Marketplace surveys
1 Several researchers and organizations have conducted
reviews of data mining tools and surveys of data miners. These
identify some of the strengths and weaknesses of the software
packages. They also provide an overview of the behaviors,
preferences and views of data miners. Some of these reports
include:https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 103: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/103.jpg)
Data mining Marketplace surveys
1 Forrester Research 2010 Predictive Analytics and Data Mining Solutions report
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 104: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/104.jpg)
Data mining Marketplace surveys
1 Gartner 2008 "Magic Quadrant" report
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 105: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/105.jpg)
Data mining Marketplace surveys
1 Haughton et al.'s 2003 Review of Data Mining Software Packages in The American
Statistician
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 106: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/106.jpg)
Data mining Further reading
1 M.S. Chen, J. Han, P.S. Yu (1996) "Data mining: an overview from a database perspective". Knowledge
and data Engineering, IEEE Transactions on 8 (6), 866-883
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 107: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/107.jpg)
Data mining Further reading
1 Feldman, Ronen; and Sanger, James; The Text Mining Handbook,
Cambridge University Press, ISBN 978-0-521-83657-9
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 108: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/108.jpg)
Data mining Further reading
1 Guo, Yike; and Grossman, Robert (editors) (1999); High Performance Data Mining: Scaling Algorithms, Applications and Systems, Kluwer
Academic Publishers
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 109: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/109.jpg)
Data mining Further reading
1 Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and
techniques. Morgan kaufmann, 2006.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 110: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/110.jpg)
Data mining Further reading
1 Liu, Bing (2007); Web Data Mining: Exploring Hyperlinks, Contents and Usage Data, Springer, ISBN 3-540-
37881-2
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 111: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/111.jpg)
Data mining Further reading
1 Murphy, Chris (16 May 2011). "Is Data Mining Free Speech?". InformationWeek (UMB): 12.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 112: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/112.jpg)
Data mining Further reading
1 Poncelet, Pascal; Masseglia, Florent; and Teisseire, Maguelonne (editors)
(October 2007); "Data Mining Patterns: New Methods and
Applications", Information Science Reference, ISBN 978-1-59904-162-9
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 113: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/113.jpg)
Data mining Further reading
1 Tan, Pang-Ning; Steinbach, Michael; and Kumar, Vipin (2005); Introduction to Data Mining, ISBN 0-321-32136-7
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 114: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/114.jpg)
Data mining Further reading
1 Theodoridis, Sergios; and Koutroumbas, Konstantinos (2009);
Pattern Recognition, 4th Edition, Academic Press, ISBN 978-1-59749-
272-0
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 115: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/115.jpg)
Data mining Further reading
1 Weiss, Sholom M.; and Indurkhya, Nitin (1998); Predictive Data Mining, Morgan
Kaufmann
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 116: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/116.jpg)
Data mining Further reading
1 Witten, Ian H.; Frank, Eibe; Hall, Mark A. (30 January 2011). Data Mining:
Practical Machine Learning Tools and Techniques (3 ed.). Elsevier. ISBN
978-0-12-374856-0. (See also Free Weka software)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 117: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/117.jpg)
Data mining Further reading
1 Ye, Nong (2003); The Handbook of
Data Mining, Mahwah, NJ:
Lawrence Erlbaumhttps://store.theartofservice.com/the-data-mining-toolkit.html
![Page 118: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/118.jpg)
Data Mining Extensions
1 Data Mining Extensions (DMX) is a query language for Data Mining
Models supported by Microsoft's SQL Server Analysis Services product.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 119: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/119.jpg)
Data Mining Extensions
1 DMX is used to create and train data mining models, and to browse, manage, and predict
against them
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 120: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/120.jpg)
Data Mining Extensions - DMX Queries
1 DMX Queries are formulated using the SELECT statement. They can extract information from existing
data mining models in various ways.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 121: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/121.jpg)
Data Mining Extensions - Data Definition Language
1 The Data Definition Language (DDL)
part of DMX can be used to
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 122: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/122.jpg)
Data Mining Extensions - Data Definition Language
1 Create new data mining models and mining structures - CREATE MINING STRUCTURE,
CREATE MINING MODEL
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 123: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/123.jpg)
Data Mining Extensions - Data Definition Language
1 Delete existing data mining models and mining structures - DROP MINING
STRUCTURE, DROP MINING MODEL
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 124: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/124.jpg)
Data Mining Extensions - Data Definition Language
1 Export and import mining structures - EXPORT, IMPORT
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 125: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/125.jpg)
Data Mining Extensions - Data Manipulation Language
1 The Data Manipulation
Language (DML) part of DMX can be
used tohttps://store.theartofservice.com/the-data-mining-toolkit.html
![Page 126: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/126.jpg)
Data Mining Extensions - Data Manipulation Language
1 Make predictions using mining model -
SELECT ... FROM PREDICTION JOIN
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 127: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/127.jpg)
Data Mining Extensions - Example: a prediction query
1 This example is a singleton prediction query, which predicts for the given customer whether she will be interested in home loan products.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 128: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/128.jpg)
Data Mining Extensions - Example: a prediction query
1 NATURAL PREDICTION JOIN
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 129: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/129.jpg)
Data Mining Extensions - Example: a prediction query
1 18 AS [Total Years of Education]
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 130: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/130.jpg)
OAuth - Abuse of OAuth for Internet data mining
1 A growing number of social networking services promote OAuth
logins to the dominant social networks (Facebook, Twitter, etc.) as the primary authentication method, over "traditional" email confirmation
type processes
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 131: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/131.jpg)
OAuth - Abuse of OAuth for Internet data mining
1 The use of OAuth logins to social networks for "authentication" permits
the application provider to legitimately circumvent the often
significant restrictions on API use put in place by social network providers
to prevent large-scale data extraction
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 132: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/132.jpg)
Social networking service - Data mining
1 Through data mining, companies are able to improve their sales and profitability
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 133: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/133.jpg)
United States Department of Homeland Security - Data mining (ADVISE)
1 The Associated Press reported on September 5, 2007, that DHS had scrapped an anti-terrorism data
mining tool called ADVISE (Analysis, Dissemination, Visualization, Insight and Semantic Enhancement) after
the agency's Privacy Office and Office of Inspector General (OIG)
found that pilot testing of the system had been performed using data on real people without having done a
Privacy Impact Assessment, a required privacy safeguard for the
various uses of real personally identifiable information required by
section 208 of the e-Government Act of 2002
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 134: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/134.jpg)
Multitenancy - Data aggregation/data mining
1 One of the most compelling reasons for vendors/ISVs to utilize
multitenancy is for the inherent data aggregation benefits
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 135: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/135.jpg)
Machine learning - Machine learning and data mining
1 These two terms are commonly confused, as they often employ the
same methods and overlap significantly. They can be roughly
defined as follows:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 136: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/136.jpg)
Machine learning - Machine learning and data mining
1 Machine learning focuses on prediction, based on known properties learned from the training
data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 137: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/137.jpg)
Machine learning - Machine learning and data mining
1 Data mining focuses on the discovery of (previously) unknown properties in the data. This is the analysis step of Knowledge Discovery in Databases.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 138: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/138.jpg)
Machine learning - Machine learning and data mining
1 Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the
basic assumptions they work with: in machine learning, performance is
usually evaluated with respect to the ability to reproduce known
knowledge, while in Knowledge Discovery and Data Mining (KDD) the
key task is the discovery of previously unknown knowledge
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 139: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/139.jpg)
Surveillance - Data mining and profiling
1 Data mining is the application of statistical techniques and
programmatic algorithms to discover previously unnoticed relationships
within the data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 140: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/140.jpg)
Surveillance - Data mining and profiling
1 Economic (such as Creditcard purchases) and social (such as
telephone calls and emails) transactions in modern society
create large amounts of stored data and records. In the past, this data was documented in paper records, leaving a paper trail, or was simply
not documented at all. Correlation of paper-based records was a laborious
process—it required human intelligence operators to manually dig through documents, which was time-consuming and incomplete, at
best.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 141: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/141.jpg)
Surveillance - Data mining and profiling
1 But today many of these records are electronic, resulting in an electronic trail
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 142: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/142.jpg)
Surveillance - Data mining and profiling
1 Information relating to many of these individual transactions is often easily available because it is generally not
guarded in isolation, since the information, such as the title of a
movie a person has rented, might not seem sensitive
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 143: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/143.jpg)
Surveillance - Data mining and profiling
1 In addition to its own aggregation and profiling tools, the government is able to access information from third parties— for example, banks, credit companies or employers, etc.— by requesting access informally, by
compelling access through the use of subpoenas or other procedures, or
by purchasing data from commercial data aggregators or data brokers
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 144: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/144.jpg)
Surveillance - Data mining and profiling
1 Under [http://caselaw.lp.findlaw.com/scripts/
getcase.pl?court=usvol=425invol=435 United
States v. Miller] (1976), data held by third parties is generally not subject to Fourth Amendment to the United
States Constitution|Fourth Amendment warrant requirements.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 145: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/145.jpg)
Criticism of Facebook - Data mining
1 There have been some concerns expressed regarding the use of
Facebook as a means of surveillance and data mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 146: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/146.jpg)
Criticism of Facebook - Data mining
1 The possibility of data mining by private individuals unaffiliated with Facebook has been a concern, as evidenced by the fact that two
Massachusetts Institute of Technology (MIT) students were able
to download, using an automated script, over 70,000 Facebook profiles
from four schools (MIT, NYU, the University of Oklahoma, and Harvard
University) as part of a research project on Facebook privacy
published on December 14, 2005
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 147: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/147.jpg)
Criticism of Facebook - Data mining
1 A second clause that brought criticism from some users allowed
Facebook the right to sell users' data to private companies, stating We may share your information with
third parties, including responsible companies with which we have a
relationship. This concern was addressed by spokesman Chris
Hughes, who said Simply put, we have never provided our users'
information to third party companies, nor do we intend to. Facebook
eventually removed this clause from its privacy policy.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 148: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/148.jpg)
Criticism of Facebook - Data mining
1 Previously, third party applications had access to almost all user
information. Facebook's privacy policy previously stated: Facebook
does not screen or approve Platform Developers and cannot control how such Platform Developers use any
personal information. However, that language has since been removed. Regarding use of user data by third party applications, the 'Preapproved
Third-Party Websites and Applications' section of the Facebook
privacy policy now states:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 149: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/149.jpg)
Criticism of Facebook - Data mining
1 In the United Kingdom, the Trades Union Congress (TUC) has
encouraged employers to allow their staff to access Facebook and other social-networking sites from work,
provided they proceed with caution.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 150: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/150.jpg)
Criticism of Facebook - Data mining
1 In September 2007, Facebook drew a fresh round of criticism after it began allowing non-members to search for
users, with the intent of opening limited public profiles up to search
engines such as Google in the following months. Facebook's privacy
settings, however, allow users to block their profiles from search
engines.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 151: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/151.jpg)
Criticism of Facebook - Data mining
1 Concerns were also raised on the Watchdog (TV series)|BBC's
Watchdog program in October 2007 when Facebook was shown to be an
easy way in which to collect an individual's personal information in
order to facilitate identity theft. However, there is barely any
personal information presented to non-friends - if users leave the
privacy controls on their default settings, the only personal
information visible to a non-friend is the user's name, gender, profile
picture, networks, and user name.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 152: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/152.jpg)
Criticism of Facebook - Data mining
1 A New York Times article in February 2008 pointed out that Facebook does not actually provide a mechanism for
users to close their accounts, and raised the concern that private user data would remain indefinitely on
Facebook's servers. , Facebook gives users the options to deactivate or
delete their accounts.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 153: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/153.jpg)
Criticism of Facebook - Data mining
1 Deactivating an account allows it to be restored later, while deleting it
will remove the account permanently, although some data
submitted by that account (like posting to a group or sending
someone a message) will remain.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 154: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/154.jpg)
Criticism of Facebook - Data mining
1 A third party site, uSocial, was involved in a controversy
surrounding the sale of fans and friends. uSocial received a cease-
and-desist letter from Facebook and has stopped selling friends.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 155: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/155.jpg)
Data visualization - Data mining
1 Data mining is the process of sorting through large amounts of data and
picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from
the enormous data sets generated by modern experimental and observational methods.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 156: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/156.jpg)
Data visualization - Data mining
1 It has been described as the nontrivial extraction of implicit, previously unknown,
and potentially useful information from data and the science of extracting useful information from large data sets or databases. In relation to enterprise
resource planning, according to Monk (2006), data mining is the statistical and
logical analysis of large sets of transaction data, looking for patterns that can aid
decision making.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 157: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/157.jpg)
Mass surveillance in the United States - Data mining of subpoenaed records
1 The Federal Bureau of Investigation|FBI collected nearly all hotel, airline,
rental car, gift shop, and casino records in Las Vegas, Nevada|Las
Vegas during the last two weeks of 2003
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 158: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/158.jpg)
Oracle Data Mining
1 It provides means for the creation, management and operational
deployment of data mining models inside the database environment.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 159: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/159.jpg)
Oracle Data Mining - Overview
1 These operations include functions to Data Definition Language|create, apply, Test method|test, and Data
manipulation|manipulate data mining models
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 160: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/160.jpg)
Oracle Data Mining - Overview
1 In data mining, the process of using a model to derive predictions or
descriptions of behavior that is yet to occur is called scoring
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 161: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/161.jpg)
Oracle Data Mining - Overview
1 Most Oracle Data Mining functions also allow text mining by accepting
Text (unstructured data) attributes as input
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 162: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/162.jpg)
Oracle Data Mining - History
1 Oracle Data Mining was first introduced in 2002 and its releases
are named according to the corresponding Oracle database
release:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 163: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/163.jpg)
Oracle Data Mining - History
1 * Oracle Data Mining 10gR1 (10.1.0.2.0 - February 2004)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 164: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/164.jpg)
Oracle Data Mining - History
1 * Oracle Data Mining 10gR2 (10.2.0.1.0 - July 2005)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 165: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/165.jpg)
Oracle Data Mining - History
1 Oracle Data Mining is a logical successor of the Darwin data mining
toolset developed by Thinking Machines Corporation in the mid-
1990s and later distributed by Oracle after its acquisition of Thinking
Machines in 1999. However, the product itself
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 166: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/166.jpg)
Oracle Data Mining - History
1 is a Rewrite (programming)|complete redesign and rewrite from ground-up
- while Darwin was a classic GUI-based analytical workbench, ODM
offers a data mining development/deployment platform
integrated into the Oracle database, along with the Oracle Data Miner
GUI.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 167: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/167.jpg)
Oracle Data Mining - History
1 The Oracle Data Miner 11gR2 New Workflow GUI was previewed at Oracle Open World 2009. An
updated Oracle Data Miner GUI was released in 2012. It is free, and is available as an extension to Oracle
SQL Developer 3.1 .
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 168: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/168.jpg)
Oracle Data Mining - Functionality
1 As of release 11gR1 Oracle Data Mining contains the following data mining functions:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 169: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/169.jpg)
Oracle Data Mining - Functionality
1 ** Model exploration,
evaluation and analysis.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 170: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/170.jpg)
Oracle Data Mining - Functionality
1 * Feature selection (Attribute Importance).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 171: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/171.jpg)
Oracle Data Mining - Functionality
1 ** Support Vector Machine (SVM).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 172: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/172.jpg)
Oracle Data Mining - Functionality
1 ** One-class Support Vector Machine (SVM).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 173: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/173.jpg)
Oracle Data Mining - Functionality
1 ** Generalized linear model (GLM) for
Multiple regression
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 174: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/174.jpg)
Oracle Data Mining - Functionality
1 ** Orthogonal Partitioning Clustering (O-Cluster).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 175: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/175.jpg)
Oracle Data Mining - Functionality
1 * Association rule learning:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 176: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/176.jpg)
Oracle Data Mining - Functionality
1 ** Itemsets and association rules
(AM).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 177: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/177.jpg)
Oracle Data Mining - Functionality
1 * Feature extraction.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 178: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/178.jpg)
Oracle Data Mining - Functionality
1 ** Combined text and non-text columns of
input data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 179: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/179.jpg)
Oracle Data Mining - Input sources and data preparation
1 Most Oracle Data Mining functions accept as input one relational table or view. Flat data can be combined with transactional data through the
use of nested columns, enabling mining of data involving one-to-many
relationships (e.g. a star schema). The full functionality of SQL can be used when preparing data for data mining, including dates and spatial
data.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 180: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/180.jpg)
Oracle Data Mining - Input sources and data preparation
1 Oracle Data Mining distinguishes numerical, categorical, and
unstructured (text) attributes. The product also provides utilities for
data preparation steps prior to model building such as outlier treatment,
discretization, Database normalization|normalization and
binning (sorting in general speak)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 181: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/181.jpg)
Oracle Data Mining - Graphical user interface: Oracle Data Miner
1 There is also an independent interface: the Spreadsheet Add-In for
Predictive Analytics which enables access to the Oracle Data Mining
Predictive Analytics PL/SQL package from Microsoft Excel.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 182: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/182.jpg)
Oracle Data Mining - PL/SQL and Java interfaces
1 Oracle Data Mining provides a native PL/SQL package
(DBMS_DATA_MINING) to create, destroy, describe, apply, test, export and import models. The code below
illustrates a typical call to build a Statistical classification|classification
model:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 183: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/183.jpg)
Oracle Data Mining - PMML
1 In Release 11gR2 (11.2.0.2), ODM supports the import of externally-
created PMML for some of the data mining models. PMML is an XML-
based standard for representing data mining models.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 184: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/184.jpg)
Oracle Data Mining - Predictive Analytics MS Excel Add-In
1 The PL/SQL package DBMS_PREDICTIVE_ANALYTICS
automates the data mining process including data preprocessing, model building and evaluation, and scoring
of new data
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 185: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/185.jpg)
Oracle Data Mining - References and further reading
1 * T. H. Davenport, [ http://www.lbl.gov/BLI/BLI_Library/assets/articles/OM/OM_PSDM_Competing
_Analytics.pdf Competing on Analytics], Harvard Business Review,
January 2006.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 186: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/186.jpg)
Oracle Data Mining - References and further reading
1 * I. Ben-Gal,[ http://www.eng.tau.ac.il/~bengal/outlier.pdf Outlier detection], In: Maimon O. and Rockach L. (Eds.) Data Mining and Knowledge Discovery Handbook:
A Complete Guide for Practitioners and Researchers, Kluwer Academic
Publishers, 2005, ISBN 0-387-24435-2.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 187: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/187.jpg)
Oracle Data Mining - References and further reading
1 * M. M. Campos, P. J. Stengard, and B. L. Milenova, Data-centric Automated Data Mining. In proceedings of the Fourth International Conference on Machine Learning and Applications 2005, 15–17 December 2005. pp8,
ISBN 0-7695-2495-8
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 188: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/188.jpg)
Oracle Data Mining - References and further reading
1 * M. F. Hornick, Erik Marcade, and Sunil Venkayala. Java Data Mining: Strategy, Standard, and Practice.
Morgan-Kaufmann, 2006, ISBN 0-12-370452-9.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 189: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/189.jpg)
Oracle Data Mining - References and further reading
1 * B. L. Milenova, J. S. Yarmus, and M. M. Campos. SVM in Oracle database
10g: removing the barriers to widespread adoption of support
vector machines. In Proceedings of the 31st international Conference on Very Large Data Bases (Trondheim, Norway, August 30 - September 2,
2005). pp1152–1163, ISBN 1-59593-154-6.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 190: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/190.jpg)
Oracle Data Mining - References and further reading
1 * B. L. Milenova and M. M. Campos. O-Cluster: scalable clustering of large
high dimensional data sets. In proceedings of the 2002 IEEE
International Conference on Data Mining: ICDM 2002. pp290–297, ISBN
0-7695-1754-4.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 191: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/191.jpg)
Oracle Data Mining - References and further reading
1 * P. Tamayo, C. Berger, M. M. Campos, J. S. Yarmus, B. L.Milenova,
A. Mozes, M. Taft, M. Hornick, R. Krishnan, S.Thomas, M. Kelly, D.
Mukhin, R. Haberstroh, S. Stephens and J. Myczkowski. Oracle Data
Mining - Data Mining in the Database Environment. In Part VII of Data
Mining and Knowledge Discovery Handbook, Maimon, O.; Rokach, L.
(Eds.) 2005, p315-1329, ISBN 0-387-24435-2.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 192: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/192.jpg)
Oracle Data Mining - References and further reading
1 * Brendan Tierney, Predictive Analytics using Oracle Data Miner:
for the data scientist, oracle analyst, oracle developer DBA, Oracle Press,
McGraw Hill, Spring 2014.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 193: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/193.jpg)
Computational sociology - Data mining and social network analysis
1 Independent from developments in computational models of social
systems, social network analysis emerged in the 1970s and 1980s from advances in graph theory, statistics, and studies of social
structure as a distinct analytical method and was articulated and
employed by sociologists like James Samuel Coleman|James S
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 194: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/194.jpg)
Department of Homeland Security - Data mining (ADVISE)
1 found that Pilot (experiment)|pilot testing of the system had been
performed using data on real people without having done a Privacy Impact
Assessment, a required privacy safeguard for the various uses of real
personally identifiable information required by section 208 of the e-
Government Act of 2002
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 195: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/195.jpg)
List of free and open-source software packages - Data mining
1 * Environment for DeveLoping KDD-Applications Supported by Index-
Structures|Environment for DeveLoping KDD-Applications
Supported by Index-Structures (ELKI) — data mining software framework
written in Java with a focus on clustering and outlier detection
methods.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 196: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/196.jpg)
List of free and open-source software packages - Data mining
1 * Orange (software) — data visualization and data mining for
novice and experts, through visual programming or Python scripting. Extensions for bioinformatics and
text mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 197: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/197.jpg)
List of free and open-source software packages - Data mining
1 * RapidMiner — data mining software written in Java, fully integrating
Weka, featuring 350+ operators for preprocessing, machine learning,
visualization, etc.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 198: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/198.jpg)
List of free and open-source software packages - Data mining
1 * Scriptella|Scriptella ETL — Extract transform load|ETL (Extract-
Transform-Load) and script execution tool. Supports integration with J2EE and Spring. Provides connectors to CSV, LDAP, XML, JDBC/ODBC and
other data sources.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 199: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/199.jpg)
List of free and open-source software packages - Data mining
1 * Weka (machine learning)|Weka — data mining software written in Java featuring machine learning operators
for classification, regression, and clustering.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 200: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/200.jpg)
List of open-source software packages - Data mining
1 * OpenNN — Open source neural networks software library written in the C++
programming language.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 201: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/201.jpg)
Learning analytics - Differentiating Learning Analytics and Educational Data Mining
1 They go on to attempt to disambiguate educational data mining from academic analytics based on whether the process is hypothesis driven or not, though
Brooks C
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 202: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/202.jpg)
Learning analytics - Differentiating Learning Analytics and Educational Data Mining
1 Regardless of the differences between the LA and EDM
communities, the two areas have significant overlap both in the
objectives of investigators as well as in the methods and techniques that
are used in the investigation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 203: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/203.jpg)
Customer analytics - Data mining
1 There are two types of categories of data mining. Predictive models use previous customer interactions to
predict future events while segmentation techniques are used to
place customers with similar behaviors and attributes into distinct
groups. This grouping can help marketers to optimize their campaign management and
targeting processes.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 204: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/204.jpg)
Conference on Knowledge Discovery and Data Mining
1 'SIGKDD' is the Association for Computing Machinery's Association for Computing Machinery#Special
Interest Groups|Special Interest Group on Knowledge Discovery and Data Mining. It became an official ACM SIG in 1998. The official web page of SIGKDD can be found on
www.KDD.org.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 205: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/205.jpg)
Conference on Knowledge Discovery and Data Mining - Conferences
1 SIGKDD has hosted an annual conference - 'ACM SIGKDD
Conference on Knowledge Discovery and Data Mining' ('KDD') - since
1995. KDD Conferences grew from KDD (Knowledge Discovery and Data
Mining) workshops at AAAI conferences, which were started by
Wikipedia:Gregory I. Piatetsky-Shapiro|Gregory Piatetsky-Shapiro in 1989, 1991, and 1993, and Usama
Fayyad in 1994.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 206: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/206.jpg)
Conference on Knowledge Discovery and Data Mining - Conferences
1 http://www.sigkdd.org/conferences.php Conference papers of each Proceedings of the SIGKDD
International Conference on Knowledge Discovery and Data Mining are published through
Association for Computing Machinery|ACMhttp://dl.acm.org/even
t.cfm?id=RE329
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 207: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/207.jpg)
Conference on Knowledge Discovery and Data Mining - Conferences
1 KDD-2012 took place in Beijing, China,http://kdd2012.sigkdd.org/ KDD-2013 took place in Chicago,
USA., and KDD-2014 will take place in New York City, USA., August 24–27,
2014. Here is a full list of past KDD meetings.http://www.kdnuggets.com/
meetings/past-meetings-kdd.html
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 208: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/208.jpg)
Conference on Knowledge Discovery and Data Mining - KDD-Cup
1 SIGKDD sponsors the [http://www.kdd.org/kddcup/ KDD Cup] competition every year in
conjunction with the annual conference. It is aimed at members
of the industry and academia, particularly students, interested in
KDD.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 209: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/209.jpg)
Conference on Knowledge Discovery and Data Mining - Awards
1 The group also annually recognizes members of the KDD community with
its [http://www.kdd.org/sigkdd-innovation-award Innovation Award] and [http://www.kdd.org/innovation-
service-awards Service Award]. Additionally, KDD presents a Best
Paper Award to recognize the highest quality paper at each
conference.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 210: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/210.jpg)
Conference on Knowledge Discovery and Data Mining - SIGKDD Explorations
1 SIGKDD has also published a biannual academic journal titled
[http://www.kdd.org/explorations/ SIGKDD Explorations] since June,
1999.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 211: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/211.jpg)
Conference on Knowledge Discovery and Data Mining - Leadership
1 The new SIGKDD leadership team
took office on July 1, 2013
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 212: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/212.jpg)
Conference on Knowledge Discovery and Data Mining - Leadership
1 * Wikipedia:Gregory I. Piatetsky-Shapiro|Gregory Piatetsky-
Shapirohttp://www.kdnuggets.com/gps.html (2005-2008)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 213: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/213.jpg)
Conference on Knowledge Discovery and Data Mining - Leadership
1 * David D. Jensenhttp://kdl.cs.umas
s.edu/people/jensen/
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 214: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/214.jpg)
Conference on Knowledge Discovery and Data Mining - Information Directors
1 * [http://faculty.washington.edu/ankurt/ Ankur Teredesai]
(2011-)https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 215: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/215.jpg)
Quantitative structure–activity relationship - Data mining approach
1 Computer SAR models typically calculate a relatively large number of
features. Because those lack structural interpretation ability, the preprocessing steps face a feature
selection problem (i.e., which structural features should be interpreted to determine the
structure-activity relationship). Feature selection can be
accomplished by visual inspection (qualitative selection by a human);
by data mining; or by molecule mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 216: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/216.jpg)
Quantitative structure–activity relationship - Data mining approach
1 A typical data mining based prediction uses e.g. support vector machines, decision trees, neural networks for inductive reasoning|
inducing a predictive learning model.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 217: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/217.jpg)
Quantitative structure–activity relationship - Data mining approach
1 Molecule mining approaches, a special case of structured data
mining approaches, apply a similarity matrix based prediction or an
automatic fragmentation scheme into molecular substructures. Furthermore there exist also
approaches using Maximum common subgraph isomorphism problem|
maximum common subgraph searches or graph kernels.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 218: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/218.jpg)
Data mining in meteorology
1 Meteorology is the interdisciplinary scientific study of the atmosphere. It
observes the changes in temperature, air pressure, moisture
and wind direction. Usually, temperature, pressure, wind
measurements and humidity are the variables that are measured by a
thermometer, barometer, anemometer, and hygrometer, respectively. There are many
methods of collecting data and Radar, Lidar, satellites are some of
them.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 219: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/219.jpg)
Data mining in meteorology
1 Weather forecasts are made by collecting quantitative data about
the current state of the atmosphere. The main issue arise in this
prediction is, it involves high-dimensional characters. To overcome
this issue, it is necessary to first analyze and simplify the data before proceeding with other analysis. Some
data mining techniques are appropriate in this context.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 220: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/220.jpg)
Data mining in meteorology - What is Data mining?
1 Consequently, data mining consists of more than collecting and analyzing
data, it also includes analyze and predictions
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 221: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/221.jpg)
Data mining in meteorology - What is Data mining?
1 The network architecture and signal process used to model nervous
systems can roughly be divided into three categories, each based on a
different philosophy.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 222: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/222.jpg)
Data mining in meteorology - What is Data mining?
1 #Feedforward neural network: the input information defines the initial signals into set of output signals.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 223: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/223.jpg)
Data mining in meteorology - What is Data mining?
1 #Feedback network: the input information defines the initial activity state of a feedback system, and after state transitions, the asymptotic final state is identified as the outcome of
the computation.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 224: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/224.jpg)
Data mining in meteorology - What is Data mining?
1 #Neighboring cells in a neural network compete in their activities
by means of mutual lateral interactions, and develop adaptively
into specific detectors of different signal patterns. In this category, learning is called competitive, unsupervised learning or self-
organizing.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 225: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/225.jpg)
Data mining in meteorology - Self-organizing Maps
1 Self-Organizing Map (SOM) is one of the most popular neural network
models, which is especially suitable for high dimensional data
visualization, clustering and modeling
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 226: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/226.jpg)
Data mining in meteorology - Self-organizing Maps
1 The Self-Organizing Map projects high-dimensional input data onto a
low dimensional (usually two-dimensional) space
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 227: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/227.jpg)
Data mining in meteorology - Self-organizing Maps
1 According to the first input of the input vector, System chooses the
output neuron (winning neuron) that closely matches with the given input
vector
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 228: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/228.jpg)
Police-enforced ANPR in the UK - Data mining
1 A major feature of the National ANPR Data Centre for car numbers is the ability to data mining|data mine.
Advanced versatile automated data mining software trawls through the
vast amounts of data collected, finding patterns and meaning in the data. Data mining can be used on
the records of previous sightings to build up intelligence of a vehicle's
movements on the road network or can be used to find cloned vehicles
by searching the database for impossibly quick journeys.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 229: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/229.jpg)
Police-enforced ANPR in the UK - Data mining
1 We can use ANPR on investigations or we can use it looking forward in a proactive,
intelligence way
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 230: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/230.jpg)
Multifactor dimensionality reduction - Data mining with MDR
1 Another approach is to generate many random permutations of the data to see what the data mining algorithm finds when given the
chance to overfit
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 231: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/231.jpg)
Educational data mining
1 Baker (2010) Data Mining for Education
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 232: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/232.jpg)
Educational data mining - Definition
1 Educational Data Mining refers to techniques, tools, and research
designed for automatically extracting meaning from large repositories of
data generated by or related to people's learning activities in
educational settings
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 233: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/233.jpg)
Educational data mining - Definition
1 In other cases, the data is less fine-
grained
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 234: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/234.jpg)
Educational data mining - History
1 Educational Data Mining: A Review of the State-of-
the-Art
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 235: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/235.jpg)
Educational data mining - History
1 As interest in EDM continued to increase, EDM researchers
established an academic journal in 2009, the
[http://www.educationaldatamining.org/JEDM/ Journal of Educational Data
Mining], for sharing and disseminating research results. In
2011, EDM researchers established the
[http://educationaldatamining.org/ International Educational Data Mining Society] to connect EDM researchers
and continue to grow the field.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 236: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/236.jpg)
Educational data mining - History
1 With the introduction of public educational data repositories in
2008, such as the Pittsburgh Science of Learning Centre’s (PSLC) DataShop
and the National Center for Education Statistics (NCES), public data sets have made educational data mining more accessible and
feasible, contributing to its growth.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 237: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/237.jpg)
Educational data mining - Goals
1 Baker and Yacef identified the following
four goals of EDM:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 238: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/238.jpg)
Educational data mining - Goals
1 #'Predicting students' future learning behavior' – With the use of student modeling, this goal can be achieved
by creating student models that incorporate the learner’s
characteristics, including detailed information such as their knowledge, behaviours and motivation to learn. The user experience of the learner
and their overall Contentment|satisfaction with learning are also
measured.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 239: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/239.jpg)
Educational data mining - Goals
1 #'Discovering or improving domain models' – Through the various
methods and applications of EDM, discovery of new and improvements
to existing models is possible. Examples include illustrating the educational content to engage
learners and determining optimal instructional sequences to support
the student’s learning style.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 240: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/240.jpg)
Educational data mining - Goals
1 #'Studying the effects of educational support' that can be achieved through learning
systems.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 241: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/241.jpg)
Educational data mining - Goals
1 #'Advancing scientific knowledge about learning and learners' by
building and incorporating student models, the field of EDM research and the technology and software
used.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 242: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/242.jpg)
Educational data mining - Users and Stakeholders
1 There are four main users and stakeholders involved with educational data mining. These
include:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 243: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/243.jpg)
Educational data mining - Users and Stakeholders
1 JEDM-Journal of Educational Data Mining
5.2 (2013): 102-126.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 244: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/244.jpg)
Educational data mining - Users and Stakeholders
1 * 'Educators' - Educators attempt to understand the learning process and the methods they can use to improve
their teaching methods
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 245: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/245.jpg)
Educational data mining - Users and Stakeholders
1 * 'Researchers' - Researchers focus on the development and the
evaluation of data mining techniques for effectiveness. A yearly
international conference for researchers began in 2008, followed
by the establishment of the [http://www.educationaldatamining.o
rg/JEDM/index.php/JEDM Journal of Educational Data Mining] in 2009. The wide range of topics in EDM ranges from using data mining to
improve institutional effectiveness to student performance.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 246: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/246.jpg)
Educational data mining - Users and Stakeholders
1 * 'Administrator (business)|Administrators' - Administrators are
responsible for allocating the resources for implementation in
institutions
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 247: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/247.jpg)
Educational data mining - Phases of Educational Data Mining
1 As research in the field of educational data mining has
continued to grow, a myriad of data mining techniques have been applied to a variety of educational contexts. In each case, the goal is to translate raw data into meaningful information about the learning process in order to
make better decisions about the design and trajectory of a learning environment. Thus, EDM generally
consists of four phases:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 248: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/248.jpg)
Educational data mining - Phases of Educational Data Mining
1 # The first phase of the EDM process (not counting pre-processing) is discovering relationships in data
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 249: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/249.jpg)
Educational data mining - Phases of Educational Data Mining
1 # Discovered relationships must then be Validity (statistics)|validated in
order to avoid overfitting.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 250: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/250.jpg)
Educational data mining - Phases of Educational Data Mining
1 # Validated relationships are applied to make predictions about future
events in the learning environment.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 251: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/251.jpg)
Educational data mining - Phases of Educational Data Mining
1 # Predictions are used to support decision-making processes and policy decisions.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 252: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/252.jpg)
Educational data mining - Phases of Educational Data Mining
1 During phases 3 and 4, data is often visualized or in some other way
distilled for human judgment. A large amount of research has been
conducted in best practices for Data visualization|visualizing data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 253: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/253.jpg)
Educational data mining - Main Approaches
1 Of the general categories of methods mentioned, prediction, Cluster
analysis|clustering and relationship mining are considered universal methods across all types of data mining; however, 'Discovery with
Models' and 'Distillation of Data for Human Judgment' are considered
more prominent approaches within educational data mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 254: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/254.jpg)
Educational data mining - Discovery with Models
1 In the Discovery with Model method, a model is developed via prediction,
clustering or by human reasoning knowledge engineering and then used as a component in another
analysis, namely in prediction and relationship mining
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 255: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/255.jpg)
Educational data mining - Discovery with Models
1 Key applications of this method include discovering relationships
between student behaviors, characteristics and contextual
variables in the learning environment. Further discovery of
broad and specific research questions across a wide range of
contexts can also be explored using this method.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 256: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/256.jpg)
Educational data mining - Distillation of Data for Human Judgment
1 Humans can make inferences about data that may be beyond the scope in which an automated data mining
method provides. For the use of education data mining, data is
distilled for human judgment for two key purposes, Identification
(information)|identification and Statistical classification|classification.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 257: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/257.jpg)
Educational data mining - Distillation of Data for Human Judgment
1 For the purpose of Identification (information)|identification, data is
distilled to enable humans to identify well-known patterns, which may
otherwise be difficult to interpret. For example, the learning curve, classic to educational studies, is a pattern that clearly reflects the relationship between learning and experience
over time.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 258: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/258.jpg)
Educational data mining - Distillation of Data for Human Judgment
1 Data is also distilled for the purposes of Statistical classification|classifying
features of data, which for educational data mining, is used to
support the development of the prediction model. Classification helps
expedite the development of the prediction model, tremendously.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 259: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/259.jpg)
Educational data mining - Distillation of Data for Human Judgment
1 The goal of this method is to summarize and present the
information in a useful, interactive and visually appealing way in order to understand the large amounts of
education data and to support decision making
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 260: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/260.jpg)
Educational data mining - Applications
1 A list of the primary applications of EDM is provided by Cristobal Romero
and Sebastian Ventura. In their taxonomy, the areas of EDM
application are:
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 261: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/261.jpg)
Educational data mining - Applications
1 * Providing feedback for supporting
instructors
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 262: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/262.jpg)
Educational data mining - Applications
1 * Recommendations for students
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 263: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/263.jpg)
Educational data mining - Applications
1 * Predicting student performance
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 264: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/264.jpg)
Educational data mining - Applications
1 * Detecting undesirable student behaviors
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 265: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/265.jpg)
Educational data mining - Applications
1 * Constructing courseware - EDM can be applied to course management
systems such as open source Moodle. Moodle contains usage data
that includes various activities by users such as test results, amount of readings completed and participation
in discussion forums. Data mining tools can be used to customize
learning activities for each user and adapt the pace in which the student
completes the course. This is in particularly beneficial for online courses with varying levels of
competency.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 266: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/266.jpg)
Educational data mining - Applications
1 New research on Mobile phone|mobile learning environments also suggests that data mining can be
useful. Data mining can be used to help provide personalized content to mobile users, despite the differences in managing content between mobile
devices and standard Personal computer|PCs and web browsers.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 267: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/267.jpg)
Educational data mining - Applications
1 New EDM applications will focus on allowing non-technical users use and
engage in data mining tools and activities, making data collection and
processing more accessible for all users of EDM. Examples include
statistical and visualization tools that analyzes social networks and their
influence on learning outcomes and productivity.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 268: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/268.jpg)
Educational data mining - Courses
1 In October 2013, Coursera offered a free online course on “Big Data in Education” that teaches how and
when to use key methods for EDM. A course archive is now available
online.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 269: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/269.jpg)
Educational data mining - Courses
1 Teachers College, Columbia University offers a Learning Analytics focus as part of its Cognitive Studies
Masters. http://catalog.tc.columbia.edu/tc/departments/humandevelopment/cogniti
vestudiesineducation/
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 270: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/270.jpg)
Educational data mining - Publication Venues
1 Considerable amounts of EDM work are published at the peer-reviewed
International Conference on Educational Data Mining, organized
by the [http://www.educationaldatamining.o
rg/ International Educational Data Mining Society].
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 271: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/271.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2008 1st International Conference on Educational Data
Mining] (2008) -- Montreal, Canada
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 272: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/272.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2009 2nd International Conference on Educational Data Mining] (2009) -- Cordoba, Spain
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 273: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/273.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2010 3rd International Conference on Educational Data Mining] (2010) -- Pittsburgh, USA
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 274: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/274.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2011 4th International Conference on Educational Data
Mining] (2011) -- Eindhoven, Netherlands
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 275: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/275.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2012 5th International Conference on Educational Data Mining] (2012) -- Chania, Greece
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 276: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/276.jpg)
Educational data mining - Publication Venues
1 * [http://www.educationaldatamining.o
rg/EDM2013 6th International Conference on Educational Data Mining] (2013) -- Memphis, USA
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 277: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/277.jpg)
Educational data mining - Publication Venues
1 EDM papers are also published in the [http://www.educationaldatamining.org/JEDM/ Journal of Educational Data
Mining] (JEDM).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 278: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/278.jpg)
Educational data mining - Publication Venues
1 Many EDM papers are routinely published in related conferences, such as Artificial Intelligence and
Education, Intelligent Tutoring Systems, and User Modeling and
Adaptive Personalization.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 279: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/279.jpg)
Educational data mining - Publication Venues
1 In 2011, Chapman Hall/CRC Press, Taylor and Francis Group published the first Handbook of Educational Data Mining. This resource was
created for those that are interested in participating in the educational
data mining community.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 280: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/280.jpg)
Educational data mining - Contests
1 In 2010, the Association for Computing Machinery's
[http://www.kdd.org/kdd2010/kddcup.shtml KDD Cup] was conducted using data from an educational
setting
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 281: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/281.jpg)
Educational data mining - Costs and Challenges
1 Along with technological advancements are costs and challenges associated with
implementing EDM applications
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 282: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/282.jpg)
Educational data mining - Criticisms
1 Research also indicates that the field of educational data mining is
concentrated in North America and western cultures and subsequently,
other countries and cultures may not be represented in the research and
findings
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 283: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/283.jpg)
Educational data mining - Criticisms
1 As users become savvy in their understanding of online privacy,
Business Administrator|administrators of educational data
mining tools need to be proactive in protecting the privacy of their users and be transparent about how and with whom the information will be
used and shared
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 284: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/284.jpg)
Educational data mining - Criticisms
1 * 'Plagiarism' - Plagiarism detection is an ongoing challenge for educators
and faculty whether in the classroom or online. However, due to the complexities associated with
detecting and preventing digital plagiarism in particular, educational data mining tools are not currently sophisticated enough to accurately
address this issue. Thus, the development of predictive capability in plagiarism-related issues should
be an area of focus in future research.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 285: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/285.jpg)
Educational data mining - Criticisms
1 * 'Adoption' - It is unknown how widespread the adoption of EDM is and the extent to which institutions
have applied and considered implementing an EDM strategy. As
such, it is unclear whether there are any barriers that prevent users from adopting EDM in their educational
settings.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 286: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/286.jpg)
Java Data Mining
1 JDM enables applications to integrate data mining technology for
developing predictive analytics applications and tools
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 287: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/287.jpg)
Java Data Mining
1 Various data mining functions and techniques like statistical
classification and association (statistics)|association, regression
analysis, data clustering, and attribute importance are covered by
the 1.0 release of this standard.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 288: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/288.jpg)
Cross Industry Standard Process for Data Mining
1 In Proceedings of the IADIS European Conference on Data Mining 2008, pp 182-185.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 289: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/289.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 The lessons learned during the process can trigger new, often more
focused business questions and subsequent data mining processes will benefit from the experiences of
previous ones.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 290: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/290.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 ;Business Understanding: This initial phase focuses on understanding the project objectives and requirements
from a business perspective, and then converting this knowledge into
a data mining problem definition, and a preliminary plan designed to
achieve the objectives.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 291: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/291.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 ;Data Understanding: The data understanding phase starts with an initial data collection and proceeds
with activities in order to get familiar with the data, to identify data quality
problems, to discover first insights into the data, or to detect interesting
subsets to form hypotheses for hidden information.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 292: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/292.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 ;Data Preparation: The data preparation phase covers all
activities to construct the final dataset (data that will be fed into the modeling tool(s)) from the initial raw
data. Data preparation tasks are likely to be performed multiple times,
and not in any prescribed order. Tasks include table, record, and
attribute selection as well as transformation and cleaning of data
for modeling tools.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 293: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/293.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 ;Modeling: In this phase, various modeling techniques are selected and applied, and their parameters are calibrated to optimal values.
Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the
data preparation phase is often needed.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 294: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/294.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 At the end of this phase, a decision on the use of the data mining results should be
reached.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 295: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/295.jpg)
Cross Industry Standard Process for Data Mining - Major phases
1 Depending on the requirements, the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data
mining process
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 296: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/296.jpg)
Cross Industry Standard Process for Data Mining - History
1 CRISP-DM was conceived in 1996. In 1997 it got underway as a European
Union project under the European Strategic Program on Research in Information Technology|ESPRIT
funding initiative. The project was led by five companies: SPSS Inc.|SPSS, Teradata, Daimler AG, NCR
Corporation and OHRA, an insurance company.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 297: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/297.jpg)
Cross Industry Standard Process for Data Mining - History
1 This core consortium brought different experiences to the project: ISL, later acquired and merged into SPSS Inc. The computer giant NCR Corporation produced the Teradata data warehouse and its own data
mining software. Daimler-Benz had a significant data mining team. OHRA
was just starting to explore the potential use of data mining.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 298: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/298.jpg)
Cross Industry Standard Process for Data Mining - History
1 and published as a step-by-step data mining guide later that year.Pete Chapman, Julian Clinton, Randy
Kerber, Thomas Khabaza, Thomas Reinartz, Colin Shearer, and Rüdiger
Wirth (2000); [ftp://ftp.software.ibm.com/software/
analytics/spss/support/Modeler/Documentation/14/UserManual/
CRISP-DM.pdf CRISP-DM 1.0 Step-by-step data mining guides].
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 299: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/299.jpg)
Cross Industry Standard Process for Data Mining - History
1 Between 2006 and 2008 a CRISP-DM 2.0 SIG was formed and there were
discussions about updating the CRISP-DM process model.Colin
Shearer (2006); [http://www.kdnuggets.com/news/20
06/n19/4i.html First CRISP-DM 2.0 Workshop Held] The current status of these efforts is not known. However,
the original crisp-dm.org website cited in the reviews, and the CRISP-
DM 2.0 SIG website are both no longer active.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 300: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/300.jpg)
Cross Industry Standard Process for Data Mining - History
1 While many non-IBM data mining practitioners use CRISP-DM, IBM is
the primary corporation that currently embraces the CRISP-DM
process model. It makes some of the old CRISP-DM documents available
for download and it has incorporated it into its SPSS Modeler product.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 301: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/301.jpg)
Data mining in agriculture
1 'Data mining in agriculture' is a very recent research topic. It consists
in the application of data mining techniques to agriculture. Recent
technologies are nowadays able to provide a lot of information on
agricultural-related activities, which can then be analyzed in order to find important information. A related, but
not equivalent term is precision agriculture.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 302: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/302.jpg)
Data mining in agriculture - Prediction of problematic wine fermentations
1 Wine is widely produced all around the world
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 303: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/303.jpg)
Data mining in agriculture - Detection of diseases from sounds issued by animals
1 The detection of animal's diseases in farms can impact positively the
productivity of the farm, because sick animals can cause contaminations
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 304: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/304.jpg)
Data mining in agriculture - Sorting apples by watercores
1 For this reason, a computational system is under study which takes X-
ray photographs of the fruit while they run on conveyor belts, and
which is also able to analyse (by data mining techniques) the taken
pictures and estimate the probability that the fruit contains watercores.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 305: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/305.jpg)
Data mining in agriculture - Optimizing pesticide use by data mining
1 By data mining the cotton Pest Scouting data along with the
meteorological recordings it was shown that how pesticide use can be
optimized (reduced)
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 306: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/306.jpg)
Data mining in agriculture - Explaining pesticide abuse by data mining
1 Creating a novel Pilot Agriculture Extension Data Warehouse followed by analysis through querying and
data mining some interesting discoveries were made, such as pesticides sprayed at the wrong
time, wrong pesticides used for the right reasons and temporal
relationship between pesticide usage and day of the week.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 307: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/307.jpg)
Data mining in agriculture - Literature
1 There are a few precision agriculture journals, such as Springer's
[http://www.springerlink.com/content/103317/ Precision Agriculture] or
Elsevier's [http://www.sciencedirect.com/science/journal/01681699 Computers and Electronics in Agriculture], but those are not exclusively devoted to data
mining in agriculture.https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 308: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/308.jpg)
Data mining in agriculture - Conferences
1 There are many conferences organized every year on data mining techniques
and applications, but rather few of them consider problems arising in the
agricultural field. To date, there is only one example of a conference completely devoted to applications in agriculture of
data mining. It is organized by Georg Ruß. This is the conference [http://dma-
workshop.de/ web page].
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 309: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/309.jpg)
Dependent variables - Data mining
1 In data mining tools (for multivariate statistics and machine learning), the
depending variable is assigned a role as 'target variable' (or in some tools as label
attribute), while a dependent variable may be assigned a role as regular
variable.[http://1xltkxylmzx3z8gd647akcdvov.wpengine.netdna-cdn.com/wp-content/uploads/2013/10/rapidminer-5.0-manual-
english_v1.0.pdf English Manual version 1.0] for RapidMiner 5.0, October 2013
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 310: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/310.jpg)
Learning algorithms - Machine learning and data mining
1 * Machine learning focuses on prediction, based on known properties learned from the
training data.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 311: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/311.jpg)
Learning algorithms - Machine learning and data mining
1 * Data mining focuses on the discovery (observation)|discovery of (previously) unknown properties in
the data. This is the analysis step of Knowledge discovery|Knowledge
Discovery in Databases.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 312: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/312.jpg)
Learning algorithms - Machine learning and data mining
1 Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually
evaluated with respect to the ability to reproduce known knowledge, while in
Knowledge Discovery and Data Mining (KDD) the key task is the discovery of previously
unknown knowledge
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 313: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/313.jpg)
Activity recognition - Data mining based approach to activity recognition
1 They proposed a data mining approach based on discriminative patterns which describe significant changes between any two activity
classes of data to recognize sequential, interleaved and
concurrent activities in a unified solution.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 314: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/314.jpg)
Activity recognition - Data mining based approach to activity recognition
1 Gilbert et al.Gilbert A, Illingworth J, Bowden R. Action Recognition using Mined
Hierarchical Compound Features. IEEE Trans Pattern Analysis and Machine Learning use 2D corners in both space and time. These
are grouped spatially and temporally using a hierarchical process, with an increasing
search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining
(Apriori rule).
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 315: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/315.jpg)
Covert surveillance - Data mining and profiling
1 Data mining is the application of statistical techniques and
programmatic algorithms to discover previously unnoticed relationships
within the data
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 316: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/316.jpg)
Covert surveillance - Data mining and profiling
1 Economic (such as credit card purchases) and social (such as
telephone calls and emails) transactions in modern society
create large amounts of stored data and records. In the past, this data was documented in paper records, leaving a paper trail, or was simply
not documented at all. Correlation of paper-based records was a laborious
process—it required human intelligence operators to manually dig through documents, which was time-consuming and incomplete, at
best.
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 317: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/317.jpg)
Covert surveillance - Data mining and profiling
1 But today many of these records are electronic, resulting in an electronic trail
https://store.theartofservice.com/the-data-mining-toolkit.html
![Page 318: Data mining](https://reader038.vdocuments.site/reader038/viewer/2022110208/56649dde5503460f94ad70cb/html5/thumbnails/318.jpg)
For More Information, Visit:
• https://store.theartofservice.com/the-data-mining-toolkit.html
The Art of Servicehttps://store.theartofservice.com