data mining. 2 models created by data mining linear equations rules clusters graphs tree structures...
TRANSCRIPT
Data Mining
2
Models Created by Data Mining
• Linear Equations
• Rules
• Clusters
• Graphs
• Tree Structures
• Recurrent Patterns
3
Knowledge Discovery in Databases (KDD)
• Select target data
• Preprocess data
• Transform (if necessary)
• Data mine information
• Interpret discovered structures
4
Dependant and Independent Variables
• Dependant Variable - Attribute to be predicted.
• Independent Variable - Attributes used for making the prediction.
5
Fields Contributing to Data Mining
• Database Technology• Statistics• Machine Learning• High Performance Computing• Pattern Recognition• Neural Networks• Data Visualization• Information Retrieval
6
Applications of Data Mining
• Decision Making
• Process Control
• Information Management
• Query Processing
7
Methods of Data Reduction
• Drill-down analysis
• Clustering
• Aggregation
• Simple Tabulation
8
Exploratory Data Analysis (EDA)
• Distributions of Variables
• Correlation Matrices
• Multi-way Frequency Tables
• Cluster Analysis
• Classification Trees
• Other multivariate techniques
9
Statistical Methods Used in Data Mining
• Regression Analysis
• Standard Distribution
• Cluster Analysis
10
Industries Using Data Mining
• Banking
• Insurance
• Medicine
• Retail
• Security
• Sciences
11
Financial Uses of Data Mining
• Fraud Detection
• Money Laundering Detection
• Risk Management
12
Medical Uses of Data Mining
• Chemical Compounds
• Genetic Material
• Predictive Treatment Models
13
Retail Uses of Data Mining
• Direct Marketing
• Store Design
• Store Operations
14
Security Uses of Data Mining
• Assess crime patterns
• Homeland Security
• Identification of suspicious activities
• Pre-screening
15
Scientific Uses of Data Mining
• Image analysis
• Classification of large data sets
16
Other Novel Uses for Data Mining
• NBA’s Advanced Scout Program
• Firefly
17
Predictive Analytics
• An advanced form of data mining that makes prediction models for the behavior of variables in large data sets.
• Highly specialized for each application
18
Uses of Predictive Analytics
• Cost-Benefit Analysis
• Predicting Customer Behavior
• Reducing Costs
19
Financial Uses of Predictive Analytics
• Credit Ratings
• Economic Prediction Models
• Federal Reserve
20
Text Mining
• Extracts data from unstructured data sets
• Allows for data mining of large data sets that are not databases
21
Sentiment Analysis
• Uses semantic techniques and keywords to detect favorable and unfavorable opinions toward specific subjects.
22
Privacy Concerns with Data Mining
• Big Brother
• Puts too much power into the hands of Governmental Security Forces
23
False Positives in Data Mining for Security Reasons
• Costs the people and the Government
• Subject of controversy and civilian mistrust
24
Data Mining as Another Tool for Security
• Government doesn’t wish to interfere in civilian life
• Actual intrusions of privacy incur legal costs
• Useful for correlating with other sources of data
25
Visual and Speech Processing
• Examining large amounts of real-time input for specific data and relationships between data
• Requires a certain amount of predictive modeling
26
Data Mining is an Essential Use of Computers
• It makes the previously impossible possible
• Powerful tool for progress and understanding
• Lasting Impact