data mining

27
Alisha Korpal Nivia Jain Sharuti Jain

Upload: alisha-korpal

Post on 15-Nov-2014

927 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data mining

Alisha Korpal

Nivia Jain

Sharuti Jain

Page 2: Data mining

Data Mining ?

Huge amounts of data Electronic record of our decisions

Choices in the supermarket Financial records

Page 3: Data mining

Data vs. Information

Page 4: Data mining

Data : Collection of raw data ,

facts and figures.

Information: processed form of data

Page 5: Data mining

Data Mining Extracting or “mining” knowledge from large amounts of

data Data – driven discovery and modeling of hidden patterns

in large volumes of data Extraction of interesting (non trivial, implicit, previously

and potentially useful) information or patterns from data

in large databases.

Page 6: Data mining

Data Mining Process Defining the problem Preparing data Exploring data Building Models Exploring and validating Models Deploying and Updating models

Page 7: Data mining

Data Mining Process

Page 8: Data mining

Defining the Problem What are you looking for?

What types of relationships are you trying to find?

Do you want to make predictions from the data mining model, or just look for interesting patterns and associations?

Page 9: Data mining

Contd…

Which attribute of the dataset do you want to try to predict?

How are the columns related? If there are multiple tables, how are the tables related?

Does the problem you are trying to solve reflect the policies or processes of the business?

Page 10: Data mining

Preparing Data

Page 11: Data mining

Exploring Data

You must understand the data in order to make appropriate decisions when you create the mining models. Exploration techniques include calculating the minimum and maximum values, calculating mean and standard deviations, and looking at the distribution of the data. 

Page 12: Data mining

Models

Building Models Exploring and Validating Models Deploying and Updating Models

Page 13: Data mining

Evolution of Data Mining Data collection -1960s

Data access - 1980s Data Warehousing & decision support -1990s Data Mining -Emerging Today

Page 14: Data mining

Evolutionary Step

Business Question Enabling Technologies

Characteristics

Data Collection(1960s)

"What was my total revenue in the last five years?"

Computers, tapes, disks Retrospective, static data delivery

Data Access(1980s)

"What were unit sales in New England last March?"

Relational databases (RDBMS), Structured Query Language (SQL), ODBC

Retrospective, dynamic data delivery at record level

Data Warehousing &Decision Support(1990s)

"What were unit sales in New England last March? Drill down to Boston."

On-line analytic processing (OLAP), multidimensional databases, data warehouses

Retrospective, dynamic data delivery at multiple levels

Data Mining(Emerging Today)

"What’s likely to happen to Boston unit sales next month? Why?"

Advanced algorithms, multiprocessor computers, massive databases

Prospective, proactive information delivery

Page 15: Data mining

Data mining Vs OLAP

On-line Analytical Processing Provides you with a very good

view of what is happening, but can not predict what will happen in the future or why it is happening

Page 16: Data mining

Scope of Data Mining

Automated prediction of trends and behaviors Automated discovery of previously unknown

patterns

Page 17: Data mining

Applications

Science: Chemistry, Physics, Medicine Biochemical analysis Remote sensors on a satellite Medical images analysis

Page 18: Data mining

Applications Financial Industry, Banks, Businesses, E

commerce Stock and investment analysis Risk management Sales forecasting

Page 19: Data mining

Applications

Database analysis and decision support Market analysis and management

Target marketing, customer relation management, market basket analysis, cross selling

Page 20: Data mining

Applications

Risk analysis and management Forecasting, customer retention, improved underwriting Fraud detection and management

Page 21: Data mining

References: http://www.data-miners.com/resources/SUGI29-Survival.

pdf http://docs.google.com/viewer?

a=v&q=cache:VRsb5lbwpGoJ:www.sdsc.edu/us/training/workshops/2006cihass/docs/2006cihass_DataMiningIntro.ppt+applications+of+data+mining+ppt&hl=en&gl=in&pid=bl&srcid=ADGEESg5iQeaEGa0RoHJpbQyDDbVKPNJwOS3Zg71DTIgFf8PhSbzZ39oAdQNwPb8wvwJAbwFwp-HcAwhGF-9C6TiHM3pv7vQm7Xf8umeBDY_oG6VtzK8eVwqAo95evUgkcvWwDO5YwKT&sig=AHIEtbQ1bj7uPnVGzCNysOs5V7_5apQk0A&pli=1

Page 22: Data mining

References:

http://www.thearling.com/text/dmwhite/dmwhite.htm http://www.anderson.ucla.edu/faculty/jason.frand/

teacher/technologies/palace/datamining.htm http://msdn.microsoft.com/en-us/library/ms174949.aspx

Page 23: Data mining
Page 24: Data mining

Result of Data Mining

What may happen in future Classifying people or things into groups by recognizing

patterns Clustering people or things into groups based on their

attributes Sequencing what events are likely to lead to later events

Page 25: Data mining

Data Mining is not

“Blind” applications of algorithms Going to find relations where none exist Presenting data in different ways A difficult to understand technology requiring an

advanced degree in computer science

Page 26: Data mining
Page 27: Data mining