data mining cs 157b section 2 keng teng lao. overview definition of data mining application of data...

15
Data Mining CS 157B Section 2 Keng Teng Lao

Upload: arthur-boone

Post on 23-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Data Mining

CS 157B Section 2Keng Teng Lao

Page 2: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Overview

• Definition of Data Mining• Application of Data Mining

Page 3: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Data Mining

• Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data.

• To be useful, data mining must be carried out efficiently on large files and databese.

Page 4: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

KDD

• Knowledge Discovery in Databases

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

Page 5: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Data Mining Vs. Data Warehousing

• The goal of a data warehouse is to support decision making with data.

• Data Mining can be used in conjunction with a data warehouse to help with certain types of decisions

Page 6: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Goals of Data Mining and Knowledge Discovery

• Prediction – Data mining can show how certain attributes within the data will behave in the future.

• Identification – Data patterns can be used to identify the existence of an item, an event, or an activity.

Page 7: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Cont.

• Classification – Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters

• Optimization – Once eventual goal of data mining may be to optimize the use of limited resources such as time, space… to maximize output variables such as sales or profits under a given set of constraints.

Page 8: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Types of Knowledge Discovered During Data Mining• Association rules• Classification hierarchies• Sequential patterns• Patterns within time series• Clustering

Page 9: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Classification hierarchies

• Process of learning a model that describes different classes of data.

• Decision Tree

Page 10: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Sequential Patterns

• The discovery of sequential patterns is based on the concept of a sequence of itemsets.

• TO find all subsequences from the given sets of sequences that have a user-defined minimum support.

Page 11: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Patterns with in Time Series

• Time series are sequences of event• Each event may be a given fixed type

of a transaction

• The closing price of a stock or a fund is an event that occurs every weekday for each stock fund.

Page 12: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Application of Data Ming

• Marketing – Application include analysis of consumer behavior based on buying patterns

• Finance – Applications include analysis of creditworthiness of clients, segmentation of account receivables…

Page 13: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Cont.

• Manufacturing – Applications involve optimization of resources like machines, manpower, and materials

• Health Care – Applications include discovering patterns in radiological images, analyzing side effects of drugs…

Page 14: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Real Life Application

• The LA police departments counterterrorism unit next are using a new data-analysis system designed to identify and connect related pieces of intelligence to help officers dter and respond to terrorist attacks.

Page 15: Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining

Reference

• Elmasri, Remez Fundamentals of Database Systems. Pearson. Singapore. 2004.

• LAPD turns to data analysis to fight terrorism. <http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=107670>