data mining cs 157b section 2 keng teng lao. overview definition of data mining application of data...

Post on 23-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Mining

CS 157B Section 2Keng Teng Lao

Overview

• Definition of Data Mining• Application of Data Mining

Data Mining

• Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data.

• To be useful, data mining must be carried out efficiently on large files and databese.

KDD

• Knowledge Discovery in Databases

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

Data Mining Vs. Data Warehousing

• The goal of a data warehouse is to support decision making with data.

• Data Mining can be used in conjunction with a data warehouse to help with certain types of decisions

Goals of Data Mining and Knowledge Discovery

• Prediction – Data mining can show how certain attributes within the data will behave in the future.

• Identification – Data patterns can be used to identify the existence of an item, an event, or an activity.

Cont.

• Classification – Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters

• Optimization – Once eventual goal of data mining may be to optimize the use of limited resources such as time, space… to maximize output variables such as sales or profits under a given set of constraints.

Types of Knowledge Discovered During Data Mining• Association rules• Classification hierarchies• Sequential patterns• Patterns within time series• Clustering

Classification hierarchies

• Process of learning a model that describes different classes of data.

• Decision Tree

Sequential Patterns

• The discovery of sequential patterns is based on the concept of a sequence of itemsets.

• TO find all subsequences from the given sets of sequences that have a user-defined minimum support.

Patterns with in Time Series

• Time series are sequences of event• Each event may be a given fixed type

of a transaction

• The closing price of a stock or a fund is an event that occurs every weekday for each stock fund.

Application of Data Ming

• Marketing – Application include analysis of consumer behavior based on buying patterns

• Finance – Applications include analysis of creditworthiness of clients, segmentation of account receivables…

Cont.

• Manufacturing – Applications involve optimization of resources like machines, manpower, and materials

• Health Care – Applications include discovering patterns in radiological images, analyzing side effects of drugs…

Real Life Application

• The LA police departments counterterrorism unit next are using a new data-analysis system designed to identify and connect related pieces of intelligence to help officers dter and respond to terrorist attacks.

Reference

• Elmasri, Remez Fundamentals of Database Systems. Pearson. Singapore. 2004.

• LAPD turns to data analysis to fight terrorism. <http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=107670>

top related