data mining & data warehousing (ppt)

37
Presented by: Harish Chand Data Mining and Data Warehousing

Upload: harish-chand

Post on 14-Aug-2015

171 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Data mining & data warehousing (ppt)

Presented by:Harish Chand

Data Mining and Data Warehousing

Page 2: Data mining & data warehousing (ppt)

Data Mining

• Data mining, the extraction of hidden predictive information from large databases,

• The overall goal of the data mining process is to extract information from a dataset and transform it into an understandable structure for further use

• Often Not to be confused with analytics, information extraction, or data analysis

• But its goal is the extraction of patterns and knowledge from large amount ofdata, not the extraction of data itself

Page 3: Data mining & data warehousing (ppt)

DATA MINIING TECHIQUES

• It extract interesting patterns such as groups of data records(cluster analysis ),unusual records(anamoly detection)

• These patterns can then be seen as a kind of summary of the input data, and maybe used in further analysis or, for example, in machine learning and predictiveanalytics

• For example, the data mining step might identify multiple groups in the data,which can then be used to obtain more accurate prediction results by a decisionsupport system

Page 4: Data mining & data warehousing (ppt)

Data Warehousing

• also known as an enterprise data warehouse (EDW), is a system usedfor reporting and data analysis. DWs are central repositories of integrated datafrom one or more disparate sources

• Storing company data in a secondary location which is typically away fromproduction systems.

• Location= where information can be proactively reported and queried against

• Goal =creation of single logical view of data that may reside in many differentphysical database

Page 5: Data mining & data warehousing (ppt)

• it exists to help users understand and enhance their organization's performance

• It is designed for query and analysis rather than for transaction processing,

• and usually contains historical data derived from transaction data, but can includedata from other sources

Page 6: Data mining & data warehousing (ppt)

TYPES OF DATA WAREHOUSING

• Data Mart

• Online Transaction Processing

• Online Analytical Processing.

Page 7: Data mining & data warehousing (ppt)

Data Mart

• A data mart is a simple form of a data warehouse that is focused on functionalarea such as sales, finance or marketing

.

• Data marts are often built and controlled by a single department within anorganization.

• Given their single-subject focus, data marts usually draw data from only a fewsources.

• The sources could be internal operational systems, a central data warehouse, orexternal data.

Page 8: Data mining & data warehousing (ppt)

Online Analytical Processing(OLAP)

• Is characterized by a relatively low volume of transactions. Queries are often verycomplex and involve aggregations.

• For OLAP systems, response time is an effectiveness measure.

• OLAP databases store aggregated, historical data in multi-dimensional schemas(usually star schemas).

• OLAP systems typically have data latency of a few hours, as opposed to datamarts, where latency is expected to be closer to one day.

Page 9: Data mining & data warehousing (ppt)

Online Transaction Processing(OTLP)

• Is characterized by a large number of short on-line transactions (INSERT,UPDATE, DELETE).

• OLTP systems emphasize very fast query processing and maintaining dataintegrity in multi-access environments.

• For OLTP systems, effectiveness is measured by the number of transactions persecond.

Page 10: Data mining & data warehousing (ppt)

• OLTP databases contain detailed and current data.

• The schema used to store transactional databases is the entity model(usually 3NF).

Page 11: Data mining & data warehousing (ppt)

Data mining VS Data warehousing

Data warehouse Data mining

Process of storing data in order in given

dataset

Process of finding pattern in given

dataset.

Data warehousing is the process of

extracting and storing data to allow easier

reporting.

Data mining is the use of pattern

recognition logic to identity trends within

a sample data set and extrapolate this

information against the larger data pool

The tools in data warehousing are

designed to extract data and store it in a

method designed to provide enhanced

system performance

A typical use of data mining is to create

targeted marketing programs, identify

financial fraud,

Helps in identifing the certain data in a

collection of data

Helps in figuring out a certain pattern of a

data or a cluster of data

Page 12: Data mining & data warehousing (ppt)

Benefits of Data Warehouse

• A Data Warehouse Delivers Enhanced Business Intelligence

By providing data from various sources, managers and executives willno longer need to make business decisions based on limited data

• A Data Warehouse Saves Time

Since business users can quickly access critical data from a number ofsources—all in one place—they can rapidly make informed decisionson key initiatives

Page 13: Data mining & data warehousing (ppt)

Benefits of Data Warehouse

• A Data Warehouse Enhances Data Quality and Consistency

Individual business units and departments including sales,marketing, finance, and operations, will start to utilize the same datarepository as the source system for their individual queries andreports.

Thus each of these individual business units and departments willproduce results that are consistent with the other business unitswithin the organization.

Page 14: Data mining & data warehousing (ppt)

Benefits of Data Warehousing

• A Data Warehouse Provides Historical Intelligence

A data warehouse stores large amounts of historical data so we cananalyze different time periods and trends in order to make futurepredictions

can enable advanced business intelligence including time-periodanalysis, trend analysis, and trend prediction.

Page 15: Data mining & data warehousing (ppt)

Benefits of Data Warehousing

• A Data Warehouse Generates a High ROI

Return on Investment(ROI)

Past references shows that companies that have implemented datawarehouses have generated more revenue and saved more moneythan companies that haven’t invested on data warehouses.

Page 16: Data mining & data warehousing (ppt)

Challenges faced on Data Warehousing

• User expectation

end-user demands and expects more accurate and refined results inreturn of processing,

however the performance decreases with exploding data and so theefficiency of the system reduces.

• Systems optimization

business intelligence tools require frequent maintenance and finetuning of whole system in order to meet users' expectations.

Page 17: Data mining & data warehousing (ppt)

Challenges faced on Data Warehousing

• Data structuring

Proper processing of data requires structuring it in a desired formatso that further operations can be executed

As the volume of data increases the task of structuring theunstructured data add-on, slowing down the processing capabilities ofsystem and eventually becomes hectic for the system manager toqualify the data for analytic purpose.

Page 18: Data mining & data warehousing (ppt)

Challenges faced on Data Warehousing

• Prefabricated vs. Custom warehouse

The varieties of warehouses available in market create ambiguityabout which type to choose or go for.

Custom warehouse saves the time of building the warehouses fromvarious operational

Prefabricated warehouses saves the time of initial configuration andinstallation.

Page 19: Data mining & data warehousing (ppt)

Challenges faced on Data Warehousing

• Resource Balancing

Many departments inside an organization tend to access theprocessing capabilities of the warehouse which eventually reduces theperformance of the system and decreases the efficiency as the stresson the system increases.

Access control and security are some techniques which can be usedto maintain a balance between the utilization and performance ofwarehouse systems.

Page 20: Data mining & data warehousing (ppt)

Decision support systems (DSS)

Decision support systems (DSS) are a specific class of computerizedinformation system that supports business and organizationaldecision-making activities.

DSS is a well integrated ,user friendly, computer based tools thatcombine data with various decision making models to solve semistructure and unstructured problems.

Page 21: Data mining & data warehousing (ppt)

Characteristics

Provide decision support for several interdependent decision.

Assist the decision maker to make decision under dynamic businessconditions.

Supports a wide variety of decision making processes and style.

Page 22: Data mining & data warehousing (ppt)

How a DSS works???

22

Page 23: Data mining & data warehousing (ppt)

Database management system

Model management system

Support tools

Components of DSS

23

Page 24: Data mining & data warehousing (ppt)

In database management ,the problem necessary to solve may comefrom internal and external database.

Within the organization, internal data are generated by systems suchas TPS and MIS; external data come from variety of sources such asperiodicals, databases, newspapers and online data services.

Database Management

24

Page 25: Data mining & data warehousing (ppt)

It stores and access models that managers use to make decisions.

Models are integral part of most decision making and are used formany tasks, such as designing a manufacturing facility, analysing thefinancial health of an organization, forecasting demand for a productor service, and determining the quality of a particular batch ofproducts.

Model Management Component

25

Page 26: Data mining & data warehousing (ppt)

It consist of tools such as pull down menus, on-line help, usersinterface, graphical analysis and error-correction mechanisms all ofwhich facilitate users interactions with the system.

Interfaces are an important support tools. This is because middle andtop managers have neither the time nor the inclination to learndifficult and complicated procedures in order to run a system. Forbetter the interface, the greater the chances that users will accept thesystem.

Support Tools

26

Page 27: Data mining & data warehousing (ppt)

Cost saving

Improve managerial effectiveness

Flexible and adaptive

Improve the effectiveness of the decision

Reduces the time and efforts in collecting and analysis of data for differentsources, a large no of alternatives can be evaluated.

Advantages of DSS

27

Page 28: Data mining & data warehousing (ppt)

It is also termed as Executive Support System[ESS].

It is a specialized decision support system used to assist seniorexecutives in the decision-making process.

It includes various hardware, software, data, procedures and thepeople.

It is very user friendly in the nature.

It is supported at a large extent by the graphics.

Executive Information System[EIS]

28

Page 29: Data mining & data warehousing (ppt)

29

Page 30: Data mining & data warehousing (ppt)

1. Informational characteristics

2. User interface/orientation characteristics

3. Managerial / executive characteristics

Characteristics of Executive Information System

30

Page 31: Data mining & data warehousing (ppt)

i. Flexibility and ease of use.

i. Provides the timely information with the short response time and alsowith the quick retrieval.

i. Produces the correct information.

i. Produces the relevant information.

ii. Produces the validated information.

1. Informational characteristics

31

Page 32: Data mining & data warehousing (ppt)

• Consists of the sophisticated self help.

• Contains the user friendly interfaces consisting of the graphic user.

• Can be used from many places.

2. User interface/orientation characteristics

32

Page 33: Data mining & data warehousing (ppt)

• Offers secure reliable, confidential access along with the access procedure.

• Is very much customized. Suites the management style of the individual executives.

Page 34: Data mining & data warehousing (ppt)

i. Supports the over all vision, mission and the strategy.

ii. Provides the support for the strategic management.

iii. Sometimes helps to deal with the situations that have a high degree of risk.

iv. Is linked to the value added business processes.

v. Supports the access to database.

vi. Is very much result oriented in the nature.

3. Managerial / executive characteristics

34

Page 35: Data mining & data warehousing (ppt)

• Achievement of the various organizational objectives.Facilitates access to the information by integrating many sources of thedata.

• Facilitates broad, aggregated perspective and the context.

• Offers broad highly aggregated information.

• User’s productivity is also improved to a large extent.

• Communication capability and the quality are increased.

Advantages of EIS

35

Page 36: Data mining & data warehousing (ppt)

• Internal factors- accurate & reliable information, improvecommunications, use of historical data

• External factors- increasing global competition, changing thebusiness environment, government regulations.

Factors affecting EIS

Page 37: Data mining & data warehousing (ppt)

DSS EIS

• Used by professionals

• Required for day to day operations

• Deals both with semi & unstructureddata

• Consists only of internal information

• Used by executives

• Required for strategic plans and procedures

• Deals only with unstructured data (whichcannot be described in detail)

• Consists of both internal & externalinformation

37

Differences between DSS and EIS