(5)data mining and ware housing
TRANSCRIPT
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 1/16
A Paper Presentation on
Presented by
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 2/16
B.YOGI REDDY K.SREEHARSHA
06471A0508 06471A1235
EMAIL: [email protected] EMAIL:[email protected]
(3/4) B.Tech (3/4)B.Tech
Mobile No:9490847674
NARASARAOPETA ENGINEERING COLLEGE Kotappakonda Road,Yellamanda(P.O), NARASARAOPET.
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 3/16
Abstract
One may
claim that the exponential
growth in the amount of
data provides
great opportunities for data
mining. In many real world
applications, the number of
sources over which this
information is fragmented
grows at an even faster rate,
resulting in barriers to
widespread application of
data mining. A data
warehouse is designed
especially for decision
support queries.
Data warehousing
is the process of extracting
and transforming
operational data into
informational data and
loading it into a central
data store or warehouse.
The idea behind
data mining , then is the “
non trivial process of
identifying valid, novel ,
potentially useful, and
ultimately understandable
patterns in India”
Data mining is
concerned with the analysis
of data and the use of
software technique for
finding patterns and
regularities in sets of data.
Data mining potential can
be enhanced if the
appropriate data has been
collected and stored in data
warehouse
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 4/16
Data warehousing
provides the means to
change raw data into
information for making
effective business decision
– the emphasis on
information , not data. The
data warehouse is the hub
for decision support data.
This paper also
explains partition algorithm
to discover all requirements
sets from the data
warehousing using the data
mining. Also explained
relation between
operational data , data
warehouse and data marts.
.
Content Overview
Page No
Introduction
4
Warehouse with a database
5
What is Data-Warehousing?
5
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 5/16
Warehousing Functions
6
Architecture Of Data Warehouse
7
What is Data Mining ?
7
Warehousing and Mining
8
Data Mining as a part of Knowledge Discovery
9
Goals of Data Mining and Knowlegdge Discovery
9
Compendium
10
Introduction
“Knowledge [no
more Information] is not
only power, but also has
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 6/16
significant competitive
advantage”
Organizations have
lately realized that just processing transactions
and/or information’s faster
and more efficiently, no
longer provides them with a
competitive advantage vis-
à-vis their competitors for
achieving business
excellence. Information
technology (IT) tools that
are oriented towards
knowledge processing can
provide the edge that
organizations need to
survive and thrive in the
current era of fierce
competition. The increasing
competitive pressures and
the desire to leverage
information technology
techniques have led many
organizations to explore the
benefits of new emerging
technology – viz. "Data
Warehousing and Data
Mining". What is needed
today is not just the latest
and updated to the nano-
second information, but the
cross-functional
information that can help
decisions making activity as
"on-line" process.
Evolution of Information
Technology Tools
The evolution of the
information systems
characterize the evolution
of systems from data
maintenance systems, to
systems that transform the
data into "information" for
use in the decision making
process. These systems
supported the information
acquisition from the
database of transactional
data. The evolution of new
patterns in the changing
scenario could not be
provided by these systems
directly, the planner was
supposed to do this from
experience.
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 7/16
The Transformation of Data into Knowledge andassociated tools.
What is Data-
Warehousing?
The data warehousemakes an attempt to figure
out "what we need", beforewe know we need it.
What it actually is ?
* A data
warehouse
stores
current and
historicaldata
* This data is
taken from
various, perhaps
incompatible
, sources and
stored in auniform
format
* Several toolstransform
this data into
meaningful
business
Processing
Transactions
Processi
Knowledge Data InformationProce
ManagementInformation
Data Mining Tool
On-Line AnalytiProcessing Too
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 8/16
information
for the
purpose of comparisons,
trends and
forecasting* Data in a
warehouse is
not updates
or changedin any way,
but is only
loaded andaccessed
later on
* Data is
organizedaccording to
subject
instead of
application.
In general a database
is not a data warehouse
unless it has the followingtwo features:
• It collects
information froma number of
different disparate
sources and is the
place where thisdisparity is
reconciled, and
• It allows
several differentapplications to
make use of thesame information.
conceptually, a Data Warehouse looks like this:
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 9/16
Information Sources
always include the
core operational
systems which form
the backbone of
day-to-day
activities. It is these
systems which have
traditionally
provided
management
information to
support decision
making.
Decision Support
Tools are used to
analyze the
information stored
in the warehouse,
typically to identify
trends and new
business
opportunities..
The Data
Warehouse itself is
the bridge between
the operational
systems and the
decision support
tools. It holds a copy
of much of the
operational system
data in a logical
structure which is
more conducive to
analysis. The Data
Warehouse, which
will be refreshed in
scheduled bursts
from operational
systems and from
relevant external
data sources,
provides a single,
consistent view of
corporate data,
leaving operational
systems unaffected.
Data – Warehouse
Functions
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 10/16
The main function
behind a data warehouse is
to get the enterprise-wide
data in a format that is most
useful to end-users,
regardless of their locations.
Datawarehousing is used
for:
• Increasing the speed and
flexibility of analysis.
* Providing a foundation
for enterprise-wide
integration and access.
* Improving or re-inventing
business processes.
* Gaining a clear
understanding of customer
behavior.
Data Warehouse
Architecture
Each
implementation of a data
warehouse is different in its
detailed design (a schematic
high-level of the
architecture and its
components is given in the
figure below), but all are
characterised by a handful
of the following key
components:
• A data
model to define
the warehouse
contents.
•
A carefullydesigned
warehouse
database,
whether
hierarchical,
relational, or
multidimensiona
l. While
choosing a
DBMS it must
be kept in view
that the database
management
system should
be powerful
enough to
handle huge
amount of data
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 11/16
running up to
terabytes.
• A front end
for DecisionSupport System
(DSS) for
reporting and for
structured and
unstructured
analysis.
Schematic view of the Data Warehouse Architecture.
Data Mining
Data base mining or Data
mining (DM) (formally
termed Knowledge
Discovery in Databases –
KDD) is a process that aims
to use existing data to
invent new facts and to
uncover new relationships
previously unknown even to
experts thoroughly familiar
with the data. It is like
extracting precious metal
(say gold etc.) and/or gems,
hence the term “mining”, It
is based on filtration and
assaying of mountain of
data “ore” in order to get
“nuggets” of knowledge.
Legacy Database
Operational Database
External Data Source
Data
Warehous
Metadata
Extract Transform
Maintain
• Query anreporting
• Multi-
dimensioanalysistools
• Other OLtools
• Data min
tools
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 12/16
The data mining process is diagrammatically
exemplified in Figure below
Transformed Data
1
2
N
The Data Mining Process.
Data Mining and Data Warehousing
· The goal of a data
warehouse is to
support decision
making with data.
Data Sources
Data
WarehouseSelected
Data
AssimilateTransformSelect Mine
ExtractedInformation
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 13/16
· Data mining can be
used in conjunction
with a datawarehouse to help
with certain types of
decisions.
· Data mining can beapplied to
operational
databases withindividual
transactions.
· To make data
mining more
efficient, the data
warehouse should
have an aggregatedor summarized
collection of data.
· Data mining helps
in extractingmeaningful new
patterns that cannot
be found necessarily by merely querying
or processing data or
metadata in the datawarehouse.
Data Mining as a Part of the Knowledge Discovery
Process
· Knowledge Discovery in
Databases, frequently
abbreviated as KDD,
typically encompasses more
than data mining.
· The knowledge discovery
process comprises six
phases:
Data selection ,Data
about specific items
or categories of
items, or from stores
in a specific region
or area of the
country, may be
selected.
Data cleansing
process then may
correct invalid zip
codes or eliminate
records with
incorrect phone
prefixes.
Enrichment typically
enhances the data with
additional sources of
information.
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 14/16
Data transformation and encoding may be done to reduce the
amount of data.
Goals of Data Mining and Knowledge
Discovery
The goals of data mining
fall into the following
classes:
Prediction :Data mining
can show how certain
attributes within the data
will behave in the future.
Identification: Data
patterns can be used to
identify the existence of an
item, an event, or an
activity.
Classification : Data
mining can partition the
data so that different classes
or categories can be
identified based on
combinations of parameters.
Optimization :One
eventual goal of data
mining may be to optimize
the use of limited resources
such as time, space, money,
or materials and to
maximize output variables
such as sales or profits
under a given set of
constraints.
Conclusion
Data warehousing provides
the means to change raw
data into information for
making effective business
decision – the emphasis on
information, not data. The
data warehouse is the hub
for decision support data.
Comprehensive data
warehouse that integrate
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 15/16
operational data with
customer, supplier, and
market information have
resulted in an explosion of
information. Completion
requires timely and
sophisticated analysis on an
integrated view of the data
. Data mining tool can
enhance inference process.
Speed up design cycle, but
con not be substitute for
statistical and domain
expertise. Data mining
allows for the creation of a
self learning organization.
So the future of
data warehouse lies in their
accessibility from the
internet. Successful
implementation of a data
warehouse and data mining
requires a high
performance; scalable
combination of hardware
and software which can
integrate easily within
existing system, so
customer can use data
warehouse to improve their
decision –making—and
their competitive advantage
Last but never the least,
the Internet has emerged as
the largest data warehouse
of unstructured and free
form data. The new
technologies are geared
towards mining this great
data warehouse.
A good
data warehouse provides the
RIGHT data…to the
RIGHT PEOPLE… at the
RIGHT time… RIGHT
now! While data
warehousing organizes data
8/3/2019 (5)Data Mining and Ware Housing
http://slidepdf.com/reader/full/5data-mining-and-ware-housing 16/16
for business analysis,
internet has emerged as the
standard for information
sharing.