dwh life cycle
DESCRIPTION
DWH life cycleTRANSCRIPT
Data Data Warehouse Warehouse Life CycleLife Cycle
Data Warehouse Data Warehouse DefinedDefined
““A data warehouse is a collection of A data warehouse is a collection of corporate information, derived corporate information, derived directly from operational systems directly from operational systems and some external data sources. Its and some external data sources. Its specific purpose is to support specific purpose is to support business decisions, not business business decisions, not business operationsoperations””
Characteristics of a Characteristics of a DW DW
Subject-oriented DataSubject-oriented Data collects all data for a subject, from different collects all data for a subject, from different
sourcessources Read-only RequestsRead-only Requests
loaded during off-hours, read-only during day loaded during off-hours, read-only during day hourshours
Interactive Features, ad-hoc queryInteractive Features, ad-hoc query flexible design to handle spontaneous user flexible design to handle spontaneous user
queriesqueries Pre-aggregated dataPre-aggregated data
to improve runtime performanceto improve runtime performance Highly denormalized data structuresHighly denormalized data structures
Dimension tables with redundant columnsDimension tables with redundant columns
Components of a Data Components of a Data WarehouseWarehouse
Source Systems
Data Staging Area
DWH Servers
End User Data Access
Storage Flat Files RDBMS
Processing
No User Query Services
Data Mart 1
Dimensional
Conforms to DW Bus
Data Mart 2
Query Tools
Report Writers
Mining Tools
Data ModelingData Modeling
Data ModelingData Modeling
WHAT IS A DATA MODEL?WHAT IS A DATA MODEL?
A data model is an abstraction of some aspect ofA data model is an abstraction of some aspect of
the real world (system). the real world (system).
WHY A DATA MODEL?WHY A DATA MODEL? Helps to visualise the businessHelps to visualise the business A model is a means of communication.A model is a means of communication. Models help elicit and document requirements. Models help elicit and document requirements. Models reduce the cost of change. Models reduce the cost of change. Model is the essence of DW architecture based Model is the essence of DW architecture based
on which DW will be implementedon which DW will be implemented
What do we want to do with the What do we want to do with the data?data?
Model depends on what kind of data Model depends on what kind of data analysis we want to do:analysis we want to do:
Different Data Analysis TechniquesDifferent Data Analysis Techniques Query and reportingQuery and reporting
Display Query ResultsDisplay Query Results Multidimensional analysisMultidimensional analysis
Analyse data content by looking at it in Analyse data content by looking at it in different perspectivesdifferent perspectives
Data miningData mining discover patterns and clustering attributes in discover patterns and clustering attributes in
data data
Impact of Data AnalysisImpact of Data Analysis Techniques on DM Techniques on DM
Query and reportingQuery and reporting Normalized data model Normalized data model Select associated data elementsSelect associated data elements summarize and group by categorysummarize and group by category present resultspresent results direct table scandirect table scan ER with normalized / denormalized ER with normalized / denormalized
appropriateappropriate
Impact of Data Analysis Impact of Data Analysis Techniques on DMTechniques on DM
Multidimensional analysisMultidimensional analysis Fast and easy access to dataFast and easy access to data Any number of analysis dimensions in Any number of analysis dimensions in
any combinationsany combinations ER will mean many joinsER will mean many joins Dimensional model appropriateDimensional model appropriate
Levels of modelingLevels of modeling
Conceptual modelingConceptual modeling Describe data requirements from a Describe data requirements from a
business point of view without technical business point of view without technical detailsdetails
Logical modelingLogical modeling Refine conceptual modelsRefine conceptual models Data structure oriented, platform Data structure oriented, platform
independentindependent Physical modelingPhysical modeling
Detailed specification of what is physically Detailed specification of what is physically implemented using specific technologyimplemented using specific technology
Conceptual ModelConceptual Model
A conceptual model shows data A conceptual model shows data through business eyes.through business eyes.
All entities which have business All entities which have business meaning.meaning.
Important relationshipsImportant relationships Few significant attributes in the Few significant attributes in the
entities.entities. Few identifiers or candidate keys.Few identifiers or candidate keys.
Logical ModelLogical Model
Replaces many-to-many relationships Replaces many-to-many relationships with associative entities.with associative entities.
Defines a full population of entity Defines a full population of entity attributes.attributes.
May use non-physical entities for May use non-physical entities for domains and sub-types.domains and sub-types.
Establishes entity identifiers.Establishes entity identifiers. Has no specifics for any RDBMS or Has no specifics for any RDBMS or
configuration.configuration.
Physical ModelPhysical Model
A Physical data model may includeA Physical data model may include Referential IntegrityReferential Integrity IndexesIndexes ViewsViews Alternate keys and other constraintsAlternate keys and other constraints Tablespaces and physical storage Tablespaces and physical storage
objects.objects.
STAGING AREASTAGING AREA YES ! (maybe multiple data models are YES ! (maybe multiple data models are
required)required) ODSODS
YES !YES ! DATAWAREHOUSE/DATAMARTDATAWAREHOUSE/DATAMART
YES! YES!
What needs to be modeled duringa data warehouse project
Data Modeling - Data Modeling - TechniquesTechniques
Modeling techniquesModeling techniques
E-R ModelingE-R Modeling Dimensional ModelingDimensional Modeling
Implementation and Implementation and modeling stylesmodeling styles
Modeling versus implementationModeling versus implementation Modeling: describe what should be Modeling: describe what should be
built to non-technical folksbuilt to non-technical folks Implementation: describe what is Implementation: describe what is
actually built to technical folksactually built to technical folks
Relational modelingRelational modeling Use for implementationUse for implementation Difficult to understand by non-technical Difficult to understand by non-technical
folksfolks Dimensional modelingDimensional modeling
Use for modeling during analysis and Use for modeling during analysis and design phasesdesign phases
Can be implemented using other modeling Can be implemented using other modeling styles e.g. object-oriented, relationalstyles e.g. object-oriented, relational
Implementation and Implementation and modeling styles modeling styles
Limitations of E-R Limitations of E-R ModelingModeling
Poor PerformancePoor Performance Tend to be very complex and Tend to be very complex and
difficult to navigate.difficult to navigate.
Dimensional ModelingDimensional Modeling
Dimensional modeling uses three Dimensional modeling uses three basic concepts : measures, facts, basic concepts : measures, facts, dimensions.dimensions.
Is powerful in representing the Is powerful in representing the requirements of the business user in requirements of the business user in the context of database tables.the context of database tables.
Focuses on numeric data, such as Focuses on numeric data, such as values counts, weights, balances and values counts, weights, balances and occurences.occurences.
Must identifyMust identify Business process to be supportedBusiness process to be supported Grain (level of detail)Grain (level of detail) DimensionsDimensions FactsFacts
Dimensional modelingDimensional modeling
Conventions used in Conventions used in Dimensional modelingDimensional modeling
FactsFacts Measures(Variables)Measures(Variables) DimensionsDimensions
Dimension membersDimension members Dimension hierarchiesDimension hierarchies
Facts Facts A fact is a collection of related data A fact is a collection of related data
items, consisting of measures and items, consisting of measures and context data.context data.
Each fact typically represents a Each fact typically represents a business item, a business transaction, business item, a business transaction, or an event that can be used in or an event that can be used in analyzing the business or business analyzing the business or business process.process.
Facts are measured, “continuously Facts are measured, “continuously valued”, rapidly changing information. valued”, rapidly changing information. Can be calculated and/or derived.Can be calculated and/or derived.
Fact TableFact Table
A table that is used to store business A table that is used to store business information (measures) that can be information (measures) that can be used in mathematical equations.used in mathematical equations. QuantitiesQuantities PercentagesPercentages PricesPrices
DimensionsDimensions
A dimension is a collection of members A dimension is a collection of members or units of the same type of views.or units of the same type of views.
Dimensions determine the contextual Dimensions determine the contextual background for the facts.background for the facts.
Dimensions represent the way business Dimensions represent the way business people talk about the data resulting from people talk about the data resulting from a business process, e.g., who, what, a business process, e.g., who, what, when, where, why, howwhen, where, why, how
Dimension TableDimension Table
Table used to store qualitative data Table used to store qualitative data about fact recordsabout fact records WhoWho WhatWhat WhenWhen WhereWhere WhyWhy
Dimension data should Dimension data should bebe
verbose, descriptiveverbose, descriptive completecomplete no misspellings, impossible valuesno misspellings, impossible values indexed indexed equally available equally available documented ( metadata to explain documented ( metadata to explain
origin, interpretation of each origin, interpretation of each attribute)attribute)
Dimensional modelDimensional model
Visualise a dimensional model as a CUBE Visualise a dimensional model as a CUBE (hypercube because dimensions can be (hypercube because dimensions can be more than more than
3 in number)3 in number) Operations for OLAP Operations for OLAP
Drill DownDrill Down : :Higher level of detailHigher level of detailRoll UpRoll Up: : summarized level of data summarized level of data(The navigation path is determined by (The navigation path is determined by hierarchies within dimensions.)hierarchies within dimensions.)SliceSlice:: cuts through the cube.Users can focus cuts through the cube.Users can focus on specific perspectiveson specific perspectives DiceDice:: rotates the cube to another perspective rotates the cube to another perspective (change the dimension)(change the dimension)
Drill down …. Roll upDrill down …. Roll up
Slice and DiceSlice and Dice
DimensionsDimensions Collection of members or units of the same Collection of members or units of the same
type of views.type of views. determine the contextual background for the determine the contextual background for the
facts.facts. the parameters over which we want to the parameters over which we want to
perform OLAP perform OLAP (eg.(eg. Time,Time, Location/region,Location/region, Customers)Customers) MemberMember is a distinct name to determine data item’s is a distinct name to determine data item’s
position (eg. Time - Month, quarter)position (eg. Time - Month, quarter) HierarchyHierarchy arrange members into hierarchies or arrange members into hierarchies or
levelslevels
HierarchiesHierarchies
Allow for the ‘rollup’ of data to more Allow for the ‘rollup’ of data to more summarized levels.summarized levels. TimeTime
dayday monthmonth quarterquarter yearyear
HierarchiesHierarchies
AggregatesAggregates
Aggregate Tables are pre-stored Aggregate Tables are pre-stored summarized tables… created at a summarized tables… created at a higher level of granularity across higher level of granularity across any or all of the dimensions.any or all of the dimensions.
If the existing granularity is Day If the existing granularity is Day wise sales, then creating a separate wise sales, then creating a separate month wise sales table is an month wise sales table is an example of Aggregate Table. example of Aggregate Table.
AggregatesAggregates
The use of such aggregates is the The use of such aggregates is the single most effective tool the data single most effective tool the data warehouse designer has to warehouse designer has to improve query performance.improve query performance.
Usage of Aggregates can increase Usage of Aggregates can increase the performance of Queries by the performance of Queries by several times. several times.
MeasuresMeasures A measure is a numeric attribute of a fact, A measure is a numeric attribute of a fact,
representing the performance or behaviour representing the performance or behaviour of the business relative to dimensions. of the business relative to dimensions.
The actual numbers are called as variables.The actual numbers are called as variables.eg. sales in money, sales volume, quantity supplied, supply eg. sales in money, sales volume, quantity supplied, supply cost, transaction amountcost, transaction amount
A measure is determined by combinations A measure is determined by combinations of the members of the dimensions and is of the members of the dimensions and is located on facts.located on facts.
The Cube The Cube
Types of FactsTypes of Facts
AdditiveAdditive Able to add the facts along all the Able to add the facts along all the
dimensionsdimensions Discrete numerical measures eg. Retail Discrete numerical measures eg. Retail
sales in $sales in $ Semi AdditiveSemi Additive
Snapshot, taken at a point in timeSnapshot, taken at a point in time Measures of IntensityMeasures of Intensity Not additive along time dimension eg. Not additive along time dimension eg.
Account balance, Inventory balanceAccount balance, Inventory balance Added and divided by number of time Added and divided by number of time
period to get a time-averageperiod to get a time-average
Types of FactsTypes of Facts
Non AdditiveNon Additive Numeric measures that cannot be added Numeric measures that cannot be added
across any dimensionsacross any dimensions Intensity measure averaged across all Intensity measure averaged across all
dimensions eg. Room temperaturedimensions eg. Room temperature Textual facts - AVOID THEMTextual facts - AVOID THEM
StarStar Single fact table surrounded by Single fact table surrounded by
denormalized dimension tablesdenormalized dimension tables The fact table primary key is the composite The fact table primary key is the composite
of the foreign keys (primary keys of of the foreign keys (primary keys of dimension tables)dimension tables)
Fact table contains transaction type Fact table contains transaction type information.information.
Many star schemas in a data martMany star schemas in a data mart Easily understood by end users, more disk Easily understood by end users, more disk
storage requiredstorage required
Common structures for Common structures for Data Marts :Data Marts :Denormalize!Denormalize!
Example of Star Example of Star SchemaSchema
SnowflakeSnowflake Single fact table surrounded by normalized Single fact table surrounded by normalized
dimension tablesdimension tables Normalizes dimension table to save data storage Normalizes dimension table to save data storage
space.space. When dimensions become very very largeWhen dimensions become very very large Less intuitive, slower performance due to joinsLess intuitive, slower performance due to joins
May want to use both approaches, May want to use both approaches, especially if supporting multiple end-user especially if supporting multiple end-user tools.tools.
Common structures for Common structures for Data Marts:Data Marts:
Denormalize!Denormalize!
Example of Snow flake Example of Snow flake schemaschema
Snowflake - Snowflake - DisadvantagesDisadvantages
Normalization of dimension makes it Normalization of dimension makes it difficult for user to understanddifficult for user to understand
Decreases the query performance Decreases the query performance because it involves more joinsbecause it involves more joins
Dimension tables are normally Dimension tables are normally smaller than fact tables - space may smaller than fact tables - space may not be a major issue to warrant not be a major issue to warrant snowflakingsnowflaking
Keys …Keys …
Primary KeysPrimary Keys uniquely identify a recorduniquely identify a record
Foreign KeysForeign Keys primary key of another table referred primary key of another table referred
herehere Surrogate KeysSurrogate Keys
system-generated key for dimensionssystem-generated key for dimensions key on its own has no meaning key on its own has no meaning integer key, less spaceinteger key, less space
More Keys …More Keys … Smart KeysSmart Keys
primary key out of various attributes of primary key out of various attributes of dimensiondimension
AVOID THEM!AVOID THEM! Join to Fact table should be on single Join to Fact table should be on single
surrogate keysurrogate key Production KeysProduction Keys
DO NOT USE Production defined DO NOT USE Production defined attributes attributes
Business may reuse/change them - DW Business may reuse/change them - DW cannot!cannot!
Basic Dimensional Modeling Basic Dimensional Modeling TechniquesTechniques
Slowing changing DimensionsSlowing changing Dimensions Rapidly changing Small DimensionsRapidly changing Small Dimensions Large DimensionsLarge Dimensions Rapidly changing Large DimensionsRapidly changing Large Dimensions Degenerate DimensionsDegenerate Dimensions Junk DimensionsJunk Dimensions
Slowly Changing Slowly Changing DimensionsDimensions
A dimension is considered a A dimension is considered a Slowly Slowly Changing DimensionChanging Dimension when its when its attributes remain attributes remain almostalmost constant constant over time, requiring relatively minor over time, requiring relatively minor alterations to represent the evolved alterations to represent the evolved state.state.
The Time Dimension The Time Dimension Time_keyday_of_weekday_number_in_monthday_number_overallweek_number_in_yearmonthquarterfiscal_periodholiday_flagweekday_flaglast_day_in_month_flagseasonevent
Time DimensionTime Dimension
An exclusive Time dimension is An exclusive Time dimension is required because the SQL date required because the SQL date semantics and functions cannot semantics and functions cannot generate several important attributes generate several important attributes required for analytical purposes.required for analytical purposes.
Attributes like weekdays, weekends, Attributes like weekdays, weekends, fiscal period, holidays, season cannot fiscal period, holidays, season cannot be generated by SQL statements.be generated by SQL statements.
Time DimensionTime Dimension
Moreover SQL date stamps occupy Moreover SQL date stamps occupy more space largely increasing the more space largely increasing the size of the fact table.size of the fact table.
Joins on such SQL generated date-Joins on such SQL generated date-stamps are costly decreasing the stamps are costly decreasing the query speed significantly.query speed significantly.
Time DimensionTime Dimension
The Day of week(Monday, ...) is The Day of week(Monday, ...) is useful to create reports comparing useful to create reports comparing for ex. Monday sales to Friday for ex. Monday sales to Friday sales. sales.
The Day number in month is The Day number in month is useful for comparing measures for useful for comparing measures for the same day in each month.the same day in each month.
The last day in month flag is useful The last day in month flag is useful for performing payday analysis.for performing payday analysis.
Time DimensionTime Dimension
The holiday flag and season The holiday flag and season attributes are useful for holiday attributes are useful for holiday VS non-holiday analysis and VS non-holiday analysis and season business analysis.season business analysis.
Event attribute is needed to Event attribute is needed to record special days like strike record special days like strike days, etc..days, etc..
ETVL Overview ETVL Overview
Introduction
SourceSystem 1
SourceSystem 2
SourceSystem 3
Staging Area Data warehouse
ETVL
ETVL
Extraction, Transformation, Validation, Load
ExtractionExtraction
Source Systems (Multiple Source Source Systems (Multiple Source Systems)Systems) Flat files, Excel, Legacy Systems, RDBMS etc.Flat files, Excel, Legacy Systems, RDBMS etc.
Frequency of ExtractionFrequency of Extraction Staging Area (If any? How many?)Staging Area (If any? How many?) Most Transformations from Source to Most Transformations from Source to
StagingStaging Cleansing and Data Quality Cleansing and Data Quality
Data integrity, De-duplication, completeness, Data integrity, De-duplication, completeness, correctnesscorrectness
TransformationTransformation
Usage of toolsUsage of tools Reusability of TransformationsReusability of Transformations Reusability of MappingsReusability of Mappings
Different toolsDifferent tools InformaticaInformatica Warehouse BuilderWarehouse Builder ETIETI SagentSagent PL/SQL scriptsPL/SQL scripts
LoadingLoading
Loading FrequencyLoading Frequency Optimized LoadingOptimized Loading
IndexingIndexing PartitioningPartitioning
AggregationAggregation SumSum AverageAverage MaxMax
Update StrategyUpdate Strategy Error HandlingError Handling
SynopsisSynopsis
- Flat files, Excel, Legacy Systems, - Flat files, Excel, Legacy Systems, RDBMS etc.RDBMS etc.
Implement Business RulesImplement Business Rules ODBC ConnectivityODBC Connectivity Scheduling the ETVLScheduling the ETVL Frequency of ExtractionFrequency of Extraction Staging AreaStaging Area Most Transformations from Source to Most Transformations from Source to
StagingStaging
SynopsisSynopsis
Cleansing and Data Quality Cleansing and Data Quality Data integrity, De-duplication, Data integrity, De-duplication,
completeness, correctnesscompleteness, correctness Rejected RecordsRejected Records Exception Handling and Error LogException Handling and Error Log Optimized LoadingOptimized Loading Re-usabilityRe-usability Aggregation of dataAggregation of data Update StrategyUpdate Strategy
STAGING AREA - Some STAGING AREA - Some Clarity Clarity
Staging Area Staging Area optionaloptional to cleanse the source datato cleanse the source data Accepts data from different sources Accepts data from different sources Data model is required at staging areaData model is required at staging area Multiple data models may be required Multiple data models may be required
for parking different sources and for for parking different sources and for transformed data to be pushed out to transformed data to be pushed out to warehousewarehouse
ODS - Some ClarityODS - Some Clarity
Operational Data StoreOperational Data Store OptionalOptional Granular, detailed level dataGranular, detailed level data May feed warehouse (eg when May feed warehouse (eg when
warehouse is aggregated)warehouse is aggregated) Usually a relational modelUsually a relational model May keep data for a smaller time period May keep data for a smaller time period
than warehousethan warehouse
A look at different DW A look at different DW architecturesarchitectures
Operational Data
External data
Warehouse Manager
L
O
A
D
M
A
N
A
G
E
R
Q
U
E
R
Y
M
A
N
A
G
E
R
Detailed Information
Summary information
Meta Data OLAP
Data Warehouse Data Warehouse Architecture - 2Architecture - 2
Data Warehouse Data Warehouse Architecture - 3Architecture - 3
Data Warehouse Data Warehouse Architecture - 4Architecture - 4
DW ArchitectureDW Architecture
Architecture Choices depend onArchitecture Choices depend on Current infrastructureCurrent infrastructure Business environmentBusiness environment Desired management and control Desired management and control
structurestructure resourcesresources commitment ….. commitment …..
Data Warehouse/data martData Warehouse/data mart
DW Architecture DW Architecture
Architecture Choices determineArchitecture Choices determine Where will DW reside?Where will DW reside?
Centrally / locally / distributed Centrally / locally / distributed Where will it be managed from?Where will it be managed from?
Centrally / independentlyCentrally / independently
3 choices3 choices GlobalGlobal IndependentIndependent InterconnectedInterconnected
(or) a combination of these three(or) a combination of these three
DW Architecture DW Architecture
Global Architecture Global Architecture related to scope of data access and related to scope of data access and
storage storage does not mean centralizeddoes not mean centralized can be physically centralized or can be physically centralized or
distributeddistributed enterprise view of dataenterprise view of data time-consuming & costly to implementtime-consuming & costly to implement
Global ArchitectureGlobal Architecture
DW Architecture DW Architecture
Independent Architecture Independent Architecture stand-alonestand-alone controlled by a departmentcontrolled by a department minimal integrationminimal integration no global viewno global view very fast to implementvery fast to implement
DW Architecture DW Architecture
Interconnected Architecture Interconnected Architecture distributeddistributed integrated and interconnected integrated and interconnected gives a global view of enterprisegives a global view of enterprise more complexitymore complexity
who manages / controls data who manages / controls data another tier in architecture to share common another tier in architecture to share common
data between multiple data martsdata between multiple data marts have a data sharing schema across data have a data sharing schema across data
martsmarts
Independent Independent & &
Interconnected ArchitectureInterconnected Architecture
Types of Data WarehouseTypes of Data Warehouse
Enterprise Data Warehouse Enterprise Data Warehouse Data MartData Mart
EnterpriseData Warehouse
Datamart Datamart Datamart
Enterprise data Enterprise data warehousewarehouse
Contains data drawn from multiple Contains data drawn from multiple operational systemsoperational systems
Supports time- series and trend Supports time- series and trend analysis across different business areasanalysis across different business areas
Can be used as a transient storage area Can be used as a transient storage area to clean all data and ensure consistencyto clean all data and ensure consistency
Can be used to populate data martsCan be used to populate data marts Can be used for everyday and strategic Can be used for everyday and strategic
decision makingdecision making
Data MartData Mart
Logical subset of enterprise data Logical subset of enterprise data warehousewarehouse
Organized around a single business Organized around a single business process process
Based on granular data Based on granular data May or may not contain aggregatesMay or may not contain aggregates Object of analytical processing by the Object of analytical processing by the
end user. end user. Less expensive and much smaller Less expensive and much smaller
than a full blown corporate data than a full blown corporate data warehouse.warehouse.
Distributed and Centralized Distributed and Centralized Data warehousesData warehouses
DW sitting on a monolithic DW sitting on a monolithic machine - machine - unrealisticunrealistic
Separate machines, different OS, Separate machines, different OS, different DB systems - different DB systems - realityreality
SolutionSolution Share a uniform architecture to Share a uniform architecture to
allow them to be fused coherentlyallow them to be fused coherently
Classical ArchitecturesClassical Architectures
Physical data warehouse (physical)Physical data warehouse (physical) Data warehouse --> data martsData warehouse --> data marts Data marts --> data warehouseData marts --> data warehouse Parallel data warehouse and data marts Parallel data warehouse and data marts
Physical data warehouse: Physical data warehouse: Data warehouse --> data Data warehouse --> data
martsmarts
•SOURCE DATA
•External•Data
•Operational Data
•Staging Area
•Data Warehouse •Data Marts
•Physical Data Warehouse:•Data Warehouse --> Data Marts
Physical data warehouse:Physical data warehouse:Data marts --> data Data marts --> data
warehousewarehouse
SOURCE DATA
ExternalData
Operational Data
Staging Area
Data WarehouseData Marts
Physical Data Warehouse:Data Marts --> Data Warehouse
Physical Data Warehouse:Physical Data Warehouse:Parallel Data Warehouse Parallel Data Warehouse
and Data Martand Data Mart
SOURCE DATA
ExternalData
Operational DataStaging Area
Data Warehouse
Data Marts
Physical Data Warehouse:Parallel Data Warehouse & Data Marts
DW Implementation DW Implementation Approaches Approaches
Top DownTop Down Bottom-upBottom-up Combination of bothCombination of both Choices depend on:Choices depend on:
current infrastructurecurrent infrastructure resourcesresources architecturearchitecture ROIROI Implementation speedImplementation speed
Top Down Top Down ImplementationImplementation
Bottom Up Bottom Up ImplementationImplementation
DW Implementation DW Implementation ApproachesApproaches
Top DownTop Down More planning and More planning and
design initiallydesign initially Involve people from Involve people from
different work-groups, different work-groups, departmentsdepartments
Data marts may be Data marts may be built later from Global built later from Global DWDW
Overall data model to Overall data model to be decided up-frontbe decided up-front
Bottom UpBottom Up Can plan initially Can plan initially
without waiting for without waiting for global infrastructureglobal infrastructure
built incrementally built incrementally can be built before can be built before
or in parallel with or in parallel with Global DWGlobal DW
Less complexity in Less complexity in designdesign
DW Implementation DW Implementation ApproachesApproaches
Top DownTop Down Consistent data definition Consistent data definition
and enforcement of and enforcement of business rules across business rules across enterpriseenterprise
High cost, lengthy High cost, lengthy process, time consumingprocess, time consuming
Works well when there is Works well when there is centralized IS centralized IS department responsible department responsible for all H/W and resourcesfor all H/W and resources
Bottom UpBottom Up Data redundancy Data redundancy
and inconsistency and inconsistency between data marts between data marts may occurmay occur
Integration requires Integration requires great planninggreat planning
Less cost of H/W Less cost of H/W and other resourcesand other resources
Faster pay-backFaster pay-back