Transcript
Page 1: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

1

Sharif University

Data WarehouseData Warehouse

Page 2: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

2

Sharif University

ObjectivesObjectives

• Need for Data Warehouse.

• What is Data Warehouse?

• Data Warehouse Properties.

• Data Warehouse Architectures.

• Data Marts.

• Corporate Information Factory.

• Extraction, Transportation, Loading and Transformation.

• Design in Data Warehouses.

• Data Warehousing Schemas.

Page 3: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

3

Sharif University

Decision support questions Decision support questions that enterprises need to that enterprises need to

have answered have answered

• How did sales representatives perform over different periods of time?

• What are the popular products?• What types of customers buy what types of

products?• How much are the various internal

organizations spending on what products?

Page 4: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

4

Sharif University

Cont.Cont.

• What were the variances between the amounts budgeted and the amounts spent?

• What positions are being filled by people with what types of background?

• What is the average pay for people within different age brackets?

• What is the average pay for people within different age brackets?

Page 5: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

5

Sharif University

What is a Data Warehouse?What is a Data Warehouse?

• A data warehouse is a relational A data warehouse is a relational database that is designed for query database that is designed for query and analysis rather than for and analysis rather than for transaction processingtransaction processing

• A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth by “ William Inmon ”:

– Subject Oriented

– Integrated

– Nonvolatile

– Time Variant

Page 6: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

6

Sharif University

Data Warehouse PropertiesData Warehouse Properties

SubjectOriented

Integrated

DataWarehouse

Non Volatile Time Variant

Page 7: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

7

Sharif University

Subject OrientedSubject Oriented

• For example, to learn more about your company’s sales data ,

"Who was our best customer for this item, in this region last year?"

This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.

•Data is categorized and stored by business subject rather than by application.

Operational SystemsOperational Systems

Region

Time

Customer

Product CustomerFinancial

Information

CustomerFinancial

Information

Data Warehouse Data Warehouse Subject AreaSubject Area

Page 8: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

8

Sharif University

IntegratedIntegrated

Data warehouses must put data from disparate sources into a consistent format.

Page 9: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

9

Sharif University

Time Variant (time series)Time Variant (time series)

•Data is stored as a series of snapshots, each representing a

•period of time.

DataTime

Jan/03

Feb/03

Mar/03

Data for January

Data for February

Data for March

Data Data WarehouseWarehouse

Page 10: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

10

Sharif University

Non VolatileNon Volatile

•Typically data in the data warehouse is not updated or deleted.

ReadRead

LoadLoad

INSERT ReadINSERT Read

UPDATEUPDATE

DELETEDELETE

Operational DatabasesOperational Databases Warehouse DatabaseWarehouse Database

Nonvolatile means that, once entered into the warehouse, data should not change .This is logical because the purpose of a warehouse is to enable you to analyze what has occurred.

Page 11: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

11

Sharif University

Other Characteristics of Data WarehouseOther Characteristics of Data Warehouse

• Summarized

• Not Normalized

• Meta Data

• Sources (Both operational and external data are presents)

Page 12: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

12

Sharif University

Summary DataSummary Data

– Provide fast access to pre-computed data

– Reduce use of

• I/O

• CPU

• Memory

– Distill from

• Source systems - lightly summarized

• Pre-calculated summaries - highly summarized

– Determine requirements early

Page 13: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

13

Sharif University

Summary DataSummary Data

• Average

• Maximum

• Total

• Percentage

DimensionDimensionDataData

FactFactDataData

Units Sold Sales($) Store

Product A

Total

Product B

Total

Product C

Total

Page 14: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

14

Sharif University

Summary DataSummary Data

TimeTime

ProductProduct

StoreStore

Summary FactSummary Fact(Derived)(Derived)

Page 15: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

15

Sharif University

NormalizationNormalization

– Normalized data contains no

• Redundancy.

• Repeating data.

• Key independent columns.

– Denormalized data often

• Improves efficiency in OLAP systems.

• Exists in data warehouse databases.

• Comprises derived or summary data.

– Star and snowflake models are denormalized.

Page 16: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

16

Sharif University

Meta Data (Data about Data)Meta Data (Data about Data)

Provides information about the content of the warehouse.

Meta Data includes:• A guide to moving data to the warehouse• Rules for summarization• Business terms used to describe data• Technical terminology• Rules for data extractions

Page 17: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

17

Sharif University

Data Warehouse ArchitecturesData Warehouse Architectures

• Data Warehouse Architecture (Basic)• Data Warehouse Architecture (with a Staging Area)• Data Warehouse Architecture (with a Staging Area and

Data Marts)

Page 18: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

18

Sharif University

Data Warehouse Architecture (Basic)Data Warehouse Architecture (Basic)

• End users directly access data derived from several source systems through the data warehouse.

Page 19: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

19

Sharif University

Data Warehouse Architecture (with a Data Warehouse Architecture (with a Staging Area)Staging Area)

you need to clean and process your operational data before putting itinto the warehouse. You can do this programmatically, although most data

warehouses use a staging area instead.

Page 20: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

20

Sharif University

Data Warehouse Architecture (with a Staging Data Warehouse Architecture (with a Staging Area and Data Marts)Area and Data Marts)

you may want to customize your warehouse’s architecture for different groups within your organization. You can do this by adding data

marts, which are systems designed for a particular line of business.

Page 21: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

21

Sharif University

Data MartsData Marts

A Data Mart is a small warehouse designed for strategic business unit or a department.

Data Mart Advantages:• The cost is low.

• Implementation time is shorter.

• They are controlled locally rather than centrally.

• They contain less information than the data warehouse and hence have more rapid response.

• They allow a business unit to build its own DSS without relying on a centralized IS department.

Data Mart Types:• Replicated Data Marts.

• Stand-alone Data Marts.

Page 22: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

22

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Corporate Information FactoryCorporate Information Factory

Page 23: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

23

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Business Operations

Business Intelligence

Business Management

Major Business FunctionsMajor Business Functions

Page 24: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

24

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Operational Systems are the internal and external core systems that run the day-to-day business operations. They are accessed through application program interfaces (APIs) and are the source of data for the data warehouse and operational data store.

Operational SystemsOperational Systems

Page 25: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

25

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

External Data is any data outside the normal data collected through an enterprise’s internal applications. Generally, external data, such as demographic, credit, competitor, and financial information, is purchased by the enterprise from a vendor of such information.

External DataExternal Data

Page 26: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

26

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Data Acquisition is the set of processes that capture, integrate, transform, cleanse, and load source data into the data warehouse and operational data store.

Data AcquisitionData Acquisition

Page 27: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

27

Sharif University

Data ProblemsData Problems

Page 28: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

28

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Data Warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data used to support the strategic decision-making process for the enterprise.

Data WarehouseData Warehouse

Page 29: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

29

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Operational Data Store is an subject-oriented, integrated, current, volatile collection of data used to support the tactical decision-making process for the enterprise.

Operational Data StoreOperational Data Store

Page 30: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

30

Sharif University

Comparing an Operational Data Store and Comparing an Operational Data Store and a Data Warehousea Data Warehouse

Page 31: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

31

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

CIF Data Management is the set of processes that protect the integrity and continuity of the data within and across the data warehouse and operational data store. It may employ a staging area for cleansing and synchronizing data.

CIF Data ManagementCIF Data Management

Page 32: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

32

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Transactional Interface is an easy-to-use and intuitive interface for the end user to access and manipulate data in the operational data store.

Transactional InterfaceTransactional Interface

Page 33: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

33

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Data Delivery is the set of processes that enables end users and their supporting IT groups to filter, format, and deliver data to data marts and oper-marts.

Data DeliveryData Delivery

Page 34: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

34

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Exploration Warehouse is a data mart whose purpose is to provide a safe haven for exploratory and ad hoc processing. An exploration warehouse may utilize specialized technologies to provide fast response times with the ability to access the entire database.

Exploration WarehouseExploration Warehouse

Page 35: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

35

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

OtherThe Data Mining Warehouse includes tasks known as knowledge extraction, data archaeology, data exploration, data pattern processing and data harvesting.

Data Mining WarehouseData Mining Warehouse

Page 36: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

36

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The OLAP (online analytical processing) Data Mart is aggregated and/or summarized data that is derived from the data warehouse and tailored to support the multidimensional requirements of a given business unit or business function.

OLAP Data MartOLAP Data Mart

Page 37: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

37

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Oper-Mart is a subset of data derived from of the operational data store used in tactical analysis and usually stored in a multidimensional manner (star schema or hypercube). They may be created in a temporary manner and dismantled when no longer needed.

Oper-MartOper-Mart

Page 38: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

38

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Decision Support Interface is an easy-to-use, intuitive tool to enable end user capabilities such as exploration, data mining, OLAP, query, and reporting to distill information from data.

Decision Support InterfaceDecision Support Interface

Page 39: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

39

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Meta Data Management is the set of processes for managing the information needed to promote data legibility, use, and administration.

Meta Data ManagementMeta Data Management

Page 40: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

40

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Information Feedback is the set of processes that transmit the intelligence gained through usage of the Corporate Information Factory to appropriate data stores.

Information FeedbackInformation Feedback

Page 41: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

41

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Information Workshop is the set of the facilities that optimize use of the Corporate Information Factory by organizing its capabilities and knowledge, and then assimilating them into the business process.

Information WorkshopInformation Workshop

Page 42: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

42

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Library and Toolbox is the collection of meta data and capabilities that provides information to effectively use and administer the Corporate Information Factory. The library provides the medium from which knowledge is enriched. The toolbox is a vehicle for organizing, locating, and accessing capabilities.

Library and ToolboxLibrary and Toolbox

Page 43: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

43

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

The Workbench is a strategic mechanism for automating the integration of capabilities and knowledge into the business process.

WorkbenchWorkbench

Page 44: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

44

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Operation and Administration is the set of activities required to ensure smooth daily operations, to ensure that resources are optimized, and to ensure that growth is managed.

Operations and AdministrationOperations and Administration

Page 45: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

45

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Systems Management is the set of processes for maintaining, versioning, and upgrading the core technology on which the data, software, and tools operate.

Systems ManagementSystems Management

Page 46: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

46

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Data Acquisition Management is the set of processes that manage and maintain processes used to capture source data and its preparation for loading into the data warehouse or operational data store.

Data Acquisition ManagementData Acquisition Management

Page 47: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

47

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Service Management is the set of processes for promoting user satisfaction and productivity within the Corporate Information Factory. It includes processes that manage and maintain service level agreements, requests for change, user communications, and the data delivery mechanisms.

Service ManagementService Management

Page 48: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

48

Sharif University

Information Workshop

Meta Data Management

Operation & Administration

Library & Toolbox Workbench

Change Management

Service Management

Data Acquisition Management

Systems Management

Data Acquisition

CIF Data Management

Data Delivery

Information Feedback

API

API

API

API DSI

DSI

TrI

DSI

DSI

Operational Systems

OperationalData Store

Data Warehouse

Exploration Warehouse

Data Mining Warehouse

OLAP Data Mart

Oper Mart

External

ERP

Internet

Legacy

Other

Change Management is the set of processes coordinating modifications to the Corporate Information Factory.

Change ManagementChange Management

Page 49: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

49

Sharif University

Extraction, Transportation, Loading and Extraction, Transportation, Loading and Transformation (ETL) Transformation (ETL)

OLTP DatabasesOLTP Databases Staging FileStaging File Warehouse DatabaseWarehouse Database

Purchase specialist tools, or develop programsPurchase specialist tools, or develop programs

• Extraction - select data using different methodsExtraction - select data using different methods• Transportation - move data into the warehouseTransportation - move data into the warehouse

• Loading and Transformation - validate, clean, Loading and Transformation - validate, clean, integrate, and time stamp dataintegrate, and time stamp data

Page 50: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

50

Sharif University

Data Quality - ImportanceData Quality - Importance

Ensure data is• Relevant

• Useful

WarehouseWarehouse

Change

Clean up

Restructure

Operational Operational

systemssystems

Relevant

Useful

Quality

Accurate

Accessible

• Large time consuming taskLarge time consuming task

• QualityQuality

• AccurateAccurate

• AccessibleAccessible

Page 51: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

51

Sharif University

An ExampleAn Example

a re

coro

f

as

XX

++

Customers:

Browser:

http://

HollywoodHollywood

Sale 1/2/98 12:00:01 Ham Pizza $10.00

Sale 1/2/98 12:00:02 Cheese Pizza $15.00

Sale 1/2/98 12:00:02 Anchovy Pizza $12.00

Return 1/2/98 12:00:03 Anchovy Pizza - $12.00

Sale 1/2/98 12:00:04 Sausage Pizza $11.00

Sale 1/2/98 12:00:02 Anchovy Pizza $12.00

Return 1/2/98 12:00:03 Anchovy Pizza - $12.00

Sale 1/2/98 12:00:01 Ham Pizza $10.00

Sale 1/2/98 12:00:02 Cheese Pizza $15.00

Sale 1/2/98 12:00:04 Sausage Pizza $11.00

Page 52: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

52

Sharif University

Extraction in Data WarehousesExtraction in Data Warehouses

• Logical Extraction Methods– Full Extraction

• The data is extracted completely from the source system.– Incremental Extraction

• At a specific point in time, only the data that has changed since a well-defined event back in history will be extracted.

• Physical Extraction Methods– Online Extraction

• The data is extracted directly from the source system itself.– Offline Extraction

• Flat files• Dump files• Redo and archive logs• Transportable tablespaces

Page 53: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

53

Sharif University

Changing DataChanging Data

Operational DatabasesOperational Databases Warehouse DatabaseWarehouse Database

First time loadFirst time load

RefreshRefresh

RefreshRefresh

RefreshRefresh

PurgePurgeoror

ArchiveArchive

Page 54: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

54

Sharif University

Transportation in Data WarehousesTransportation in Data Warehouses

• Transportation Mechanisms in Data Warehouses

– Transportation Using Flat Files– Transportation Through Distributed Operations– Transportation Using Transportable Tablespaces

Page 55: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

55

Sharif University

Transportation in Data WarehousesTransportation in Data Warehouses

• Transportation Using Flat Files– The most common method for transporting data is by the

transfer of flat files, using mechanisms such as FTP or other remote file system access protocols

• Transportation Through Distributed Operations– Distributed queries, either with or without gateways, can be an

effective mechanism for extracting data. These mechanisms also transport the data directly to the target system.

• Transportation Using Transportable Tablespaces– Some Databases such as Oracle and DB2 introduced an

important mechanism for transporting data: transportable tablespaces. This feature is the fastest way for moving large volumes of data between two databases.

Page 56: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

56

Sharif University

Loading and Transformation in Data Loading and Transformation in Data WarehousesWarehouses

• Loading Mechanisms– SQL*Loader– External Tables– OCI and Direct-Path APIs– Export/Import

• Transformation Mechanisms

– Transformation Using SQL

– Transformation Using PL/SQL– Transformation Using Table Functions

Page 57: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

57

Sharif University

Incremental Development Incremental Development

– Focus on business functionality

– Deliver business benefit

– Are suited to warehouse evolution

– Once an increment is complete the selection and scope of the next increment is defined

– Each increment follows the same phase sequence

StrategyStrategy

Projectand

ProgramManagement

Projectand

ProgramManagement

ETAEnterpriseTechnical

Architecture

ETAEnterpriseTechnical

Architecture

DefinitionDefinition

AnalysisAnalysis

DesignDesign

BuildBuild

Transition to ProductionTransition to Production

DiscoveryDiscovery

IncrementalDevelopment

Page 58: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

58

Sharif University

RolesRoles

–The project team: roles and responsibilities–Common roles

• Analyst, Database Administrator, Programmer, Tester

– Warehouse specific roles• DW Architect, Metadata Architect, Data Quality

Administrator, DW Administrator

Page 59: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

59

Sharif University

Design in Data WarehousesDesign in Data Warehouses

• Logical Design in Data WarehousesLogical Design in Data Warehouses– Data Warehousing Schemas

• Star

• Snowflake

• Constellation

• Physical Design in Data WarehousesPhysical Design in Data Warehouses– Physical Design Structures

• Tablespaces

• Tables and Partitioned Tables

• Views

• Integrity Constraints

• Dimensions

• Indexes and Partitioned Indexes

• Materialized Views

Page 60: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

60

Sharif University

Data Warehousing SchemasData Warehousing Schemas

• Star

• Snowflake

• Constellation

Page 61: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

61

Sharif University

Star SchemaStar Schema

• The center of the star consists of

one or more fact tables and the

points of the star are the

dimension tables.

Store TableStore_idDistrict_id...

Item TableItem_idItem_desc...

Time TableDay_idMonth_idPeriod_idYear_id

Product TableProduct_idProduct_desc…

Sales Fact TableProduct_idStore_idItem_idDay_idSales_dollarsSales_units...

Page 62: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

62

Sharif University

Snowflake SchemaSnowflake Schema

• d

Sales Fact TableItem_idStore_id

Sales_dollarsSales_units

Store TableStore_id

Store_descDistrict_id

Item TableItem_id

Item_descDept_id

Time TableWeek_idPeriod_idYear_id

District TableDistrict_id

District_desc

Dept TableDept_id

Dept_descMgr_id

Mgr TableDept_idMgr_id

Mgr_name

Product TableProduct_id

Product_desc

Page 63: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

63

Sharif University

ConstellationConstellation

Warehouse TableWarehouse_id

Warehouse_loc

Inventory Fact TableProduct_id

Shelf_idCost_dollarsQty_on_hand

Store TableStore_id

District_id

Item TableItem_idDept_id

Time TableWeek_idPeriod_idYear_id

Product TableProduct_id

Product_desc

Sales Fact TableItem_idStore_id

Sales_dollarsSales_units

Page 64: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

64

Sharif University

SummarySummary

• Need for Data Warehouse.• What is Data Warehouse?• Data Warehouse Properties.• Data Warehouse Architectures.• Data Marts.• Corporate Information Factory.• Extraction, Transportation, Loading and Transformation.• Design in Data Warehouses.• Data Warehousing Schemas.

Page 65: 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

65

Sharif University

Q & A

Data warehouseData warehouseInternal andInternal andexternalexternalsystemssystems

Decision makersDecision makers


Top Related