how to design a data warehouse

40
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)

Upload: ilori

Post on 12-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Linh Nguyen (Elly). How to design a DATA WAREHOUSE. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers. Agenda. What is DW? DW architecture How to design a DW? Advantages of DW Disadvantages of DW Questions and Answers. Agenda. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: How to design a DATA WAREHOUSE

June 08, 2011

How to design aDATA WAREHOUSE

Linh Nguyen (Elly)

Page 2: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 3: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 4: How to design a DATA WAREHOUSE

What is DW?Be a relational database organized to hold information in a structure that best supports reporting and analysis

•Subject oriented

•Time variant

•Non-volatile: stable

•Integrated: from multi source

Page 5: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 6: How to design a DATA WAREHOUSE

DW architecture

Page 7: How to design a DATA WAREHOUSE

DW architecture (cont)

Page 8: How to design a DATA WAREHOUSE

DW architecture (cont)

Data marts

Page 9: How to design a DATA WAREHOUSE

DW architecture (cont)

I. Slowly changing dimensions (SCD)

II. Dimension (table) types

III. Fact (table) types

IV. Scheme types

Page 10: How to design a DATA WAREHOUSE

DW architecture (cont)

I. Slowly changing dimensions (SCD)

– There are 6 types of SCD, they are type 0,1,2,3,4,6

– The most common SCDs are Types 1, 2, and 3

Page 11: How to design a DATA WAREHOUSE

DW architecture (cont)

I.1. SCD Type 0

– Do not manage a slowly changing dimension.

– Values remain as they were at the time the

dimension record was first entered.

– Surrogate key is not required.

– Be easy to maintain.

Page 12: How to design a DATA WAREHOUSE

DW architecture (cont)

I.2. SCD Type 1

– Do not track historical data at all.

– Overwrite old data with new data.

– Surrogate key is not required.

– Be easy to maintain.

Page 13: How to design a DATA WAREHOUSE

DW architecture (cont)

I.3. SCD Type 2– Create multiple records for a given natural key with separate surrogate

key

– New record is inserted each time a change is made

– Should not be used if dimensional model is subject to change

Page 14: How to design a DATA WAREHOUSE

DW architecture (cont)

I.4. SCD Type 3

– It tracks changes using separate columns

– Type 3 has limited history preservation depending on the number of columns designated for storing historical data

Page 15: How to design a DATA WAREHOUSE

DW architecture (cont)

I.5. SCD Type 4

– There are two tables storing data of a dimension

• One table keeps the current data• Additional table is used to keep a record of

changes

Page 16: How to design a DATA WAREHOUSE

DW architecture (cont)

I.6. SCD Type 6 (Hybrid)

– Type 6 combines the approaches of types 1, 2 and 3 (1+2+3=6)

Page 17: How to design a DATA WAREHOUSE

DW architecture (cont)

II. Dimensions Types

– Conformed dimension

– Junk dimension

– Degenerate dimension

– Role-playing dimension

Page 18: How to design a DATA WAREHOUSE

DW architecture (cont)

II.1. Conformed dimension– Be a dimension table which can be shared by multiple fact tables

– Have a single meaning or content through data warehouse

Page 19: How to design a DATA WAREHOUSE

DW architecture (cont)

II.2. Junk dimension– Be a collection of random/miscellaneous attributes that are unrelated to

any particular dimension and don't fit into tight star scheme to:

• Reduce number of dimension tables

• Decrease number of columns in fact tables

Page 20: How to design a DATA WAREHOUSE

DW architecture (cont)

II.3 Degenerate dimension– Any values in the fact table that don’t join to dimensions are either

considered degenerate dimensions or measures

– Degenerate dimension can be rolled up from Facts within SSAS

Page 21: How to design a DATA WAREHOUSE

DW architecture (cont)

II.4. Role-playing dimension– Is a dimension which can play different roles in a fact table

• For example Date dimension which can be used for Order date, Shipping date, Due date...

Page 22: How to design a DATA WAREHOUSE

DW architecture (cont)

III. Fact Types– Consist of the measurements, metrics or facts of business process

(the one contains no measures or facts is factless fact table)

– Measure types:

• Additive type: can be summed up through all of the dimensions

• Semi Additive type: can be summed up for some of the dimensions

• Non Additive type: can not be summed up for any of the dimensions

– Fact data can be at detail level or aggregated

– Fact types:

• Transactional: the most detailed data held in the table

• Periodic snapshots: it takes a 'picture of the moment – period of time' of data from the transactional table

• Accumulating snapshots: store the activity of a process from beginning to end

Page 23: How to design a DATA WAREHOUSE

DW architecture (cont)

IV. Scheme Types

– Start Scheme

– SnowFlake Scheme

Page 24: How to design a DATA WAREHOUSE

DW architecture (cont)

IV.1. Star schema Design

Page 25: How to design a DATA WAREHOUSE

DW architecture (cont)

IV.1. Advantages of Star schema

– be very easy to understand, even for non technical

business managers

– provide better performance and smaller query

times

– be easily extensible and will handle future changes

easily

Page 26: How to design a DATA WAREHOUSE

DW architecture (cont)

IV.2. Snowflake schema Design

Page 27: How to design a DATA WAREHOUSE

DW architecture (cont)

IV.2. Advantages of Snowflake schema

– Save storage space

– Reduce processing time in some cases

Page 28: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 29: How to design a DATA WAREHOUSE

How to design a DW?

Analyze requirements***

Identify the Grain

Identify the Dimensions

Identify the Facts

Page 30: How to design a DATA WAREHOUSE

How to design a DW ?(cont)

• Recommendation

– Use surrogate key instead of natural key, PK is

required in SSAS

– Should not create foreign key with check option

– Avoid NULL value of surrogate key on fact table

• Should assign a special surrogate key

– Pay much consider on many-to-many relationship

Page 31: How to design a DATA WAREHOUSE

How to design a DW ?(cont)

• Demo Star Scheme

• Demo SnowFlake Scheme

Page 32: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 33: How to design a DATA WAREHOUSE

Advantage of DW

•Enhances end-user access to a wide variety of

data.

•Increases data consistency.

•Increases productivity and decreases computing

costs.

Page 34: How to design a DATA WAREHOUSE

Advantage of DW (cont)

•Be able to combine data from different sources,

in one place.

•Can record historical information for data source

tables that are not set up to save an update

history.

Page 35: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantages of DW

Disadvantages of DW

Questions and Answers

Page 36: How to design a DATA WAREHOUSE

Disadvantage of DW

•Extracting, cleaning and loading data could be

time consuming.

•Data warehousing project scope might increase.

•Problems with compatibility with systems

already in place e.g. transaction processing

system.

Page 37: How to design a DATA WAREHOUSE

Disadvantage of DW (cont)

•Providing training to end-users, who end up not

using the data warehouse.

• Security could develop into a serious issue,

especially if the data warehouse is web

accessible.

•A data warehouse is a HIGH maintenance

system.

Page 38: How to design a DATA WAREHOUSE

Agenda

What is DW?

DW architecture

How to design a DW?

Advantage of DW

Disadvantage of DW

Questions and Answers

Page 39: How to design a DATA WAREHOUSE

QA

Thank you for your attendant

Page 40: How to design a DATA WAREHOUSE

Reference

The Data Warehouse Toolkit book by Ralph Kimball

Building the Data Warehouse book by W. H. Inmon

Data Warehouse Design Solutions by Christopher Adamson and Michael Venerable

http://www.cognosforums.com/forums/

http://dbms.knowledgehills.com/

http://www.dwinfocenter.org/index.html

http://en.wikipedia.org/wiki/Slowly_changing_dimension

http://en.wikipedia.org/wiki/Fact_table

http://www.1keydata.com/datawarehousing/fact-table-types.html

http://dbms.knowledgehills.com/Dimensional-Modeling-(DM)-tutorial-with-OLAP-and-data-warehouse-design-concepts/a32p1