decision support and date warehouse jingyi lu. outline decision support system olap vs. oltp what is...

21
Decision Support and Date Warehouse Jingyi Lu

Upload: ashlie-jones

Post on 04-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Decision Support andDate Warehouse

Jingyi Lu

Page 2: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Outline

Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform, and Load (ETL)

Page 3: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Decision Support System

Information technology to help the knowledge worker(executive, manager, analyst) make faster and better decisions.

– What were the sales volumes by region and product category for the last year?

– Which orders should we fill to maximize revenues?

– Will a 10% discount increase sales volume sufficiently?

Page 4: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Decision Support Systems

Created to facilitate the decision making process

So much information that it is difficult to extract it all from a traditional database

Need for a more comprehensive data storage facility

Data Warehouse

Page 5: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Decision Support Systems

Extract Information from data to use as the basis for decision making

Used at all levels of the Organization Tailored to specific business areas Ad Hoc queries to retrieve and display

information Combines historical operation data with

business activities

Page 6: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Decision Support Systems

Page 7: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

OLAP vs. OLTP

OLTP (On-line Transaction Processing): is characterized by a large number of short on-line transactions .-----> Operational database

OLAP (On-line Analytical Processing):is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations.------> Data Warehouse

Page 8: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

OLAP vs. OLTP

O L T P O L A P

u sers c le rk , IT p ro fe ss io n a l k n o w le d g e w o rk e r

fu n ctio n d ay to d ay o p e ra tio n s d ec is io n su p p o rt

D B d esig n a p p lic a tio n -o rien te d su b jec t-o rien te d

d a ta c u rren t, u p -to -d a te d e ta ile d , fla t re la tio n a l iso la ted

h is to ric a l, su m m arize d , m u ltid im e n sio n a l in teg ra ted , co n so lid a ted

u sa g e re p e titiv e a d -h o c

a ccess re ad /w rite in d e x /h a sh o n p rim . k ey

lo ts o f sca n s

u n it o f w o rk sh o rt, s im p le tran sac tio n c o m p lex q u e ry

# reco rd s a ccessed te n s m illio n s

# u sers th o u sa n d s h u n d red s

D B size 1 0 0 M B -G B 1 0 0 G B -T B

m etr ic tra n sac tio n th ro u g h p u t q u e ry th ro u g h p u t, re sp o n se

Page 9: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

What is a Data Warehouse

The repository for the DSS is the DATA WAREHOUSE

Definition: Integrated, Subject-Oriented, Time-Variant, Nonvolatile database that provides support for decision making.

Page 10: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Integrated

The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization

Multiple Sources Diverse Sources Diverse Formats

Page 11: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Subject-Oriented

Data is arranged and optimized to provide answer to questions from diverse functional areas

Data is organized and summarized by topic Sales / Marketing / Finance / Distribution /

Etc.

Page 12: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Time-Variant

The Data Warehouse represents the flow of data through time

Can contain projected data from statistical models

Data is periodically uploaded then time-dependent data is recomputed

Page 13: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Nonvolatile

Once data is entered it is NEVER removed Represents the company’s entire history

Near term history is continually added to it Always growing Must support terabyte databases and

multiprocessors

Read-Only database for data analysis and query processing

Page 14: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Dimensional Modeling Dimension

dimension is a data element that categorizes each item in a data set into non-overlapping regions

Facts a value or measurement, which represents a fact about the

managed entity or system.

typically numeric values that can be aggregated

Page 15: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Dimensional Modeling Database is a set of facts (points) in a

multidimensional space Fact tables

contains business facts or measures and foreign keys which refer to primary keys in the dimension tables

Dimension tables Each dimension table has a set of attributes

e.g., Day, Month, Year of Date

Attributes of a dimension may be related by partial order

Hierarchy: e.g., Day > Month > Year

Page 16: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Example of Star Schema

Page 17: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Example of Snowflake Schema

Page 18: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

ETL

Page 19: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

ETL

Extraction Transformation Loading – ETL To get data out of the source and load it into the data

warehouse – simply a process of copying data from one database to other

Data is extracted from an OLTP database, transformed to match the data warehouse schema and loaded into the data warehouse database

Many data warehouses also incorporate data from non-OLTP systems such as text files, legacy systems, and spreadsheets; such data also requires extraction, transformation, and loading

When defining ETL for a data warehouse, it is important to think of ETL as a process, not a physical implementation

Page 20: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

ETL ETL is often a complex combination of process and

technology that consumes a significant portion of the data warehouse development efforts and requires the skills of business analysts, database designers, and application developers

It is not a one time event as new data is added to the Data Warehouse periodically – monthly, daily, hourly

Because ETL is an integral, ongoing, and recurring part of a data warehouse

Automated Well documented Easily changeable

Page 21: Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

ETL Staging Database

ETL operations should be performed on a relational database server separate from the source databases and the data warehouse database

Creates a logical and physical separation between the source systems and the data warehouse

Minimizes the impact of the intense periodic ETL activity on source and data warehouse databases