riadh ben messaoud - fsegn analytical... · 2019-10-24 · data warehouses & olap 4 what is...

48
Riadh Ben Messaoud

Upload: others

Post on 28-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Riadh Ben Messaoud

Page 2: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

1. The Big Picture2. Data Warehouse Philosophy3. Data Warehouse Concepts4. Warehousing Applications5. Warehouse Schema Design6. Business Intelligence Reporting7. On-Line Analytical Processing8. OLAP Applications9. Data Warehouse Implementation10. Warehousing Software

2Data Warehouses & OLAP

Page 3: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

1. The Big Picture2. Data Warehouse Philosophy3. Data Warehouse Concepts4. Warehousing Applications5. Warehouse Schema Design6. Business Intelligence Reporting7. On-Line Analytical Processing8. OLAP Applications9. Data Warehouse Implementation10. Warehousing Software

3Data Warehouses & OLAP

Page 4: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 4

What is OLAP? On-Line Analytical Processing is not a

definition… It gives no help in deciding if a product is an

OLAP tool or not!

Since late 1994, many vendors claim to have OLAP compliant products

It is not possible to rely on the vendors’ own description

Membership of the OLAP council is not a good indicator…

The Codd rules are also an unsuitable way of detecting OLAP compliance…

Page 5: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 5

What is OLAP? Researchers were forced to create their own

definition… It had to be simple, memorable and

product-independent

The FASMI test is one of the most converging definition efforts for detecting OLAP compliance

It defines the characteristics of an OLAP application in a specific way…

FASMIFast Analysis of Shared Multidimensional

Information

Page 6: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 6

FAST The system is targeted to deliver responses to

users within about 5 seconds◦ Simplest analysis ~ no more than 1 second◦ Most complicated analysis ~ no more than 20

seconds

End-users assume that a process has failed if results are not received within 30 seconds

Unless the system warns that the report will take longer time, the user will hit “Alt+Ctrl+Delete”

Page 7: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 7

FAST The OLAP response speed is not easy to

achieve… … especially when on-the-fly and ad hoc

calculations are required

Vendors resort to many techniques to achieve this goal:◦ Specialized forms of data storage,◦ Extensive pre-calculations,◦ Specific hardware requirements.

Page 8: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 8

FAST None of the existent products is fully

optimized … an area of developing technology◦ The full pre-calculation approach fails with large and

sparse data◦ Doing everything on-the-fly is much too slow with

large data

According to surveys, slow query response is consistently the most often-cited technical problem with OLAP product …

Page 9: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 9

ANALYSIS

The system can cope with any business logic and statistical analysis relevant for the application and the user

In some OLAP product some pre-programming may be needed…◦ Without having to program, it is necessary to

allow the user to define new ad hoc calculations

Page 10: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 10

ANALYSIS Analysis could include specific features like:◦ Time series analysis,◦ Cost allocations,◦ Currency translation,◦ Goal seeking,◦ Ad hoc multidimensional structural changes,◦ Non-procedural modeling,◦ Exception alerting,◦ Data mining.

These capabilities differ between products, depending on their target markets

Page 11: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 11

SHARED The system implements all the security

requirements for confidentiality If multiple write access is needed,

concurrent update locking at an appropriate level should be implemented

The system should be able to handle multiple updates in a timely and secure manner

This is a major area of weakness in many OLAP products

… assuming that OLAP applications will be read-only

Even products with multi-user read-write have crude security models

Page 12: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 12

MULTIDIMENSIONAL Is the key requirement for all OLAP

applications

The system must provide a multidimensional conceptual view including:◦ Full support for hierarchies◦ Multiple hierarchies

… This is the most logical way to analyze businesses and organizations

Page 13: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 13

INFORMATION Is all of the data and derived information

needed, wherever it is and however much is relevant for the application

The capacity if handling data differ between OLAP products◦ The largest OLAP products can hold at least a

thousand times as much as the smallest

Many considerations must be taking:◦ Data duplication, RAM required, disk space utilization,

performance, integration with DWs

Page 14: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 14

The FASMI test is a reasonable and understandable definition of the goals OLAP is meant to achieve

Researches encourage users and vendors to adopt this definition, which we hope will avoid the controversies of previous attempts

Page 15: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 15

The Codd rules In 1993, Codd et al. published a white paper

“Providing OLAP to User-Analysts: An IT Mandate”

Codd was very well known as a respected database researcher from the 1960s till the late 1980s

He is credited with being the inventor of the relational database model in 1969

Unfortunately, his OLAP rules proved to be controversial due to being vendor-sponsored, rather than mathematically based

Page 16: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 16

The Codd rules The OLAP white paper included 12 rules,

which are now well known They were followed by another 6 rules in

1995

Codd restructured the rules into four groups, calling them “features”◦ Basic Features◦ Special Features◦ Reporting Features◦ Dimension Control

Page 17: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 17

The Codd rulesBasic Features

1. Multidimensional Conceptual View◦ Few would argue with this feature◦ Codd believes this to be the central core of OLAP◦ Codd included “slice and dice” as part of this

requirement

Page 18: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 18

The Codd rulesBasic Features

2. Intuitive Data Manipulation◦ Data manipulation through direct actions on

cells in the view◦Without recourse to menus or multiple

actions, we assume that this is by using a mouse

◦ Many products fail on this, because they do not necessarily support double clicking or drag and drop

Page 19: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 19

The Codd rulesBasic Features

3. Accessibility: OLAP as a Mediator◦ OLAP engines are considered as middleware, sitting

between heterogeneous data sources and an OLAP front-end

◦ Most products can achieve this, but often with more data staging and batching than vendors like to admit

Page 20: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 20

The Codd rulesBasic Features

4. Batch Extraction vs Interpretive◦ This rule effectively required that products offer

both their own staging database for OLAP data as well as offering live access to external data

◦ Only a minority of OLAP products properly comply with it

Page 21: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 21

The Codd rulesBasic Features

5. OLAP Analysis Models◦ Codd required that OLAP products should support

all four analysis models : Categorical: parameterized static reporting ~ All

OLAP tools Exegetical: slicing and dicing with drill down ~ All

OLAP tools Contemplative: « what if? » analysis ~ Most OLAP

tools Formulaic: goal seeking models ~ Very few OLAP

tools

Page 22: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 22

The Codd rulesBasic Features

6. Client/Server Architecture◦ The OLAP server component of an OLAP product

should be sufficiently intelligent that various clients could be attached with minimum effort and programming for integration◦ Relatively few OLAP products are qualified for

this test◦ A very tough test

◦ What the Web would deliver on this issue?◦ What XML would deliver on this issue?

Page 23: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 23

The Codd rulesBasic Features

7. Transparency◦ This test, dealing with openness, is also a tough

but valid one◦ A spreadsheet user should be able to get full

values from an OLAP engine and not even be aware of where the data comes from◦ OLAP products must allow live access to

heterogeneous data sources from a full function spreadsheet add-in, with the OLAP server engine in between◦ A very few products that do fully comply with

the test

Page 24: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 24

The Codd rulesBasic Features

8. Multi-User Support◦ OLAP tools must provide concurrent access

(retrieval and update), integrity and security◦ Many OLAP applications are still read-only

◦ However, almost all vendors claim compliance!!!

Page 25: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 25

The Codd rulesSpecial Features

9. Treatment of Non-Normalized Data◦ Refers to the integration between an OLAP engine

and denormalized source data◦ Any data updates performed in the OLAP

environment should not be allowed to alter stored denormalized data in feeder systems◦ Data changes should not be allowed in what are

normally regarded as calculated cells within the OLAP database

Page 26: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 26

The Codd rulesSpecial Features

10. Storing OLAP Results: Keeping them Separate from Source Data◦ This is really an implementation rather than a

product issue◦ But few would disagree with it◦ Read-write OLAP applications should not be

implemented directly on live transaction data◦ OLAP data changes should be kept distinct from

transaction data◦ The method of data write-back used in Microsoft

Analysis Services is the best implementation of this

Page 27: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 27

The Codd rulesSpecial Features

11. Extraction of Missing Values◦ All missing values are cast in the uniform

representation defined by the Relational Model◦ Missing values are to be distinguished from

zero values◦ A few OLAP tools do break this rule

Page 28: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 28

The Codd rulesSpecial Features

12. Treatment of Missing Values◦ All missing values are to be ignored by the OLAP

analyzer regardless of their source◦ This is an almost inevitable consequence of how

multidimensional engines treat all data

Page 29: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 29

The Codd rulesReporting Features

13. Flexible Reporting◦ The dimensions can be laid out in any way that the

user requires in reports◦ Most products are capable of this in their

formal report writers◦ It is preferable that analysis and reporting facilities

be combined in one module

Page 30: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 30

The Codd rulesReporting Features

14. Uniform Reporting Performance◦ Reporting performance be not significantly

degraded by increasing the number of dimensions or database size◦ There are differences between products◦ The principal factor that affects performance is the

degree to which the calculations are performed in advance and where live calculations are done

Page 31: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 31

The Codd rulesReporting Features

15. Automatic Adjustment of Physical Level◦ OLAP system must adjust its physical schema

automatically to adapt to the type of model, data volumes and sparsity

◦ Most vendors fall far short of this noble ideal◦ Since 1996, users can benefit from it in

Microsoft Analysis Services

Page 32: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 32

The Codd rulesDimension Control

16. Generic Dimensionality◦ Each dimension must be equivalent in both its

structure and operational capabilities◦ This has proven to be one of the most controversial

Codd’s rules

◦ With a strictly purist interpretation, few products fully comply◦ If you are buying a product for a specific

application, you may safely ignore the rule

Page 33: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 33

The Codd rulesDimension Control

17. Unlimited Dimensions & Aggregation Levels◦ Technically, no product can possibly comply

with this feature◦ There is no such thing as an unlimited entity on a

limited computer◦ Few applications need more than about eight or

ten dimensions◦ Few hierarchies have more than about six

consolidation levels

◦ In practice, you can probably ignore this requirement.

Page 34: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 34

The Codd rulesDimension Control

18. Unrestricted Cross-dimensional Operations◦ All forms of calculation must be allowed across all

dimensions, not just the “measures” dimension◦ Many products which use only relational

storage are weak in this area

◦ These types of calculations are important if you are doing complex calculations

Page 35: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 35

OLAP Milestones

Page 36: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 36

OLAP Milestones

Page 37: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 37

OLAP Milestones

Page 38: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 38

OLAP Milestones

Page 39: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 39

OLAP Milestones

Page 40: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 40

OLAP Milestones

Page 41: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 41

OLAP Milestones

Page 42: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 42

OLAP Milestones

Page 43: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 43

OLAP Milestones

Page 44: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 44

OLAP Milestones

Page 45: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 45

OLAP Milestones

Page 46: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 46

OLAP Milestones

Page 47: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 47

OLAP Milestones

Page 48: Riadh Ben Messaoud - FSEGN Analytical... · 2019-10-24 · Data Warehouses & OLAP 4 What is OLAP? On-Line Analytical Processing is not a definition… It gives no help in deciding

Data Warehouses & OLAP 48

OLAP Milestones