lecture olap
Post on 05-Apr-2018
234 Views
Preview:
TRANSCRIPT
-
8/2/2019 Lecture OLAP
1/83
ON-LINE ANALYTICALPROCESSING
-
8/2/2019 Lecture OLAP
2/83
04/23/12 2
Lecture Objectives
What is OLAP
Need for OLAPFeatures & functions of OLAP
Different OLAP models
OLAP implementations
-
8/2/2019 Lecture OLAP
3/83
04/23/12 3
OLAP
Term coined in mid 1990s
Main Goal: support ad-hoc but
complex querying by businessanalysts
Extends worksheet like analysis towork with huge amounts of data in
a DW
-
8/2/2019 Lecture OLAP
4/83
04/23/12 4
Demand for OLAP
2 approaches to developing EDWs In both approaches, Data Marts
rest on Dimensional Model Data Marts are sufficient for basic
data analysis Users need to go beyond such
basic analysis
-
8/2/2019 Lecture OLAP
5/83
04/23/12 5
Demand for OLAP
Need for MultidimensionalAnalysis
Fast Access & Powerful
Calculations Limitations of other analysis
methods like:SQL
SpreadsheetsReport Writers
-
8/2/2019 Lecture OLAP
6/83
04/23/12 6
Demand for OLAP
Traditional tools of report writers,query products, spreadsheets, &language interfaces do not matchthe user expectations as far asperforming multidimensionalanalysis with complex calculationsis concerned.
Tools used with OLTP and basic DW
environments do not match up tothe task
-
8/2/2019 Lecture OLAP
7/83
04/23/12 7
OLAP is the Answer!
OLAP is a category of software technology
that enables analysts, managers, and
executives to gain insight into the data
through fast, consistent, interactive, access in
a wide variety of possible views of
information that has been transformed from
raw data to reflect the real dimensionality ofthe enterprise as understood by the user.
-
8/2/2019 Lecture OLAP
8/83
04/23/12 8
What is OLAP?
OLAP software provides the ability toanalyze large volumes of information to
improve decision making at all levels of an
organization.
-
8/2/2019 Lecture OLAP
9/83
04/23/12 9
What is OLAP?
A wide spectrum of multidimensionalanalysis involving intricate calculations and
requiring fast response times.
-
8/2/2019 Lecture OLAP
10/83
04/23/12 10
What is OLAP?
OLAP has two immediate consequences:
online part requires the answers of queries to
be fast, the analyticalpart is a hint that the
queries itself are complex
i.e., Complex questions with Fast Answers!
-
8/2/2019 Lecture OLAP
11/83
04/23/12 11
Why a separate OLAP tool?
oEmpowers end users to do own
analysiso Frees up IS backlog of report requestso Ease of useo No knowledge of tables or SQL
required
-
8/2/2019 Lecture OLAP
12/83
04/23/12 12
OLAP Characteristics
oMulti-user environment
oClient-server architecture
o Rapid response to queries,
regardless of DB size and complexity
-
8/2/2019 Lecture OLAP
13/83
04/23/12 13
Data Warehouse & OLAP
o OLAP is a software system that works on top
of a DW
o A front-end tool for a DW
o Information delivery system for the DW
o Compliments the information deliverycapacities of a DW
-
8/2/2019 Lecture OLAP
14/83
04/23/12 14
Why is OLAP useful?
Facilitates multidimensional dataanalysis by pre-computingaggregates across many sets of
dimensions Provides for:
Greater speed and responsiveness
Improved user interactivity
-
8/2/2019 Lecture OLAP
15/83
04/23/12 15
The OLAP Market
-
8/2/2019 Lecture OLAP
16/83
04/23/12 16
The OLAP Market
-
8/2/2019 Lecture OLAP
17/83
04/23/12 17
Warehouse Models & Operators
Data Models relations
stars & snowflakes
cubes
Operators slice & dice
roll-up, drill down pivoting
other
-
8/2/2019 Lecture OLAP
18/83
04/23/12 18
Data Warehouses
A data warehouse is based on amultidimensional data model which viewsdata in the form of a data cube
A data cube allows data to be modeledand viewed in multiple dimensions
In data warehousing literature, an n-Dbase cube is called a base cuboid. The top
most 0-D cuboid, which holds thehighest-level of summarization, is calledthe apex cuboid. The lattice of cuboidsforms a data cube.
-
8/2/2019 Lecture OLAP
19/83
04/23/12 19
Lattice of Cuboids
all
time item location supplier
time,item time,location
time,supplier
item,location
item,supplier
location,supplier
time,item,location
time,item,supplier
time,location,supplier
item,location,supplier
time, item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
-
8/2/2019 Lecture OLAP
20/83
04/23/12 20
CUBE
sale prodId storeId date amtp1 c1 1 12
p2 c1 1 11
p1 c3 1 50
p2 c2 1 8
p1 c1 2 44
p1 c2 2 4
day 2c1 c2 c3
p1 44 4
p2 c1 c2 c3
p1 12 50
p2 11 8
day 1
dimensions = 3
Multi-dimensional cube:Fact table view:
-
8/2/2019 Lecture OLAP
21/83
04/23/12 21
Aggregates
sale prodId storeId date amt
p1 c1 1 12
p2 c1 1 11
p1 c3 1 50
p2 c2 1 8
p1 c1 2 44
p1 c2 2 4
Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE
WHERE date = 1
81
-
8/2/2019 Lecture OLAP
22/83
04/23/12 22
sale prodId storeId date amt
p1 c1 1 12
p2 c1 1 11
p1 c3 1 50
p2 c2 1 8
p1 c1 2 44
p1 c2 2 4
Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE
GROUP BY date
ans date sum
1 81
2 48
Aggregates
-
8/2/2019 Lecture OLAP
23/83
04/23/12 23
Operators: sum, count, max,min, median, ave
Having clause Using dimension hierarchy
average by region (within store)
maximum by month (within date)
Aggregates
-
8/2/2019 Lecture OLAP
24/83
04/23/12 24
Cube Aggregation
day 2
c1 c2 c3
p1 44 4
p2c1 c2 c3
p1 12 50
p2 11 8
day 1
c1 c2 c3
p1 56 4 50
p2 11 8
c1 c2 c3
sum 67 12 50
sum
p1 110
p2 19
129
. . .
drill-down
rollup
Example: computing sums
-
8/2/2019 Lecture OLAP
25/83
04/23/12 25
Cube Operators
day 1
day 2c1 c2 c3
p1 44 4
p2 c1 c2 c3
p1 12 50
p2 11 8
c1 c2 c3
p1 56 4 50
p2 11 8
c1 c2 c3
sum 67 12 50
sum
p1 110
p2 19
129
. . .
sale(c1,*,*)
sale(*,*,*)sale(c2,p2,*)
sale(*,p1,*)
-
8/2/2019 Lecture OLAP
26/83
04/23/12 26
c1 c2 c3 *
p1 56 4 50 110
p2 11 8 19* 67 12 50 129
Extended Cube
day 2c1 c2 c3 *
p1 44 4 48
p2
* 44 4 48c1 c2 c3 *
p1 12 50 62
p2 11 8 19
* 23 8 50 81
day 1
*
sale(*,p2,*)
-
8/2/2019 Lecture OLAP
27/83
04/23/12 27
Aggregation UsingHierarchies
day 2c1 c2 c3
p1 44 4
p2 c1 c2 c3
p1 12 50
p2 11 8
day 1
region A region B
p1 56 54
p2 11 8
customer
region
country
(customer c1 in Region A;
customers c2, c3 in Region B)
-
8/2/2019 Lecture OLAP
28/83
04/23/12 28
Pivoting
sale prodId storeId date amt
p1 c1 1 12
p2 c1 1 11
p1 c3 1 50
p2 c2 1 8p1 c1 2 44
p1 c2 2 4
day 2c1 c2 c3
p1 44 4
p2 c1 c2 c3
p1 12 50
p2 11 8
day 1
Multi-dimensional cube:Fact table view:
c1 c2 c3
p1 56 4 50
p2 11 8
-
8/2/2019 Lecture OLAP
29/83
04/23/12 29
Cube Aggregates Lattice
city, product, date
city, product city, date product, date
city product date
all
day 2c1 c2 c3
p1 44 4
p2 c1 c2 c3
p1 12 50
p2 11 8
day 1
c1 c2 c3
p1 56 4 50
p2 11 8
c1 c2 c3
p1 67 12 50
129
use greedy
algorithm to
decide what
to materialize
-
8/2/2019 Lecture OLAP
30/83
04/23/12 30
Dimension Hierarchies
all
state
city
cities city state
c1 CA
c2 NY
-
8/2/2019 Lecture OLAP
31/83
04/23/12 31
Dimension Hierarchies
city, product
city, product, date
city, date product, date
city product date
all
state, product, date
state, date
state, product
state
not all arcs shown...
-
8/2/2019 Lecture OLAP
32/83
04/23/12 32
Interesting Hierarchy
all
years
quarters
months
days
weeks
time day week month quarter year
1 1 1 1 2000
2 1 1 1 2000
3 1 1 1 2000
4 1 1 1 2000
5 1 1 1 2000
6 1 1 1 20007 1 1 1 2000
8 2 1 1 2000
conceptual
dimension table
-
8/2/2019 Lecture OLAP
33/83
04/23/12 33
Total annual salesof TV in U.S.A.
Date
Product
Countr
ysum
sumTV
VCRPC
1Qtr 2Qtr 3Qtr 4Qtr
U.S.A
Canada
Mexico
sum
SAMPLE CUBE
Total annual sales
of PC in U.S.A.Total annual sales
of VCR in U.S.A.
Total Q1 sales
In U.S.ATotal Q1 sales
In CanadaTotal Q1 sales
In Mexico
Total Q1 sales
In all countries
Total Q2 sales
In all countries
Total sales
In U.S.ATotal sales
In Canada
Total sales
In Mexico
TOTAL SALES
-
8/2/2019 Lecture OLAP
34/83
04/23/12 34
Roll-Up
Drill-Down
Slice & Dice Pivot
Drill-Across
Drill-Through
OLAP Operations
-
8/2/2019 Lecture OLAP
35/83
04/23/12 35
OLAP Operations
Roll up (drill-up): summarize data
by climbing up hierarchy or by dimension reduction
Drill down (roll down): reverse of roll-up
from higher level summary to lower level summary or
detailed data, or introducing new dimensions
Slice and dice:
project and select
Pivot (rotate):
reorient the cube, visualization, 3D to series of 2D planes. Other operations
drill across: involving (across) more than one fact table
drill through: through the bottom level of the cube to its
back-end relational tables (using SQL)
-
8/2/2019 Lecture OLAP
36/83
04/23/12 36
Fact TableSales(Store_id, Product_id, Time_id, Sales_amt)
Dimension TablesStore (Store_id, city, state, region, country)
Product (Product_id, name, category)
Day (Time_id, month, quarter, year)
HierarchiesStore City State Region CountryProduct CategoryDay Month Quarter Year
Example Schema
-
8/2/2019 Lecture OLAP
37/83
04/23/12 37
SELECT S.product_id, S.store_id, SUM(S.sales_amt)
FROM Sales S
GROUP BY S.store_id, S. product_id
SELECT S.product_id, St.state, SUM(S.sales_amt)FROM Sales S, Store St
WHERE St.store_id=S.store_id
GROUP BY S.product_id, St.state
SELECT S.product_id, St.city, SUM(S.sales_amt)FROM Sales S, Store St
WHERE St.store_id=S.store_id
GROUP BY S.product_id, St.city
Drill-Down
State
City
-
8/2/2019 Lecture OLAP
38/83
04/23/12 38
Drill-Down
-
8/2/2019 Lecture OLAP
39/83
04/23/12 39
SELECT S.product_id, St.city, SUM(S.sales_amt)
INTO City_sales
FROM Sales S, Store St
WHERE St.store_id=S.store_id
GROUP BY S.product_id, St.city
SELECT T.product_id, St.State, SUM(T.sales_amt)
FROM City_sales T, Store St
WHERE St.city=T.CityGROUP BY T.product_id, St.State
Rolling Up
-
8/2/2019 Lecture OLAP
40/83
04/23/12 40
When we view the data as a multi-dimensionalcube & group on a subset of axes, we are said tobe performing a pivot on those axes
- Pivoting on dimension Dj (j=1(1)k) in a cube Di(i=1(1)n) means that we use GROUP BY Aj
(j=1(1)k) & aggregate over Ak+1, . An, where Ai
is an attribute of dimension Di
- Pivoting on product & time corresponds togrouping on prod_id & quarter & aggregatingover store_id
Pivoting
-
8/2/2019 Lecture OLAP
41/83
04/23/12 41
SELECT S.product_id, T.quarter, SUM(S.sales_amt)
FROM Sales S, Time T
WHERE T.time_id=S.time_id
GROUP BY S.product_id, T.quarter
Pivoting
-
8/2/2019 Lecture OLAP
42/83
04/23/12 42
When we use GROUP BY to specify partof an hierarchy, we are performing arange selection called a DICE
Dicing Sales in the time dimension: totalsales for each product in each qurater
SELECT S.product_id, T.quarter, SUM(S.sales_amt)
FROM Sales S, Time TWHERE T.time_id=S.time_id
GROUP BY T.quarter, S.product_id
Dicing
-
8/2/2019 Lecture OLAP
43/83
04/23/12 43
When we use WHERE to specify aparticular value for an axis, we areperforming a SLICE
Slicing in the time dimension: choosingsales only in week 12, then pivoting toproduct_id (aggregating over store_id)
SELECT S.product_id, SUM(S.sales_amt)
FROM Sales S, Time TWHERE T.time_id=S.time_id & T.week=12GROUP BY S.product_id
Slicing
-
8/2/2019 Lecture OLAP
44/83
04/23/12 44
Slicing
-
8/2/2019 Lecture OLAP
45/83
04/23/12 45
OLAP Operations
-
8/2/2019 Lecture OLAP
46/83
04/23/12 46
Slicing
-
8/2/2019 Lecture OLAP
47/83
04/23/12 47
Dicing (Sub-cube)
-
8/2/2019 Lecture OLAP
48/83
04/23/12 48
Roll-Up
-
8/2/2019 Lecture OLAP
49/83
04/23/12 49
Drill-Down
h
-
8/2/2019 Lecture OLAP
50/83
04/23/12 50
Other OLAPOperations
o Drill-Across: Queries involving more than one fact tableo Drill-Through: Makes use of SQL to drill through thebottom level of a data cube down to its back-end relationaltables
o Pivot (rotate): Pivot (also called "rotate") is avisualization operation which rotates the data axes inview in order to provide an alternative presentation ofthe data. Other examples include rotating the axes in a
3-D cube, or transforming a 3-D cube into a series of 2-D planes.
O h O
-
8/2/2019 Lecture OLAP
51/83
04/23/12 51
Other OLAPOperations
oTop N or Bottom N querieso Moving Averageso Growth Rateso Depreciationo Currency Conversiono Statistical Functions
-
8/2/2019 Lecture OLAP
52/83
04/23/12 52
Conceptual vs. Actual
The cube is a logical way ofvisualizing the data in an OLAPsetting
Not how the data is actuallyrepresented on disk
Two ways of storing data:ROLAP: Relational OLAPMOLAP: Multidimensional OLAP
-
8/2/2019 Lecture OLAP
53/83
04/23/12 53
Approaches to OLAPServers
It is all about which DBMS youchoose to store your data warehouse
data RDBMS ROLAP
MDDB MOLAP
BOTH - HOLAP
-
8/2/2019 Lecture OLAP
54/83
04/23/12 54
OLAP Flavours
OLAP
ROLAP MOLAP DOLAP
HOLAP
-
8/2/2019 Lecture OLAP
55/83
04/23/12 55
Approaches to OLAPServers
Three possibilities for OLAP servers
(1) Relational OLAP (ROLAP) Relational and specialized relational DBMS to store and
manage warehouse data
OLAP middleware to support missing pieces(2) Multidimensional OLAP (MOLAP)
Array-based storage structures Direct access to array data structures
(3) Hybrid OLAP (HOLAP)
Storing detailed data in RDBMS Storing aggregated data in MDBMS User access via MOLAP tools
-
8/2/2019 Lecture OLAP
56/83
04/23/12 56
ROLAP
Special schema design: star, snowflake
Special indexes: bitmap, multi-table join
Proven technology (relational model,DBMS), tend to outperform specializedMDDB especially on large data sets
Products IBM DB2, Oracle, Sybase IQ, RedBrick,
Informix
-
8/2/2019 Lecture OLAP
57/83
04/23/12 57
ROLAP Defines complex, multi-dimensional data
with simple model Reduces the number of joins a query has to
process
Allows the data warehouse to evolve withrelatively low maintenance Can contain both detailed and summarized
data. ROLAP is based on familiar, proven, and
already selected technologies.BUT!!! SQL for multi-dimensional manipulation of
calculations.
-
8/2/2019 Lecture OLAP
58/83
04/23/12 58
MOLAP
MDDB: a special-purpose data model Facts stored in multi-dimensional
arrays Dimensions used to index array Sometimes on top of relational DB Products
Pilot, Arbor Essbase, Gentia
-
8/2/2019 Lecture OLAP
59/83
04/23/12 59
MOLAP
Pre-calculating or pre-consolidating transactionaldata improves speed.
BUTFully pre-consolidating incoming data, MDDs requirean enormous amount of overhead both in processingtime and in storage. An input file of 200MB can easilyexpand to 5GB
MDDBs are great candidates for the < 100GBdepartment data marts.
With MDDs, application design is essentially thedefinition of dimensions and calculation rules, whilethe RDBMS requires that the database schema be astar or snowflake.
i k f
-
8/2/2019 Lecture OLAP
60/83
04/23/12 60
Quick Recap of OLAPNeeds
User Needs Multidimensional view Excellent Performance Analytical Flexibility Real-Time Data Access High Data Capacity
MIS Needs Leverages Data Warehouse
Easy Development Low Structure Maintenance Low Aggregate Maintenance
Q i k R f O AP
-
8/2/2019 Lecture OLAP
61/83
04/23/12 61
Quick Recap of OLAPNeeds: User Needs
Multidimensional ViewAll true OLAP tools, whether they work
with a MDDB or an RDBMS, provide a
multidimensional view of data. For example, decision makers may view
sales by office, quarter, representative,product, etc. This perspective on data,
which mirrors the way businessprofessional think, allows for moreintuitive and more powerful analysis.
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
62/83
04/23/12 62
Excellent Performance The performance of your decision support
tool directly depends on the way it
manages aggregates.RDBMS
Calculate aggregates on fly (response timesuffers)
DBA creates summary tables to storeaggregates (enormous amount of diskspace)
Quick Recap of OLAPNeeds: User Needs
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
63/83
04/23/12 63
Quick Recap of OLAPNeeds: User Needs
Excellent Performance For example, suppose you have a Sales indicator
with six dimensionsRepresentatives, Products,Customers, Regions, Months, and Years.
MOLAP tools will store a given aggregate, such asthe November 1997 government sales of productA504 by representative 1040 in New York, in 1cell of the MDDB.
In contrast, ROLAP tools consume 600% more
space, because they require a record of sevenvaluessix foreign keys and the actual aggregatein a relational summary table.
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
64/83
04/23/12 64
Quick Recap of OLAPNeeds: User Needs
Excellent Performance
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
65/83
04/23/12 65
Quick Recap of OLAPNeeds: User Needs
Excellent PerformanceRDBMSs must use several summary tables to store the aggregatesthat a MOLAP could store in just one cube. For example, consider a Salesindicator with three dimensions: Months, Regions, and Products. The indicatorcube will contain seven sets of aggregates:
Sales by month Sales by product Sales by region Sales by month and product Sales by month and region Sales by product and region Sales by product, month, and regionTo store these aggregates in an RDBMS, youd have to create seven summarytables, one for each aggregate set.HOW MANY SUMMARY TABLES FOR 6 DIMENSIONS?(Separate fact table and shrunken dimension table approach for storingaggregates)
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
66/83
04/23/12 66
Quick Recap of OLAPNeeds: User Needs
Excellent Performance
Huge amounts of extra storage space is required (even ifthere is no sparsity failure)
Maintenance costs are high
Lot of statistical analysis needs to be done to decidewhich aggregates are to be precomputed
DBA must keep the cost/performance ratio in check
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
67/83
04/23/12 67
Quick Recap of OLAPNeeds: User Needs
Excellent Performance
In contrast, weve seen that multidimensional databasesstore aggregates in a very compact structure thatconsumes very little disk space and requires very little
maintenance
All levels of consolidation can therefore be precomputedand stored in MDDB
As a result, fast response time is not limited to the mostfrequently accessed queries; all aggregates can be accessed withlightning speed.
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
68/83
04/23/12 68
Quick Recap of OLAPNeeds: User Needs
Analytical Flexibility
Both ROLAP & MOLAP tools offer comparativeperformance for Comparative Analysis
Roll-up and Drill-down
Slicing & Dicing
Only MOLAP tools offer what-if analysis
Q i k R f OLAP
-
8/2/2019 Lecture OLAP
69/83
04/23/12 69
Quick Recap of OLAPNeeds: User Needs
Real-Time Data Access MOLAP tools load data into the multidimensional cubes.
Consequently, the data being accessed is only as recentas the last load.
Some applications require real-time data access Process of continually refreshing the data attaches higher
costs to operating a MOLAP system Some MOLAP tools offer reach-through functionality to
access volatile data stored outside the MDDB Unfortunately, users must be aware of the underlying
database structure Relational data access is too complex for the typical user
Q ick Recap of OLAP
-
8/2/2019 Lecture OLAP
70/83
04/23/12 70
Quick Recap of OLAPNeeds: User Needs
Real-Time Data Access ROLAP tools maintain a constant link to the
operational RDBMS, which provides users
with up-to-the-minute, accurate data(Real-Time Data Warehousing)
Industries & organizations with highly volatiledata particularly benefit from this access to
live, operational data.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
71/83
04/23/12 71
Quick Recap of OLAPNeeds: User Needs
High Capacity Data MOLAP products are limited by the size of the
cube defined by the multidimensional view.
When dimension elements are predefined, thescope of available data is limited at the onset.
ROLAP tools circumvent this barrier. Dynamicdimensions are not stored in the predefined
multidimensional model, but fetched at runtime from the RDBMS.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
72/83
04/23/12 72
Quick Recap of OLAPNeeds: User Needs
High Capacity Data
o In MOLAP, only aggregates are stored in the cube.
Atomic, operational data are forced out of the usersanalytical realm.o ROLAP systems can access extremely detailedoperational data, as well as aggregated data stored in
summary tables.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
73/83
04/23/12 73
Quick Recap of OLAPNeeds
MIS Needs
Administrators should be able to
leverage their existing relationaldatabases without devoting largeamounts of time and effort to intricatedevelopment, fine tuning, or intensive
maintenance.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
74/83
04/23/12 74
Quick Recap of OLAPNeeds: MIS Needs
Leveraging Data Warehouse Both the finance and the MIS departments of
your organization will appreciate a decision
support tool that leverages existinginvestments in data warehousing.
MIS staff that opts for a MOLAP tool mustduplicate data in its own proprietary MDDB.
MIS staff that chooses a ROLAP tool will beable to access the data warehouse directly.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
75/83
04/23/12 75
Quick Recap of OLAPNeeds: MIS Needs
Easy Development MOLAP development is straightforward, it requires no
fine tuning and creates its own aggregates. ROLAP tools, on the other hand, require a specific
schema for the relational database. Skilled DBAs must provide the appropriate schema
(star or snowflake schema), tune the database, andcreate the appropriate summary tables.
However, many ROLAP tools are metadata-driven,
which means the multidimensional view is generatedand maintained more easily.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
76/83
04/23/12 76
Quick Recap of OLAPNeeds: MIS Needs
Low Structure Maintenance The structure of a MOLAP tools underlying MDDB
greatly depends on each of its dimensions. When onedimension changes, the entire MDDB must be re-
structured. Multi-matrix MDDBs reduce the maintenance burden ROLAP systems do not store data in a proprietary
structure. They build and maintain a constant link between the
multidimensional view and the underlying RDBMS usingthe metadata.
No database restructuring is required.
Quick Recap of OLAP
-
8/2/2019 Lecture OLAP
77/83
04/23/12 77
Quick Recap of OLAPNeeds: MIS Needs
Low Aggregate Maintenance MOLAP tools automatically create high-level aggregates
based on your lower-level MDDB data and aggregatedefinitions.
When data is updated, the aggregates areautomatically updated and stored in the MDDB. With ROLAP tools, MIS staff must continually monitor
the use of summary tables to keep theircost/performance ratio in check.
DBAs inevitably use sophisticated statistics to isolate
only the most frequently accessed aggregates, andstore them in summary tables. These tables leave ROLAP administrators with a heavy
maintenance burden.
-
8/2/2019 Lecture OLAP
78/83
04/23/12 78
ROLAP vs. MOLAP
-
8/2/2019 Lecture OLAP
79/83
04/23/12 79
ROLAP vs. MOLAP
1)1)Performance:Performance:How fast will the system appear to the end-user?How fast will the system appear to the end-user? MDD server vendors believe this is a key point in theirMDD server vendors believe this is a key point in their
favor.favor.
2) Data volume and scalability:2) Data volume and scalability:While MDD servers can handle up to 100GB of storage,While MDD servers can handle up to 100GB of storage,RDBMS servers can handle hundreds of gigabytes andRDBMS servers can handle hundreds of gigabytes and
terabytes.terabytes.
-
8/2/2019 Lecture OLAP
80/83
04/23/12 80
Hybrid OLAP - HOLAP
o Best of both worlds
o Storing detailed data in RDBMS
o Storing aggregated data in MDBMS
o
User access via MOLAP tools
-
8/2/2019 Lecture OLAP
81/83
04/23/12 81
HOLAP
Multi-
dimensional
access Multidimensional
Viewer
RelationalViewer
ClientMDBMS Server
Multi-
dimensional
data
SQL-Read
RDBMS Server
User
data Metadata
Derived
data
SQL-Reach
Through
SQL-Read
ROLAP MOPAL or
-
8/2/2019 Lecture OLAP
82/83
04/23/12 82
ROLAP, MOPAL, orHOLAP
IF A. You require write accessB. Your data is under 50 GBC. Your timetable to implement is 60-90 daysD. Lowest level already aggregatedE. Data access on aggregated levelF. Youre developing a general-purpose application for inventory movement or assetsmanagement
THEN
Consider an MDD /MOLAP solution for your data mart
IFA. Your data is over 100 GBB. You have a "read-only" requirementC. Historical data at the lowest level of granularityD. Detailed access, long-running queriesE. Data assigned to lowest level elements
THEN
Consider an RDBMS/ROLAP solution for your data mart.
IFA. OLAP on aggregated and detailed dataB. Different user groupsC. Ease of use and detailed data
THENConsider an HOLAP for your data mart
-
8/2/2019 Lecture OLAP
83/83
Conclusions
ROLAP: RDBMS -> star/snowflake schema MOLAP: MDDB -> Cube structures ROLAP or MOLAP: Data models used play major role in
performance differences
MOLAP: for summarized and relatively lesser volumesof data (100GB)
ROLAP: for detailed and larger volumes of data Both storage methods have strengths and weaknesses The choice is requirement specific, though currently
data warehouses are predominantly built usingRDBMSs/ROLAP.
HOLAP is emerging as the OLPA server of choice
top related