agenda olap data warehouse data warehouse design cube dimensions measures facts

41
BI Terminologies

Upload: alfred-wilkins

Post on 03-Jan-2016

225 views

Category:

Documents


4 download

TRANSCRIPT

BI Termenlogies

BI TerminologiesAgendaOLAPData WarehouseData Warehouse designCubeDimensionsMeasuresFacts

Agenda ContdMeasure GroupsCalculated MeasuresMeasure ExpressionHierarchiesDimension MembersDiscretizationCalculated MembersCustom Member Formulas

Agenda ContdCell and TupleSet and Named SetKPIMDXDimension TypesDimension relationship to FactsSummaryQuestionsOLAP

Data WarehouseRepository of an organizations electrically stored data.

Designed to facilitate reporting and analysis.

How is it used in the BI solution?Data Warehouse designStar Schema:Consists of few fact tables, referencing directly any number of dimension tables

Data Warehouse design ContdSnowflake Schema:Consists of a centralized fact tables which are connected to multiple dimensions, and these dimensions are normalized into multiple related tables

Data Warehouse design ContdCombination Schema:Mix approach leading to some dimension tables being completely a star schema while other dimension tables are snow flake schema to save space.

CubeData structure that allows fast analysis of data.

Cube Operations:Slice (Where)Dice (Select)Drill up/down

Types to save Cube aggregations:MOLAP (high performance, high storage)ROLAP (less performance, better storage)HOLAP (benefit from both)DimensionsThe specific data which the user will be concerned to view his data by this categorizationMeasuresThe fields in which the system is concerned to measure.FactsGroup of measures that can be categorized by dimensions.ExampleIf we have a cube connected to a data source that contains the sales amount data and date time data, and the user want to view the sales amount by time, (ex. view the sales amount per month). Then we shall have a dimension as this what data would be categorized by. Then we will have a measure which will be the ExampleIf we have a cube connected to a data source that contains the sales amount data and date time data, and the user want to view the sales amount by time, (ex. view the sales amount per month). Then we shall have a dimension Date Time as this what data would be categorized by. Then we will have a measure which will be the Sales amount And in the fact table we will have each measure labeled to a certain period of time using foreign keys.

Measure GroupsIt is a group of measures; a fact is considered a measure group.Calculated MeasureA measure that is calculated by a formula, not by a direct aggregation.

Example:If we have a measure to calculate the sum of salaries of employees, but we want to get the salaries including the 10% taxes, then we will create a calculated measure which will take the value of the salary and multiply it by 0.1. Also it can be used if we want to multiply 2 measures with each other; this also is considered a calculated measureMeasure ExpressionWorks on the least level of members

Formula is executed before the aggregation processExample Calculated Measure Vs Measure ExpressionExample Calculated Measure Vs Measure Expression ContdTotal amount for B1???Example Calculated Measure Vs Measure Expression ContdCalculated measure: It will aggregate first, then execute the multiplication so for B1 we will have the total amount (20%+10%+10%) * (2000$) which will be 800$.Measure expression: It will execute the multiplication on the lowest level then aggregate, so for B1 we will have the total amount (20% * 1000$) + (10% * 1000) + (10% * 1000) which will be 400$. And this is the correct value.

21AttributeIt is the columns within a dimension table.HierarchiesIt is a grouping of attributes ordered to reflect their relationship with other attributes

Example:In the time dimension, we could make the following hierarchy YearQuarterMonthWeekDay, which means that every year will be divided into quarters, and each quarter will be divided into months, and so onDimension MembersIt is the value of each cell in the dimension table

Example:In time dimension, in the year attribute, we have 2005,2006,2007,.. As member of this attribute.DiscretizationThe process of grouping members of an attribute into a number of member groups.

Example:If we have an attribute City, that contain a large number of members (ex. 500,000 city) we can discretize this attribute into groups to facilitate viewing of this attribute (ex. Cities from A-B, then C-D,.. and so on.Calculated MembersIt is a member that is calculated by a formula not as the normal members of the attribute.

Example:Year to Date added to the members of the attribute, you should be aware that the calculated members have different behavior than the normal members, so they should be tested along with the normal members.

Custom Member FormulasIt is a member value that is calculated or set, for certain conditions, it will not be generic as calculated members, and it will only be executed with certain members of other attributes

Example:If we want to multiply the salaries of employees in united states only by 10, then we will create a custom member formula and set its value with united states only with salary*10.

Custom Member Vs Calculated Member Custom Member:Exists in dimension tableIt can be considered as a modification or altering to the aggregation formula of already existing members

Calculated Member:It is a new virtual member not existing in the dimension tableConsidered as a dimension member.Cell and TupleCell:Certain value within a dimension or/and fact.

Tuple:Certain row within a dimension or/and fact.

Set and Named SetSet:It is a group of rows of data.

Named Set:It can be considered as a calculated set, also it can be considered as a view with certain conditions other than the normal behavior if the set.

Example on Named SetIf you want to select employees whose salaries are more than 3000, and you will use this set of data in multiple queries, then you can put the output of this query in a named set where you can use in the multiple queries you want to use, and if there is a change in any condition of this set you will not have to change in each query, you will just have to change in the named set, and each query which uses this named set will be affected with the change.

KPIMeasurement for measuring business success.

Frequently evaluated over time and varies from one organization to the other.

Consists of:Goal [ Target ]Value [ Actual ]Status [ Score ] Trend [ Upward, Downward]Status indicator Trend indicator Parent KPIWeight [ In case of parent KPI ]MDXLanguage for OLAP databases

Stands for?Dimension TypesStandard Dimension

Time Dimension

Server Time DimensionDimension Relationship to FactsRegular:A regular dimension relationship between a cube dimension and a measure group (Fact) exists when the key column for the dimension is joined directly to the fact table. This direct relationship is based on a primary keyforeign key relationship in the underlying relational database, but might also be based on a logical relationship that is defined in the data source viewDimension Relationship to Facts ContdReference (Snowflake): A reference dimension relationship between a cube dimension and a measure group exists when the key column for the dimension is joined indirectly to the fact table through a key in another dimension table.

Dimension Relationship to Facts ContdFact (Degenerate):It is a fact table that acts as a dimension for itself, rather than using a separate dimension table. Usually it is used for drill to details information.

Dimension Relationship to Facts ContdMany To Many:It is the same as the many to many relationship in the relational database, and we should overcome this by using an intermediate table.

SummaryOLAPData WarehouseDimensionsFactsMeasureMembersDimension Relationship to FactsQuestions

THANK YOUA VERY BIG THANK YOU TO Sherif Anwar