dw design 1_dim_facts
DESCRIPTION
Modelado dimensionalTRANSCRIPT
![Page 1: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/1.jpg)
DATA WAREHOUSING Multi Dimensional Data Modeling. Facts and Dimensions
![Page 2: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/2.jpg)
2
![Page 3: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/3.jpg)
While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse.
3
![Page 4: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/4.jpg)
End users cannot understand, remember, navigate an E/R model (not even with a GUI)
One reason is that an enterprise-level ERM would be too complex to understand.
4
![Page 5: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/5.jpg)
Software cannot usefully query an E/R model
5
![Page 6: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/6.jpg)
Use of E/R modeling doesn’t meet the DW purpose: intuitive and high performance querying
6
![Page 7: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/7.jpg)
7
Fact Table Dimension Table
Time_Dim TimeKey
TheDate . . .
Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey
$ . . .
Employee_Dim EmployeeKey
EmployeeID . . .
Product_Dim ProductKey
ProductID . . .
Customer_Dim CustomerKey
CustomerID . . .
Shipper_Dim ShipperKey ShipperID . . .
![Page 8: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/8.jpg)
8
Geographic Product Time Units $
Dimension
Tables
Geographic
Product
Time
Fact Table Measures
Facts
Dimension
Several distinct dimensions, combined with
facts, enable you to answer business
questions.
![Page 9: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/9.jpg)
They are normally textual and descriptive descriptions of the business.
9
Dimensions
![Page 10: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/10.jpg)
dimension tables contain relatively small amounts of relatively static data
10
Dimensions
![Page 11: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/11.jpg)
dimension table: usually not-normalized
11
Dimensions
![Page 12: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/12.jpg)
Independent of each other, not hierarchically related
12
Dimensions
![Page 13: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/13.jpg)
Dimensional attributes (attributes no key) help to describe the dimensional value.
13
Dimensional attributes
![Page 14: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/14.jpg)
Fact are (usually numerical) measures of business.
14
Facts
![Page 15: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/15.jpg)
Fact table is the largest table in the star schema and is composed of large volumes of data
15
Facts
![Page 16: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/16.jpg)
Fact table is (often) normalized
16
Facts
![Page 17: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/17.jpg)
fact table has a composite primary key made up of foreign keys
17
Facts
PK = FKi
![Page 18: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/18.jpg)
fact table usually contains one or more numerical facts that occur for the combination of keys that define each record
18
Facts
measures
![Page 19: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/19.jpg)
A fact table contains either detail-level facts or facts that have been aggregated (summary tables)
19
Facts
Σ
![Page 20: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/20.jpg)
Facts are:
additive
semi-additive
non-additive
20
Facts
![Page 21: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/21.jpg)
Non-additive facts cannot be added at all.
An example of this is averages. Semi-additive facts can be aggregated along some of
the dimensions and not along others:
current_Balance is a semi-additive fact as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?) but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information
The most useful measures are: Numeric, Additive
21
Facts
![Page 22: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/22.jpg)
Atomic level of data of the business process
A definition of the highest level of detail that is supported in a data warehouse
22
![Page 23: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/23.jpg)
A fact table usually contains facts with the same level of aggregation
a proper dimensional design allows only facts of a uniform grain (the same dimensionality) to coexist in a single fact table
23
![Page 24: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/24.jpg)
Some perfectly good fact tables represent measurements that have no facts! This kind of measurements is often called an event. The classic example of such a factless fact table is a record representing a student attending a class on a specific day. The dimensions are Day, Student, Professor, Course, and Location, but there are no obvious numeric facts. The tuition paid and grade received are good facts but not at the grain of the daily attendance.
24
![Page 25: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/25.jpg)
Dimensions without attributes. (Such as a transaction number or order number.)
Put the attribute value into the fact table even though it is not an additive fact.
25
![Page 26: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/26.jpg)
26
![Page 27: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/27.jpg)
27
Employee_Dim EmployeeKey
EmployeeID . . .
EmployeeKey
Time_Dim TimeKey
TheDate . . .
TimeKey
Product_Dim ProductKey
ProductID . . .
ProductKey
Customer_Dim CustomerKey
CustomerID . . .
CustomerKey
Shipper_Dim ShipperKey
ShipperID . . .
ShipperKey
Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey $ . . .
TimeKey
CustomerKey ShipperKey
ProductKey EmployeeKey
Multipart Key
Measures
Dimensional Keys
Fact table provides statistics
for sales broken down by
product, time, employee, shipper
and customer, dimensions
![Page 28: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/28.jpg)
28
![Page 29: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/29.jpg)
1. Choosing the data mart for the small group of end users we deal with.
Choose a business process to model, e.g., orders, invoices, etc.
29
![Page 30: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/30.jpg)
2. Fact table granularity (the smallest defined level of data in the table) is determined.
30
![Page 31: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/31.jpg)
3. Fact table dimensions are selected.
Choose the dimensions that will apply to each fact table record
Add dimensions for "everything you know" about this grain.
31
![Page 32: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/32.jpg)
4. Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount.
Choose the measure that will populate each fact table record
Add numeric measured facts true to the grain
32
![Page 33: Dw design 1_dim_facts](https://reader034.vdocuments.site/reader034/viewer/2022051613/54c6d9874a7959fc018b457c/html5/thumbnails/33.jpg)
The Data Warehouse Toolkit.Second Edition.The Complete Guide to Dimensional Modeling.Ralph Kimball.Margy Ross