designing and developing business process dimensional model or data warehouse
DESCRIPTION
Designing and developing Business Process dimensional Model or Data WarehouseTRANSCRIPT
Designing and developing Business Process dimensionalModel or Data Warehouse
About MeSlava Kokaev – Lead Business Intelligence Architect
at Industrial DefenderBoston BI USER GROUP leader
email: vkokaev@bosto
nbi.orgweb:
www.bostonbi.org/blogstwitter:
@SlavaKokaev
Business Process Dimensional Modelor “Star Schema” Database
DimensionsDimensions are the foundation of the dimensional model, describing the objects of the business, such as employee, product, customer, service.They describe the surrounding measurement events. The business processes (facts) or actions of the business in which the dimensions participate. Each dimension table links to all the business processes in which it participates.
Fact TablesEach fact table contains the measurements associated with a specific business process. A record in a fact table is a measurement, and a measurement event can always produce a fact table record. These events usually have numeric measurements that quantify the magnitude of the event, such as quantity ordered, sale amount, or call duration. These numbers are called facts (or measures in Analysis Services).
Surrogate KeysPrimary key purpose
Identifies uniqueness
Relates to foreign keys in a fact table
Two candidates
Business key Represents source primary key
Surrogate keyConsolidates multiple data sources
Consolidates multi-value business keys
Allows tracking of dimension history
Limits fact table width for optimizationUsing a surrogate key
is considered best practice
Surrogate Keys Implementation
MS-1981163MS-1981
Source OLTP Table Target DW Table
Surrogate Key
Business Key
Best Practices
SnowflakingSnowflaking is the practice of connecting lookup tables to fields in the dimension tables. Sometimes it's easier to maintain a dimension in the ETL process when it's been partially normalized or snowflaked.
Reviewing Star Schema Benefits • Transforms normalized data into a
simpler model• Delivers high-performance queries• Delivers higher performing queries
using Star Join Query Optimization• Uses mature modeling techniques
that are widely supported by many BI tools
• Requires low maintenance as the data warehouse design evolves
OLTP vs. OLAP
Normalized(OLTP)
Denormalized (Star Schema)
Slowly Changing DimensionsSupport primary role of data warehouse to describe the past accurately
Maintain historical context as new or changed data is loaded into dimension tables
Slowly Changing Dimension (SCD) types
Type 1: Overwrite the existing dimension record
Type 2: Insert a new ‘versioned’ dimension record
Type 3: Track limited history with attributes
The concept of Slowly Changing Dimensions was introduced by Ralph Kimball
Slowly Changing Dimensions Type 1
Existing record is updated
History is not preserved
LastName update to Valdez-Smythe
Slowly Changing Dimensions Type 2
Existing record is ‘expired’ and new record inserted
History is preserved
Most common form of Slowly Changing Dimension
SalesTerritoryKey update to 10
Slowly Changing Dimensions Type 2
Existing record is updated
Limited history is preserved
Implementation is rare
SalesTerritoryKey update to 10
DEMO
ResourcesSQL Server 2008 Books Online,msdn2.microsoft.com/en-us/library/bb543165(sql.100).aspx
The Microsoft Data Warehouse Toolkit by Joy Mundy, Warren Thornthwaite, and Ralph Kimball
The Data Warehouse Lifecycle Toolkit by Ralph Kimball, et al.