data modeling fundamentals

45
Data Modeling Fundamentals Version 1.1 Cristi Salcescu

Upload: cristi-salcescu

Post on 22-May-2015

645 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Data modeling fundamentals

Data Modeling Fundamentals

Version 1.1Cristi Salcescu

Page 2: Data modeling fundamentals

Subjects

• Relational Modeling• Dimensional Modeling• Object Modeling

Page 3: Data modeling fundamentals

What is data modeling?

• Apply structure• Organize

Page 4: Data modeling fundamentals

Relational Modeling

• Tables– Columns and– Rows

• Keys– Primary Key– Foreign key (Referential Integrity)– Surrogate Key– Composite Key• is a key that contains more than one column

Page 5: Data modeling fundamentals

Types of Relations

• One-to-Many• Many-to-One• Many-to-Many• One-to-One• Recursive

Page 6: Data modeling fundamentals

One-to-Many

PersonsId

LastName

FirstName

PoliciesId

Serial

Number

IssuedDate

BeginDate

EndDate

IdPerson

IdPolicyType

IdUser

Page 7: Data modeling fundamentals

Many-to-Many

Page 8: Data modeling fundamentals

One-to-One

PolciesHouseholdId

IdAddress

Age

Surface

RoomsNo

PoliciesId

Serial

Number

IssuedDate

BeginDate

EndDate

IdPerson

IdPolicyType

IdUser

PoliciesMotorId

ConstructionYear

CylCap

ChassisNo

PlateNo

Page 9: Data modeling fundamentals

Many-to-One

Page 10: Data modeling fundamentals

Self-Referencing

_CategoriesIdCategory

Name

IdParent

Page 11: Data modeling fundamentals

Normalization

• creates granularity• remove duplication• is a set of cumulative rules (Normal) Forms :

1st, 2nd, 3rd Normal Form• good for saving space, but I/O costs are cheap• bad for performance : Joins

Page 12: Data modeling fundamentals

1st Normal Form

• creates Many-to-One relation• removes duplication that occurs horizontally

Page 13: Data modeling fundamentals

2nd Normal Form

• Creates One-to-Many relation• removes duplication that occurs vertically

Page 14: Data modeling fundamentals

3rd Normal Form

• Creates Many-to-Many relation

Page 15: Data modeling fundamentals

4th Normal Form

• Creates a One-to-One relation• Separates NULL values

Page 16: Data modeling fundamentals

Insurance Policies - Car, Home and Life

Page 18: Data modeling fundamentals

OLTP vs OLAP

• OLTP : On-line Transaction Processing• OLAP : On-line Analytical Processing

Why Relational Model fails for Reporting?• too granular

• high concurrency (lots of users sharing small pieces at the same time)• too many tables : Joins are too big, SQL code too slow

Page 19: Data modeling fundamentals

OLTP

– recent data– daily basis– hundreds millions of users– high concurrency– designed for working with a single record/entity at

a time– highly “normalized”– getting data for a report involves many joins

Page 20: Data modeling fundamentals

OLAP

– huge amout of (historical) data– high speed to access huge amount of data– access many tables– low concurency : few users (top executives)– number of tables are reduced, reducing number of

joins– Data is de-normalized

Page 21: Data modeling fundamentals

Dimensional Modeling• Data Warehouse

– A gigantic storehouse of data– All data– Provides a long term storage of data– Aggregation of data from multiple systems – Reduce the load on the production system

• Facts– Transactional information– Hold numeric measures

• Dimensions– Hold the values that describe facts– Static information, or Slowly changing– Answer questions like : who, what, when, where?– Look up values

Page 22: Data modeling fundamentals

Fact table example

Page 23: Data modeling fundamentals

Denormalization

• removing Normal Forms• removes granularity• uses lots of space : I/O costs • good for performance• reduces the number of Joins• good for large database

Page 24: Data modeling fundamentals

3rd Normal Form

Page 25: Data modeling fundamentals

Denormalized

Page 26: Data modeling fundamentals

Relational Model

Page 27: Data modeling fundamentals

Denormalize facts tables

Page 28: Data modeling fundamentals

Snowflake Schema

Page 29: Data modeling fundamentals

Star Schema

Page 31: Data modeling fundamentals

Object Modeling

• a layer of objects that model the business area you're working in

Page 32: Data modeling fundamentals

UML

Unified Modeling LanguageThe most basic of UML diagrams is the Class Diagram. It describes classes and shows the relationships among them.

Page 33: Data modeling fundamentals

Types of Relations

• Inheritance• Association• Aggregation• Composition

Page 34: Data modeling fundamentals

Inheritance

class Relations

A

B

InheritanceA generalizes BB derives from A

Page 35: Data modeling fundamentals

Association

AssociationA uses B

Class fieldMethode parameterMethode Return TypeLocal variable

class Relations

A B

Page 36: Data modeling fundamentals

Aggregation

AggregationShared Association

A aggregates BB is part of A

class Relations

A B

class Relations

Airport Aircraft

Page 37: Data modeling fundamentals

Composition

CompositionNot-Shared Association

A is composed of B

class Relations

A B

class Relations

Person Le g

Page 38: Data modeling fundamentals

Domain Layer

Domain Layer– Introduced by Eric Evans, in his book “Domain Driven

Design – Tackling Complexity in the Heart of Software” @2003

– Entities• An object that is not defined by its attributes, but

rather by its identity– Value Objects

• An object that contains attributes but has no conceptual identity

Page 39: Data modeling fundamentals

Insurance – Relational Model

PersonsId

LastName

FirstName

PolciesHouseholdId

IdAddress

Age

Surface

RoomsNo

PoliciesId

Serial

Number

IssuedDate

BeginDate

EndDate

IdPerson

IdPolicyType

IdUser

PoliciesMotorId

ConstructionYear

CylCap

ChassisNo

PlateNo

Page 40: Data modeling fundamentals

Insurance – Object Model

Page 42: Data modeling fundamentals

Data Flow between the 3 Modelspkg Models

Domain M odel

Relational Model Dimens ional Model

Tables

Fac ts

Dimensions

Enti ties

ValueObjects

«flow»

«flow» «flow»

Page 43: Data modeling fundamentals

ORM/ ETL

• ORM (Object-relational mapping) http://www.agiledata.org/essays/mappingObjects.html

• ETL (Extract, transform and load)

Page 44: Data modeling fundamentals

Summary

• Relational Modeling– Tables (columns, rows)– Types of Relations– Normal Forms

• Dimensional Modeling– Facts and Dimensions– De-Normalization

• Object Modeling– Entities and Values Objects– Inheritance, Aggregation, Association

Page 45: Data modeling fundamentals

Resources

• VTC – Data Modeling• Pluralsight - Introduction to Data Warehousing