database design - for all levels
TRANSCRIPT
-
7/27/2019 Database Design - For ALL LEVELS
1/50
2010 Wipro Ltd - Confidential
Sandeep Kar
Database Design
-
7/27/2019 Database Design - For ALL LEVELS
2/50
2010 Wipro Ltd - Confidential2
Logical Database DesignPhysical Vs Logical Database ModelingNormalization TechniquesDe-normalization TechniquesData Modeling Tool - ERWIN
Topics We are going to discuss
-
7/27/2019 Database Design - For ALL LEVELS
3/50
2010 Wipro Ltd - Confidential3
Database design is the process of producing a detailed data model of adatabase.
Data modeling in software engineering is the process of creating a datamodel by applying formal data model descriptions using data modelingtechniques.
Data modeling is atechnique for defining business requirements for adatabase.
Definition
-
7/27/2019 Database Design - For ALL LEVELS
4/50
2010 Wipro Ltd - Confidential4
Types of Data ModelsTypes of Data Models
Model
Conceptual Logical Physical
-
7/27/2019 Database Design - For ALL LEVELS
5/50
2010 Wipro Ltd - Confidential5
Conceptual Data Model:A conceptual data model identifies thehighest-level relationships between the different entities.
Logical Data Modelinguses Entity-Relationship Modeling technique(E/R Modeling technique)
Physical Data Modelinguses Relational Modeling technique
Logical Data Independence :The ability to change the logical(conceptual) schema without changing the External schema is called logicaldata independence. E.g. Views can shield users from changes in thestructure of the real tables .
Modeling techniquesModeling techniques
-
7/27/2019 Database Design - For ALL LEVELS
6/50
2010 Wipro Ltd - Confidential6
Communicate via Business
Analyst
Data Modeler DBA Data Architect
Business
Analyst
--- Produces
Data Requirements
Use Cases
--- Data Requirements
Use Cases
LDM
Data Modeler Logical Data Model --- Physical Data Model Logical and PhysicalData Models
DBA --- Physical Data Model --- Physical Data Model
Data Architect Data RequirementsUse Cases
LDM
Logical and PhysicalData Models Performs the detailedphysical design
Creates the DDL and/or
required DML
Implements the model
into the database
Reviews the DataModels and assesses
the completeness for
the physical
implementation
Roles and responsibilitiesRoles and responsibilities
-
7/27/2019 Database Design - For ALL LEVELS
7/50
2010 Wipro Ltd - Confidential7
Includes the important entities and the relationships among them.
No attribute is specified.
No primary key is specified.
A conceptualentity-relationship model shows how the business world sees
information. It suppresses non-critical details in order to emphasize business rules and
user objects.
It typically includes only significant entities which have business
meaning, along with theirrelationships.
Conceptual Data Model:
-
7/27/2019 Database Design - For ALL LEVELS
8/50
2010 Wipro Ltd - Confidential8
Logical database designLogical database design
Derive a logical model from the information represented in theER model (conceptual model)
Validate the logical model to check if it fulfils clients
data and
transaction requirements
We focus on one type of logical model which is relationalmodel
Logical model = relational model Relational model means Collection of connected tables Mapping from conceptual model to logical model mainly
involves
Designing tables with primary keysAnd linking tables with foreign keys
Logical modelling is independent of database type
-
7/27/2019 Database Design - For ALL LEVELS
9/50
2010 Wipro Ltd - Confidential9
Logical Data ModelLogical Data Model
Creates/
Iscreatedbya
Customer
CUSTOMERNUMBER:Number
CUSTOMERFIRSTNAME:String
CUSTOMERLASTNAME:String
CUSTOMERBIRTHDATE:String
Transaction
TRANSACTIONIDENTIFIER:Number
CUSTOMERNUMBER:Number(FK)TRANSACTIONAMOUNT:Number
TRANSACTIONDATE:Datetime
-
7/27/2019 Database Design - For ALL LEVELS
10/50
2010 Wipro Ltd - Confidential10
How to read the cardinalitiesHow to read the cardinalities
Must have one and only one
May have one/Must not exceed one/Must have zero or one
May have zero, one or more
Must have at least one and possibly more
-
7/27/2019 Database Design - For ALL LEVELS
11/50
2010 Wipro Ltd - Confidential11
Entity is an LDM construct that has uniformstructure storing data instances in pre-definedformat
Attribute is part of an entity and stores atomic dataof uniform structure (e.g. number, string etc.)
Relationship is a modeling object that defines thedependency between two entities
Logical Data ModelLogical Data Model
-
7/27/2019 Database Design - For ALL LEVELS
12/50
2010 Wipro Ltd - Confidential12
Individual tables are derived from strong entities(entities with a clear Primary key)
Fields in the tables are derived from attributesassociated with entities
Define the data types of the fields Define the primary key of the table
Unique and Not Null
Foreign keys are decided later while modelling therelationships Not all tables (relations) have foreign keys However, relation model is incomplete without deciding
foreign keys
Entities & their AttributesEntities & their Attributes
-
7/27/2019 Database Design - For ALL LEVELS
13/50
2010 Wipro Ltd - Confidential13
Table is a physical implementation of entity (s). Ithas uniform structure and comprises of records
Column is a physical implementation of an attribute,stores a single value (although not necessarily) andhas uniform validation rule (s)
Reference is a physical implementation of arelationship that defines basic integrity rules in themodel. It can be implemented using the database,
trigger or application (e.g. Stored Procedure ofexternal application code)
Physical Data ModelPhysical Data Model
-
7/27/2019 Database Design - For ALL LEVELS
14/50
2010 Wipro Ltd - Confidential14
Constraints are also defined, including primary keys, foreign keys, otherunique keys, and check constraints.
Views can be created from database tables to summarize data or to simplyprovide the user with another perspective of certain data.
When physical modeling occurs, objects are being defined at the schemalevel. A schema is a group of related objects in a database. A databasedesign effort is normally associated with one schema.
Other objects such as indexes and snapshots can also be defined duringphysical modeling. Physical modeling is when all the pieces come togetherto complete the process of defining a database for a business.
Physical Database Modeling
-
7/27/2019 Database Design - For ALL LEVELS
15/50
2010 Wipro Ltd - Confidential15
Physical modeling involves the actual design of a database according to therequirements that were established during logical modeling.
Logical modeling mainly involves gathering the requirements of thebusiness, with the latter part of logical modeling directed toward the goalsand requirements of the database.
Physical modeling deals with the conversion of the logical, or businessmodel, into a relational database model.
During physical modeling, objects such as tables and columns are createdbased on entities and attributes that were defined during logical modeling.
Logical Vs Physical Database Modeling
-
7/27/2019 Database Design - For ALL LEVELS
16/50
2010 Wipro Ltd - Confidential16
Identifying Relationship
Non-Identifying Relationship
Weak Relationship
Binary relationships can be
One-to-one(1:1)
One-to-many(1:*)
Many-to-many(*:*)
Each of these is modelled differently
Understanding 1:* type is particularly important Many real world relationships are of type 1:*
Relationships
-
7/27/2019 Database Design - For ALL LEVELS
17/50
2010 Wipro Ltd - Confidential17
An identifying relationship means that the child table cannot be uniquely
identified without the parent. For example, you have this situation in theintersection table used to resolve a many-to-many relationship where the
intersecting table's Primary Key is a composite of the left and right
(parents) table's Primary Keys. Example...
Account (AccountID, AccountNum, AccountTypeID)
PersonAccount (AccountID, PersonID, Balance)
Person(PersonID, Name)
The Account to PersonAccount relationship and the Person to
PersonAccount relationship are identifying because the child row
(PersonAccount) cannot exist without having been defined in the parent
(Account or Person). In other words: there is no personaccount when there
is no Person or when there is no Account.
Identifying Relationships
-
7/27/2019 Database Design - For ALL LEVELS
18/50
2010 Wipro Ltd - Confidential18
A non-identifying relationship is one where the child can be identified
independently of the parent ( Account - AccountType)
Example...
Account( AccountID, AccountNum, AccountTypeID )
AccountType( AccountTypeID, Code, Name, Description )
The relationship between Account and AccountType is non-identifying
because each AccountType can be identified without having to exist in the
parent table.
Non-Identifying Relationships
-
7/27/2019 Database Design - For ALL LEVELS
19/50
2010 Wipro Ltd - Confidential19
These are the most common type of relationships Also known as parent:child relationship
One parent can have many children The entity on the One side of the relationship is known as the Parent
entity
The entity on the many side is known as the Child entity Our task: how 1:* relationship between two entities at ER
Model level is represented in a relational model We assume that both the participating entities are modelled as tables (
as explained earlier) Do we make any changes to these tables to reflect the relationship
between them? Yes, we use a foreign keyto mark the relationship
We make foreign key decision while modelling 1:* relationship
One-to-many (1:*) relationships
-
7/27/2019 Database Design - For ALL LEVELS
20/50
2010 Wipro Ltd - Confidential20
In a 1:* relationship Foreign key is designed as a column in the child table (table one the *
side) Foreign key references the parent table (table on the 1 side)
In other words, when you post a foreign key to a table it
means This table is the child table and For every row in the parent table, this table may havemore than one
(many) corresponding rows
Create a few rows of data in the tables participating in the 1:*relationship and check if the foreign key is acting as a link for
information from the child table to the information from theparent table Example data is always useful in designing foreign keys
Foreign Key Design
-
7/27/2019 Database Design - For ALL LEVELS
21/50
2010 Wipro Ltd - Confidential21
Consider the 1:* relationship Oversees between Staff and PropertyForRent
In this case, Staff is the Parent entity
Because it is on the one side of the relationship PropertyForRent is the child entity
Because it is one the many side of the relationship
When we model this relationship at the relational level We assume that Staff and PropertyForRent are modelled as tables as discussed earlier We post a copy of the PrimaryKey, StaffNo from the Parent entity, Staff as a foreign key
in the child entity,PropertyForRent
Our final tables are Staff(StaffNo, lName, fName, Position)
Primary key StaffNo PropertyForRent(PropertyNo, Street, Town, StaffNo)
Primary key ProperrtyNoForeign key StaffNo references Staff(StaffNo)
Example
Staff PropertyForRentOversees0..1 0..*
-
7/27/2019 Database Design - For ALL LEVELS
22/50
2010 Wipro Ltd - Confidential22
There are two methods to tackle *:* relationships Method: Create a new table to represent the relationship
We assume that the two entities participating in the relationship arealready modelled as tables as explained earlier
The third table is created to represent the relationship
This methods result in similar solutions Three tables, where one of the tables (relationship table) links both
the entity tables through foreign keys
Many-to-many (*:*) Relationships
-
7/27/2019 Database Design - For ALL LEVELS
23/50
2010 Wipro Ltd - Confidential23
Example: Method for modelling *:* relationship
Here, the *:* relationship between Client and
PropertyForRent is directly represented as a new table
viewing
Primary key for the new entity includes the two foreign
keys from the two participating entities
-
7/27/2019 Database Design - For ALL LEVELS
24/50
2010 Wipro Ltd - Confidential24
Generally, in relationship modelling we always identify theparent table Then post a copy of its primary key as the foreign key in the child
table
In this case of 1:1, max (cardinality) constraints which are 1:1do not help to identify the parent table
Therefore we use min (participation) constraints to identifythe parent table
For example, we choose the entity with min value zero as theparent entity, if the other participating entity has min value ofone
One-to-one (1:1) relationships
-
7/27/2019 Database Design - For ALL LEVELS
25/50
2010 Wipro Ltd - Confidential25
Complex relationships too can be simplified into simpler 1:1 or 1:*relationships first and then modelled at the logical level
Alternatively, a new table can be created to represent a complexrelationship.
Foreign keys are posted in the new table from all the participating entities
Complex Relationships
-
7/27/2019 Database Design - For ALL LEVELS
26/50
2010 Wipro Ltd - Confidential26
Complexity
Conceptual
Data Model
Logical
Data Model
Physical
Data Model
Dimensional
Data Model
Physical
Data Model
-
7/27/2019 Database Design - For ALL LEVELS
27/50
2010 Wipro Ltd - Confidential27
Enterprise Data Modeling tools:
Rational Rose (Object Oriented and Relational modeling)
Data Modeling tools:
AllFusion (ERWin)-This tool is a Market leader (according to GG and
Foresters) Power Designer-This tool is a major contender
Embarkaderos E/R Studio
Model repositories:
Model Mart (ERWin/AllFusion)
Power Designer Model repository
Classification of Data Modeling tools
-
7/27/2019 Database Design - For ALL LEVELS
28/50
2010 Wipro Ltd - Confidential28
Normalization is the process of efficiently organizing data in a database.
There are two goals of the normalization process:
eliminating redundant data
ensuring data dependencies (only storing related data in a table)
What Is Database Normalization?
-
7/27/2019 Database Design - For ALL LEVELS
29/50
2010 Wipro Ltd - Confidential29
Decreased storage requirements by effectively choosing the data type
Faster search performance!
Smaller file for table scans.
More directed searching.
Improved data integrity!
What are the Benefits of Database Normalization?
-
7/27/2019 Database Design - For ALL LEVELS
30/50
2010 Wipro Ltd - Confidential30
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF) Fifth Normal Form (5NF)
What are the Normal Forms?
-
7/27/2019 Database Design - For ALL LEVELS
31/50
2010 Wipro Ltd - Confidential31
Our Table
name phone1 phone2 email1 email2
Mike Hillyer 403-555-1717 403-555-1919 [email protected] [email protected]
Tom Jensen 403-555-1919 403-555-1313 [email protected] [email protected]
Ray Smith 403-555-1919 403-555-1111 [email protected]
user
name
nickname
phone1
phone2
phone3
cell
pager
address
city
province
postal_code
country
email1
email2web_url
company
department
picture
notes
email_format
-
7/27/2019 Database Design - For ALL LEVELS
32/50
2010 Wipro Ltd - Confidential32
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify eachrow with a unique column or set of columns (the primary key).
Benefits
Easier to query/sort the data
More scalable
Each row can be identified for updating
First Normal Form
-
7/27/2019 Database Design - For ALL LEVELS
33/50
2010 Wipro Ltd - Confidential33
One Solution
first_name last_name phone email
Mike Hillyer 403-555-1717 [email protected]
Mike Hillyer 403-555-1919 [email protected]
Tom Jensen 403-555-1919 [email protected]
Tom Jensen 403-555-1313 [email protected]
Ray Smith 403-555-1919 [email protected]
Ray Smith 403-555-1111
Multiple rows per user
Emails are associated with only one other
phone
Hard to Search
user
first_name
last_name
nickname
phonecell
pager
address
city
province
postal_code
country
web_url
department
picture
notes
-
7/27/2019 Database Design - For ALL LEVELS
34/50
2010 Wipro Ltd - Confidential34
Satisfying 1NF
user
PK user_id
first_name
last_namenickname
address
city
province
postal_code
country
web_url
company
department
picture
notes
phone
PK phone_id
country_code
number
extension
email
PK email_id
address
-
7/27/2019 Database Design - For ALL LEVELS
35/50
2010 Wipro Ltd - Confidential35
Three Forms
One to (zero or) One
One to (zero or) Many
Many to Many
One to One
Same Table?
One to Many
Place PK of the One in the Many
Many to Many
Create a joining table
Forming Relationships
-
7/27/2019 Database Design - For ALL LEVELS
36/50
2010 Wipro Ltd - Confidential36
Joining Tables
user
PK user_id
first_name
last_name
nickname
address
city
province
postal_code
countryweb_url
picturenotes
email_format
email
PK address
FK1 user_id
user_phone
PK,FK1 phone_idPK user_id
type
phone
PK phone_id
country_code
number
extension
-
7/27/2019 Database Design - For ALL LEVELS
37/50
2010 Wipro Ltd - Confidential37
user
PK user_id
first_name
last_name
nickname
address
city
province
postal_code
country
web_url
picture
notes
email_format
email
PK address
FK1 user_id
user_phone
PK,FK1 phone_id
PK user_id
type
phone
PK phone_id
country_code
number
extension
Our User Table
first_name last_name company department
Mike Hillyer MySQL Documentation
Tom Jensen CPNS Finance
Ray Smith CPNS Documentation
-
7/27/2019 Database Design - For ALL LEVELS
38/50
2010 Wipro Ltd - Confidential38
Meet all the requirements of the first normal form.
Remove subsets of data that apply to multiple rows of a table and place
them in separate tables.
Create relationships between these new tables and their predecessors
through the use of foreign keys.
Benefits
Increased storage efficiency
Less data repetition
Second Normal Form
-
7/27/2019 Database Design - For ALL LEVELS
39/50
2010 Wipro Ltd - Confidential39
Satisfying 2NF
email
PK address
type
FK1 user_id
phone
PK phone_id
country_code
number
extension
type
user
PK user_id
first_name
last_name
nickname
address
city
province
postal_code
country
web_url
picture
notes
user_phone
PK,FK1 user_id
PK,FK2 phone_id
company
PK company_id
name
user_company
PK,FK1 user_id
PK,FK2 company_id
department
email
PK address
FK1 user_id
user
PK user_id
first_name
last_name
nickname
address
city
province
postal_code
country
web_url
picture
notes
email_format
-
7/27/2019 Database Design - For ALL LEVELS
40/50
2010 Wipro Ltd - Confidential40
Table must be in Second Normal Form
If your table is 2NF, there is a good chance it is 3NF
Remove columns that are not dependent upon the primary key.
All attributes that are not part of the key must not depend on
any other non-key attributes.
Benefits
No extraneous data
Third Normal Form
-
7/27/2019 Database Design - For ALL LEVELS
41/50
2010 Wipro Ltd - Confidential41
Satisfying 3NF
email
PK address
FK1 user_id
format
phone
PK phone_id
country_codenumber
type
user
PK user_id
first_namelast_name
nickname
address
city
province
postal_code
country
web_url
picture
notes
user_phone
PK,FK1 user_id
PK,FK2 phone_id
extension
company
PK company_id
name
user_company
PK,FK1 user_id
PK,FK2 company_id
department
-
7/27/2019 Database Design - For ALL LEVELS
42/50
2010 Wipro Ltd - Confidential42
Finding Balance
email
PK address
FK1 user_id
format
phone
PK phone_id
FK1 type_id
area_code
NXX
NCXFK2 country_id
user
PK user_id
first_name
last_name
nickname
unit
street_number
street_name
street_type
quadrant
web_url
picture
notes
FK1 postal_code
user_phone
PK,FK1 user_id
PK,FK2 phone_id
extension
company
PK company_id
name
user_department
PK,FK1 user_id
PK,FK2 department_id
type
PK type_id
type
country
PK country_id
Name
phone_code
department
PK department_id
name
FK1 company_id
postal_code
PK postal_code
FK1 city_id
province
PK province_id
Name
Abbreviation
FK1 country_id
city
PK city_id
name
FK1 province_id
-
7/27/2019 Database Design - For ALL LEVELS
43/50
2010 Wipro Ltd - Confidential43
Only one valid reason exists for de-normalizing a relational design - to
enhance performance.
De-normalization is the process of putting one fact in numerous places.
This speeds data retrieval at the expense of data modification.
Normalize first, then de-normalize
Use only when you cannot optimize
Try temp tables, UNIONs, VIEWs,
Always monitor and periodically re-evaluate all de-normalized
applications.
I/O saved , CPU saved , complexity of update programming , cost of
returning to a normalized design
Goal of de-normalization
-
7/27/2019 Database Design - For ALL LEVELS
44/50
2010 Wipro Ltd - Confidential44
Types of De-normalization
-
7/27/2019 Database Design - For ALL LEVELS
45/50
2010 Wipro Ltd - Confidential45
Surrogate Key: A unique {primary key}generated by the {RDBMS}that isnot derived from any data in the database and whose only significance is toact as the primary key. A surrogate key is frequently a sequential number(e.g. a {Sybase}"{identity column})
Candidate Key: is a combination of attributes that can be uniquely used to
identify a database record without any extraneous data. Each table mayhave one or more candidate keys. One of these candidate keys is selectedas the table primary key.
Vertical slicing: A Projection is a vertical slicing of the table. You simplyindicate which columns you want.
Horizontal slicing: A Selection is a horizontal slicing of the table. Theselection defines which records you returned out of all possible records inthe table. The projection is defined in the WHERE clause of the SQLstatement.
Concepts
-
7/27/2019 Database Design - For ALL LEVELS
46/50
2010 Wipro Ltd - Confidential46
Using the Data Modeling tools
CASE tool
repository
Target
DB
CASE
Tool
Model
Meta-model
DB
Meta-model
CompareGenerate
DDL
-
7/27/2019 Database Design - For ALL LEVELS
47/50
2010 Wipro Ltd - Confidential47
Purpose of Model Mart in Erwin is to allow centralized sharing of models
automate programming tasks and forward and reverse engineer databases
include data types, macros and model storage that a data modeler must
address.
We can generate the Erwin met model to create a data dictionary that
stores information about the data structures used in Erwin models
Attribute Name as it appears in the logical model.
Data type lists the default data type assigned to the table column. The
actual data type used is dependent on which target server you use to hold
the Erwin dictionary.
Description indicates the type of information stored by the attribute.
Valid Values lists the valid values, the reference table that contains thevalid value list, or validation expression for the attribute. Where no
validation rule has been defined, the data must conform to the attributes
data type.
FK indicates if the attribute is a foreign key (FK) attribute. This column is
blank for non-foreign key attributes.
Data Modeling Tool - ERWIN
-
7/27/2019 Database Design - For ALL LEVELS
48/50
2010 Wipro Ltd - Confidential48
Pick the topic for analysis
Create the Conceptual Data Model
Create LDM
Validate the LDM (Formal validation, normalization etc.)
Perform the transition from LDM to PDM using ERWin
Create PDM
Perform the Physical design (indexes, table spaces, buffer pools, partitioning etc.) Validate the PDM (make sure that de-normalization did not change the data integirty,
validate reference implementation through RI)
Perform the model walkthrough (attack the PDM to see if we have any holes)
Implement OLTP model (enjoy the work of DBA)
Play with Data Model/Database to see what we can/cant do (experience the SQL againstour masterpiece)
Define the requirements for Decision support system (understand what Data Analyticsmeans and why we have to create a separate data model for this)
Create DSS model (Star, Snow flake, Constellation, Data Mart, Data Warehouse)
PlanPlan
-
7/27/2019 Database Design - For ALL LEVELS
49/50
Questions?
-
7/27/2019 Database Design - For ALL LEVELS
50/50
2010 Wipro Ltd - Confidential
Thank You
Sandeep Kar
Module Leader