relational model & normalization relational terminology anomalies and the need for normalization...

28
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization

Upload: ilene-hubbard

Post on 20-Jan-2016

236 views

Category:

Documents


0 download

TRANSCRIPT

Relational Model & Normalization

Relational terminology Anomalies and the need for

normalization Normal forms Relation synthesis De-normalization

Why the Relational Model?

General model DBMS-independent design Widely used in DBMS products But we must deal with anomalies

Relational Terminology

Relation: schema or structure versus instance

EMPLOYEE(Name, Age, Sex, EmployeeNumber)

When is a table a relation?

Single value cells - no repeating groups or arrays

Each attribute has unique name All values in a column are of same kind Order of columns is not significant No identical rows Order of rows is not significant

Approaches to Relation Design

Analysis– start with table structure and normalize

(eliminate anomalies)– Entity Relationship Model (3rd Normal)

Synthesis– construct relations from attributes

Basic Concepts

Functional Dependency– relationship between or among attributes

Key– group of one or more attributes that

uniquely identifies a row

Functional Dependency

Y is functionally dependent on X if value of X determines value of Y– if we know the value of X, we can obtain

(look up, compute,…) the value for Y– determined by user model and business

rules

Functional Dependency Example

StudentID StudentName

determinant

Functional Dependency Notation

X (Y,Z) (X,Y) Z X (Y,Z )

X Y & X Z

(X,Y ) ZX Z& Y Z

Keys

Single or group attributes

Depend on user model

Example: why {SID,Activity}?

Is there anotheroption?

Functional Dependencies, Keys and Uniqueness

Key is always unique Key functionally determines entire row Determinant need not be unique, hence

is not necessarily a key Example:

Activity Fee

Realitycheck

ProjectID EmployeeName? ProjectID EmployeeSalary? (ProjectID, EmployeeName) EmployeeSalary? EmployeeName EmployeeSalary? EmployeeSalary ProjectID? EmployeeSalary (ProjectID, EmployeeName)? What is the key?

Normalization

Modification Anomalies Referential Integrity Constraint Normal Forms Golden Rule: “A relation should have a “A relation should have a

single theme; if it has more, break it into single theme; if it has more, break it into more relations.”more relations.”

Modification Anomalies

What happens when you want to– add a new book?– change the address of a patron?– delete a patron record?

PatronName

PatronAddress

BookID

BookTitle

BookAuthor

BorrowDate

DueDate

ReturnDate

SmithJonesHartHicksRiceJones

12 Elk25 Sun73 Sera22 Main69 Witt25 Sun

AAABBBCCCAAADDDCCC

PeaceWarSystemPeaceSpringSystem

BartHineVangBartLyonVang

2/42/42/52/122/61/26

2/182/182/192/252/202/7

2/152/192/232/282/82/6

Modification Anomalies

Deletion anomaly– deleting one fact about an entity deletes a

fact about another entity Insertion anomaly

– cannot insert one fact about an entity unless a fact about another entity is also added

Update anomaly– changing one fact about an entity requires

multiple changes to a table

Referential Integrity Constraint

When we split a relation, we must pay attention to the references across the newly formed relations

E.g., a book must exist before it can be checked out:– CHECKOUT [BookID] BOOK [BookID]

The DBMS or the applications will have to check/enforce constraints

Classification of relations

All relations

Second Normal Form

Single attribute key, or all non-key attributes are dependent on the entire key– ACTIVITY(SID, Activity, Fee)

Third Normal Form

No transitive dependencies– WORKER(Employee, Dept, Location)

– WORKER(Employee, Dept)OFFICE(Dept, Location)

Quick Quiz

Determine if the following relations are in 1NF, 2NF or 3NF

Rewrite each relation in 3NF– EMPLOYEE (EmpID, EmpName, JobCode)– EMPLOYEE(EmpID, EmpName, JobCode,

JobDesc)– EMPLOYEE(EmpID, EmpName, ProjectID,

HrsWorked)

Boyce-Codd Normal Form

Every determinant is a candidate key– ADVISER(SID,Major,Fname)

– STU-ADV(SID,Fname)ADV-SUBJ(Fname,Subject)

Multi-valued Dependency

Two or more functionally independent multi-valued attributes are dependent on another attribute– EMPLOYEE(Name,Dependent,Project)

Data redundancy and modification anomalies 4NF: BCNF & no multi-valued dependencies

– EMPLOYEE(Name,Dependent)– EMPLOYEE(Name, Project)

Domain/Key Normal Form

Every constraint on the relation is a logical consequence of the definitions of keys and domains

Constraints: rules, functional and multi-valued dependencies, anything that can be statically ascertained as true or false

Enforcing key and domain restrictions causes all of the constraints to be met

Summary of Normal Forms

De-Normalization

Many databases are not normalized or poorly normalized implying bad design

We may also want to de-normalize to improve efficiency or ease of use

Consider the alternatives:– CUSTOMER(CustNo, CustName, City,

State, Zip)– CUSTOMER(CustNo, CustName, Zip)

CODES(Zip, City, State)

Optimization

There may be more than one way to normalize a table– COLLEGE(CollegeName, Dean, AsstDean)

» DEAN(CollegeName, Dean)ASSTDEAN(CollegeName, AsstDean)

» COLLEGE (CollegeName, Dean, AsstDean1, AsstDean2, AsstDean3)

Which is best depends on efficiency considerations

Synthesis