lecture 7 - is220site.files.wordpress.com · informal design guidelines for relational databases....

57
1 IS220 : Database Fundamentals LECTURE 7 : FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES Ref. “Chapter13from “Database Systems: A Practical Approach to Design, Implementation and Management.” Thomas Connolly, Carolyn Begg.

Upload: vothuan

Post on 04-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

1

I S 2 2 0 : D a t ab a s e F u n d a m e n t a l s

LECTURE 7:

FUNCTIONAL DEPENDENCIES AND

NORMALIZATION FOR RELATIONAL

DATABASES

Ref. “Chapter13” from

“Database Systems: A Practical Approach to Design, Implementation and Management.”

Thomas Connolly, Carolyn Begg.

Page 2: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Informal Design Guidelines for Relational Databases.

The concept of functional dependency, which describes the relationship between attributes.

The characteristics of functional dependencies used in normalization.

The purpose of normalization.

How normalization can be used when designing a relational database.

How normalization uses functional dependencies to group attributes into relations that are in a known normal form.

How to identify the most commonly used normal forms, namely First Normal Form(1NF), Second Normal Form (2NF), and Third Normal Form (3NF).

2

Objectives

Page 3: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Slide 10- 3

Informal Design Guidelines for Relational

Databases

What is relational database design?

➢ The grouping of attributes to form "good" relation schemas

Two levels of relation schemas

➢ The logical "user view" level.

➢ The storage "base relation" level.

Design is concerned mainly with base relations

What are the criteria for "good" base relations?

Page 4: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Informal Design Guidelines for Relation

Schemas (CONT.)

Four informal measures of quality for relation schema design:

1. Semantics of the attributes.

2. Reducing the redundant information in tuples.

3. Reducing the NULL values in tuple.

4. Disallowing the possibility of generating spurious tuples

4

Page 5: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

1. Semantics of the Relation Attributes

GUIDELINE 1: Informally, each tuple in a relation should

represent one entity or relationship instance. (Applies to

individual relations and their attributes).

Attributes of different entities (EMPLOYEEs, DEPARTMENTs,

PROJECTs) should not be mixed in the same relation

Only foreign keys should be used to refer to other entities

Entity and relationship attributes should be kept apart as much as

possible.

Bottom Line: Design a schema that can be explained easily

relation by relation. The semantics of attributes should be easy to

interpret.

5

Page 6: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

A simplified COMPANY relational database

schema6

Page 7: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

2. Redundant Information in Tuples and Update

Anomalies

Redundant Information in tuples leads to:

➢ Wastes storage

➢ Causes problems with update anomalies

Insertion anomalies

Deletion anomalies

Modification anomalies

7

Page 8: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

2. Redundant Information in Tuples and Update

Anomalies (cont.)

GUIDELINE 2:

➢ Design a schema that does not suffer from the insertion,

deletion and update anomalies.

8

Page 9: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

EXAMPLE OF AN UPDATE ANOMALY

Consider the relation:

EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

Update Anomaly:

Changing the name of project number P1 from “Billing” to

“Customer-Accounting” may cause this update to be made for

all 100 employees working on project P1.

9

Page 10: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

EXAMPLE OF AN INSERT ANOMALY

Consider the relation:

EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

Insert Anomaly:

Cannot insert a project unless an employee is assigned to it.

Conversely

Cannot insert an employee unless an he/she is assigned to a

project.

10

Page 11: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

EXAMPLE OF AN DELETE ANOMALY

Consider the relation:

EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

Delete Anomaly:

➢ When a project is deleted, it will result in deleting all the

employees who work on that project.

➢ Alternately, if an employee is the sole employee on a

project, deleting that employee would result in deleting the

corresponding project.

11

Page 12: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example: two relation schemas suffering from

update anomalies12

Page 13: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example States for EMP_DEPT and EMP_PROJ

13

Page 14: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Slide 10- 14

3. Null Values in Tuples

GUIDELINE 3:

▪ Relations should be designed such that their tuples will have

as few NULL values as possible.

▪ Attributes that are NULL frequently could be placed in

separate relations (with the primary key).

Reasons for nulls:

▪ Attribute not applicable or invalid.

▪ Attribute value unknown (may exist).

▪ Value known to exist, but unavailable .

Page 15: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

4. Generation of Spurious Tuples

Bad designs for a relational database may result in erroneous results for certain JOIN operations

The "lossless join" property is used to guarantee meaningful results for join operations

GUIDELINE 4:

The relations should be designed to satisfy the lossless join condition.

15

Page 16: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

16

Page 17: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Functional Dependencies

An important concept associated with normalization is

functional dependency, which describes the relationship

between attributes.

For example, if A and B are attributes of relation R, B is

functionally dependent on A (denoted A B), if each value of

A in R is associated with exactly one value of B in R.

17

Pearson Education © 2009

Page 18: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Characteristics of Functional

Dependencies

Functional dependency is a property of the meaning or

semantics of the attributes in a relation.

Diagrammatic representation.

The determinant of a functional dependency refers to the

attribute or group of attributes on the left-hand side of the

arrow.

18

Pearson Education © 2009

Page 19: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

An Example of Functional Dependency19

Pearson Education © 2009

Page 20: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example 20

Pearson Education © 2009

Page 21: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example Functional Dependency that

holds for all Time

Consider the values shown in staffNo and sName attributes of the Staff relation (see Slide 20).

Based on sample data, the following functional dependencies appear to hold.

staffNo → sName

sName → staffNo

However, the only functional dependency that remains true for

all possible values for the staffNo and sName attributes of

the Staff relation is:

staffNo → sName

21

Pearson Education © 2009

Page 22: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Characteristics of Functional

Dependencies

Determinants should have the minimal number of attributes

necessary to maintain the functional dependency with the

attribute(s) on the right hand-side.

This requirement is called full functional dependency.

Full functional dependency indicates that if A and B are

attributes of a relation, B is fully functionally dependent on A,

if B is functionally dependent on A, but not on any proper

subset of A.

22

Pearson Education © 2009

Page 23: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Characteristics of Functional

Dependencies (cont.)

A functional dependency A → B is a full functional dependency

if removal of any attribute from A results in the dependency

no longer existing.

A functional dependency A→B is a partially dependency if

there is some attribute that can be removed from A and yet

the dependency still holds.

23

Page 24: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example of Full and Partial Functional

Dependency

Exists in the Staff relation (see Slide 20).

staffNo, sName → branchNo

True - each value of (staffNo, sName) is associated with a single value of branchNo.

However, branchNo is also functionally dependent on a subset of (staffNo, sName), namely staffNo. Example above is a partial dependency.

24

Pearson Education © 2009

Page 25: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Summary of Functional Dependencies

Characteristics

Main characteristics of functional dependencies used in normalization:

1. There is a one-to-one relationship between the attribute(s)

on the left-hand side (determinant) and those on the right-

hand side of a functional dependency.

2. Holds for all time.

3. The determinant has the minimal number of attributes

necessary to maintain the dependency with the attribute(s)

on the right hand-side.

25

Pearson Education © 2009

Page 26: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Transitive Dependencies

Important to recognize a transitive dependency because its

existence in a relation can potentially cause update anomalies.

Transitive dependency describes a condition where A, B, and

C are attributes of a relation such that if A → B and B → C,

then C is transitively dependent on A via B (provided that A is

not functionally dependent on B or C).

26

Pearson Education © 2009

Page 27: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example of a Transitive Dependency

Consider functional dependencies in the StaffBranch relation (see Slide 20).

staffNo → sName, position, salary, branchNo, bAddress

branchNo → bAddress

Transitive dependency, branchNo → bAddress exists on staffNo via branchNo.

27

Pearson Education © 2009

Page 28: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Identifying Functional Dependencies

Identifying all functional dependencies between a set of attributes is relatively simple if the meaning of each attribute and the relationships between the attributes are well understood.

This information should be provided by the enterprise in the form of discussions with users and/or documentation such as the users’ requirements specification.

28

Pearson Education © 2009

Page 29: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example - Identifying a set of

functional dependencies.

Examine semantics of attributes in StaffBranch relation (see

Slide 20). Assume that position held and branch determine a

member of staff ’s salary.

With sufficient information available, identify the functional

dependencies for the StaffBranch relation as:

staffNo → sName, position, salary, branchNo, bAddress

branchNo → bAddress

bAddress → branchNo

branchNo, position → salary

bAddress, position → salary

29

Pearson Education © 2009

Page 30: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Identifying the Primary Key for a Relation

using Functional Dependencies

Main purpose of identifying a set of functional dependencies

for a relation is to specify the set of integrity constraints that

must hold on a relation.

An important integrity constraint to consider first is the

identification of candidate keys, one of which is selected to be

the primary key for the relation.

30

Pearson Education © 2009

Page 31: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example - Identify Primary Key for

StaffBranch Relation

StaffBranch relation has five functional dependencies (see

Slide 29).

The determinants are staffNo, branchNo, bAddress, (branchNo,

position), and (bAddress, position).

To identify all candidate key(s), identify the attribute (or group

of attributes) that uniquely identifies each tuple in this relation.

31

Pearson Education © 2009

Page 32: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example - Identifying Primary Key

for StaffBranch Relation

All attributes that are not part of a candidate key should be

functionally dependent on the key.

The only candidate key and therefore primary key for

StaffBranch relation, is staffNo, as all other attributes of the

relation are functionally dependent on staffNo.

32

Pearson Education © 2009

Page 33: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Normalization

Normalization:is a database design technique, which begins by

examining the relationships (called functional dependencies)

between attributes.

33

Page 34: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Purpose of Normalization

The purpose of normalization is to identify a suitable set of

relations that support the data requirements of an enterprise

Characteristics of a suitable set of relations include:

➢ the minimal number of attributes necessary to support the data requirements of the enterprise;

➢ attributes with a close logical relationship are found in the same relation;

➢ minimal redundancy with each attribute represented only once with the important exception of attributes that form all or part of foreign keys.

34

Page 35: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Purpose of Normalization(cont.)

The benefits of using a database that has a suitable set of

relations is that the database will be:

➢ easier for the user to access and maintain the data;

➢ take up minimal storage space on the computer.

35

Page 36: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

How Normalization Supports

Database Design

There are two main approaches for using normalization:

1. Bottom-up approach: standalone database design technique.

Use normalization to create set of relations.

2. Top-down approach: Use normalization as a validation

technique to check the structure of relations. Mapped ER model to

a set of relations.

36

Page 37: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

How Normalization Supports

Database Design (cont.)37

Page 38: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

The Process of Normalization

Normalization is a formal technique for analyzing a relation

based on its primary key and the functional dependencies

between the attributes of that relation.

Often executed as a series of steps. Each step corresponds to

a specific normal form, which has known properties.

As normalization proceeds, the relations become progressively

more restricted (stronger) in format and also less vulnerable to

update anomalies.

38

Pearson Education © 2009

Page 39: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

The Process of Normalization (cont.)

Normal forms :

▪ First Normal Form (1NF).

▪ Second Normal Form (2NF).

▪ Third Normal Form (3NF).

▪ Fourth general Normal Form(4NF).

▪ Boyce-Codd Normal Form (5 BCNF)

39

Page 40: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

The Process of Normalization(cont.)

40

Pearson Education © 2009

Page 41: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

The Process of Normalization (cont.)

41

Pearson Education © 2009

Page 42: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Unnormalized Form (UNF)

Unnormalized Form (UNF): is a table that contains

one or more repeating groups.

Repeating group: is an attribute, or group of

attributes, within a table that occurs with multiple

values for a single occurrence of the nominated key

attribute(s) for that table.

To create an unnormalized table:

➢ Transform the data from the information source

(e.g. form) into table format with columns and

rows.

42

Pearson Education © 2009

Page 43: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Example of Unormalized form 43

Page 44: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

First Normal Form (1NF)

First Normal Form: is a relation in which the

intersection of each row and column contains one

and only one value.

44

Pearson Education © 2009

Page 45: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

UNF to 1NF

To transform the unnormalized table to First Normal Form we

identify and remove repeating groups within the table.

Remove the repeating group by placing the repeating data

along with a copy of the original key attribute(s) into a

separate relation.

45

Pearson Education © 2009

Page 46: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

UNF to 1NF Example46

The format of the resulting 1NF relations are as follows:

Client (clientNo, cName)

PropertyRentalOwner (clientNo, propertyNo, pAddress,

rentStart, rentFinish, rent, ownerNo, oName)

Page 47: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Second Normal Form (2NF)

Based on the concept of Full functional dependency.

Second Normal Form applies to relations with

composite keys, that is, relations with a primary key

composed of two or more attributes.

A relation that is in 1NF and every non-primary-key

attribute is fully functionally dependent on the

primary key.

47

Pearson Education © 2009

Page 48: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

1NF to 2NF

Identify the primary key for the 1NF relation.

Identify the functional dependencies in the relation.

If partial dependencies exist on the primary key

remove them by placing then in a new relation

along with a copy of their determinant.

48

Pearson Education © 2009

Page 49: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

1NF to 2NF Example:(cont.)

49

Client (clientNo, cName)

Rental (clientNo, propertyNo, rentStart, rentFinish)

PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName)

The format of the resulting 2NF relations are as follows:

Page 50: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

1NF to 2NF Example:50

Page 51: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Third Normal Form (3NF)

Based on the concept of transitive dependency.

A relation that is in 1NF and 2NF and in which no

non-primary-key attribute is transitively dependent

on the primary key.

51

Pearson Education © 2009

Page 52: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

2NF to 3NF

Identify the primary key in the 2NF relation.

Identify functional dependencies in the relation.

If transitive dependencies exist on the primary key

remove them by placing them in a new relation

along with a copy of their dominant.

52

Pearson Education © 2009

Page 53: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

2NF to 3NF Example:

The functional dependencies for the Client, Rental, and

PropertyOwner relations, derived from slide-50, are as

follows:

➢ Client

fd2 clientNo → cName (Primary key)

➢ Rental

fd1 clientNo, propertyNo → rentStart, rentFinish (Primary key)

fd5′ clientNo, rentStart → propertyNo, rentFinish (Candidate key)

fd6′ propertyNo, rentStart → clientNo, rentFinish (Candidate key)

➢ PropertyOwner

fd3 propertyNo → pAddress, rent, ownerNo, oName (Primary key)

fd4 ownerNo → oName (Transitive dependency)

53

Page 54: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

2NF to 3NF Example:(cont.)

54

PropertyForRent (propertyNo, pAddress, rent, ownerNo)

Owner (ownerNo, oName)

The format of the resulting 3NF relations are as follows:

Page 55: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

Normalization Example:55

The ClientRental relation has been transformed by the process

of normalization into four relations in 3NF. The resulting 3NF

relations have the form:

Client (clientNo, cName)

Rental (clientNo, propertyNo, rentStart, rentFinish)

PropertyForRent (propertyNo, pAddress, rent, ownerNo)

Owner (ownerNo, oName)

Page 56: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

General Definitions of 2NF and

3NF

Second normal form (2NF)

A relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on any candidate key.

Third normal form (3NF):

A relation that is in first and second normal form and in which no non-primary-key attribute is transitively dependent on any candidate key.

56

Pearson Education © 2009

Page 57: LECTURE 7 - is220site.files.wordpress.com · Informal Design Guidelines for Relational Databases. The concept of functional dependency, which describes the relationship between attributes

57