schema refinement and normal forms
DESCRIPTION
Schema Refinement and Normal Forms. Review: Database Design. Requirements Analysis user needs; what must database do? Conceptual Design high level description (often done w/ER model) Logical Design translate ER into DBMS data model Schema Refinement consistency, normalization - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/1.jpg)
Schema Refinement and
Normal Forms
![Page 2: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/2.jpg)
Review: Database Design
• Requirements Analysis– user needs; what must database do?
• Conceptual Design– high level description (often done w/ER model)
• Logical Design– translate ER into DBMS data model
• Schema Refinement – consistency, normalization
• Physical Design - indexes, disk layout• Security Design - who accesses what
![Page 3: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/3.jpg)
Design Steps
• Step (3) to step (4) is based on a “design theory” for relations and is called “normalization”. It is important for two reasons:
– Automatic mappings from ER to relations may not produce the best relational design possible.
– Database designers may go directly from (1) to (3), in which case, the relational design can be really bad.
![Page 4: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/4.jpg)
4
Informal guidelines Semantics of the attributes
–easy to explain relation–doesn’t mix concepts
Reducing the redundant values in tuples Choosing attribute domains that are atomic Reducing the null values in tuples Disallowing spurious tuples
![Page 5: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/5.jpg)
1. Semantics of Attributes• Semantics of attributes specify how to interpret
the attributes values stored in a tuple of the relation.
• In other words, how the attributes’ values in a tuple are related to one another.
Guideline 1: Design a relation schema so that it is easy to
explain its meaning. Do not combine attributes from multiple entity
types and relationship types into a single relation.
![Page 6: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/6.jpg)
2. Null Values• If many of the attributes do not apply to all tuples in the
relation, we end up with many null values (no value) in those tuples.
• This leads to wasted space and misunderstandings.
Guideline 2: As much as possible, avoid placing attributes in a relation
whose values may frequently be null. If nulls are unavoidable, make sure that they apply in
exceptional cases, and do not apply to the majority of tuples in the relation.
![Page 7: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/7.jpg)
![Page 8: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/8.jpg)
3. Spurious Tuples• Additional tuples that were not in the original
relation are called spurious tuples because they represent spurious or wrong information that is not valid.
• This is called the lossless join property.Guideline 3: Design relation schemas so that they can be
JOINed (with equality condition on attributes that are
either primary keys or foreign keys) in a way that guarantees that no spurious tuples are generated.
![Page 9: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/9.jpg)
3. Spurious Tuples cont.
CustomerID
True Lies
Title Price KindDate
0001 3.25 D04-19-2002True Lies0002 3.25 D04-21-2002The Lion King0001 4.00 C04-19-2002
Henry V0001 1.75 D04-18-2002The Lion King0003 4.00 C04-19-2002
![Page 10: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/10.jpg)
3. Spurious Tuples cont.
Bad Relational Schema:
True LiesTitle Price Kind
3.25 DThe Lion King 3.25 CHenry V 1.75 D
CustomerIDTrue LiesTitle Price Kind Date
0001 3.25 D 04-19-2002True Lies0002 3.25 D 04-21-2002The Lion King0001 3.25 C 04-19-2002
Henry V0001 1.75 D 04-18-2002The Lion King0003 3.25 C 04-19-2002
The Lion King0002 3.25 D 04-21-2002True Lies0003 3.25 C 04-19-2002
The
Join
Of the
Above
2 Relations
CustomerID Price Date0001 3.25 04-19-20020002 3.25 04-21-2002
0001 1.75 04-18-20020003 3.25 04-19-2002
![Page 11: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/11.jpg)
3. Spurious Tuples cont.Good Relational Schema:
True LiesTitle Price Kind
3.25 DThe Lion King 3.25 CHenry V 1.75 D
CustomerIDTrue LiesTitle Price Kind Date
0001 3.25 D 04-19-2002True Lies0002 3.25 D 04-21-2002The Lion King0001 3.25 C 04-19-2002
Henry V0001 1.75 D 04-18-2002The Lion King0003 3.25 C 04-19-2002
The
Join
Of the
Above
2 Relations
CustomerIDTrue LiesTitle Date
0001 04-19-2002True Lies0002 04-21-2002The Lion King0001 04-19-2002
Henry V0001 04-18-2002The Lion King0003 04-19-2002
![Page 12: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/12.jpg)
Modification Anomalies:
CustomerIDTrue Lies
Title Price KindDate0001 3.25 D04-19-2002
True Lies0002 3.25 D04-21-2002The Lion King0001 4.00 C04-19-2002
Henry V0001 1.75 D04-18-2002The Lion King0003 4.00 C04-19-2002
4. Reducing Redundancies cont.
Insertion Anomaly: Cannot insert information about a film if it has not been rented yet.
Update Anomaly: Updating the rental price for “True Lies” to $4, requires changing it in
several typles (if not, it will cause inconsistencies). Deletion Anomaly:
Deleting the rental information will cause the film information to disappear.
![Page 13: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/13.jpg)
4. Reducing Redundancies• Redundancies in a relation schema result in:
Waste of space Potential for inconsistent data (loss of data
integrity) Potential for modification anomalies (unusual
behavior): Insertion anomalies Update anomalies Deletion anomalies
Guideline 4: Design the relation schemas so that no insertion,
modification, or modification anomalies occur.
![Page 14: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/14.jpg)
Refinements
![Page 15: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/15.jpg)
• Integrity constraints, in particular functional dependencies, can be used to identify schemas with such problems and to suggest refinements.
• Decomposition should be used judiciously:– Is there reason to decompose a relation?– What problems (if any) does the decomposition
cause?
![Page 16: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/16.jpg)
Q1)answered by applying various Normal forms Q2)answered by properties of decomposition that interests us
are lossless-join ( enables us to recover any
instance of the decomposed relation from corresponding instances of the smaller relations)dependency-preservation ( enables us to enforce any constraint on the original relation by simply enforcing some constraints on each of the smaller relations. We do not have to perform join of smaller relation to check if a constraint on original relation is violated.
![Page 17: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/17.jpg)
• From Performance point of view
If queries over the original relation are common then decomposing is not acceptable
In some cases the decomposition is improves performance when queries and updates examine only decomposed relations.
![Page 18: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/18.jpg)
A BAD Relational Schema
S N L R W H
123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
S N L R H
123-22-3666 Attishoo 48 8 40
231-31-5368 Smiley 22 8 30
131-24-3650 Smethurst 35 5 30
434-26-3751 Guldu 35 5 32
612-67-4134 Madayan 35 8 40
R W
8 10
5 7
An Improved Schema
![Page 19: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/19.jpg)
What’s a Good Design?
• Three properties:– No anomalies.– Can reconstruct all original information.– Ability to check all FDs within a single relation.
• Role of FDs in detecting redundancy:– Consider a relation R with 3 attributes, ABC.
• No FDs hold: There is no redundancy here.• Given A B: Several tuples could have the same A
value, and if so, they’ll all have the same B value!
![Page 20: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/20.jpg)
Decomposition of a Relation Scheme
• Suppose that relation R contains attributes A1 ... An. A decomposition of R consists of replacing R by two or more relations such that:– Each new relation scheme contains a subset of the attributes of R
(and no attributes that do not appear in R), and– Every attribute of R appears as an attribute of one of the new
relations.• Intuitively, decomposing R means we will store instances of the
relation schemes produced by the decomposition, instead of instances of R.
• E.g., Can decompose SNLRWH into SNLRH and RW.
![Page 21: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/21.jpg)
Functional Dependency• A functional dependency (FD) is a constraint
between two sets of attributes in relation R.
• It is denoted by: X Y
• Reads:– Y is functionally dependent on X
– X (functionally) determines Y
• Means:– If two tuples in R agree on their X-value, they must
necessarily agree on their Y-value.
![Page 22: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/22.jpg)
Functional Dependencies (FDs)
• A functional dependency X Y holds over relation R if, for every allowable instance r of R:– i.e., given two tuples in r, if the X values agree, then
the Y values must also agree. (X and Y are sets of attributes.)
• K is a key for relation R if:1. K determines all attributes of R.2. For no proper subset of K is (1) true.
• If K satisfies only (1), then K is a superkey.
• K is a candidate key for R means that K R– However, K R does not require K to be minimal!
![Page 23: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/23.jpg)
Functional Dependencies (FDs)
• A functional dependency X Y holds over relation schema R if, for every allowable instance r of R: t1 r, t2 r, X (t1) = X (t2) implies Y (t1) = Y (t2)(where t1 and t2 are tuples;X and Y are sets of attributes)
• In other words: X Y means Given any two tuples in r, if the X values are the same,
then the Y values must also be the same. (but not vice versa)
• Read “” as “determines”
![Page 24: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/24.jpg)
![Page 25: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/25.jpg)
![Page 26: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/26.jpg)
![Page 27: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/27.jpg)
![Page 28: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/28.jpg)
![Page 29: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/29.jpg)
![Page 30: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/30.jpg)
![Page 31: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/31.jpg)
![Page 32: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/32.jpg)
![Page 33: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/33.jpg)
FD Examples cont.
EMP_PROJ (SSN, PNUMBER, HOURS, ENAME , PNAME, PLOCATION)
Can Assert the FDs
SSN ENAME
PNUMBER { PNAME, PLOCATION }
{ SSN, PNUMBER } HOURS
FD Diagram
![Page 34: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/34.jpg)
Example• Consider relation Hourly_Emps:
– Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)
FD is a key:– ssn is the key
S SNLRWH
FDs give more detail than the mere assertion of a key.
– rating determines hrly_wages R W
S N L R W H
123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 40
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 30
612-67-4134 Madayan 35 8 10 40
![Page 35: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/35.jpg)
FD’s Continued
• An FD is a statement about all allowable relations.– Must be identified based on semantics of application.– Given some instance r1 of R, we can check if r1 violates
some FD f, but we cannot determine if f holds over R.
• FDs are a generalization of keys.
![Page 36: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/36.jpg)
Try This One
A B Ca1 b1 c1
a1 b2 c1
a2 b3 c3
a3 b4 c4
a4 b1 c5
a5 b5 c2
FD T/FA B FA C TB A FB C FC A T
AB C TAC B FBC A T
C B F
AB C is equivalent to {A, B} C
Assuming that all the FDs in the relation are apparent in the following instance of the relation:
![Page 37: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/37.jpg)
… and This OneR (SID, CourseID, TotalCreditHours, Grade , SName, Status)
Can Assert the FDs
SID { SName, TotalCreditHours, Status }
{ SID, CourseID } Grade
TotalCreditHours Status
![Page 38: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/38.jpg)
Notes about FDs
• Functional dependencies are constraints that hold on the whole relation R, not on any particular instance of the relation.
• An FD X Y is trivial if Y X (subset)
Examples: StuID StuID
{ StuID, CourseID } CourseID
• X Y does not mean Y X (an FD is not reversible)
![Page 39: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/39.jpg)
Notes about FDs cont.
• The left-hand-side (LHS) of any FD
X Y (X in this case) is called a determinant.
• Even though we can write X YZ (standard form), you should always remember that this is TWO FDs in one: X Y and X Z (canonical form).
• We can write the above formally as:
X YZ |= { X Y , X Z }
( |= denotes “logical implication”)
![Page 40: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/40.jpg)
Notes about FDs cont.
• We denote by F the set of functional dependencies that are specified on a relation schema R.
R (SID, CourseID, TotalCreditHours, Grade , SName, Status)
F = { SID { SName, TotalCreditHours, Status },{ SID, CourseID } GradeTotalCreditHours Status }
![Page 41: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/41.jpg)
Notes about FDs cont.• If X Y is an FD that holds in R, we say that Y is fully FD on X
if removal of any attribute from X means that the FD does not hold any more; otherwise, we say Y is partially FD on X.
• Notice that if X is a single attribute, then for sure Y is fully FD on X.
R (SID, CourseID, TotalCreditHours, Grade , SName, Status)
SID SName
{SID, CourseID} Grade
{SID, CourseID} SName
{SID, CourseID} Status
SName is fully FD on SID
Grade is fully FD on {SID, CourseID}
SName is NOT fully FD on {SID, CourseID}
Status is NOT fully FD on {SID, CourseID}
![Page 42: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/42.jpg)
Reasoning About FDs
• Given some FDs, we can usually infer additional FDs:– ssn did, did lot implies ssn lot
• An FD f is implied by a set of FDs F if f holds whenever all FDs in F hold.– F+ = closure of F is the set of all FDs that are implied by F.
• Armstrong’s Axioms (X, Y, Z are sets of attributes):– Reflexivity: If Y X, then X Y – Augmentation: If X Y, then XZ YZ for any Z– Transitivity: If X Y and Y Z, then X Z
• These are sound and complete inference rules for FDs!
![Page 43: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/43.jpg)
FD Inference RulesIR1 (Reflexivity): X Y |= X Y
• X is superset of Y
• The trivial dependency rule (e.g. AB B); useful for derivations.
IR2 (Augmentation): X Y |= XZ YZ
• If a dependency holds, then we can freely expand its left hand side.
IR3 (Transitivity): X Y, Y Z |= X Z
• The most powerful inference rule; useful in multi-step derivations.
![Page 44: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/44.jpg)
FD Inference Rules cont.Armstrong inference rules (also called Armstrong’s Axioms) are:
• Sound:
meaning that given a set of FDs F specified on a relation schema R, any FD that we can infer from F by using IR1 through IR3 holds on every relation state (instance) of R that satisfies the dependencies in F.
• Complete:
meaning that using IR1 through IR3 repeatedly to infer FDs, until no more FDs can be inferred, results in the complete set of all possible FDs that can be inferred from F (closure of F, denoted as F+).
![Page 45: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/45.jpg)
IR ProofsProve or Disprove:
X YZ |= X Y and X Z(decomposition or projective rule IR4)
1. X YZ (given)
2. YZ Y (using IR1 and knowing that YZ Y)
3. X Y (using IR3 on 1 and 2)
![Page 46: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/46.jpg)
Reasoning About FDs (Cont.)
• Couple of additional rules (that follow from AA):– Union IR5: If X Y and X Z, then X YZ
• Proof of Union:– X Y (given)– X XY (augmentation using X)– X Z (given)– XY YZ (augmentation)– X YZ (transitivity)
![Page 47: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/47.jpg)
IR ProofsProve or Disprove:
X Y, WY Z |= WX Z(pseudotransitive rule IR6)
1. X Y (given)
2. WY Z (given)
3. WX WY (using IR2 on 1 by augmenting with W)
4. WX Z (using IR3 (transitivity) on 3 and 2)
![Page 48: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/48.jpg)
Reasoning About FDs • Computing the closure of a set of FDs can be expensive.
(Size of closure is exponential in # attrs!)• Typically, we just want to check if a given FD X Y is in the
closure of a set of FDs F. An efficient check:– Compute attribute closure of X (denoted X+) wrt F:
• Set of all attributes A such that X A can be inferred using the Armstrong Axioms
• There is a linear time algorithm to compute this. – Check if Y is in X+
• Does F = {A B, B C, C D E } imply A E?– i.e, is A E in the closure F+ ? Equivalently, is E in A+ ?
![Page 49: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/49.jpg)
Finding All Implied FDs• Motivation: Suppose we have a relation ABCD with some
FDs F. If we decide to decompose ABCD into ABC and AD, what are the FDs for ABC, AD?
• Example: F = AB C, C D, D A. It looks like just AB C holds in ABC, but in fact C A follows from F and applies to relation ABC.
• Problem is exponential in worst case.• Algorithm to find F+:
– For each set of attributes X of R, compute X+.
![Page 50: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/50.jpg)
X+ := X
repeat
oldX+ := X+
for each FD Y Z in F do
if Y X+ then X+ := X+ Z
until oldX+ = X+
Closure of Attributes• Given a set of FDs F in relation R, the set of all the attributes that
can be determined (directly or indirectly) from a given attribute (or set of attributes) X is called the closure of X, denoted by X+
• X+ can be determined using the simple algorithm:
![Page 51: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/51.jpg)
Example
• A B, BC D– A+ = AB– C+ = C– (AC)+ = ABCD
• Thus, AC is a key.
![Page 52: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/52.jpg)
Example• F = AB C, C D, D A. What FDs follow?
– A+ = A; B+ = B (nothing)– C+ = ACD (add C A)– D+ = AD (nothing new)– (AB)+ = ABCD (add AB D; skip all supersets of AB).– (BC)+ = ABCD (nothing new; skip all supersets of BC). – (BD)+ = ABCD (add BD C; skip all supersets of BD).– (AC)+ = ACD; (AD)+ = AD; (CD)+ = ACD (nothing new).– (ACD)+ = ACD (nothing new).– All other sets contain AB, BC, or BD, so skip.– Thus, the only interesting FDs that follow from F are:
• C A, AB D, BD C.
![Page 53: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/53.jpg)
IR Proofs cont.Prove or Disprove:
X Z, Y Z |= X YWe will disprove this by presenting a relation instance that satisfies the FDs in the LHS of the rule, but does not satisfy the FDs in the RHS:
X Y Z??
x1x1
z1z1
??
y1y1
z2z2
y2y3
x3x2
It is clear from those two records that the FDX Ydoes not hold
![Page 54: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/54.jpg)
Prove or Disprove:
X Y, X W, WY Z |= X Z
Try This
![Page 55: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/55.jpg)
1. X Y (given)
2. X W (given)
3. WY Z (given)
4. X XY (using IR2 on 1 (augmenting with X; XX=X))5. XY WY (using IR2 on 2 (augmenting with Y))6. X WY (using IR3 on 4 and 5)7. X Z (using IR3 on 6 and 3)
![Page 56: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/56.jpg)
… and ThisProve or Disprove:
XY Z, Z W |= X WWe will disprove this by presenting a relation instance that satisfies the FDs in the LHS of the rule, but does not satisfy the FDs in the RHS:
X Y Z Wx1 y1 z1x1 y1 z1? ?? ?
??
z2z2
w1w1
It is clear from those two records that the FD X Wdoes not hold
x1x2
y3y2
w2w2
Notice the choices for values
![Page 57: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/57.jpg)
Closure Example • R(A, B, C, D, E, F) with FDs:• Compute {A, B}+
(can also be written as AB+)
A BC
A BYou Have
A CBYou Have
EA CBYou Have
FA CBYou Have E
AB+ = ABCEF (i.e. {A,B}+ = {A,B,C,E,F})
CD EF B E E CF
Visit CYou Get
Visit You Get E
Visit You Get F
End of Shopping
![Page 58: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/58.jpg)
Try This • R(A, B, C, D, E, F) with FDs:• Compute:
A BC
CD EF B E E CF
A+
C+
BC+
AD+
ABCD+
ABCEFCBCEFABCDEFABCDEF
![Page 59: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/59.jpg)
Attribute Closure (example)• R = {A, B, C, D, E}• F = { B CD, D E, B A, E C, AD B }• Is B E in F+ ?
B+ = BB+ = BCDB+ = BCDAB+ = BCDAE … Yes! and B is a key for R too!
• Is D a key for R?D+ = DD+ = DED+ = DEC … Nope!
• Is AD a key for R? AD+ = ADAD+ = ABD and B is a key, so Yes!
• Is AD a candidate key for R?A+ = A, D+ = DEC… A,D not keys, so Yes!
• Is ADE a candidate key for R?
… No! AD is a key, so ADE is a superkey, but not a cand. key
![Page 60: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/60.jpg)
![Page 61: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/61.jpg)
Given P QRQ SUsing augmentation,QR RS
PQR, QR RS hence PRSRT (given)RS ST (augmentation)By transitivityPRS, RS ST hence PST
![Page 62: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/62.jpg)
Equivalence of sets of functional dependencies
• A set of FD F is said to cover another set of FD E if every FD in E is also in F+. i.e if every dependency in E can be inferred from F, alternately we can say that E is covered by F
• Two sets of FD E and F are equivalent if E+ = F+
.i.e E covers F and F covers E
![Page 63: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/63.jpg)
A Note About Closures• We have just seen that given a set of FDs that
hold in R, we can calculate the closure of a set of attributes
• The reverse process is also possible. In other words, given a closure of a set of attributes we can infer the non-trivial FDs that hold in R
![Page 64: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/64.jpg)
Note About Closure cont.• Given R(A, B, C, D, E, F), and:
B+ B C E F
It can be easily inferred that the following FD holds on R: B C E F
A BC
CD EF B E E CF
As a matter of fact, it can even be proven, using inference rules, if we knew it only came from the relation in the previous example
![Page 65: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/65.jpg)
Minimal Cover for an FD Set• One can imagine the number of FDs that can be
inferred from a given set of FDs (using inference rules).
• Thus, we need minimal set of FDs that maintains the relationships between all the attributes and with no redundancies
![Page 66: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/66.jpg)
Minimal Cover cont. A set of FDs F is minimal iff:
1. Every FD in F has a single attribute for its RHS
2. We cannot replace any FD AB in F with an FD CB, where C is a proper subset of A, and still have a set of FDs that is equivalent to F
3. We cannot remove any FD in F, and still have a set of FDs that is equivalent to F
![Page 67: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/67.jpg)
Algorithm to Find Minimal CoverStep 1: Make the RHS of every FD singular. In other words,
replace every FD of the type A BC with A B and A C
Step 2: For every FD A B where A is not singular. Remove any attribute (or set of attributes) X from A, and check if A+ = {A-X}+. If the answer is yes, then replace A B with {A-X} B
Step 3: For every FD A B, cover it with your finger (i.e. imagine it not there), and ask if A+ includes B without it. If the answer is yes, eliminate A B permanently
Note: For each step, You can use the resulting FDs from the previous step.
![Page 68: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/68.jpg)
ResultsStep 1
Step 3
Step 2
Minimal Cover Example• R(A, B, C, D) with FDs: A BCD AD C CD B
remove C is D+=CD+?
AD C
CD B noremove D is C+=CD+? noremove A is D+=AD+? noremove D is A+=AD+? yes
A B
AD C CD B
A C A D
A B
AD C CD B
A C A D
A C A B
CD B
A C A D
cover it: is B in A+? yescover it: is C in A+? nocover it: is D in A+? nocover it: is B in CD+? no
CD B A D A C
![Page 69: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/69.jpg)
Step 3
ResultsStep 1
Step 2
Try This One• R(A, B, C, D, E) with FDs: D ACE
AB C A B
AB C
D A
AB C A B
D C D E
D A
AB C A B
D C D E
A C
A B D E
A C D A
A B
D C D E
A C
D A remove B is A+=AB+?remove A is B+=AB+?
cover it: is A in D+?cover it: is C in D+?cover it: is E in D+?cover it: is B in A+?cover it: is C in A+?
no
noyes
yesnonono
![Page 70: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/70.jpg)
Step 2 Results
….
….
ABC D
Note on Minimal Cover AlgorithmIn Step 2: For every FD A B where A is not singular.
Remove any attribute (or set of attributes) X from A, and check if A+ = {A-X}+. If the answer is yes, then replace A B with {A-X} B.
ABC D
BC D
remove AB is C+=ABC+?remove AC is B+=ABC+?
remove C is B+=BC+?
remove BC is A+=ABC+? yes
yes
remove A is BC+=ABC+?remove B is C+=BC+?
noyesno
nonono A D ABC D BC D B D
![Page 71: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/71.jpg)
Keys• If X+ (the closure of the attribute (or set of
attributes) X) includes all the attributes in a relation R, then X is a superkey (SK) for the relation R
• If X is a superkey for R, and the removal of any attribute from X will cause X not to be a superkey anymore, then X is called the key for R
• The difference between a key and a superkey is that a key has to be minimal
![Page 72: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/72.jpg)
Keys cont.• If R has more than one key, then each is called
a candidate key (CK) for R• The terms key and candidate key are used
interchangeably• One of the candidate keys for R is (arbitrarily)
designated to be the primary key (PK) for R, and the others are called secondary keys
• Secondary keys are used for indexing purposes to improve performance
![Page 73: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/73.jpg)
Keys Example• R(A, B, C, D, E, F) with FDs:
A+ ABCDEF
AD+ ABCDEFBD+ ABCDEFCD+ CDFCDE+ ABCDEF
A CDE
CD F AD E B CE
BD A CDE ABD
AADBDCDCDE
SK CK
D+
B+
DCBE
C+ CD+ DE+ ECE+ CE… ……..
B+ CBE
![Page 74: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/74.jpg)
Finding CKs Systematically
To systematically find all the candidate keys for any relation is a brute force algorithm that requires trying all the possible combinations of the attributes:Single attribute: take the closure of every single attribute. If the closure gives you all the attributes in the relation, then the attribute is a CK.Two attributes: Take the closure of the 2-attributes combinations not including the ones you found in the previous step above.Three attributes: Take the closure of the 3-attributes combinations not including the ones you found in the previous step above.Four attributes: ……… Continue as above
![Page 75: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/75.jpg)
Finding CKs Example1
Answer:Single attribute: A+ = ABCDEF A is a CKThen we try B+, C+, D+, E+, F+ (but they will not work)Two attributes: (all combinations not including A because no future CK can contain A which is itself a CK):
BD+ = ABCDEF BD is a CKThen we try BE+, BF+, CD+, CE+ ………… etc.
Three attributes: (all combinations not including A or BD)CDE+ = ABCDEF CDE is a CKThen we try BCE+, BCF+, BEF+ ………… etc.
Four attributes: (all combinations not including A, BD or CDE)BCEF+ = ….
Final Results: The only CKs are: A, BD & CDE.
Find the CKs for the relation R (A, B, C, D, E, F) with FDs:
FD1: B CE FD2: AD E FD3: CD F FD4: BD A
FD5: A CDE FD6: CDE ABD
![Page 76: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/76.jpg)
Hints on Finding CKs• Look at the RHS of the FDs. If an attribute does
not appear on the RHS of all the FDs, then such attribute must be part of the CK. Why?
• It makes your life easier if you start with the minimal cover instead of the given FDs.
• Do not be fooled by trying ONLY the attributes (or combination of attributes) that appear on the LHS of the FDs. (An example will show you why?)
![Page 77: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/77.jpg)
Finding CKs Example2
Answer:Because any candidate key must have B and E in its closure, but B and E do not appear at the RHS of any FDs, thus B and E can only be determined by themselves.Therefore B and E must be part of the candidate key (1)Single attribute: not applicable because of (1)Two attributes: only BE is possible because of (1)
BE+ = ABCDE BE is a CKThree attributes: because BE was fund to be a CK, thus no future candidate key can contain BE, but because of (1) any CK must contain BE, this is a contradictionFinal Results: The only CK is BE.
Find the CKs for the relation R (A, B, C, D, E) with FDs:
FD1: B D FD2: E C FD3: AC D
FD4: CD A FD5: BE A
![Page 78: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/78.jpg)
Finding CKs Example3
Answer:Single attribute: A+ = A, B=B, E +=AE+
C+ = ABCDE C is a CK D+ = ABCDE D is a CK
Two attributes: (all combinations not including C or D) AE+ = AE AB+ = ABCDE AB is a CK BE+ = ABCDE BE is a CK
Three attributes: (all combos not including C, D, AB or BE) None can be found
Find the CKs for the relation R (A, B, C, D, E) with FDs:
FD1: E A FD2: D BE FD3: C D FD4: AB C
Notice: If you would have tried only the LHS of the FDs, then BE would not have been found as a CK.
![Page 79: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/79.jpg)
An attribute that is part (member) of any candidate key is called a prime attribute
An attribute is called nonprime if it not a prime attribute
Prime or Nonprime
![Page 80: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/80.jpg)
Database Normalization• Database normalization is the process of removing
redundant data from your tables in order to improve storage efficiency, data integrity, and scalability.
• In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them.
• Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
![Page 81: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/81.jpg)
History
• Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated:“There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition nonsimple domains are replaced by ‘domains whose elements are atomic (nondecomposable) values.’”
![Page 82: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/82.jpg)
Normal Form
• Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).
![Page 83: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/83.jpg)
Normal formsUniverse of relations
1 NF
2NF
3NF
BCNF
4NF
5NF
![Page 84: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/84.jpg)
A relation is in 1NF iff (if and only if) every attribute is single valued for each tuple (i.e. no multivalued attributes).
In other words, a relation is in 1NF if it has the characteristics of a table:
Has the rows and columns format Each data value is atomic Each column has a domain Has at least one key (i.e. no duplicate records) Order of rows is not important Order of columns is not important
First Normal Form
![Page 85: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/85.jpg)
To put a relation in 1NF: Make the multivalued attribute as part of the key, and
duplicate all the other information.
First Normal Form cont.
EmpName ChildName SpouseName
SpouseAge
John Newton
AlicePeter
Mary Joe 29
Nancy Corn George Paul Atlas 46
Joe Johnson Nora
John
Sally
Pat Thomas 38
![Page 86: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/86.jpg)
For all our purposes: All relations are in 1NF (i.e. 1NF is of no value other than
historical).
First Normal Form cont.
EmpName ChildName SpouseName
SpouseAge
John Newton
Alic Mary Joe 29
John Newton
Peter Mary Joe 29
Nancy Corn George Paul Atlas 46
Joe Johnson Nora Pat Thomas 38
Joe Johnson John Pat Thomas 38
Joe Johnson Sally Pat Thomas 38
![Page 87: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/87.jpg)
Remember: A prime is any attribute that is part of any CK. For X Y, Y is fully FD on X if Y is not FD on any part of
X.
A relation R is in 2NF iff every nonprime attribute is fully FD on every candidate key in R
In other words, a relation R is in 2NF iff every nonprime attribute is not partially FD on any candidate key in R
In other words, R is not in 2NF if it contain an FD of the form:prime nonprime
Second Normal Form
![Page 88: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/88.jpg)
Answer: The key is {EmpName, ChildName} In FD1, SpouseName is a nonprime that is FD on a prime
EmpName (i.e. SpouseName is not fully FD on the key) That is an explicit violation to 2NF R is not in 2NF.
R (EmpName, SpouseName, ChildName, SpouseAge) with FDs:
FD1: EmpName SpouseName
FD2: SpouseName SpouseAge
FD3: {EmpName, ChildName} {SpouseName, SpouseAge}
Is R in 2NF?
2NF Example1
![Page 89: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/89.jpg)
Answer: The key is AE In FD2, B is a nonprime that is FD a prime A (i.e. B is not
fully FD on the key) That is an explicit violation to 2NF R is not in 2NF.
R (A, B, C, D, E, F, G, H) with FDs:
FD1: AE GH FD2: A BC
FD3: C D FD4: E F
Is R in 2NF?
2NF Example2
![Page 90: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/90.jpg)
Third Normal Form
A relation R is in 3NF iff the set of FDs for R does not contain any transitive FD.
An FD is transitive if it has a nonprime on both of its sides.
In other words, R is not in 3NF if it contain an FD of the form:
nonprime nonprime
![Page 91: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/91.jpg)
Answer: The CK is D In FD2, a nonprime C is FD on another nonprime AB That is an explicit violation to 3NF R is not in 3NF. Question: Is R in 2NF?
R (A, B, C, D, E) with FDs:
FD1: A B FD2: AB C
FD3: D ACE
Is R in 3NF?
3NF Example1
![Page 92: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/92.jpg)
Answer: The key is GE There is no explicit violation to 3NF In FD2, a nonprime BF is FD on a prime G This is an explicit violation to 2NF, and an implicit
violation to 3NF R is not in 3NF.
R (A, B, C, D, E, F, G) with FDs:
FD1: GE AD FD2: G BF
FD3: E C
Is R in 3NF?
3NF Example2
![Page 93: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/93.jpg)
Boyce-Codd Normal Form
A relation R is in BCNF iff every determinant is a superkey.
To find a violation to BCNF, just find a determinant that is not a superkey.
Remember: A determinant is the LHS of any FD. A superkey for a relation R is an attribute (or set of
attributes) whose closure gives all the attributes of the relation R.
![Page 94: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/94.jpg)
Answer: A+ = AB A is a determinant in FD1, but A is not a superkey This is an explicit violation to BCNF R is not in BCNF. Question: Is R in 2NF? Question: Is R in 3NF?
R (A, B, C, D) with FDs:
FD1: A B FD2: BC D
FD3: D BC FD4: C A
Is R in BCNF?
BCNF Example
![Page 95: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/95.jpg)
Notes on BCNF: Every relation that has only two attributes is in BCNF.
Every relation that has only 1 candidate key, if it is in 3NF, then it is in BCNF; except in the very rare case:
if an FD X A exists in R with X not a super key, and A is a prime attribute, then R will be in 3NF but not in BCNF.
If the relation has only one candidate key, if you manage to put the relation in 3NF then you will achieve BCNF automatically except in the above rare case.
![Page 96: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/96.jpg)
Normalizing a Relation
To normalize a relation, means to rid the relation of the modifications anomalies.
This usually means breaking up the relation into two (or more) relations. This is called decomposition.
Two normalization algorithms are available:1) 3NF Algorithm:
Guarantees every resulting relation is in 3NF Guarantees Lossless Join property Guarantees FD perseverance
2) BCNF Algorithm: Guarantees every resulting relation is in BCNF Guarantees Lossless Join property
![Page 97: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/97.jpg)
3NF Algorithm
Step 1: Get the minimal cover for the FDs, and work with it instead of the original FDs.
Step 2: Combine the attributes on RHSs of any FDs that have the same LHS (i.e A B and A C will be combined into A BC (standard form))
Step 3: Find the candidate key(s).Step 4: Output each of the FDs as a relation by itself (i.e. the
FD AB C will be outputted as the relation (A, B, C)).
Step 5: If none of the relations (from step 4) contains a key (i.e. we need at least one key), then create one more relation that contains the attributes that form a key.
![Page 98: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/98.jpg)
Notes on 3NF Algorithm:
If two relations have the same set of attributes, then eliminate one of them.
Always underline a key in each of the new decomposed relations.
The key for a new decomposed relation is the determinant of the FD that the relation has resulted from (i.e. AB C will result in the relation (A, B, C)).
![Page 99: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/99.jpg)
Answer:
Step 1: As above (no change).
Step 2: As above (no change).
Step 3: The key is BE
Step 4: FD1 R1(B, D) FD2 R2(E, C)
FD3 R3(A, C, D) FD4 R4(A, C, D) duplicate
Step 5: None of the relations contains the key, we need
another relation: R4(B, E)
Final results: R1(B, D), R2(C, E), R3(A, C, D), and R4(B, E)
Normalize the relation R (A, B, C, D, E) with FDs:
FD1: B DFD2: E C FD3: AC D FD4: CD A
3NF Algorithm Example1
![Page 100: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/100.jpg)
Old FD Diagram
A C
R
B D E
![Page 101: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/101.jpg)
New FD Diagram
B DR1
E CR2 B ER4
A CR3 D
![Page 102: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/102.jpg)
Answer:
Step 1: As above (no change).
Step 2: As above (no change).
Step 3: The key is {EmpName, ChildName}
Step 4: FD1 R1(EmpName, SpouseName)
FD2 R2(SpouseName, SpouseAge)
Step 5: None of the relations contains the key, we need
another relation: R3(EmpName, ChildName)
Final results: R1, R2 and R3 as above.
Normalize the relation R (EmpName, SpouseName, ChildName, SpouseAge) with FDs: FD1: EmpName SpouseName
FD2: SpouseName SpouseAge
3NF Algorithm Example2
![Page 103: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/103.jpg)
Old FD Diagram
EmpName SpouseName ChildNameSpouseAge
R
![Page 104: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/104.jpg)
New FD Diagram
EmpName ChildName
SpouseName SpouseAge
EmpName SpouseName
R1
R2
R3
![Page 105: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/105.jpg)
Answer:
Step 1: FD1: A B, FD2: BC D , FD3: D A ,
FD4: D C , FD4: D E
Step 2: FD1: A B, FD2: BC D , FD3: D ACE
Step 3: The CK is D
Step 4: FD1 R1(A, B) FD2 R2(B, C, D)
FD3 R3(A, C, D, E)
Step 5: The CK is in relation R3 above
Final results: R1(A, B), R2(B, C, D) and R3(A, C, D, E)
Normalize the relation R (A, B, C, D, E) with FDs:
FD1: A B FD2: BC D FD3: D ACE
3NF Algorithm Example3
![Page 106: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/106.jpg)
BCNF Algorithm Step 1: Find an FD that violates BCNF (explicit and
implicit)Step 2: Split R into:
R1 which contains the attributes of the violating FD.
R2 which contains all the attributes in the original relation except the attributes on the RHS of the violating FD.
Step 3: Check each of the new relations and repeat steps 1 and 2 on each one of them when the violation is found.
![Page 107: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/107.jpg)
Notes on BCNF Algorithm:
This is a universal algorithm that can work to put any relation into 2NF, 3NF or BCNF.(i.e. if you want to put a relation into 2NF, then in step 1 of
the algorithm, find an FD that violates 2NF). It is better (but not a must) to start with the
minimal cover as it gives less FDs to work with. It is better (but not a must) to apply the algorithm
to higher violations first (i.e. start with FDs that explicitly violate BCNF, then FDs that explicitly violate 3NF … etc.).
![Page 108: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/108.jpg)
BCNF Algorithm Notes cont.
If you want to preserve the FDs of a relation, then be satisfied with 3NF
You can try to apply 3NF (which guarantees that every resulting relation is in 3NF), then check each resulting relation if it is in BCNF. If so, then GREAT; otherwise, you can try applying the BCNF algorithm and risk loosing some of the FDs
![Page 109: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/109.jpg)
Answer: Remember: The key is {EmpName, ChildName}.
Since EmpNname SpouseName is an explicit violation to 2NF
and SpouseName SpouseAge is a explicit violation to 3NF
We start with the highest violation first:
We will split SpouseName and SpouseAge into a separate relation:
R3(SpouseName, SpouseAge)
Now: what remains in R is what was there originally except what is on the right hand side of the violating FDs (i.e. we exclude SpouseAge):
R(EmpName, ChildName, SpouseName)
Normalize the relation R (EmpName, SpouseName, ChildName, SpouseAge) with FDs: FD1: EmpName SpouseName
FD2: SpouseName SpouseAge
BCNF Algorithm Example
![Page 110: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/110.jpg)
Answer cont.: So far we have: R3(SpouseName, SpouseAge)and R(emp-name, child-name).
Now we consider the other violation:
We will split EmpName and SpouseName into a separate relation:
R2(EmpName, SpouseName)
Now: what remains in R is what was there originally except what is on the right hand side of the violating FDs (i.e. we exclude SpouseName):
R(EmpName, ChildName)
Normalize the relation R (EmpName, SpouseName, ChildName, SpouseAge) with FDs: FD1: EmpName SpouseName
FD2: SpouseName SpouseAge
BCNF Algorithm Example cont.
![Page 111: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/111.jpg)
1. BCNF decomposition Normalize the relation R(A,B,C,D,E) with FDs: D->B, CE->ADCE is a minimal key.R violates BCNF since D (in itself) is not a key.D+=DB and we split by D->B. The result is DB, DCEADCEA violates BCNF since CE is not a key. CE+=CEA and we split by CE-
>A. The result is CEA, CED CEA, CED do not violate BCNF.R1(D,B), R2(C,E,A) , R3(C,E,D)
![Page 112: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/112.jpg)
Normalize the relation S(A,B,C,D,E)With FDs: A->E, BC->A, DE->BDCE, DCB, DCA are minimal keys S violates BCNF since A is not a key A+=AE and we split by A->E. The result is AE,
ABCD ABCD violates BCNF since BC is not a key BC+=BCA and we split by BC->A. The result is BCA,
BCD.BCA, BCD do not violate BCNF
Answer: AE, BCA, BCD
![Page 113: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/113.jpg)
2. 3NF decomposition R(A,B,C,D,E); D->B, CE->ADCE is a minimal key The first condition is already violated (as above for BCNF) R violates 3NF since B is not a prime (part of a key) D+=DB and we split by D->B. The result is DB, DCEA DCEA violates 3NF since A is not a prime. CE+=CEA and we split by CE->A. The result is CEA, CED CEA, CED do not violate 3NF. Answer: DB, CEA, CED
![Page 114: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/114.jpg)
S(A,B,C,D,E); A->E, BC->A, DE->BDCE, DCB, DCA are minimal keysS does not violate 3NF since each right side of given
FDs is a prime (part of a key) Answer: ABCDE
![Page 115: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/115.jpg)
Normalize the relation R (A, B, C, D, E, F) with FDs:
FD1: AB CDEF FD2: B C FD3: D F
Notice: The Key is AB
Successive Normalization Example
R (A, B, C, D, E, F)Remove 2NFViolation: B C
R1(B, C)R (A, B, D, E, F)
Remove 3NFViolation: D F
R1(B, C) R1(D, F) R (A, B, D, E)
R1(B, C) with FD: B C is in BCNF. Why?R (A, B, D, E)with FD: AB DE is in BCNF. Why?R1(D, F) with FD: D F is in BCNF. Why?R (A, B, D, E, F)with FDs: AB DEF & D F is in 2NF. Why?
![Page 116: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/116.jpg)
Assume that AC is the key for the relation R (A,B,C,D,E), and in addition the following FDs hold on the relation R:
FD1: A BFD2: D E
What is the best normal form for the relation R? Normalize the relation R.
Try this …… IMPORTANT
![Page 117: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/117.jpg)
Final Word … It is not always possible to get a BCNF
decomposition that preserves the functional dependencies of a relation, as this example will show.
R (A, B, C) with FDs: FD1: AB C FD2: C B
Discussion: The CKs are: AB and AC C is a determinant in FD2, but C is not a superkey This is an explicit violation to BCNF, so R is not in BCNF. Problem: Any decomposition of R will fail to preserve the
FD AB C (FD1).
![Page 118: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/118.jpg)
![Page 119: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/119.jpg)
![Page 120: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/120.jpg)
![Page 121: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/121.jpg)
![Page 122: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/122.jpg)
![Page 123: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/123.jpg)
![Page 124: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/124.jpg)
![Page 125: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/125.jpg)
![Page 126: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/126.jpg)
![Page 127: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/127.jpg)
![Page 128: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/128.jpg)
![Page 129: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/129.jpg)
![Page 130: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/130.jpg)
![Page 131: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/131.jpg)
![Page 132: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/132.jpg)
![Page 133: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/133.jpg)
![Page 134: Schema Refinement and Normal Forms](https://reader033.vdocuments.site/reader033/viewer/2022061607/56813488550346895d9b6d82/html5/thumbnails/134.jpg)