multivalued dependencies by david wortham. problem introduction assume a relation r (from the book):...
Post on 21-Dec-2015
218 views
TRANSCRIPT
Problem IntroductionProblem Introduction
Assume a relation Assume a relation RR (from the book): (from the book):(credit Ullman and Widom)(credit Ullman and Widom)
NameName AddrStreetAddrStreet AddrCityAddrCity FilmNameFilmName FilmYearFilmYear
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi
19831983
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi
19831983
Carrie Fisher = Princess Leia Organa
Problem IntroductionProblem Introduction
What is the highest Normal Form What is the highest Normal Form RR complies with? complies with?
NameName AddrStreetAddrStreet AddrCityAddrCity FilmNameFilmName FilmYearFilmYear
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi
19831983
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi
19831983
Recall 1NFRecall 1NF
1NF eliminates wasted space due to 1NF eliminates wasted space due to duplicate attributes (columns)duplicate attributes (columns)
Example (before 1NF normalization):Example (before 1NF normalization):ManagerManager SubordinatSubordinat
e1e1SubordinatSubordinate2e2
SubordinatSubordinate3e3
SubordinatSubordinate4e4
BobBob JimJim MaryMary BethBeth
MaryMary MikeMike JasonJason CarolCarol MarkMark
JimJim AlanAlan
Recall 1NF (con’t)Recall 1NF (con’t)
After 1NF normalization:After 1NF normalization:ManagerManager SubordinateSubordinate
BobBob JimJim
BobBob MaryMary
BobBob BethBeth
MaryMary MikeMike
MaryMary JasonJason
MaryMary CarolCarol
MaryMary MarkMark
JimJim AlanAlan
Identify the FDsIdentify the FDs
Functional Dependencies in Functional Dependencies in RR:: Since every tuple in Since every tuple in RR is unique, the is unique, the
attrAattrA attrBattrB attrCattrC attrDattrD attrEattrE
11 22 33 66 77
11 44 55 66 77
11 22 33 88 99
11 44 55 88 99
11 22 33 1010 1111
11 44 55 1010 1111
Identify the FDsIdentify the FDs
Functional Dependencies in Functional Dependencies in RR:: The only FD is: The only FD is: {attrB, attrC, attrD, attrE } {attrB, attrC, attrD, attrE } attrA attrA
attrAattrA attrBattrB attrCattrC attrDattrD attrEattrE
11 22 33 66 77
11 44 55 66 77
11 22 33 88 99
11 44 55 88 99
11 22 33 1010 1111
11 44 55 1010 1111
Normal Form of Normal Form of RR
1NF (no multivalues) 1NF (no multivalues) [check][check] 2NF (no FDs where a subset of the key to 2NF (no FDs where a subset of the key to
the relation is on the left) the relation is on the left) [check][check] 3NF (no non-trivial FDs: either the 3NF (no non-trivial FDs: either the
determinant is a superkey or the RHS of determinant is a superkey or the RHS of the FD is a member of some key) the FD is a member of some key) [check][check]
BCNFBCNF (the determnant of any non-trivial (the determnant of any non-trivial FD is a superkey for the relation) FD is a superkey for the relation) [check][check]
Problem Intro. (con’t)Problem Intro. (con’t)
Notice:Notice:
NameName AddrStreeAddrStreett
AddrCityAddrCity FilmNameFilmName FilmYearFilmYear
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi
19831983
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi
19831983
Problem Intro. (con’t)Problem Intro. (con’t)
Also Notice:Also Notice:
NameName AddrStreeAddrStreett
AddrCityAddrCity FilmNameFilmName FilmYearFilmYear
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back
19801980
C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi
19831983
C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi
19831983
Observe the PatternObserve the Pattern
R ~= TxUxVR ~= TxUxV(R is similar to the Cartesian product of (R is similar to the Cartesian product of relations T, U, and V)relations T, U, and V)
AddrStreetAndCityAddrStreetAndCity
123 Maple Dr. | Hollywood123 Maple Dr. | Hollywood
5 Locust Ln. | Malibu5 Locust Ln. | Malibu
FilmNameAndYearFilmNameAndYear
Star Wars | 1977Star Wars | 1977
The Empire Strikes Back | 1980The Empire Strikes Back | 1980
Return of the Jedi | 1983Return of the Jedi | 1983
NameName
C. FisherC. Fisher
Relation T Relation U Relation V
Problem DefinitionProblem Definition
The Relation The Relation RR contains unnecessary contains unnecessary duplication of dataduplication of data
RR is valid 1NF, 2NF, 3NF, and BCNF is valid 1NF, 2NF, 3NF, and BCNF (and there are no exact duplicate (and there are no exact duplicate tuples)tuples)
R has common data on AddrStreet and R has common data on AddrStreet and AddrCity of all tuplesAddrCity of all tuples
R has common data on FilmName and R has common data on FilmName and FilmLocat of all tuplesFilmLocat of all tuples
SolutionSolution
Introduction of a 4NFIntroduction of a 4NF Eliminate “non-trivial” MDsEliminate “non-trivial” MDs Eliminate additional FDs that violate Eliminate additional FDs that violate
BCNFBCNF
DefinitionsDefinitions
Fourth Normal FormFourth Normal Form- if - if RR is valid BCNF and… is valid BCNF and…- given the “non-trivial” MVD: - given the “non-trivial” MVD: AA11AA22…A…Ann B B11BB22…B…Bnn
{{AA11AA22…A…Ann} is a superkey} is a superkey A MVD: A MVD: AA11AA22…A…Ann B B11BB22…B…Bnn for a Relation for a Relation RR is “ is “non-non-
trivialtrivial” if:” if:1.1. none of the none of the BBs are among the s are among the AAss2.2. Not all of the attributes of Not all of the attributes of RR are among the are among the AAs s
and and BBss A MVD is “A MVD is “trivialtrivial” if it contains all the variations ” if it contains all the variations
of of AA11AA22…A…Ann xx B B11BB22…B…Bn.n.
A relation cannot be decomposed any further A relation cannot be decomposed any further (under 4NF rules) if it has a trivial MVD(under 4NF rules) if it has a trivial MVD
Note about FDs and MVDsNote about FDs and MVDs
Every FD is a MVDEvery FD is a MVD
(if (if AA11AA22…A…Ann B B11BB22…B…Bnn , then , then AA11AA22…A…Ann B B11BB22…B…Bnn ) ) The converse is not always trueThe converse is not always true FDs rule out certain tuples (i.e. if FDs rule out certain tuples (i.e. if A A B B then two then two
tuples will not have the same value for A and tuples will not have the same value for A and different values for B)different values for B)
MVDs do not rule out tuples. They guarantee that MVDs do not rule out tuples. They guarantee that certain tuples must exist. (we will see this later)certain tuples must exist. (we will see this later)
Another ExampleAnother Example
Another MVD exampleAnother MVD example(credit www.cs.jcu.edu.au)(credit www.cs.jcu.edu.au)
emp_nameemp_name qualificationqualificationss
languageslanguages
SMITHSMITH B. ScB. Sc NULLNULL
SMITHSMITH Dip. CSDip. CS NULLNULL
SMITHSMITH NULLNULL FORTRANFORTRAN
SMITHSMITH NULLNULL COBOLCOBOL
SMITHSMITH NULLNULL PASCALPASCAL
Another Example (con’t)Another Example (con’t)
Our second example 4NF decomposed:Our second example 4NF decomposed:
emp_nameemp_name qualificationqualificationss
languageslanguages
SMITHSMITH B. ScB. Sc NULLNULL
SMITHSMITH Dip. CSDip. CS NULLNULL
SMITHSMITH NULLNULL FORTRANFORTRAN
SMITHSMITH NULLNULL COBOLCOBOL
SMITHSMITH NULLNULL PASCALPASCAL
emp_nameemp_name qualificationsqualifications
SMITHSMITH B. ScB. Sc
SMITHSMITH Dip. CSDip. CS
emp_nameemp_name languageslanguages
SMITHSMITH FORTRANFORTRAN
SMITHSMITH COBOLCOBOL
SMITHSMITH PASCALPASCAL
Relation L
Relation M
Relation O
Explanation of second Explanation of second exampleexample
Unnecessary tuples are eliminated Unnecessary tuples are eliminated (those with NULL values)… saving (those with NULL values)… saving spacespace
Note: for this example, Note: for this example, MMxxOO is not is not similar to similar to LL: this is because the MVD : this is because the MVD in L is non-trivialin L is non-trivial
Second example (modified)Second example (modified)
If we were to modify the second If we were to modify the second example example LL to be 4NF, we would need to be 4NF, we would need to combine every possible value of to combine every possible value of MM with every one of with every one of OO, changing the , changing the MVD from non-trivial to trivialMVD from non-trivial to trivial
This relation would look similar to the This relation would look similar to the product of product of MM and and OO
Second example (modified)Second example (modified)
If we were to modify the second example If we were to modify the second example LL, we would need to combine every , we would need to combine every possible value of possible value of MM with every one of with every one of OO
emp_nameemp_name qualificationqualificationss
languageslanguages
SMITHSMITH B. ScB. Sc NULLNULL
SMITHSMITH Dip. CSDip. CS NULLNULL
SMITHSMITH NULLNULL FORTRANFORTRAN
SMITHSMITH NULLNULL COBOLCOBOL
SMITHSMITH NULLNULL PASCALPASCAL
Relation L
Second example (modified)Second example (modified)
If we were to modify the second example If we were to modify the second example LL, we would need to combine every , we would need to combine every possible value of possible value of MM with every one of with every one of OO
This relation is 4NF b/cThis relation is 4NF b/cthe only MVD is trivialthe only MVD is trivial(the original L was not(the original L was not4NF)4NF)
emp_nameemp_name qualificationqualificationss
languageslanguages
SMITHSMITH B. ScB. Sc FORTRANFORTRAN
SMITHSMITH Dip. CSDip. CS FORTRANFORTRAN
SMITHSMITH B. ScB. Sc COBOLCOBOL
SMITHSMITH Dip. CSDip. CS COBOLCOBOL
SMITHSMITH B. ScB. Sc PASCALPASCAL
SMITHSMITH Dip. CSDip. CS PASCALPASCAL
Relation L w/ Trivial MVD
Formal Definition of a MVDFormal Definition of a MVD
Using the following relation,Using the following relation,(credit Ullman & Widom)(credit Ullman & Widom)
Note that A, B, and “others” are not actual Note that A, B, and “others” are not actual attributes, but rather sets of attributesattributes, but rather sets of attributes
AsAs BsBs othersothers
tupletuple t t aa11 bb11 cc11
== ==
tupletuple uu
aa11 bb11 cc22
== ==
tupletuple vv
aa11 bb22 cc22
Formal Definition of a MVD Formal Definition of a MVD (con’t)(con’t)
Suppose A Suppose A B holds, then: B holds, then: tt[A] = [A] = uu[A] = [A] = vv[A][A] tt[B] = [B] = uu[B][B] uu[C] = [C] = vv[C][C]
If a MVD is to exist, If a MVD is to exist, uu must exist. must exist. For the MVD to be non-trivial,For the MVD to be non-trivial,
every tuple must be in the formevery tuple must be in the formof a of a uu tuple tuple
AsAs BsBs othersothers
tupletuple t t aa11 bb11 cc11
== ==
tupletuple uu
aa11 bb11 cc22
== ==
tupletuple vv
aa11 bb22 cc22
Formal Definition of a MVD Formal Definition of a MVD (con’t)(con’t)
For the MVD For the MVD A A B B to be trivial, either: to be trivial, either: B B A or A or B B A = R A = R
Must be trueMust be true
AsAs BsBs othersothers
tupletuple t t aa11 bb11 cc11
== ==
tupletuple uu
aa11 bb11 cc22
== ==
tupletuple vv
aa11 bb22 cc22
Armstrong’s Axioms WRT Armstrong’s Axioms WRT MVDsMVDs
Many of Armstrong’s Axioms work with Many of Armstrong’s Axioms work with MVDs including:MVDs including:
Reflexivity ruleReflexivity rule Augmentation ruleAugmentation rule Transitivity ruleTransitivity rule Complementation ruleComplementation rule Multivalued augmentation ruleMultivalued augmentation rule Multivalued transitivity ruleMultivalued transitivity rule Replication ruleReplication rule Coalescence ruleCoalescence rule Multivalued union ruleMultivalued union rule Intersection ruleIntersection rule Difference ruleDifference rule
See below See below for specifics on these rulesfor specifics on these rules http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/nodehttp://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node15.html15.html
ReferencesReferences
Course textbook:Course textbook:a First Course in Database Systems (Jeffrey D. Ullman and Jennifer Widom)a First Course in Database Systems (Jeffrey D. Ullman and Jennifer Widom)
Normalizing your database: First Normal Form (1NF):Normalizing your database: First Normal Form (1NF):http://webc.nicc.edu/~weedd/sysanaly/Chapter%206%20Database%20normhttp://webc.nicc.edu/~weedd/sysanaly/Chapter%206%20Database%20normalization/1NF.htmlalization/1NF.html
Multivalued Dependencies (Ozmar Zaine):Multivalued Dependencies (Ozmar Zaine):http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node13.htmlhttp://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node13.html
Multivalued Dependencies:Multivalued Dependencies:http://www.cs.jcu.edu.au/Subjects/cp1500/1998/Lecture_Notes/normalisatiohttp://www.cs.jcu.edu.au/Subjects/cp1500/1998/Lecture_Notes/normalisation/mvd.htmln/mvd.html