multivalued dependencies by david wortham. problem introduction assume a relation r (from the book):...

26
Multivalued Multivalued Dependencies Dependencies By David Wortham By David Wortham

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Multivalued DependenciesMultivalued Dependencies

By David WorthamBy David Wortham

Problem IntroductionProblem Introduction

Assume a relation Assume a relation RR (from the book): (from the book):(credit Ullman and Widom)(credit Ullman and Widom)

NameName AddrStreetAddrStreet AddrCityAddrCity FilmNameFilmName FilmYearFilmYear

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi

19831983

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi

19831983

Carrie Fisher = Princess Leia Organa

Problem IntroductionProblem Introduction

What is the highest Normal Form What is the highest Normal Form RR complies with? complies with?

NameName AddrStreetAddrStreet AddrCityAddrCity FilmNameFilmName FilmYearFilmYear

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi

19831983

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi

19831983

Recall 1NFRecall 1NF

1NF eliminates wasted space due to 1NF eliminates wasted space due to duplicate attributes (columns)duplicate attributes (columns)

Example (before 1NF normalization):Example (before 1NF normalization):ManagerManager SubordinatSubordinat

e1e1SubordinatSubordinate2e2

SubordinatSubordinate3e3

SubordinatSubordinate4e4

BobBob JimJim MaryMary BethBeth

MaryMary MikeMike JasonJason CarolCarol MarkMark

JimJim AlanAlan

Recall 1NF (con’t)Recall 1NF (con’t)

After 1NF normalization:After 1NF normalization:ManagerManager SubordinateSubordinate

BobBob JimJim

BobBob MaryMary

BobBob BethBeth

MaryMary MikeMike

MaryMary JasonJason

MaryMary CarolCarol

MaryMary MarkMark

JimJim AlanAlan

Identify the FDsIdentify the FDs

Functional Dependencies in Functional Dependencies in RR:: Since every tuple in Since every tuple in RR is unique, the is unique, the

attrAattrA attrBattrB attrCattrC attrDattrD attrEattrE

11 22 33 66 77

11 44 55 66 77

11 22 33 88 99

11 44 55 88 99

11 22 33 1010 1111

11 44 55 1010 1111

Identify the FDsIdentify the FDs

Functional Dependencies in Functional Dependencies in RR:: The only FD is: The only FD is: {attrB, attrC, attrD, attrE } {attrB, attrC, attrD, attrE } attrA attrA

attrAattrA attrBattrB attrCattrC attrDattrD attrEattrE

11 22 33 66 77

11 44 55 66 77

11 22 33 88 99

11 44 55 88 99

11 22 33 1010 1111

11 44 55 1010 1111

Normal Form of Normal Form of RR

1NF (no multivalues) 1NF (no multivalues) [check][check] 2NF (no FDs where a subset of the key to 2NF (no FDs where a subset of the key to

the relation is on the left) the relation is on the left) [check][check] 3NF (no non-trivial FDs: either the 3NF (no non-trivial FDs: either the

determinant is a superkey or the RHS of determinant is a superkey or the RHS of the FD is a member of some key) the FD is a member of some key) [check][check]

BCNFBCNF (the determnant of any non-trivial (the determnant of any non-trivial FD is a superkey for the relation) FD is a superkey for the relation) [check][check]

Problem Intro. (con’t)Problem Intro. (con’t)

Notice:Notice:

NameName AddrStreeAddrStreett

AddrCityAddrCity FilmNameFilmName FilmYearFilmYear

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi

19831983

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi

19831983

Problem Intro. (con’t)Problem Intro. (con’t)

Also Notice:Also Notice:

NameName AddrStreeAddrStreett

AddrCityAddrCity FilmNameFilmName FilmYearFilmYear

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Star WarsStar Wars 19771977

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Star WarsStar Wars 19771977

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu The Empire The Empire Strikes BackStrikes Back

19801980

C. FisherC. Fisher 123 Maple Dr.123 Maple Dr. HollywoodHollywood Return of the Return of the JediJedi

19831983

C. FisherC. Fisher 5 Locust Ln.5 Locust Ln. MalibuMalibu Return of the Return of the JediJedi

19831983

Observe the PatternObserve the Pattern

R ~= TxUxVR ~= TxUxV(R is similar to the Cartesian product of (R is similar to the Cartesian product of relations T, U, and V)relations T, U, and V)

AddrStreetAndCityAddrStreetAndCity

123 Maple Dr. | Hollywood123 Maple Dr. | Hollywood

5 Locust Ln. | Malibu5 Locust Ln. | Malibu

FilmNameAndYearFilmNameAndYear

Star Wars | 1977Star Wars | 1977

The Empire Strikes Back | 1980The Empire Strikes Back | 1980

Return of the Jedi | 1983Return of the Jedi | 1983

NameName

C. FisherC. Fisher

Relation T Relation U Relation V

Problem DefinitionProblem Definition

The Relation The Relation RR contains unnecessary contains unnecessary duplication of dataduplication of data

RR is valid 1NF, 2NF, 3NF, and BCNF is valid 1NF, 2NF, 3NF, and BCNF (and there are no exact duplicate (and there are no exact duplicate tuples)tuples)

R has common data on AddrStreet and R has common data on AddrStreet and AddrCity of all tuplesAddrCity of all tuples

R has common data on FilmName and R has common data on FilmName and FilmLocat of all tuplesFilmLocat of all tuples

SolutionSolution

Introduction of a 4NFIntroduction of a 4NF Eliminate “non-trivial” MDsEliminate “non-trivial” MDs Eliminate additional FDs that violate Eliminate additional FDs that violate

BCNFBCNF

DefinitionsDefinitions

Fourth Normal FormFourth Normal Form- if - if RR is valid BCNF and… is valid BCNF and…- given the “non-trivial” MVD: - given the “non-trivial” MVD: AA11AA22…A…Ann B B11BB22…B…Bnn

{{AA11AA22…A…Ann} is a superkey} is a superkey A MVD: A MVD: AA11AA22…A…Ann B B11BB22…B…Bnn for a Relation for a Relation RR is “ is “non-non-

trivialtrivial” if:” if:1.1. none of the none of the BBs are among the s are among the AAss2.2. Not all of the attributes of Not all of the attributes of RR are among the are among the AAs s

and and BBss A MVD is “A MVD is “trivialtrivial” if it contains all the variations ” if it contains all the variations

of of AA11AA22…A…Ann xx B B11BB22…B…Bn.n.

A relation cannot be decomposed any further A relation cannot be decomposed any further (under 4NF rules) if it has a trivial MVD(under 4NF rules) if it has a trivial MVD

Note about FDs and MVDsNote about FDs and MVDs

Every FD is a MVDEvery FD is a MVD

(if (if AA11AA22…A…Ann B B11BB22…B…Bnn , then , then AA11AA22…A…Ann B B11BB22…B…Bnn ) ) The converse is not always trueThe converse is not always true FDs rule out certain tuples (i.e. if FDs rule out certain tuples (i.e. if A A B B then two then two

tuples will not have the same value for A and tuples will not have the same value for A and different values for B)different values for B)

MVDs do not rule out tuples. They guarantee that MVDs do not rule out tuples. They guarantee that certain tuples must exist. (we will see this later)certain tuples must exist. (we will see this later)

Another ExampleAnother Example

Another MVD exampleAnother MVD example(credit www.cs.jcu.edu.au)(credit www.cs.jcu.edu.au)

emp_nameemp_name qualificationqualificationss

languageslanguages

SMITHSMITH B. ScB. Sc NULLNULL

SMITHSMITH Dip. CSDip. CS NULLNULL

SMITHSMITH NULLNULL FORTRANFORTRAN

SMITHSMITH NULLNULL COBOLCOBOL

SMITHSMITH NULLNULL PASCALPASCAL

Another Example (con’t)Another Example (con’t)

Our second example 4NF decomposed:Our second example 4NF decomposed:

emp_nameemp_name qualificationqualificationss

languageslanguages

SMITHSMITH B. ScB. Sc NULLNULL

SMITHSMITH Dip. CSDip. CS NULLNULL

SMITHSMITH NULLNULL FORTRANFORTRAN

SMITHSMITH NULLNULL COBOLCOBOL

SMITHSMITH NULLNULL PASCALPASCAL

emp_nameemp_name qualificationsqualifications

SMITHSMITH B. ScB. Sc

SMITHSMITH Dip. CSDip. CS

emp_nameemp_name languageslanguages

SMITHSMITH FORTRANFORTRAN

SMITHSMITH COBOLCOBOL

SMITHSMITH PASCALPASCAL

Relation L

Relation M

Relation O

Explanation of second Explanation of second exampleexample

Unnecessary tuples are eliminated Unnecessary tuples are eliminated (those with NULL values)… saving (those with NULL values)… saving spacespace

Note: for this example, Note: for this example, MMxxOO is not is not similar to similar to LL: this is because the MVD : this is because the MVD in L is non-trivialin L is non-trivial

Second example (modified)Second example (modified)

If we were to modify the second If we were to modify the second example example LL to be 4NF, we would need to be 4NF, we would need to combine every possible value of to combine every possible value of MM with every one of with every one of OO, changing the , changing the MVD from non-trivial to trivialMVD from non-trivial to trivial

This relation would look similar to the This relation would look similar to the product of product of MM and and OO

Second example (modified)Second example (modified)

If we were to modify the second example If we were to modify the second example LL, we would need to combine every , we would need to combine every possible value of possible value of MM with every one of with every one of OO

emp_nameemp_name qualificationqualificationss

languageslanguages

SMITHSMITH B. ScB. Sc NULLNULL

SMITHSMITH Dip. CSDip. CS NULLNULL

SMITHSMITH NULLNULL FORTRANFORTRAN

SMITHSMITH NULLNULL COBOLCOBOL

SMITHSMITH NULLNULL PASCALPASCAL

Relation L

Second example (modified)Second example (modified)

If we were to modify the second example If we were to modify the second example LL, we would need to combine every , we would need to combine every possible value of possible value of MM with every one of with every one of OO

This relation is 4NF b/cThis relation is 4NF b/cthe only MVD is trivialthe only MVD is trivial(the original L was not(the original L was not4NF)4NF)

emp_nameemp_name qualificationqualificationss

languageslanguages

SMITHSMITH B. ScB. Sc FORTRANFORTRAN

SMITHSMITH Dip. CSDip. CS FORTRANFORTRAN

SMITHSMITH B. ScB. Sc COBOLCOBOL

SMITHSMITH Dip. CSDip. CS COBOLCOBOL

SMITHSMITH B. ScB. Sc PASCALPASCAL

SMITHSMITH Dip. CSDip. CS PASCALPASCAL

Relation L w/ Trivial MVD

Formal Definition of a MVDFormal Definition of a MVD

Using the following relation,Using the following relation,(credit Ullman & Widom)(credit Ullman & Widom)

Note that A, B, and “others” are not actual Note that A, B, and “others” are not actual attributes, but rather sets of attributesattributes, but rather sets of attributes

AsAs BsBs othersothers

tupletuple t t aa11 bb11 cc11

== ==

tupletuple uu

aa11 bb11 cc22

== ==

tupletuple vv

aa11 bb22 cc22

Formal Definition of a MVD Formal Definition of a MVD (con’t)(con’t)

Suppose A Suppose A B holds, then: B holds, then: tt[A] = [A] = uu[A] = [A] = vv[A][A] tt[B] = [B] = uu[B][B] uu[C] = [C] = vv[C][C]

If a MVD is to exist, If a MVD is to exist, uu must exist. must exist. For the MVD to be non-trivial,For the MVD to be non-trivial,

every tuple must be in the formevery tuple must be in the formof a of a uu tuple tuple

AsAs BsBs othersothers

tupletuple t t aa11 bb11 cc11

== ==

tupletuple uu

aa11 bb11 cc22

== ==

tupletuple vv

aa11 bb22 cc22

Formal Definition of a MVD Formal Definition of a MVD (con’t)(con’t)

For the MVD For the MVD A A B B to be trivial, either: to be trivial, either: B B A or A or B B A = R A = R

Must be trueMust be true

AsAs BsBs othersothers

tupletuple t t aa11 bb11 cc11

== ==

tupletuple uu

aa11 bb11 cc22

== ==

tupletuple vv

aa11 bb22 cc22

Armstrong’s Axioms WRT Armstrong’s Axioms WRT MVDsMVDs

Many of Armstrong’s Axioms work with Many of Armstrong’s Axioms work with MVDs including:MVDs including:

Reflexivity ruleReflexivity rule Augmentation ruleAugmentation rule Transitivity ruleTransitivity rule Complementation ruleComplementation rule Multivalued augmentation ruleMultivalued augmentation rule Multivalued transitivity ruleMultivalued transitivity rule Replication ruleReplication rule Coalescence ruleCoalescence rule Multivalued union ruleMultivalued union rule Intersection ruleIntersection rule Difference ruleDifference rule

See below See below for specifics on these rulesfor specifics on these rules http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/nodehttp://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node15.html15.html

ReferencesReferences

Course textbook:Course textbook:a First Course in Database Systems (Jeffrey D. Ullman and Jennifer Widom)a First Course in Database Systems (Jeffrey D. Ullman and Jennifer Widom)

Normalizing your database: First Normal Form (1NF):Normalizing your database: First Normal Form (1NF):http://webc.nicc.edu/~weedd/sysanaly/Chapter%206%20Database%20normhttp://webc.nicc.edu/~weedd/sysanaly/Chapter%206%20Database%20normalization/1NF.htmlalization/1NF.html

Multivalued Dependencies (Ozmar Zaine):Multivalued Dependencies (Ozmar Zaine):http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node13.htmlhttp://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter7/node13.html

Multivalued Dependencies:Multivalued Dependencies:http://www.cs.jcu.edu.au/Subjects/cp1500/1998/Lecture_Notes/normalisatiohttp://www.cs.jcu.edu.au/Subjects/cp1500/1998/Lecture_Notes/normalisation/mvd.htmln/mvd.html