multi-valued dependencies and fourth normal form

27
Multi-valued Dependencies and Fourth Normal Form COSC 6340

Upload: wynona

Post on 19-Feb-2016

73 views

Category:

Documents


4 download

DESCRIPTION

Multi-valued Dependencies and Fourth Normal Form. COSC 6340. Topics Covered. Definition of Multivalued Dependencies Reasoning about MVDs Fourth Normal Form. Motivation. There are schemas that are in BCNF that do not seem to be sufficiently normalized. Stars. name. street. city. title. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Multi-valued Dependencies and Fourth Normal Form

Multi-valued Dependencies and Fourth Normal Form

COSC 6340

Page 2: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 2

Topics Covered1. Definition of Multivalued

Dependencies2. Reasoning about MVDs3. Fourth Normal Form

Page 3: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 3

Motivation There are schemas that are in BCNF that do

not seem to be sufficiently normalized

name street

Starscity title year

C. Fisher

C. Fisher

C. Fisher

C. Fisher

C. Fisher

123 Maple Str.

5 Locust Ln.

123 Maple Str.

5 Locust Ln.

123 Maple Str.

5 Locust Ln.C. Fisher

Hollywood

Malibu

Hollywood

Malibu

Hollywood

Malibu

Star Wars 1977

Star Wars 1977

Empire Strikes Back 1980

Empire Strikes Back 1980

Return of the Jedi 1983

Return of the Jedi 1983

Page 4: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 4

Attribute Independence No reason to associate address with one

movie and not another When we repeat address and movie facts

in all combinations, there is obvious redundancy

However, NO BCNF violation in Stars relation There are no non-trivial FD’s at all, all five

attributes form the only superkey Why?

Page 5: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 5

Multi-valued DependencyDefinition: Multivalued dependency (MVD): A1A2…An B1B2…Bm holds for relation R if: For all tuples t, u in R If t[A1A2...An] = u[A1A2...An], then there exists a v in

R such that: (1) v[A1A2...An] = t[A1A2...An] = u[A1A2...An] (2) v[B1B2…Bm] = t[B1B2…Bm] (3) v[C1C2…Ck] = u[C1C2…Ck], where C1C2…Ck is

all attributes in R except (A1A2...An B1B2…Bm)

Page 6: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 6

Pictorially Speaking...

An MVD guarantees v exists The existence of a fourth tuple w is implied

by interchanging t and u

t

u

A’s B’s Others

v

w

Page 7: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 7

Example: name street city

name street

Starscity title year

C. Fisher

C. Fisher

C. Fisher

123 Maple Str.

123 Maple Str.

5 Locust Ln.

Hollywood

Hollywood

Malibu

Star Wars 1977

Empire Strikes Back 1980

Empire Strikes Back 1980

C. Fisher

C. Fisher

5 Locust Ln.

123 Maple Str.

5 Locust Ln.C. Fisher

Malibu

Hollywood

Malibu

Star Wars 1977

Return of the Jedi 1983

Return of the Jedi 1983

t

u

v

Page 8: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 8

Example: name street city

name street

Stars

city title year

C. Fisher

C. Fisher

C. Fisher

C. Fisher

123 Maple Str.

5 Locust Ln.

123 Maple Str.

5 Locust Ln.

Hollywood

Malibu

Hollywood

Malibu

Star Wars 1977

Star Wars 1977

Empire Strikes Back 1980

Empire Strikes Back 1980

C. Fisher 123 Maple Str.

5 Locust Ln.C. Fisher

Hollywood

Malibu

Return of the Jedi 1983

Return of the Jedi 1983

u

t

wv

Page 9: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 9

More on MVDs Intuitively, A1A2…An B1B2…Bm says that the

relationship between A1A2…An and B1B2…Bm is independent of the relationship between A1A2…An and R -{B1B2…Bm}

MVD's uncover situations where independent facts related to a certain object are being squished together in one relation

Functional dependencies rule out certain tuples from being in a relation

How? Multivalued dependencies require that other tuples

of a certain form be present in the relation a.k.a. tuple-generating dependencies

Page 10: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 10

Let’s Illustrate In Stars, we must repeat the movie (title, year)

once for each address (street, city) a movie star has Alternatively, we must repeat the address for each movie a

star has made Example: Stars with name street city name street city title year

C. Fisher

C. Fisher

C. Fisher

123 Maple Str.

5 Locust Ln.

123 Maple Str.

Hollywood

Malibu

Hollywood

Star Wars 1977

Empire Strikes Back 1980

Return of the Jedi 1983

Is an incomplete extent of Stars Infer the existence of a fourth tuple under the given MVD

Page 11: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 11

Trivial MVDs Trivial MVD A1A2…An B1B2…Bm where B1B2…

Bm is a subset of A1A2…An or (A1A2…An B1B2…Bm ) contains all attributes of R

Page 12: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 12

Reasoning About MVDs FD-IS-AN-MVD Rule (Replication) If A1A2…An B1B2…Bm then A1A2…An B1B2…Bm holds

Page 13: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 13

Reasoning About MVDs COMPLEMENTATION Rule If A1A2…An B1B2…Bm then A1A2…An

C1C2…Ck where C1C2…Ck is all attributes in R except (A1A2…An B1B2…Bm )

AUGMENTATION Rule If XY and WZ then WX YZ TRANSITIVITY Rule If XY and YZ then X (ZY)

Page 14: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 14

Coalescence Rule for MVDX Y

W:W Z

Then: X Z

If:

Remark: Y and W have to be disjoint and Z has to be a subset of or equal to Y

Page 15: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 15

Definition 4NF Given: relation R and set of MVD's for R Definition: R is in 4NF with respect to its

MVD's if for every non-trivial MVD A1A2…AnB1B2…Bm , A1A2…An is a superkey

Note: Since every FD is also an MVD, 4NF implies BCNF

Example: Stars is not in 4NF

Page 16: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 16

Decomposition Algorithm(1) apply closure to the user-specified FD's and MVD's**:(2) repeat until no more 4NF violations: if R with AA ->> BB violates 4NF then: (2a) decompose R into R1(AA,BB) and R2(AA,CC), where CC is all attributes in R except (AA BB) (2b) assign FD's and MVD's to the new relations**

** MVD's: hard problem! No simple test analogous to computing the attribute

closure for FD’s exists for MVD’s. You are stuck to have to use the 5 inference rules for MVD’s when computing the closure!

Page 17: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 17

Exercise Decompose Stars into a set of

relations that are in 4NF. namestreet city is a 4NF violation Apply decomposition:R(name, street, city)S(name, title, year)

What about namestreet city in R and nametitle year in S?

Page 18: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 18

Exercise For the relation R(A,B,C,D) with only

MVD’s AB and AC find all 4NF violations and decompose R into a collection of relation schemas in 4NF.

Page 19: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 19

Solution Since there are no functional dependencies,

the only key is all four attributes, ABCD. Thus, each of the nontrivial multivalued

dependencies A->->B and A->->C violate 4NF. Separate out the attributes of these

dependencies, first decomposing into AB and ACD

Then decompose the latter into AC and AD because A->->C is still a 4NF violation for ACD.

The final set of relations are AB, AC, and AD.

Page 20: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 20

Exercise Suppose we have relation R(A,B,C)

with MVD AB. If we know that the tuples (a,b1,c1), (a,b2,c2), and (a,b3,c3) are in the current instance of R, what other tuples do we know must also be in R?

Page 21: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 21

Solution Since A->->B, and all the tuples have the

same value for attribute A, we can pair the B-value from any tuple with the value of the remaining attribute C from any other tuple.

Thus, we know that R must have at least the nine tuples of the form (a,b,c), where b is any of b1, b2, or b3, and c is any of c1, c2, or c3. That is, we can derive, using the definition of a multivalued dependency, that each of the tuples (a,b1,c2), (a,b1,c3), (a,b2,c1), (a,b2,c3), (a,b3,c1), and (a,b3,c2) are also in R.

Page 22: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 22

‘s

Another View of 4NF

MVD’s that are also FD’s

True MVD’s

TrivialMVD’s

4NF:= Relation is inBCNF and there are no true MVD’s (yellowpart is empty)

True MVD XY:= non-trivial &XY does not hold

Remark: If XY is a true MVD then X cannotbe a superkey (becauseXY does not hold); Therefore, true MVD’s always violate 4NF (“trueMVD’s are always bad)

XY

XY and XY

XY and not XY

Page 23: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 23

H1-2005-Problem88) Normalization [6] gradedR(A,B,C,D,E,F) is given with: (1) ABCD (2)CDAB (3)ABF (4) FEa) What are the candidate keys of relation R? [1] b) b) Transform R into a relational schema that is in BCNF and

does not have any lost functional dependencies! [5]

Correct Solution:a) Candidate keys: AB and CDb) Decompose R into R1(A,B,C,D,F) with local FD’s (1), (2), (3) and

R2(E,F) with local FD’s (4) Due to the fact that all four dependencies are still present no functional dependency has been lost. Moreover, all functional depencies are good

A non-optimal (“too many relations”) solution I also saw was: Decompose R into R1(A,B,C,D) with local functional dependencies ABCD and CDAB, R2(A,B,F) with local functional dependencies ABF and R3(F,E) with local functional dependencies FA..

Page 24: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 24

Problem 1; H1-2004a. Candidate keys are: {a,b}, {a,d},

{a,e}b. 14 superkeys totalc. All but the first functional

dependency are bad

Page 25: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 25

Problem 2; H2-04a. No; EBC is a “true” multi-valued dependency and E is not a

candidate key (as a matter of fact {E}+={A,D,E,F} see below)b. No (but just mentioning neither E ABC nor E CF holds is not

sufficient (e.g. if EABC holds then the decomposition is lossless!) ) --- a counter examples should be given to show that the statement is false!

c. Yes C is candidate key; therefore CBDEF; therefore CBDEFd. Yes E BC and BC BCD implies ED due to MVD-

transitivity (CCDBCBCD BCBCD)e. Yes EBC; therefore EADF; moreover, CADF and using

the Coalescence Rule we obtain EADF; therefore, EA holdsf. No R is not in BCNF because EADF holds and E is not a

candidate key.

Page 26: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 26

Problem 3a-2004From AB and AC we can infer:

ABC??(1) AC AAC (2) AB ACABC (3) ACABC ACDE (4) AAC, ACDE ADE (5) ADE ABC Using: 1. Augmentation, 2.Augmentation, 3.Complementation,

4.Transitivity, 5. Complementation

Wrong!!

Remark: This problem will be revised in Homework3-2005; it is too complicated to worry about it for the midterm exam!

Page 27: Multi-valued Dependencies and Fourth Normal Form

Ch. Eick: 4thNF and MVD's 27

MDV’s and FD’s --- Ungraded Homework

Assume we have a relation R(A,B,C,D,E) with the following dependencies:

(1)   AB CDE (2)   CD ABE(3)   E DBAnswer the following questions giving reasons for your answers:a)      Is R in BCNF? ????? (answer after Spring break) Warning: The

presence of the MVD might imply other functional dependencies (see textbook page 637)

b)      Does ABE D hold for R? yesc)      Does CD B hold for R? yesd)      Does E D always hold for R (either show that this

dependency can be inferred from the given 3 dependencies, or give a counter example of a relation that satisfies (1), (2), (3) but violates ED)? No