dbms unit-5 notes

Upload: fathermother

Post on 11-Oct-2015

190 views

Category:

Documents


1 download

TRANSCRIPT

  • 5/21/2018 Dbms Unit-5 Notes

    1/27

    A Helpful Hand

    DATABASE MANAGEMENT SYSTEMS

    UNIT V

    SCHEMA REFINEMENT

    DEPARTMENT OF COMPUTER SCIENCE

    AND ENGINEERING

    CSEROCKZ

  • 5/21/2018 Dbms Unit-5 Notes

    2/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    INTRODUCTION TO SCHEMA REFINEMENTConceptual database design gives us a set of relation schemas and integrity

    constraints (ICs) that can be regarded as a good starting point for the final database

    design. This initial design must be refined by taking the ICs into account and also

    by considering performance criteria and typical workloads.

    A major aim of relational database design is to minimie data redundancy. The

    problems associated with data redundancy are illustrated as follows!

    Problems caused by redudacy

    "toring the same information in more than one place within a database is called

    redundancyand can lead to several problems!

    Reduda! S!ora"e# "ome information is stored repeatedly.

    U$da!e Aomal%es# If one copy of such repeated data is updated# an

    inconsistency is created unless all copies are similarly updated.

    Iser!%o Aomal%es# It may not be possible to store certain information

    unless some other# unrelated# information is stored as well.

    Dele!%o Aomal%es# It may not be possible to delete certain information

    without losing some other# unrelated# information as well.

    E Consider a relation# $ourly%&mps(ssn, name, lot, rating, hourly_wages,

    hours_worked)

    The key for $ourly%&mps is ssn. In addition# suppose that the hourly_wages

    attribute is determined by the rating attribute. That is# for a given rating value#

    there is only one permissible hourly_wages value. This IC is an e'ample of a

    functional dependency. It leads to possible redundancy in the relation

    $ourly%&mps# as shown below!

    ssn name lo

    t

    ratin

    g

    hourly%wages hours%worked

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    3/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    * Adithya +, , - +

    * /evesh 00 , - 1

    *+ Ayush

    "oni

    1 * 1

    2* 3ajasekhar

    1 * 10

    c- "unil 1 , - +

    If the same value appears in the rating column of two tuples# the IC tells us that the

    same value must appear in the hourly_wages column as well. This redundancy has

    the following problems!

    Redundant Storage:The rating value , corresponds to the hourly%wage -#

    and this association is repeated three times.

    Update Anomalies: The hourly_wage in the first tuple could be updated

    without making a similar change in the second tuple.

    Insertion Anomalies: 4e cannot insert a tuple for an employee unless we

    know the hourly_wage for the employee5s rating value.

    Deletion Anomalies: If we delete all tuples with a given rating value (e.g.#

    we delete the tuples for Ayush "oni and 3ajasekhar) we lose the association

    between that rating value and its hourly_wage value.

    Null )alues

    6ull values cannot provide a complete solution# but they can provide some help.

    Consider the e'ample $ourly%&mps relation. $ere null values cannot help to

    eliminate redundant storage# update or deletion anomalies. It appears that they can

    address insertion anomalies. 7or instance# we can insert an employee tuple with

    null values in the hourly wage field. $owever# null values cannot address all

    insertion anomalies. Thus# null values do not provide a general solution to theproblems of redundancy# even though they can help in some cases.

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    4/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Decom$os%!%os

    A decom$os%!%o o* rela!%o sc+emaR consists of replacing the relation schema

    by two or more relation schemas that each contain a subset of the attributes of R

    and together include all attributes inR.

    Ee can decompose $ourly%&mps into two relations!

    $ourly%&mps0(ssn, name, lot, rating, hours_worked)

    4ages(rating

    , hourly_wages)

    Problems Rela!ed !o Decom$os%!%o

    Two important 8uestions must be asked during decomposition process!

    -. /o we need to decompose a relation9

    0. 4hat problems does a given decomposition cause9

    To answer a first 8uestion# several normal forms have been proposed for relations.

    If a relation schema is in one of these normal forms# we know that certain kinds of

    problems cannot arise.

    4ith respect second 8uestion# two properties of decomposition are to beconsidered!

    The lossless!oinproperty enables us to recover any instance of the relation of the

    decomposed relation from corresponding instances of the smaller relations.

    The dependencypreser"ationproperty enables us to enforce any constraint on the

    original relation by simply enforcing some constraints on each of the smaller

    relations. That is# we need not perform joins of the smaller relations to check

    whether a constraint on the original relation is violated.

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    5/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Fuc!%oal De$edec%es

    A *uc!%oal de$edecy (7/) is a kind of IC that generalies the concept of a

    key.

    :et 3 be a relation schema and let ; and < be nonempty sets of attributes in 3. 4e

    say that an instance r of 3 satisfies the 7/ ;= t0.;# then t-.< > t0.

  • 5/21/2018 Dbms Unit-5 Notes

    6/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Biven a set of 7/s over a relation schema 3# typically several additional 7/s hold

    over 3 whenever all of the given 7/s hold.

    E Consider 4orkers(ssn, name, lot, did, since)

    4ith given 7/sssn # did and did# lot. Then# in any legal instance of 4orkers# if

    two tuples have the same ssn value# they must have the same did value# and

    because they have the same did value# they must also have the same lot value.

    Therefore# the 7/ssn# lot also holds on 4orkers.

    Closure o* a Se! o* FDs

    The set of all 7/s implied by a given set $ of 7/s is called the closure o* F,

    denoted as $%. The closure $% can be calculated by using the following

    Arms!ro"-s A&%oms rules. :et ;#

  • 5/21/2018 Dbms Unit-5 Notes

    7/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Contracts(contractid, supplierid, pro!ectid, deptid, partid, &ty, "alue)

    4e denote the schema for Contracts as 'S(D)*+.

    The following are the given 7/s!

    i '# 'S(D)*+.

    ii ()#'.

    iii SD#).

    "everal additional 7/s hold in the closure of the set of given 7/s!

    7rom()#' and '#'S(D)*+, and transitivity# we infer()#'S(D)*+.

    7rom SD#) and augmentation# we infer SD(# ().

    7rom SD(#(), ()#'S(D)*+, and transitivity# we infer SD(#'S(D)*+.

    Note:

    In a !r%.%al FD, the right side contains only attributes that also appear on the

    left side. Dsing refle-i"ity, we can generate all trivial dependencies# whichare of the form!

    ;=

  • 5/21/2018 Dbms Unit-5 Notes

    8/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    If we want to check whether a given dependency# say# ;=

  • 5/21/2018 Dbms Unit-5 Notes

    9/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    2ormali3ation is a techni8ue for producing a set of relations with desirable

    properties# given the data re8uirements of an enterprise.

    The process of normaliation was first developed by &.7.Codd.

    2ormali3ation is often performed as a series of tests on a relation to

    determine whether it satisfies or violates the re8uirements of a given normal

    form.

    Ad.a!a"es o* Normal%4a!%o#

    :ess storage space

    3educes data redundancy in a database

    It eliminates serious manipulation anomalies.

    7le'ible structure

    Normal Forms

    Biven a relation schema# we need to decide whether it is a good design or we need

    to decompose it into smaller relations. To make such a decision# several ormal

    *orms have been proposed.

    The normal forms based on 7/s are first normal form 452$, second normal

    form462$, third normal form472$, and8oyce'odd normal form 48'2$.

    F%rs! Normal Form 5/NF6#

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    10/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    A relation in which the intersection of each row and column contains one and only

    one value.

    E&am$le#

    &mp6um &mpHhone &mp/egrees

    --- +

    01,+--0

    000 +

    012,*+

    E ?A# ?"c# Hh/

    G

    111 +

    01+*,2

    E ?"c# J"c G

    Identification of multi"alued attri9utes:

    The above relation in o!in -67 as &mp/egrees is a multivalued field!

    employee 111 has two degrees!8ScandSc

    employee 000 has three degrees!8A, 8Sc, )hD

    ;ransformation into 52$:

    In order to make this relation to be in -67# we remove the repeating group

    (&mp/egrees details) by placing the repeating data along with a copy of the

    original key attribute ( &mp6um) in a separate relation. The format of the resulting

    -67 relations are as follows!

    &mployee( &mp6um# &mpHhone) &mployee/egree(&mp6um#

    &mp/egrees)

    &mp6um &mpHhone

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    11/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    --- +01,+--0

    000 +012,*+

    111 +01+*,2

    Secod Normal Form 50NF6#

    Par!%al De$edecy# A partial dependency e'ists when an attribute ? is

    functionally dependent on an attribute A# and A is a component of a multipart

    candidate key.

    0NF# A relation is in 067 if it is in -67# and every nonkey attribute is fully

    dependent on each candidate key. (That is# we don5t have any partial functional

    dependency.)

    A relation in 067 will not have any partial dependencies.

    E&am$le#

    Consider this I.3%e table (in -67)!

    Inv6um :ine6um Hrod6um Kty Inv/ate

    &mp6um &mp/egrees

    000 ?A

    000 ?"c

    000 Hh/

    111 ?"c

    111 J"c

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    12/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    The above relation has the following 7/s!

    EInv6um# :ine6umG=Hrod6um# Kty

    EInv6um# Hrod6umG=:ine6um# Kty

    Inv6um=Inv/ate

    Identification of partial dependencies:

    Inv:ine is o! 0NFsince there is a partial dependency of Inv/ate on Inv6um.

    ;ransformation into 62$:

    4e can impro"e the database by decomposing the above relation into relations!

    Inv6um :ine6um Hrod6um Kty

    Inv6um Inv/ate

    T+%rd Normal Form 57NF6#

    Tras%!%.e de$edecy# A condition where A# ?# and C are attributes of a relation

    such that if A =? and ?=C# then C is transitively dependent on A via ?.

    7NF# A relation that is in first and second normal form# and in which no nonkeyattribute is transitively dependent on the candidate key.

    In other words#

    A relation in 167 will not have any transitive dependencies of nonkey attribute on

    a candidate key through another non key attribute.

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    13/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    7ormally it can be given as!

    :et 3 be relation schema# 7 be the set of 7/s given to hold over 3# then for every

    7/ ;=A in 7# one of the following is true!

    -) ; is a candidate key

    or

    0) A is part of the key.

    E&am$le#

    Consider an Em$loyee relation!

    &mp6um &mp6ame /ept6um /ept6ame

    This relation has the following 7/s!

    &mp6um=&mp6ame# /ept6um# /ept6ame

    /ept6um=/ept6ame

    Identifying transiti"e dependencies:

    &mp6ame# /ept6um# and /ept6ame are nonkey attributes.

    /ept6um determines /ept6ame# a nonkey attribute# and /ept6um is not a

    candidate key.

    ;ransformation into 72$:

    Is the relation in 1679 L no

    Is the relation in 0679 L yes

    4e correct the situation by decomposing the original relation into two 167

    relations!

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    14/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    8oyce9Codd Normal Form 58CNF6#

    6amed after its inventors! ?oyce and

    Codd

    "tronger than 167

    De!erm%a!# 3efers to the attribute or

    group of attributes on the left hand side of

    the arrow of a functional dependency.

    &'! Consider an 7/#&mp6um=&mp&mail. $ere# &mp6um is

    a determinant of &mp&mail.

    8CNF#A relation is in ?C67# if and only

    if# every determinant is a candidate key.

    7ig! &.7.Codd

    &mp6um &mp6ame /ept6um /ept6um /ept6ame

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    15/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    7ormally it can be given as!

    :et 3 be a relation schema# 7 be the set of 7/s given over 3# then for every 7/

    ; =A in 7# ; should be a candidate key.

    E&am$le#

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    16/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    ;ransformation into 8'2$:

    Note:

    any relation that is in ?C67# is in 167

    any relation in 167 is in 067

    any relation in 067 is in -67

    There is a se8uence to normal forms!

    -67 is considered the weakest

    067 is stronger than -67

    167 is stronger than 067# and

    ?C67 is considered the strongest

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    17/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Mul!%9)alued De$edec%es 5M)D6

    The possible e'istence of multivalued dependencies in a relation is due to first

    normal form (-67)# which disallows an attribute in a tuple from having a set of

    values.

    7or e'ample# if we have two multivalued attributes in a relation# we have to repeat

    each value of one of the attributes with every value of the other attribute# to ensure

    that tuples of a relation are consistent. This type of constraint is referred to as a

    multivalued dependency and results in data redundancy.

    Consider the

  • 5/21/2018 Dbms Unit-5 Notes

    18/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Mul!%9)alued De$edecy 5M)D6#

    3epresents a dependency between attributes (for e'ample# A# ?# and C) in a

    relation# such that for each value of A there is a set of values for ? and a set of

    values for C. $owever# the set of values for ? and C are independent of each other.

    4e represent an JF/ between attributes A# ?# and C in a relation using the

    following notation!

    A == ?

    A == C

    For e&am$le,we specify the JF/ in the above t-(;)# t+( t0(

    t-().

    2$%%Y &a' (e read asX multi-determines Y.

    ; <

    '- y- - t-

    '- y0 0 t0

    '- y- 0 t1

    '- y0 - t+

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    19/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    The JF/ ;== < says that the relationship between ; and < is independent of

    the relationship between ; and 3M

  • 5/21/2018 Dbms Unit-5 Notes

    20/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Four!+ Normal Form 5:NF6#

    A relation that is in ?oyceCodd normal form and contains no nontrivial JF/s.

    7ormally# :et 3 be a relation schema# ; and < be nonempty subsets of theattributes of 3. 3 is said to be in :NF, if# for every JF/ ;==< that holds over

    3# one of the following is true!

    < is a subset of ; or ;< > 3# or

    ; is a super key.

    E&am$le#

    Consider the above

  • 5/21/2018 Dbms Unit-5 Notes

    21/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    Pro$er!%es o* Decom$os%!%o

    3ossless9;o% Decom$os%!%o#

    :et 3 be a relation schema and let 7 be a set of 7/s over 3. A decomposition of 3into two schemas with attribute sets ; and < is said to be a lossless9

  • 5/21/2018 Dbms Unit-5 Notes

    22/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    N"H (r) NH/ (r)

    7ig! Instances illustrating :ossy /ecompositions

    T+eorem#:et 3 be a relation and 7 be a set of 7/s that hold over 3. The

    decomposition of 3 into relations with attribute sets 3- and 30 is lossless if and

    only if 7O contains either the 7/ 3- P 30 = 3- (or 3-M30) or the 7/ 3- P 30

    = 30 (or 30M3-).

    Consider the $ourly%&mps relation. It has attributes S2?R@, and the 7/ 3=4

    causes a violation of 167. 4e dealt this violation by decomposing the relation into

    S2?R and R@. "ince 3 is common to both decomposed relations and 3=4

    holds# this decomposition is losslessjoin.

    De$edecy9Preser.%" Decom$os%!%o#

    Consider the Contracts relation with attributes 'S(D)*+. The given 7/s are

    C=C"Q/HKF# QH=C# and "/=H. ?ecause "/ is not a key# the dependency

    "/=H causes a violation of ?C67.

    4e can decompose Contracts into relations with schemas C"Q/KF and "/H to

    address this violation. The decomposition is losslessjoin. ?ut# there is one

    problem. If we want to enforce an integrity constraint QH=C# it re8uires an

    e'pensive join of the two relations. 4e say that this decomposition is not

    dependencypreserving.

    :et 3 be a relation schema that is decomposed into two schemas with attributes

    sets ; and

  • 5/21/2018 Dbms Unit-5 Notes

    23/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    of 7/s in the closure 7Othat involve only attributes in ;. 4e denote the projection

    of 7 on attributes ; as 7; . 6ote that a dependency D=F in 7O is in 7; only if all

    the attributes in D and F are in ;.

    The decomposition of relation schema 3 with 7/s 7 into schemas with attributesets ; and < is de$edecy9$reser.%" if (7; D 7 7O.

    E&am$le#

    Consider the relation 3 with attributes A?C is decomposed into relations with

    attributes A? and ?C. The set of 7/s over 3 includes A=?# ?=C# and C=A.

    The closure of 7 contains all dependencies in 7 plus A=C# ?=A# and C=?.

    Conse8uently 7A? contains

    A=? and ?=A# and 7?C contains ?=C and C=?. Therefore# 7A? D 7?Ccontains

    A=?# ?=C# ?=A

    and C=?. The closure of 7A? and 7?C now includes C=A (which follows from

    C=? and ?=A). Thus

    the decomposition preserves the dependency C=A.

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    24/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    E1ERCISES

    0) i) The following relation is in 167# but not in ?C679 &'plain why

    and take an appropriate action9

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    25/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    ii) If we reverse the direction between staff%id and class%code# then

    determine whether it is violating 067# 167# ?C679 If yes# perform the

    necessary action.

    1) 6ormalie the following relation!

    +) Biven a relation with schema EI/# 6ame# Address# Hostcode#

    CardType# Card6umberG# the candidate key EI/G# and the following

    functional dependencies!

    R EI/G E6ame# Address# Hostcode# CardType# Card6umberG

    R EAddressG EHostcodeG

    R ECard6umberG ECardTypeG

    (i) &'plain why this relation is in second normal form# but not in third

    normal form.

    ' ' ' ( C S E R O C K Z ( C O M Page 2

  • 5/21/2018 Dbms Unit-5 Notes

    26/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    (ii) "how how this relation can be converted to third normal form.

  • 5/21/2018 Dbms Unit-5 Notes

    27/27

    DATABASE MANAGEMENT SYSTEMS UNIT-V

    www.cserockz.com

    Kee$ 'a!c+%" *or Re"ular U$da!esB(

    ' ' ' ( C S E R O C K Z ( C O M Page 2

    http://www.cserockz.com/http://www.cserockz.com/