dbms unit-5 notes
TRANSCRIPT
-
5/21/2018 Dbms Unit-5 Notes
1/27
A Helpful Hand
DATABASE MANAGEMENT SYSTEMS
UNIT V
SCHEMA REFINEMENT
DEPARTMENT OF COMPUTER SCIENCE
AND ENGINEERING
CSEROCKZ
-
5/21/2018 Dbms Unit-5 Notes
2/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
INTRODUCTION TO SCHEMA REFINEMENTConceptual database design gives us a set of relation schemas and integrity
constraints (ICs) that can be regarded as a good starting point for the final database
design. This initial design must be refined by taking the ICs into account and also
by considering performance criteria and typical workloads.
A major aim of relational database design is to minimie data redundancy. The
problems associated with data redundancy are illustrated as follows!
Problems caused by redudacy
"toring the same information in more than one place within a database is called
redundancyand can lead to several problems!
Reduda! S!ora"e# "ome information is stored repeatedly.
U$da!e Aomal%es# If one copy of such repeated data is updated# an
inconsistency is created unless all copies are similarly updated.
Iser!%o Aomal%es# It may not be possible to store certain information
unless some other# unrelated# information is stored as well.
Dele!%o Aomal%es# It may not be possible to delete certain information
without losing some other# unrelated# information as well.
E Consider a relation# $ourly%&mps(ssn, name, lot, rating, hourly_wages,
hours_worked)
The key for $ourly%&mps is ssn. In addition# suppose that the hourly_wages
attribute is determined by the rating attribute. That is# for a given rating value#
there is only one permissible hourly_wages value. This IC is an e'ample of a
functional dependency. It leads to possible redundancy in the relation
$ourly%&mps# as shown below!
ssn name lo
t
ratin
g
hourly%wages hours%worked
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
3/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
* Adithya +, , - +
* /evesh 00 , - 1
*+ Ayush
"oni
1 * 1
2* 3ajasekhar
1 * 10
c- "unil 1 , - +
If the same value appears in the rating column of two tuples# the IC tells us that the
same value must appear in the hourly_wages column as well. This redundancy has
the following problems!
Redundant Storage:The rating value , corresponds to the hourly%wage -#
and this association is repeated three times.
Update Anomalies: The hourly_wage in the first tuple could be updated
without making a similar change in the second tuple.
Insertion Anomalies: 4e cannot insert a tuple for an employee unless we
know the hourly_wage for the employee5s rating value.
Deletion Anomalies: If we delete all tuples with a given rating value (e.g.#
we delete the tuples for Ayush "oni and 3ajasekhar) we lose the association
between that rating value and its hourly_wage value.
Null )alues
6ull values cannot provide a complete solution# but they can provide some help.
Consider the e'ample $ourly%&mps relation. $ere null values cannot help to
eliminate redundant storage# update or deletion anomalies. It appears that they can
address insertion anomalies. 7or instance# we can insert an employee tuple with
null values in the hourly wage field. $owever# null values cannot address all
insertion anomalies. Thus# null values do not provide a general solution to theproblems of redundancy# even though they can help in some cases.
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
4/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Decom$os%!%os
A decom$os%!%o o* rela!%o sc+emaR consists of replacing the relation schema
by two or more relation schemas that each contain a subset of the attributes of R
and together include all attributes inR.
Ee can decompose $ourly%&mps into two relations!
$ourly%&mps0(ssn, name, lot, rating, hours_worked)
4ages(rating
, hourly_wages)
Problems Rela!ed !o Decom$os%!%o
Two important 8uestions must be asked during decomposition process!
-. /o we need to decompose a relation9
0. 4hat problems does a given decomposition cause9
To answer a first 8uestion# several normal forms have been proposed for relations.
If a relation schema is in one of these normal forms# we know that certain kinds of
problems cannot arise.
4ith respect second 8uestion# two properties of decomposition are to beconsidered!
The lossless!oinproperty enables us to recover any instance of the relation of the
decomposed relation from corresponding instances of the smaller relations.
The dependencypreser"ationproperty enables us to enforce any constraint on the
original relation by simply enforcing some constraints on each of the smaller
relations. That is# we need not perform joins of the smaller relations to check
whether a constraint on the original relation is violated.
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
5/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Fuc!%oal De$edec%es
A *uc!%oal de$edecy (7/) is a kind of IC that generalies the concept of a
key.
:et 3 be a relation schema and let ; and < be nonempty sets of attributes in 3. 4e
say that an instance r of 3 satisfies the 7/ ;= t0.;# then t-.< > t0.
-
5/21/2018 Dbms Unit-5 Notes
6/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Biven a set of 7/s over a relation schema 3# typically several additional 7/s hold
over 3 whenever all of the given 7/s hold.
E Consider 4orkers(ssn, name, lot, did, since)
4ith given 7/sssn # did and did# lot. Then# in any legal instance of 4orkers# if
two tuples have the same ssn value# they must have the same did value# and
because they have the same did value# they must also have the same lot value.
Therefore# the 7/ssn# lot also holds on 4orkers.
Closure o* a Se! o* FDs
The set of all 7/s implied by a given set $ of 7/s is called the closure o* F,
denoted as $%. The closure $% can be calculated by using the following
Arms!ro"-s A&%oms rules. :et ;#
-
5/21/2018 Dbms Unit-5 Notes
7/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Contracts(contractid, supplierid, pro!ectid, deptid, partid, &ty, "alue)
4e denote the schema for Contracts as 'S(D)*+.
The following are the given 7/s!
i '# 'S(D)*+.
ii ()#'.
iii SD#).
"everal additional 7/s hold in the closure of the set of given 7/s!
7rom()#' and '#'S(D)*+, and transitivity# we infer()#'S(D)*+.
7rom SD#) and augmentation# we infer SD(# ().
7rom SD(#(), ()#'S(D)*+, and transitivity# we infer SD(#'S(D)*+.
Note:
In a !r%.%al FD, the right side contains only attributes that also appear on the
left side. Dsing refle-i"ity, we can generate all trivial dependencies# whichare of the form!
;=
-
5/21/2018 Dbms Unit-5 Notes
8/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
If we want to check whether a given dependency# say# ;=
-
5/21/2018 Dbms Unit-5 Notes
9/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
2ormali3ation is a techni8ue for producing a set of relations with desirable
properties# given the data re8uirements of an enterprise.
The process of normaliation was first developed by &.7.Codd.
2ormali3ation is often performed as a series of tests on a relation to
determine whether it satisfies or violates the re8uirements of a given normal
form.
Ad.a!a"es o* Normal%4a!%o#
:ess storage space
3educes data redundancy in a database
It eliminates serious manipulation anomalies.
7le'ible structure
Normal Forms
Biven a relation schema# we need to decide whether it is a good design or we need
to decompose it into smaller relations. To make such a decision# several ormal
*orms have been proposed.
The normal forms based on 7/s are first normal form 452$, second normal
form462$, third normal form472$, and8oyce'odd normal form 48'2$.
F%rs! Normal Form 5/NF6#
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
10/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
A relation in which the intersection of each row and column contains one and only
one value.
E&am$le#
&mp6um &mpHhone &mp/egrees
--- +
01,+--0
000 +
012,*+
E ?A# ?"c# Hh/
G
111 +
01+*,2
E ?"c# J"c G
Identification of multi"alued attri9utes:
The above relation in o!in -67 as &mp/egrees is a multivalued field!
employee 111 has two degrees!8ScandSc
employee 000 has three degrees!8A, 8Sc, )hD
;ransformation into 52$:
In order to make this relation to be in -67# we remove the repeating group
(&mp/egrees details) by placing the repeating data along with a copy of the
original key attribute ( &mp6um) in a separate relation. The format of the resulting
-67 relations are as follows!
&mployee( &mp6um# &mpHhone) &mployee/egree(&mp6um#
&mp/egrees)
&mp6um &mpHhone
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
11/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
--- +01,+--0
000 +012,*+
111 +01+*,2
Secod Normal Form 50NF6#
Par!%al De$edecy# A partial dependency e'ists when an attribute ? is
functionally dependent on an attribute A# and A is a component of a multipart
candidate key.
0NF# A relation is in 067 if it is in -67# and every nonkey attribute is fully
dependent on each candidate key. (That is# we don5t have any partial functional
dependency.)
A relation in 067 will not have any partial dependencies.
E&am$le#
Consider this I.3%e table (in -67)!
Inv6um :ine6um Hrod6um Kty Inv/ate
&mp6um &mp/egrees
000 ?A
000 ?"c
000 Hh/
111 ?"c
111 J"c
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
12/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
The above relation has the following 7/s!
EInv6um# :ine6umG=Hrod6um# Kty
EInv6um# Hrod6umG=:ine6um# Kty
Inv6um=Inv/ate
Identification of partial dependencies:
Inv:ine is o! 0NFsince there is a partial dependency of Inv/ate on Inv6um.
;ransformation into 62$:
4e can impro"e the database by decomposing the above relation into relations!
Inv6um :ine6um Hrod6um Kty
Inv6um Inv/ate
T+%rd Normal Form 57NF6#
Tras%!%.e de$edecy# A condition where A# ?# and C are attributes of a relation
such that if A =? and ?=C# then C is transitively dependent on A via ?.
7NF# A relation that is in first and second normal form# and in which no nonkeyattribute is transitively dependent on the candidate key.
In other words#
A relation in 167 will not have any transitive dependencies of nonkey attribute on
a candidate key through another non key attribute.
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
13/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
7ormally it can be given as!
:et 3 be relation schema# 7 be the set of 7/s given to hold over 3# then for every
7/ ;=A in 7# one of the following is true!
-) ; is a candidate key
or
0) A is part of the key.
E&am$le#
Consider an Em$loyee relation!
&mp6um &mp6ame /ept6um /ept6ame
This relation has the following 7/s!
&mp6um=&mp6ame# /ept6um# /ept6ame
/ept6um=/ept6ame
Identifying transiti"e dependencies:
&mp6ame# /ept6um# and /ept6ame are nonkey attributes.
/ept6um determines /ept6ame# a nonkey attribute# and /ept6um is not a
candidate key.
;ransformation into 72$:
Is the relation in 1679 L no
Is the relation in 0679 L yes
4e correct the situation by decomposing the original relation into two 167
relations!
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
14/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
8oyce9Codd Normal Form 58CNF6#
6amed after its inventors! ?oyce and
Codd
"tronger than 167
De!erm%a!# 3efers to the attribute or
group of attributes on the left hand side of
the arrow of a functional dependency.
&'! Consider an 7/#&mp6um=&mp&mail. $ere# &mp6um is
a determinant of &mp&mail.
8CNF#A relation is in ?C67# if and only
if# every determinant is a candidate key.
7ig! &.7.Codd
&mp6um &mp6ame /ept6um /ept6um /ept6ame
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
15/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
7ormally it can be given as!
:et 3 be a relation schema# 7 be the set of 7/s given over 3# then for every 7/
; =A in 7# ; should be a candidate key.
E&am$le#
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
16/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
;ransformation into 8'2$:
Note:
any relation that is in ?C67# is in 167
any relation in 167 is in 067
any relation in 067 is in -67
There is a se8uence to normal forms!
-67 is considered the weakest
067 is stronger than -67
167 is stronger than 067# and
?C67 is considered the strongest
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
17/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Mul!%9)alued De$edec%es 5M)D6
The possible e'istence of multivalued dependencies in a relation is due to first
normal form (-67)# which disallows an attribute in a tuple from having a set of
values.
7or e'ample# if we have two multivalued attributes in a relation# we have to repeat
each value of one of the attributes with every value of the other attribute# to ensure
that tuples of a relation are consistent. This type of constraint is referred to as a
multivalued dependency and results in data redundancy.
Consider the
-
5/21/2018 Dbms Unit-5 Notes
18/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Mul!%9)alued De$edecy 5M)D6#
3epresents a dependency between attributes (for e'ample# A# ?# and C) in a
relation# such that for each value of A there is a set of values for ? and a set of
values for C. $owever# the set of values for ? and C are independent of each other.
4e represent an JF/ between attributes A# ?# and C in a relation using the
following notation!
A == ?
A == C
For e&am$le,we specify the JF/ in the above t-(;)# t+( t0(
t-().
2$%%Y &a' (e read asX multi-determines Y.
; <
'- y- - t-
'- y0 0 t0
'- y- 0 t1
'- y0 - t+
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
19/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
The JF/ ;== < says that the relationship between ; and < is independent of
the relationship between ; and 3M
-
5/21/2018 Dbms Unit-5 Notes
20/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Four!+ Normal Form 5:NF6#
A relation that is in ?oyceCodd normal form and contains no nontrivial JF/s.
7ormally# :et 3 be a relation schema# ; and < be nonempty subsets of theattributes of 3. 3 is said to be in :NF, if# for every JF/ ;==< that holds over
3# one of the following is true!
< is a subset of ; or ;< > 3# or
; is a super key.
E&am$le#
Consider the above
-
5/21/2018 Dbms Unit-5 Notes
21/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
Pro$er!%es o* Decom$os%!%o
3ossless9;o% Decom$os%!%o#
:et 3 be a relation schema and let 7 be a set of 7/s over 3. A decomposition of 3into two schemas with attribute sets ; and < is said to be a lossless9
-
5/21/2018 Dbms Unit-5 Notes
22/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
N"H (r) NH/ (r)
7ig! Instances illustrating :ossy /ecompositions
T+eorem#:et 3 be a relation and 7 be a set of 7/s that hold over 3. The
decomposition of 3 into relations with attribute sets 3- and 30 is lossless if and
only if 7O contains either the 7/ 3- P 30 = 3- (or 3-M30) or the 7/ 3- P 30
= 30 (or 30M3-).
Consider the $ourly%&mps relation. It has attributes S2?R@, and the 7/ 3=4
causes a violation of 167. 4e dealt this violation by decomposing the relation into
S2?R and R@. "ince 3 is common to both decomposed relations and 3=4
holds# this decomposition is losslessjoin.
De$edecy9Preser.%" Decom$os%!%o#
Consider the Contracts relation with attributes 'S(D)*+. The given 7/s are
C=C"Q/HKF# QH=C# and "/=H. ?ecause "/ is not a key# the dependency
"/=H causes a violation of ?C67.
4e can decompose Contracts into relations with schemas C"Q/KF and "/H to
address this violation. The decomposition is losslessjoin. ?ut# there is one
problem. If we want to enforce an integrity constraint QH=C# it re8uires an
e'pensive join of the two relations. 4e say that this decomposition is not
dependencypreserving.
:et 3 be a relation schema that is decomposed into two schemas with attributes
sets ; and
-
5/21/2018 Dbms Unit-5 Notes
23/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
of 7/s in the closure 7Othat involve only attributes in ;. 4e denote the projection
of 7 on attributes ; as 7; . 6ote that a dependency D=F in 7O is in 7; only if all
the attributes in D and F are in ;.
The decomposition of relation schema 3 with 7/s 7 into schemas with attributesets ; and < is de$edecy9$reser.%" if (7; D 7 7O.
E&am$le#
Consider the relation 3 with attributes A?C is decomposed into relations with
attributes A? and ?C. The set of 7/s over 3 includes A=?# ?=C# and C=A.
The closure of 7 contains all dependencies in 7 plus A=C# ?=A# and C=?.
Conse8uently 7A? contains
A=? and ?=A# and 7?C contains ?=C and C=?. Therefore# 7A? D 7?Ccontains
A=?# ?=C# ?=A
and C=?. The closure of 7A? and 7?C now includes C=A (which follows from
C=? and ?=A). Thus
the decomposition preserves the dependency C=A.
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
24/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
E1ERCISES
0) i) The following relation is in 167# but not in ?C679 &'plain why
and take an appropriate action9
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
25/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
ii) If we reverse the direction between staff%id and class%code# then
determine whether it is violating 067# 167# ?C679 If yes# perform the
necessary action.
1) 6ormalie the following relation!
+) Biven a relation with schema EI/# 6ame# Address# Hostcode#
CardType# Card6umberG# the candidate key EI/G# and the following
functional dependencies!
R EI/G E6ame# Address# Hostcode# CardType# Card6umberG
R EAddressG EHostcodeG
R ECard6umberG ECardTypeG
(i) &'plain why this relation is in second normal form# but not in third
normal form.
' ' ' ( C S E R O C K Z ( C O M Page 2
-
5/21/2018 Dbms Unit-5 Notes
26/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
(ii) "how how this relation can be converted to third normal form.
-
5/21/2018 Dbms Unit-5 Notes
27/27
DATABASE MANAGEMENT SYSTEMS UNIT-V
www.cserockz.com
Kee$ 'a!c+%" *or Re"ular U$da!esB(
' ' ' ( C S E R O C K Z ( C O M Page 2
http://www.cserockz.com/http://www.cserockz.com/