chapter 8 normalization. © 2001 the mcgraw-hill companies, inc. all rights reserved....

28
Chapter 8 Chapter 8 Normalization

Post on 19-Dec-2015

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

Chapter 8Chapter 8Normalization

Page 2: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Outline Outline

Modification anomaliesFunctional dependenciesMajor normal formsRelationship independencePractical concerns

Page 3: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Modification AnomaliesModification Anomalies

Unexpected side effectInsert, modify, and delete more data than

desiredCaused by excessive redundanciesStrive for one fact in one place

Page 4: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Big University Database TableBig University Database Table

StdSSN StdClass OfferNo OffYear Grade CourseNo CrsDesc

S1 JUN O1 2000 3.5 C1 DB S1 JUN O2 2000 3.3 C2 VB S2 JUN O3 2000 3.1 C3 OO S2 JUN O2 2000 3.4 C2 VB

Page 5: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Functional DependenciesFunctional Dependencies

Constraint on the possible rows in a tableValue neutral like FKs and PKsAssertedUnderstand business rules

Page 6: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

FD DefinitionFD Definition

X YX (functionally) determines YX: left-hand-side (LHS) or determinantFor each X value, there is at most one Y

valueSimilar to candidate keys

Page 7: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

FD Diagrams and ListsFD Diagrams and Lists

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc

StdSSN StdCity, StdClass

OfferNo OffTerm, OffYear, CourseNo, CrsDesc

CourseNo CrsDesc

StdSSN, OfferNo EnrGrade

Page 8: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

FDs in DataFDs in Data

• Prove non-existence (but not existence) by looking at data

• Two rows that have the same X value but a different Y value

StdSSN StdClass OfferNo OffYear Grade CourseNo CrsDesc

S1 JUN O1 2000 3.5 C1 DB S1 JUN O2 2000 3.3 C2 VB S2 JUN O3 2000 3.1 C3 OO S2 JUN O2 2000 3.4 C2 VB

Page 9: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

NormalizationNormalization

Process of removing unwanted redundancies

Apply normal forms– Identify FDs– Determine whether FDs meet normal form– Split the table to meet the normal form if there

is a violation

Page 10: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Relationships of Normal FormsRelationships of Normal Forms1NF

2NF

3NF/BCNF

4NF

5NF

DKNF

Page 11: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

1NF1NF

Starting point for SQL2 databasesNo repeating groups: flat rows

StdSSN StdClass OfferNo OffYear Grade CourseNo CrsDesc

S1 JUN O1 2000 3.5 C1 DB O2 2000 3.3 C2 VB S2 JUN O3 2000 3.1 C3 OO O2 2000 3.4 C2 VB

Page 12: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Combined Definition of Combined Definition of 2NF/3NF2NF/3NFKey column: candidate key or part of

candidate keyAnalogy to the traditional justice oathEvery nonkey depends on a key, the whole

key, and nothing but the keyUsually taught as separate definitions

Page 13: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

2NF2NF

Every nonkey column depends on a whole key, not part of a key

Violations– Part of key nonkey– Violations only for combined keys

Page 14: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

2NF Example2NF Example

Many violations for the big university database table– StdSSN StdCity, StdClass– OfferNo OffTerm, OffYear, CourseNo,

CrsDescSplitting the table

– UnivTable1 (StdSSN, StdCity, StdClass) – UnivTable2 (OfferNo, OffTerm, OffYear,

CourseNo, CrsDesc)

Page 15: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

3NF3NF

Every nonkey column depends only on a key not on nonkey columns

Violations: Nonkey NonkeyAlternative formulation

– No transitive FDs– A B, B C then A C– OfferNo CourseNo, CourseNo CrsDesc

then OfferNo CrsDesc

Page 16: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

3NF Example3NF Example

One violation in UnivTable2– CourseNo CrsDesc

Splitting the table– UnivTable2-1 (OfferNo, OffTerm, OffYear,

CourseNo, CrsDesc)– UnivTable2-2 (CourseNo, CrsDesc)

Page 17: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

BCNFBCNF

Every determinant must be a candidate keySimpler definitionApply with simple synthesis procedureSpecial case not covered by 3NF

– Part of key Part of key– Special case is not common

Page 18: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

BCNF ExampleBCNF Example

Many violations for the big university database table– StdSSN StdCity, StdClass– OfferNo OffTerm, OffYear, CourseNo,

CrsDesc– CourseNo CrsDesc

Splitting into four tables

Page 19: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Simple Synthesis ProcedureSimple Synthesis Procedure

1. Eliminate extraneous columns from the LHSs.

2. Remove derived FDs. 3. Arrange the FDs into groups with each

group having the same determinant. 4. For each FD group, make a table with the

determinant as the primary key.5. Merge tables in which one table contains

all columns of the other table.

Page 20: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Simple Synthesis ExampleSimple Synthesis Example

Step 1: no extraneous columnsStep 2: eliminate OfferNo CrsDescStep 3: already arranged by LHSStep 4: four tables (Student, Enrollment,

Course, Offering)Step 5: no redundant tables

Page 21: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Relationship Independence and Relationship Independence and 4NF4NFM-way relationship that can be derived

from binary relationships Split into binary relationshipsSpecialized problem4NF does not involve FDs

Page 22: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Relationship Independence Relationship Independence ProblemProblem

StdSSNStdName

StudentOfferNoOffLocation

Offering

TextNoTextTitle

Textbook

EnrollStd-Enroll

Offer-Enroll

Text-Enroll

Page 23: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Relationship Independence Relationship Independence SolutionSolution

StdSSNStdName

Student

OfferNoOffLocation

Offering

TextNoTextTitle

Textbook

Enroll Orders

Page 24: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

MVDs and 4NFMVDs and 4NF

MVD: difficult to identify– A B | C (multi-determines)– A associated with a collection of B and C

values– B and C are independent– Nontrivial MVD: not also an FD

4NF: no nontrivial MVDs

Page 25: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Higher Level Normal FormsHigher Level Normal Forms

5NF for M-way relationshipsDKNF: absolute normal formDKNF is an ideal, not a practical normal

form

Page 26: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Role of NormalizationRole of Normalization

Refinement– Use after ERD– Apply to table design or ERD

Initial design– Record attributes and FDs– No initial ERD– May reverse engineer an ERD

Page 27: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

Normalization ObjectiveNormalization Objective

Update biasedNot a concern for databases without

updates (data warehouses)Denormalization

– Purposeful violation of a normal form– Some FDs may not cause anomalies– May improve performance

Page 28: Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies

© 2001 The McGraw-Hill

Companies, Inc. All rights reserved.

McGraw-Hill/Irwin

SummarySummary

Beware of unwanted redundanciesFDs are important constraintsStrive for BCNFUse a CASE tool for large problemsImportant tool of database developmentFocus on the normalization objective