class agenda (03/29 through 04/10)

22
1 Class Agenda (03/29 through 04/10) Review HW #7 (do this on 4/3/2012) Present normalization process Enhance conceptual knowledge of database design. Improve practical skills of database design. Approach Review goals of database design. Identify and define vocabulary for normalization. Do an “intuitive” database design for a refresher. Discuss the characteristics of the three normal forms and the characteristics of a data model in third normal form. Use the normalization process to do the same database design. Compare results.

Upload: seven

Post on 04-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Class Agenda (03/29 through 04/10). Review HW #7 (do this on 4/3/2012) Present normalization process Enhance conceptual knowledge of database design. Improve practical skills of database design. Approach Review goals of database design. Identify and define vocabulary for normalization. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Class Agenda  (03/29 through 04/10)

1

Class Agenda (03/29 through 04/10)

Review HW #7 (do this on 4/3/2012) Present normalization process

Enhance conceptual knowledge of database design. Improve practical skills of database design.

Approach Review goals of database design. Identify and define vocabulary for normalization. Do an “intuitive” database design for a refresher. Discuss the characteristics of the three normal forms

and the characteristics of a data model in third normal form.

Use the normalization process to do the same database design.

Compare results.

Page 2: Class Agenda  (03/29 through 04/10)

2

Review of database design goals

Protect the integrity of the data. Reduce data redundancy. Prevent data anomalies.

Provide for change. Prevent inflexible data structures. Anticipate changes.

Provide access to complete data for decision making.

Page 3: Class Agenda  (03/29 through 04/10)

3

What are data anomalies?

An anomaly is a potential error or inconsistency in the data.

Data anomalies are most frequently caused by the implementation of M:N relationships.

A M:N relationship can be implemented through a “false” 1:M relationship.

Page 4: Class Agenda  (03/29 through 04/10)

Example of a “False” 1:m RelationshipProduct

PK productID

description unitofmeasure type eoq warehouseID warehousename warehousephone#

Order

PK orderidPK,FK1 productIDPK expectedshipdate

dateplaced vendorID billinginstructions price quantityordered

is ordered on

Order12100 5613 02-27-2012 03-10-2012 200 net30 4.99 1112100 7816 02-27-2012 03-10-2012 200 net30 45.89 1512100 5613 03-12-2012 03-26-2012 200 net30 4.78 5012250 4512 02-30-2012 03-23-2012 231 cod 9.99 8712250 5622 03-12-2012 03-18-2012 231 cod 5.70 25

Product5613 tumbler 12 oz ea inventory 50 452 Sacramento South 91622345515613 tumbler 12 oz ea inventory 50 455 Sacramento North 91678172737816 food processor ea inventory 15 455 Sacramento North 91678172735622 paper rm supplies 25 452 Sacramento South 91622445514512 glass pitcher ea inventory 75 488 Reno 7753314551

Page 5: Class Agenda  (03/29 through 04/10)

5

Redundant Data

What data is redundant in the previous slide?

Is redundant data really a problem? Why or why not?

Do the primary keys for both entities yield unique values for each row in the table?

Page 6: Class Agenda  (03/29 through 04/10)

6

Potential anomalies in the example

Insertion anomaly: Can’t add “some” of a row; must have all the key attributes. Example - Suppose we need to add a new warehouse?

Deletion anomaly: Lose some relevant data when deleting other data. Example - What happens to the Reno warehouse information (name and phone#) when we delete item #4512?

Update anomaly: Must update more than one row when one piece of data changes. Examples - What happens if the telephone number at the Sacramento North warehouse changes? What happens if the date the purchase order was placed is entered incorrectly and must be updated?

Page 7: Class Agenda  (03/29 through 04/10)

7

Other problems with “false” 1:m relationships

What happens when the database design grows or changes?

How do you add new data attributes? What about keeping track of the buyer for an

order? Or the manager for a warehouse?

Page 8: Class Agenda  (03/29 through 04/10)

8

What is normalization?

Normalization is a formal, process-oriented approach to data modeling.

Normalization is the process of: examining groups of data attributes; splitting them into appropriate entities; identifying the relationships between

the entities; and identifying appropriate primary and

foreign keys.

Page 9: Class Agenda  (03/29 through 04/10)

9

Database Normalization

What will you know about database normalization? Define normalization. Know the vocabulary of normalization. Understand the process of normalization. Better understand the characteristics of an effective

database design. What will you be able to do?

Be able to identify the characteristics of each normal form.

Be able to tell whether or not a data model is in third normal form.

Potentially be able to use normalization to assist you in the design of a database.

Page 10: Class Agenda  (03/29 through 04/10)

Same old/same old

Normalization should sound like what you have already done during database design.

The ultimate goals of design have not changed; we are just going to go about it in a slightly different way.

Let’s start with an application and doing it through “intuitive” database design. Student information grade screen design exercise

10

Page 11: Class Agenda  (03/29 through 04/10)

11

Normalization process

Some refer to this as the “bottom-up” form of database design.

Contrast with the more intuitive “top-down” approach we have been using.

The results from the normalization process are stable, flexible entities. The results from the intuitive approach should be the same.

Page 12: Class Agenda  (03/29 through 04/10)

12

Two methods of applying normalization

1. Use it to help in designing a database. Normalization starts with a single entity. Normalization breaks that entity into a series of

additional entities. More entities are discovered and named during the

process. Entities are linked during the process.

2. Use it to validate the design of a database. Identify entities from the meaning of the data. Create conceptual and logical data models. Apply the rules of normalization to ensure a stable,

non-redundant design.

Page 13: Class Agenda  (03/29 through 04/10)

13

Vocabulary for normalization

A “functional dependency" is a relationship between attributes in which one attribute or group of attributes determines the value of another.

A “determinant” is an attribute or group of attributes that, once known, can determine the value of another attribute.

Page 14: Class Agenda  (03/29 through 04/10)

Examples of functional dependencies and determinants

A social security number determines your name and address. SSN name, address.

A vehicle id number determines the make and model of a car. VIN make, model.

Name and address are “functionally dependent” on SSN.

SSN “determines” name and address.Functional dependency diagram format:

CourseID CourseName, CourseDescription, CourseCredits ZipCode City, State PatientID, TreatmentDateTime TestResults

Page 15: Class Agenda  (03/29 through 04/10)

15

Normalization process

Normalization is accomplished in stages. A “normal form” is a state (level of completeness) of a data model.

Unnormalized data: A data model that has not been normalized. It contains repeating groups and is not a stable model.

Unnormalized data is essentially one entity. The system under analysis is categorized as a single entity.

Page 16: Class Agenda  (03/29 through 04/10)

16

Steps/forms/phases in Normalization

First normal form: Remove repeating groups.Second normal form: Remove partial functional

dependencies.Third normal form: Remove transitive

dependences

Page 17: Class Agenda  (03/29 through 04/10)

• Semester• Year• Student Name• Student Address• Student City• Student State• Student Zip Code• Student ID• Student College• Student Major• Student Minor• Student Year• Course ID• Course Title• Course Instructor• Course Credits• Grade

What attributes might be needed that aren’t visible

on the grade report?

Group all attributes in one “big” entity.Identify a primary key for the entity.Maybe studentID for this one.

Unnormalized data for grade report exercise

Page 18: Class Agenda  (03/29 through 04/10)

18

First Normal Form

First normal form: Remove repeating groups. A repeating group is an attribute or group of attributes

that can have more than one value for an instance of an entity. If it is a single attribute, we have been calling it a “multi-valued” attribute.

To get a data model into first normal form: Identify repeating groups and place them as separate

entities in the model. Identify a primary key for the repeating group. The

key may be concatenated. Create the relationships between entities. Divide m:n relationships with appropriate intersection

entities.

Page 19: Class Agenda  (03/29 through 04/10)

19

Second Normal Form

Second normal form: Remove partial functional dependencies.

A partial functional dependency is a situation in which one or more non-key attributes are functionally dependent on part, but not all, of the primary key. Partial functional dependencies occur only with

concatenated keys.Examples of partial functional dependencies:

PatientID, TreatmentDateTime PatName, TstResults, TrtID, LocID

CourseID, StudentID CourseTitle, GradeWhich entities developed during the transition to

first normal form for the grade report have concatenated keys?

Page 20: Class Agenda  (03/29 through 04/10)

20

Third normal form

Third normal form: Remove transitive dependencies. A transitive dependency occurs when a non-key attribute is

functionally dependent on one or more non-key attributes.Third normal form examines entities with single

primary keys and removes the “floating” or transitive dependencies.

It may be possible to have attributes that are determined by other attributes, rather than by the primary key. They must be removed into entities with appropriate primary keys.

Example of partial functional dependency: PatID, TrtDateTime TstResults, TrtType, TrtDescription,

LocName, TrtID, LocID

Page 21: Class Agenda  (03/29 through 04/10)

21

Summary of normalization process

Examine and evaluate the logical data model for effectiveness. Find the repeating groups and put the model

into first normal form. Identify primary key fields for any new entities. Relate entities with foreign keys.

Find the functional dependencies. Identify the partial functional dependencies and put the model into second normal form. Identify primary key fields for any new entities. Relate entities with foreign keys.

Find the transitive dependencies and put the model into third normal form. Identify primary key fields for any new entities. Relate entities with foreign keys.

Page 22: Class Agenda  (03/29 through 04/10)

22

Goal of normalization

A set of entities where each attribute in each

entity is dependent on the primary key, the whole

primary key, and nothing but the primary key.