database conceptual and logical design zachary g. ives/grigoris karvounarakis university of...

30
Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October 3, 2007 me slide content courtesy of Susan Davidson & Raghu Ramakrishnan

Upload: randolph-mckenzie

Post on 20-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

Database Conceptual and Logical Design

Zachary G. Ives/Grigoris KarvounarakisUniversity of Pennsylvania

CIS 550 – Database & Information Systems

October 3, 2007

Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan

Page 2: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

2

Now: How Do We Get the Database in the First Place?

Database design theory!

Neat outcome: we can actually prove that we have optimal design, in a manner of speaking…

But first we need to understand how to visualize in pretty pictures…

Page 3: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

3

Databases Anonymous:A 6-Step Program

1. Requirements Analysis: what data, apps, critical operations

2. Conceptual DB Design: high-level description of data and constraints – typically using ER model

3. Logical DB Design: conversion into a schema4. Schema Refinement: normalization

(eliminating redundancy)5. Physical DB Design: consider workloads,

indexes and clustering of data6. Application/Security Design

Page 4: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

4

Entity-Relationship Diagram(based on our running example)

STUDENTS COURSESTakes

namesid serno subj

PROFESSORS

Teaches

cid

fid name

entity set relationship set

exp-grade

attributes (recall these have domains)

Underlined attributes are keys

semester

Page 5: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

5

Conceptual Design Process

What are the entities being represented?

What are the relationships?

What info (attributes) do we store about each?

What keys & integrity constraints do we have?

name

STUDENTS

Takes

sid

exp-grade

Page 6: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

6

Translating Entity Sets toLogical Schemas & SQL DDL

CREATE TABLE STUDENTS (sid INTEGER, name VARCHAR(15) PRIMARY KEY (sid) )

CREATE TABLE COURSES (serno INTEGER, subj VARCHAR(30), cid CHAR(15), PRIMARY KEY (serno) )

Fairly straightforward to generate a schema…

Page 7: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

7

Translating Relationship Sets

Generate schema with attributes consisting of: Key(s) of each associated entity (foreign keys) Descriptive attributes

CREATE TABLE Takes (sid INTEGER, serno INTEGER, exp-grade CHAR(1), PRIMARY KEY (?), FOREIGN KEY (serno) REFERENCES COURSES, FOREIGN KEY (sid) REFERENCES STUDENTS)

Page 8: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

8

… OK, But What about Connectivityin the E-R Diagram?

Attributes can only be connected to entities or relationships

Entities can only be connected via relationships

As for the edges, let’s consider kinds of relationships and integrity constraints…

COURSESPROFESSORS Teaches

(warning: the book has a slightly different notation here!)

Page 9: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

9

Logical Schema Design

Roughly speaking, each entity set or relationship set becomes a table (not always be the case; see Monday)

Attributes associated with each entity set or relationship set become attributes of the relation; the key is also copied (ditto with foreign keys in a relationship set)

Page 10: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

10

Binary Relationships & Participation

Binary relationships can be classified as 1:1, 1:Many, or Many:Many, as in:

1:1 1:n m:n

Page 11: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

11

1:Many (1:n) Relationships

Placing an arrow in the many one direction, i.e. towards the entity that’s ref’d via a foreign key

Suppose profs teach multiple courses, but may not have taught yet:

Suppose profs must teach to be on the roster:

COURSESPROFESSORS Teaches

COURSESPROFESSORS Teaches

Partial participation (0 or more…)

Total participation (1 or more…)

Page 12: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

12

Many-to-Many Relationships

Many-to-many relationships have no arrows on edges The “relationship set” relation has a key that

includes the foreign keys, plus any other attributes specified as key

STUDENTS COURSESTakes

Page 13: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

13

Examples

Suppose courses must be taught to be on the roster

Suppose students must have enrolled in at least one course

Page 14: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

14

Representing 1:n Relationships in Tables

CREATE TABLE Teaches( fid INTEGER, serno CHAR(15), semester CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS, FOREIGN KEY (serno) REFERENCES COURSES)

CREATE TABLE Teaches_Course( serno INTEGER, subj VARCHAR(30), cid CHAR(15), fid CHAR(15), name CHAR(40), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS)

• Key of relationship set:

• Or embed relationship in “many” entity set:

Page 15: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

15

1:1 Relationships

If you borrow money or have credit, you might get:

What are the table options?

CreditReport Borrower

delinquent?

ssn

namedebt

Describesrid

Page 16: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

16

Roles: Labeled Edges

Sometimes a relationship connects the same entity, and the entity has more than one role:

This often indicates the need for recursive queries

name

qty

Partsid

Assembly Subpart

Includes

Page 17: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

17

DDL for Role ExampleCREATE TABLE Parts (Id INTEGER, Name CHAR(15), … PRIMARY KEY (ID) )

CREATE TABLE Includes (Assembly INTEGER, Subpart INTEGER, Qty INTEGER, PRIMARY KEY (Assemb, Sub), FOREIGN KEY (Assemb) REFERENCES Parts, FOREIGN KEY (Sub) REFERENCES Parts)

Page 18: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

18

Married

Roles vs. Separate Entities

Husband Wifeid id

Husband Wife

name name

What is the differencebetween these two representations?

Married

Personid

name

Page 19: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

19

ISA Relationships: Subclassing(Structurally)

Inheritance states that one entity is a “special kind” of another entity: “subclass” should be member of “base class”

name

ISA

Peopleid

Employees salary

Page 20: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

20

But How Does this Translateinto the Relational Model?

Compare these options: Two tables, disjoint tuples Two tables, disjoint attributes One table with NULLs Object-relational databases

Page 21: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

21

Weak Entities

A weak entity can only be identified uniquely using the primary key of another (owner) entity. Owner and weak entity sets in a one-to-many

relationship set, 1 owner : many weak entities Weak entity set must have total

participation

People Feeds Pets

ssn name weeklyCost name species

Page 22: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

22

Translating Weak Entity Sets

Weak entity set and identifying relationship set are translated into a single table; when the owner entity is deleted, all owned weak entities must also be deleted

CREATE TABLE Feed_Pets ( name VARCHAR(20), species INTEGER, weeklyCost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES People, ON DELETE CASCADE)

Page 23: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

23

N-ary Relationships

Relationship sets can relate an arbitrary number of entity sets:

Student Project

Advisor

IndepStudy

Page 24: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

24

Summary of ER Diagrams

One of the primary ways of designing logical schemas

CASE tools exist built around ER (e.g. ERWin, PowerBuilder, etc.) Translate the design automatically into DDL,

XML, UML, etc. Use a slightly different notation that is better

suited to graphical displays Some tools support constraints beyond what ER

diagrams can capture Can you get different ER diagrams from the

same data?

Page 25: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

25

Schema Refinement & Design Theory

ER Diagrams give us a start in logical schema design

Sometimes need to refine our designs further There’s a system and theory for this Focus is on redundancy of data

Let’s briefly touch on one key concept in preparation for Monday’s lecture on normalization…

Page 26: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

26

Not All Designs are Equally Good

Why is this a poor schema design?

And why is this one better?

Stuff(sid, name, cid, subj, grade)

Student(sid, name)Course(cid, subj)Takes(sid, cid, exp-grade)

Page 27: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

27

Focus on the Bad Design

Certain items (e.g., name) get repeated Some information requires that a student be

enrolled (e.g., courses) due to the key

sid

name

cid

subj

exp-grade

1 Sam 570

AI B

23 Nitin 550

DB A

45 Jill 505

OS A

1 Sam 505

OS C

Page 28: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

28

Functional DependenciesDescribe “Key-Like” Relationships

A key is a set of attributes where:If keys match, then the tuples match

A functional dependency (FD) is a generalization:If an attribute set determines another, written A ! B

then if two tuples agree on A, they must agree on B:

sid ! Address

What other FDs are there in this data?

FDs are independent of our schema design choice

Page 29: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

29

Formal Definition of FD’s

Def. Given a relation scheme R (a set of attributes) and subsets X,Y of R:An instance r of R satisfies FD X Y if,

for any two tuples t1, t2 2 r, t1[X ] = t2[X ] implies t1[Y] = t2[Y]

For an FD to hold for scheme R, it must hold for every possible instance of r

(Can a DBMS verify this? Can we determine this by looking at an instance?)

Page 30: Database Conceptual and Logical Design Zachary G. Ives/Grigoris Karvounarakis University of Pennsylvania CIS 550 – Database & Information Systems October

30

General Thoughts on Good Schemas

We want all attributes in every tuple to be determined by the tuple’s key attributesWhat does this say about redundancy?

But: What about tuples that don’t have keys (other

than the entire value)? What about the fact that every attribute

determines itself?

Stay tuned for Monday!