relational data model ch. 7.1 – 7.3

27
Relational Data Model Ch. 7.1 – 7.3 John Ortiz

Upload: nanji

Post on 14-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Relational Data Model Ch. 7.1 – 7.3. John Ortiz. Why Study Relational Model?. Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy systems” in older models E.G., IBM’s IMS (hierarchical model) Recent competitor: object-oriented model - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relational Data Model Ch. 7.1 – 7.3

Relational Data ModelCh. 7.1 – 7.3

John Ortiz

Page 2: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 2

Why Study Relational Model?

Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle,

Sybase, etc. “Legacy systems” in older models

E.G., IBM’s IMS (hierarchical model) Recent competitor: object-oriented model

ObjectStore, Versant, Ontos, O2 A synthesis emerging: object-relational

model Informix UDS, UniSQL, Oracle, DB2

Page 3: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 3

Anatomy of a Relation Each relation is a table with a name. An attribute is a column heading. The heading is the schema of the relation

Students(SSN, Name, Age, GPA)

SID Name Age GPA 1002 John Smith 20 3.2 1005 Mary Day 18 2.9 1020 Bill Lee 19 2.7

StudentsRelation name

Attribute name

Column

Tuple

Page 4: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 4

Domain of an Attribute

The domain of an attribute A, denoted by Dom(A), is the set of values that the attribute can take. A domain is usually represented by a

type. E.g., SID char(4) Name varchar(30) --- character

string of variable length up to 30 Age number --- a number

Page 5: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 5

Tuples

A tuple of a relation is a row in its table. If t is a tuple of a relation R and A is an

attribute of R, we use t[A] to represent the value of t under A in R.

Example: If t is the second tuple in Students t[Name] = ‘Mary Day’

t[Age] = 18,

t[Name, Age] = (‘Mary Day’, 18)

Page 6: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 6

Schema and Instance

A relation schema, denoted by R(A1, A2, …, An), consists of the relation name R and a list of attributes A1, …, An. R.A denotes attribute A of R. # of attributes = degree

A relation instance (state) of a relation schema R(A1, …, An), denoted by r(R), is a set of tuples in the table of R at some instance of time. # of tuples = cardinality

Page 7: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 7

Schema & Instance Update

The schema of a relation may change (e.g., adding, deleting, renaming attributes and deleting a table schema) but it is infrequent

The state of a relation may also change (e.g., inserting or deleting a tuple, and changing an attribute value in a tuple) & it is frequent

A schema may have different states at different times.

Page 8: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 8

Relational Database

A relational database schema is a set of relation schemas S={R1, …, Rm}.

A relational database is a set of relations DB(S)={r(R1), …, r(Rm)}.

A database state is a set of relation instances at some instance of time.

In addition, a relational database must satisfy a number of constraints (more to come later).

Page 9: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 9

A University Databasesee p. 204, Fig. 7.5

SID Name Major GPA 1002 J. Smith CS 3.2 1005 M. Day Math 2.9 1020 B. Lee EE 2.7

StudentsCno Sno Semester Prof CS374 001 F2000 Zhang CS455 002 S2000 Smith Math210 001 F1999 Brown

Sections

Cno Name Hour Dept CS374 Database 3 CS CS455 Network 3 CS CS100 Prog.Lang 4 CS Math210 Calculus 3 Math

Courses

Name Room Chair CS SB220 Hansen EE EB318 Johnson Math AB119 Miller

Departments

Page 10: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 10

Constraints of Relational DB

Relations must satisfy the following constraints. Domain (1NF) Constraint. Access-by-Content Constraint. Key (Unique Tuple) Constraint. Entity Integrity Constraint. Referential Integrity Constraint.

Integrity constraints are enforced by the RDBMS.

Page 11: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 11

Domain Constraint

Also known as the First Normal Form (1NF): Attributes can only take atomic values (I.e., set values are not allowed).

How to handle multivalued attributes? Use multiple tuples, one per value Use multiple columns, one per value Use separate tablesWhat problems does these solutions

have?

Page 12: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 12

Handle Multi-Valued Attributes

EID Name Age Dependents 1234 Bob 34 {Allen, Ann} 1357 Mary 23 {Kathy} 2468 Peter 54 {Mike, Sue, David}

Employees

EID Name Age Dependents 1234 Bob 34 Allen 1234 Bob 34 Ann 1357 Mary 23 Kathy 2468 Peter 54 Mike 2468 Peter 54 Sue 2468 Peter 54 David

Employees

Multiple Values:

Use Multiple Tuples:

Page 13: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 13

Handle Multi-Valued Attributes

EID Name Age Dep1 Dep2 Dep3 1234 Bob 34 Allen Ann 1357 Mary 23 Kathy 2468 Peter 54 Mike Sue David

Employees

EID Name Age 1234 Bob 34 1357 Mary 23 2468 Peter 54

EmployeesEID Name 1234 Allen 1234 Ann 1357 Kathy 2468 Mike 2468 Sue 2468 David

Dependents

Use Multiple Columns:

Use SeparateRelations:

Page 14: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 15

Access-by-Content Constraint

A tuple is retrieved only by values of its attributes, i.e., the order of tuples is not important.

This is because a relation is a set of tuples.

Although the order of tuples is insignificant for query formulation, it is significant for query evaluation.

Page 15: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 16

Superkey

A superkey of a relation is a set of attributes whose values uniquely identify the tuples of the relation.

Every relation has at least one superkey (default is all attributes together?).

Any superset of a superkey is a superkey. From a state of a relation, we may

determine that a set of attributes is not a superkey, but can not determine that a set of attributes is a superkey.

Page 16: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 17

Superkey Example Find all superkeys of the Students relation.

SID Name Major GPA 1002 J. Smith CS 3.2 1005 M. Day Math 2.9 1020 B. Lee EE 2.7

Students

A B C D A1 B2 C1 D2 A2 B2 C3 D2 A2 B1 C2 D1 A3 B3 C4 D1

R

With the only state of R, is A a superkey? What about {A, B}?

Page 17: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 18

Candidate Key

A candidate key of a relation is a set of attributes of the relation such that it is a superkey, and none of its proper subsets is a superkey.

Find all candidate keys in Students relation. Is it true that every relation has at least

candidate key? Why? Can candidate keys be found from a state? If AB is a candidate key of a relation, can A

also be a candidate key? What is ABC called?

Page 18: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 19

Primary Key

A primary key of a relation is a candidate key designated (with an underline) by a database designer. Often chosen at the time of schema

design, & once specified to DBMS, it cannot be changed.

Better be the smallest candidate key for improvement of both storage and query processing efficiencies.

What should be the primary key of Students?

Page 19: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 20

Key Constraint

Every relation must have a primary key. Why is key constraint needed?

Every tuple has a different primary key value.

Only the primary key values need to be checked for identifying duplicate when new tuples are inserted (index is often used).

Primary key values can be referenced from within other relations

Page 20: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 21

Entity Integrity Constraint

A null value is a special value that is unknown, yet to be assigned, or inapplicable.

Entity Integrity Constraint: No primary key values can be null.

Why?

Page 21: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 22

Foreign Key

A foreign key in relation R1 referencing relation R2 is a set of attributes FK of R1, such that, FK is compatible with a candidate (or

primary) key PK of R2 (with same number of attributes and compatible domains); and

for every tuple t1 in R1, either there exists a tuple t2 in R2 such that t1[FK] = t2[PK] or t1[FK] = null.

Foreign keys need to be explicitly defined.

Page 22: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 23

Foreign Key Example

EID Name Age DName 1234 Bob 34 Sales 1357 Mary 23 Service 2468 Peter 54 null

EmployeesName City Manager Sales Huston Bill Payroll Dallas Steve Service Chicago Tom

Departments

DName of Employees is a foreign key referencing Name of Departments

A foreign key may reference its own relation. Employee(EID, Name, Age, Dept, ManegerID)

Page 23: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 24

Referential Integrity Constraint

Referential Integrity Constraint: No relation can contain unmatched foreign key values.

Using foreign keys in a relation to reference primary keys of other relations is the only way in the relational data model to establish relationships among different relations.

Page 24: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 25

Update Operations Insert

Can violate any of the 4 previous constraints – what were they again?

1 solution: reject the insert Delete

Can only violate referential integrity – why?

3 solutions: reject deletion, propagate deletion, modify referencing attributes

Modify Can violate any of the 4 previous

constraints

Page 25: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 26

Relational Model: Summary

A tabular representation of data. Simple and intuitive, currently the most

widely used. Integrity constraints can be specified by the

DBA, based on application semantics. DBMS checks for violations. Two important ICs for primary and

foreign keys In addition, we always have domain

constraints.

Page 26: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 27

Relational Model: Summary

ICs are based upon the semantics of the real-world enterprise that is being described in the database relations.

We can check a database instance to see if an IC is violated, but we can never infer that an IC is true by looking at an instance.

Powerful and natural query languages exist.

Guidelines to translate ER to relational model (next class…)

Page 27: Relational Data Model Ch. 7.1 – 7.3

Lecture 3 Relational Data Model 28

Look Ahead Next topic: Relational Algebra Read Textbook:

Chapter 7.4 – 7.6