relational database model iii. introduction to the in 1970, e. f....

13
1 III. Introduction to the Relational Database Model 2 Relational Database Model In 1970, E. F. Codd published “A Relational Model of Data for Large Shared Data Banks” in CACM. In the early 1980s, commercially viable relational database management systems became available. 3 Relational Database Model While relational database was very tempting in concept in the 1970s, it was not easily applicable in a real-world environment for reasons related to performance. The earlier hierarchical and network database management systems were just coming onto the commercial scene and were the focus of intense marketing efforts by the software and hardware vendors. 4 The Relational Database Concept Data appears to be stored in what we have been referring to as simple, linear files. Relational databases are based on mathematics. Relational algebra A relational database is a collection of relations that, as a group, contain the data that describes a particular business environment.

Upload: others

Post on 17-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

1

III. Introduction to the Relational Database Model

2

Relational Database Model In 1970, E. F. Codd published “A

Relational Model of Data for Large Shared Data Banks” in CACM.

In the early 1980s, commercially viable relational database management systems became available.

3

Relational Database Model While relational database was very tempting

in concept in the 1970s, it was not easily applicable in a real-world environment for reasons related to performance.

The earlier hierarchical and network database management systems were just coming onto the commercial scene and were the focus of intense marketing efforts by the software and hardware vendors.

4

The Relational Database Concept Data appears to be stored in what we have

been referring to as simple, linear files. Relational databases are based on

mathematics. Relational algebra

A relational database is a collection of relations that, as a group, contain the data that describes a particular business environment.

Page 2: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

2

5

Relational Terminology Relation = logical representation of an

underlying file. Also called a table. Row = record (files) = tuple (relation) Column = field (files) = attribute (relation) Relation schema names a relation and lists its

attributes (including keys and other properties)

Dependencies Functional dependence: Value of one or

more attributes determines the value of one or more other attributes Determinant: Attribute whose value determines

another Dependent: Attribute whose value is determined

by the other attribute Full functional dependence: Entire

collection of attributes in the determinant is necessary for the relationship

6

7

Primary Key A relation always has a unique primary key A primary key (also called “the key”) is an attribute or a

group of attributes whose values are unique throughout all of the rows of the relation All attributes are functionally dependent on the key

Used to: Ensure that each row in a table is uniquely identifiable Establish relationships among tables and to ensure the integrity

of the data

8

Primary Key Specified in the relation schema The number of attributes involved in

the primary key is always the minimum number of attributes that provide the uniqueness quality

In the worst case, all of the relation’s attributes combined could serve as the primary key

Page 3: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

3

9

Candidate Key If a relation has more than one attribute or

minimum group of attributes that represents a way of uniquely identifying the entities, then they are each called a candidate key

When there is more than one candidate key, one of them must be chosen to be the primary key of the relation

Alternate key is a candidate key that was not chosen to be the primary key of the relation

10

Foreign Key

An attribute or group of attributes that serves as the primary key of one relation and also appears in another relation (foreign key in this relation).

11

Foreign Key An attribute or group of attributes that serves as the

primary key of one relation and also appears in another relation (foreign key in the second relation).

Crucial in relational database, because the foreign key is the mechanism that ties relations together to represent unary, binary, and ternary relationships.

Foreign key attribute must have same domain of values (i.e., type) as primary key attribute in other relation.

Keys and Integrity Constraints Entity integrity: Condition in which each

row in the table has its own unique identity All of the values in the primary key must be

unique No key attribute in the primary key can contain a

null Referential integrity: Every reference to an

entity instance by another entity instance is valid

12

Page 4: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

4

Integrity Rules

13

Entity Integrity DescriptionRequirement All primary key entries are unique,

and no part of a primary key may be null

Purpose Each row will have a unique identity,and foreign key values can properly reference primary key values

Example No invoice can have a duplicate number, nor it can be null

Integrity Rules

14

Referential Integrity DescriptionRequirement A foreign key may have either a null

entry or a entry that matches a primary key value in a table to whichit is related

Purpose It is possible for an attribute not to have a corresponding value but it is impossible to have an invalid entry

It is impossible to delete row in a table whose primary keys has mandatory matching foreign key values in another table

Example It is impossible to have invalid sales representative number

15

Relational Database Model Relations used to implement building blocks

from the data model: Entity sets

Including dependent entity sets Attributes

Including composite, multi-valued, derived Relationships

Binary, unary, ternary 1-1, 1-M, M-M

Constraints Specialization/generalization hierarchies

Example of a Simple Relational Database

16

Page 5: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

5

17

Binary Relationships One-to-One

One-to-Many

Many-to-Many

18

One-to-Many Binary Relationship

Salesperson Customer

The Salesperson Number foreign key in the CUSTOMER relation effectively establishes the one-to-many relationship between salespersons and customers.

19

General Hardware Co.Salesperson Number

Salesperson Name

Commission Percentage

Year Of Hire

137 Baker 10 1995 186 Adams 15 2001 204 Dickens 10 1998 361 Carlyle 20 2001

SALESPERSON relation.

Customer Number

Customer Name

Salesperson Number

HQ City

0121 Main St. Hardware 137 New York 0839 Jane’s Stores 186 Chicago 0933 ABC Home Stores 137 Los Angeles 1047 Acme Hardware Store 137 Los Angeles 1525 Fred’s Tool Stores 361 Atlanta 1700 XYZ Stores 361 Washington 1826 City Hardware 137 New York 2198 Western Hardware 204 New York 2267 Central Stores 186 New York

CUSTOMER relation.

20

One-to-One Binary Relationship

Page 6: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

6

21

General Hardware Co. including OFFICE

Office Number

Telephone

Size (sq. ft.)

1253 901-555-4276 120 1227 901-555-0364 100 1284 901-555-7335 120 1209 901-555-3108 95

OFFICE relation.

Can SALESPERSON and OFFICE be combined into one relation?

Could foreign key be in OFFICE instead?

Salesperson Number

Salesperson Name

Commission Percentage

Year Of Hire

Office Number

137 Baker 10 1995 1284 186 Adams 15 2001 1253 204 Dickens 10 1998 1209 361 Carlyle 20 2001 1227

SALESPERSON relation.

22

Many-to-Many Relationship

23

Unacceptable: Many-to-ManySalesperson Number

Salesperson Name

Commission Percentage

Year Of Hire

Product and Quantity Pairs

137 Baker 10 1995 (19440, 473) (24013, 170) (26722, 688)

186 Adams 15 2001 (16386, 1745) (19440, 2529) (21765, 1962) (24013, 3071)

204 Dickens 10 1998 (21765, 809) (26722, 734)

361 Carlyle 20 2001 (16386, 3729) (21765, 3110) (26722, 2738)

24

Many-to-Many Relationship Has its own relation in the database

Can have its own attributes Intersection data

It is a kind of entity -- an associative entity (or composite or bridge)

Page 7: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

7

25

SALES Relation (without intersection data)

SalespersonNumber

ProductNumber

137 19440137 24013137 26722186 16386186 19440186 21765186 24013204 21765204 26722361 16386361 21765361 26722

26

Many-to-Many Relationship

Salesperson Number

Product Number

Quantity

137 19440 473 137 24013 170 137 26722 688 186 16386 1745 186 19440 2529 186 21765 1962 186 24013 3071 204 21765 809 204 26722 734 361 16386 3729 361 21765 3110 361 26722 2738

SALES relation.

Product Number

Product Name

Unit Price

16386 Wrench 12.95 19440 Hammer 17.50 21765 Drill 32.99 24013 Saw 26.25 26722 Pliers 11.50

PRODUCT relation.

Salesperson Product

Salesperson Number

Salesperson Name

Commission Percentage

Year Of Hire

Office Number

137 Baker 10 1995 1284 186 Adams 15 2001 1253 204 Dickens 10 1998 1209 361 Carlyle 20 2001 1227

SALESPERSON relation.

27

General Hardware Co. including OFFICE

SalespersonNumber

SalespersonName

CommissionPercentage

YearOf Hire

OfficeNumber

137 Baker 10 1995 1284186 Adams 15 2001 1253204 Dickens 10 1998 1209361 Carlyle 20 2001 1227

(a) SALESPERSON relation.

CustomerNumber

EmployeeNumber

EmployeeName

Title

0121 27498 Smith Co-Owner0121 30441 Garcia Co-Owner0933 25270 Chen VP Sales0933 30441 Levy Sales Manager0933 48285 Morton President1525 33779 Baker Sales Manager2198 27470 Smith President2198 30441 Jones VP Sales2198 33779 Garcia VP Personnel2198 35268 Kaplan Senior Accountant

(c) CUSTOMER EMPLOYEE relation.

CustomerNumber

CustomerName

SalespersonNumber HQ City

0121 Main St. Hardware 137 New York0839 Jane’s Stores 186 Chicago0933 ABC Home Stores 137 Los Angeles1047 Acme Hardware Store 137 Los Angeles1525 Fred’s Tool Stores 361 Atlanta1700 XYZ Stores 361 Washington1826 City Hardware 137 New York2198 Western Hardware 204 New York2267 Central Stores 186 New York

(b) CUSTOMER relation.

ProductNumber

ProductName

UnitPrice

16386 Wrench 12.9519440 Hammer 17.5021765 Drill 32.9924013 Saw 26.2526722 Pliers 11.50

(d) PRODUCT relation. 28

Data Retrieval from a Relational Database Relational DBMS

Have the ability to accept high level data retrieval commands

Process the commands against the database’s relations

and return the desired data. Based on relational algebra

Page 8: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

8

29

Relational Algebra The formal basis for operations on the

relational data model Underlying basis for both SQL and QBE

An “algebra” is a collection of operations on some domain

Relational algebra is a collection of operators operands and results are relations (closure) significant operators

projection and selection remove parts of a relation set operators, union, intersection and difference joins and products combine the tuples of two relations

30

Select Operator Retrieves a horizontal slice of the relation. The result of a relational operation will always

be a relation. applied to R produces a subset of R

Applies a selection criterion to a relation Select all rows from the SALESPERSON relation in

which Salesperson Number = 204. Select all rows from the MOVIE relation that are

longer than 100 minutes and genre is ‘action comedy’

Select

31 32

Project Operator Retrieves a vertical slice of the relation. Used to produce a relation with fewer attributes

(columns) called projection because it reduces the dimension of the

relation e.g., removing a column from a 3-dimensional relation

produces a 2-dimensional relation Examples

List (project) the Title of every Movie List the Salesperson Number and Salesperson Name over

the SALESPERSON relation. May have fewer rows than the original relation

Page 9: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

9

Project

33 34

Set Operations Union: R S = { t | t R or t S}

R and S must have the same number and types of attributes No duplicate tuples

Consider union of tables Rental (accountId, videoId, dateRented, dateDue, cost) PreviousRental (accountId, videoId, dateRented,

dateReturned, cost) Rental PreviousRental

Intersection and Difference are similarly based on normal set operations

Union

35

Intersect

36

Page 10: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

10

Difference

37 38

Cartesian Product Cartesian product of two relations R and S

creates a new relation one attribute for each attribute in R and one for

each attribute in S one tuple for each combination of tuple in R and

tuple in S |R| * |S| tuples if R has m tuples and S has n tuples, RS has m*n tuples

Just like Cartesian product of any set

Product

39 40

Data Integration The relational algebra Join operation

Based on Cartesian product R join S on A has those tuples of RS

where R.A = S.A Each tuple from R is joined to all tuples of S that

have the same value for attribute A Example

Every combination of Salesperson and Customer where the Salesperson Number fields match

How would I find the name of the salesperson for City Hardware?

Page 11: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

11

41

Terminology Cartesian Product - comparing every possible

combination of two sets, or two relations Natural join - Links tables by selecting only the

rows with common values in their common attributes Join columns: Common columns Product Select Project

Equijoin - Links tables on the basis of an equality condition that compares specified columns of each table Theta join uses a condition other than equality

Natural JOIN

42

Types of Joins Inner join: Only returns matched records

from the tables that are being joined Outer join: Matched pairs are retained and

unmatched values in the other table are left null Left outer join: Yields all of the rows in the first

table, including those that do not have a matching value in the second table

Right outer join: Yields all of the rows in the second table, including those that do not have matching values in the first table

43

Left and Right JOINs

44

Page 12: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

12

45

Relational Expressions Select customer 1826, project name and HQ city

Customer Name, HQ City( Customer Number=1826(Customer))

Salesperson name for City Hardware

Salesperson Name (( Customer Name=‘City Hardware’(Customer)) Salesperson Number Salesperson)

Salesperson Name ( Customer Name=‘City Hardware’(Salesperson Salesperson Number Customer))

What is the order of evaluation?

What is the best order of evaluation?

Query optimization

Data Dictionary and the System Catalog Data dictionary: Description of all tables in

the database created by the user and designer

System catalog: System data dictionary that describes all objects within the database

Homonyms and synonyms must be avoided to lessen confusion

Homonym: Same name is used to label different attributes

Synonym: Different names are used to describe the same attribute 46

47

Achieving Direct Access If we know the value of a field of a record

that we want to retrieve, we want to find its location in the physical file and instruct the hardware mechanisms of the disk device where to find it.

Need either: An index A hashing method

48

The Index The items of interest are copied over into the index,

but the original text is not disturbed in any way.

The items in the index are sorted.

Each item in the index is associated with a “pointer.” The pointer identifies a cylinder, track and sector

Page 13: Relational Database Model III. Introduction to the In 1970, E. F. …cs.furman.edu/~ktreu/csc341/lectures/chapter03.pdf · 2019. 9. 20. · 1 III. Introduction to the Relational Database

13

49

Simple Linear Index

Index is ordered by Salesperson Name field.

The first index record shows Adams 3 because the record of the Salesperson file with salesperson name Adams is at relative record location 3 in the Salesperson file.

SalespersonName

RecordAddress

RecordNumber

SalespersonNumber

SalespersonName City ...

Adams 3 1 119 Taylor New York ...Baker 2 2 137 Baker Detroit ...Carlyle 6 3 186 Adams Dallas ...Dickens 4 4 204 Dicken s Dallas ...Green 7 5 255 Lincoln Atlanta ...Lincoln 5 6 361 Carlyle Detroit ...Taylor 1 7 420 Green Tucson ...

Index Salesperson File

Figure 2.11 Salesperson file on the right with index built over the Salesperson Namefield, on the left.

50

Simple Linear Index

An index built over the City field.

An index can be built over a field with nonunique values.

SalespersonNumber

RecordAddress

RecordNumber

SalespersonNumber

SalespersonName City ...

Atlanta 5 1 119 Taylor New York ...Dallas 3 2 137 Baker Detroit ...Dallas 4 3 186 Adams Dallas ...Detroit 2 4 204 Dicken s Dallas ...Detroit 6 5 255 Lincoln Atlanta ...New York 1 6 361 Carlyle Detroit ...Tucson 7 7 420 Green Tucson ...

Index Salesperson File

Figure 2.12 Salesperson file on the right with index built ove r the City field, on the left.