relational database model iii. introduction to the in 1970, e. f....
TRANSCRIPT
1
III. Introduction to the Relational Database Model
2
Relational Database Model In 1970, E. F. Codd published “A
Relational Model of Data for Large Shared Data Banks” in CACM.
In the early 1980s, commercially viable relational database management systems became available.
3
Relational Database Model While relational database was very tempting
in concept in the 1970s, it was not easily applicable in a real-world environment for reasons related to performance.
The earlier hierarchical and network database management systems were just coming onto the commercial scene and were the focus of intense marketing efforts by the software and hardware vendors.
4
The Relational Database Concept Data appears to be stored in what we have
been referring to as simple, linear files. Relational databases are based on
mathematics. Relational algebra
A relational database is a collection of relations that, as a group, contain the data that describes a particular business environment.
2
5
Relational Terminology Relation = logical representation of an
underlying file. Also called a table. Row = record (files) = tuple (relation) Column = field (files) = attribute (relation) Relation schema names a relation and lists its
attributes (including keys and other properties)
Dependencies Functional dependence: Value of one or
more attributes determines the value of one or more other attributes Determinant: Attribute whose value determines
another Dependent: Attribute whose value is determined
by the other attribute Full functional dependence: Entire
collection of attributes in the determinant is necessary for the relationship
6
7
Primary Key A relation always has a unique primary key A primary key (also called “the key”) is an attribute or a
group of attributes whose values are unique throughout all of the rows of the relation All attributes are functionally dependent on the key
Used to: Ensure that each row in a table is uniquely identifiable Establish relationships among tables and to ensure the integrity
of the data
8
Primary Key Specified in the relation schema The number of attributes involved in
the primary key is always the minimum number of attributes that provide the uniqueness quality
In the worst case, all of the relation’s attributes combined could serve as the primary key
3
9
Candidate Key If a relation has more than one attribute or
minimum group of attributes that represents a way of uniquely identifying the entities, then they are each called a candidate key
When there is more than one candidate key, one of them must be chosen to be the primary key of the relation
Alternate key is a candidate key that was not chosen to be the primary key of the relation
10
Foreign Key
An attribute or group of attributes that serves as the primary key of one relation and also appears in another relation (foreign key in this relation).
11
Foreign Key An attribute or group of attributes that serves as the
primary key of one relation and also appears in another relation (foreign key in the second relation).
Crucial in relational database, because the foreign key is the mechanism that ties relations together to represent unary, binary, and ternary relationships.
Foreign key attribute must have same domain of values (i.e., type) as primary key attribute in other relation.
Keys and Integrity Constraints Entity integrity: Condition in which each
row in the table has its own unique identity All of the values in the primary key must be
unique No key attribute in the primary key can contain a
null Referential integrity: Every reference to an
entity instance by another entity instance is valid
12
4
Integrity Rules
13
Entity Integrity DescriptionRequirement All primary key entries are unique,
and no part of a primary key may be null
Purpose Each row will have a unique identity,and foreign key values can properly reference primary key values
Example No invoice can have a duplicate number, nor it can be null
Integrity Rules
14
Referential Integrity DescriptionRequirement A foreign key may have either a null
entry or a entry that matches a primary key value in a table to whichit is related
Purpose It is possible for an attribute not to have a corresponding value but it is impossible to have an invalid entry
It is impossible to delete row in a table whose primary keys has mandatory matching foreign key values in another table
Example It is impossible to have invalid sales representative number
15
Relational Database Model Relations used to implement building blocks
from the data model: Entity sets
Including dependent entity sets Attributes
Including composite, multi-valued, derived Relationships
Binary, unary, ternary 1-1, 1-M, M-M
Constraints Specialization/generalization hierarchies
Example of a Simple Relational Database
16
5
17
Binary Relationships One-to-One
One-to-Many
Many-to-Many
18
One-to-Many Binary Relationship
Salesperson Customer
The Salesperson Number foreign key in the CUSTOMER relation effectively establishes the one-to-many relationship between salespersons and customers.
19
General Hardware Co.Salesperson Number
Salesperson Name
Commission Percentage
Year Of Hire
137 Baker 10 1995 186 Adams 15 2001 204 Dickens 10 1998 361 Carlyle 20 2001
SALESPERSON relation.
Customer Number
Customer Name
Salesperson Number
HQ City
0121 Main St. Hardware 137 New York 0839 Jane’s Stores 186 Chicago 0933 ABC Home Stores 137 Los Angeles 1047 Acme Hardware Store 137 Los Angeles 1525 Fred’s Tool Stores 361 Atlanta 1700 XYZ Stores 361 Washington 1826 City Hardware 137 New York 2198 Western Hardware 204 New York 2267 Central Stores 186 New York
CUSTOMER relation.
20
One-to-One Binary Relationship
6
21
General Hardware Co. including OFFICE
Office Number
Telephone
Size (sq. ft.)
1253 901-555-4276 120 1227 901-555-0364 100 1284 901-555-7335 120 1209 901-555-3108 95
OFFICE relation.
Can SALESPERSON and OFFICE be combined into one relation?
Could foreign key be in OFFICE instead?
Salesperson Number
Salesperson Name
Commission Percentage
Year Of Hire
Office Number
137 Baker 10 1995 1284 186 Adams 15 2001 1253 204 Dickens 10 1998 1209 361 Carlyle 20 2001 1227
SALESPERSON relation.
22
Many-to-Many Relationship
23
Unacceptable: Many-to-ManySalesperson Number
Salesperson Name
Commission Percentage
Year Of Hire
Product and Quantity Pairs
137 Baker 10 1995 (19440, 473) (24013, 170) (26722, 688)
186 Adams 15 2001 (16386, 1745) (19440, 2529) (21765, 1962) (24013, 3071)
204 Dickens 10 1998 (21765, 809) (26722, 734)
361 Carlyle 20 2001 (16386, 3729) (21765, 3110) (26722, 2738)
24
Many-to-Many Relationship Has its own relation in the database
Can have its own attributes Intersection data
It is a kind of entity -- an associative entity (or composite or bridge)
7
25
SALES Relation (without intersection data)
SalespersonNumber
ProductNumber
137 19440137 24013137 26722186 16386186 19440186 21765186 24013204 21765204 26722361 16386361 21765361 26722
26
Many-to-Many Relationship
Salesperson Number
Product Number
Quantity
137 19440 473 137 24013 170 137 26722 688 186 16386 1745 186 19440 2529 186 21765 1962 186 24013 3071 204 21765 809 204 26722 734 361 16386 3729 361 21765 3110 361 26722 2738
SALES relation.
Product Number
Product Name
Unit Price
16386 Wrench 12.95 19440 Hammer 17.50 21765 Drill 32.99 24013 Saw 26.25 26722 Pliers 11.50
PRODUCT relation.
Salesperson Product
Salesperson Number
Salesperson Name
Commission Percentage
Year Of Hire
Office Number
137 Baker 10 1995 1284 186 Adams 15 2001 1253 204 Dickens 10 1998 1209 361 Carlyle 20 2001 1227
SALESPERSON relation.
27
General Hardware Co. including OFFICE
SalespersonNumber
SalespersonName
CommissionPercentage
YearOf Hire
OfficeNumber
137 Baker 10 1995 1284186 Adams 15 2001 1253204 Dickens 10 1998 1209361 Carlyle 20 2001 1227
(a) SALESPERSON relation.
CustomerNumber
EmployeeNumber
EmployeeName
Title
0121 27498 Smith Co-Owner0121 30441 Garcia Co-Owner0933 25270 Chen VP Sales0933 30441 Levy Sales Manager0933 48285 Morton President1525 33779 Baker Sales Manager2198 27470 Smith President2198 30441 Jones VP Sales2198 33779 Garcia VP Personnel2198 35268 Kaplan Senior Accountant
(c) CUSTOMER EMPLOYEE relation.
CustomerNumber
CustomerName
SalespersonNumber HQ City
0121 Main St. Hardware 137 New York0839 Jane’s Stores 186 Chicago0933 ABC Home Stores 137 Los Angeles1047 Acme Hardware Store 137 Los Angeles1525 Fred’s Tool Stores 361 Atlanta1700 XYZ Stores 361 Washington1826 City Hardware 137 New York2198 Western Hardware 204 New York2267 Central Stores 186 New York
(b) CUSTOMER relation.
ProductNumber
ProductName
UnitPrice
16386 Wrench 12.9519440 Hammer 17.5021765 Drill 32.9924013 Saw 26.2526722 Pliers 11.50
(d) PRODUCT relation. 28
Data Retrieval from a Relational Database Relational DBMS
Have the ability to accept high level data retrieval commands
Process the commands against the database’s relations
and return the desired data. Based on relational algebra
8
29
Relational Algebra The formal basis for operations on the
relational data model Underlying basis for both SQL and QBE
An “algebra” is a collection of operations on some domain
Relational algebra is a collection of operators operands and results are relations (closure) significant operators
projection and selection remove parts of a relation set operators, union, intersection and difference joins and products combine the tuples of two relations
30
Select Operator Retrieves a horizontal slice of the relation. The result of a relational operation will always
be a relation. applied to R produces a subset of R
Applies a selection criterion to a relation Select all rows from the SALESPERSON relation in
which Salesperson Number = 204. Select all rows from the MOVIE relation that are
longer than 100 minutes and genre is ‘action comedy’
Select
31 32
Project Operator Retrieves a vertical slice of the relation. Used to produce a relation with fewer attributes
(columns) called projection because it reduces the dimension of the
relation e.g., removing a column from a 3-dimensional relation
produces a 2-dimensional relation Examples
List (project) the Title of every Movie List the Salesperson Number and Salesperson Name over
the SALESPERSON relation. May have fewer rows than the original relation
9
Project
33 34
Set Operations Union: R S = { t | t R or t S}
R and S must have the same number and types of attributes No duplicate tuples
Consider union of tables Rental (accountId, videoId, dateRented, dateDue, cost) PreviousRental (accountId, videoId, dateRented,
dateReturned, cost) Rental PreviousRental
Intersection and Difference are similarly based on normal set operations
Union
35
Intersect
36
10
Difference
37 38
Cartesian Product Cartesian product of two relations R and S
creates a new relation one attribute for each attribute in R and one for
each attribute in S one tuple for each combination of tuple in R and
tuple in S |R| * |S| tuples if R has m tuples and S has n tuples, RS has m*n tuples
Just like Cartesian product of any set
Product
39 40
Data Integration The relational algebra Join operation
Based on Cartesian product R join S on A has those tuples of RS
where R.A = S.A Each tuple from R is joined to all tuples of S that
have the same value for attribute A Example
Every combination of Salesperson and Customer where the Salesperson Number fields match
How would I find the name of the salesperson for City Hardware?
11
41
Terminology Cartesian Product - comparing every possible
combination of two sets, or two relations Natural join - Links tables by selecting only the
rows with common values in their common attributes Join columns: Common columns Product Select Project
Equijoin - Links tables on the basis of an equality condition that compares specified columns of each table Theta join uses a condition other than equality
Natural JOIN
42
Types of Joins Inner join: Only returns matched records
from the tables that are being joined Outer join: Matched pairs are retained and
unmatched values in the other table are left null Left outer join: Yields all of the rows in the first
table, including those that do not have a matching value in the second table
Right outer join: Yields all of the rows in the second table, including those that do not have matching values in the first table
43
Left and Right JOINs
44
12
45
Relational Expressions Select customer 1826, project name and HQ city
Customer Name, HQ City( Customer Number=1826(Customer))
Salesperson name for City Hardware
Salesperson Name (( Customer Name=‘City Hardware’(Customer)) Salesperson Number Salesperson)
Salesperson Name ( Customer Name=‘City Hardware’(Salesperson Salesperson Number Customer))
What is the order of evaluation?
What is the best order of evaluation?
Query optimization
Data Dictionary and the System Catalog Data dictionary: Description of all tables in
the database created by the user and designer
System catalog: System data dictionary that describes all objects within the database
Homonyms and synonyms must be avoided to lessen confusion
Homonym: Same name is used to label different attributes
Synonym: Different names are used to describe the same attribute 46
47
Achieving Direct Access If we know the value of a field of a record
that we want to retrieve, we want to find its location in the physical file and instruct the hardware mechanisms of the disk device where to find it.
Need either: An index A hashing method
48
The Index The items of interest are copied over into the index,
but the original text is not disturbed in any way.
The items in the index are sorted.
Each item in the index is associated with a “pointer.” The pointer identifies a cylinder, track and sector
13
49
Simple Linear Index
Index is ordered by Salesperson Name field.
The first index record shows Adams 3 because the record of the Salesperson file with salesperson name Adams is at relative record location 3 in the Salesperson file.
SalespersonName
RecordAddress
RecordNumber
SalespersonNumber
SalespersonName City ...
Adams 3 1 119 Taylor New York ...Baker 2 2 137 Baker Detroit ...Carlyle 6 3 186 Adams Dallas ...Dickens 4 4 204 Dicken s Dallas ...Green 7 5 255 Lincoln Atlanta ...Lincoln 5 6 361 Carlyle Detroit ...Taylor 1 7 420 Green Tucson ...
Index Salesperson File
Figure 2.11 Salesperson file on the right with index built over the Salesperson Namefield, on the left.
50
Simple Linear Index
An index built over the City field.
An index can be built over a field with nonunique values.
SalespersonNumber
RecordAddress
RecordNumber
SalespersonNumber
SalespersonName City ...
Atlanta 5 1 119 Taylor New York ...Dallas 3 2 137 Baker Detroit ...Dallas 4 3 186 Adams Dallas ...Detroit 2 4 204 Dicken s Dallas ...Detroit 6 5 255 Lincoln Atlanta ...New York 1 6 361 Carlyle Detroit ...Tucson 7 7 420 Green Tucson ...
Index Salesperson File
Figure 2.12 Salesperson file on the right with index built ove r the City field, on the left.