lesson 9: relational data model & sql
TRANSCRIPT
Lesson 9: Relational Data Model &
SQL
Lesson 9 / Page 2AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Contentsn Structure of Relational Databasesn Relational Algebran Basic Relational-Algebra Operationsn Additional Relational-Algebra Operationsn Extended Relational-Algebra Operationsn Null Values and Three-valued Logicsn Database Modification by Relational-Algebra Operations
n Brief Introduction to SQLn SQL and Relationsn Fundamental SQL statementsn null values in SQLn Database modifications in SQL
Lesson 9 / Page 3AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Why Relations?n We have seen tablesn Why do we need another
view of data?n There is a number of
reasons:l Need to create a rigorous
mathematical modell This model enables for formalizing database operationsl The exact model is needed to formulate declarative queries and
optimize their processingn The central idea is to describe a database as a collection of
predicates over a finite set of predicate variables, defining constraints on the possible values and combinations of values.
350Palo AltoA-305700RedwoodA-222750BrightonA-217700BerkeleyA-215900BrightonA-201400BridgesA-102500DowntownA-101
balancebranch_nameaccount_number
Lesson 9 / Page 4AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
What is a Relation?
n Mathematically, given sets 1, 2, …, n a relation is a subset of the Cartesian product 1 x 2 x … x nThus, a relation is a set of n-tuples (a1, a2, …, an) where each ai ∈i
n Example:l customer_name = {Jones, Smith, Curry, Lindsay, …}
/* Set of all customer names */l customer_street = {Main, North, Park, …} /* Set of all street names*/l customer_city = {Harrison, Rye, Pittsfield, …} /* Set of all city names */Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),(Curry, North, Rye),(Lindsay, Park, Pittsfield) }
is a relation, i.e. subset of customer_name x customer_street x customer_city
n As we are concerned with finite sets, such sets can be expressed by enumeration, i.e. tables
Lesson 9 / Page 5AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Relation is a Subset of a Cartesian Productn No duplicates in sets
l Very important for database applications
n Component set members can be in any orderl Sorted or unsorted
Bush
Carter
Clinton
Jefferson
Kenedy
Lincoln
Obama
Roosevelt
Washington
Abr
aham
Bar
ac Bill
Fran
klin
Geo
rge
Jim
my
John
Theo
dore
Thom
as
First names
Last
nam
es
Selected U.S. Presidents
Lesson 9 / Page 6AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Attribute Typesn Each attribute of a relation has a namen The set of allowed values for each attribute is called the
domain of the attributen Attribute values are (normally) required to be atomic;
that is, indivisiblel E.g. the value of an attribute can be an account number,
but cannot be a set of account numbersn Domain is said to be atomic if all its members are atomicn The special value null is a member of every domainn The null value causes complications in the definition of
many operationsl We shall ignore the effect of null values in our main presentation
and consider their effect later
Lesson 9 / Page 7AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Relation Schema & Relation Instancen Relation Schema
l A1, A2, …, An are attributesl R = (A1, A2, …, An ) is a relation schema
Example:Customer_schema = (customer_name, customer_street, customer_city)
l r(R) denotes a relation r on the relation schema RExample:
customer (Customer_schema)n Relation Instance
l The current values (relation instance) of a relation are specified by a table
l An element t of r is a tuple, represented by a row in a table
JonesSmithCurryLindsay
customer_name
MainNorthNorthPark
customer_street
HarrisonRyeRyePittsfield
customer_city
customer
attributes(or columns)
tuples(or rows)
Lesson 9 / Page 8AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Databasen A database consists of multiple relationsn Information about an enterprise is broken up into parts,
with each relation storing one part of the informationaccount : stores information about accountsdepositor : stores information about which customer
owns which account customer : stores information about customers
n Storing all information as a single relation such as bank(account_number, balance, customer_name, ..)
results inl repetition of information
4 e.g., if two customers own an account (What gets repeated?)l the need for null values
4 e.g., to represent a customer without an account
n Normalization theory deals with how to design relational schemas
Lesson 9 / Page 9AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Keys (revisited)n Let K ⊆ Rn K is a superkey of R if values for K are sufficient to identify
a unique tuple of each possible relation r(R)l by “possible r ” we mean a relation r that could exist in the
enterprise we are modeling.l Example: {customer_name, customer_street} and
{customer_name} are both superkeys of Customer, if no two customers can possibly have the same name4 In real life, an attribute such as customer_id would be used instead of
customer_name to uniquely identify customers, but we omit it to keep our examples small, and instead assume customer names are unique.
n K is a candidate key if K is minimall Example: {customer_name} is a candidate key for Customer, since
it is a superkey and no subset of it is a superkey.n Primary key: a candidate key chosen as the principal
means of identifying tuples within a relationl Should choose an attribute whose value never, or very rarely,
changes.4 E.g. email address is unique, but may change
Lesson 9 / Page 10AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Foreign Keysn A relation schema may have an attribute that corresponds
to the primary key of another relation. The attribute is called a foreign key.l E.g. customer_name and account_number attributes of depositor
are foreign keys to customer and account respectively.l Only values occurring in the primary key attribute of the
referenced relation may occur in the foreign key attribute of the referencing relation.
branch_cityassets
branch_namebranch
branch_namebalance
account_numberaccount
customer_streetcustomer_city
customer_namecustomer
branch_nameamount
loan_numberloan
customer_nameloan_number
borrower
customer_nameaccount_number
depositor
Lesson 9 / Page 11AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Relational Algebran Procedural languagen Six basic operators
l select: σl project: ∏l union: ∪l set difference: –l Cartesian product: xl rename: ρ
n The operators take one or two relations as inputs and produce a new relation as a result.
Lesson 9 / Page 12AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Select Operationn Notation: σ p(r)n p is called the selection predicaten Defined as:
σp(r) = {t | t ∈ r and p(t)}Where p is a formula in propositional calculus consisting of termsconnected by : ∧ (and), ∨ (or), ¬ (not)Each term is one of:
<attribute> op <attribute> or <constant>where op is one of: =, ≠, >, ≥, <, ≤
n Example of selection: σ branch_name=“Redwood”(account)
A B C D
αβ
αβ
123
710
σA=B ^ D > 5 (r)
A B C D
ααββ
αβββ
151223
77310
r
Lesson 9 / Page 13AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Project Operationn Notation:
where A1, A2 are attribute names and r is a relation name.n The result is defined as the relation of k columns obtained
by erasing the columns that are not listedl Duplicate rows are removed from result, since relations are sets
n Example: To eliminate the branch_name attribute of account
∏account_number, balance (account)
)(,,, 21r
kAAA K∏
A B C
α
α
β
β
10
20
30
40
1
1
1
2
r
A C
α
α
β
β
1
1
1
2
A C
α
β
β
1
1
2
∏A,C (r) =
Lesson 9 / Page 14AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Union Operationn Notation: r ∪ sn Defined as:
r ∪ s = {t | t ∈ r or t ∈ s}n For r ∪ s to be valid.
1. r, s must have the same arity (same number of attributes)2. The attribute domains must be compatible
(example: 2nd column of r deals with the same type of values as does the 2nd column of s)
n Example: to find all customers with either an account or a loan
∏customer_name (depositor) ∪ ∏customer_name (borrower)
A B
α
α
β
1
2
1r
A B
α
β
2
3
s
Relations r, s:r ∪ s:
A B
α
α
β
β
1
2
1
3
Lesson 9 / Page 15AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Set Difference Operationn Notation r – sn Defined as:
r – s = {t | t ∈ r and t ∉ s}
n Set differences must be taken between compatiblerelations.l r and s must have the same arityl attribute domains of r and s must be compatible
A B
α
α
β
1
2
1r
A B
α
β
2
3
s
Relations r, s:
A B
α
β
1
1
r – s:
Lesson 9 / Page 16AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Cartesian-Product Operationn Notation r x sn Defined as:
r x s = {t q | t ∈ r and q ∈ s}
n Assume that attributes of r(R) and s(S) are disjoint l That is, R ∩ S = ∅.
n Can build expressions using multiple operationsn If attributes of r(R) and s(S) are not disjoint, then renaming
must be used.A B
ααααββββ
11112222
C D
αββγαββγ
1010201010102010
E
aabbaabb
r x s:Relations r, s:
A B
αβ
12
r
C D
αββγ
10102010
E
aabb
s
Caution: May generate HUGE tables
Lesson 9 / Page 17AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Composition of Operationsn Building operations by composing several together
A B C D E
αββ
122
αββ
101020
aab
σA=C(r x s):
A B
ααααββββ
11112222
C D
αββγαββγ
1010201010102010
E
aabbaabb
r x s:
Lesson 9 / Page 18AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Rename Operationn Not a “true relational algebra” operationn Allows us to name, and therefore to refer to, the results of
relational-algebra expressions.n Allows us to refer to a relation by more than one name.n Example:
returns the expression E under the name Xn If a relational-algebra expression E has arity n, then
returns the result of expression E under the name X, and with the attributes renamed to A1 , A2 , …., An .
)(),...,,( 21E
nAAAXρ
( )EXρ
Lesson 9 / Page 19AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Banking Examplen Relationsl branch(branch_name, branch_city, assets)l customer(customer_name, customer_street, customer_city)l account(account_number, branch_name, balance)l loan(loan_number, branch_name, amount)l depositor(customer_name, account_number)l borrower(customer_name, loan_number)
n Example Queriesl Find all loans of over $1200
l Find the loan number for each loan amounting over $1200
l Find the names of all customers who have an account at the Redwood branch
)(1200 loanamount>σ
))(( 1200_ loanamountnumberloan >Π σ
)))((
( Redwood""__
loandepositorberccount_num account.amber account_nudepositor.
namebranchnamecustomer
×
Π
=
=
σ
σ
Lesson 9 / Page 20AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Banking Example (cont.)n Example Queries (cont.)
l Find the names of all customers who have a loan at the Redwoodbranch but do not have an account at any branch of the bank
l Find the names of all customers who have a loan at the Redwoodbranch4 Possibility No. 1
4 Possibility No. 2
)()))((
( Redwood""
depositorloanborrower
amecustomer_n
_number loan.loan oan_numberborrower.l
ebranch_namamecustomer_n
Π−×
Π=
=σ
σ
)))((( Redwood""__
loanborrower_number loan.loan oan_numberborrower.l
namebranchnamecustomer
×
Π
=
=
σ
σ
))))((((
Redwood""_
_
loanborrowernamebranch
_number loan.loan oan_numberborrower.lnamecustomer
×
Π
=
=
σ
σ
Lesson 9 / Page 21AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Banking Example (cont.)n Example Queries (use of rename)
l Find the largest account balancel Strategy:
4 Find those balances that are not the largest4 Rename account relation as temp so that we can compare each
account balance with all others4 Use set difference to find those account balances that were not found in
the earlier step. l The query is:
( )( )( )( )accountaccount
account
tempnce temp.balalance account.ba
balanceaccountbalance
ρσ ×Π−Π
<
.
Lesson 9 / Page 22AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Formal Definitionn A basic expression in the relational algebra consists of
either one of the following:l A relation in the databasel A constant relation
n Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:l E1 ∪ E2l E1 – E2l E1 x E2l σP (E1), P is a predicate on attributes in E1l ∏s(E1), S is a list consisting of some of the attributes in E1l ρ x (E1), x is the new name for the result of E1
Lesson 9 / Page 23AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Additional Operations
n We define additional operations that do not add any power to the relational algebra, but that simplify common queries.l Set intersectionl Natural joinl Divisionl Assignment
Lesson 9 / Page 24AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Set-Intersection Operationn Notation: r ∩ sn Defined as:
r ∩ s = { t | t ∈ r and t ∈ s }n Assume:
l r, s have the same arityl attributes of r and s are compatible
n Note: r ∩ s = r – (r – s)
A B
ααβ
121
r
A B
αβ
23
s
Relations r, s:A B
α 2r ∩ s:
Lesson 9 / Page 25AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Natural-Join Operationn Notation: r ⋈ sn Let r and s be relations on schemas R and S respectively.
Then, r⋈ s is a relation on schema R ∪ S obtained as follows:l Consider each pair of tuples tr from r and ts from s. l If tr and ts have the same value on each of the attributes in R ∩ S,
add a tuple t to the result, where4 t has the same value as tr on r4 t has the same value as ts on s
n The result of the natural join is the set of all combinations oftuples in R and S that are equal on their common attribute names
n Example:R = (A, B, C, D)S = (E, B, D)l Result schema = (A, B, C, D, E)l r⋈ s is defined as:
( )( )srDsDrBsBrEsDrCrBrAr ×Π =∧= .....,.,.,.,. σ
Lesson 9 / Page 26AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
n Relations r, s: r⋈s:
n Practical example
Natural Join Operation – ExampleA B
ααααδ
11112
C D
ααγγβ
aaaab
E
αγαγδ
A B
αβγαδ
12412
C D
µγβγβ
aabab
r
B
13123
D
aaabb
E
αβγδ∈
s
Employee Name EmplId DeptName Harry 1235 Finance Sally 2241 Sales Joe 3401 Finance
Harriet 2202 Production
Dept DeptName Manager
Finance George Sales Harrald
Production Charles
Employee ⋈ Dept Name EmplId DeptName Manager Harry 1235 Finance George Sally 2241 Sales Harrald Joe 3401 Finance George
Harriet 2202 Production Charles
Lesson 9 / Page 27AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Division Operationn Notation: r ÷ sn Suited to queries that include the phrase “for all”.n Let r and s be relations on schemas R and S respectively
wherel R = (A1, …, Am , B1, …, Bn ) and S = (B1, …, Bn)The result of r ÷ s is a relation on schema R – S = (A1, …, Am)
r ÷ s = { t | t ∈ ∏ R-S (r) ∧ ∀ u ∈ s (tu ∈ r) } where tu means the concatenation of tuples t and u to produce a single tuple
n Property l Let q = r ÷ s
4 Then q is the largest relation satisfying q x s ⊆ rn Definition in terms of the basic algebra operation
l Let r(R) and s(S) be relations, and let S ⊆ Rr ÷ s = ∏R-S (r ) – ∏R-S ( ( ∏R-S (r ) x s ) – ∏R-S,S(r ))
l To see why4 ∏R-S,S (r) simply reorders attributes of r4 ∏R-S (∏R-S (r) x s ) – ∏R-S,S(r)) gives those tuples t in
∏R-S (r) such that for some tuple u ∈ s, tu ∉ r
Lesson 9 / Page 28AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
n Relations r, s: r ÷s:
n Practical example
Division Operation – ExampleA
αβ
A B
αααβγδδδ∈∈β
12311134612
r
B
1
2
s
Reports_to Name Manager Harry George Sally Harrald Joe George
Harriet Charles
Boss Manager
George Charles
Reports_to ÷ Boss Name Harry
Joe Harriet
Lesson 9 / Page 29AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Assignment Operationn The assignment operation (←) provides a convenient way
to express complex queries. l Write query as a sequential program consisting of
4 a series of assignments 4 followed by an expression whose value is displayed as a result of the
query.l Assignment must always be made to a temporary relation variable.
n Example: Write r ÷ s as temp1 ← ∏R-S (r )temp2 ← ∏R-S ((temp1 x s ) – ∏R-S,S (r ))result = temp1 – temp2
l The result to the right of the ← is assigned to the relation variable on the left of the ←.
l May use variable in subsequent expressions
Lesson 9 / Page 30AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Bank Example Queriesn Find the names of all customers who simultaneously have
a loan and an account at bank∏customer_name (borrower) ∩ ∏customer_name (depositor)
n Find the name of all customers who have a loan at the bank and the loan amount
∏customer_name, loan_number, amount (borrower ⋈ loan)n Find all customers who have an account from at least the
“Downtown” and the "Uptown” branchesl Possibility 1
∏customer_name (σbranch_name = “Downtown” (depositor⋈ account)) ∩∏customer_name (σbranch_name = “Uptown” (depositor⋈ account))
l Possibility 2∏customer_name, branch_name (depositor⋈ account)
÷ ρtemp(branch_name) ({(“Downtown”), (“Uptown”)})4 Note, that this version uses a "constant relation"
Lesson 9 / Page 31AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Extended Relational-Algebra-Operations
n Generalized Projectionn Aggregate Functionsn Outer Join
Lesson 9 / Page 32AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Generalized Projectionn Extends the projection operation by allowing arithmetic
functions to be used in the projection list
n E is any relational-algebra expressionn Each of F1, F2, …, Fn are are arithmetic expressions
involving constants and attributes in the schema of E.n Is used to compute ‘derived’ (calculated) attributesn Given relation
credit_info(customer_name, limit, credit_balance),find how much more each person can spend:
∏customer_name, limit – credit_balance (credit_info)
)(,,, 21E
nFFF L∏
Lesson 9 / Page 33AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Aggregate Functions and Operationsn Aggregate function takes a collection of values and
returns a single value as a result.avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values
n Aggregate operation in relational algebra
E is any relational-algebra expressionl G1, G2, …, Gn is a list of attributes on which to group (can be empty)l Each Fi is an aggregate functionl Each Ai is an attribute name
)()(,,(),(,,, 221121E
nnn AFAFAFGGG KK ϑ
Lesson 9 / Page 34AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Aggregate Operation – Example
n Relation r: ϑsum(C)(r):
n Relation account grouped by branch_name:
n branch_name ϑ sum(balance)(account):
A B
ααββ
αβββ
C
77310
sum(c )
27
branch_name account_number balancePerryridgePerryridgeBrightonBrightonRedwood
A-102A-201A-217A-215A-222
400900750750700
branch_name sum(balance)PerryridgeBrightonRedwood
13001500700
Lesson 9 / Page 35AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Outer Joinn An extension of the join operation that avoids loss of
information.n Computes the join and then adds tuples form one relation
that does not match tuples in the other relation to the result of the join.
n Uses null values:l null signifies that the value is unknown or does not exist l All comparisons involving null are (roughly speaking) false by
definition.4 We shall study precise meaning of comparisons with nulls later
Lesson 9 / Page 36AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Outer Join – Exampleloan
loan_number branch_name amount L-170 Downtown 3000 L-230 Redwood 4000 L-260 Perryridge 1700
borrower customer_name loan_number
Jones L-170 Smith L-230 Hayes L-155
loan ⋈ borrower loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones L-230 Redwood 4000 Smith
Natural join
loan ⊐⋈ borrower loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones L-230 Redwood 4000 Smith L-260 Perryridge 1700 null
Left outer join
loan ⋈⊏ borrower loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones L-230 Redwood 4000 Smith L-155 null null Hayes
Right outer join
loan ⊐⋈⊏ borrower loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones L-230 Redwood 4000 Smith L-260 Perryridge 1700 null L-155 null null Hayes
Full outer join
Lesson 9 / Page 37AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Null Valuesn It is possible for tuples to have a null value, denoted by
null, for some of their attributesn null signifies an unknown value or that a value does not
exist.n The result of any arithmetic expression involving null is
null.n Aggregate functions simply ignore null valuesn For duplicate elimination and grouping, null is treated
like any other value, and two nulls are assumed to be the same
Lesson 9 / Page 38AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Null Values (cont.)n Comparisons with null values return the special logical
value: unknownl If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5n Three-valued logic using the truth value unknown:
l OR: (unknown or true) = true, (unknown or false) = unknown(unknown or unknown) = unknown
l AND: (true and unknown) = unknown, (false and unknown) = false,(unknown and unknown) = unknown
l NOT: (not unknown) = unknownn Result of select predicate is treated as false if it evaluates
to unknown
Lesson 9 / Page 39AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Modification of the Databasen The content of the database may be modified using the
following operations:l Deletionl Insertionl Updating
n All these operations are expressed using the assignment operator.
Lesson 9 / Page 40AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Deletionn A delete request is expressed similarly to a query, except
instead of displaying tuples to the user, the selected tuples are removed from the database.
n Can delete only whole tuples; cannot delete values on only particular attributes
n A deletion is expressed in relational algebra by:r ← r – E
where r is a relation and E is a relational algebra queryn Examples
l Delete all account records in the Perryridge branch
l Delete all loan records with amount in the range of 0 to 50
account ← account – σ branch_name = “Perryridge” (account )
loan ← loan – σamount ≥ 0 and amount ≤ 50(loan)
Lesson 9 / Page 41AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
r1 ← (σbranch_name = “Perryridge” (borrower ⋈ loan))account ← account ∪ ∏loan_number, branch_name, 200 (r1)depositor ← depositor ∪ ∏customer_name, loan_number (r1)
Insertionn To insert data into a relation, we either:
l specify a tuple to be insertedl write a query whose result is a set of tuples to be inserted
n In relational algebra, an insertion is expressed by:r ← r ∪ E
where r is a relation and E is a relational algebra expression.
n The insertion of a single tuple is expressed by letting E be a constant relation containing one tuple
n Example:l Insert information in the database specifying that Smith has $1200
in account A-973 at the Perryridge branch
l Provide as a gift for all loan customers in the Perryridge branch, a $200 savings account. Let the loan number serve as the account number for the new savings account.
account ← account ∪ {(“A-973”, “Perryridge”, 1200)}depositor ← depositor ∪ {(“Smith”, “A-973”)}
Lesson 9 / Page 42AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Updatingn A mechanism to change a value in a tuple without
charging all values in the tuplen Use the generalized projection operator to do this task
n Each Fi is either l the I th attribute of r, if the I th attribute is not updated, or,l if the attribute is to be updated Fi is an expression, involving only
constants and the attributes of r, which gives the new value for the attribute
n Examplesl Make interest payments by increasing all balances by 5 %
l Pay all accounts with balances over $10,000 6 % interest and payall others 5 %
)(,,,, 21rr
lFFF K∏←
account ← ∏ account_number, branch_name, balance * 1.05 (account)
account ← ∏ account_number, branch_name, balance * 1.06 (σ BAL > 10000 (account ))∪ ∏ account_number, branch_name, balance * 1.05 (σBAL ≤ 10000 (account))
Lesson 9, part 2Structured Query Language (SQL)
Lesson 9 / Page 44AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Create Table Constructn An SQL relation is defined using the create table
command:create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1), ...,(integrity-constraintk))
l r is the name of the relationl each Ai is an attribute name in the schema of relation rl Di is the data type of values in the domain of attribute Ai
n Integrity constraints in create tablel not nulll primary key(A1, ..., AL )
n Example:create table branch( branch_name char(15) not null,
branch_city char(30),assets integer,primary key(branch_name)
)n Note:
l SQL names are case insensitive (i.e., you may use upper- or lower-case letters); e.g.: Branch_Name ≡ BRANCH_NAME ≡branch_name
Lesson 9 / Page 45AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Basic Query Structure n SQL is based on set and relational operations with certain
modifications and enhancementsn A typical SQL query has the form:
select A1, A2, ..., Anfrom R1, R2, ..., Rmwhere Pl Ai represents an attributel Ri represents a relationl P is a predicate.
n This query is equivalent to the relational algebra expression
n The result of an SQL query is a relationn Important remark:
l SQL is a declarative (query) language while relational algebra is procedural
l Mapping SQL queries to relational expressions converts declarative queries to procedures
l Query execution will use procedures implementing relation algebra operations
))(( 21,,, 21 mPAAA RRRn
×××∏ KK σ
Lesson 9 / Page 46AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
The select clausen The select clause lists the attributes desired in the
result of a queryl corresponds to the projection operation of the relational algebra
n Example: l Find the names of all branches in the loan relation:
select branch_name from loanl In the relational algebra, the query would be:
∏branch_name (loan)n SQL allows duplicates in relations as well as in query
resultsl This violates relational model assumptions but may speed-up
processingn To force the elimination of duplicates, insert the keyword distinct after select.l Find the names of all branches in the loan relations, and remove
duplicatesselect distinct branch_name from loan
l The keyword all specifies that duplicates not be removedselect all branch_name from loan
Lesson 9 / Page 47AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
The select clause (cont.)n An asterisk in the select clause denotes “all attributes”
select * from loann The select clause can contain arithmetic expressions
involving the operation, +, –, ∗, and /, and operating on constants or attributes of tuples
n The queryselect loan_number, branch_name, amount ∗ 100from loan
would return a relation that is the same as the loan relation, except that the value of the attribute amount is multiplied by 100l This is, in fact, the generalized projection
Πloan_number, branch_name, amount ∗ 100(loan)
Lesson 9 / Page 48AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
The where clausen The where clause specifies conditions that the result
must satisfyl Corresponds to the selection predicate of the relational algebra.
n Examplel Find all loan numbers for loans made at the Perryridge branch
with loan amounts greater than $1200.select loan_number
from loanwhere branch_name = 'Perryridge' and amount > 1200
n Comparisonl results can be combined using the logical connectives and, or,
and not.l Comparisons may be applied to results of arithmetic expressions.l SQL includes a between comparison operator
4 Example: Find the loan number of those loans with loan amounts between $90,000 and $100,000 (that is, ≥ $90,000 and ≤ $100,000)select loan_number from loan
where amount between 90000 and 100000which maps toΠloan_number(σ(amount ≥ 90000)∧(amount ≤ 100000)(loan))
Lesson 9 / Page 49AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
The from clausen The from clause lists the relations involved in the query
l Corresponds to the Cartesian product operation of the relationalalgebra
l Find the Cartesian product borrower x loanselect ∗ from borrower, loan
l Find the name, loan number and loan amount of all customers having a loan at the Brighton branch
select customer_name, borrower.loan_number, amountfrom borrower, loanwhere borrower.loan_number = loan.loan_number and
branch_name = 'Brighton' corresponds to
Π customer_name, borrower.loan_number, amount (σ borrower.loan_number = loan.loan_number ∧ branch_name = 'Brighton‘
(borrower x loan))
Lesson 9 / Page 50AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
The Rename Operationn The SQL allows renaming relations and attributes using the
as clause:old-name as new-name
l Find the name, loan number and loan amount of all customers; rename the column name loan_number as loan_id
n Home work:l Rewrite this query to relational expression
select customer_name, borrower.loan_number as loan_id, amountfrom borrower, loanwhere borrower.loan_number = loan.loan_number
Lesson 9 / Page 51AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Tuple Variablesn Tuple variables are defined in the from clause via the use
of the as clausen Example
l Find the customer names and their loan numbers for all customershaving a loan at some branchselect customer_name, T.loan_number, S.amount
from borrower as T, loan as Swhere T.loan_number = S.loan_number
l Find the names of all branches that have greater assets than some branch located in Brooklynselect distinct T.branch_name
from branch as T, branch as Swhere T.assets > S.assets and S.branch_city = 'Brooklyn'
Lesson 9 / Page 52AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
SQL allows duplicatesn Multiset versions of some of the relational algebra
operators – given multiset relations r1 and r2:l σθ (r1): If there are c1 copies of tuple t1 in r1, and t1 satisfies
selections σθ,, then there are c1 copies of t1 in σθ (r1).l ΠA (r ): For each copy of tuple t1 in r1, there is a copy of tuple ΠA
(t1) in ΠA (r1) where ΠA (t1) denotes the projection of the single tuple t1.
l r1 x r2: If there are c1 copies of tuple t1 in r1 and c2 copies of tuple t2in r2, there are c1 ∗ c2 copies of the tuple t1…t2 in r1 x r2
n Example: l Suppose multiset relations r1 (A, B) and r2 (C) are as follows:
r1 = {(1, a) (2,a)} r2 = {(2), (3), (3)}l Then ΠB(r1) would be {(a), (a)}, while ΠB(r1) x r2 would be
{(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}n SQL duplicate semantics:
l select A1,, A2, ..., An from r1, r2, ..., rm where Pis equivalent to the multiset version of the expression:
))(( 21,,, 21 mPAAA rrrn
×××∏ KK σ
Lesson 9 / Page 53AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Set Operationsn The set operations union, intersect, and except
operate on relations and correspond to the relational algebra operations ∪, ∩, −
n Find all customers who have a loan, an account, or both:
n Find all customers who have both a loan and an account:
n Find all customers who have an account but no loan
(select customer_name from depositor)union(select customer_name from borrower)
(select customer_name from depositor)intersect(select customer_name from borrower)
(select customer_name from depositor)except(select customer_name from borrower)
Lesson 9 / Page 54AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Aggregate Functions in SQLn These functions operate on the multiset of values of a
column of a relation, and return a valueavg average valuemin minimum valuemax maximum valuesum sum of valuescount number of values
n Find the average account balance at the Perryridge branch
n Find the number of depositors in the bank
select avg (balance)from accountwhere branch_name = 'Perryridge'
select count (distinct customer_name)from depositor
Lesson 9 / Page 55AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Null Valuesn It is possible for tuples to have a null value, denoted by
null, for some of their attributesn null signifies an unknown value or that a value does not
exist.n The predicate is null can be used to check for null
values.l Example: Find all loan number which appear in the loan relation
with null values for amount.select loan_number from loan
where amount is nulln The result of any arithmetic expression involving null is null
l Example: 5 + null returns nulln However, aggregate functions simply ignore nullsn Any comparison with null returns unknown
l Example: 5 < null or null <> null or null = nulln Three-valued logic using the truth value unknown is the
same as above for relationsn “P is unknown” evaluates to true if predicate P evaluates
to unknown
Lesson 9 / Page 56AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Nested Subqueries
n SQL provides a mechanism for the nesting of queriesn A subquery is a select-from-where expression that is
nested within another queryn A common use of subqueries is to perform tests for set
membership, set comparisons, and set cardinalityn Example:
l Find all customers who have both an account and a loan at the bank
select distinct customer_namefrom borrowerwhere customer_name in (select customer_name
from depositor )
Lesson 9 / Page 57AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Viewsn In some cases, it is not desirable for all users to see the
entire logical modell that is, all the actual relations stored in the database
n Consider a person who needs to know a customer’s name, loan number and branch name, but has no need to see the loan amount. This person should see a relation described, in SQL, by
(select customer_name, borrower.loan_number, branch_namefrom borrower, loanwhere borrower.loan_number = loan.loan_number )
n A view provides a mechanism to hide certain data from the view of certain users. l Any relation that is not of the conceptual model but is made visible
to a user as a “virtual relation” is called a view.n A view is defined using the create view statement which
has the formcreate view v as < query expression >
The view name is represented by v.l Once a view is defined, the view name can be used to refer to the
virtual relation that the view generates
Lesson 9 / Page 58AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Modification of the Databasen Deletion
l Statement is delete-from-where with the arguments similar to the select-from-where construct
l Delete all account tuples at the Brighton branchdelete from account where branch_name = ‘Brighton‘
n Insertionl Statement is: insert into relation values
<compatible_relation>l Add a new tuple to accountinsert into account (branch_name, balance, account_number)
values ('Perryridge', 1200, 'A-9732')n Updates
l Statement is: update relation set attribute = expression where condition
l Add 6% to accounts over $1000update account set balance = balance∗1.06 where balance>1000
Lesson 9 / Page 59AE3B33OSD Silberschatz, Korth, Sudarshan S. ©2007
Joined Relationsn Join operations take two relations and return as a result
another relation.n These additional operations are typically used as subquery
expressions in the from clausen Join condition – defines which tuples in the two relations
match, and what attributes are present in the result of the join.
n Join type – defines how tuples in each relation that do not match any tuple in the other relation (based on the join condition) are treated.
n Completely based on relational-algebra joins. SQL syntax described in the SQL standards
n Examplel Find all customers who have either an account or a loan (but not
both) at the bankselect customer_name
from (depositor full outer join borrower )where account_number is null or loan_number is null
End of Lesson 9