relational algebra & calculus - umass...

34
1 Relational Algebra & Calculus Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke

Upload: buikhanh

Post on 15-Aug-2019

227 views

Category:

Documents


0 download

TRANSCRIPT

1

Relational Algebra & Calculus

Yanlei Diao UMass Amherst

Slides Courtesy of R. Ramakrishnan and J. Gehrke

2

Outline

v  Conceptual Design: ER model

v  Logical Design: ER to relational model

v  Querying and manipulating data

§  Practical language: SQL

• Declarative: say “what you want”, not “how to get it”

§  Formal languages: Relational Algebra and Calculus

• Mathematical foundation for query processing

3

What is an “Algebra”?

v  A mathematical system consisting of: §  Operands: variables or values from which new values

can be constructed. §  Operators: symbols denoting procedures that construct

new values from given values.

4

What is “Relational Algebra”

v  Relational Algebra: §  Operands are relations. §  Each operator takes 1 or 2 relations and produces a relation.

v  Closure property: relational algebra is closed under the relational model. §  Relational operators can be arbitrarily composed!

5

Relational Algebra v  Basic operations:

§  Selection ( σ ) Selects a subset of rows from a relation.

§  Projection ( π ) Deletes unwanted columns from a relation.

§  Cross-product ( × ) Allows us to combine two relations.

§  Set-difference ( - ) Tuples in reln. 1, but not in reln. 2.

§  Union ( ∪ ) Tuples in reln. 1 or in reln. 2.

v  Additional operations: §  Join ( ), Intersection ( ∩ ), Division ( / ), Renaming ( ρ ) §  Can be derived from basic operators. Not essential, but

useful for writing simpler queries.

6

Example Instances

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

sid bid day22 101 10/10/9658 103 11/12/96

R1

S1

S2

Sailors

Reserves

7

Projection

sname ratingyuppy 9lubber 8guppy 5rusty 10

π sname rating S, ( )2

v  Retain only attributes in the projection list; delete others. v  Schema of result contains exactly the fields in projection list.

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

8

Projection (contd.)

age35.055.5

πage S( )2

v  Projection operator has to eliminate duplicates! §  Real systems typically don’t do duplicate elimination

unless the user explicitly asks for it. (Why not?)

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

9

Selection

σ rating S>8 2( )

sid sname rating age28 yuppy 9 35.058 rusty 10 35.0

v  Select rows that satisfy the selection condition; discard others.

v  Schema of result identical to schema of input.

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

10

Selection (contd.)

σ(rating > 5∨rating < 5)∧age < 50(S2)

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

11

A Simple Example of Composition

σ rating S>8 2( )

sid sname rating age28 yuppy 9 35.058 rusty 10 35.0

sname ratingyuppy 9rusty 10

π σsname rating rating S, ( ( ))>8 2

v  Composition: result relation of an operator can be the input to another operator.

12

Union, Intersection, Set-Difference

v  Set operations: §  Union ( ∪ )

§  Intersection ( ∩ ) §  Set difference ( - )

v  Take two input relations, which must be union-compatible: §  Same number of fields. §  Corresponding fields have the

the same type.

v  What is the schema of result?

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1

S2

13

Example Set Operations

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.044 guppy 5 35.028 yuppy 9 35.0

S S1 2∪

Duplicate elimination: remove tuples that have same values in all attributes.

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1

S2

14

Example Set Operations

sid sname rating age31 lubber 8 55.558 rusty 10 35.0

S S1 2∩

sid sname rating age22 dustin 7 45.0

S S1 2−

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1

S2

Duplicates?

15

Cross (Cartesian) Product

v  S1 × R1: Each row of S1 is paired with each row of R1.

sid bid day22 101 10/10/9658 103 11/12/96

R1 sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

S1

(sid) sname rating age (sid) bid day22 dustin 7 45.0 22 101 10/10/9622 dustin 7 45.0 58 103 11/12/9631 lubber 8 55.5 22 101 10/10/9631 lubber 8 55.5 58 103 11/12/9658 rusty 10 35.0 22 101 10/10/9658 rusty 10 35.0 58 103 11/12/96

S1 × R1

16

Cross-Product (contd.)

v  Result schema inherits all fields of S1 and R1. §  Conflict: Both S1 and R1 have a field called sid.

ρ ( ( , ), )C sid sid S R1 1 5 2 1 1→ → ×

(sid) sname rating age (sid) bid day22 dustin 7 45.0 22 101 10/10/9622 dustin 7 45.0 58 103 11/12/9631 lubber 8 55.5 22 101 10/10/9631 lubber 8 55.5 58 103 11/12/9658 rusty 10 35.0 22 101 10/10/9658 rusty 10 35.0 58 103 11/12/96

•  Renaming operator:

What if we want to find pairs s.t. S1.sid = S2.sid?

17

Join

v  Condition (theta) Join:

§  Result schema same as that of cross-product. §  But often fewer tuples, more efficient for computation.

RθS =σ

θ(R×S)

(sid) sname rating age (sid) bid day22 dustin 7 45.0 58 103 11/12/9631 lubber 8 55.5 58 103 11/12/96

S RS sid R sid1 11 1 . .<

18

Join

v  Equi-Join: A special case of condition join where the condition θ contains only equalities.

§  Result schema contains only one copy of fields for which equality is specified.

v  Natural Join ( R wv S ): equi-join on all common fields.

sid sname rating age bid day22 dustin 7 45.0 101 10/10/9658 rusty 10 35.0 103 11/12/96

11 RS sid

19

Self-Join

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

S1 sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

S2

ρ(S1,S)×ρ(S2,S)

S1.rating>S2.rating∧S1.age<S2.age

20

Example Schema

v  Sailors(sid: integer, sname: string, rating: integer, age: integer)

v  Boats(bid: integer, color: string) v  Reserves(sid: integer, bid: integer, day: date)

21

v  Sailors(sid: integer, sname: string, rating: integer, age: integer)

v  Boats(bid: integer, color: string) v  Reserves(sid: integer, bid: integer, day: date)

Find names of sailors who’ve reserved boat #103

v  How many relations do we need to access? §  If more than 1, use cross-products or joins.

v  Filtering condition? Use selection. v  Attributes to return? Use projection.

22

Find names of sailors who’ve reserved boat #103

v  Solution 1: π σsname bid serves Sailors(( Re ) )=103

v  Solution 2: ρ σ( , Re )Temp servesbid1 103=

ρ ( , )Temp Temp Sailors2 1

π sname Temp( )2

v  Solution 3: π σsname bid serves Sailors( (Re ))=103

Algebraic equivalence!

23

Find names of sailors who’ve reserved a red boat

v  Boat color is only available in Boats; so need an extra join:

π σsname color red Boats serves Sailors(( ' ' ) Re )=

v  Sailors(sid: integer, sname: string, rating: integer, age: integer)

v  Boats(bid: integer, color: string) v  Reserves(sid: integer, bid: integer, day: date)

24

Find names of sailors who’ve reserved a red or a green boat

v  Sailors(sid: integer, sname: string, rating: integer, age: integer)

v  Boats(bid: integer, color: string) v  Reserves(sid: integer, bid: integer, day: date)

25

Find names of sailors who’ve reserved a red or a green boat

v  Can identify all red or green boats, then find sailors who’ve reserved one of these boats:

ρ σ( , ( ' ' ' ' ))Tempboats color red color green Boats= ∨ =

π sname Tempboats serves Sailors( Re )

v  Can also define Tempboats using union. How?

26

Find names of sailors who’ve reserved a red boat and a green boat

v  A single selection won’t work. Why? v  Instead, intersect sailors who’ve reserved red boats and

sailors who’ve reserved green boats.

ρ π σ( , (( ' ' ) Re ))Tempred sid color red Boats serves=

π sname Tempred Tempgreen Sailors(( ) )∩

ρ π σ( , (( ' ' ) Re ))Tempgreen sid color green Boats serves=

sid is a key for Sailors

Can we project onto name and then do the intersection?

27

Find the names of sailors who’ve reserved all boats

28

Relational Calculus

v  Query has the form:

x x xn p x x xn1 2 1 2, ,..., | , ,...,!

"

###

$

%

&&&

'

()

*)

+

,)

-)

v  Answer includes all tuples that

make the formula be true.

x x xn1 2, ,...,

p x x xn1 2, ,...,!

"

###

$

%

&&&

domain variables, or constants formula

29

Formulas

v  Formula is recursively defined: §  Atomic formulas: retrieving tuples from relations, or

making comparisons of values

§  Logical connectives: ¬, ∧, ∨ §  Quantifiers: ∃, ∀ (optional)

o  ∀x F(x) is equivalent to ¬∃x ¬F(x)

30

Find names of sailors rated > 7 who have reserved boat #103

))()Re(( 7103 Sailorsserves ratingbidsname >=σσπ

{Xsname | ∃ Xsid, Xrating, Xage Sailors(Xsid, Xsname, Xrating, Xage) ∧ Xrating > 7

∧ ∃ Xbid, Xday Reserves(Xsid, Xbid, Xday) ∧ Xbid=103 }

Relational Algebra:

Relational Calculus:

v  Where is the join? §  Use ∃ to find a tuple in Reserves that `joins with’ the Sailors tuple

under consideration.

31

Find names of sailors who’ve reserved all boats

v  To find sailors who’ve reserved all red boats:

{Xsname | ∃ Xsid, Xrating, Xage 〈Xsid, Xsname, Xrating, Xage〉∈Sailors ∧ ∀ 〈Xbid, Xcolor〉∈Boats (∃ Xday 〈Xsid, Xbid, Xday〉 ∈ Reserves) }

{Xsname | ∃ Xsid, Xrating, Xage 〈Xsid, Xsname, Xrating, Xage〉∈Sailors ∧ ∀ 〈Xbid, Xcolor〉∈Boats (Xcolor≠ʹ′redʹ′ ∨ ∃ Xday 〈Xsid, Xbid, Xday〉 ∈ Reserves) }

Sailors(sid: integer, sname: string, rating: integer, age: integer) Boats(bid: integer, color: string) Reserves(sid: integer, bid: integer, day: date)

32

Unsafe Queries, Expressive Power

v  Some syntactically correct calculus queries can have an infinite number of answers! Such queries are called unsafe. §  e.g.,

v  Equivalence result: every query that can be expressed in relational algebra can be expressed as a safe query in relational calculus; the converse is also true.

v  Query language (e.g., SQL) can express every query that is expressible in relational algebra/calculus.

S S Sailors| ¬ ∈"

#

$$

%

&

''

(

)*

+*

,

-*

.*

33

Find names of sailors who’ve reserved all boats

{Xsname | ∃ Xsid, Xrating, Xage 〈Xsid, Xsname, Xrating, Xage〉∈Sailors ∧ ∀ 〈Xbid, Xcolor〉∈Boats (∃ Xday 〈Xsid, Xbid, Xday〉 ∈ Reserves) }

{Xsname | ∃ Xsid, Xrating, Xage 〈Xsid, Xsname, Xrating, Xage〉∈Sailors ∧ ¬∃ 〈Xbid, Xcolor〉∈Boats (¬∃ Xday 〈Xsid, Xbid, Xday〉 ∈ Reserves) }

34

Find the names of sailors who’ve reserved all boats

v  Step 1: for each sailor, check if there exists a boat that he has not reserved (called formula F).

ρ (S _ neg, πsid ( (π sid Re serves) × (πbid Boats) − (π sid,bidRe serves)))

π sname ( (π sidRe serves − S _ neg) Sailors )

v  Step 2: find sailors for which F is not true and retrieve their names

‘-’: the only way to express negation in relational algebra!