l4 relational algebra - github pages › files › lectures › lec4.pdfoverview last time, learned...

89
L4 Relational Algebra Eugene Wu

Upload: others

Post on 26-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

L4Relational Algebra

Eugene Wu

Page 2: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Supplemental Materials

Helpful Referenceshttps://en.wikipedia.org/wiki/Relational_algebra

Page 3: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Overview

Last time, learned aboutpre-relational modelsan informal introduction to relational modelan introduction to the SQL query language.

Learn about formal relational query languagesRelational Algebra (algebra: perform operations)Relational Calculus (logic: are statements true?)

Keys to understanding SQL and query processing

Page 4: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Who Cares?

Clean query semantics & rich program analysis

Helps/enables optimization

Opens up rich set of topics

Materialized views

Data lineage/provenance

Query by example

Distributed query execution

Page 5: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

What’s a Query Language?

Allows manipulation and retrieval of data from a database.

Traditionally: QL != programming languageDoesn’t need to be turing completeNot designed for computationSupports easy, efficient access to large databases

Recent YearsScaling to large datasets is a realityQuery languages are a powerful way to

think about data algorithms scalethink about asynchronous/parallel programming

Page 6: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

What’s a Query Language?

Tutorial at VLDB 2015 Abadi et al.http://www.slideshare.net/abadid/sqlonhadoop-tutorial

Page 7: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Formal Relational Query Languages

Formal basis for real languages e.g., SQL

Relational Algebra

Operational, used to represent execution plans

Relational Calculus

Logical, describes what data users want

(not operational, fully declarative)

Page 8: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Prelims

Query is a function over relation instances

Q(R1,…,Rn) = Rresult

Schemas of input and output relations are fixed and well defined by the query Q.

Positional vs Named field notationPosition easier for formal defs

one-indexed (not 0-indexed!!!)Named more readableBoth used in SQL

Page 9: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Prelims

Relation (for this lecture)Instance is a set of tuplesSchema defines field names and types (domains)Students(sid int, name text, major text, gpa int)

How are relations different than generic sets (ℝ)?Can assume item structure due to schemaSome algebra operations (x) need to be modifiedWill use this later

Page 10: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Relational Algebra Overview

Core 5 operationsPROJECT (π)SELECT (σ)UNION (U)SET DIFFERENCE (-)

CROSSPRODUCT (x)

Additional operationsRENAME (p)INTERSECT (∩)

JOIN (⨝)DIVIDE (/)

Page 11: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Instances Used Today: Library

Students, Reservations

Use positional or named field notation

Fields in query results are inherited from input relations (unless specified)

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

sid rid day

1 101 10/10

2 102 11/11

R1

S1

S2

Page 12: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Project

π<attr1,…>(A) = Rresult

Pick out desired attributes (subset of columns)

Schema is subset of input schema in the projection listπ<a,b,c>(A) has output schema (a,b,c) w/ types carried over

Page 13: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Project

πname,age(S2) = name age

aziz 21barb 21tanya 88rusty 21

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S2

Page 14: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Project

πname,name,age(S2) = name name age

aziz aziz 21barb barb 21tanya tanya 88rusty rusty 21

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S2

Page 15: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Project

age

2188

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S2

Where did all the rows go?Real systems typically don’t remove duplicates. Why?

πage(S2) =

Page 16: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Select

σ<p>(A) = Rresult

Select subset of rows that satisfy condition p

Won’t have duplicates in result. Why?

Result schema same as input

Page 17: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Select

sid name gpa age

1 eugene 4 202 barb 3 21

σage<30 (S1) =

name

eugenebarb

πname(σage<30 (S1)) =

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1

Page 18: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively & Associativity

A + B = B + AA * B = B * A

A + (B * C) = (B * C) + A

A + (B + C) = (A + B) + CA + (B * C) = (A + B) * C

Page 19: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively & Associativity

A + B = B + AA * B = B * A

A + (B * C) = (B * C) + A

A + (B + C) = (A + B) + CA + (B * C) = (A + B) * C

Page 20: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

πage(σage<30 (S1))

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

σage<30 ( )sid name gpa age

1 eugene 4 202 barb 3 21

Page 21: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

πage(σage<30 (S1))

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

σage<30

πage

sid name gpa age

1 eugene 4 202 barb 3 21

( )age

2021

( ) =

Page 22: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

σage<30(πage(S1))

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

( )age

202188

πage

Page 23: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

σage<30(πage(S1))

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

( )age

2021

( ) age

202188

πage

σage<30 =

Page 24: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

Does Project and Select commute?

πage(σage<30 (S1)) = σage<30(πage(S1))

What about

πname(σage<30 (S1))?

Page 25: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

Does Project and Select commute?

πage(σage<30 (S1)) = σage<30(πage(S1))

What about

πname(σage<30 (S1)) != σage<30(πname(S1))

Page 26: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

Does Project and Select commute?

πage(σage<30 (S1)) = σage<30(πage(S1))

What about

πname(σage<30 (S1)) != σage<30(πname, age(S1))

Page 27: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Commutatively

Does Project and Select commute?

πage(σage<30 (S1)) = σage<30(πage(S1))

What about

πname(σage<30 (S1)) = πname(σage<30(πname, age(S1)))

OK!

Page 28: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Union, Set-Difference

A op B = Rresult

A, B must be union-compatibleSame number of fieldsField i in each schema have same type

Result Schema borrowed from first arg (A)A(big int, poppa int) U B(thug int, life int) = ?

Page 29: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Union, Set-Difference

A op B = Rresult

A, B must be union-compatibleSame number of fieldsField i in each schema have same type

Result Schema borrowed from first arg (A)A(big int, poppa int) U B(thug int, life int) = Rresult(big int, poppa int)

Page 30: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Union, Intersect, Set-Difference

S1 US2=

sid name gpa age

1 eugene 4 204 aziz 3.2 215 rusty 3.5 213 tanya 2 882 barb 3 21

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S1 S2

Page 31: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Union, Intersect, Set-Difference

sid name gpa age

1 eugene 4 20S1- S2=

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S1 S2

Page 32: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Note on Set Difference & Performance

Notice that most operators are monotonicincreasing size of inputs à outputs growif A⊇B à Q(A, T) ⊇ Q(B, T)can compute incrementally

Set Difference is not monotonicif A⊇B à T – A⊆T – Be.g., 5 > 1 à 9 – 5 < 9 – 1

Set difference is blocking:For T – S, must wait for all S tuples before any results

Page 33: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Cross-Product

A(a1,…,an) x B(an+1,…,am) = Rresult(a1,…,am)

Each row of A paired with each row of B

Result schema concats A and B’s fields, inherit if possibleConflict: students and reservations have sid field

Different than mathematical “X” by flattening results:

math A x B = { (a, b) | a∈A ^ b∈B }e.g., {1, 2} X {3, 4} = { (1, 3), (1, 4), (2, 3), (2, 4) }

what is {1, 2} x {3, 4} x {5, 6}?

Page 34: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Cross-Product

(sid) name gpa age (sid) rid day

1 eugene 4 20 1 101 10/10

2 barb 3 21 1 101 10/10

3 tanya 2 88 1 101 10/10

1 eugene 4 20 2 102 11/11

2 barb 3 21 2 102 11/11

3 tanya 2 88 2 102 11/11

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1

S1xR1=

sid rid day

1 101 10/10

2 102 11/11

R1

Page 35: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Rename

p(<new_name>(<mappings>), Q)

Explicitly defines/changes field names of schema

p(C(1 à sid1, 5 à sid2), S1 x R1)

sid1 name gpa age sid2 rid day

1 eugene 4 20 1 101 10/10

2 barb 3 21 1 101 10/10

3 tanya 2 88 1 101 10/10

1 eugene 4 20 2 102 11/11

2 barb 3 21 2 102 11/11

3 tanya 2 88 2 102 11/11

C =

Page 36: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

π( ) =Project

σ( ) =Select

x =Cross product

- =Difference

U =Union

∩ =Intersect

Page 37: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Compound/Convenience Operators

INTERSECT (∩)

JOIN (⨝)DIVIDE (/)

Page 38: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Intersect

A ∩ B = Rresult

A, B must be union-compatible

Page 39: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Intersect

sid name gpa age

2 barb 3 213 tanya 2 88

S1∩S2=

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

sid name gpa age

4 aziz 3.2 212 barb 3 213 tanya 2 885 rusty 3.5 21

S1 S2

Page 40: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Intersect

A ∩ B = Rresult

A, B must be union-compatible

Can we express using core operators?A∩B = ?

Page 41: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Intersect

A ∩ B = Rresult

Can we express using core operators?A∩B = A - ? (think venn diagram)

A B

Page 42: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Intersect

A ∩ B = Rresult

Can we express using core operators?A∩B = A – (A – B)

A B

Page 43: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves BooksStudents

relation relation relation

What if you want to query across all three tables?e.g., all names of students that reserved “The Purple Crayon”

Need to combine these tables Cross product? But that ignores foreign key references

day

Page 44: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves BooksStudents

relation relation relation

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1rid name

101 The PurpleCrayon

102 1984

B1

day

sid rid day

1 101 10/10

2 102 11/11

R1

Page 45: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves BooksStudents

relation relation relation

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1sid rid day

1 101 10/10

2 102 11/11

R1rid name

101 The PurpleCrayon

102 1984

B1

day

Page 46: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves

day

BooksStudents

relation relation relation

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1 RB1sid (rid) day

1 101 10/10

2 102 11/11

(rid) name

101 The PurpleCrayon

102 1984

Page 47: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves BooksStudents

relation relation relation

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1 RB1sid (rid) day

1 101 10/10

2 102 11/11

(rid) name

101 The PurpleCrayon

102 1984

day

Page 48: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Joins (high level)

Reserves BooksStudents

relation relation relation

(sid) (name) gpa age

1 eugene 4 202 barb 3 21

SRB1

day

(sid) (rid) day

1 101 10/10

2 102 11/11

(rid) (name)

101 The PurpleCrayon

102 1984

Page 49: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

theta (θ) Join

A ⨝c B = σc(A x B)

Most general form

Result schema same as cross product

Often far more efficient to compute than cross product

Commutative

(A⨝cB) ⨝cC = A⨝c(B ⨝cC)

Page 50: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

theta (θ) Join

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1sid rid day

1 101 10/10

2 102 11/11

R1

(sid) name gpa age (sid) rid day

1 eugene 4 20 1 101 10/10

2 barb 3 21 1 101 10/10

3 tanya 2 88 1 101 10/10

1 eugene 4 20 2 102 11/11

2 barb 3 21 2 102 11/11

3 tanya 2 88 2 102 11/11

S1⨝S1.sid ≤ R1.sid R1 = σ S1.sid ≤ R1.sid(S1xR1)=

Page 51: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

theta (θ) Join

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1sid rid day

1 101 10/10

2 102 11/11

R1

(sid) name gpa age (sid) rid day

1 eugene 4 20 1 101 10/10

2 barb 3 21 1 101 10/10

3 tanya 2 88 1 101 10/10

1 eugene 4 20 2 102 11/11

2 barb 3 21 2 102 11/11

3 tanya 2 88 2 102 11/11

S1⨝S1.sid ≤ R1.sid R1 = σ S1.sid ≤ R1.sid(S1xR1)=

Page 52: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

theta (θ) Join

(sid) name gpa age (sid) rid day

1 eugene 4 20 1 101 10/10

1 eugene 4 20 2 102 11/11

2 barb 3 21 2 102 11/11

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1sid rid day

1 101 10/10

2 102 11/11

R1

S1⨝S1.sid ≤ R1.sid R1 = σ S1.sid ≤ R1.sid(S1xR1)=

Page 53: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Equi-Join

A ⨝attr B = πall attrs except B.attr(A ⨝A.attr = B.attr B)

Special case where the condition is attribute equality

Result schema only keeps one copy of equality fields

Natural Join (A⨝B):

Equijoin on all shared fields (fields w/ same name)

Page 54: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Equi-Join

S1⨝sidR1 = sid name gpa age rid day

1 eugene 4 20 101 10/10

2 barb 3 21 102 11/11

sid name gpa age

1 eugene 4 202 barb 3 213 tanya 2 88

S1sid rid day

1 101 10/10

2 102 11/11

R1

Page 55: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Division

Let us have relations A(x, y), B(y)

A/B = { <x> | ∀y ∈B <x,y> ∈A }

Find all students that have reserved all booksA/B = all x (students) s.t. for every y (reservation), <x,y> ∈A

Good to ponder, not supported in most systems (why?)

Generalizationy can be a list of fields in Bx U y is fields in A

Page 56: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Examples

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

rid

2

rid

2

4

rid

1

2

4

A R1 R2 R3

A/R1 A/R2 A/R3

Which sid values have relationship with all vals in Rx?

Page 57: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Examples

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

rid

2

rid

2

4

rid

1

2

4

A R1 R2 R3

sid

1

2

3

4

A/R1 A/R2 A/R3

Which sid values have relationship with all vals in Rx?

Page 58: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Examples

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

rid

2

rid

2

4

rid

1

2

4

A R1 R2 R3

sid

1

2

3

4

sid

1

4

A/R1 A/R2 A/R3

Which sid values have relationship with all vals in Rx?

Page 59: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Examples

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

rid

2

rid

2

4

rid

1

2

4

A R1 R2 R3

sid

1

2

3

4

sid

1

4

sid

1

A/R1 A/R2 A/R3

Which sid values have relationship with all vals in Rx?

Page 60: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Is A/B a Fundamental Operation?

No. Shorthand like Joins

joins natively supported: ubiquitous, can be optimized

Hint: Find all xs not ‘disqualified’ by some y in B.

A(x, y), B(y)x value is disqualified if

1. by attaching y value from B (e.g., create <x, y>)

2. we obtain an <x,y> that is not in A.

Page 61: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

A

Disqualified = πx((πx(A) x B) – A)A/B = πx(A) - Disqualified

rid

2

4

B

Page 62: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

A

Disqualified = πx((πsid(A) x B) – A)A/B = πx(A) - Disqualified

rid

2

4

B

sid

1

2

3

4

πsid(A)

Page 63: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

A

Disqualified = πx((πsid(A) x B) – A)A/B = πx(A) - Disqualified

rid

2

4

B

sid rid

1 2

1 4

2 2

2 4

3 2

3 4

4 2

4 4

πsid(A) x B

sid

1

2

3

4

πsid(A)

Page 64: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

A

Disqualified = πx((πsid(A) x B) – A)A/B = πx(A) - Disqualified

rid

2

4

B

sid rid

1 2

1 4

2 2

2 4

3 2

3 4

4 2

4 4

sid rid

2 4

3 4

πsid(A) x B (πsid(A) x B) – A

sid

1

2

3

4

πsid(A)

Page 65: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

sid rid

1 1

1 2

1 3

1 4

2 1

2 2

3 2

4 2

4 4

A

sid

1

4

A/B

Disqualified = πsid((πsid(A) x B) – A)A/B = πsid(A) - Disqualified

rid

2

4

B πsid(A) x B

sid rid

1 2

1 4

2 2

2 4

3 2

3 4

4 2

4 4

(πsid(A) x B) – A

sid rid

2 4

3 4

sid

1

2

3

4

πsid(A)

Page 66: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Different Plans, Same Results

Semantic equivalence:results are always the same

Note that it is independent of the database instance!

Page 67: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved book 2

πname(σrid=2 (R1) ⨝ S1)

p(tmp1, σrid=2 (R1))p(tmp2, tmp1 ⨝ S1)πname(tmp2)

πname(σrid=2(R1 ⨝ S1))

EquivalentQueries

Page 68: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 69: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 70: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 71: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 72: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

More efficient queryπname(πsid((πrid σtype=‘db’ (Book)) ⨝ Reserve) ⨝ Student)

Query optimizer can find the more efficient query!

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 73: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

More efficient queryπname(πsid((πrid σtype=‘db’ (Book)) ⨝ Reserve) ⨝ Student)

Query optimizer can find the more efficient query!

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 74: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

More efficient queryπname(πsid((πrid σtype=‘db’ (Book)) ⨝ Reserve) ⨝ Student)

Query optimizer can find the more efficient query!

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 75: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Names of students that reserved dbbooks

Need to join DB books with reserve and studentsπname(σtype=‘db’ (Book) ⨝ Reserve ⨝ Student)

More efficient queryπname(πsid((πrid σtype=‘db’ (Book)) ⨝ Reserve) ⨝ Student)

Query optimizer can find the more efficient query!

Book(rid, type) Reserve(sid, rid) Student(sid, name)

Page 76: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved DB or HCI book

1. Find all DB or HCI books

2. Find students that reserved one of those books

p(tmp, (σtype=‘DB’ v type=‘HCI’ (Book))

πname(tmp ⨝ Reserve ⨝ Student)

Alternatives

define tmp using UNION (how?)

what if we replaced v with ^ in the query?

Page 77: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved a DB and HCI book

Does previous approach work?

p(tmp, (σtype=‘DB’ ^ type=‘HCI’ (Book))

πname(tmp ⨝ Reserve ⨝ Student)

NO

Page 78: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Why?

p(tmp, (σtype=‘DB’ ^ type=‘HCI’ (Book))πname(tmp ⨝ Reserve ⨝ Student)

for b in Book:

if b.type = ‘DB’ and b.type = ‘HCI’:for r in Reserve:

for s in Student:

if r.sid = s.sid and r.bid = b.bid:yield b.name

Page 79: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved a DB and HCI book

Does previous approach work?

1. Find students that reserved DB books

2. Find students that reversed HCI books

3. Intersection

p(tmpDB, πsid(σtype=‘DB’ Book) ⨝ Reserve)

p(tmpHCI, πsid(σtype=‘HCI’ Book) ⨝ Reserve)

πname((tmpDB∩tmpHCI) ⨝ Student)

Page 80: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved all books

Use division

Be careful with schemas of inputs to / !

p(tmp, (Reserves / Books))

tmp ⨝ Students

Books(rid, type) Reserves(sid, rid, date) Students(sid,)

Type error

Page 81: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved all books

Use division

Be careful with schemas of inputs to / !

p(tmp, (Reserves / πrid (Books)))

tmp ⨝ Students

Books(rid, type) Reserves(sid, rid, date) Students(sid,)

Each student must have reserved all books on the same date!

Page 82: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Students that reserved all books

Use divisionBe careful with schemas of inputs to / !

p(tmp, (πsid,rid (Reserves)) / (πrid (Books)))tmp ⨝ Student

What if want students that reserved all horror books?

p(tmp, (πsid,rid Reserves) / (πrid(σtype=‘horror’ Book) ))

Books(rid, type) Reserves(sid, rid, date) Students(sid)

Page 83: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Let’s step back

Relational algebra is expressiveness benchmarkA language equal in expressiveness as relational algebra is relationally complete

But has limitationsnullsaggregationrecursionduplicates

Page 84: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Equi-Joins are a way of life

Matching of two sets based on shared attributes

Yelp: Join between your location and restaurants

Market: Join between consumers and suppliers

High five: Join between two hands on time and space

Comm.: Join between minds on ideas/concepts

Page 85: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Equi-Joins are a way of life

Matching of two things based on shared common attribute

Page 86: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

What can we do with RA?

Query(DB instance) à Relation instance

Page 87: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

GestureDB. Nandi et al.

What can we do with RA?

Query(DB instance) = Relation instance

Query by exampleHere’s data and examples of result, generate the query for me

Novel relationally complete interfaces

Page 88: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Summary

Relational Algebra (RA) operators

Operators are closed inputs & outputs are relations

Multiple Relational Algebra queries can be equivalent It is operationalSame semantics but different performance

Forms basis for optimizations

Page 89: L4 Relational Algebra - GitHub Pages › files › lectures › lec4.pdfOverview Last time, learned about pre-relational models an informal introduction to relational model an introduction

Next Time

Relational CalculusSQL