1 data models and query languages cse 590db, winter 1999 theory of databases zack ives january 10,...
TRANSCRIPT
![Page 1: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/1.jpg)
1
Data Models and Query Languages
CSE 590DB, Winter 1999Theory of Databases
Zack IvesJanuary 10, 1999
![Page 2: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/2.jpg)
2
The Relational Model
Database consists of relations, where columns are attributes and rows are tuples
Declarative queries using an algebraic or logic language (or SQL, which we shall ignore)
attribute
tuple
arity = 3, cardinality = 3NewMovies(Name, Rating, Director)
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
![Page 3: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/3.jpg)
3
Schema
Each attribute in a relation has a type:NewMovies(Name: String, Rating: RatingDomain, Director:
String), where RatingDomain = {G, PG, PG-13, R, NC-17}(In many cases, we will omit the type, and assume the type
most appropriate to the expression)
Relation schema: relation name, attributes, and types Relation instance: a set of tuples for a given schema Database schema: set of relation schemas Database instance: relation instance for every
relation schema
![Page 4: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/4.jpg)
4
More on Tuples
Formally, a tuple is a mapping from attribute names to (correctly typed) values: Director -> “George Lucas” Name -> “Stepmom”
May specify a tuple using values & schema notation:NewMovies(“The Phantom Menace”, “PG”, “George Lucas”)
Can have constraints on attribute values: Integrity constraints Keys Foreign keys Functional dependencies
![Page 5: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/5.jpg)
5
Relational Algebra
All operators take one or more sets of tuples as input, and produce a new set as output Basic set operators (, , —, but normally no
complement) Selection () Projection () Cartesian Product () Join () Division ()
![Page 6: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/6.jpg)
6
Set Operations
Sets must be compatible! Same number of attributes Same types/domains
R1 R2 = {all tuples in either R1 or R2}R1 R2 = {all tuples in both R1 and R2}Complement???Negation: R1 — R2 = {all tuples in R1 not
in R2}Query without unions is conjunctive
![Page 7: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/7.jpg)
7
Selection
Chooses a subset of tuples satisfying a predicate
Rating = “PG-13” (NewMovies)
Name Rating DirectorPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
![Page 8: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/8.jpg)
8
Projection
Chooses a subset of attributes, removes duplicates
NewMoviesName Rating DirectorThe Phantom Menace PG George LucasPatch Adams PG-13 Tom ShadyacStepmom PG-13 Chris Columbus
Name DirectorThe Phantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
Name, Director (NewMovies)
![Page 9: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/9.jpg)
9
Cartesian Product
Combine two relations - all pairings of tuples
NewMovieName DirectorPhantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
MetroShowsName TimePatch Adams 7:00Patch Adams 9:40Stepmom 6:50Stepmom 9:00
NewMovie MetroShowsName Director Name TimePhantom Menace George Lucas Patch Adams 7:00Phantom Menace George Lucas Patch Adams 9:40Phantom Menace George Lucas Stepmom 6:50Phantom Menace George Lucas Stepmom 9:00Patch Adams Tom Shadyac Patch Adams 7:00… … … ...
![Page 10: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/10.jpg)
10
Join
Combine tuples from two relations by predicate. If predicate is equality, remove duplicate attribute.
NewMovieName DirectorPhantom Menace George LucasPatch Adams Tom ShadyacStepmom Chris Columbus
MetroShowsName TimePatch Adams 7:00Patch Adams 9:40Stepmom 6:50Stepmom 9:00
NewMovie NewMovie.Name = MetroShows.Name MetroShows
Name Director TimePatch Adams Tom Shadyac 7:00Patch Adams Tom Shadyac 9:40Stepmom Chris Columbus 6:50Stepmom Chris Columbus 9:00
Equivalent to:Name, Director, Time ( NewMovie.Name = MetroShows.Name
(NewMovie MetroShows))
![Page 11: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/11.jpg)
11
Division
Only returns results from dividend which match attributes of all of tuples in divisor (“for all”)Customersname branchJohnson DowntownSmith BrightonLindsay RedwoodGreen BrightonGreen Downtown
BranchesbranchBrightonDowntown
Customers BranchesnameGreen
![Page 12: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/12.jpg)
12
Logic-Based Query Languages
Tuple-relational calculus Based on first-order predicate calculus, but
imposes restrictions (e.g., no function symbols)
Equivalent in power to relational algebra
Datalog More powerful than tuple-relational calculus
Supports recursion and thus transitive closure
Equivalent to set of Horn clauses
![Page 13: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/13.jpg)
13
Predicates & Atoms
Relations represented by predicatesTuples represented by atoms:
Flight(“Alon”, “British Airways”, 1234, “Seattle”, “Israel”, “1/9/1999”, “10:00”)
Arithmetic atoms:X < 100, X + Y + 5 > Z * 2
In certain cases, negated atoms:not Flight(“Zack”, “British Airways”, 1234, “Seattle”,
“Israel”, “1/9/1999”, “10:00”)
![Page 14: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/14.jpg)
14
Datalog Rules & Programs
“Pure” datalog rules:head :- atom1, atom2, atom3
where all atoms are non-negated and relational
Datalog program is set of datalog rules Conjunctive query has single rule:
HanksDirector(D) :- ActorIn(“Hanks”, M) & DirectedBy(D, M)
Disjunctive query:HanksDirector(D) :- ActorIn(“Hanks”, M) & DirectedBy(D, M)HanksDirector(D) :- ProducedBy(“Hanks”, M) & DirectedBy(D, M)
![Page 15: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/15.jpg)
15
Intension & Extension
Distinction between EDB & IDB: Extensional DB
what’s stored in the databasecan only be in the body of a Datalog rule
Intensional DBresult of a datalog rulecan be in head or body of a Datalog rule
![Page 16: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/16.jpg)
16
Evaluating Rules
Evaluating rules: Start with EDB, iteratively derive facts for IDB’s Repeat until cannot derive any new facts:
Consider every possible assignment from database to variables in body
If each atom in body is made true by assignment, add tuple for the head into the head’s relation
Conjunctive queries are inexpensive Disjunctive are more expensive (even though
they’re Horn clauses)
![Page 17: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/17.jpg)
17
Transitive Closure
Suppose we to know all possible destination airports reachable from SeaTac, given an EDB Flight(From, To):
SEA
SFO
LAX
JFK
LGA DEN
We need to express the query: Find all airports reachable from SEA
![Page 18: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/18.jpg)
18
Recursion in Datalog
Program:Dest(X, Y) :- Flight(X, Y)Dest(X, Y) :- Dest(X, Z), Flight(Z, Y)
Evaluate unknown # of times until fixpoint Flight: {(SEA,SFO), (SEA,LAX), (SEA,JFK), (LAX,LGA),
(JFK,LGA),(LGA,DEN)}Dest0: {}
Dest1: Dest0 {(SEA,SFO), (SEA,LAX), (SEA,JFK)} Dest2: Dest1 {(SEA,LGA),(LAX,DEN),(JFK,DEN)} Dest3: Dest2 {(SEA,DEN)} No more changes, so stop (minimum fixpoint)
![Page 19: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/19.jpg)
19
Built-in Predicates
Rules may include atoms with built-in predicates:Expensive(X) :- Product(X, Y, P) & P > 100
But need to restrict use of built-in atoms:P(X) :- R(X) & X < YWhen is X < Y?
We require that every variable from a built-in atom appear in a relational atom
![Page 20: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/20.jpg)
20
Negated Subgoals
Restrict forms:P(X, Y) :- Between(X, Y, Z) & NOT Direct(X,Z)
Bad:Q(X, Y) :- From(X) & NOT To(Y)
Bad but fixable:P(X) :- From(X) & NOT Connects(X, Y)
Rewrite as:Connects’(X) :- Connects(X,Y)P(X) :- End(X) & NOT Connects’(X)
![Page 21: 1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999](https://reader035.vdocuments.site/reader035/viewer/2022062517/56649f355503460f94c52cd4/html5/thumbnails/21.jpg)
21
Stratified/Stratisfiable Rules
Predicate P depends on predicate Q if Q appears negated in a rule defining P
If cycle in dependency graph, program is not stratifiable:p(X) :- r(X) & NOT q(X)
q(X) :- r(X) & NOT p(X)
Suppose r has tuple {1}
Intuitively, stratification provides a way of executing program P as sequence of subprograms P1 … Pn defining IDB relations w/o “forward references”. Results independent of stratification.