containment cse 590 db rachel pottinger. outline zintroduction zmotivation zformal definition...

22
Containment CSE 590 DB Rachel Pottinger

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Containment

CSE 590 DBRachel Pottinger

Page 2: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Outline

IntroductionMotivationFormal definitionAlgorithms for different complexitiesAn application: rewriting queries

using views

Page 3: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Containment, what is it?

For two queries, Q1 and Q2, if all of the answers to Q1 are a subset of those for Q2 for all databases, then Q1 is contained in Q2.

Denoted as Q1 Q2.For general datalog, this is undecidable

(by reduction from decision problems for context free languages)

Page 4: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Why should I care?

Containment is useful in a number of situations, including: Query minimization Independence of queries using updates Rewriting queries using views Interesting logic problem

Page 5: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

More definitions

Equivalence of queries: Q1Q2 if they return the same answers for all databases. This is the same as Q1 Q2 and Q2 Q1

Conjunctive query - a query that is formed only of conjunctions of predicates.

Q(X,Y):- e(X,Z),e(Z,Y)

Page 6: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Containment Mapping

Let Q1 and Q2 be two conjunctive queries Q1: I :- J1, …, Jl Q2: H :- G1, …, Gk

A symbol mapping h is said to be a containment mapping if h turns Q2 into Q1; that is, h(H)= I, and for each i = 1,2,…,k, there is some j such that h(Gi)=Jj. There is no requirement that each Jj be the target of some Gi

Page 7: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Proof Sketch

If there’s a containment mapping from Q2 to Q1, then Q1 Q2

Suppose maps Vars(Q2)Vars(Q1) Let D be a database and be an answer is a mapping from Vars(Q1) D

• Vars(Q2) D

The rest of the proof follows later

Page 8: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Example of homomorphism rules

Q1: fp(X,Y) :- e(Y,X), e(X,Z)

Q2: fp(A,B) :- e(B,A), e(C,A),e(A,D)

For Q1 Q2, map from Q2 to Q1

Page 9: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Test for containment of a conjunctive query (Q1 Q2)

Freeze the body of Q1, and put this into a canonical database

Apply Q2 to the canonical database

If Q1 can be derived from Q2 on the canonical database, then Q1 Q2, otherwise not

Page 10: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

A chilling example

Q1: p(X,Z) :- a(X,Y), a(Y,Z)

Q2: p(X,Z) :- a(X,U), a(V,Z)

Canonical Database of Q1

Page 11: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Proof continued

If Q1 Q2,then there is a containment mapping Since Q1 Q2, we know that if we apply

Q2 to the canonical database formed from Q1, we’ll get back the same fact we got from applying it to Q1, which makes a mapping from Q2 to Q1.

Page 12: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Conjunctive queries with negation

Negation in the heads of the subgoals, ie: Q(X,Y):- e(X,Z),e(Z,Y)

The Levy and Sagiv test looks at an exponential number of canonical databases, thus is P

2 complete

Consider all partitions of Q1; form canonical databases for all of them, D1, … Dk

For each database Di, see if the database makes all subgoals of Q1 true.

For all Di’s passing step 2, see if it the head of Q1 can be derived by applying Q2

If so, then Q1 Q2, else not

Page 13: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

A negative example

Q1: p(X,Z):-a(X,Y), a(Y,Z), a(X,Z)

Q2: p(A,C):-a(A,B),a(B,C), a(A,D)

Page 14: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Conjunctive Queries with Arithmetic Comparisons

Q(X,Y):-e(X,Z),e(Z,Y), Z < YTreat the same as the negated

subgoals, only a check must be made for each ordering of each partition

Also P2 complete for dense domain

such as reals

Page 15: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Example with arithmetic comparisons

Q1:p(X,Z):-a(X,Y), a(Y,Z), X < YQ2:p(A,C):-A(A,B),A(B,C), A < Cfalse, see x = z = 0, y = 1

Page 16: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Other complexity results

queries restricted to queries Q1 and Q2 such that all database predicates have arity at most 2 and every database predicate occurs at most three times in the body of Q1 - P

2

Conjunctive queries where Q1 is fixed- NP complete Conjunctive queries where Q2 is fixed - polynomial Conjunctive query containment where Q2 is an

acyclic query - polynomial time Conjunctive queries where every database

predicate occurs at most twice in the body of Q1 - linear time

Page 17: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Rewriting Queries Using Views

Useful in query optimizationGood for query minimizationNeeded to make the best use of

cached informationNecessary in data integration

Page 18: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Views

A view is a relation that is not part of the conceptual model, but is visible to the user.

Useful for common expressions, or protecting data

Example: If you had faculty(name, office, ssn) you may want students to access faculty_office(name, office)

Page 19: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Views (con’t.)

Views can be either materialized or virtual

In data integration, data sources can be thought of as views

Page 20: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

An example of rewriting queries using views

Suppose you had two databases: One has famous people and whether

they are right or left handed One has the birthdays of famous people

You want the birthdays of all of the lefties

Page 21: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Containment in rewriting

Query of q(X):-e(X,Y), e(Y,X)View of v(A,B):- e(A,C),e(C,B)

Page 22: Containment CSE 590 DB Rachel Pottinger. Outline zIntroduction zMotivation zFormal definition zAlgorithms for different complexities zAn application:

Q(x,u):-p(x,y),p0(y,z),p1(x,w),p2(w,u)

V1(a,b):-p(a,c),p0(c,b),p1(a,d)

V2(a,b):-p1(a,b)

V3(a,b):-p2(a,b)

A more complicated example