11-2-20091 lecture 5 on query optimization this lecture demonstrates the techniques of optimizing...

44
11-2-2009 1 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer will transform SQL query into an equivalent SQL query in the form of relational algebra with less cost. It shows how to apply heuristics rules in reducing the cost of query.

Post on 21-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 1

Lecture 5 on Query Optimization

This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer will transform SQL query into an equivalent SQL query in the form of relational algebra with less cost. It shows how to apply heuristics rules in reducing the cost of query.

Page 2: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 2

Query OptimizerWe can interpret an expression of relational algebra not only as the

specification of the semantics of a query, but also as the specification of sequence of operations. From this viewpoint, two expressions with the same semantics can describe two different sequences of operations.

Given:Relation EMP(Empnum, Name, Sal, Tax, Mgrnum, Deptnum) Relation DEPT(Deptnum, Name, Area, Mgrnum)Relation SUPPLIER(Snum, Name, City)Relation SUPPLY(Snum, Pnum, Deptnum, Quan)

PJ NAME, DEPTNUM SL DEPTNUM=15 EMP = SLDEPTNAM=15 PJ NAME, DEPTNUM EMP

condition SL DEPTNUM PJ DEPTNUMEMP (projected data must be in selected data)

Are equivalent expressions but define two different sequences of operations.

Page 3: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 3

Page 4: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 4

The operator tree of an expression of relational algebra can be regarded as the parse tree of the expression itself, assuming the following grammar:

R -> identifier

R -> (R)

R -> un_op R

R -> R bin_op R

Un_op -> SLF | PJA

Bin_op -> CP | UN | DF | JNF | NJNF | SJF | NSJF

Two relations are equivalent when their tuples represent the same mapping from attribute names to values, even if the order of attributes is different.

Page 5: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 5

• Commutativity of unary operations:

U1U2R <->U2U1R

• Commutativity of operands of binary operationsR B S <-> S B R

• Associativity of unary operations:

U R <-> U1 U2 R

• Distributivity of unary operations with respect to binary operations:

U (R B S) -> U(R) B U(S)

• Factorization of unary operations (this tranforsmation is the inverse distributivity):

U (R) B U(S) -> U (R B S)

Page 6: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 6

Commutativity of unary operationsSLF1 SLF2 R → SLF2 SLF1 R

SLF1 PJA2 R → PJA2 SLF1 RAttr(F1)A2

Commutativity and Associativity of binary operationsR UN S → S UN RR CP S → S CP RR JNF S → S JNF R

(R UN S) UN T → R UN (S UN T)(R CP S) CP T → R CP (S CP T)

Idempotence of unary operationsPJA R → PJA1 PJA2 R

AA1AA2

SLF R → SLF1 SLF2 RF=F1 F2

Page 7: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 7

Distributivity of unary operationsSLF (R UN S) → (SLF R) UN (SLF S)

SLF (R DF S) → (SLF R) DF (SLF S)

SLF (R SJF3 S) → (SLF R) SJF3 (SLFS S)FS=true => result is not empty

PJA (R CP S) → (PJAR R) CP (PJAS S)AR=A-Attr(S)=R.AAS=A-Attr(R)=S.A

Factorization of unary operations from binary operations(PJAR R) CP (PJAS S) → PJA (R CP S)

A=ARAS

(SLFR R) JNF1 (SLFS S) → SLF (R JNF1 S) F=FR FS∧

Page 8: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 8

Page 9: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 9

Page 10: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 10

Page 11: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 11

Page 12: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 12

Page 13: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 13

Page 14: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 14

Page 15: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 15

Page 16: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 16

R.A S.B

b

1

1

3

3

A B

a

1

1

C

a a

d

a

a

b

SL A=a (R SJ R.C=T.C T) → (SL A=a R) SJ R.C=T.C (SL C=a,b,d T)

(PJ A R) CP (PJ B S) → (PJ A,B (R CP S)

Page 17: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 17

Qualified Relations

A qualified relation is a pair [R: qR] where R is a relation called the body of the qualified relation and qR is a predicate called the qualification of the qualified relation, for example horizontal fragments are qualified relations in which the qualification corresponds to the partitioning predicate.

As a result, [R: qR] is an unary operation such as selection (horizontal fragmentation) and/or projection (vertical fragmentation) to relation R.

Page 18: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

21-2-2009 18

Algebra of qualified relationsThe application of an unary operation to qualified relation R

Un_op[R: qR1] → Un_op R: qR2

produces a relation un_op R as its body and the predicate qR2 as its qualifications. On the left hand side, we apply qualification qR1 followed by un_ary operation. On the right hand side, we apply Un_ary operation first, followed by qualification qR2.

Rule 1: SLF[R: qR] → [SLFR: F AND qR] (horizontal fragmentation)

F holds all the tuples as well as qR

SLF

R

SLF

R2R1 R3

SLF

R2R1 R3

SLF1 SLF2 SLF3

UNUN

Page 19: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

21-2-2009 19

Rule 2: PJA[R: qR] → [PJAR: qR] (vertical fragmentation)

PJA

R

R2R1 R3

R2R1 R3

PJA

PJA

PJA1 PJA2 PJA3

NJN

NJN

Page 20: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 20

Horizontal Fragmentation Vertical Fragmentation

Page 21: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 21

CP

R S

CP

UN UN

R1 R2 S1 S2

UN

CP CP

R1 S1 R1 S2

CP CP

R2 S1 R2 S2

Rule 3: [R: qR] CP [S: qS] → [R CP S: qR AND qS]

Two qualifications apply to disjoint attributes of R CP S:

Page 22: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 22

Rule 4: [R: qR] DF [S: qS] → [R DF S: qR]

DF

R S

DF

UN UN

R1 R2 S1 S2

UN

DF DF

R1 S1 R1 S2

DF DF

R2 S1 R2 S2

Page 23: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

21-2-2009 23

Rule 5: [R: qR] UN [S: qS] → [R UN S: qR OR qS ]

UN

R S

UN

UN UN

R1 R2 S1 S2

R1

UN

UN UN

R1 S1 R1 S2

UN UN

R2 S1 R2 S2

Page 24: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 24

Properties to simplify an operator tree

• R NJN R R

• R UN R R

• R DF R 0

• R NJN SLF R SLF R

• R UN SLF R R

• R DF SLF R SLNOT F R

• (SLF1 R) NJN (SLF2 R) SLF1 AND F2 R

• (SLF1 R) UN (SLF2 R) SLF1 OR F2 R

• (SLF1 R) DF (SLF2 R) SLF1 AND NOT F2 R

Page 25: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 25

Sub_expression of a query is empty• SLF (0) 0• PJA (0) 0• R CP 0 0• R UN 0 R• R DF 0 R• 0 DF R 0• R JNF 0 0• R SJF 0 0• 0 SJF R 0

Where 0 is an empty set

Page 26: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 26

Criterions for Query Optimization1. Use idempotence of selection and projection to generate

appropriate selection and projection for each operand relation.

2. Push selections and projections down in the tree as far as possible.

3. Push selections down to the leaves of the tree, and then apply them using the algebra of qualified relations, substitute the selections result with the empty relation if the qualification of the result is contradictory.

4. Use the algebra of qualified relations to evaluate the qualification of operands of joins. Substitute the subtree, including the join and its operands, with the empty relation if the qualification of the result of the join is contradictory.

5. In order to distribute joins which appear in the global query, unions (representing fragment collections) must be pushed up, beyond the joins that we want to distribute.

Page 27: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 27

A modified operator tree for query Q1by criterion 1 (use of idempotence)

PJ EMPNUM, PNUM

JN DEPTNUM = DEPTNUM

PJ DEPTNUM, PNUM PJ DEPTNUM, EMPNUM

SUPPLY SL SALAR > 1000000

EMP

PJ EMPNUM, PNUM

JN DEPTNUM = DEPTNUM

EMPSUPPY

SL SALARY > 1000000

Note: R DF SLF R SL NOT F R

Page 28: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 28

Decompose query Q2 by criterion 2(push selection/project down in the tree)

• Q2: give the names of employees who work in a department whose manager has number 373 but who do not earn more than $35,000

PJEMP.NAME((EMP JN DEPTNUM=DEPTNUM SL MGRNUM=373 DEPT) DF (SLSAL>35000 EMP JN DEPTNUM=DEPTNUM SL MGRNUM=373 DEPT))

Page 29: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 29

PJ EMP.NAME

DF

JN DEPTNUM = DEPTNUMJN DEPTNUM = DEPTNUM

SL MGRNUM=373EMP

DEPT

SL MGRNUM=373

DEPT

SL SAL>35000

EMP

Operator tree

Page 30: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

21-2-2009 30

JN DEPTNUM = DEPTNUMJN DEPTNUM = DEPTNUM

SL MGRNUM=373EMP

DEPT

SL MGRNUM=373

DEPT

EMP

PJ EMP.NAME

DF

SL SAL>35000

Idempotence

Page 31: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 31

PJ EMP.NAME

EMP

SL MGRNUM=373

DEPT

DF

SL SAL>35000

JN DEPTNUM = DEPTNUM

Factorization

Page 32: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 32

PJ EMP.NAME

DEPT

SL SAL<35000

JN DEPTNUM = DEPTNUM

EMPSL MGRNUM=373

Simplication

Page 33: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 33

Push down Selection in the treePJ EMP.NAME

SL SAL<35000

EMP

SL MGRNUM=373

DEPT

JN DEPTNUM = DEPTNUM

PJ NAME.DEPTNUMPJ DEPTNUM

Page 34: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 34

Decompose query Q3 by criterion 3(eliminate empty set)

Q3: SL DEPTNUM=1 DEPT

Given:

DEPT = DEPT1: DEPTNUM<10

UN

DEPT2: 10<DEPTNUM<20

UN

DEPT3:DEPTNUM>20

Page 35: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 35

UN

SL DEPTNUM=1

[DEPT1:DEPTNUM<10]

[DEPT3:DEPTNUM>20]

[DEPT2:10<DEPTNUM<20]

SL DEPTNUM=1

[DEPT1: DEPTNUM<10]

Decompose query Q3 by criterion 3(eliminate empty set)

Q3: SL DEPTNUM=1 DEPT

Page 36: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 36

Given:

SUPPLIER => [SUPPLIER1:CITY=“SF”] UN

[SUPPLIER2:CITY=“LA”]

SUPPLY => [SUPPLY1:Snum=SUPPLIER1.Snum] UN

[SUPPLY2:Snum=SUPPLIER2.Snum]

Decompose query Q4 by criterion 4(eliminate irrelevant joins) and 5(push union up in the tree)

Q4: PJ SMUM (SUPPLY NJN SUPPLIER)

Page 37: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 37

PJ SNUM

NJN

SupplierSuppy

Decompose query Q4 by criterion 4(eliminate irrelevant joins) and 5(push union up in the tree)

Q4: PJ SMUM (SUPPLY NJN SUPPLIER)

Page 38: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 38

PJ SNUM

NJN

UNUN

[SUPPLY1: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”SF”]

[SUPPLY2: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”LA”]

[SUPPLIER1: CITY=”SF”]

[SUPPLIER2: CITY=”LA”]

Decompose query Q4 by criterion 4(eliminate irrelevant joins) and 5(push union up in the tree)

Q4: PJ SMUM (SUPPLY NJN SUPPLIER)

Page 39: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 39

UN

PJ SNUM PJ SNUM

NJN

[SUPPLY1: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”SF”]

[SUPPLIER1: CITY=”SF”]

NJN

[SUPPLIER2: CITY=”LA”]

[SUPPLY2: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”LA”]

PJ SNUM PJ SNUM

NJN

[SUPPLY1: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”SF”]

[SUPPLIER2: CITY=”LA”]

NJN

[SUPPLIER1: CITY=”SF”]

[SUPPLY2: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”LA”]

Elimination of empty relation

Page 40: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 40

Optimized operator treeUN

PJ SNUM PJ SNUM

NJN

[SUPPLY1: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”SF”]

[SUPPLIER1: CITY=”SF”]

NJN

[SUPPLIER2: CITY=”LA”]

[SUPPLY2: SNAM=SUPPLIER.SNUM AND SUPPLIER. CITY=”LA”]

Page 41: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 41

Lecture summary

Many heuristics rules can be applied to reduce the cost of SQL query as follows:

Idempotence of select and project.

Push select and project down in the operator tree.

Qualify relation first for eliminating contradictory relation(s).

Evaluate join operations with qualified relation(s).

Push Union operation up in the operator tree.

Page 42: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 42

Review Question 5(1) Discuss the reasons for converting SQL queries into relational

algebra queries before optimization is done.

(2) Show an example of qualifying following

Relation R (Key, Attribute 1, Attribute 2) by

Horizontal fragmentationVertical fragmentationInto relations R1 and R2.

(3) Show how to reconstruct relations R1 and R2 back into relation R from

(a) Horizontal fragmentation(b) Vertical fragmentation

Page 43: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 43

Make-up Tutorial Question 5Given:Relation PATIENT (Pnum, Name, Dept, Treat, Dnum)Relation CARE (*Pnum, Drug, Quan)

A query search is needed for the names of the patients who are taking ‘asprin’ drug.List the Query (i) in SQL (20%)(ii) in relational algebra (20%) (iii) in operator tree (20%)

(iv) Given its qualified relations (fragments) as follows:Relation PATIENT1 = SL Dept=’surgery” and Treat=”intensive” PATIENTRelation PATIENT2 = SL Dept=’surgery” and Treat”intensive” PATIENTRelation PATIENT3 = SL Dept’surgery” PATIENTRelation CARE1 = CARE SJ Pnum=Pnum PATIENT1Relation CARE2 = CARE SJ Pnum=Pnum PATIENT2Relation CARE3 = CARE SJ Pnum=Pnum PATIENT3

Translate the query in into fragment queries (20%) and simplify them for optimization. (20%)

Page 44: 11-2-20091 Lecture 5 on Query Optimization This lecture demonstrates the techniques of optimizing SQL query command for efficiency performance. The optimizer

11-2-2009 44

Reading Assignment

Chapter 15 Algorithm for Query Processing and Optimization of “Fundamentals of Database Systems” by Elmasri and Navathe, 5th edition, Pearson, 2007.