1 database query execution zack ives cse 544 - principles of dbms ullman chapter 6, query execution...
TRANSCRIPT
1
Database Query Execution
Zack IvesCSE 544 - Principles of DBMS
Ullman Chapter 6, Query Execution
Spring 1999
2
Query Execution
Inputs: Query execution plan from optimizer Data from source relations Indices
Outputs: Query results Data distribution statistics (Also use temp storage)
3
Query Plans
Data-flow tree (or graph) of relational algebra operators
Statically pre-compiled vs. dynamic decisions: “Choose nodes” Competition Fragments
Select
Client = “Atkins”
Join
PressRel.Symbol = Clients.Symbol
Scan
PressRel
Scan
Clients
Join
Symbol = Northwest.CoSymbol
Project
CoSymbol
Scan
Northwest
4
Plan Execution
Execution granularity & parallelism: Pipelining vs. blocking Threads Materialization
Execution flow: Iterator/top-down Data-driven/bottom-
up
Select
Client = “Atkins”
Join
PressRel.Symbol = Clients.Symbol
Scan
PressRel
Scan
Clients
Join
Symbol = Northwest.CoSymbol
Project
CoSymbol
Scan
Northwest
5
Data-Driven Execution
Schedule leaves (generally parallel or
distributed system)
Leaves feed data “up” tree; may need to buffer
Good for slow sources or parallel/distributed
Often less efficient than iterator w.r.t. memory and CPU
Select
Client = “Atkins”
Join
PressRel.Symbol = Clients.Symbol
Scan
PressRel
Scan
Clients
Join
Symbol = Northwest.CoSymbol
Project
CoSymbol
Scan
Northwest
6
The Iterator Model
Execution begins at root open, getNext, close Propagate calls to
childrenNon-pipelined operation may
require multiple getNexts
Efficient scheduling & resource usage
Poor if slow sources (getNext may block)
Select
Client = “Atkins”
Join
PressRel.Symbol = Clients.Symbol
Scan
PressRel
Scan
Clients
Join
Symbol = Northwest.CoSymbol
Project
CoSymbol
Scan
Northwest
7
Tukwila Modified Iterator Model
Same operationsopen, getNext, close
Some operators multithreaded Use buffering and
synchronizationSchedule work while
blocked Multithreaded
operators need more memory
Select
Client = “Atkins”
Join
PressRel.Symbol = Clients.Symbol
Scan
PressRel
Scan
Clients
Join
Symbol = Northwest.CoSymbol
Project
CoSymbol
Scan
Northwest
8
The Cost of Execution
Costs very important to the optimizer It must search for low-cost query execution
plan Statistics:
Cardinalities Histograms (estimate selectivities)
Impact of data integration?
I/O vs. computation costs Time-to-first-tuple vs. completion time
9
Reducing Costs with Buffering
Read a page/block at a time Should look familiar to OS people!
Use a page replacement strategy: LRU (not as good as you might think) MRU (good for one-time sequential
scans) Clock etc.
Note that we have more knowledge than OS to predict paging behavior e.g. one-time scan should use MRU Can also prefetch when appropriate
Buffer Mgr
Tuple Reads/Writes
10
Select Operator
If unsorted & no index, check against predicate:Read tupleWhile tuple doesn’t meet predicateRead tuple
Return tuple
Sorted data: can stop after particular value encountered
Indexed data: apply predicate to index, if possible If predicate is:
conjunction: may use indexes and/or scanning loop above (may need to sort/hash to compute intersection)
disjunction: may use union of index results, or scanning loop
11
Project Operator
Simple scanning method often used if no index:Read tupleWhile more tuples
Output specified attributesRead tuple
Duplicate removal may be necessary Partition output into separate files by bucket, do duplicate
removal on those May need to use recursion If have many duplicates, sorting may be better
Can sometimes do index-only scan, if projected attributes are all indexed
12
The Simplest Join — Nested-Loops
Requires two nested loops:For each tuple in outer relationFor each tuple in inner, compareIf match on join attribute, output
Block nested loops join: read & match page at a time What if join attributes are indexed?
Index nested-loops join
Very simple to implement Inefficient if size of inner relation > memory (keep
swapping pages); requires sequential search for match
Join
outer inner
13
Sort-Merge Join
First sort data based on join attributes Use an external sort (as previously described),
unless data is already orderedMerge and join the files, reading sequentially a block at a time
Maintain two file pointers; advance pointer that’s pointing at guaranteed non-matches
Allows joins based on inequalities (non-equijoins)
Very efficient for presorted data Not pipelined unless data is presorted
14
Hashing it Out: Hash-Based Joins
Allows (at least some) pipelining of operations with equality comparisons (e.g. equijoin, union)
Sort-based operations block, but allow range and inequality comparisons
Hash joins usually done with static number of hash buckets Alternatives use directories, are more complex:
Extendible hashing Linear hashing
Generally have fairly long overflow chains
15
Hash Join
Read entire inner relation into hash table (join attributes as key)
For each tuple from outer, look up in hash table & join
Very efficient, very good for databases
Not fully pipelined Supports equijoins
only Data integration?
16
Overflowing Memory - GRACE
Two possible strategies: Overflow prevention (prevent from happening) Overflow resolution (handle overflow when it
occurs)
GRACE hashWrite each bucket to separate fileFinish reading inner, swapping tuples to appropriate files
Read outer, swapping tuples to overflow files matching those from inner
Recursively GRACE hash join matching outer & inner overflow files
17
Overflowing Memory - Hybrid Hash
A “lazy” version of the GRACE hash:When memory overflows, only swap a subset of the tables
Continue reading inner relation and building table (sending tuples to buckets on disk as necessary)
Read outer, joining with buckets in memory or swapping to disk as appropriate
Join the corresponding overflow files, using recursion
18
Double-Pipelined Join
Two hash tables As a tuple comes
in, add to the appropriate side & join with opposite table
Fully pipelined, data-driven
Needs more memory
19
Double Pipelined Join Performance for Data Integration
(a) Join Performance: Lineitem Supplier Order
0
200
400
600
800
1 51 101 151 201 251
Number of Tuples Output (1000's)
Tim
e (s
ec)
Double PipelinedHybrid - (Lineitem Supplier) OrderHybrid - (Supplier Lineitem) Order
20
Overflow Resolution in the DPJoin
Requires a bunch of ugly bookkeeping!Need to mark tuples depending on state of opposite bucket -
this lets us know whether they need to be joined later
Tukwila “Incremental left flush” strategy Pause reading from outer relation, swap some of
its buckets Finish reading from inner; still join with left-side
hash table if possible, or swap to disk Read outer relation, join with inner’s hash table Read from overflow files and join as in hybrid hash
join
21
Overflow Resolution, Pt. II
Tukwila “Symmetric flush” strategy: Flush all tuples for the same bucket from both
sides Continue joining; when done, join overflow files
by hybrid hash
Urhan and Franklin’s X-Join Flush buckets from either relation If stalled, start trying to join from overflow files Needs lots of really nasty bookkeeping
22
Performance of Overflow Methods
0
30
60
90
120
150
180
210
1 11 21 31
Number of Tuples Produced (1000's)
Tim
e (s
ec)
Left Flush - 32MBLeft Flush - 16MBSymmetric Flush - 32MBSymmetric Flush - 16MBFits in Memory - 64MB
23
The Semi-Join/Dependent Join
Take attributes from left and feed to the right source as input/filter
Important in data integration Simple method:
for each tuple from leftsend to right sourceget data back, join
More complex: Hash “cache” of attributes & mappings Don’t send attribute already seen
JoinA.x = B.y
A Bx
25
Issues in Choosing Joins
Goal: minimize I/O costs! Is the data pre-sorted? How much memory do I have and need?
Selectivity estimates
Inner relation vs. outer relation Am I doing an equijoin or some other join? Is pipelining important? How confident am I in my estimates? Partition such that partition files don’t overflow!