replacing geqo - join ordering via simulated …replacing geqo join ordering via simulated annealing...

72
Replacing GEQO Join ordering via Simulated Annealing Jan Urba´ nski [email protected] University of Warsaw / Flumotion May 21, 2010 Jan Urba´ nski [email protected] (University of Warsaw / Flumotion) Replacing GEQO May 21, 2010 1 / 46

Upload: others

Post on 23-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

Replacing GEQOJoin ordering via Simulated Annealing

Jan [email protected]

University of Warsaw / Flumotion

May 21, 2010

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 1 / 46

Page 2: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

For those following at home

Getting the slides

$ wget http://wulczer.org/saio.pdf

Trying it out

$ git clone git://wulczer.org/saio.git

$ cd saio && make

$ psql

=# load ’.../saio.so’;

Do not try to compile against an assert-enabled build. Do not be scaredby lots of ugly code.

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 2 / 46

Page 3: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

1 The problemDetermining join order for large queriesGEQO, the genetic query optimiser

2 The solutionSimulated Annealing overviewPostgreSQL specificsQuery tree transformations

3 The resultsComparison with GEQO

4 The futureDevelopment focuses

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 3 / 46

Page 4: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem Determining join order for large queries

Outline

1 The problemDetermining join order for large queriesGEQO, the genetic query optimiser

2 The solution

3 The results

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 4 / 46

Page 5: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem Determining join order for large queries

Getting the optimal join order

I part of of planning a query is determining the order in which relationsare joined

I it is not unusual to have queries that join lots of relations

I JOIN and subquery flattening contributes to the number or relationsto join

I automatically generated queries can involve very large joins

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 5 / 46

Page 6: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem Determining join order for large queries

Problems with join ordering

I finding the optimal join order is an NP-hard problem

I considering all possible ways to do a join can exhaust availablememory

I not all join orders are valid, because of:I outer joins enforcing a certain join orderI IN and EXISTS clauses that get converted to joins

I joins with restriction clauses are preferable to Cartesian joins

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 6 / 46

Page 7: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem GEQO, the genetic query optimiser

Outline

1 The problemDetermining join order for large queriesGEQO, the genetic query optimiser

2 The solution

3 The results

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 7 / 46

Page 8: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem GEQO, the genetic query optimiser

Randomisation helps

I PostgreSQL switches from exhaustive search to a randomisedalgorithm after a certain limit

I GEQO starts by joining the relations in any order

I and then proceeds to randomly change the join order

I genetic algorithm techniques are used to choose the cheapest joinorder

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 8 / 46

Page 9: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The problem GEQO, the genetic query optimiser

Problems with GEQO

I has lots of dead/experimental code

I there is a TODO item to remove it

I nobody really cares about it

I is an adaptation of an algorithm to solve TSP, not necessarily bestsuited to join ordering

I requires some cooperation from the planner, which violatesabstractions

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 9 / 46

Page 10: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

Outline

1 The problem

2 The solutionSimulated Annealing overviewPostgreSQL specificsQuery tree transformations

3 The results

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 10 / 46

Page 11: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

Previous work

I numerous papers on optimising join order have been written

I Adriano Lange implemented a prototype using a variation ofSimulated Annealing

I other people discussed the issue on -hackers

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 11 / 46

Page 12: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

What is Simulated Annealing

Annealing (...) is a processthat produces conditions byheating to above there-crystallisation temperatureand maintaining a suitabletemperature, and then cooling.

– Wikipedia

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 12 / 46

Page 13: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA Algorithm cont.

I the system starts with an initial temperature and a random state

I uphill moves are accepted with probability that depends on thecurrent temperature

probability of accepting an uphill move

p = ecostprev−costnew

temperature

I moves are made until equilibrium is reached

I temperature is gradually lowered

I once the system is frozen, the algorithm ends

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 13 / 46

Page 14: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 15: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 16: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 17: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 18: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 19: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 14 / 46

Page 20: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA Algorithm cont.

Implementing Simulated Annealing means solving the following problems:

I finding an initial state

I generating subsequent states

I defining an acceptance function

I determining the equilibrium condition

I suitably lowering the temperature

I determining the freeze conditions

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 15 / 46

Page 21: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 22: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 23: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 24: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 25: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 26: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 27: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

The SA algorithm

Simulated Annealing

state = random state()

do {

do {

new state = random move()

if (acceptable(new state))

state = new state

}

while (!equilibrium())

reduce temperature()

}

while (!frozen())

return state

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 16 / 46

Page 28: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Simulated Annealing overview

A visual example

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 17 / 46

Page 29: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Outline

1 The problem

2 The solutionSimulated Annealing overviewPostgreSQL specificsQuery tree transformations

3 The results

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 18 / 46

Page 30: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Differences from the original algorithm

I PostgreSQL always considers all possible paths for a relation

I make join rel is symmetrical

I you can have join order constraints (duh)

I the planner is keeping a list of all relations...

I ... and sometimes turns it into a hash

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 19 / 46

Page 31: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Join order representation

I SAIO represents joins as query trees

I chosen to mimic the original algorithm more closely

I each state is a query tree

I leaves are basic relations

I internal nodes are joinrels

I the joinrel in the root of the tree is the current result

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 20 / 46

Page 32: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Query trees

Example query tree for a sixrelation join.

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 21 / 46

Page 33: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Query trees cont.

Some useful query tree properties:

I symmetrical (no difference between left and right child)

I fully determined by the tree structure and relations in leaves

I each node has a cost

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 22 / 46

Page 34: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Query trees cont.

Some useful query tree properties:

I symmetrical (no difference between left and right child)

I fully determined by the tree structure and relations in leaves

I each node has a cost

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 22 / 46

Page 35: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution PostgreSQL specifics

Query trees cont.

Some useful query tree properties:

I symmetrical (no difference between left and right child)

I fully determined by the tree structure and relations in leaves

I each node has a cost

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 22 / 46

Page 36: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

Outline

1 The problem

2 The solutionSimulated Annealing overviewPostgreSQL specificsQuery tree transformations

3 The results

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 23 / 46

Page 37: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

Implementing Simulated Annealing means solving the following problems:

I finding an initial state

I generating subsequent states

I defining an acceptance function

I determining the equilibrium condition

I suitably lowering the temperature

I determining the freeze conditions

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 24 / 46

Page 38: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

Implementing Simulated Annealing means solving the following problems:

I finding an initial state

I generating subsequent states

I defining an acceptance function

I determining the equilibrium condition

I suitably lowering the temperature

I determining the freeze conditions

Some are easy

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 24 / 46

Page 39: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

Implementing Simulated Annealing means solving the following problems:

I finding an initial state

I generating subsequent states

I defining an acceptance function

I determining the equilibrium condition

I suitably lowering the temperature

I determining the freeze conditions

Some are easy, some are hard

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 24 / 46

Page 40: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

The easy problems - initial state

Finding an initial state

Make base relations into one-node trees, keep merging them on joins withrestriction clauses, forcefully merge the remaining ones using Cartesianjoins. Results in a query tree that is as left-deep as possible.

This is exactly what GEQO does.

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 25 / 46

Page 41: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

The easy problems - temperature

The acceptance function

A uphill move is accepted with the probability that depends on the currenttemperature.

P(accepted) = ecostprev−costnew

temperature

Lowering the temperature

The initial temperature depends on the number of initial relations anddrops geometrically.

initial temperature = I ∗ initial rels

new temperature = temperature ∗ K

where0 < K < 1

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 26 / 46

Page 42: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

The easy problems - equilibrium and freezing

Equilibrium condition

Equilibrium is reached after a fixed number of moves that depend on thenumber of initial relations.

moves to equilibrium = N ∗ initial rels

Freezing condition

The system freezes if temperature falls below 1 and a fixed number ofconsecutive moves has failed.

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 27 / 46

Page 43: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

Move types

The difficult part seems to be generating subsequent states.

I a number of move generating approaches can be taken

I the most costly operations is creating a joinrel, especially computingpaths

I need to free memory between steps, otherwise risk overrunning

I need to deal with planner scribbling on its structures when creatingjoinrels

I how to efficiently sample the solution space?

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 28 / 46

Page 44: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO move

I randomly choose two nodes from the query tree

I swap the subtrees around

I recalculate the whole query tree

I if it cannot be done, the move fails

I check if the cost of the new tree is acceptable

I if not, the move fails

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 29 / 46

Page 45: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO move example

Take a tree,

choose two nodes, swap them around

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 30 / 46

Page 46: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO move example

Take a tree, choose two nodes,

swap them around

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 30 / 46

Page 47: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO move example

Take a tree, choose two nodes, swap them around

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 30 / 46

Page 48: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO move problems

I choosing a node narrows down the possible choices for the secondnode

I can’t choose descendant node (how would that work?)I can’t choose ancestor node (for the same reason)I can’t choose sibling node (because of symmetry)

I the changes to the tree are big, the algorithm takes “large steps”

I if the resulting query tree is invalid, lots of work is thrown away

I any join failure results in the whole move failing, so it doesn’t explorethe solution space very deeply

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 31 / 46

Page 49: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot

I change (A join B) join C into A join (B join C)

I in practise, choose a node at random

I swap the subtrees of one of its children and the sibling’s

I continue trying such pivots until all nodes have been tried

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 32 / 46

Page 50: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot example

Take a tree,

choose a node, pivot, choose another, pivot ...

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 33 / 46

Page 51: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot example

Take a tree, choose a node,

pivot, choose another, pivot ...

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 33 / 46

Page 52: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot example

Take a tree, choose a node, pivot,

choose another, pivot ...

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 33 / 46

Page 53: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot example

Take a tree, choose a node, pivot, choose another,

pivot ...

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 33 / 46

Page 54: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot example

Take a tree, choose a node, pivot, choose another, pivot ...

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 33 / 46

Page 55: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO pivot problems

I each move explores a lot of possibilities, but requires lots ofcomputation

I does not introduce big changes, which sometimes are needed to breakpessimal joins

I actually, it’s not obvious that the solution space is smooth wrt costsI small changes in the structure may result it gigantic changes in costsI might want to augment the cost assessment function (number of

non-cross joins?)

I the same join might be recalculated many times in each step

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 34 / 46

Page 56: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc

I essentially the same as move

I but keeps all the relations built between moves

I recalculate the joins from the chosen nodes up to the commonancestor

I if it succeeded, recalculate the nodes from the common ancestor upto the root node

I avoids pointless recalculations when joins fail

I each query tree node has its own memory context

I needs to remove individual joinrels from the planner (nasty!)

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 35 / 46

Page 57: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree,

choose two nodes, swap them around, recalculate up to thecommon ancestor, just discard the built joinrels if failure, recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 58: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree, choose two nodes,

swap them around, recalculate up to thecommon ancestor, just discard the built joinrels if failure, recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 59: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree, choose two nodes, swap them around,

recalculate up to thecommon ancestor, just discard the built joinrels if failure, recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 60: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree, choose two nodes, swap them around, recalculate up to thecommon ancestor,

just discard the built joinrels if failure, recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 61: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree, choose two nodes, swap them around, recalculate up to thecommon ancestor, just discard the built joinrels if failure,

recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 62: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc example

Take a tree, choose two nodes, swap them around, recalculate up to thecommon ancestor, just discard the built joinrels if failure, recalculate therest of the relations

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 36 / 46

Page 63: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The solution Query tree transformations

SAIO recalc problems

I does really nasty hacks

I does not speed things up as much as it should

I does not solve the problem of failing, it just makes failures cheaper

I because the trees are usually left-deep, the benefits from partialrecalculation are not as big

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 37 / 46

Page 64: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The results Comparison with GEQO

Outline

1 The problem

2 The solution

3 The resultsComparison with GEQO

4 The future

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 38 / 46

Page 65: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The results Comparison with GEQO

Moderately big query

Collapse limits set to 100. Move algorithm used is recalc.

algorithm equilibrium loops temp reduction avg cost avg time

GEQO n/a n/a 1601.540000 0.54379

SAIO 4 0.6 1623.874000 0.42599

SAIO 6 0.9 1617.341000 3.69639

SAIO 8 0.7 1618.838000 1.44605

SAIO 12 0.4 1627.873000 0.87609

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 39 / 46

Page 66: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The results Comparison with GEQO

Very big query

Collapse limits set to 100. Move algorithm used is recalc.

algorithm equilibrium loops temp reduction avg cost avg time

GEQO n/a n/a 22417.210000 777.20033

SAIO 4 0.6 21130.063000 145.27570

SAIO 6 0.4 21218.134000 125.08545

SAIO 6 0.6 21131.333000 240.70522

SAIO 8 0.4 21250.160000 179.34261

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 40 / 46

Page 67: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The future Development focuses

Outline

1 The problem

2 The solution

3 The results

4 The futureDevelopment focuses

5 The end

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 41 / 46

Page 68: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The future Development focuses

What the future brings

I MSc thesis :o)

I smarter tree transformation methods

I less useless computation

I perhaps some support from the core infrastructure

I faster and higher quality results than GEQO

I git://wulczer.org/saio.git

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 42 / 46

Page 69: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The future Development focuses

Acknowledgements

I Adriano Lange, the author of the TWOPO implementation

I Andres Freund, for providing the craziest test query ever

I Robert Haas, for providing a slightly less crazy test query

I dr Krzysztof Stencel, for help and guidance

I Flumotion Services, for letting me mess with the PG planner insteadof doing work

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 43 / 46

Page 70: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The future Development focuses

Further reading

Yannis E. Ioannidis and Eugene Wong.Query optimization by simulated annealing.SIGMOD Rec., 16(3):9–22, 1987.

Michael Steinbrunn, Guido Moerkotte, and Alfons Kemper.Heuristic and randomized optimization for the join ordering problem.The VLDB Journal, 6(3):191–208, 1997.

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 44 / 46

Page 71: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

The end Questions

Questions?

Jan Urbanski [email protected] (University of Warsaw / Flumotion)Replacing GEQO May 21, 2010 45 / 46

Page 72: Replacing GEQO - Join ordering via Simulated …Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski j.urbanski@wulczer.org University of Warsaw / Flumotion May 21, 2010

Example query tree for a sixrelation join.