the lost art of the self join
TRANSCRIPT
The Lost Artof theSelf JoinBeat Vontobel, CTO, MeteoNews [email protected]
Why and what?• Idea for session dates back to 2005
‣ Sudoku solver in a Stored Procedure (Per-Erik Martin)
‣ „The lost Art of the Join“ (Erik Bergen)
‣ Self Joins in my last year‘s presentation„The declarative power of VIEWs“
• A few serious, but simpler examples of Self Joins
• One to be taken less seriously, but more complex
From last year: Paradigms• Imperative Programming
‣ PHP, C, Java…
‣ Specify the Algorithm: How?
• Declarative Programming
‣ Prolog, Lisp, XSLT, SQL…
‣ Specify the Goal: What?
SELECT child.child AS child, sibling.child AS siblingFROM parents [AS] childINNER JOIN parents [AS] siblingON child.parent = sibling.parentWHERE child.child != sibling.child;
Every Table needs an Alias
Martha
Paul Chris
Julie
parent childmartha paul
chris julie
martha chris
SELECT child.child AS child, sibling.child AS siblingFROM parents [AS] childINNER JOIN parents [AS] siblingON child.parent = sibling.parentWHERE child.child != sibling.child;
+-------+---------+| child | sibling |+-------+---------+| Paul | Chris | | Chris | Paul | +-------+---------+2 rows in set (0.00 sec)
A simple Self Join
Martha
Paul Chris
Julie
child parentpaul martha
julie chris
chris martha
parent childmartha paul
chris julie
martha chris
Trees in SQL• Basic Text Book Example: Employees Table
• „Nested Set Model“
‣ Google for „Trees SQL Mike Hillyer“
Restriction on Self Joins: Temporary Tablesmysql> CREATE TEMPORARY TABLE t (i INT);Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM t t1 CROSS JOIN t t2;ERROR 1137 (HY000): Can't reopen table: 't1'
Workaround:Create global tables with uniqe names(e.g. using session ID)
mysql> CREATE TABLE t_89372 (i INT);
Example table: Temperaturestemps
station CHAR(3)PK
dtime TIMESTAMP
temp DECIMAL(3, 1)
mysql1.intern-test [admin] > SELECT * FROM temps;+---------+---------------------+------+| station | dtime | temp |+---------+---------------------+------+| ABO | 2008-04-04 00:10:00 | -2.0 || ABO | 2008-04-04 00:20:00 | -1.9 || … | … | … || BAS | 2008-04-04 00:10:00 | 6.1 || BAS | 2008-04-04 00:20:00 | 6.2 || … | … | … |+---------+---------------------+------+
Absolute to relativeSELECT current.station AS stat, current.dtime, previous.temp AS prev, current.temp AS curr, current.temp - previous.temp AS diffFROM temps currentINNER JOIN temps previousON current.station = previous.stationAND previous.dtime = current.dtime - INTERVAL 10 MINUTEORDER BY diff DESCLIMIT 10
Absolute to relative+------+---------------------+-------+-------+------+| stat | dtime | prev | curr | diff |+------+---------------------+-------+-------+------+| SAM | 2008-04-04 08:20:00 | -4.8 | -1.1 | 3.7 || MAG | 2008-04-04 01:10:00 | 7.6 | 10.7 | 3.1 || BUF | 2008-04-04 22:10:00 | -13.1 | -10.2 | 2.9 || MAG | 2008-04-04 05:00:00 | 7.3 | 10.1 | 2.8 || CIM | 2008-04-04 10:00:00 | 1.8 | 4.6 | 2.8 || MAG | 2008-04-04 00:20:00 | 6.0 | 8.4 | 2.4 || CHZ | 2008-04-04 09:40:00 | 7.8 | 10.2 | 2.4 || MAG | 2008-04-04 04:20:00 | 6.3 | 8.7 | 2.4 || EGH | 2008-04-04 12:10:00 | -8.5 | -6.2 | 2.3 || VIS | 2008-04-04 05:40:00 | -1.8 | 0.5 | 2.3 |+------+---------------------+-------+-------+------+10 rows in set (0.21 sec)
Missed opportunity for a Self Join// Typical example of „keeping state“ over// a loop fetching rows from the database
while($row = mysql_fetch_row($result)) {// Computation involving $oldrow and $row…$oldrow = $row;
}
Why not in a loop?• SQL code is clearer
• Logical dependency between SQL and application level
• But: „I have to loop anyway! Might even be faster in some cases…“
‣ You have to order by what you need for computation
‣ Different order requested for the result?
‣ You might miss the opportunity to use framework
Use for any kind of serial data…• Meteorological data
• Racing lap times
• Fuel used
• Bank account figures
• Stock values
• Webserver hit statistics
• Mails processed
• …
Fill the gaps (simple linear interpolation)SELECT current.dtime, current.temp AS orig, COALESCE( current.temp, ROUND((prev.temp + next.temp) / 2, 1) ) AS interpolFROM temps currentINNER JOIN temps prevON prev.dtime = current.dtime - INTERVAL 10 MINUTEAND prev.station = current.stationINNER JOIN temps nextON next.dtime = current.dtime + INTERVAL 10 MINUTEAND next.station = current.stationWHERE current.station LIKE 'TAE'
Fill the gaps (simple linear interpolation)+---------------------+------+----------+| dtime | orig | interpol |+---------------------+------+----------+| … | … | … || 2008-04-04 16:30:00 | 7.9 | 7.9 || 2008-04-04 16:40:00 | 8.0 | 8.0 || 2008-04-04 16:50:00 | NULL | 7.9 || 2008-04-04 17:00:00 | 7.8 | 7.8 || … | … | … |+---------------------+------+----------+142 rows in set (0.01 sec)
Walking averageSELECT current.dtime, current.temp, ROUND( ( 3 * current.temp + 2 * prev1.temp + 1 * prev2.temp ) / 6, 1 ) AS walking_avgFROM temps currentINNER JOIN temps prev1ON prev1.dtime = current.dtime - INTERVAL 10 MINUTEAND prev1.station = current.stationINNER JOIN temps prev2ON prev2.dtime = current.dtime - INTERVAL 20 MINUTEAND prev2.station = current.stationWHERE current.station LIKE 'TAE'ORDER BY current.dtime
Walking average+---------------------+------+-------------+| dtime | temp | walking_avg |+---------------------+------+-------------+| … | … | … || 2008-04-04 09:10:00 | 5.7 | 5.7 || 2008-04-04 09:20:00 | 5.8 | 5.7 || 2008-04-04 09:30:00 | 6.3 | 6.0 || 2008-04-04 09:40:00 | 6.3 | 6.2 || 2008-04-04 09:50:00 | 6.0 | 6.2 || 2008-04-04 10:00:00 | 6.2 | 6.2 || 2008-04-04 10:10:00 | 6.6 | 6.4 || 2008-04-04 10:20:00 | 6.6 | 6.5 || 2008-04-04 10:30:00 | 6.6 | 6.6 || … | … | … |+---------------------+------+-------------+142 rows in set (0.01 sec)
Coherence/„Correlation“SELECT source.station, correlated.station, STDDEV( source.temp - correlated.temp ) AS dev, AVG( source.temp - correlated.temp ) AS offsetFROM temps sourceINNER JOIN temps correlatedON source.dtime = correlated.dtimeWHERE source.station = 'TAE'GROUP BY source.station, correlated.stationORDER BY dev
Coherence („Correlation“)+---------+---------+---------+----------+| station | station | dev | offset |+---------+---------+---------+----------+| TAE | TAE | 0.00000 | 0.00000 || TAE | ABO | 0.60563 | 5.43636 || TAE | FRE | 0.65031 | 4.14615 || … | … | … | … || TAE | CHZ | 1.05226 | -1.57063 || TAE | CIM | 1.07280 | 3.45035 || … | … | … | … || TAE | SBO | 3.58539 | -6.88811 |+---------+---------+---------+----------+87 rows in set (0.04 sec)
Groupwise maximum row (Subquery)SELECT a.station, a.dtime, a.tempFROM temps aWHERE a.temp = ( SELECT MAX(temp) FROM temps b WHERE b.station = a.station )ORDER BY a.station, a.temp;
Groupwise maximum row+---------+---------------------+-------+| station | dtime | temp |+---------+---------------------+-------+| ABO | 2008-04-04 14:40:00 | 4.7 || AIG | 2008-04-04 13:40:00 | 13.0 || ALT | 2008-04-04 12:20:00 | 10.9 || ALT | 2008-04-04 12:30:00 | 10.9 || … | … | … || WYN | 2008-04-04 14:30:00 | 10.5 || ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (4.03 sec)
Groupwise maximum row (Self Join)SELECT maximum.station, maximum.dtime, maximum.tempFROM temps maximumLEFT JOIN temps higherON maximum.station = higher.stationAND maximum.temp < higher.temp WHERE higher.station IS NULLAND maximum.temp IS NOT NULLORDER BY maximum.station, maximum.temp
Groupwise maximum row (Joined Subquery)SELECT a.station, a.dtime, a.tempFROM temps aINNER JOIN ( SELECT station, MAX(temp) AS temp FROM temps GROUP BY station ) bON (a.station, a.temp) = (b.station, b.temp)ORDER BY a.station, a.temp;
Groupwise maximum row (Alternatives)CORRELATED SUBQUERY| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (4.03 sec)
SELF JOIN| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (1.43 sec)
JOINED SUBQUERY| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (0.05 sec)
Comment from a Blog Post
„I left joined a table with itself once, and a lightning fast query had a second or so delay. Two joins, and it was slow. Three joins, and it took upwards of 15 seconds. This kind of joining a table to
itself repeatedly kills database performance.“ (John)
A few Words of Caution• Rows to scan: nm (rowstable_references )
• 15‘000 rows (example table) (0.003s @ 5 Mio. rows/s)
‣ joined once: n2 = 225‘000‘000 rows (45s)
‣ joined twice: n3 = 3‘375‘000‘000‘000 rows (7.8 days)
‣ joined three times: n4 = 50‘625‘000‘000‘000‘000 rows(321 years)
• Check your JOIN conditions and indexes
• Do some EXPLAINs
‣ „EXPLAIN demystified“(Baron Schwartz, today, 2:00pm, Ballroom D)
A few Words of Caution: Many Joins• Time spent executing query
• Before that: Time spent finding execution plan!
Sudoku
5 3 7
6 1 9 5
9 8 6
8 6 3
4 8 3 1
7 2 6
6 2 8
4 1 9 5
8 7 9
Sudoku: Fill every square with all digits 1–9
5 3 7
6 1 9 5
9 8 6
8 6 3
4 8 3 1
7 2 6
6 2 8
4 1 9 5
8 7 9
Sudoku: No digit repeated on column or row
5 3 7
6 1 9 5
9 8 6
8 6 3
4 8 3 1
7 2 6
6 2 8
4 1 9 5
8 7 9
Solve a Sudoku with one Query?• SQL: One „solution“ equals „one row“
‣ There might be more than one solution
‣ Soduku „spread out“ horizontally (one column per field)
• Table `digits` holding the „base material“: 1, 2, 3, 4, 5…
‣ Self Joins: One table reference for every field
‣ 9-by-9: 81 table references („80 Joins“)
‣ MySQL Limit: 61 Joins (31 back in MySQL 3.23)
How to Solve a Sudoku „Brute Force“
6
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
1 6
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
1 6
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 6
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6 1
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6 1
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6 2
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6 ?
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“
2 1 6 5
6 3
4 1
1 3 6
5 4 6
1
How to Solve a Sudoku „Brute Force“• Try all 6 digits for a field
‣ Still no solution?
‣ Backtrack!Erase fieldTry something different in the previous fieldSometimes this means „back to square one“
• So, a long, long time later…
How to Solve a Sudoku „Brute Force“
5 3 6 2 1 4
2 4 1 6 5 3
4 6 3 1 2 5
1 5 2 3 4 6
3 1 5 4 6 2
6 2 4 5 1
How to Solve a Sudoku „Brute Force“
5 3 6 2 1 4
2 4 1 6 5 3
4 6 3 1 2 5
1 5 2 3 4 6
3 1 5 4 6 2
6 2 4 5 1 1
How to Solve a Sudoku „Brute Force“
5 3 6 2 1 4
2 4 1 6 5 3
4 6 3 1 2 5
1 5 2 3 4 6
3 1 5 4 6 2
6 2 4 5 2 1
How to Solve a Sudoku „Brute Force“
5 3 6 2 1 4
2 4 1 6 5 3
4 6 3 1 2 5
1 5 2 3 4 6
3 1 5 4 6 2
6 2 4 5 3 1
How to Solve a Sudoku „Brute Force“• We‘re not finished yet!
‣ There might be another solution…
‣ So, backtrack and try other possibilities…
Solving a Sudoku with one SELECT (1)SELECT CONCAT( d11.d, ' ', d12.d, ' ', d13.d, ' ', d14.d, ' ', d15.d, ' ', d16.d, ' ', CHAR(10), d21.d, ' ', d22.d, ' ', d23.d, ' ', d24.d, ' ', d25.d, ' ', d26.d, ' ', CHAR(10), d31.d, ' ', d32.d, ' ', d33.d, ' ', d34.d, ' ', d35.d, ' ', d36.d, ' ', CHAR(10), d41.d, ' ', d42.d, ' ', d43.d, ' ', d44.d, ' ', d45.d, ' ', d46.d, ' ', CHAR(10), d51.d, ' ', d52.d, ' ', d53.d, ' ', d54.d, ' ', d55.d, ' ', d56.d, ' ', CHAR(10), d61.d, ' ', d62.d, ' ', d63.d, ' ', d64.d, ' ', d65.d, ' ', d66.d, ' ', CHAR(10) ) AS solutionFROM digits d11INNER JOIN digits d12ON COALESCE(d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1)AND d12.d != d11.dINNER JOIN digits d13ON COALESCE(d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1)AND d13.d != d11.d AND d13.d != d12.dINNER JOIN digits d14ON COALESCE(d14.d = ( SELECT d FROM start WHERE i = 1 AND j = 4 ), 1)AND d14.d != d11.d AND d14.d != d12.d AND d14.d != d13.dINNER JOIN digits d15ON COALESCE(d15.d = ( SELECT d FROM start WHERE i = 1 AND j = 5 ), 1)AND d15.d != d11.d AND d15.d != d12.d AND d15.d != d13.d AND d15.d != d14.dINNER JOIN digits d16ON COALESCE(d16.d = ( SELECT d FROM start WHERE i = 1 AND j = 6 ), 1)AND d16.d != d11.d AND d16.d != d12.d AND d16.d != d13.d AND d16.d != d14.d AND d16.d != d15.dINNER JOIN digits d21ON COALESCE(d21.d = ( SELECT d FROM start WHERE i = 2 AND j = 1 ), 1)AND d21.d != d11.dINNER JOIN digits d22ON COALESCE(d22.d = ( SELECT d FROM start WHERE i = 2 AND j = 2 ), 1)AND d22.d != d21.dAND d22.d != d12.dAND d22.d != d11.dINNER JOIN digits d23ON COALESCE(d23.d = ( SELECT d FROM start WHERE i = 2 AND j = 3 ), 1)AND d23.d != d21.d AND d23.d != d22.dAND d23.d != d13.dAND d23.d != d11.d AND d23.d != d12.d
Solving a Sudoku with one SELECT (2)INNER JOIN digits d24ON COALESCE(d24.d = ( SELECT d FROM start WHERE i = 2 AND j = 4 ), 1)AND d24.d != d21.d AND d24.d != d22.d AND d24.d != d23.dAND d24.d != d14.dINNER JOIN digits d25ON COALESCE(d25.d = ( SELECT d FROM start WHERE i = 2 AND j = 5 ), 1)AND d25.d != d21.d AND d25.d != d22.d AND d25.d != d23.d AND d25.d != d24.dAND d25.d != d15.dAND d25.d != d14.dINNER JOIN digits d26ON COALESCE(d26.d = ( SELECT d FROM start WHERE i = 2 AND j = 6 ), 1)AND d26.d != d21.d AND d26.d != d22.d AND d26.d != d23.d AND d26.d != d24.d AND d26.d != d25.dAND d26.d != d16.dAND d26.d != d14.d AND d26.d != d15.dINNER JOIN digits d31ON COALESCE(d31.d = ( SELECT d FROM start WHERE i = 3 AND j = 1 ), 1)AND d31.d != d11.d AND d31.d != d21.dINNER JOIN digits d32ON COALESCE(d32.d = ( SELECT d FROM start WHERE i = 3 AND j = 2 ), 1)AND d32.d != d31.dAND d32.d != d12.d AND d32.d != d22.dINNER JOIN digits d33ON COALESCE(d33.d = ( SELECT d FROM start WHERE i = 3 AND j = 3 ), 1)AND d33.d != d31.d AND d33.d != d32.dAND d33.d != d13.d AND d33.d != d23.dINNER JOIN digits d34ON COALESCE(d34.d = ( SELECT d FROM start WHERE i = 3 AND j = 4 ), 1)AND d34.d != d31.d AND d34.d != d32.d AND d34.d != d33.dAND d34.d != d14.d AND d34.d != d24.dINNER JOIN digits d35ON COALESCE(d35.d = ( SELECT d FROM start WHERE i = 3 AND j = 5 ), 1)AND d35.d != d31.d AND d35.d != d32.d AND d35.d != d33.d AND d35.d != d34.dAND d35.d != d15.d AND d35.d != d25.dINNER JOIN digits d36ON COALESCE(d36.d = ( SELECT d FROM start WHERE i = 3 AND j = 6 ), 1)AND d36.d != d31.d AND d36.d != d32.d AND d36.d != d33.d AND d36.d != d34.d AND d36.d != d35.dAND d36.d != d16.d AND d36.d != d26.d
Solving a Sudoku with one SELECT (3)INNER JOIN digits d41ON COALESCE(d41.d = ( SELECT d FROM start WHERE i = 4 AND j = 1 ), 1)AND d41.d != d11.d AND d41.d != d21.d AND d41.d != d31.dINNER JOIN digits d42ON COALESCE(d42.d = ( SELECT d FROM start WHERE i = 4 AND j = 2 ), 1)AND d42.d != d41.dAND d42.d != d12.d AND d42.d != d22.d AND d42.d != d32.dAND d42.d != d31.dINNER JOIN digits d43ON COALESCE(d43.d = ( SELECT d FROM start WHERE i = 4 AND j = 3 ), 1)AND d43.d != d41.d AND d43.d != d42.dAND d43.d != d13.d AND d43.d != d23.d AND d43.d != d33.dAND d43.d != d31.d AND d43.d != d32.dINNER JOIN digits d44ON COALESCE(d44.d = ( SELECT d FROM start WHERE i = 4 AND j = 4 ), 1)AND d44.d != d41.d AND d44.d != d42.d AND d44.d != d43.dAND d44.d != d14.d AND d44.d != d24.d AND d44.d != d34.dINNER JOIN digits d45ON COALESCE(d45.d = ( SELECT d FROM start WHERE i = 4 AND j = 5 ), 1)AND d45.d != d41.d AND d45.d != d42.d AND d45.d != d43.d AND d45.d != d44.dAND d45.d != d15.d AND d45.d != d25.d AND d45.d != d35.dAND d45.d != d34.dINNER JOIN digits d46ON COALESCE(d46.d = ( SELECT d FROM start WHERE i = 4 AND j = 6 ), 1)AND d46.d != d41.d AND d46.d != d42.d AND d46.d != d43.d AND d46.d != d44.d AND d46.d != d45.dAND d46.d != d16.d AND d46.d != d26.d AND d46.d != d36.dAND d46.d != d34.d AND d46.d != d35.dINNER JOIN digits d51ON COALESCE(d51.d = ( SELECT d FROM start WHERE i = 5 AND j = 1 ), 1)AND d51.d != d11.d AND d51.d != d21.d AND d51.d != d31.d AND d51.d != d41.dINNER JOIN digits d52ON COALESCE(d52.d = ( SELECT d FROM start WHERE i = 5 AND j = 2 ), 1)AND d52.d != d51.dAND d52.d != d12.d AND d52.d != d22.d AND d52.d != d32.d AND d52.d != d42.d
Solving a Sudoku with one SELECT (4)INNER JOIN digits d53ON COALESCE(d53.d = ( SELECT d FROM start WHERE i = 5 AND j = 3 ), 1)AND d53.d != d51.d AND d53.d != d52.dAND d53.d != d13.d AND d53.d != d23.d AND d53.d != d33.d AND d53.d != d43.dINNER JOIN digits d54ON COALESCE(d54.d = ( SELECT d FROM start WHERE i = 5 AND j = 4 ), 1)AND d54.d != d51.d AND d54.d != d52.d AND d54.d != d53.dAND d54.d != d14.d AND d54.d != d24.d AND d54.d != d34.d AND d54.d != d44.dINNER JOIN digits d55ON COALESCE(d55.d = ( SELECT d FROM start WHERE i = 5 AND j = 5 ), 1)AND d55.d != d51.d AND d55.d != d52.d AND d55.d != d53.d AND d55.d != d54.dAND d55.d != d15.d AND d55.d != d25.d AND d55.d != d35.d AND d55.d != d45.dINNER JOIN digits d56ON COALESCE(d56.d = ( SELECT d FROM start WHERE i = 5 AND j = 6 ), 1)AND d56.d != d51.d AND d56.d != d52.d AND d56.d != d53.d AND d56.d != d54.d AND d56.d != d55.dAND d56.d != d16.d AND d56.d != d26.d AND d56.d != d36.d AND d56.d != d46.dINNER JOIN digits d61ON COALESCE(d61.d = ( SELECT d FROM start WHERE i = 6 AND j = 1 ), 1)AND d61.d != d11.d AND d61.d != d21.d AND d61.d != d31.d AND d61.d != d41.d AND d61.d != d51.dINNER JOIN digits d62ON COALESCE(d62.d = ( SELECT d FROM start WHERE i = 6 AND j = 2 ), 1)AND d62.d != d61.dAND d62.d != d12.d AND d62.d != d22.d AND d62.d != d32.d AND d62.d != d42.d AND d62.d != d52.dAND d62.d != d51.dINNER JOIN digits d63ON COALESCE(d63.d = ( SELECT d FROM start WHERE i = 6 AND j = 3 ), 1)AND d63.d != d61.d AND d63.d != d62.dAND d63.d != d13.d AND d63.d != d23.d AND d63.d != d33.d AND d63.d != d43.d AND d63.d != d53.dAND d63.d != d51.d AND d63.d != d52.dINNER JOIN digits d64ON COALESCE(d64.d = ( SELECT d FROM start WHERE i = 6 AND j = 4 ), 1)AND d64.d != d61.d AND d64.d != d62.d AND d64.d != d63.dAND d64.d != d14.d AND d64.d != d24.d AND d64.d != d34.d AND d64.d != d44.d AND d64.d != d54.d
Solving a Sudoku with one SELECT (5)INNER JOIN digits d65ON COALESCE(d65.d = ( SELECT d FROM start WHERE i = 6 AND j = 5 ), 1)AND d65.d != d61.d AND d65.d != d62.d AND d65.d != d63.d AND d65.d != d64.dAND d65.d != d15.d AND d65.d != d25.d AND d65.d != d35.d AND d65.d != d45.d AND d65.d != d55.dAND d65.d != d54.dINNER JOIN digits d66ON COALESCE(d66.d = ( SELECT d FROM start WHERE i = 6 AND j = 6 ), 1)AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.dWHERE COALESCE(d11.d = ( SELECT d FROM start WHERE i = 1 AND j = 1 ), 1)
Table `digits` for the „pool“ of digits+---+| d |+---+| 1 | | 2 | | 3 | | 4 | | 5 | | 6 | +---+
Table `start` for initial conditions+---+---+------+| i | j | d |+---+---+------+| 1 | 3 | 6 | | 2 | 4 | 6 | | 2 | 6 | 3 | | 3 | 1 | 4 | | 3 | 4 | 1 | | 4 | 1 | 1 | | 4 | 4 | 3 | | 4 | 6 | 6 | | 5 | 3 | 5 | | 5 | 4 | 4 | | 5 | 5 | 6 | | 6 | 6 | 1 | +---+---+------+
6
6 3
4 1
1 3 6
5 4 6
1
How the query works: First field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…
How the query works: Second field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…
How the query works: Third field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…
How the query works: Last field
…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…
How the query works: Last field
…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…
How the query works: Last field
…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…
How the query works: Last field
…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…
Conclusions from the „Sudoku-Case“• Declarative Paradigm (Constraint Programming)
‣ Don‘t care about the „how“, but about the „what“
‣ Optimizer does a great job!
• (Ab-)use built-in Backtracking of Join Engine
• A query might look awkward – but still performs!
Some reasons for reasonable performance…• Very small table (`digits`) and covering index
• Small result set: Always working on one row!
• Subqueries basically optimized away
‣ „Impossible WHERE noticed“ (no pre-condition case)
‣ Constant (pre-condition case)
• Optimizer/Join Engine is good at this stuff!+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+| 1 | PRIMARY | d11 | index | NULL | PRIMARY | 1 | NULL | 6 | Using where; Using index || 1 | PRIMARY | d12 | index | NULL | PRIMARY | 1 | NULL | 6 | Using where; Using index || … | … | … | … | … | … | … | … | … | … || 37 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after|| | | | | | | | | | reading const tables | | 36 | SUBQUERY | start | const | PRIMARY | PRIMARY | 2 | const,const | 1 | || … | … | … | … | … | … | … | … | … | … |+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+72 rows in set (0.01 sec)
Final Message• Have fun with the declarative power of SQL!
‣ Despite its flaws…
• Do it the SQL way!
• Slides and code will be made available on conference website
• Check out Developer Zone on MySQL website for an upcoming article version of my last year‘s session „The declarative power of VIEWs“
This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
To view a copy of this license, visithttp://creativecommons.org/licenses/by-nc-sa/3.0/
or send a letter toCreative Commons, 171 Second Street, Suite 300,
San Francisco, California, 94105, USA.