the lost art of the self join

65
The Lost Art of the Self Join Beat Vontobel, CTO, MeteoNews AG [email protected]

Upload: oleksiy-kovyrin

Post on 12-Nov-2014

2.069 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Lost Art of the Self Join

The Lost Artof theSelf JoinBeat Vontobel, CTO, MeteoNews [email protected]

Page 2: The Lost Art of the Self Join

Why and what?• Idea for session dates back to 2005

‣ Sudoku solver in a Stored Procedure (Per-Erik Martin)

‣ „The lost Art of the Join“ (Erik Bergen)

‣ Self Joins in my last year‘s presentation„The declarative power of VIEWs“

• A few serious, but simpler examples of Self Joins

• One to be taken less seriously, but more complex

Page 3: The Lost Art of the Self Join

From last year: Paradigms• Imperative Programming

‣ PHP, C, Java…

‣ Specify the Algorithm: How?

• Declarative Programming

‣ Prolog, Lisp, XSLT, SQL…

‣ Specify the Goal: What?

Page 4: The Lost Art of the Self Join

SELECT child.child AS child, sibling.child AS siblingFROM parents [AS] childINNER JOIN parents [AS] siblingON child.parent = sibling.parentWHERE child.child != sibling.child;

Every Table needs an Alias

Martha

Paul Chris

Julie

parent childmartha paul

chris julie

martha chris

Page 5: The Lost Art of the Self Join

SELECT child.child AS child, sibling.child AS siblingFROM parents [AS] childINNER JOIN parents [AS] siblingON child.parent = sibling.parentWHERE child.child != sibling.child;

+-------+---------+| child | sibling |+-------+---------+| Paul | Chris | | Chris | Paul | +-------+---------+2 rows in set (0.00 sec)

A simple Self Join

Martha

Paul Chris

Julie

child parentpaul martha

julie chris

chris martha

parent childmartha paul

chris julie

martha chris

Page 6: The Lost Art of the Self Join

Trees in SQL• Basic Text Book Example: Employees Table

• „Nested Set Model“

‣ Google for „Trees SQL Mike Hillyer“

Page 7: The Lost Art of the Self Join

Restriction on Self Joins: Temporary Tablesmysql> CREATE TEMPORARY TABLE t (i INT);Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM t t1 CROSS JOIN t t2;ERROR 1137 (HY000): Can't reopen table: 't1'

Workaround:Create global tables with uniqe names(e.g. using session ID)

mysql> CREATE TABLE t_89372 (i INT);

Page 8: The Lost Art of the Self Join

Example table: Temperaturestemps

station CHAR(3)PK

dtime TIMESTAMP

temp DECIMAL(3, 1)

mysql1.intern-test [admin] > SELECT * FROM temps;+---------+---------------------+------+| station | dtime | temp |+---------+---------------------+------+| ABO | 2008-04-04 00:10:00 | -2.0 || ABO | 2008-04-04 00:20:00 | -1.9 || … | … | … || BAS | 2008-04-04 00:10:00 | 6.1 || BAS | 2008-04-04 00:20:00 | 6.2 || … | … | … |+---------+---------------------+------+

Page 9: The Lost Art of the Self Join

Absolute to relativeSELECT current.station AS stat, current.dtime, previous.temp AS prev, current.temp AS curr, current.temp - previous.temp AS diffFROM temps currentINNER JOIN temps previousON current.station = previous.stationAND previous.dtime = current.dtime - INTERVAL 10 MINUTEORDER BY diff DESCLIMIT 10

Page 10: The Lost Art of the Self Join

Absolute to relative+------+---------------------+-------+-------+------+| stat | dtime | prev | curr | diff |+------+---------------------+-------+-------+------+| SAM | 2008-04-04 08:20:00 | -4.8 | -1.1 | 3.7 || MAG | 2008-04-04 01:10:00 | 7.6 | 10.7 | 3.1 || BUF | 2008-04-04 22:10:00 | -13.1 | -10.2 | 2.9 || MAG | 2008-04-04 05:00:00 | 7.3 | 10.1 | 2.8 || CIM | 2008-04-04 10:00:00 | 1.8 | 4.6 | 2.8 || MAG | 2008-04-04 00:20:00 | 6.0 | 8.4 | 2.4 || CHZ | 2008-04-04 09:40:00 | 7.8 | 10.2 | 2.4 || MAG | 2008-04-04 04:20:00 | 6.3 | 8.7 | 2.4 || EGH | 2008-04-04 12:10:00 | -8.5 | -6.2 | 2.3 || VIS | 2008-04-04 05:40:00 | -1.8 | 0.5 | 2.3 |+------+---------------------+-------+-------+------+10 rows in set (0.21 sec)

Page 11: The Lost Art of the Self Join

Missed opportunity for a Self Join// Typical example of „keeping state“ over// a loop fetching rows from the database

while($row = mysql_fetch_row($result)) {// Computation involving $oldrow and $row…$oldrow = $row;

}

Page 12: The Lost Art of the Self Join

Why not in a loop?• SQL code is clearer

• Logical dependency between SQL and application level

• But: „I have to loop anyway! Might even be faster in some cases…“

‣ You have to order by what you need for computation

‣ Different order requested for the result?

‣ You might miss the opportunity to use framework

Page 13: The Lost Art of the Self Join

Use for any kind of serial data…• Meteorological data

• Racing lap times

• Fuel used

• Bank account figures

• Stock values

• Webserver hit statistics

• Mails processed

• …

Page 14: The Lost Art of the Self Join

Fill the gaps (simple linear interpolation)SELECT current.dtime, current.temp AS orig, COALESCE( current.temp, ROUND((prev.temp + next.temp) / 2, 1) ) AS interpolFROM temps currentINNER JOIN temps prevON prev.dtime = current.dtime - INTERVAL 10 MINUTEAND prev.station = current.stationINNER JOIN temps nextON next.dtime = current.dtime + INTERVAL 10 MINUTEAND next.station = current.stationWHERE current.station LIKE 'TAE'

Page 15: The Lost Art of the Self Join

Fill the gaps (simple linear interpolation)+---------------------+------+----------+| dtime | orig | interpol |+---------------------+------+----------+| … | … | … || 2008-04-04 16:30:00 | 7.9 | 7.9 || 2008-04-04 16:40:00 | 8.0 | 8.0 || 2008-04-04 16:50:00 | NULL | 7.9 || 2008-04-04 17:00:00 | 7.8 | 7.8 || … | … | … |+---------------------+------+----------+142 rows in set (0.01 sec)

Page 16: The Lost Art of the Self Join

Walking averageSELECT current.dtime, current.temp, ROUND( ( 3 * current.temp + 2 * prev1.temp + 1 * prev2.temp ) / 6, 1 ) AS walking_avgFROM temps currentINNER JOIN temps prev1ON prev1.dtime = current.dtime - INTERVAL 10 MINUTEAND prev1.station = current.stationINNER JOIN temps prev2ON prev2.dtime = current.dtime - INTERVAL 20 MINUTEAND prev2.station = current.stationWHERE current.station LIKE 'TAE'ORDER BY current.dtime

Page 17: The Lost Art of the Self Join

Walking average+---------------------+------+-------------+| dtime | temp | walking_avg |+---------------------+------+-------------+| … | … | … || 2008-04-04 09:10:00 | 5.7 | 5.7 || 2008-04-04 09:20:00 | 5.8 | 5.7 || 2008-04-04 09:30:00 | 6.3 | 6.0 || 2008-04-04 09:40:00 | 6.3 | 6.2 || 2008-04-04 09:50:00 | 6.0 | 6.2 || 2008-04-04 10:00:00 | 6.2 | 6.2 || 2008-04-04 10:10:00 | 6.6 | 6.4 || 2008-04-04 10:20:00 | 6.6 | 6.5 || 2008-04-04 10:30:00 | 6.6 | 6.6 || … | … | … |+---------------------+------+-------------+142 rows in set (0.01 sec)

Page 18: The Lost Art of the Self Join

Coherence/„Correlation“SELECT source.station, correlated.station, STDDEV( source.temp - correlated.temp ) AS dev, AVG( source.temp - correlated.temp ) AS offsetFROM temps sourceINNER JOIN temps correlatedON source.dtime = correlated.dtimeWHERE source.station = 'TAE'GROUP BY source.station, correlated.stationORDER BY dev

Page 19: The Lost Art of the Self Join

Coherence („Correlation“)+---------+---------+---------+----------+| station | station | dev | offset |+---------+---------+---------+----------+| TAE | TAE | 0.00000 | 0.00000 || TAE | ABO | 0.60563 | 5.43636 || TAE | FRE | 0.65031 | 4.14615 || … | … | … | … || TAE | CHZ | 1.05226 | -1.57063 || TAE | CIM | 1.07280 | 3.45035 || … | … | … | … || TAE | SBO | 3.58539 | -6.88811 |+---------+---------+---------+----------+87 rows in set (0.04 sec)

Page 20: The Lost Art of the Self Join

Groupwise maximum row (Subquery)SELECT a.station, a.dtime, a.tempFROM temps aWHERE a.temp = ( SELECT MAX(temp) FROM temps b WHERE b.station = a.station )ORDER BY a.station, a.temp;

Page 21: The Lost Art of the Self Join

Groupwise maximum row+---------+---------------------+-------+| station | dtime | temp |+---------+---------------------+-------+| ABO | 2008-04-04 14:40:00 | 4.7 || AIG | 2008-04-04 13:40:00 | 13.0 || ALT | 2008-04-04 12:20:00 | 10.9 || ALT | 2008-04-04 12:30:00 | 10.9 || … | … | … || WYN | 2008-04-04 14:30:00 | 10.5 || ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (4.03 sec)

Page 22: The Lost Art of the Self Join

Groupwise maximum row (Self Join)SELECT maximum.station, maximum.dtime, maximum.tempFROM temps maximumLEFT JOIN temps higherON maximum.station = higher.stationAND maximum.temp < higher.temp WHERE higher.station IS NULLAND maximum.temp IS NOT NULLORDER BY maximum.station, maximum.temp

Page 23: The Lost Art of the Self Join

Groupwise maximum row (Joined Subquery)SELECT a.station, a.dtime, a.tempFROM temps aINNER JOIN ( SELECT station, MAX(temp) AS temp FROM temps GROUP BY station ) bON (a.station, a.temp) = (b.station, b.temp)ORDER BY a.station, a.temp;

Page 24: The Lost Art of the Self Join

Groupwise maximum row (Alternatives)CORRELATED SUBQUERY| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (4.03 sec)

SELF JOIN| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (1.43 sec)

JOINED SUBQUERY| ZER | 2008-04-04 14:10:00 | 5.3 |+---------+---------------------+-------+114 rows in set (0.05 sec)

Page 25: The Lost Art of the Self Join

Comment from a Blog Post

„I left joined a table with itself once, and a lightning fast query had a second or so delay. Two joins, and it was slow. Three joins, and it took upwards of 15 seconds. This kind of joining a table to

itself repeatedly kills database performance.“ (John)

Page 26: The Lost Art of the Self Join

A few Words of Caution• Rows to scan: nm (rowstable_references )

• 15‘000 rows (example table) (0.003s @ 5 Mio. rows/s)

‣ joined once: n2 = 225‘000‘000 rows (45s)

‣ joined twice: n3 = 3‘375‘000‘000‘000 rows (7.8 days)

‣ joined three times: n4 = 50‘625‘000‘000‘000‘000 rows(321 years)

• Check your JOIN conditions and indexes

• Do some EXPLAINs

‣ „EXPLAIN demystified“(Baron Schwartz, today, 2:00pm, Ballroom D)

Page 27: The Lost Art of the Self Join

A few Words of Caution: Many Joins• Time spent executing query

• Before that: Time spent finding execution plan!

Page 28: The Lost Art of the Self Join

Sudoku

5 3 7

6 1 9 5

9 8 6

8 6 3

4 8 3 1

7 2 6

6 2 8

4 1 9 5

8 7 9

Page 29: The Lost Art of the Self Join

Sudoku: Fill every square with all digits 1–9

5 3 7

6 1 9 5

9 8 6

8 6 3

4 8 3 1

7 2 6

6 2 8

4 1 9 5

8 7 9

Page 30: The Lost Art of the Self Join

Sudoku: No digit repeated on column or row

5 3 7

6 1 9 5

9 8 6

8 6 3

4 8 3 1

7 2 6

6 2 8

4 1 9 5

8 7 9

Page 31: The Lost Art of the Self Join

Solve a Sudoku with one Query?• SQL: One „solution“ equals „one row“

‣ There might be more than one solution

‣ Soduku „spread out“ horizontally (one column per field)

• Table `digits` holding the „base material“: 1, 2, 3, 4, 5…

‣ Self Joins: One table reference for every field

‣ 9-by-9: 81 table references („80 Joins“)

‣ MySQL Limit: 61 Joins (31 back in MySQL 3.23)

Page 32: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

6

6 3

4 1

1 3 6

5 4 6

1

Page 33: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

1 6

6 3

4 1

1 3 6

5 4 6

1

Page 34: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

1 6

6 3

4 1

1 3 6

5 4 6

1

Page 35: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 6

6 3

4 1

1 3 6

5 4 6

1

Page 36: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6

6 3

4 1

1 3 6

5 4 6

1

Page 37: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6 1

6 3

4 1

1 3 6

5 4 6

1

Page 38: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6 1

6 3

4 1

1 3 6

5 4 6

1

Page 39: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6 2

6 3

4 1

1 3 6

5 4 6

1

Page 40: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6 ?

6 3

4 1

1 3 6

5 4 6

1

Page 41: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

2 1 6 5

6 3

4 1

1 3 6

5 4 6

1

Page 42: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“• Try all 6 digits for a field

‣ Still no solution?

‣ Backtrack!Erase fieldTry something different in the previous fieldSometimes this means „back to square one“

• So, a long, long time later…

Page 43: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

5 3 6 2 1 4

2 4 1 6 5 3

4 6 3 1 2 5

1 5 2 3 4 6

3 1 5 4 6 2

6 2 4 5 1

Page 44: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

5 3 6 2 1 4

2 4 1 6 5 3

4 6 3 1 2 5

1 5 2 3 4 6

3 1 5 4 6 2

6 2 4 5 1 1

Page 45: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

5 3 6 2 1 4

2 4 1 6 5 3

4 6 3 1 2 5

1 5 2 3 4 6

3 1 5 4 6 2

6 2 4 5 2 1

Page 46: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“

5 3 6 2 1 4

2 4 1 6 5 3

4 6 3 1 2 5

1 5 2 3 4 6

3 1 5 4 6 2

6 2 4 5 3 1

Page 47: The Lost Art of the Self Join

How to Solve a Sudoku „Brute Force“• We‘re not finished yet!

‣ There might be another solution…

‣ So, backtrack and try other possibilities…

Page 48: The Lost Art of the Self Join

Solving a Sudoku with one SELECT (1)SELECT CONCAT( d11.d, ' ', d12.d, ' ', d13.d, ' ', d14.d, ' ', d15.d, ' ', d16.d, ' ', CHAR(10), d21.d, ' ', d22.d, ' ', d23.d, ' ', d24.d, ' ', d25.d, ' ', d26.d, ' ', CHAR(10), d31.d, ' ', d32.d, ' ', d33.d, ' ', d34.d, ' ', d35.d, ' ', d36.d, ' ', CHAR(10), d41.d, ' ', d42.d, ' ', d43.d, ' ', d44.d, ' ', d45.d, ' ', d46.d, ' ', CHAR(10), d51.d, ' ', d52.d, ' ', d53.d, ' ', d54.d, ' ', d55.d, ' ', d56.d, ' ', CHAR(10), d61.d, ' ', d62.d, ' ', d63.d, ' ', d64.d, ' ', d65.d, ' ', d66.d, ' ', CHAR(10) ) AS solutionFROM digits d11INNER JOIN digits d12ON COALESCE(d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1)AND d12.d != d11.dINNER JOIN digits d13ON COALESCE(d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1)AND d13.d != d11.d AND d13.d != d12.dINNER JOIN digits d14ON COALESCE(d14.d = ( SELECT d FROM start WHERE i = 1 AND j = 4 ), 1)AND d14.d != d11.d AND d14.d != d12.d AND d14.d != d13.dINNER JOIN digits d15ON COALESCE(d15.d = ( SELECT d FROM start WHERE i = 1 AND j = 5 ), 1)AND d15.d != d11.d AND d15.d != d12.d AND d15.d != d13.d AND d15.d != d14.dINNER JOIN digits d16ON COALESCE(d16.d = ( SELECT d FROM start WHERE i = 1 AND j = 6 ), 1)AND d16.d != d11.d AND d16.d != d12.d AND d16.d != d13.d AND d16.d != d14.d AND d16.d != d15.dINNER JOIN digits d21ON COALESCE(d21.d = ( SELECT d FROM start WHERE i = 2 AND j = 1 ), 1)AND d21.d != d11.dINNER JOIN digits d22ON COALESCE(d22.d = ( SELECT d FROM start WHERE i = 2 AND j = 2 ), 1)AND d22.d != d21.dAND d22.d != d12.dAND d22.d != d11.dINNER JOIN digits d23ON COALESCE(d23.d = ( SELECT d FROM start WHERE i = 2 AND j = 3 ), 1)AND d23.d != d21.d AND d23.d != d22.dAND d23.d != d13.dAND d23.d != d11.d AND d23.d != d12.d

Page 49: The Lost Art of the Self Join

Solving a Sudoku with one SELECT (2)INNER JOIN digits d24ON COALESCE(d24.d = ( SELECT d FROM start WHERE i = 2 AND j = 4 ), 1)AND d24.d != d21.d AND d24.d != d22.d AND d24.d != d23.dAND d24.d != d14.dINNER JOIN digits d25ON COALESCE(d25.d = ( SELECT d FROM start WHERE i = 2 AND j = 5 ), 1)AND d25.d != d21.d AND d25.d != d22.d AND d25.d != d23.d AND d25.d != d24.dAND d25.d != d15.dAND d25.d != d14.dINNER JOIN digits d26ON COALESCE(d26.d = ( SELECT d FROM start WHERE i = 2 AND j = 6 ), 1)AND d26.d != d21.d AND d26.d != d22.d AND d26.d != d23.d AND d26.d != d24.d AND d26.d != d25.dAND d26.d != d16.dAND d26.d != d14.d AND d26.d != d15.dINNER JOIN digits d31ON COALESCE(d31.d = ( SELECT d FROM start WHERE i = 3 AND j = 1 ), 1)AND d31.d != d11.d AND d31.d != d21.dINNER JOIN digits d32ON COALESCE(d32.d = ( SELECT d FROM start WHERE i = 3 AND j = 2 ), 1)AND d32.d != d31.dAND d32.d != d12.d AND d32.d != d22.dINNER JOIN digits d33ON COALESCE(d33.d = ( SELECT d FROM start WHERE i = 3 AND j = 3 ), 1)AND d33.d != d31.d AND d33.d != d32.dAND d33.d != d13.d AND d33.d != d23.dINNER JOIN digits d34ON COALESCE(d34.d = ( SELECT d FROM start WHERE i = 3 AND j = 4 ), 1)AND d34.d != d31.d AND d34.d != d32.d AND d34.d != d33.dAND d34.d != d14.d AND d34.d != d24.dINNER JOIN digits d35ON COALESCE(d35.d = ( SELECT d FROM start WHERE i = 3 AND j = 5 ), 1)AND d35.d != d31.d AND d35.d != d32.d AND d35.d != d33.d AND d35.d != d34.dAND d35.d != d15.d AND d35.d != d25.dINNER JOIN digits d36ON COALESCE(d36.d = ( SELECT d FROM start WHERE i = 3 AND j = 6 ), 1)AND d36.d != d31.d AND d36.d != d32.d AND d36.d != d33.d AND d36.d != d34.d AND d36.d != d35.dAND d36.d != d16.d AND d36.d != d26.d

Page 50: The Lost Art of the Self Join

Solving a Sudoku with one SELECT (3)INNER JOIN digits d41ON COALESCE(d41.d = ( SELECT d FROM start WHERE i = 4 AND j = 1 ), 1)AND d41.d != d11.d AND d41.d != d21.d AND d41.d != d31.dINNER JOIN digits d42ON COALESCE(d42.d = ( SELECT d FROM start WHERE i = 4 AND j = 2 ), 1)AND d42.d != d41.dAND d42.d != d12.d AND d42.d != d22.d AND d42.d != d32.dAND d42.d != d31.dINNER JOIN digits d43ON COALESCE(d43.d = ( SELECT d FROM start WHERE i = 4 AND j = 3 ), 1)AND d43.d != d41.d AND d43.d != d42.dAND d43.d != d13.d AND d43.d != d23.d AND d43.d != d33.dAND d43.d != d31.d AND d43.d != d32.dINNER JOIN digits d44ON COALESCE(d44.d = ( SELECT d FROM start WHERE i = 4 AND j = 4 ), 1)AND d44.d != d41.d AND d44.d != d42.d AND d44.d != d43.dAND d44.d != d14.d AND d44.d != d24.d AND d44.d != d34.dINNER JOIN digits d45ON COALESCE(d45.d = ( SELECT d FROM start WHERE i = 4 AND j = 5 ), 1)AND d45.d != d41.d AND d45.d != d42.d AND d45.d != d43.d AND d45.d != d44.dAND d45.d != d15.d AND d45.d != d25.d AND d45.d != d35.dAND d45.d != d34.dINNER JOIN digits d46ON COALESCE(d46.d = ( SELECT d FROM start WHERE i = 4 AND j = 6 ), 1)AND d46.d != d41.d AND d46.d != d42.d AND d46.d != d43.d AND d46.d != d44.d AND d46.d != d45.dAND d46.d != d16.d AND d46.d != d26.d AND d46.d != d36.dAND d46.d != d34.d AND d46.d != d35.dINNER JOIN digits d51ON COALESCE(d51.d = ( SELECT d FROM start WHERE i = 5 AND j = 1 ), 1)AND d51.d != d11.d AND d51.d != d21.d AND d51.d != d31.d AND d51.d != d41.dINNER JOIN digits d52ON COALESCE(d52.d = ( SELECT d FROM start WHERE i = 5 AND j = 2 ), 1)AND d52.d != d51.dAND d52.d != d12.d AND d52.d != d22.d AND d52.d != d32.d AND d52.d != d42.d

Page 51: The Lost Art of the Self Join

Solving a Sudoku with one SELECT (4)INNER JOIN digits d53ON COALESCE(d53.d = ( SELECT d FROM start WHERE i = 5 AND j = 3 ), 1)AND d53.d != d51.d AND d53.d != d52.dAND d53.d != d13.d AND d53.d != d23.d AND d53.d != d33.d AND d53.d != d43.dINNER JOIN digits d54ON COALESCE(d54.d = ( SELECT d FROM start WHERE i = 5 AND j = 4 ), 1)AND d54.d != d51.d AND d54.d != d52.d AND d54.d != d53.dAND d54.d != d14.d AND d54.d != d24.d AND d54.d != d34.d AND d54.d != d44.dINNER JOIN digits d55ON COALESCE(d55.d = ( SELECT d FROM start WHERE i = 5 AND j = 5 ), 1)AND d55.d != d51.d AND d55.d != d52.d AND d55.d != d53.d AND d55.d != d54.dAND d55.d != d15.d AND d55.d != d25.d AND d55.d != d35.d AND d55.d != d45.dINNER JOIN digits d56ON COALESCE(d56.d = ( SELECT d FROM start WHERE i = 5 AND j = 6 ), 1)AND d56.d != d51.d AND d56.d != d52.d AND d56.d != d53.d AND d56.d != d54.d AND d56.d != d55.dAND d56.d != d16.d AND d56.d != d26.d AND d56.d != d36.d AND d56.d != d46.dINNER JOIN digits d61ON COALESCE(d61.d = ( SELECT d FROM start WHERE i = 6 AND j = 1 ), 1)AND d61.d != d11.d AND d61.d != d21.d AND d61.d != d31.d AND d61.d != d41.d AND d61.d != d51.dINNER JOIN digits d62ON COALESCE(d62.d = ( SELECT d FROM start WHERE i = 6 AND j = 2 ), 1)AND d62.d != d61.dAND d62.d != d12.d AND d62.d != d22.d AND d62.d != d32.d AND d62.d != d42.d AND d62.d != d52.dAND d62.d != d51.dINNER JOIN digits d63ON COALESCE(d63.d = ( SELECT d FROM start WHERE i = 6 AND j = 3 ), 1)AND d63.d != d61.d AND d63.d != d62.dAND d63.d != d13.d AND d63.d != d23.d AND d63.d != d33.d AND d63.d != d43.d AND d63.d != d53.dAND d63.d != d51.d AND d63.d != d52.dINNER JOIN digits d64ON COALESCE(d64.d = ( SELECT d FROM start WHERE i = 6 AND j = 4 ), 1)AND d64.d != d61.d AND d64.d != d62.d AND d64.d != d63.dAND d64.d != d14.d AND d64.d != d24.d AND d64.d != d34.d AND d64.d != d44.d AND d64.d != d54.d

Page 52: The Lost Art of the Self Join

Solving a Sudoku with one SELECT (5)INNER JOIN digits d65ON COALESCE(d65.d = ( SELECT d FROM start WHERE i = 6 AND j = 5 ), 1)AND d65.d != d61.d AND d65.d != d62.d AND d65.d != d63.d AND d65.d != d64.dAND d65.d != d15.d AND d65.d != d25.d AND d65.d != d35.d AND d65.d != d45.d AND d65.d != d55.dAND d65.d != d54.dINNER JOIN digits d66ON COALESCE(d66.d = ( SELECT d FROM start WHERE i = 6 AND j = 6 ), 1)AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.dWHERE COALESCE(d11.d = ( SELECT d FROM start WHERE i = 1 AND j = 1 ), 1)

Page 53: The Lost Art of the Self Join

Table `digits` for the „pool“ of digits+---+| d |+---+| 1 | | 2 | | 3 | | 4 | | 5 | | 6 | +---+

Page 54: The Lost Art of the Self Join

Table `start` for initial conditions+---+---+------+| i | j | d |+---+---+------+| 1 | 3 | 6 | | 2 | 4 | 6 | | 2 | 6 | 3 | | 3 | 1 | 4 | | 3 | 4 | 1 | | 4 | 1 | 1 | | 4 | 4 | 3 | | 4 | 6 | 6 | | 5 | 3 | 5 | | 5 | 4 | 4 | | 5 | 5 | 6 | | 6 | 6 | 1 | +---+---+------+

6

6 3

4 1

1 3 6

5 4 6

1

Page 55: The Lost Art of the Self Join

How the query works: First field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…

Page 56: The Lost Art of the Self Join

How the query works: Second field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…

Page 57: The Lost Art of the Self Join

How the query works: Third field…FROM digits d11INNER JOIN digits d12ON COALESCE( d12.d = ( SELECT d FROM start WHERE i = 1 AND j = 2 ), 1 )AND d12.d != d11.dINNER JOIN digits d13ON COALESCE( d13.d = ( SELECT d FROM start WHERE i = 1 AND j = 3 ), 1 )AND d13.d != d11.d AND d13.d != d12.d…

Page 58: The Lost Art of the Self Join

How the query works: Last field

…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…

Page 59: The Lost Art of the Self Join

How the query works: Last field

…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…

Page 60: The Lost Art of the Self Join

How the query works: Last field

…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…

Page 61: The Lost Art of the Self Join

How the query works: Last field

…INNER JOIN digits d66ON COALESCE( … )AND d66.d != d61.d AND d66.d != d62.d AND d66.d != d63.d AND d66.d != d64.d AND d66.d != d65.dAND d66.d != d16.d AND d66.d != d26.d AND d66.d != d36.d AND d66.d != d46.d AND d66.d != d56.dAND d66.d != d54.d AND d66.d != d55.d…

Page 62: The Lost Art of the Self Join

Conclusions from the „Sudoku-Case“• Declarative Paradigm (Constraint Programming)

‣ Don‘t care about the „how“, but about the „what“

‣ Optimizer does a great job!

• (Ab-)use built-in Backtracking of Join Engine

• A query might look awkward – but still performs!

Page 63: The Lost Art of the Self Join

Some reasons for reasonable performance…• Very small table (`digits`) and covering index

• Small result set: Always working on one row!

• Subqueries basically optimized away

‣ „Impossible WHERE noticed“ (no pre-condition case)

‣ Constant (pre-condition case)

• Optimizer/Join Engine is good at this stuff!+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+| 1 | PRIMARY | d11 | index | NULL | PRIMARY | 1 | NULL | 6 | Using where; Using index || 1 | PRIMARY | d12 | index | NULL | PRIMARY | 1 | NULL | 6 | Using where; Using index || … | … | … | … | … | … | … | … | … | … || 37 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after|| | | | | | | | | | reading const tables | | 36 | SUBQUERY | start | const | PRIMARY | PRIMARY | 2 | const,const | 1 | || … | … | … | … | … | … | … | … | … | … |+----+-------------+-------+-------+---------------+---------+---------+-------------+------+-------------------------------+72 rows in set (0.01 sec)

Page 64: The Lost Art of the Self Join

Final Message• Have fun with the declarative power of SQL!

‣ Despite its flaws…

• Do it the SQL way!

• Slides and code will be made available on conference website

• Check out Developer Zone on MySQL website for an upcoming article version of my last year‘s session „The declarative power of VIEWs“

Page 65: The Lost Art of the Self Join

This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

To view a copy of this license, visithttp://creativecommons.org/licenses/by-nc-sa/3.0/

or send a letter toCreative Commons, 171 Second Street, Suite 300,

San Francisco, California, 94105, USA.