still using windows 3.1? - modern sql · select * from t1 cross join lateral (select * from t2...
TRANSCRIPT
Still using Windows 3.1? So why stick with
SQL-92?
@ModernSQL - https://modern-sql.com/ @MarkusWinand
SQL:1999
LATERAL
Select-list sub-queries must be scalar[0]:
LATERAL Before SQL:1999
SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…
(an atomic quantity that can hold only one value at a time[1])
[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar
Select-list sub-queries must be scalar[0]:
LATERAL Before SQL:1999
SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…
(an atomic quantity that can hold only one value at a time[1])
[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar
✗,column_2
More thanone column? ⇒Syntax error
Select-list sub-queries must be scalar[0]:
LATERAL Before SQL:1999
SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…
(an atomic quantity that can hold only one value at a time[1])
[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar
✗,column_2
More thanone column? ⇒Syntax error
}More thanone row?
⇒Runtime error!
SELECT*FROMt1CROSSJOINLATERAL(SELECT*FROMt2WHEREt2.x=t1.x)derived_tableON(true)
LATERAL Since SQL:1999Lateral derived queries can see table names defined before:
SELECT*FROMt1CROSSJOINLATERAL(SELECT*FROMt2WHEREt2.x=t1.x)derived_tableON(true)
LATERAL Since SQL:1999
Valid due to
LATERAL
keyword
Lateral derived queries can see table names defined before:
SELECT*FROMt1CROSSJOINLATERAL(SELECT*FROMt2WHEREt2.x=t1.x)derived_tableON(true)
LATERAL Since SQL:1999
Valid due to
LATERAL
keyword
Useless, but still required
Lateral derived queries can see table names defined before:
SELECT*FROMt1CROSSJOINLATERAL(SELECT*FROMt2WHEREt2.x=t1.x)derived_tableON(true)
LATERAL Since SQL:1999
Valid due to
LATERAL
keyword
Lateral derived queries can see table names defined before:
Use CROSS JOIN to omit the ON clause
But WHY?
Use-CasesLATERAL
‣ Top-N per group
inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.
Use-CasesLATERAL
FROMtCROSSJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table
‣ Top-N per group
inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.
Use-CasesLATERAL
FROMtCROSSJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table
‣ Top-N per group
inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.
Use-CasesLATERAL
Add proper indexfor Top-N query
https://use-the-index-luke.com/sql/partial-results/top-n-queries
FROMtCROSSJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table
‣ Top-N per group
inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.
‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).
Use-CasesLATERAL
FROMtCROSSJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table
‣ Top-N per group
inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.
‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).
‣ Table function arguments
(TABLE often implies LATERAL)
Use-CasesLATERAL
FROMtJOINTABLE(your_func(t.c))
LATERAL Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 MariaDB8.0.14 MySQL
9.3 PostgreSQLSQLite
9.1 DB2 LUW11gR1[0] 12cR1 Oracle
2005[1] SQL Server[0]Undocumented. Requires setting trace event 22829.[1]LATERAL is not supported as of SQL Server 2016 but [CROSS|OUTER]APPLY can be used for the same effect.
WITHRECURSIVE(Common Table Expressions)
WITHRECURSIVECREATETABLEt(idINTEGER,parentINTEGER)
WITHRECURSIVECREATETABLEt(idINTEGER,parentINTEGER)
WITHRECURSIVECREATETABLEt(idINTEGER,parentINTEGER)
WITHRECURSIVE Since SQL:1999
SELECTt.id,t.parentFROMtWHEREt.id=?
WITHRECURSIVE Since SQL:1999
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?
WITHRECURSIVE Since SQL:1999
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?
WITHRECURSIVE Since SQL:1999
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?
WITHRECURSIVE Since SQL:1999
WITHRECURSIVEprev(id,parent)AS(
)
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?
SELECT*FROMprev
WITHRECURSIVE Since SQL:1999
WITHRECURSIVEprev(id,parent)AS(
)
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtJOINprevONt.parent=prev.id
SELECT*FROMprev
WITHRECURSIVE Since SQL:1999
WITHRECURSIVEprev(id,parent)AS(
)
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtJOINprevONt.parent=prev.id
SELECT*FROMprev
WITHRECURSIVE Since SQL:1999
WITHRECURSIVEprev(id,parent)AS(
)
SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtJOINprevONt.parent=prev.id
SELECT*FROMprev
WITHRECURSIVE Since SQL:1999
WITHRECURSIVE Use Cases
“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf
‣ Processing graphs
Shortest route from person A to B in LinkedIn/Facebook/…
WITHRECURSIVE Use Cases
“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf
https://wiki.postgresql.org/wiki/Loose_indexscan
† n … # distinct values, N … # of table rows. Suitable index required
‣ Processing graphs
Shortest route from person A to B in LinkedIn/Facebook/…
‣ Finding distinct values
in n*log(N)† time complexity
WITHRECURSIVE Use Cases
WITHRECURSIVEgen(n)AS(SELECT1UNIONALLSELECTn+1FROMgenWHEREn<?)SELECT*FROMgen
“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf
https://wiki.postgresql.org/wiki/Loose_indexscan
† n … # distinct values, N … # of table rows. Suitable index required
‣ Processing graphs
Shortest route from person A to B in LinkedIn/Facebook/…
‣ Finding distinct values
in n*log(N)† time complexity
‣ Row generators
To fill gaps (e.g., in time series), generate test data
WITHRECURSIVE Use Cases
AvailabilityWITHRECURSIVE1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.2 MariaDB8.0 MySQL
8.4 PostgreSQL3.8.3[0] SQLite
7.0 DB2 LUW11gR2 Oracle
2005 SQL Server[0]Only for top-level SELECT statements
GROUPINGSETS
Only one GROUPBY operation at a time:
GROUPINGSETS Before SQL:1999
Only one GROUPBY operation at a time:
GROUPINGSETS Before SQL:1999
SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month
Monthly revenue
Only one GROUPBY operation at a time:
GROUPINGSETS Before SQL:1999
SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month
Monthly revenue Yearly revenue
SELECTyear,sum(revenue)FROMtblGROUPBYyear
GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month
SELECTyear,sum(revenue)FROMtblGROUPBYyear
GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month
SELECTyear,sum(revenue)FROMtblGROUPBYyear
UNIONALL
,null
GROUPINGSETS Since SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month
SELECTyear,sum(revenue)FROMtblGROUPBYyear
UNIONALL
,null
SELECTyear,month,sum(revenue)FROMtblGROUPBYGROUPINGSETS((year,month),(year))
GROUPINGSETS Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1[0] MariaDB5.0[1] MySQL
9.5 PostgreSQLSQLite
5 DB2 LUW9iR1 Oracle
2008 SQL Server[0]Only ROLLUP (properitery syntax).[1]Only ROLLUP (properitery syntax). GROUPING function since MySQL 8.0.
SQL:2003
OVERand
PARTITIONBY
OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:
OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:
‣Merge rows with the same key properties
‣ GROUPBY to specify key properties
‣ DISTINCT to use full row as key
OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:
‣Merge rows with the same key properties
‣ GROUPBY to specify key properties
‣ DISTINCT to use full row as key
‣ Aggregate data from related rows ‣ Requires GROUPBY to segregate the rows
‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows
OVER (PARTITION BY) The Problem
OVER (PARTITION BY) The Problem
SELECTc1,c2FROMt
SELECTc1,c2FROMt
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No SELECTc1
,c2FROMt
SELECTc1,c2FROMt
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No SELECTc1
,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMt
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMt
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMt
SELECTc1,SUM(c2)totFROMtGROUPBYc1
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMt
SELECTc1,SUM(c2)totFROMtGROUPBYc1
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)
SELECTc1,SUM(c2)totFROMtGROUPBYc1
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)
SELECTc1,SUM(c2)totFROMtGROUPBYc1
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)
SELECTc1,SUM(c2)totFROMtGROUPBYc1
,tot
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) The Problem
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)
SELECTc1,SUM(c2)totFROMtGROUPBYc1
,tot
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) Since SQL:2003
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMtFROMt
SELECTc1,SUM(c2)totFROMtGROUPBYc1
OVER (PARTITION BY) Since SQL:2003
Yes ⇠
Mer
ge ro
ws ⇢
No
No ⇠ Aggregate ⇢ Yes
SELECTc1,c2FROMt
SELECTDISTINCTc1,c2FROMt
SELECTc1,c2FROMt
FROMt
,SUM(c2)OVER(PARTITIONBYc1)
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
Look here
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
OVER (PARTITION BY) How it works
dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000
SELECTdep,salary,SUM(salary)OVER()FROMemp
OVER (PARTITION BY) How it works
)
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000
OVER (PARTITION BY) How it works
)
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000
OVER (PARTITION BY) How it works
)PARTITIONBYdep
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000
OVER (PARTITION BY) How it works
)PARTITIONBYdep
SELECTdep,salary,SUM(salary)OVER()FROMemp
dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000
OVER (PARTITION BY) How it works
)PARTITIONBYdep
OVERand
ORDERBY(Framing & Ranking)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The Problem
SELECTid,value,FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The Problem
SELECTid,value,FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The Problem
SELECTid,value,FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The Problem
SELECTid,value,
(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)
FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) The Problem
SELECTid,value,
(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)
FROMtransactionst
Range segregation (<=)not possible with
GROUP BY orPARTITION BY
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)
FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYid
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDING
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +30
22 3 -10 +20
333 4 +50 +70
333 5 -30 +40
333 6 -20 +20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10
22 2 +20
22 3 -10
333 4 +50
333 5 -30
333 6 -20
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
OVER (ORDER BY) Since SQL:2003
SELECTid,value,
FROMtransactionst
SUM(value)OVER(
)
acnt id value balance
1 1 +10 +10
22 2 +20 +20
22 3 -10 +10
333 4 +50 +50
333 5 -30 +20
333 6 -20 .0
ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW
PARTITIONBYacnt
OVER (ORDER BY) Since SQL:2003With OVER(ORDERBYn) a new type of functions make sense:
n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DISTA 1 1 1 0 0.25B 2 2 2 0.33… 0.75B 3 2 2 0.33… 0.75X 4 4 3 1 1
‣ Aggregates without GROUPBY
Use CasesOVER (SQL:2003)
‣ Aggregates without GROUPBY
‣ Running totals, moving averages
Use Cases
AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg
OVER (SQL:2003)
‣ Aggregates without GROUPBY
‣ Running totals, moving averages
‣ Ranking
Use Cases
AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg
OVER (SQL:2003)
‣ Aggregates without GROUPBY
‣ Running totals, moving averages
‣ Ranking‣ Top-N per Group
Use Cases
SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3
AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg
OVER (SQL:2003)
‣ Aggregates without GROUPBY
‣ Running totals, moving averages
‣ Ranking‣ Top-N per Group
‣ Avoiding self-joins
Use Cases
SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3
AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg
OVER (SQL:2003)
‣ Aggregates without GROUPBY
‣ Running totals, moving averages
‣ Ranking‣ Top-N per Group
‣ Avoiding self-joins
[… many more …]
Use Cases
SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3
AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg
OVER (SQL:2003)
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.2 MariaDB8.0 MySQL
8.4 PostgreSQL3.25.0 SQLite
7.0 DB2 LUW8i Oracle
2005[0] 2012 SQL Server[0]No framing
OVER (SQL:2003) Availability
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.2 MariaDB8.0 MySQL
8.4 PostgreSQL3.25.0 SQLite
7.0 DB2 LUW8i Oracle
2005[0] 2012 SQL Server[0]No framing
OVER (SQL:2003) AvailabilityImpalaSpark
NuoDB
BigQuery Hive
Inverse Distribution Functions (percentiles)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
Number rows
SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
Number rows
Pick middle one
SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
Number rows
Pick middle one
SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)
Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.
(how to get the middle value (median) of a set)
Number rows
Pick middle one
Since SQL:2003Inverse Distribution Functions
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Median
Since SQL:2003Inverse Distribution Functions
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Median
Which value?
Since SQL:2003Inverse Distribution Functions
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
0 0.25 0.5 0.75 11
2
3
4
PERCENTILE_DISC
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
0 0.25 0.5 0.75 11
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
0 0.25 0.5 0.75 11
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC(0.5)
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
0 0.25 0.5 0.75 11
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_CONT
PERCENTILE_DISC(0.5)
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
1
2
3
4
0 0.25 0.5 0.75 11
2
3
4
0 0.25 0.5 0.75 11
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_CONT
PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1
1
2
3
4
PERCENTILE_CONT(0.5)
PERCENTILE_DISC(0.5)
SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata
Since SQL:2003Inverse Distribution Functions
Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)
Inverse Distribution Functions Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.3[0] MariaDBMySQL
9.4 PostgreSQLSQLite
11.1 DB2 LUW9iR1 Oracle
2012[0] SQL Server[0]Only as window function (requires OVER clause)
TABLESAMPLE
TABLESAMPLE Since SQL:2003
SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)
User provided seed
value
TABLESAMPLE Since SQL:2003
SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)
Or: bernoulli(10)
better sampling,but slow.
User provided seed
value
TABLESAMPLE Since SQL:2003
SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)
Or: bernoulli(10)
better sampling,but slow.
User provided seed
value
TABLESAMPLE Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 MariaDBMySQL
9.5[0] PostgreSQLSQLite
8.2[0] DB2 LUW8i[0] Oracle
2005[0] SQL Server[0]Not for derived tables
SQL:2008
MERGE
Copying rows from another table is easy:
INSERTINTO<target>SELECT…FROM<source>WHERENOTEXISTS(SELECT*FROM<target>WHERE…)
MERGE The Problem
Copying rows from another table is easy:
INSERTINTO<target>SELECT…FROM<source>WHERENOTEXISTS(SELECT*FROM<target>WHERE…)
MERGE The Problem
Both, <target> and <source>
are in scope here.
Deleting rows that exist in another table is also possible:
DELETEFROM<target>WHEREEXISTS(SELECT*FROM<source>WHERE…)
MERGE The Problem
Deleting rows that exist in another table is also possible:
DELETEFROM<target>WHEREEXISTS(SELECT*FROM<source>WHERE…)
MERGE The Problem
Both, <target> and <source>
are in scope here.
Updating rows from another table is awkward:
UPDATE<target>SET…WHERE…
MERGE The Problem
Updating rows from another table is awkward:
UPDATE<target>SET…WHERE…
MERGE The Problem
Requires a name(table or updatable view)
but not a subquery
Updating rows from another table is awkward:
UPDATE<target>SET…WHERE…SET(col1,col2)=(SELECTcol1,col2FROM<source>WHERE…)WHERE…
MERGE The Problem
Both, <target> and <source>
are in scope here.
SQL:2008 introduced merge to improve this situation:
MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…
‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.
MERGE Since SQL:2008
…
SQL:2008 introduced merge to improve this situation:
MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…
‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.
MERGE Since SQL:2008
SQL:2008 introduced merge to improve this situation:
MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…
‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.
MERGE Since SQL:2008
MERGE Since SQL:20081999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 MariaDBMySQLPostgreSQLSQLite
9.1 DB2 LUW9iR1[0] Oracle
2008 SQL Server[0]No AND condition.
FETCHFIRST
FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10
FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10
FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.
And how about databases not supporting ROW_NUMBER()?
SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10
FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.
And how about databases not supporting ROW_NUMBER()?
Dammit! Let's takeLIMIT
SELECT*FROMdataORDERBYxFETCHFIRST10ROWSONLY
FETCHFIRST Since SQL:2008SQL:2008 introduced the FETCHFIRST…ROWSONLY clause:
FETCHFIRST Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 MariaDB3.19.3[0] MySQL6.5[1] 8.4 PostgreSQL
2.1.0[1] SQLite
7.0 DB2 LUW12cR1 Oracle
7.0[2] 2012 SQL Server[0]Earliest mention of LIMIT. Probably inherited from mSQL[1]Functionality available using LIMIT[2]SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters
SQL:2011
OFFSET
SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn>10andrn<=20
OFFSET The ProblemHow to fetch the rows after a limit?
(pagination anybody?)
SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY
OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!
SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY
OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!
OFFSET
SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY
OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!
OFFSETGrab coasters & stickers!
https://use-the-index-luke.com/no-offset
OFFSET Since SQL:20111999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 MariaDB3.20.3[0] 4.0.6[1] MySQL6.5 PostgreSQL
2.1.0 SQLite
9.7[2] 11.1 DB2 LUW12cR1 Oracle
2012 SQL Server[0]LIMIT[offset,]limit: "With this it's easy to do a poor man's next page/previous page WWW application."[1]The release notes say "Added PostgreSQL compatible LIMIT syntax"[2]Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS
OVER
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
diff+50+40-20-40
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
diff+50+40-20-40
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
diff+50+40-20-40
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
diff+50+40-20-40
OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.
(E.g., calculate the difference to the previous rows)
SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt
balance50907030
……………
diff+50+40-20-40
OVER (SQL:2011) Since SQL:2011SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:
SELECT*,balance-COALESCE(LAG(balance)OVER(ORDERBYx),0)FROMt
OVER (SQL:2011) Since SQL:2011SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:
SELECT*,balance-COALESCE(LAG(balance)OVER(ORDERBYx),0)FROMt
Available functions:LEAD/LAGFIRST_VALUE/LAST_VALUENTH_VALUE(col,n)FROMFIRST/LASTRESPECT/IGNORENULLS
OVER (SQL:2011) Since SQL:2011SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:
OVER (LEAD, LAG, …) Since SQL:20111999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.2[0] MariaDB8.0[0] MySQL
8.4[0] PostgreSQL3.25.0[0] SQLite
9.5[1] 11.1 DB2 LUW8i[1] 11gR2 Oracle
2012[1] SQL Server[0]No IGNORENULLS and FROMLAST[1]No NTH_VALUE
System Versioning (Time Traveling)
INSERTUPDATEDELETE
are DESTRUCTIVE
System Versioning The Problem
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
CREATETABLEt(...,
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,end_tsTIMESTAMP(9)GENERATEDALWAYSASROWEND,
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,end_tsTIMESTAMP(9)GENERATEDALWAYSASROWEND,
PERIODFORSYSTEM_TIME(start_ts,end_ts)
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,end_tsTIMESTAMP(9)GENERATEDALWAYSASROWEND,
PERIODFORSYSTEM_TIME(start_ts,end_ts))WITHSYSTEMVERSIONING
System Versioning Since SQL:2011Table can be system versioned, application versioned or both.
ID Data start_ts end_ts1 X 10:00:00
UPDATE...SETDATA='Y'...
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00
DELETE...WHEREID=1
INSERT...(ID,DATA)VALUES(1,'X')
System Versioning Since SQL:2011
ID Data start_ts end_ts1 X 10:00:00
UPDATE...SETDATA='Y'...
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00
DELETE...WHEREID=1
INSERT...(ID,DATA)VALUES(1,'X')
System Versioning Since SQL:2011
ID Data start_ts end_ts1 X 10:00:00
UPDATE...SETDATA='Y'...
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00
DELETE...WHEREID=1
INSERT...(ID,DATA)VALUES(1,'X')
System Versioning Since SQL:2011
ID Data start_ts end_ts1 X 10:00:00
UPDATE...SETDATA='Y'...
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00
DELETE...WHEREID=1
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00
INSERT...(ID,DATA)VALUES(1,'X')
System Versioning Since SQL:2011
Although multiple versions exist, only the “current” one is visible per default.
After 12:00:00, SELECT*FROMt doesn’t return anything anymore.
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00
System Versioning Since SQL:2011
ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00
With FOR…ASOF you can query anything you like: SELECT*FROMtFORSYSTEM_TIMEASOFTIMESTAMP'2019-05-0810:30:00'
ID Data start_ts end_ts
1 X 10:00:00 11:00:00
System Versioning Since SQL:2011
System Versioning Since SQL:20111999
2001
2003
2005
2007
2009
2011
2013
2015
2017
5.1 10.3 MariaDBMySQLPostgreSQLSQLite
10.1 DB2 LUW10gR1[0] 11gR1[1] Oracle
2016 SQL Server[0]Short term using Flashback.[1]Flashback Archive. Proprietery syntax.
SQL:2016 (released: 2016-12-15)
JSON_TABLE
JSON Since SQL:2016
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
JSON Since SQL:2016
id a1
42 foo
43 bar
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
JSON Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
id a142 foo43 bar
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
id a142 foo43 bar
Bind Parameter
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
id a142 foo43 bar
SQL/JSON Path ‣ Query language to
select elements from a JSON document ‣Defined in the
SQL standardBind
Parameter
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
id a142 foo43 bar
SQL/JSON Path ‣ Query language to
select elements from a JSON document ‣Defined in the
SQL standardBind
Parameter
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]
id a142 foo43 bar
SQL/JSON Path ‣ Query language to
select elements from a JSON document ‣Defined in the
SQL standardBind
Parameter
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
JSON_TABLE Since SQL:2016
SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r
INSERTINTOtarget_table
JSON_TABLE Since SQL:2016
JSON_TABLE Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
MariaDB8.0 MySQL
PostgreSQLSQLite
DB2 LUW12cR1 Oracle
SQL Server
MATCH_RECOGNIZE (Row Pattern Matching)
Example: Logfile
Row Pattern Matching The Problem
Time
30 minutes
Example: Logfile
Row Pattern Matching The Problem
Time
30 minutes
Example: Logfile
Row Pattern Matching The Problem
Time
30 minutes
Example: Logfile
Row Pattern Matching The Problem
Example: Logfile
Time
30 minutes
Session 1 Session 2
Session 3
Session 4
Row Pattern Matching The Problem
Time
30 minutes
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes
Start-of-group tags
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes
Start-of-group tags
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes
number sessions
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes
number sessions
2222 2 33 3 44 42 3 41
Row Pattern Matching The Problem
SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped
Time
30 minutes 2222 2 33 3 44 42 3 41
Row Pattern Matching The Problem
Regular ExpressionsRow Pattern Matching
.\S*
Regular ExpressionsRow Pattern Matching
.\S*Regular Expression
any character non-white space
Regular ExpressionsRow Pattern Matching
.\S*{
Rail track diagram by regexper.com
any character non-white space
Regular ExpressionsRow Pattern Matching
.\S*{ {Rail track diagram by regexper.com
any character non-white spaceany character non-white space
Regular ExpressionsRow Pattern Matching
.\S*{ { {Rail track diagram by regexper.com
MATCH_RECOGNIZE applies regular expressions on rows.
Row Pattern Matching
MATCH_RECOGNIZE applies regular expressions on rows.
Row Pattern Matching
‣Define the pattern variables (“characters”):
DEFINEdeletedASstatus=99
You can also look at other rows for that:
DEFINEcontinuationASts<PREV(ts)+INTERVAL'30'MINUTE
MATCH_RECOGNIZE applies regular expressions on rows.
Row Pattern Matching
‣Define the pattern variables (“characters”):
DEFINEdeletedASstatus=99
You can also look at other rows for that:
DEFINEcontinuationASts<PREV(ts)+INTERVAL'30'MINUTE
MATCH_RECOGNIZE applies regular expressions on rows.
Row Pattern Matching
‣Define the pattern variables (“characters”):
DEFINEdeletedASstatus=99
You can also look at other rows for that:
DEFINEcontinuationASts<PREV(ts)+INTERVAL'30'MINUTE
‣Use this alphabet in an regular expression:
PATTERN(deletedcontinuation*)
Since SQL:2016Row Pattern Matching
Time
30 minutes
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
definecontinued
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
Oracle doesn’t support avg on intervals — query doesn’t work as shown
undefinedpattern variable: matches any row
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
any numberof “cont”
rows
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
Very muchlike GROUP BY
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
Very muchlike SELECT
Oracle doesn’t support avg on intervals — query doesn’t work as shown
SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t
Since SQL:2016Row Pattern Matching
Time
30 minutes
Oracle doesn’t support avg on intervals — query doesn’t work as shown
Row Pattern Matching Availability1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
MariaDBMySQLPostgreSQLSQLite
DB2 LUW12cR1 Oracle
SQL Server
https://www.flickr.com/photos/mfoubister/25367243054/
https://www.flickr.com/photos/mfoubister/25367243054/
A lot has happened
since SQL-92
https://www.flickr.com/photos/mfoubister/25367243054/
SQL has evolved
beyond the relational idea
A lot has happened
since SQL-92
https://www.flickr.com/photos/mfoubister/25367243054/
SQL has evolved
beyond the relational idea
If you use SQL for
CRUD operations only, you are doing it wrong
A lot has happened
since SQL-92
https://www.flickr.com/photos/mfoubister/25367243054/
SQL has evolved
beyond the relational idea
If you use SQL for
CRUD operations only, you are doing it wrong
A lot has happened
since SQL-92
https://modern-sql.com@ModernSQL by @MarkusWinand