still using windows 3.1? - modern sql · select * from t1 cross join lateral (select * from t2...

Still using Windows 3.1? So why stick with

SQL-92?

@ModernSQL - https://modern-sql.com/ @MarkusWinand

SQL:1999

LATERAL

Select-list sub-queries must be scalar[0]:

LATERAL Before SQL:1999

SELECT…,(SELECTcolumn_1FROMt1WHEREt1.x=t2.y)AScFROMt2…

(an atomic quantity that can hold only one value at a time[1])

[0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar






✗,column_2

More thanone column? ⇒Syntax error






✗,column_2

More thanone column? ⇒Syntax error

}More thanone row?

⇒Runtime error!

SELECT*FROMt1CROSSJOINLATERAL(SELECT*FROMt2WHEREt2.x=t1.x)derived_tableON(true)

LATERAL Since SQL:1999Lateral derived queries can see table names defined before:


LATERAL Since SQL:1999

Valid due to

LATERAL

keyword

Lateral derived queries can see table names defined before:



Valid due to

LATERAL

keyword

Useless, but still required




Valid due to

LATERAL

keyword


Use CROSS JOIN to omit the ON clause

But WHY?

Use-CasesLATERAL

‣ Top-N per group

inside a lateral derived tableFETCHFIRST (or LIMIT, TOP)applies per row from left tables.

Use-CasesLATERAL

FROMtCROSSJOINLATERAL(SELECT…FROM…WHEREt.c=…ORDERBY…LIMIT10)derived_table

‣ Top-N per group


Use-CasesLATERAL


‣ Top-N per group


Use-CasesLATERAL

Add proper indexfor Top-N query

https://use-the-index-luke.com/sql/partial-results/top-n-queries


‣ Top-N per group


‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).

Use-CasesLATERAL


‣ Top-N per group


‣ Also useful to find most recent news from several subscribed topics (“multi-source top-N”).

‣ Table function arguments

(TABLE often implies LATERAL)

Use-CasesLATERAL

FROMtJOINTABLE(your_func(t.c))

LATERAL Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDB8.0.14 MySQL

9.3 PostgreSQLSQLite

9.1 DB2 LUW11gR1[0] 12cR1 Oracle

2005[1] SQL Server[0]Undocumented. Requires setting trace event 22829.[1]LATERAL is not supported as of SQL Server 2016 but [CROSS|OUTER]APPLY can be used for the same effect.

WITHRECURSIVE(Common Table Expressions)

WITHRECURSIVECREATETABLEt(idINTEGER,parentINTEGER)

WITHRECURSIVE Since SQL:1999

SELECTt.id,t.parentFROMtWHEREt.id=?


SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?


WITHRECURSIVEprev(id,parent)AS(

)

SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtWHEREt.parent=?

SELECT*FROMprev


WITHRECURSIVEprev(id,parent)AS(

)

SELECTt.id,t.parentFROMtWHEREt.id=?UNIONALLSELECTt.id,t.parentFROMtJOINprevONt.parent=prev.id

SELECT*FROMprev


WITHRECURSIVE Use Cases

“[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer performance superior to that of the dedicated graph databases.” event.cwi.nl/grades2013/07-welc.pdf

‣ Processing graphs

Shortest route from person A to B in LinkedIn/Facebook/…


http://event.cwi.nl/grades2013/07-welc.pdf


https://wiki.postgresql.org/wiki/Loose_indexscan

† n … # distinct values, N … # of table rows. Suitable index required



‣ Finding distinct values

in n*log(N)† time complexity




WITHRECURSIVEgen(n)AS(SELECT1UNIONALLSELECTn+1FROMgenWHEREn<?)SELECT*FROMgen



† n … # distinct values, N … # of table rows. Suitable index required



‣ Finding distinct values

in n*log(N)† time complexity

‣ Row generators

To fill gaps (e.g., in time series), generate test data




AvailabilityWITHRECURSIVE1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 10.2 MariaDB8.0 MySQL

8.4 PostgreSQL3.8.3[0] SQLite

7.0 DB2 LUW11gR2 Oracle

2005 SQL Server[0]Only for top-level SELECT statements

GROUPINGSETS

Only one GROUPBY operation at a time:

GROUPINGSETS Before SQL:1999



SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

Monthly revenue



SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

Monthly revenue Yearly revenue

SELECTyear,sum(revenue)FROMtblGROUPBYyear

GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month


GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month


UNIONALL

,null

GROUPINGSETS Since SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month


UNIONALL

,null

SELECTyear,month,sum(revenue)FROMtblGROUPBYGROUPINGSETS((year,month),(year))

GROUPINGSETS Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1[0] MariaDB5.0[1] MySQL


5 DB2 LUW9iR1 Oracle

2008 SQL Server[0]Only ROLLUP (properitery syntax).[1]Only ROLLUP (properitery syntax). GROUPING function since MySQL 8.0.

SQL:2003

OVERand

PARTITIONBY

OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:


‣Merge rows with the same key properties

‣ GROUPBY to specify key properties

‣ DISTINCT to use full row as key


‣Merge rows with the same key properties

‣ GROUPBY to specify key properties

‣ DISTINCT to use full row as key

‣ Aggregate data from related rows ‣ Requires GROUPBY to segregate the rows

‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows

OVER (PARTITION BY) The Problem


SELECTc1,c2FROMt

SELECTc1,c2FROMt


Yes ⇠

Mer

ge ro

ws ⇢

No SELECTc1

,c2FROMt

SELECTc1,c2FROMt


Yes ⇠

Mer

ge ro

ws ⇢

No SELECTc1

,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMt


Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt


SELECTc1,c2FROMt

SELECTc1,SUM(c2)totFROMtGROUPBYc1


Yes ⇠

Mer

ge ro

ws ⇢

No


SELECTc1,c2FROMt


SELECTc1,c2FROMt




Yes ⇠

Mer

ge ro

ws ⇢

No


SELECTc1,c2FROMt


SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)




Yes ⇠

Mer

ge ro

ws ⇢

No


SELECTc1,c2FROMt


SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)


,tot


OVER (PARTITION BY) Since SQL:2003

Yes ⇠

Mer

ge ro

ws ⇢

No


SELECTc1,c2FROMt


SELECTc1,c2FROMtFROMt


OVER (PARTITION BY) Since SQL:2003

Yes ⇠

Mer

ge ro

ws ⇢

No


SELECTc1,c2FROMt


SELECTc1,c2FROMt

FROMt

,SUM(c2)OVER(PARTITIONBYc1)

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works


dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000


Look here


dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000


dep salary1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000



)


dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000


)


dep salary1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000


)PARTITIONBYdep

OVERand

ORDERBY(Framing & Ranking)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,FROMtransactionst


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20


SELECTid,value,

(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)

FROMtransactionst


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20


SELECTid,value,


FROMtransactionst

Range segregation (<=)not possible with

GROUP BY orPARTITION BY

OVER (ORDER BY) Since SQL:2003

SELECTid,value,


FROMtransactionst


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20


SELECTid,value,

FROMtransactionst


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20


SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20


SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYid


SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDING


SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW


SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10

22 2 +20

22 3 -10

333 4 +50

333 5 -30

333 6 -20



SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)


1 1 +10 +10

22 2 +20 +20

22 3 -10 +10

333 4 +50 +50

333 5 -30 +20

333 6 -20 .0


PARTITIONBYacnt

OVER (ORDER BY) Since SQL:2003With OVER(ORDERBYn) a new type of functions make sense:

n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DISTA 1 1 1 0 0.25B 2 2 2 0.33… 0.75B 3 2 2 0.33… 0.75X 4 4 3 1 1

‣ Aggregates without GROUPBY

Use CasesOVER (SQL:2003)


‣ Running totals, moving averages

Use Cases

AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg

OVER (SQL:2003)



‣ Ranking

Use Cases


OVER (SQL:2003)



‣ Ranking‣ Top-N per Group

Use Cases

SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3


OVER (SQL:2003)




‣ Avoiding self-joins

Use Cases



OVER (SQL:2003)




‣ Avoiding self-joins

[… many more …]

Use Cases



OVER (SQL:2003)

1999

2001

2003

2005

2007

2009

2011

2013

2015

2017


8.4 PostgreSQL3.25.0 SQLite

7.0 DB2 LUW8i Oracle

2005[0] 2012 SQL Server[0]No framing

OVER (SQL:2003) Availability

1999

2001

2003

2005

2007

2009

2011

2013

2015

2017


8.4 PostgreSQL3.25.0 SQLite

7.0 DB2 LUW8i Oracle

2005[0] 2012 SQL Server[0]No framing

OVER (SQL:2003) AvailabilityImpalaSpark

NuoDB

BigQuery Hive

Inverse Distribution Functions (percentiles)

Inverse Distribution Functions The ProblemGrouped rows cannot be ordered prior to aggregation.

(how to get the middle value (median) of a set)

SELECTd1.valFROMdatad1JOINdatad2ON(d1.val<d2.valOR(d1.val=d2.valANDd1.id<d2.id))GROUPBYd1.valHAVINGcount(*)=(SELECTFLOOR(COUNT(*)/2)FROMdatad3)






Number rows




Number rows

Pick middle one

Since SQL:2003Inverse Distribution Functions

SELECTPERCENTILE_DISC(0.5)WITHINGROUP(ORDERBYval)FROMdata



Median



Median

Which value?




Two variants: ‣ for discrete values (categories) ‣ for continuous values (linear interpolation)

1

2

3

4




1

2

3

4

0 0.25 0.5 0.75 11

2

3

4




1

2

3

4

0 0.25 0.5 0.75 11

2

3

4

0 0.25 0.5 0.75 11

2

3

4

PERCENTILE_DISC




1

2

3

4

0 0.25 0.5 0.75 11

2

3

4

0 0.25 0.5 0.75 11

2

3

4

PERCENTILE_DISC0 0.25 0.5 0.75 1

1

2

3

4

PERCENTILE_DISC




1

2

3

4

0 0.25 0.5 0.75 11

2

3

4

0 0.25 0.5 0.75 11

2

3

4


1

2

3

4


1

2

3

4

PERCENTILE_DISC(0.5)




1

2

3

4

0 0.25 0.5 0.75 11

2

3

4

0 0.25 0.5 0.75 11

2

3

4


1

2

3

4


1

2

3

4

PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1

1

2

3

4

PERCENTILE_CONT





1

2

3

4

0 0.25 0.5 0.75 11

2

3

4

0 0.25 0.5 0.75 11

2

3

4


1

2

3

4


1

2

3

4

PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1

1

2

3

4

PERCENTILE_CONT

PERCENTILE_DISC(0.5)0 0.25 0.5 0.75 1

1

2

3

4

PERCENTILE_CONT(0.5)





Inverse Distribution Functions Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 10.3[0] MariaDBMySQL


11.1 DB2 LUW9iR1 Oracle

2012[0] SQL Server[0]Only as window function (requires OVER clause)

TABLESAMPLE

TABLESAMPLE Since SQL:2003

SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)

User provided seed

value

TABLESAMPLE Since SQL:2003

SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)

Or: bernoulli(10)

better sampling,but slow.

User provided seed

value

TABLESAMPLE Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDBMySQL

9.5[0] PostgreSQLSQLite

8.2[0] DB2 LUW8i[0] Oracle

2005[0] SQL Server[0]Not for derived tables

SQL:2008

Copying rows from another table is easy:

INSERTINTO<target>SELECT…FROM<source>WHERENOTEXISTS(SELECT*FROM<target>WHERE…)

MERGE The Problem

Copying rows from another table is easy:

INSERTINTO<target>SELECT…FROM<source>WHERENOTEXISTS(SELECT*FROM<target>WHERE…)

MERGE The Problem

Both, <target> and <source>

are in scope here.

Deleting rows that exist in another table is also possible:

DELETEFROM<target>WHEREEXISTS(SELECT*FROM<source>WHERE…)

MERGE The Problem

Deleting rows that exist in another table is also possible:

DELETEFROM<target>WHEREEXISTS(SELECT*FROM<source>WHERE…)

MERGE The Problem


are in scope here.

Updating rows from another table is awkward:

UPDATE<target>SET…WHERE…

MERGE The Problem


UPDATE<target>SET…WHERE…

MERGE The Problem

Requires a name(table or updatable view)

but not a subquery


UPDATE<target>SET…WHERE…SET(col1,col2)=(SELECTcol1,col2FROM<source>WHERE…)WHERE…

MERGE The Problem


are in scope here.

SQL:2008 introduced merge to improve this situation:

MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…

‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.

MERGE Since SQL:2008

…

SQL:2008 introduced merge to improve this situation:

MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…

‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.



2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDBMySQLPostgreSQLSQLite

9.1 DB2 LUW9iR1[0] Oracle

2008 SQL Server[0]No AND condition.

FETCHFIRST

FETCHFIRST The ProblemLimit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)

SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn<=10




SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.

And how about databases not supporting ROW_NUMBER()?



SQL:2003 introduced ROW_NUMBER() to number rows.But this still requires wrapping to limit the result.

And how about databases not supporting ROW_NUMBER()?

Dammit! Let's takeLIMIT

SELECT*FROMdataORDERBYxFETCHFIRST10ROWSONLY

FETCHFIRST Since SQL:2008SQL:2008 introduced the FETCHFIRST…ROWSONLY clause:

FETCHFIRST Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDB3.19.3[0] MySQL6.5[1] 8.4 PostgreSQL

2.1.0[1] SQLite

7.0 DB2 LUW12cR1 Oracle

7.0[2] 2012 SQL Server[0]Earliest mention of LIMIT. Probably inherited from mSQL[1]Functionality available using LIMIT[2]SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters

SQL:2011

OFFSET

SELECT*FROM(SELECT*,ROW_NUMBER()OVER(ORDERBYx)rnFROMdata)numbered_dataWHERErn>10andrn<=20

OFFSET The ProblemHow to fetch the rows after a limit?

(pagination anybody?)

SELECT*FROMdataORDERBYxOFFSET10ROWSFETCHNEXT10ROWSONLY

OFFSET Since SQL:2011SQL:2011 introduced OFFSET, unfortunately!



OFFSET



OFFSETGrab coasters & stickers!

https://use-the-index-luke.com/no-offset

OFFSET Since SQL:20111999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDB3.20.3[0] 4.0.6[1] MySQL6.5 PostgreSQL

2.1.0 SQLite

9.7[2] 11.1 DB2 LUW12cR1 Oracle

2012 SQL Server[0]LIMIT[offset,]limit: "With this it's easy to do a poor man's next page/previous page WWW application."[1]The release notes say "Added PostgreSQL compatible LIMIT syntax"[2]Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.

(E.g., calculate the difference to the previous rows)

SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt

balance50907030

……………

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is ugly.

(E.g., calculate the difference to the previous rows)

SELECTbalance,balance-COALESCE(MAX(balance)OVER(ORDERBY…ROWSBETWEEN1PRECEDINGAND1PRECEDING),0)diffFROMt

balance50907030

……………

diff+50+40-20-40

OVER (SQL:2011) Since SQL:2011SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that:

SELECT*,balance-COALESCE(LAG(balance)OVER(ORDERBYx),0)FROMt


SELECT*,balance-COALESCE(LAG(balance)OVER(ORDERBYx),0)FROMt

Available functions:LEAD/LAGFIRST_VALUE/LAST_VALUENTH_VALUE(col,n)FROMFIRST/LASTRESPECT/IGNORENULLS


OVER (LEAD, LAG, …) Since SQL:20111999

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 10.2[0] MariaDB8.0[0] MySQL

8.4[0] PostgreSQL3.25.0[0] SQLite

9.5[1] 11.1 DB2 LUW8i[1] 11gR2 Oracle

2012[1] SQL Server[0]No IGNORENULLS and FROMLAST[1]No NTH_VALUE

System Versioning (Time Traveling)

INSERTUPDATEDELETE

are DESTRUCTIVE

System Versioning The Problem

System Versioning Since SQL:2011Table can be system versioned, application versioned or both.

CREATETABLEt(...,


CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,


CREATETABLEt(...,start_tsTIMESTAMP(9)GENERATEDALWAYSASROWSTART,end_tsTIMESTAMP(9)GENERATEDALWAYSASROWEND,



PERIODFORSYSTEM_TIME(start_ts,end_ts)



PERIODFORSYSTEM_TIME(start_ts,end_ts))WITHSYSTEMVERSIONING


ID Data start_ts end_ts1 X 10:00:00

UPDATE...SETDATA='Y'...

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00

DELETE...WHEREID=1

INSERT...(ID,DATA)VALUES(1,'X')

System Versioning Since SQL:2011

ID Data start_ts end_ts1 X 10:00:00

UPDATE...SETDATA='Y'...

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00

DELETE...WHEREID=1

ID Data start_ts end_ts1 X 10:00:00 11:00:001 Y 11:00:00 12:00:00

INSERT...(ID,DATA)VALUES(1,'X')


Although multiple versions exist, only the “current” one is visible per default.

After 12:00:00, SELECT*FROMt doesn’t return anything anymore.




With FOR…ASOF you can query anything you like: SELECT*FROMtFORSYSTEM_TIMEASOFTIMESTAMP'2019-05-0810:30:00'

ID Data start_ts end_ts

1 X 10:00:00 11:00:00



2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 10.3 MariaDBMySQLPostgreSQLSQLite

10.1 DB2 LUW10gR1[0] 11gR1[1] Oracle

2016 SQL Server[0]Short term using Flashback.[1]Flashback Archive. Proprietery syntax.

SQL:2016 (released: 2016-12-15)

JSON_TABLE

JSON Since SQL:2016

[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]

JSON Since SQL:2016

id a1

42 foo

43 bar

[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]

JSON Since SQL:2016

SELECT*FROMJSON_TABLE(?,'$[*]'COLUMNS(idINTPATH'$.id',a1VARCHAR(…)PATH'$.a1'))r

[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]

id a142 foo43 bar

JSON_TABLE Since SQL:2016


[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]

id a142 foo43 bar

Bind Parameter



[{"id":42,"a1":"foo"},{"id":43,"a1":"bar"}]

id a142 foo43 bar

SQL/JSON Path ‣ Query language to

select elements from a JSON document ‣Defined in the

SQL standardBind

Parameter



INSERTINTOtarget_table


JSON_TABLE Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

MariaDB8.0 MySQL

PostgreSQLSQLite

DB2 LUW12cR1 Oracle

SQL Server

MATCH_RECOGNIZE (Row Pattern Matching)

Example: Logfile

Row Pattern Matching The Problem

Time

30 minutes

Example: Logfile


Example: Logfile

Time

30 minutes

Session 1 Session 2

Session 3

Session 4


Time

30 minutes


SELECTcount(*)sessions,avg(duration)avg_durationFROM(SELECTMAX(ts)-MIN(ts)durationFROM(SELECTts,SUM(grp_start)OVER(ORDERBYts)session_noFROM(SELECTts,CASEWHENts>=LAG(ts,1,DATE'1900-01-01')OVER(ORDERBYts)+INTERVAL'30'minuteTHEN1ELSE0ENDgrp_startFROMlog)tagged)numberedGROUPBYsession_no)grouped

Time

30 minutes



Time

30 minutes

Start-of-group tags



Time

30 minutes

number sessions



Time

30 minutes

number sessions

2222 2 33 3 44 42 3 41



Time

30 minutes 2222 2 33 3 44 42 3 41


Regular ExpressionsRow Pattern Matching

.\S*


.\S*Regular Expression

any character non-white space


.\S*{

Rail track diagram by regexper.com

http://regexper.com

any character non-white space


.\S*{ {Rail track diagram by regexper.com

http://regexper.com

any character non-white spaceany character non-white space


.\S*{ { {Rail track diagram by regexper.com

http://regexper.com

MATCH_RECOGNIZE applies regular expressions on rows.

Row Pattern Matching



‣Define the pattern variables (“characters”):

DEFINEdeletedASstatus=99

You can also look at other rows for that:

DEFINEcontinuationASts<PREV(ts)+INTERVAL'30'MINUTE



‣Define the pattern variables (“characters”):

DEFINEdeletedASstatus=99

You can also look at other rows for that:

DEFINEcontinuationASts<PREV(ts)+INTERVAL'30'MINUTE

‣Use this alphabet in an regular expression:

PATTERN(deletedcontinuation*)

Since SQL:2016Row Pattern Matching

Time

30 minutes

SELECTCOUNT(*)sessions,AVG(duration)avg_durationFROMlogMATCH_RECOGNIZE(ORDERBYtsMEASURESLAST(ts)-FIRST(ts)ASdurationONEROWPERMATCHPATTERN(anycont*)DEFINEcontASts<PREV(ts)+INTERVAL'30'minute)t


Time

30 minutes

Oracle doesn’t support avg on intervals — query doesn’t work as shown



Time

30 minutes

definecontinued




Time

30 minutes


undefinedpattern variable: matches any row



Time

30 minutes

any numberof “cont”

rows




Time

30 minutes

Very muchlike GROUP BY




Time

30 minutes

Very muchlike SELECT




Time

30 minutes


Row Pattern Matching Availability1999

2001

2003

2005

2007

2009

2011

2013

2015

2017

MariaDBMySQLPostgreSQLSQLite

DB2 LUW12cR1 Oracle

SQL Server

https://www.flickr.com/photos/mfoubister/25367243054/


A lot has happened

since SQL-92


SQL has evolved

beyond the relational idea

A lot has happened

since SQL-92


SQL has evolved


If you use SQL for

CRUD operations only, you are doing it wrong

A lot has happened

since SQL-92


SQL has evolved


If you use SQL for

CRUD operations only, you are doing it wrong

A lot has happened

since SQL-92

https://modern-sql.com@ModernSQL by @MarkusWinand

still using windows 3.1? - modern sql · select * from t1 cross join lateral (select * from t2...

Documents