12c sql pattern matching wann werde ich das benutzen andrej pashchenko
TRANSCRIPT
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
12c SQL Pattern Matching –wann werde ich das benutzen?Andrej Pashchenko Senior ConsultantTrivadis GmbH
Unser Unternehmen.
12c SQL Pattern Matching – wann werde ich das benutzen?2 19.11.2015
Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution Engineering und der Erbringung von IT-Services mit Fokussierung auf -und -Technologien in der Schweiz, Deutschland, Österreich und Dänemark. Trivadis erbringt ihre Leistungen aus den strategischen Geschäftsfeldern:
Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme.
B E T R I E B
KOPENHAGEN
MÜNCHEN
LAUSANNEBERN
ZÜRICHBRUGG
GENF
HAMBURG
DÜSSELDORF
FRANKFURT
STUTTGART
FREIBURG
BASEL
WIEN
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort.
12c SQL Pattern Matching – wann werde ich das benutzen?3 19.11.2015
14 Trivadis Niederlassungen mitüber 600 Mitarbeitenden.
Über 200 Service Level Agreements.
Mehr als 4'000 Trainingsteilnehmer.
Forschungs- und Entwicklungsbudget: CHF 5.0 Mio.
Finanziell unabhängig undnachhaltig profitabel.
Erfahrung aus mehr als 1'900 Projekten pro Jahr bei über 800 Kunden.
Über mich
12c SQL Pattern Matching – wann werde ich das benutzen?4 19.11.2015
Senior Consultant bei der Trivadis GmbH, Düsseldorf
Schwerpunkt Oracle
– Application Development
– Application Performance
– Data Warehousing
22 Jahre IT-Erfahrung, davon 16 Jahre mit Oracle DB
Kurs-Referent „Oracle 12c New Features für Entwickler“ und „Beyond SQL and PL/SQL“
Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?5 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?6 19.11.2015
Introduction
Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?7 19.11.2015
Analytic functions
Analyticfunctionsenhancements
SQL Model Clause
LISTAGG
NTH_VALUE
PIVOT/UNPIVOT clause
Pattern Matching
Top-N
Introduction
Oracle 12c database supports SQL Pattern Matching with the newclause - MATCH_RECOGNIZE
pattern matching in a sequences of rows
nothing to do with string patterns (PL/SQL REGEXP_... functions)
it‘s a clause, not a function
after the table name in FROM clause
patterns are expressed with regular expression syntax overpattern variables
pattern variables are defined as SQL expressions
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?8
Introduction
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?9
MATCH_RECOGNIZE( [ PARTITION BY <cols> ][ ORDER BY <cols> ][ MEASURES <cols> ][ ONE ROW PER MATCH | ALL ROWS PER MATCH ][ SKIP_TO <option> ]PATTERN ( <row pattern> )[ SUBSET <subset list> ]DEFINE <definition list> )
IntroductionExample: Find Mappings in the ETL logging table, which were increasingly faster over a period of four days. Output: start and end dates of the period, elapsed time at the beginning and the end of the period, average elapsed time.
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?10
Introduction
SELECT etl_date, mapping_name, elapsedFROM dwh_etl_runs;...04-NOV-14 MAP_STG_S_ORDER_ITEM +000000 00:14:54.4273805-NOV-14 MAP_STG_S_ORDER +000000 00:10:13.4498905-NOV-14 MAP_STG_S_ORDER_ITEM +000000 00:15:06.2458705-NOV-14 MAP_STG_S_ASSET +000000 00:14:15.2285506-NOV-14 MAP_STG_S_ASSET +000000 00:14:00.4951306-NOV-14 MAP_STG_S_ORDER +000000 00:11:05.0733706-NOV-14 MAP_STG_S_ORDER_ITEM +000000 00:10:12.6741007-NOV-14 MAP_STG_S_ORDER_ITEM +000000 00:19:29.6431407-NOV-14 MAP_STG_S_ORDER +000000 00:14:59.8095307-NOV-14 MAP_STG_S_ASSET +000000 00:13:33.8078908-NOV-14 MAP_STG_S_ASSET +000000 00:10:14.6565208-NOV-14 MAP_STG_S_ORDER +000000 00:13:30.7774408-NOV-14 MAP_STG_S_ORDER_ITEM +000000 00:17:15.11789...
19.11.2015 12c SQL Pattern Matching – wann werde ich das benutzen?11
Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?12
SELECT *FROM dwh_etl_runs MATCH_RECOGNIZE (
PARTITION BY mapping_nameORDER BY etl_dateMEASURES FIRST (etl_date) AS start_date, LAST (etl_date) AS end_date, FIRST (elapsed) AS first_elapsed, LAST (elapsed) AS last_elapsed, AVG(elapsed) AS avg_elapsedPATTERN (STRT DOWN{3})DEFINE DOWN AS elapsed < PREV(elapsed) )
As for analytic functions: partition and order
Define measures, which areaccessible in the main query
Define search pattern withregular expression over boolean
pattern variables
Define pattern variables
Navigation operators:� PREV, NEXT – physical offset� FIRST, LAST – logical offset
19.11.2015
Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?13
PATTERN: Subset of Perl syntax for regular expressions– * — 0 or more iterations– + — 1 or more iterations– ? — 0 or 1 iterations– {n} — n iterations (n > 0)– {n,} — n or more iterations (n >= 0)– {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m)– {,m} — between 0 and m (inclusive) iterations (m > 0)– ( ) – Grouping– | – Alternation – {- … -} – Exclusion – ^ - before the first row in the Partition– $ - after the last row in the partition– ? – “reluctant” vs. “greedy”– ….
19.11.2015
Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?14
Patterns are everywhere
FinancialTelcos
Retail Traffic
AutomotiveTransport / Logistics
Fraud Detection
Quality of ServiceTrouble Ticketing
Price Trends
Buying Patterns
Stock Market Money Laundering
Money Laundering
Sensor Data
Network Activity
Advertising Campaigns
Sessionization
Frequent Flyer Programms
Process Chain
CRM
19.11.2015
Introduction
12c SQL Pattern Matching – wann werde ich das benutzen?15
SQL had no efficient way to handle such questions
pre 12c solutions
self-joins, subqueries (NOT) IN, (NOT) EXISTS
switch to PL/SQL - „Do it yourself“, often multiple SQL queries
transfer some logic to pipelined functions and integrate them in the main query
analytic (window) functions
– ORA-30483: window functions are not allowed here
– not possible to use in WHERE clause
– not possible to nest them
– unable to access the output of analytic functions in other rows
– often leads to nesting queries, self-joins, etc.
19.11.2015
Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?16 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?17 19.11.2015
Find consecutive ranges and gaps
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?18
SLA, QoS: find the longest period without outage
Table T_GAPS
Find consecutive ranges in the values of column ID
Output: Start- and End-ID of consecutive range
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_consecutive.sql
Start of Range End of Range
1 3
5 6
10 12
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?19
Pre 12c solution using analytic functionsID
1
2
3
5
6
10
11
12
14
20
21
…
WITH groups_marked AS (SELECT id , CASE
WHEN id != LAG(id,1,id) OVER(ORDER BY id) + 1 THEN 1 ELSE 0
END new_grpFROM t_gaps)
, sum_grp AS ( SELECT id, SUM(new_grp) OVER(ORDER BY id) grp_sumFROM groups_marked )
SELECT MIN(id) start_of_range, MAX(id) end_of_rangeFROM sum_grpGROUP BY grp_sumORDER BY grp_sum;
mr_consecutive.sql
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?20
„Tabibitosan“- method*
* - https://community.oracle.com/message/3991177#3991177
ID
1
2
3
5
6
10
11
12
14
20
21
…
SELECT MIN(id) start_of_range , MAX(id) end_of_rangeFROM (SELECT id
, id - ROW_NUMBER() OVER(ORDER BY id) distanceFROM t_gaps)
GROUP BY distanceORDER BY distance;
mr_consecutive.sql
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?21
12c solution with MATCH_RECOGINZEID
1
2
3
5
6
10
11
12
14
20
21
…
SELECT *FROM t_gaps MATCH_RECOGNIZE (
ORDER BY idMEASURES FIRST(id) start_of_range, LAST(id) end_of_range, COUNT(*) cntONE ROW PER MATCHPATTERN (strt cont*)DEFINE cont AS id = PREV(id)+1);
mr_consecutive.sql
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?22
Table T_GAPS, numeric column ID with gaps
Find the gaps in the values of column ID
Output: start- and end-ID of the gap
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
Start of Gap End of Gap
4 4
7 9
13 13
15 19
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?23
Solution with analytic functions
„Tabibitosan“-method*
* - https://community.oracle.com/message/3991177#3991177
ID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
SELECT start_of_gap, end_of_gapFROM ( SELECT id + 1 start_of_gap
, LEAD(id) OVER(ORDER BY id) - 1 end_of_gap, CASE
WHEN id + 1 != LEAD(id) OVER(ORDER BY id) THEN 1ELSE 0
END is_gapFROM t_gaps)
WHERE is_gap = 1;
SELECT MAX(id) + 1 start_of_gap, LEAD(MIN(id)) OVER (ORDER BY distance) -1 en d_of_gapFROM (SELECT id
, id - ROW_NUMBER() OVER(ORDER BY id) distanceFROM t_gaps)
GROUP BY distance;
19.11.2015
Find Consecutive Ranges / Gaps
12c SQL Pattern Matching – wann werde ich das benutzen?24
12c solution with MATCH_RECOGINZEID
1
2
3
5
6
10
11
12
14
20
21
…
mr_gaps.sql
SELECT *FROM t_gaps MATCH_RECOGNIZE (
ORDER BY idMEASURES PREV(gap.id)+1 start_of_gap, gap.id - 1 end_of_gapONE ROW PER MATCHPATTERN (strt gap+)DEFINE gap AS id != PREV(id)+1);
19.11.2015
Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?25 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?26 19.11.2015
Trouble Ticket roundtrip
Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?27
SCOTT
ADAMS
KING
ID Assignee Datum1 SCOTT 01.02.2015
1 SCOTT 02.02.2015
1 ADAMS 03.02.2015
1 SCOTT 04.02.2015
2 ADAMS 01.02.2015
2 ADAMS 02.02.2015
2 SCOTT 03.02.2015
3 KING 01.02.2015
3 ADAMS 02.02.2015
3 ADAMS 03.02.2015
3 KING 04.02.2015
3 ADAMS 05.02.2015
4 KING 01.02.2015
4 ADAMS 02.02.2015
4 SCOTT 03.02.2015
4 KING 05.02.2015
� Find the tickets, which wentagain to the same assignee
19.11.2015
Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?28
Pre12c solution using self-joins
mr_trouble_ticket.sql
SELECT DISTINCT t1.ticket_id, t1.assignee AS first_assignee, t3.change_date AS last_changeFROM trouble_ticket t1, trouble_ticket t2, trouble_ticket t3WHERE t1.ticket_id = t2.ticket_idAND t1.assignee != t2.assigneeAND t2.change_date > t1.change_dateAND t3.assignee = t1.assignee AND t3.ticket_id = t1.ticket_idAND t3.change_date > t2.change_dateORDER BY ticket_id
19.11.2015
Trouble Ticket Roundtrip
12c SQL Pattern Matching – wann werde ich das benutzen?29
12c solution using MATCH_RECOGINZE clause
New:
– Row Pattern Skip To:where to start over after match?
– match overlaping patterns
mr_trouble_ticket.sql
SELECT *FROM trouble_ticket
MATCH_RECOGNIZE(PARTITION BY ticket_idORDER BY change_dateMEASURES strt.assignee as first_assignee, LAST(same.change_date) as letzte_bearbeit ungAFTER MATCH SKIP TO FIRST anotherPATTERN (strt another+ same+)DEFINE same AS same.assignee = strt.assignee,
another AS another.assignee != strt.assignee);
Where to start over after a match is found?
19.11.2015
Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?30 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?31 19.11.2015
Grouping on fuzzy criteria
Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?32
„Sessionization“
– Group rows together where the gap between the timestamps is less than defined
...
PATTERN (STRT SESS+)
DEFINE SESS AS SESS.ins_date – PREV(SESS.ins_date)<= 10/24/60
– Group rows together that are within a defined interval relatively to thefirst row, otherwise start next grouphttps://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:13946369553642#3478381500346951056
...
PATTERN (A+)
DEFINE A AS ins_date < FIRST(ins_date) + 6/24
Group over running totals
– Split the data into the groups of defined capacity
19.11.2015
Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?33
Example-Schema SH (Sales History)
Task: split the data into the group of fixed capacity
� Fit all customers ordered by age intogroups providing that total sales in everygroup < 200 000$
19.11.2015
Grouping over fuzzy criteria
12c SQL Pattern Matching – wann werde ich das benutzen?34
12c solution with MATCH_RECOGINZE clause
mr_group_running_total.sql
WITH q AS (SELECT c.cust_id, c.cust_year_of_birth, SUM(s.amount_sold) cust_amount_soldFROM customers c JOIN sales s ON s.cust_id = c.c ust_idGROUP BY c.cust_id, c.cust_year_of_birth)
SELECT *FROM q
MATCH_RECOGNIZE(ORDER BY cust_year_of_birthMEASURES MATCH_NUMBER() gruppe, SUM(cust_amount_sold) running_sum, FINAL SUM(cust_amount_sold) final_sumALL ROWS PER MATCHPATTERN (gr*)DEFINE gr AS SUM(cust_amount_sold)<=200000
);
We need all matchesAggregate function in
pattern variable‘s condition
function returns the macthnumber
Aggregates in MEASURES: Running vs. Final
19.11.2015
Agenda
12c SQL Pattern Matching – wann werde ich das benutzen?35 19.11.2015
1. Introduction
2. Find consecutive ranges and gaps
3. Trouble Ticket roundtrip
4. Grouping on fuzzy criteria
5. Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?36 19.11.2015
Merge temporal intervals
Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?37
Temporal version of SCOTT-Schema: the data in EMP, DEPT andJOB have temporal validity (VALID_FROM - VALID_TO)
19.11.2015
Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?38
Task: Query the data for one employee joining four tables withrespect of temporal validity:
19.11.2015
Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?39
WITH joined AS (
SELECT e.empno,
g.valid_from,
LEAST( e.valid_to, d.valid_to, j.valid_to,
NVL(m.valid_to, e.valid_to),
LEAD(g.valid_from - 1, 1, e.valid_to) OVER(
PARTITION BY e.empno ORDER BY g.valid_from )
) AS valid_to,
e.ename, j.job, e.mgr, m.ename AS mgr_ename, e.hiredate,
e.sal, e.comm, e.deptno, d.dname
FROM empv e
INNER JOIN (SELECT valid_from FROM empv
UNION
SELECT valid_from FROM deptv
UNION
SELECT valid_from FROM jobv
UNION
SELECT valid_to + 1 FROM empv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM deptv
WHERE valid_to != DATE '9999-12-31'
UNION
SELECT valid_to + 1 FROM jobv
WHERE valid_to != DATE '9999-12-31') g
ON g.valid_from BETWEEN e.valid_from AND e.valid_to
INNER JOIN deptv d
ON d.deptno = e.deptno AND g.valid_from BETWEEN d.valid_from AND d.valid_to
INNER JOIN jobv j
ON j.jobno = e.jobno AND g.valid_from BETWEEN j.valid_from AND j.valid_to
LEFT JOIN empv m
ON m.empno = e.mgr AND g.valid_from BETWEEN m.valid_from AND m.valid_to )
...
Quelle: Philipp Salvisberg: http://www.salvis.com/blog/2012/12/28/joining-temporal-intervals-part-2/
19.11.2015
Merge temporal intervals
12c SQL Pattern Matching – wann werde ich das benutzen?40
...
SELECT empno, valid_from, valid_to, ename, job, mgr, mgr_ename, hiredate, sal, comm, deptno, dname
FROM joinedMATCH_RECOGNIZE (
PARTITION BY empno, ename, job, mgr,mgr_ename, hiredate, sal, comm, deptno, dname
ORDER BY valid_fromMEASURES FIRST(valid_from) valid_from,
LAST(valid_to) valid_toPATTERN ( strt nxt* )DEFINE nxt as valid_from = prev(valid_to) + 1)
WHERE empno = 7788;
19.11.2015
Conclusion
12c SQL Pattern Matching – wann werde ich das benutzen?41
Very powerful feature
Significantly simplifies a lot of queries (self-joins, semi-, anti-joins, nested queries), mostly with performance benefit
Since 2007 a proposal for ANSI-SQL
Requires thinking in patterns
Complicated syntax (at first sight )
But in many cases the code looks like the requirement in „plain English“
19.11.2015
Further information ...
12c SQL Pattern Matching – wann werde ich das benutzen?42
Database Data Warehousing Guide - SQL for Pattern Matching -http://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
Stewart Ashton‘s Blog - https://stewashton.wordpress.com
Oracle Whitepaper - Patterns everywhere - Find them Fast! -
http://www.oracle.com/ocom/groups/public/@otn/documents/webcontent/1965433.pdf
19.11.2015
12c SQL Pattern Matching – wann werde ich das benutzen?43 19.11.2015
Trivadis an der DOAG 2015
Ebene 3 - gleich neben der Rolltreppe
Wir freuen uns auf Ihren Besuch.
Denn mit Trivadis gewinnen Sie immer.