fun with analytic functions - amazon s3€¦ · •pattern matching (find patterns, like v shaped...

37
FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017

Upload: others

Post on 05-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

FUN WITH ANALYTIC FUNCTIONSUTOUG TRAINING DAYS 2017

Page 2: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

ABOUT ME

• Born and raised here in UT

• In IT for 10 years, DBA for the last 6

• Databases and Data are my hobbies, I’m rather quite boring

• This isn’t why you’re here though

Page 3: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

ANALYTIC FUNCTIONS… SAY WHAT?

• Analytic Functions compute a value based upon a subset of the rows in a query result

• The subset it referred to as “the partition” – Unrelated to table partitioning

• The best way to understand these functions is to compare them to standard Aggregate

functions (SUM, MIN, MAX, etc.)

Page 4: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

AGGREGATE VS. ANALYTIC

The Data Aggregate AVG Analytic Function AVG

Page 5: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

41 FLAVORS

• 41 different Analytic Functions

• Positional (FIRST, LAST, ROW_NUMBER, LEAD, LAG, RANK, etc.)

• Statistical (CORR, REG_R, N_TILE, STDDEV, etc.)

• Aggregate (SUM, AVG, MIN, MAX, etc.)

• Pattern Matching (Find patterns, like V shaped dips in stock ticker data)

• ListAgg

Page 6: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

SAMPLES!

• Samples based

on SCOTT schema

• View -> Snippets

Page 7: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

THE SYNTAX

It’s not as complicated as it looks

Page 8: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

QUICK EXAMPLES

The Data Analytic Function AVG

select

ename,

job,

deptno,

avg(sal)over (partition by deptno)

avg_sal_by_deptno,

sal,

sal/(avg(sal) over (partition by deptno))

pct_of_average

from scott.emp

order by deptno desc;

FUNCTION(<field a>) OVER (PARTITION by <field b>)

Page 9: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

MIX ‘N MATCH

select

ename,

job,

deptno,

avg(sal)over (partition by deptno)

avg_sal_by_deptno,

sal,

sal/(avg(sal) over (partition by deptno))

pct_of_average

from scott.emp

order by deptno desc;

select

ename,

job,

deptno,

min(sal) over (partition by deptno)

min_sal_by_deptno,

sal,

sal/(min(sal) over (partition by deptno))

pct_of_min

from scott.emp

order by deptno desc;

Page 10: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

REAL LIFE

Page 11: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

C-LEVEL ASKS EASY QUESTION

“Can you tell me the order that accounts were opened in?” “Can you give me an ordinal number (1st, 2nd, 3rd)?”

row_number() over (partition by acct order by acct_open_date)

Page 12: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

WHAT ABOUT WHEN TWO SUB ACCOUNTS ARE OPENED ON THE SAME DAY, CAN YOU MAKE THOSE BE THE SAME?

dense_rank() over (partition by acct order by acct_open_date)

rank() over (partition by acct order by acct_open_date)

row_number() over (partition by acct order by acct_open_date)

Original Query

Page 13: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

CAN YOU TELL ME HOW LONG IT TAKES BETWEEN ONE ACCOUNT AND ANOTHER?

lag(acct_open_date) over (partition by acct order by acct_open_date)

acct_open_date - lag(acct_open_date) over (partition by acct order by acct_open_date)

LAG

LEAD

Page 14: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

WHAT SHE REALLY WANTED…

• I just need the sequence patterns, in general

This uses LISTAGG

Page 15: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

LISTAGG

• LISTAGG(<string to concatenate>, ‘<concatenator>’ within group (order by <field>)

• LISTAGG(job, ' -> ') within group (order by hiredate)

Page 16: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

NOT GOOD ENOUGH…

• “Can you order those by how common each pattern is?”

• Sure…?

SELECT

DISTINCT listagg(acct_description, ' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE)

,

count(DISTINCT listagg(acct_description,' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE))

pattern_observance_count

Analytic Functions can’t go in a GROUP BY Clause

Page 17: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

DON’T PUT YOUR AF’S WHERE THEY DON’T BELONG

• Use a subquery to get around this

select

deptno,

avg(sal)over (partition by deptno)

avg_sal_by_deptno,

sal,

sal/(avg(sal) over (partition by deptno))

pct_of_average

from scott.emp

order by deptno desc;

select

deptno,

avg(sal)over (partition by deptno)

avg_sal_by_deptno,

sal,

sal/(avg(sal) over (partition by deptno))

pct_of_average

from scott.emp

where sal/(avg(sal) over (partition by deptno))

>1

order by deptno desc;

select

deptno,avg_sal_by_deptno,sal,pct_of_average

from (

select

deptno,

avg(sal)over (partition by deptno)

avg_sal_by_deptno,

sal,

sal/(avg(sal) over (partition by

deptno)) pct_of_average

from scott.emp

order by deptno desc

)

where pct_of_average >=1

Page 18: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

GETTING ROLLED…

Can you tell me the transactions an account has done? Can you sum the Amounts?

Page 19: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

NO, COULD YOU SUM UP THE AMOUNTS FOR EACH MONTH, BUT DON'T HIDE THE TRANSACTION DETAILS?

Original Data sum(amount)over

(partition by trunc(business_date,'MM'), acct_num)

monthly_total

sum(amount)

Page 20: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

COULD YOU BREAK IT OUT BY THE TYPE OF TRANSACTION IT WAS? DEBIT VS. CREDIT?

sum(amount)over

(partition by trunc(business_date,'MM'),

acct_num,tran_type) monthly_total

sum(amount)over

(partition by trunc(business_date,'MM'),

acct_num) monthly_total

Nulls

treated

together

Same partition => same total

Different partition => different total

Page 21: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

COULD YOU MAKE A ROLLING SUM TOO, BROKEN OUT THE SAME WAY?

sum(amount)over (partition by trunc(business_date,'MM'),acct_num,tran_type) monthly_total,

sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type

order by acct_seq_num) rolling_monthly_total

Page 22: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

PERFECT, BUT COULD YOU EXCLUDE THE CURRENT TRANSACTION FROM THE ROLLING MONTHLY TOTAL ?

sum(amount)over (partition by trunc(business_date,'MM'), acct_num,tran_type) monthly_total,

sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type order by acct_seq_num)

rolling_monthly_total,

sum(amount) over ( partition by trunc(business_date,'MM'),acct,suffix,tran_type

ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING ) roll_mnthly_tot_excl_cur_tran

Page 23: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

ROWS AND RANGE – SUB PARTITIONS

• ROWS BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING

• ROWS BETWEEN UNBOUNDED PRECEDING and X PRECEDING

• ROWS is number of Rows

• RANGE is a numeric or date range

• PRECEEDING is before the current row

• FOLLOWING is after the current row

Page 24: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

SIMPLE EXAMPLE

lead(row_number) over (partition by 'X' order by row_number) next_number,

first_value(row_number) over (partition by 'X' order by row_number rows between 2 FOLLOWING and 3 FOLLOWING)

number_after_the_next_number,

sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and 2 FOLLOWING)

sum_of_next_2_nums,

sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and UNBOUNDED FOLLOWING)

sum_nums_from_this_to_the_end,

sum(row_number) over (partition by 'X' order by row_number rows between 1 PRECEDING and 1 FOLLOWING)

sum_nums_1_before_to_1_after

Page 25: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

FILLING HOLES

Can you tell me a drawer’s end of day totals are each day?

Lots of

missing days

How can we fill

in those gaps?

Page 26: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

LET’S GET THE NEXT USED DATE ON EACH ROW

lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date) next_used_date

Lets fix this null

Page 27: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

AF’S CAN BE USED ALMOST ANYWHERE

case

when lead(branch_date) over (partition by branch_code,cashbox_id order by

branch_date)is null then

branch_date

else

lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date)

end next_used_date,

Page 28: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

NULLS FIXED!

Before After

But we still have gaps…

Page 29: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

JOIN THIS TO A “CALENDAR”

Begin Date

Some big number larger than how far you want to go back.

This would calculate out the “End Date”

SELECT

to_date('20161101','YYYYMMDD')+ ROWNUM -1 calendar_date

FROM ( SELECT 1 just_a_column

FROM dual

CONNECT BY LEVEL <= (10000)

20161101* to_date('20161101','YYYYMMDD')

Page 30: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

JOINING TO A CALENDAR

WHERE calendar_date BETWEEN branch_date and next_used_date-1

20161115 is between

20161115 and (20161116 -1)

20th is missing, but

20161120 is between

20161119 and (20161121– 1)

Page 31: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

FILLED GAPS – THANKS TO AN AF

Before After

Page 32: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

HOW BIG IS THAT CANYON?

• Department wanted to know details of accounts going negative

• They wanted to know how deep and how wide the “canyon” was when looking at a daily

history of account balances

-2000

-1500

-1000

-500

0

500

1000

1500

How deep?

How wide?

Start Time?End Time?

Page 33: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

USE PATTERN MATCHING (12C)

The Data

-500

0

500

1000

1500

The Result

Page 34: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

THINGS YOU CAN DO WITH IT:

• Find V, W and other patterns in Stock Prices

• Find timeframes of high database use

• Group clicks in web logs into sessions

• Detect traversal patterns of Finite State Machines

• We won’t go much deeper… but look into these, they’re neat!

Page 35: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

NOT COMPLICATED, JUST INVOLVED

• Used wherever you can put data into a line graph, i.e. data is a log of events

• Lots of great resources:

• Ask Tom - http://www.oracle.com/technetwork/issue-archive/2013/13-nov/o63asktom-2034271.html

• GitHub - https://github.com/oracle/analytical-sql-examples/tree/master/pattern-matching

• Burleson - http://www.dba-oracle.com/t_sql_match_recognize.htm

• YouTube has some good demos too

Page 36: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

AF PERFORMANCE?

• Keep an eye on performance – these do lots of sorts

• Try to use indexes, filter your data before applying analytic functions

• Sometimes AF’s can help improve performance, other times it can reduce it

• Tom Kyte says: In general, analytics are great for answering "really big" questions or

questions against "small sets" https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1137250200346660664

Page 37: Fun with Analytic Functions - Amazon S3€¦ · •Pattern Matching (Find patterns, like V shaped dips in stock ticker data) •ListAgg. SAMPLES! •Samples based on SCOTT schema

QUESTIONS?