open data and xe˜yvind-isene...sql for the curious mind •answer to a query leads to more...
TRANSCRIPT
![Page 1: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/1.jpg)
![Page 2: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/2.jpg)
HavingFunwithOpenDataandOracleXELetsseewhatyoucandowith1CPU,11GBdata,andafreedatabase
Some pr
eaching
toward
s DBAs i
nclude
d
![Page 3: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/3.jpg)
Øyvind Isene @OyvindIsene
http://oisene.blogspot.com
http://sysco.no http://www.bicon.no
https://enesi.no/
![Page 4: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/4.jpg)
3MembershipTiers• OracleACEDirector• OracleACE• OracleACEAssociate
bit.ly/OracleACEProgram
500+TechnicalExpertsHelpingPeersGlobally
Connect:
Nominateyourselforsomeoneyouknow:acenomina?on.oracle.com
@oracleace
Facebook.com/oracleaces
![Page 5: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/5.jpg)
Whattheheck?
• WhyXEwhenyoucanplayaroundwithEE?• Whyopendata?• WhyshouldDBAscareaboutthis?• Whathasthisgottodowithautonomousdatabases?
![Page 6: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/6.jpg)
WhyOpenData?
• Lotsofexitingfreedatasetonthenet• RealisticdatatopracticeSQLon• MotivateyoutolearnmoreSQL&PL/SQL• Testyourskillsandverifyassumptions• Learnmoreaboutafieldthatinterestsyou
![Page 7: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/7.jpg)
SQLforthecuriousmind
• Answertoaqueryleadstomorequestions• SQLmakesiteasytoiterativelyinvestigateadataset
• Learnmorestatistics-usefulforeverybody• IncludingDBAsinvolvedwithoptimisation• Skewandweirdstatisticalpatterns
How can a DBA help others if he don’t know SQL?
![Page 8: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/8.jpg)
SQLandPL/SQLforeverything
• Almost• FeelthepressuretolearnR,Python,otherstuff?
• …butnotenoughtime?• Trytodoitfirstinthedatabase• Learnalgorithmsandmethodsfirst
![Page 9: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/9.jpg)
Improveapplications
• SQL,PL/SQLandpackagesimproveforeveryrelease
• Almostguaranteedtoworkbetterthananythingelse
• CheckoutsuppliedpackagesandSQLReference
• Beagoodfriendwiththedevelopers
![Page 10: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/10.jpg)
• It’saboutlatency• Reducenetworktrips• Closenesstodata• Findthefastestalgorithm• Usecodetestedbymany
Why?
![Page 11: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/11.jpg)
Inthedatabaseyoucan
• Load• Transform• Massageandwrangledata• Analyse• DisplaywithAPEX
![Page 12: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/12.jpg)
OK,butwhyXE?
• Self-imposedlimittoseehowmuchIcanfitintoXE
• XEisfree• Ifyoumakesomethinggreatyoucanreleaseit*
• Smallerfootprintandrunsalmosteverywhere• Fastertoinstall
![Page 13: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/13.jpg)
Notafaircomparison,but
TimetocreateDockercontainer
VirtualSizeofContainer
XE(11.2.0.1) <4minutes 3GB
EE(12.1.0.2) ~16minutes 11GB
![Page 14: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/14.jpg)
XELimits
• Runson1CPU• Usesmax1GBRAM• Max11GBuserdata• SomeheavyliftingfromEEnotincluded• Relatedtorecovery,onlineoperations,• Partitioning,managementpacks• DataMining:-(• OracleSpatial(butLocatoris)
In other words: Still lot of fun stuff in there
![Page 15: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/15.jpg)
XE18c
• ExpectedbetweenMarchandAugust2018• Willhave12GBandmorefeaturesincludingadvancedcompression(giving40GBrealcapacity)
• Use2GBRAM• 2CPUsand4pluggabledatabases• Stillnopatches• Yearlyreleases(meaninglessvulnerable)
![Page 16: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/16.jpg)
GetStarted
![Page 17: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/17.jpg)
Yourownlab
• DockerandVagrant• QuickinstallationofXE• DownloadableVMsfromODC(akaOTN)• Freecloudtrial?• Needstobeeasyandquick-focusonlearning
![Page 18: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/18.jpg)
Expressinstallation
• Linux:rpm-ivhdownloads/oracle-xe-11.2.0-1.0.x86_64.rpm
• Windows:Doubleclicksomething• Docker:SeepostbySQLMaria:http://bit.ly/2yeKIQx
![Page 19: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/19.jpg)
• Archivingturnedoffbydefault-OKforfun• Redologfilesaredefault50Mfor11g• Increasetosay500M
Increaseredologs
![Page 20: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/20.jpg)
• Youshould• Version4.0.2includedinXE11.2.0.2• Upgradetolatest-Easy!
APEX?
cd /tmp unzip apex_5.1.3.zip sqlplus / as sysdba @apexins.sql SYSAUX SYSAUX TEMP /i/ sqlplus / as sysdba @apex_epg_config.sql /tmp @apxchpwd.sql
![Page 21: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/21.jpg)
My Workflow
![Page 22: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/22.jpg)
Getmotivated
• Findanareathatinterestsyou• Beerdata• Governmentdata• Yourowndatafromsocialnetwork• Lookforsomethinginareasonableformat• FireupSQLDeveloper
![Page 23: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/23.jpg)
Freedatasets
• Data.world• DataisPluralbyJeremySinger-Vine• kdnuggets.com• kaggle.com• Reddit:reddit.com/r/datasets• Collectyouownfromgadgetsandsocialnetworks
![Page 24: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/24.jpg)
Whatcanyousqueezeinto11GB?
• Manydatasetsarequitesmall• 30yearsofmovies:<1MB• Differentsetstodiscoverweirdconnections• Somewillnotfit:• AmonthofRedditcomments:47GBJSON• AdvancedCompressionnotsupported• ButyoucanuseUTL_COMPRESS
![Page 25: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/25.jpg)
Loadingdata• Getitin
![Page 26: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/26.jpg)
ImportinSQLDeveloper
• Easyimportforstandardformats• CSV,XL,text• Savetimebylazyimport:• largeVARCHAR2columns• transformdatatypeslater
![Page 27: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/27.jpg)
Externaltable
• Easiertoautomate• Worksbestondatawithagoodstructure
![Page 28: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/28.jpg)
• ReadentiretextfileintoaCLOB
• Easierforridiculousformats
DBMS_LOB1.5 mill. beer tastings from Beer Advocate, 1999-2011
![Page 29: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/29.jpg)
create table file_clob(id number, text clob);
declare l_file BFILE ; l_clob clob ; l_dest_offset integer := 1; l_src_offset integer := 1; l_lang_context integer := 0; l_warning integer ; begin insert into file_clob values(1,empty_clob()) returning text into l_clob; l_file := bfilename('TMP_DIR','Beeradvocate.txt'); dbms_lob.fileopen( l_file, dbms_lob.FILE_READONLY ); DBMS_LOB.LOADCLOBFROMFILE ( dest_lob => l_clob, src_bfile => l_file, amount => DBMS_LOB.LOBMAXSIZE, dest_offset => l_dest_offset, src_offset => l_src_offset, bfile_csid => 873, lang_context => l_lang_context, warning => l_warning); dbms_lob.fileclose( l_file ); dbms_output.put_line('warning: ' || l_warning); COMMIT; end; /
Example with DBMS_LOB
![Page 30: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/30.jpg)
REST
• Widelyusedwhenintegrating• ManysitesoffersaRESTAPI• Easiertoautomate-useschedulerindb• NotdifficulttowritePL/SQLtoconsumeawebservicewithUTL_HTTP.
![Page 31: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/31.jpg)
RESTinpeace
• RESTiseveneasierwithAPEX• CreatereportsdirectlyonRESTdatasource• Somesitesoffersrealtimedataonly• Needtofetchregularlytogethistoricaldata
![Page 32: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/32.jpg)
JSON
• Usedalotindatasets• Goodsupportin12c• TransformtoandfromJSONwithstandardfunctions
• OpensourcelibrariesonGithubfor11g
![Page 33: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/33.jpg)
OtherTools
• SQLLoader• OpenSourcetools• Writeyourown• ButIpreferSQLDeveloperforsimplead-hoc
![Page 34: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/34.jpg)
Transform• Cleanupotherpeople’smess
![Page 35: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/35.jpg)
• Verifystructureandlookforbaddata• CreateTableasSelect(CTAS)• Probablythefastestwaytoconvertdata• Addnewfeatures(columns)andaggregations
SQL
![Page 36: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/36.jpg)
PL/SQL
• Whenyoucan’tdoitinSQL• ReadlinebylineandtransformwithPL/SQL• Learnregularexpressions!• REGEXP_SUBSTR• REGEXP_COUNT• REGEXP_INSTR• REGEXP_REPLACE
![Page 37: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/37.jpg)
• Structureddatahasafieldandarecordseparator
• Usefultohaveroutinestosearchfortheseandsplitaccordingly
• Wildvariationsinvariousuglyformats
FieldandRecordSeparators
![Page 38: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/38.jpg)
ABEERTablecreate table beer( name varchar2(500), id integer, brewerid integer, abv number, style varchar2(100));
declare l_pos integer := 1; l_end_of_record integer ; l_line varchar2(32767); l_length integer; l_clob clob; l_beer beer%rowtype; begin select text into l_clob from file_clob where id=1; l_length := dbms_lob.getlength(l_clob); while l_pos < l_length - 5 loop l_end_of_record := dbms_lob.instr(l_clob,chr(9) || chr(9) || chr(10) || chr(10),l_pos,1); l_line := dbms_lob.substr(l_clob,l_end_of_record - l_pos,l_pos); l_beer.name := regexp_substr(l_line,'beer\/name:\s+(.+)\s',1,1,'i',1); l_beer.id := regexp_substr(l_line,'beer\/beerId:\s+(\d+)\s',1,1,'i',1); l_beer.brewerid := regexp_substr(l_line,'beer\/brewerId:\s+(\d+)\s',1,1,'i',1); l_beer.abv := regexp_substr(l_line,'beer\/ABV:\s+(\d+\.\d+)\s',1,1,'i',1); l_beer.style := regexp_substr(l_line,'beer\/style:\s+(.+)\s',1,1,'i',1); insert into beer values l_beer; l_pos := l_end_of_record + 4; end loop; end; /
Parsing file from Beeradvocate
1.58 million records, 8 minutes with XE on i7
![Page 39: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/39.jpg)
Finalversionwithallattributes
1.59 mill. rows Size on disk: 1.5 GB Segment in db: 156 MB
![Page 40: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/40.jpg)
LearningOpportunities
• OracleTextonreviewtext• Analytical/statisticalfunctions• Findotherbeersthatmatchesyourtaste• Aretherefaultsinthedata?• HowdoyoucalculateaDATEfromUnixepoch?
![Page 41: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/41.jpg)
AddVirtualColumn
• ThisdatasetusesUNIXepochtime(secondssince1970-01-01)
• EasilyderiveaDATEcolumn:alter table reviews add review_date date generated always as (date '1970-01-01' + time_epoch /60/60/24);
![Page 42: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/42.jpg)
Analyseit• Thisissupposedtobefun
![Page 43: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/43.jpg)
Whatistheaveragescore?
Dopeoplegetbetterover
timetofindgoodbeer?
Whatisthelongestperiodapersonhasbeendrinkingbeereveryday?Whatisthemostpopularstyle?
IsFridayrea
llybeerda
y?
DobeerswithhigherABVgetbetterscores?
![Page 44: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/44.jpg)
Whatisthemostreviewedstyle?select style, count(*) from beer group by style order by 2 desc;
STYLE COUNT(*)AmericanIPA 117563AmericanDouble/ImperialIPA 85958AmericanPaleAle(APA) 63451RussianImperialStout 54109AmericanDouble/ImperialStout 50698AmericanPorter 50461AmericanAmber/RedAle 45741BelgianStrongDarkAle 37724Fruit/VegetableBeer 33853AmericanStrongAle 31939
![Page 45: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/45.jpg)
Butifyoucheckthescore…select style,round(avg(overall),2) avg_score, round(stddev(overall),2) stddev_score from reviews group by style order by 2 desc; Style AvgScore StdDev
Gueuze 4.09 0.64AmericanWildAle 4.09 0.65Quadrupel(Quad) 4.07 0.63Lambic-Unblended 4.05 0.66AmericanDouble/ImperialStout 4.03 0.67RussianImperialStout 4.02 0.64Weizenbock 4.01 0.6AmericanDouble/ImperialIPA 4 0.64FlandersRedAle 3.99 0.68Eisbock 3.98 0.63
![Page 46: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/46.jpg)
IsFridayBeerDay?
select year,day from ( select to_char(review_date,'YYYY') year , to_char(review_date,'DAY') day, row_number() over (partition by to_char(review_date,'YYYY') order by count(*) desc) rn from reviews group by to_char(review_date,'YYYY') , to_char(review_date,'DAY') ) where rn=1 order by 1;
Thegroupbyisexecutedfirst,thentheanalyticalrow_number()onthoserows,sortedbyhighesttolowest.Intheouterselectthedaywithrn=1(highest)isselected.
![Page 47: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/47.jpg)
WeekdaywithmostReviewsbyYearYEAR DAY1996 THURSDAY1998 THURSDAY1999 TUESDAY2000 SATURDAY2001 FRIDAY2002 MONDAY2003 MONDAY2004 MONDAY2005 SUNDAY2006 SUNDAY2007 SUNDAY2008 SUNDAY2009 SUNDAY2010 SUNDAY2011 SATURDAY2012 SUNDAY
![Page 48: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/48.jpg)
DoBeerswithHigherABVGetBetterScores?select style,round(corr(abv,overall),2) Correlation, count(*) cnt from reviews where review_date between date '2011-01-01' and date '2011-12-31' group by style order by 2 desc; STYLE CORRELATION CNT
ChileBeer 0.39 504Dortmunder/ExportLager
0.35 670
Faro 0.35 184ViennaLager 0.32 1189BièredeChampagne/BièreBrut
0.29 347
Happoshu 0.29 25
![Page 49: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/49.jpg)
select profile_name,min(review_day), max(review_day), count(*) days_drinking from (
) group by profile_name,lb order by days_drinking desc;
LongestPeriod
select profile_name, review_day, review_day - row_number() over (partition by profile_name order by review_day) lb from (
)
select profile_name,trunc(review_date) review_day from reviews group by profile_name, trunc(review_date)
This method is well explained in KISS series on Analytics, part 3, Mind the Gap
![Page 50: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/50.jpg)
TheResultsAreIn
mikesgroove 2008-04-01 2008-07-14 105atsprings 2010-07-06 2010-10-14 101cvstrickland 2008-05-12 2008-08-07 88Daniellobo 2010-01-13 2010-04-02 80jwinship83 2009-05-22 2009-08-07 78Zorro 2004-01-27 2004-04-06 71Gusler 2002-10-06 2002-12-12 68Jadjunk 2010-12-19 2011-02-23 67superdedooperboy 2008-08-14 2008-10-18 66
Query above took 8 sec. A simple inspection to verify:
select review_date from reviews where profile_name='mikesgroove' and review_date >= date '2008-04-01' order by 1;
![Page 51: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/51.jpg)
• Ahiddengemfrom10g• Sortofdatamining• Finditemsthatoccurtogether-basketanalysis• Usefortasterecommendations
DBMS_FREQUENT_ITEMSET
![Page 52: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/52.jpg)
Fromdoc
DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL ( tranx_cursor IN SYSREFCURSOR, support_threshold IN NUMBER, itemset_length_min IN NUMBER, itemset_length_max IN NUMBER, including_items IN SYS_REFCURSOR DEFAULT NULL, excluding_items IN SYS_REFCURSOR DEFAULT NULL) RETURN TABLE OF ROW ( itemset [Nested Table of Item Type DERIVED FROM tranx_cursor], support NUMBER, length NUMBER, total_tranx NUMBER);
![Page 53: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/53.jpg)
Tablewiththreemostpopular
create table popular_style_list as select profile_name,style,avg_score from ( select profile_name,style,round(avg(overall),1) avg_score, row_number() over (partition by profile_name order by avg(overall) desc) rn from reviews group by profile_name,style ) where rn <=3;
Thewillprofile_nameserveasthetransactionidThethreemostpopularstylesarechosenforeachperson
![Page 54: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/54.jpg)
Anewtype
create or replace type fi_style_nt as table of varchar2(200) /
![Page 55: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/55.jpg)
Findmostcommoncombinations
SELECT CAST(itemset as fi_style_nt) itemset, support, length, total_tranx FROM table(DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL( CURSOR(SELECT profile_name,style FROM popular_style_list), 0.01, 2, 3, NULL, NULL )) order by support desc;
![Page 56: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/56.jpg)
![Page 57: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/57.jpg)
Getrecommendations
SELECT CAST(itemset as fi_style_nt) itemset, support, length, total_tranx FROM table(DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL( CURSOR(SELECT profile_name,style FROM popular_style_list), 0.01, 2, 3, CURSOR(SELECT * FROM table(fi_style_nt ('American IPA'))), NULL )) order by support desc;
![Page 58: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/58.jpg)
![Page 59: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/59.jpg)
Visualiseit• Morefun?
![Page 60: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/60.jpg)
Dotheygetbetteratfindinggoodbeer?
![Page 61: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/61.jpg)
Not so easy to seeBuckeyeNation 2003 3.88BuckeyeNation 2004 3.75BuckeyeNation 2005 3.75BuckeyeNation 2006 3.69BuckeyeNation 2007 3.63BuckeyeNation 2008 3.71BuckeyeNation 2009 3.82BuckeyeNation 2010 3.82BuckeyeNation 2011 3.84ChainGangGuy 2005 3.76ChainGangGuy 2006 3.69ChainGangGuy 2007 3.41ChainGangGuy 2008 3.51ChainGangGuy 2009 3.6
![Page 62: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/62.jpg)
LineChartinSQLDeveloper
I picked the persons with most reviews, maybe they drink everything?
![Page 63: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/63.jpg)
Numberofstylesovertime
select*from(selectprofile_name,review_month,round(avg(no_of_styles)over(partitionbyprofile_nameorderbyreview_monthrangebetweeninterval'2'monthprecedingandcurrentrow))moving_avg,dense_Rank()over(orderbymonths_drinkingdesc)rnfrom(selectprofile_name,trunc(review_date,'MM')review_month,count(distinctstyle)no_of_styles,count(*)over(partitionbyprofile_name)months_drinkingfromreviewsgroupbyprofile_name,trunc(review_date,'MM')))wherern<=10orderbyrn,2;
![Page 64: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/64.jpg)
• Latestversionhasseveraloptionfornicecharts• CheckoutSampleAppforchartsinAPEX5.1
ChartsinApex
![Page 65: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/65.jpg)
![Page 66: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/66.jpg)
• XEispowerfulformanypurposes• Opendataletsyoulearnonrealisticdata• Remembertohavefun• Limitssometimesmakeiteasiertoprioritise
Conclusion
![Page 67: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/67.jpg)
Credits to Conno
r McDonald
![Page 68: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/68.jpg)
AnewworldforDBAs• Cloud,autonomousdatabase…?
![Page 69: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/69.jpg)
• Cloudrealityshowstheneedfornewskills• Autonomousdatabasemightwork-advancesinAI
• DBAspendslesstimeontedioustasks• …andmoreonwhat?
Sometrends
I’m done with patching, backup, extending data files
![Page 70: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/70.jpg)
Weneedtoimproveon
• Security• Dataquality• Integration• Analysis• OpenSource
Quite a fe
w are doing
this
- don’t get be
hind!
![Page 71: Open Data and XE˜yvind-Isene...SQL for the curious mind •Answer to a query leads to more questions •SQL makes it easy to iteratively investigate a data set •Learn more statistics](https://reader035.vdocuments.site/reader035/viewer/2022071106/5fe0ea4752cf0e4dab5aa77c/html5/thumbnails/71.jpg)
• Datagovernance• Datamodelling• Somecoding• Security• Howtoutilisethedatabasebetter
DBAshouldlearnmore
Are DBAs too busy helping others? Need to reserve time for learning