oracle openworld 2019...oracle machine learning for sql (oml4sql) python (oml4py) r (oml4r) empower...

70
Oracle OpenWorld 2019 SAN FRANCISCO Copyright © 2019 Oracle and/or its affiliates.

Upload: others

Post on 08-Feb-2020

27 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle OpenWorld 2019S A N F R A N C I S C O

Copyright © 2019 Oracle and/or its affiliates.

Page 2: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Overview of New Features and Roadmap

Mark Hornick, Senior Director, Product Management

Marcos Arancibia, Product Manager

Charlie Berger, Senior Director, Product Management

Copyright © 2019 Oracle and/or its affiliates.

Page 3: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.

Safe Harbor

Copyright © 2019 Oracle and/or its affiliates.

Page 4: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Key Attributes

Copyright © 2019 Oracle and/or its affiliates.

Automated

Get better results faster with less effort –

even non-expert users

Scalable

Handle big data volumes using parallel, distributed algorithms –

no data movement

Production-ready

Deploy and update data science solutions faster with

integrated ML platform

Increase productivity, Achieve enterprise goals, Innovate More

Page 5: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Why this matters

Machine Learning offers tremendous promise for enterprises, but data access, model scalability, and ultimate model deployment issues all too often derail even the best initiatives

Oracle Machine Learning integrates with key enterprise infrastructure, and delivers the performance, scalability, and automation required by enterprise-scale data science projects

Copyright © 2019 Oracle and/or its affiliates.

Page 6: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Agenda

• Oracle Machine Learning family of products

• Supporting multiple personas

• OML component details

• Enabling applications with ML

• Roadmap

Copyright © 2019 Oracle and/or its affiliates.

Page 7: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

d

Oracle Machine Learning

OML Microservices*Supporting Oracle Applications

Image, Text, Scoring, Deployment,Model Management

* Coming soon

OML4SQLOracle Advanced Analytics

SQL API

OML4Py*Python API

OML4ROracle R Enterprise

R API

OML Notebookswith Apache Zeppelin on

Autonomous Database

OML4SparkOracle R Advanced Analytics

for Hadoop

Oracle Data MinerOracle SQL Developer extension

Copyright © 2019 Oracle and/or its affiliates.

Page 8: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives

OML empowers Enterprise Users

OracleMachineLearning

Copyright © 2019 Oracle and/or its affiliates.

Page 9: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Data Scientists

•Popular data science languages: Python, R, SQL•Augment with 3rd party packages•Scalability and performance•Automation-enhanced productivity•Greater enterprise collaboration•Integrate and analyze data across the enterprise

OracleMachineLearning

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives

Copyright © 2019 Oracle and/or its affiliates.

Page 10: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Business and Data Analysts•Expand analytical tool set with ML•Enable non-ML experts with AutoML•Leverage domain knowledge for better results•Collaborate with Data Scientists and IT

OracleMachineLearning

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives

Copyright © 2019 Oracle and/or its affiliates.

Page 11: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

DBA and IT Professionals

OracleMachineLearning

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives

•Even greater value from Oracle investment•Support scalability and performance•Simpler, streamlined infrastructure•Maintain data security, backup, recovery•Use SQL, expand to Python and R•Leverage Database and Big Data sources

Copyright © 2019 Oracle and/or its affiliates.

Page 12: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Application and Dashboard Developers

•Realize intelligent solutions faster through Oracle stack integration•Easily uptake data scientists’ R, Python, SQL scripts and rapidly deploy solutions •Embed ML in applications and dashboards using SQL, REST, and SODA APIs

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives OracleMachineLearning

Copyright © 2019 Oracle and/or its affiliates.

Page 13: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Executives

•Benefit from world-class data management technology and support•Democratize ML across the enterprise to enable better data-driven decisions•Deploy solutions faster to realize ROI

Data Scientists

Business and Data Analysts

DBA and IT Professionals

Application / Dashboard Developers

Executives OracleMachineLearning

Copyright © 2019 Oracle and/or its affiliates.

Page 14: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Evolving into End-to-End Analytics Data Platform

Analytics

Machine Learning

Application Development

Data Acquisition &

Transformation

Data Virtualization

Autonomous Database Cloud Service

Tightly integrated analytics data platformCopyright © 2019 Oracle and/or its affiliates.

Page 15: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Cross-Platform Machine LearningMultiple user interfaces and APIsDeployed in cloud and on-premisesFrom database to entire data management ecosystem

Oracle Cloud SQLOML4R

OML4Python

REST

OML4SQL

SQL Developer

Popular RIDEs

Popular Python IDEs

OML Notebooks

SelectUser Interface, e.g.

APIOptions

Cloud or On-premises

Reach broaderData Sources

Oracle Object Storage

Big DataService (HDFS)

NoSQLDatabases

KafkaStreams

Amazon S3

Azure Blob Storage

Oracle Database

Data Lake

OML4Spark

Oracle Big Data SQL

Copyright © 2019 Oracle and/or its affiliates.

OCI Data Science

Page 16: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

CLASSIFICATIONNaïve BayesLogistic Regression (GLM)Decision TreeRandom ForestNeural NetworkSupport Vector Machine (SVM)Explicit Semantic Analysis

CLUSTERINGHierarchical K-MeansHierarchical O-ClusterExpectation Maximization (EM)

ANOMALY DETECTIONOne-Class SVM

TIME SERIESForecasting - Exponential SmoothingIncludes popular models

e.g. Holt-Winters with trends, seasonality, irregularity, missing data

REGRESSIONLinear ModelGeneralized Linear Model (GLM)Support Vector Machine (SVM)Stepwise Linear regressionNeural NetworkLASSO

ATTRIBUTE IMPORTANCEMinimum Description LengthPrincipal Component Analysis (PCA)Unsupervised Pair-wise KL Div CUR decomposition for row & AI

ASSOCIATION RULESA priori/ market basket

PREDICTIVE QUERIESPredict, cluster, detect, features

SQL ANALYTICSSQL WindowsSQL PatternsSQL Aggregates

FEATURE EXTRACTIONPrincipal Comp Analysis (PCA)Non-negative Matrix FactorizationSingular Value Decomposition (SVD)Explicit Semantic Analysis (ESA)

TEXT MINING SUPPORTAlgorithms support text columnsTokenization and theme extractionExplicit Semantic Analysis (ESA) for

document similarity

STATISTICAL FUNCTIONSBasic statistics: min, max,

median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.

R AND PYTHON PACKAGESThird-party R and Python Packages

through Embedded ExecutionSpark MLlib algorithm integration

Oracle Machine Learning Algorithms and Analytics

Copyright © 2019 Oracle and/or its affiliates.

Page 17: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

OML algorithm features

Feature In-Database Spark

No data movement to separate analytical engines

Wide range of ML techniques supported Native Native and Spark MLlib

High performance from parallel, distributed executionSpark 2-based. Use all nodes from Hadoop

cluster

Greater scalability from improved memory utilization

Handle narrow, wide, and sparse data Plus star schema, nested data

AutomationData preparation,

text mining, partitioned models,

AutoML

Copyright © 2019 Oracle and/or its affiliates.

Page 18: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Notebooks

for Oracle Autonomous Database

Page 19: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Notebooks

Collaborative UI Based on Apache Zeppelin

Supports data scientists, data analysts, application developers, DBAs

Easy sharing of notebooks and templates

Permissions, versioning, and execution scheduling

Included with Autonomous DatabaseAutomatically provisioned, managed, backed up

In-database SQL algorithms and analytics functions

Soon to be augmented with Python and R

Autonomous Database as a Data Science Platform

Copyright © 2019 Oracle and/or its affiliates.

Page 20: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Copyright © 2019 Oracle and/or its affiliates.

Page 21: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning forSQL (OML4SQL)Python (OML4Py)R (OML4R)

Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Empower data scientists with open source environments

Page 22: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Traditional Analytics and Data Source Interaction

Access latency

Paradigm shift: R/Python Data Access Language R/Python

Memory limitation – data size, in-memory processing

Single threaded

Issues for backup, recovery, security

Ad hoc production deployment

DeploymentAd hoccron job

Data SourceFlat Filesextract / exportread

export load

Data source connectivity packages

Read/Write files using built-in tool capabilities

?

Copyright © 2019 Oracle and/or its affiliates.

Page 23: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for SQL

In-database, parallel, distributed algorithms

ML models as first class database objects

Export / import models across databases

Batch and real-time scoring

Explanatory predictive details

Leverage ML across Oracle stack

Component of Oracle Autonomous Database and Oracle Advanced Analytics option to Oracle Database

SQL InterfacesSQL*PlusSQLDeveloper…

OracleAutonomous

Database

OML Notebooks

Oracle Databasewith OAA option

Copyright © 2019 Oracle and/or its affiliates.

Page 24: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

OML4SQL: Model Build and Real-time Prediction

BEGIN

DBMS_DATA_MINING.CREATE_MODEL(

model_name => 'BUY_INSUR1',

mining_function => dbms_data_mining.classification,

data_table_name => 'CUST_INSUR_LTV',

case_id_column_name => 'CUST_ID',

target_column_name => 'BUY_INSURANCE',

settings_table_name => 'CUST_INSUR_LTV_SET');

END;

Simple SQL Syntax—Classification Model

SELECT prediction_probability(BUY_INSUR1, 'Yes'

USING 3500 as bank_funds, 825 as checking_amount, 400 as credit_balance, 22 as age,

'Married' as marital_status, 93 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)

FROM dual;

Model build (PL/SQL)

Real-time scoring (SQL query)

Copyright © 2019 Oracle and/or its affiliates.

Page 25: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Data Miner User Interface

SQL Developer Extension

Automates typical data science steps

Easy to use drag-and-drop interface

Analytical workflows quickly defined and shared

Wide range of algorithms and data transformations

Generate SQL code for immediate deployment

Create analytical workflows – supports “Citizen Data Scientists”

Copyright © 2019 Oracle and/or its affiliates.

Page 26: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for R and Python*

Oracle Database as HPC environment

In-database parallel and distributed machine learning algorithms

Manage scripts and objects in Oracle Database

Integrate results into applications

and dashboards via SQL

OML4Py automated machine learning

Components of Oracle Advanced Analytics option to Oracle Database

Database ServerMachine

Client SQL InterfacesSQL*Plus

SQLDeveloperOML4Py OML4R

Copyright © 2019 Oracle and/or its affiliates. * Coming soon

Page 27: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for R and Python

Transparency layerLeverage proxy objects so data remain in database

Overload native functions translating functionality to SQL

Use familiar R / Python syntax on database data

Parallel, distributed algorithmsScalability and performance

Exposes in-database algorithms available from OML4SQL

Embedded executionManage and invoke R or Python scripts in Oracle Database

Data-parallel, task-parallel, and non-parallel execution

Use open source packages to augment functionality

OML4Py AutoMLModel selection, feature selection, hyper-parameter tuning

Components of Oracle Advanced Analytics option to Oracle Database

Database ServerMachine

Client SQL InterfacesSQL*Plus

SQLDeveloperOML4Py OML4R

Copyright © 2019 Oracle and/or its affiliates.

Page 28: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Example using OML4R interface

Proxy objects

data.frame

Proxydata.frame

Inherits from

Copyright © 2019 Oracle and/or its affiliates.

Page 29: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Transparency Layer

Leverages proxy objects for database data: oml.DataFrame

# Create table from Pandas DataFrame data

DATA = oml.create(data, table = 'BOSTON')

# Get proxy object to DB table boston

DATA = oml.sync(table = 'BOSTON')

Uses familiar Python syntax to manipulate database data

Overloads Python functions translating functionality to SQL

In-database performance – indexes, query optimization, parallelism, partitioning

DATA.shape

DATA.head()

DATA.describe()

DATA.std()

DATA.skew()

TRAIN, TEST =

DATA.split()

TRAIN.shape

TEST.shape

Copyright © 2019 Oracle and/or its affiliates.

Page 30: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

In-database modeling using Support Vector Machine

Parallel, Distributed Algorithms

User tables

Oracle Database

from oml import svm

# create proxy object

ONTIME_S = oml.sync(table='ONTIME_S')

# define model object

settings = {'svms_outlier_rate' : 0.01}

svm_mod = svm('anomaly_detection',

svms_kernel_function =

'dbms_data_mining.svms_linear',

**settings)

# build anomaly detection model

svm_mod = svm_mod.fit(x=ONTIME_S, y=None)

# view model object

svm_mod

OML4Py

Python Client

Copyright © 2019 Oracle and/or its affiliates.

Page 31: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Embedded Execution

User tables

pyq*eval () interface

extproc

Oracle Database

extproc

# user-defined function using sklearn

def build_lm(dat):

from sklearn import linear_model

lm = linear_model.LinearRegression()

X = dat[['PETAL_WIDTH']]

y = dat[['PETAL_LENGTH']]

lm.fit(X, y)

return lm

# select column(s) for partitioning data

index = oml.DataFrame(IRIS['SPECIES'])

# invoke function in parallel on IRIS table

mods = oml.group_apply(IRIS, index,

func=build_lm,

parallel=2)

mods.pull().items()

OML4Py

Python Engine

OML4Py

Python Engine

OML4Py

Python Client

Example of parallel execution for partitioned data flow using third party package

Copyright © 2019 Oracle and/or its affiliates.

Page 32: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

AutoML – new with OML4Py

Auto Feature Selection

– Reduce # of features by identifying most predictive

– Improve performance and accuracy

Increase data scientist productivity – reduce overall compute time

Auto ModelSelection

Much faster than exhaustive search

Auto FeatureSelection

>50% reduction in features

AutoTune

Significant score improvement

MLModel

Auto Model Selection– Identify in-database

algorithm that achieves highest model quality

– Find best model faster than with exhaustive search

Auto Tune Hyperparameters– Significantly improve

model accuracy

– Avoid manual or exhaustive search techniques

Copyright © 2019 Oracle and/or its affiliates.

Enables non-expert users to leverage Machine Learning

DataTable

Page 33: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Reduce # features by identifying most relevantImprove performance and accuracy

Auto Feature Selection: examples

0

5

10

15

20

25

30

299 9

Trai

nin

g ti

me

(sec

on

ds)

ML training time

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

299 9

Acc

ura

cy

Prediction Accuracy

33xreduction

+4%

OpenML dataset 312 with 1925 rows, 299 columns OpenML dataset 40996 with 56K rows, 784 columns

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

784 309

Acc

ura

cy

Prediction Accuracy

+18%

60% reduction1.3X time reduction to build SVM Gaussian model

97% reduction

Copyright © 2019 Oracle and/or its affiliates.

Page 34: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for Spark (OML4Spark)

supported by Oracle R Advanced Analytics for Hadoop

Page 35: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for Spark

Leverage Spark 2 environment for powerfuldata preparation and machine learning

Use data across range of Data Lake sources

Achieve scalability and performance using full Hadoop cluster

Parallel and distributed ML algorithms from native and Spark MLlib implementations

R Language API Component to Oracle Big Data Connectors

Java API

HDFS | Hive | Spark DF | Impala | JDBC Sources

BDABDSDIY

OML4Spark

R Client

Copyright © 2019 Oracle and/or its affiliates.

Page 36: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning for Spark

Transparency layerProxy objects reference data from file system, HDFS, Hive, Impala, Spark DataFrame and JDBC sources

Overloaded R functions translate functionality to native language, e.g., HiveQL for HIVE and Impala

Users manipulate data via standard R syntax

Parallel, distributed machine learning algorithmsScalability and performance leveraging full Hadoop cluster

Spark-based custom LM, GLM, NN, K-Means plus Spark MLlib

Use expressive R Formula specification

Compute framework with custom R mappers/reducers

Data-parallel and task-parallel execution

Allows for open source CRAN packages run on Cluster Nodes

R Language API Component to Oracle Big Data Connectors

Copyright © 2019 Oracle and/or its affiliates.

Java API

HDFS | Hive | Spark DF | Impala | JDBC Sources

BDABDSDIY

OML4Spark

R Client

Page 37: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

OML4Spark PerformanceLogistic Regression (GLM)

Data fits in memoryUp to 7x faster than Spark MLlib

Data cannot fit memoryAble to solve a 10B row model

Benchmark environmentORAAH 2.8.0

Big Data Appliance X7-2

6 Nodes, 256GB of RAM per Node

Formula: cancelled ~ distance + origin + dest + as.factor(month) + as.factor(year) + as.factor(dayofmonth) + as.factor(dayofweek) + as.factor(flightnum)

1

10

100

1,000

10,000

100K 1M 10M 100M 1B 10B

Exec

uti

on

Tim

e (s

eco

nd

s)

Dataset Size (# rows)

OML4Spark vs. Spark MLlibfor GLM Logistic Regression

OML4Spark MLlib

Copyright © 2019 Oracle and/or its affiliates.

Page 38: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Enabling Oracle Applications

with OML Models and Microservices

Page 39: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

HCM Cloud Workforce Predictions

CRM Sales Cloud Sales Prediction

Retail GBUCustomer Insights, Customer Segmentation

Adaptive Intelligent Applications for Manufacturing

Configure, Price, Quote Cloud

Content and ExperienceUnstructured Data Analytics

Oracle Integration CloudDigital Process Automation

Industry Data ModelsCommunications, SNA, Utilities, Airlines, Retail, …

EBS Spend Classification

Organize spend into logical categories

EBS Depot Repair

Optimize speed, cost, quality of product repair, reuse, recycling

Oracle Identity ManagementAdaptive Access Management

FSGBUAnalytical ApplicationsInfrastructure

Applications integrating OML

Copyright © 2019 Oracle and/or its affiliates.

Page 40: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Integration Cloud (OIC)

Digital Process AutomationHelp business users make better decisions by using recommendations from ML models

Increase automation of human-centric approval workflows

Used by Oracle SaaS process-centric apps

PaaS service that exposes OML featuresBuild models in ADBDeploy via OML Microservices

Oracle Content and Experience (OCE)

Improve Content DiscoverabilitySearch, organize content, reduce duplication

Find relevant images/docs during content creation

Automatic tagging and classification of images, videos, text

Visual search

Cloud-based content hub to drive omni-channel content management and accelerate experience delivery

Leverages OML cognitive microservices

Application platforms using OML Platform

Copyright © 2019 Oracle and/or its affiliates.

Page 41: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Microservices

Model Management ServicesBuilding and deploying OML models

Cognitive ServicesFeature Extraction, Image and Text

Model repository Store, version, compare ML models

REST APIs for application integration

Currently available to Oracle Applications teams – GA coming soon

Copyright © 2019 Oracle and/or its affiliates.

Page 42: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Sample of Microservices APIs

Copyright © 2019 Oracle and/or its affiliates.

ModelRepository REST

ModelDeploy REST

CognitiveImage REST

CognitiveText REST

Model ManagementGET /models

GET /{model name}GET /{model name}/{version}

POST /{model name}POST /{model name}/{version}

DELETE /{model name}/{version}

Model DeploymentGET /models

GET /{uri}GET /{uri}/api

POST /{uri}POST /{uri}/score

DELETE /{uri}

Cognitive ImagePOST /imageClassification

POST /nsfwPOST /objectDetectionPOST /faceDetection

POST /imageSimilarityPOST /faceSimilarity

Cognitive TextPOST /ner

POST /topicsPOST /keywordsPOST /sentimentPOST /summaryPOST /similarity

Page 43: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning RoadmapAlgorithmsOML NotebooksOML User InterfaceOML4R and OML4PyOML4SparkOML MicroservicesGPUs

Page 44: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: Algorithms for Database 20c

eXtreme Gradient Boosting Trees (XGBoost)Classification, regression, ranking, survival analysis

Highly popular and powerful algorithm

MSET-SPRTAnomaly detection for sensors, IoT data sources

“Multivariate State Estimation Technique”

A non-linear, non-parametric anomaly detection ML technique

Based on Oracle Labs algorithm

Frequently requested algorithms

Copyright © 2019 Oracle and/or its affiliates.

Page 45: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: Expand Autonomous Databasewith Python and R

OML Notebooks add support for Python and R

Python and R scripts managed in-databaseInvoke from OML Notebooks, and REST or SQL APIs

Deploy into SQL and Web applications easily

Scalable Python and R executionTransparency layer-enabled database functionality

In-database machine learning algorithms

AutoML functionality via OML4Py

OML4Py integrated with OCI Data Science (DataScience.com)

Autonomous Database as a Data Science Platform

DATA SCIENTISTS SQL Clients

REST Applications

$

SQL

Copyright © 2019 Oracle and/or its affiliates.

Page 46: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: OML AutoML User Interface

Automate production and deployment of ML modelsEnhance Data Scientist productivity, user-experience

Enable non-expert users to leverage ML

Unify model deployment and monitoring

Support model management

FeaturesMinimal user input: data, target

Model leaderboard

Model deployment in applications via REST endpoint

Model monitoring: accuracy, prediction/predictor drift

Cognitive features for processing image and text

“Code-free” user interface supporting automated end-to-end machine learning

Copyright © 2019 Oracle and/or its affiliates.

Page 47: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: OML4R and OML4Py

Expose additional OML4SQL algorithms to Python and R

Support for recent R and Python releases

Enhance analytics and data operations furtherwith in-database computation

Deliver high performance, scalable, auto-tuning algorithms and data analytics to Python and R users

Enable Oracle Database standard, integrated installation, patching, upgrade/downgrade

Expand support for open source languages and ecosystems

Copyright © 2019 Oracle and/or its affiliates.

Page 48: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: OML4Spark

Support advanced machine learning activities on Big DataModel management and cognitive image and text processing

Model deployment and monitoring on Big Data (including Database models)

Cloud-oriented packaging (containers, REST APIs) of OML4Spark functionality

Enable OML4Py and OML4R for uniform experience across platforms

AlgorithmsNeural Network gradient descent enhancements avoid over-fitting

New native Support Vector Machine with linear and non-linear kernels

New native k-Means and k-Mode clustering algorithms

New cloud-based architecture with powerful Spark analytics

Copyright © 2019 Oracle and/or its affiliates.

Page 49: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: OML Platform REST APIs

Extend Model Management MicroservicesModel Monitoring of accuracy and

prediction/predictor drift

User-defined scripts deploymentPython and R user-defined functions

REST API

ModelMonitoring REST

User-defined ScriptsDeployment

REST

Copyright © 2019 Oracle and/or its affiliates.

Page 50: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Platform REST APIs

Cognitive Services

FeatureExtraction REST

CognitiveImage REST

CognitiveText REST

ModelRepository REST

ModelDeploy REST

ModelMonitoring REST

Model Management and Online Scoring Services

Build / Batch Scoring / Export

User-defined ScriptsDeployment

REST

Store Models and Metadata

Roadmap

Roadmap

Copyright © 2019 Oracle and/or its affiliates.

Oracle REST Data Services

REST

Page 51: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Roadmap: Enabling GPUs

Enable GPUs for in-database algorithms –replace MKL with cuBLAS

Leverage GPUs for user-defined R and Python functions

Include 3rd party packages leveraging GPUs, e.g., Tensorflow, Keras

Execute on GPU cloud compute shape

Support state-of-the-art ML processing, e.g., deep learning

Augment OML Microservices for GPU processing – key for images

Copyright © 2019 Oracle and/or its affiliates.

Page 52: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Oracle Machine Learning Key Attributes

Copyright © 2019 Oracle and/or its affiliates.

Automated Scalable Production-ready

Increase productivity, Achieve enterprise goals, Innovate More

Page 53: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

For more information…

https://www.oracle.com/database/technologies/datawarehouse-bigdata/machine-learning.html

Copyright © 2019 Oracle and/or its affiliates.

Page 54: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database
Page 55: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Thank You

Mark Hornick

Senior Director, Product ManagementData Science and Machine Learning

Page 56: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Appendix

Page 57: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Business Value of Machine Learning

Deliver better outcomes

Understand the reasons behind outcomes

Capitalize on an enterprise’s unique data

Create sustainable competitive advantage

Increase the rate of better decision making

Respond to market conditions faster

Attain a higher level of customer satisfaction

Know which of your big data you should pay attention to

64

Traditional analytics tools examine historical data, Machine learning explains the past and predicts the future

Page 58: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

On-premises configuration options

Oracle Machine Learning Components

Oracle Database Enterprise Edition

Oracle Advanced Analytics option

Oracle Advanced Analytics option

Oracle Data Mining

Oracle R Enterprise

Oracle Data Miner UI in SQL Developer

Oracle R Distribution

Standard R IDEs

Oracle Machine Learning for Python

Standard Python IDEs

Python

Oracle R Enterpriseclient library

Copyright © 2019 Oracle and/or its affiliates.

Page 59: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

On-premises configuration options

Oracle Machine Learning Components

Oracle Database Enterprise Edition

Oracle Advanced Analytics option

Oracle Advanced Analytics option

OML4SQL

OML4R

Oracle Data Miner UI in SQL Developer

Oracle R Distribution

Standard R IDEs

OML4Py

Standard Python IDEs

Python

OML4Rclient library

Copyright © 2019 Oracle and/or its affiliates.

Page 60: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

OML4R / ORE 1.5.1Machine Learning algorithms in-database

• Decision Tree• Logistic Regression • Naïve Bayes• Support Vector Machine• RandomForest

Regression

• Linear Model• Generalized Linear Model• Multi-Layer Neural Networks• Stepwise Linear Regression• Support Vector Machine

Classification

Attribute Importance

• Minimum Description Length• Expectation Maximization

Clustering

• Hierarchical k-Means • Orthogonal Partitioning• Expectation Maximization

Feature Extraction

• Nonnegative Matrix Factorization• Principal Component Analysis• Singular Value Decomposition• Explicit Semantic Analysis

Market Basket Analysis

• Apriori – Association Rules

Anomaly Detection

• 1 Class Support Vector Machine

Time Series

• Single Exponential Smoothing• Double Exponential Smoothing

…plus open source R packages for algorithms in combination with embedded R execution

Supports integrated partitioned models, text miningCopyright © 2019 Oracle and/or its affiliates.

Page 61: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

OML4Py 1.0Machine Learning algorithms in-database

• Decision Tree• Naïve Bayes• Generalized Linear Model• Support Vector Machine• RandomForest• Neural Network

Regression

• Generalized Linear Model• Neural Network• Support Vector Machine

Classification

Attribute Importance

• Minimum Description Length

Clustering

• Expectation Maximization• Hierarchical k-Means

Feature Extraction

• Singular Value Decomposition• Explicit Semantic Analysis

Market Basket Analysis

• Apriori – Association Rules

Anomaly Detection

• 1 Class Support Vector Machine

…plus open source Python packages for algorithms in combination with embedded Python execution

Supports integrated partitioned models, text mining

ODB >= 18c

Copyright © 2019 Oracle and/or its affiliates.

Page 62: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

R and Python Object PersistenceDatastore: *.save() and *.load()

Provide database storage to save/restore objects across sessions

Use casesPass R/Python predictive models for

embedded R/Python execution

Pass arguments to R/Python functions with embedded R/Python execution, especially when non-scalar for SQL invocation

Preserve objects across R/Python sessions

x1 <- ore.lm(...)x2 <- ore.frame(...)ore.save(x1,x2,name="ds1")

R Datastore

ore.load(name="ds1")ls()“x1” “x2”

ds1 {x1,x2}

Copyright © 2019 Oracle and/or its affiliates.

Page 63: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

R and Python Object PersistenceDatastore: *.save() and *.load()

Provide database storage to save/restore objects across sessions

Use casesPass R/Python predictive models for

embedded R/Python execution

Pass arguments to R/Python functions with embedded R/Python execution, especially when non-scalar for SQL invocation

Preserve objects across R/Python sessions

x1 = rf_mod.fit(...)x2 = oml.push(...)oml.ds.save(objs={'x1': x1, 'x2': x2},

name="ds1")

oml.ds.load(name="ds1")[‘x1’, ‘x2’]

PythonDatastoreds1 {x1,x2}

Copyright © 2019 Oracle and/or its affiliates.

Page 64: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

IoT Use Case: Sensor Data Analysis

Model each customer’s behavior and identify deviations in individual behaviorand overall aggregate demand

200 thousand households, each with a utility “smart meter”

1 reading / meter / hr

200K x 8760 hrs / yr 1.752B readings

3 years worth of data 5.256B readings

Each customer has 26280 readings

If each model takes 10 seconds to build, 555.6 hrs (23.2 days) …with 128 DOP 4.3 hrs

Massive Predictive Modeling

Copyright © 2019 Oracle and/or its affiliates.

Page 65: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

f(dat,args,…) {

}

Oracle Database

Data c1 c2 ci cn

Scriptbuildmodel

f(dat,args,…) f(dat,args,…) f(dat,args,…) f(dat,args,…)

Model c1

Model c2

Model cn

Model ci

Datastore Script Repository

Scalable Sensor Data Analysis – Model BuildingIoT Use Case: Sensor Data Analysis

Copyright © 2019 Oracle and/or its affiliates.

Page 66: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Build models and store in database, partition on CUST_ID

OML4R embedded R execution

ore.groupApply (CUST_USAGE_DATA,

CUST_USAGE_DATA$CUST_ID,

function(dat, ds.name) {

cust_id <- dat$CUST_ID[1]

mod <- lm(Consumption ~ . -CUST_ID, dat)

mod$effects <- mod$residuals <- mod$fitted.values <- NULL

name <- paste("mod", cust_id,sep="")

assign(name, mod)

ds.name1 <- paste(ds.name,".",cust_id,sep="")

ore.save(list=paste("mod",cust_id,sep=""), name=ds.name1, overwrite=TRUE)

TRUE

},

ds.name="myDatastore", ore.connect=TRUE, parallel=TRUE

)

14 lines

Copyright © 2019 Oracle and/or its affiliates.

Page 67: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

For model build and batch scoring

OML4R embedded R execution – SQL API

begin

--sys.rqScriptDrop('Example2')

sys.rqScriptCreate('Example2',

'function(dat,datastore_name) {

mod <- lm(ARRDELAY ~ DISTANCE + DEPDELAY, dat)

ore.save(mod,name=datastore_name, overwrite=TRUE)

TRUE

}');

end;

/

select *

from table(rqTableEval(

cursor(select ARRDELAY,

DISTANCE,

DEPDELAY

from ontime_s),

cursor(select 1 "ore.connect",

'myDatastore' as "datastore_name"

from dual),

'XML',

'Example2' ));

begin

--sys.rqScriptDrop('Example3')

sys.rqScriptCreate('Example3',

'function(dat, datastore_name) {

ore.load(datastore_name)

prd <- predict(mod, newdata=dat)

prd[as.integer(rownames(prd))] <- prd

res <- cbind(dat, PRED = prd)

res}');

end;

/

select *

from table(rqTableEval(

cursor(select ARRDELAY, DISTANCE, DEPDELAY

from ontime_s

where year = 2003

and month = 5

and dayofmonth = 2),

cursor(select 1 "ore.connect",

'myDatastore' as "datastore_name" from dual),

'select ARRDELAY, DISTANCE, DEPDELAY, 1 PRED from ontime_s',

'Example3'))

order by 1, 2, 3;

Copyright © 2019 Oracle and/or its affiliates.

Page 68: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Automatic Data Preparation

Supported transformations

• Binning: applies a supervised transformation to numeric data to generate categorical bins

• Normalization: normalizes numeric data to fit required range, e.g., 0..1

• Outlier treatment: removes values that deviate significantly from most other values in the column, which can affect normalization and binning

Transformation “instructions” embedded in model for automatic application during scoring

Can turn off automatic data preparation if user needs more control over preparation stages

Automatically performs transformations required by each algorithm

Page 69: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Partition Models

Builds ensemble model with multiple sub-models, one for each data partitionPotentially achieve better accuracy through multiple targeted models

Sub-models managed and used as one

Simplified scoring using top-level model onlyProper sub-model chosen by system based on row of data to be scored

Oracle Database

Table

Specify Partition Column(s)

Partition-1

Partition-2

Partition-3

Partition-n

Sub-Model-1

Sub-Model-2

Sub-Model-3

Sub-Model-n

Top Level Model

New Data

Score data using top level model

In-DBAlgorithm

Automates typical machine learning tasks for data scientists

Page 70: Oracle OpenWorld 2019...Oracle Machine Learning for SQL (OML4SQL) Python (OML4Py) R (OML4R) Empower SQL users with immediate access to ML in Oracle Database and Oracle Autonomous Database

Core APIs Feature Summary

Feature OML4SQL OML4Py OML4R

Transparency Layer n/a

Parallel, Distributed Algorithms

Embedded Execution n/a

Automated Data Preparation

Automated Text Processing

Automated Partitioned Models

Automated Machine Learning (AutoML)

PGX Integration for Graph Analytics implicit

DML table package transparency n/a

Extensible Algorithm Models

SQL plus two most popular open source languages for machine learning

n/a – not applicableCopyright © 2019 Oracle and/or its affiliates.