wso2 machine learner - product overview
TRANSCRIPT
WSO2 Machine Learner 1.1.0
WSO2 Analy+cs Pla/orm WSO2 Analy5cs Pla8orm uniquely combines simultaneous real-‐2me and batch analysis with predic2ve analy2cs to turn data from IoT, mobile and Web apps into ac5onable insights
2
WSO2 Analy+cs Pla/orm
3
WSO2 Advantages
4
Highly Pluggable Architecture
5
Toolboxes for Extensibility
6
+ Toolboxes = Industry or domain specific analy7cs
Toolboxes: • Fraud and Anomaly Detec+on-‐ Supports fraud and anomaly detec7on through sta7c rules, Markov
chains, and scoring.
• GIS Data Monitoring -‐ Can take any data stream tagged with geographical loca7ons and support visualiza7ons of that data in a map.
• Ac+vity Monitoring-‐ Lets users correlate events related to the same transac7on in order to visualize, analyze, and write queries on top of those ac7vi7es.
Edge Analy+cs-‐Mobile and IoT Streams
7
Event correla2on/filtering available at the edge
High Level Languages • For both batch and real-‐7me, we provide structured , SQL-‐like query languages.
• No Java programming is required
• Lowers the adop7on entry point.
• Batch analy7cs relies on SparkSQL.
• Real Time analy7cs implemented through WSO2 owned solu7on Siddhi
8
Real+me analy+cs with Siddhi • ThroRling & Blacklis7ng users define stream RequestStream ( correla7onID string, serviceID string,userID string, tear string, requestTime long, ... ) ;
define table BlacklistedUserTable(userID string,7me long,requestCount long);
from RequestStream[tear==‘BRONZE’]#window.7me(1 min)
select userID, requestTime as 7me, count(correla7onID) as requestCount
group by userID having up requestCount > 5 insert into BlacklistedUserTable ;
9
Batch Analy+cs with Spark SQL
create temporary table product_data using carbonanalytics
options (schema …)
create temporary table products using carbonanalytics
options (schema …)
insert into products select product_name from product_data
group by …
10
Case Studies
11
Smart Home • DEBS (Distributed Event Based Systems) is a premier academic
conference, which post yearly event processing challenge (hRp://www.cse.iitb.ac.in/debs2014/?page_id=42)
• Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events • We posted fastest single node solu7on measured (400K events/sec)
and close to one million distributed throughput. • WSO2 CEP based solu7on is one of the four finalists (with Dresden
University of Technology, Fraunhofer Ins7tute, and Imperial College London)
• Only generic solu7on to become a finalist
12
Healthcare Data Monitoring • Allows to search/visualize/analyze healthcare records (HL7) across 20 hospitals in
Italy
• Used in combina7on with WSO2 ESB
• Custom toolbox tailored to customer’s requirement ( to replace exis7ng system)
•
13
Cloud IDE Analy+cs • Custom solu7on created in partnership with Codenvy to bring analy7cs to Codenvy
management team and its customers
• Developed in less than a month, with a custom plug-‐in to MongoDB.
• Deployed in the codenvy.com plamorm.
14
Addi+onal Customers Use Cases • Cisco (BAM + CEP) -‐ OEM, Healthcare, Parking Monitoring (see Solu7on paRerns based
approach to rapidly create IoE solu7ons across industries, • hRp://us14.wso2con.com/videos/#Coumara-‐Radja
• Used by a Large Scale IoT System Provider for use cases including Vehicle tracking, Smart City, Building Monitoring (CEP)
• See “Internet of Big Things: The Story of Pacific Controls, hRp://us14.wso2con.com/videos/#Sajaad-‐Chaudry”
• Transac7on Monitoring in a Large Bank (CEP) • Knowledge Mining and tracking Prospec7ve Customers through Natural Language data
sources (CEP)
• CEP Embedded in edge Devices • See WSO2Con 2013 -‐ Keynote:Emerging Founda7ons of Next-‐Genera7on Business Systems
hRps://www.youtube.com/watch?v=7CyG3JKUxWw
• ThroRling and Anomaly Detec7on by Group of Telecom Companies
15
WSO2 Machine Learner (Technical Overview)
16
WSO2 Machine Learner
17
Overview
18
o Open source Machine Learning (ML) tool
o Scalable way to perform machine learning
o Visually explore uploaded data sets
o Support for various machine learning algorithms
o Metrics to evaluate and compare built ML models.
o Ability to export ML models
o Extensions for real-time predictions
o REST API to expose all features i.e. ML jobs are scriptable
Func+onality
19
o Manage and explore your data
o Analyze the data using machine learning algorithms
o Build machine learning models
o Compare and manage generated machine learning models
o Predict using the built models
Manage Data set
20
o Supported data sources o CSV/TSV files from local file systems.
o Files from HDFS.
o Tables from WSO2 Data Analytics Server
o Supports data set versioning. o Version data collected overtime from the same data set
o Generate models from the different versions.
o Manage datasets based on projects ,users.
Pre-‐process & Explore Data
21
o Find key details from feature set
o Scatter plots to understand relationship between feature set o Supported graphs: o Scatter plots, Parallel sets,Trellis charts, Cluster diagram, Histogram
o Missing value handling with mean imputation and discard
Analysis with ML Algorithm
22
o Supports deep learning
o Supports supervised and unsupervised learning.
o Includes algorithms for numerical prediction, classification
and clustering.
o Supports anomaly detection algorithm.
o Supports recommendation with Collaborative Filtering
Recommendation Algorithm
Analysis with ML Algorithm
23
o Includes algorithms for numerical prediction, classification
and clustering.
Numerical prediction
Linear Regression, Ridge Regression, Lasso Regression
Classification Logistic Regression, Naive Bayes, Decision Tree, Random Forest and Support Vector Machines
Clustering K-Means
Model Evalua+on & Comparison
24
o Evaluate generated models based on metrics o Accuracy o Area under ROC curve o Confusion Matrix o Predicted vs. Actual graphs o Feature importance
o Compare models generated from different analysis.
o Set fractions for training data
Integra+on of ML Models
25
o Models can be used via main transaction flow (WSO2 ESB) or data analysis flow (WSO2 CEP)
o Supports PMML for interoperability.
Deployment Op+ons
26
o Stand alone mode
o With external Spark
Cluster
o With WSO2 DAS as
external Spark Cluster
Run Yourself or let WSO2 Run it for you
27
Self-Hosted • Your operations team maintains the
deployment with production support from WSO2
WSO2 Managed Cloud • WSO2 Operations team runs the
deployment in a dedicated environment in AWS datacenter of your choice
• Includes monitoring, backups, patches, updates
• Financially backed SLA on uptime and response time
Thank You!
Download WSO2 Machine Learner at: h]p://wso2.com/products/machine-‐learner/