using the latest features from the splunk machine learning ......machine learning toolkit customer...

42
Using The Latest Features From The Splunk Machine Learning Toolkit To Create Your Own Custom Models Adam J. Oliner | Director of Engineering Manish Sainani | Director of Product September 2017 | Washington, DC

Upload: others

Post on 25-Apr-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Using The Latest Features From The Splunk Machine Learning Toolkit To Create Your Own Custom ModelsAdam J. Oliner | Director of EngineeringManish Sainani | Director of Product

September 2017 | Washington, DC

Page 2: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC.

The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release.

Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved.

Forward-Looking Statements

THIS SLIDE IS REQUIRED FOR ALL 3 PARTY PRESENTATIONS.

Page 3: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Splunk Machine Learning Toolkit▶ Platform Extensions: ML-SPL, etc.▶ The Assistants: Guided Machine Learning▶ What’s New▶ Demo

• DIY Anomaly Detector

Outline

Page 4: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Splunk Machine Learning Toolkit

platform extensions and guided modeling dashboards

Page 5: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ A process for generalizing from examples▶ Examples

• A, B, … → # (regression)

• A, B, ... → a (classification)

• Xpast → Xfuture (forecasting)

• like with like (clustering)

• |Xpredicted – Xactual| >> 0 (anomaly detection)

Machine Learning

Page 6: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Evaluate

Deploy Clean / Transform

Explore / Visualize

Model

Collect DataMachine Learning Process

Page 7: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Data Gathering And PrepSource: CrowdFlower

Page 8: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Machine Learning Process With Splunk

Pivot, Table UI, SPL

props.conf,transforms.conf,Datamodels,Splunkbase

Splunk Machine Learning Toolkit

Alerts,Dashboards,Reports

Page 9: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Overview Of Machine Learning At Splunk

Rich Ecosystem ofApps & Add-Ons

Splunk PremiumSolutions

MainframeData

RelationalDatabasesMobileForwarders Syslog/

TCPIoTDevices

NetworkWire Data Hadoop

Platform for Operational Intelligence

Page 10: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Assistants: Guide model building, testing, & deployment for common tasks

▶ Showcase: 25+ interactive examples from IT, security, business, and IoT

▶ Algorithms: 25+ standard algorithms plus an extensibility API

▶ SPL ML Commands: New commands to fit, test, and operationalize models

▶ Python for Scientific Computing Library: 300+ open-source algorithms

Splunk Machine Learning Toolkitextends Splunk with new tools and guided modeling

Page 11: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Machine Learning Customer Success

Network OptimizationDetect & Prevent Equipment Failure Security / Fraud Prevention

Prioritize Website Issues and Predict Root Cause

Predict Gaming OutagesFraud Prevention

Machine Learning Consulting Services Analytics App built on ML Toolkit

Prevent Cell Tower FailureOptimize Repair Operations

Entertainment Company

Page 12: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Machine Learning Toolkit Customer Use Cases

Speeding website problem resolution by automatically ranking actions for support engineers

Reducing customer service disruption with early identification of difficult-to-detect network incidents

Minimizing cell tower degradation and downtime with improved issue detection sensitivity

Improving uptime and lowering costs by predicting/preventing cell tower failures and optimizing repair truck rolls

Predicting and averting potential gaming outage conditions with finer-grained detection

Ensuring mobile device security by detecting anomalies in ID authentication

Preventing fraud by Identifying malicious accounts and suspicious activitiesEntertainment Company

Page 13: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Platform Extensions: ML-SPL, etc.

custom search commands for machine learning

Page 14: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Commands (ML-SPL)• fit• apply• summary• listmodels• deletemodel• sample

▶ Macros• regressionstatistics• classificationstatistics• classificationreport• confusionmatrix

• forecastviz• histogram• modvizpredict• splitby(1-5)

▶ Viz• Outliers Chart• Forecast Chart• Scatter Line Chart• Histogram Chart• Downsampled Line Chart• Scatterplot Matrix

SPL, Macros, & VizOh, my!

Page 15: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ A suite of SPL commands specifically for machine learning• modeling• sampling

▶ Most are implemented using modules from the Python for Scientific Computing add-on for Splunk• scikit-learn• numpy• pandas• statsmodels• scipy

ML-SPLWhat is it?

Page 16: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶︎ Fit (i.e., train) a model from search results… | fit <ALGORITHM> <TARGET> from <VARIABLES …>

<PARAMETERS> into <MODEL>

▶︎ Apply a model to obtain predictions from (new) search results… | apply <MODEL>

▶︎ Inspect a model (e.g., display coefficients)| summary <MODEL>

ML-SPL Commands

Page 17: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

… | fit <ALGORITHM> <TARGET> from <VARIABLES><PARAMETERS> into <MODEL>

Examples:

… | fit LinearRegressionsystem_temp from cpu_load fan_rpminto temp_model

… | fit KMeans k=10downloads purchases posts days_active visits_per_dayinto user_behavior_clusters

ML-SPL Commands: fit

optional

Page 18: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ 25+ algorithms OotB• prediction, clustering, forecasting, feature engineering

▶ Extensibility API for 300+ more▶ Pipeline for advanced use cases

… | fit TFIDF message| fit StandardScaler files bytes| fit KMeans message_tfidf_* SS_* k=5| fit PCA message_tfidf_* k=2| …

ML-SPL: Algorithms

Page 19: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

… | apply <MODEL>

Examples:

… | apply temp_model… | apply user_behavior_clusters

ML-SPL Commands: apply

Page 20: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

… | summary <MODEL>

Examples:

… | summary temp_model… | summary user_behavior_clusters

ML-SPL Commands: summary

Page 21: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

| listmodels| deletemodel <MODEL>

ML-SPL Commands

Page 22: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Randomly sample or partition events… | sample <PARAMETERS>

▶ Four modes• Ratio … | sample 0.01• Count … | sample 20• Proportional … | sample proportional=“some_field”• Partition … | sample partitions=10

ML-SPL Commands: sample

Page 23: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Custom search commands!▶ Tabular munging!▶ Jacob Leverich!▶ Shang Cai!

Plug: Machine Learning In SPL With The Machine Learning Toolkit

Tuesday, September 26th @ 2:15pm in Ballroom B

Page 24: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

The Assistantsguided machine learning

Page 25: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Guides you through an analytic• Prepare, fit, validate, and deploy

▶ Automatically generates all the relevant SPL

Guided ML With The Assistants

Page 26: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Assistants: Prepare

Page 27: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Assistants: Fit

Page 28: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Assistants: Validate

Page 29: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Assistants: Deploy

Page 30: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Predict Numeric Fields▶ Predict Categorical Fields▶ Detect Numeric Outliers▶ Detect Categorical Outliers▶ Forecast Time Series▶ Cluster Numeric Events

The Assistants

Page 31: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

What’s Newsince last .conf

Page 32: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Detect Numeric Outliers improvements▶ Preprocessing▶ Model Management▶ ML-SPL extensibility API▶ Spark Support (private beta)▶ New algorithms:

• ACF & PACF• ARIMA

▶ Load Existing Settings is per-user▶ Downsampled Line Chart supports drilldown

What’s New(since .conf 2016)

Page 33: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Detect Numeric Outlierssplit-by support

Page 34: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Detect Numeric Outliersdata distribution viz

Page 35: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ In Predict Numeric, Predict Categorical, and Cluster Numeric assistants

Preprocessingbuild a pipeline of data prep

Page 36: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Assign permissions to models to control access

▶ Manage models via the UI▶ Experiments

• Assistant configurations• May produce 1+ models

Model ManagementRBAC & more

Page 37: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Make more algos available to fit / apply• 300+ in PSC• Custom algorithms

▶ Expose new or different parameters▶ Docs include examples

• Correlation Matrix• Agglomerative Clustering• Support Vector Regressor• Savitzky-Golay Filter

▶ Use in your apps / dashboards / etc.!

ML-SPL Extensibility APIfeaturing: primo documentation

Page 38: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Implementation details!▶ Extensibility API!▶ Alexander Johnson!▶ Zidong Yang!

Plug: Advanced Machine Learning using the Extensible ML API

Wednesday, September 27th @ 4:35pm in Ballroom B

Page 39: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

▶ Use your existing Spark cluster with MLTK• Distributed fit on massive datasets• Apply MLlib models for supported algos

▶ sfit / sapply▶ Contact [email protected]

• What is your use case (e.g., predicting server downtime)?• Why do you want / need Spark (i.e., why isn’t MLTK sufficient)?

Spark Supportprivate beta open now

Page 40: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Demo

Page 41: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

Q&AAdam J. Oliner | Director of Engineering

Manish Sainani | Director of Product

Page 42: Using The Latest Features From The Splunk Machine Learning ......Machine Learning Toolkit Customer Use Cases Speeding website problem resolution by automatically ranking actions for

© 2017 SPLUNK INC.

Don't forget to rate this session in the .conf2017 mobile app

Thank You