data mining to real-time processing - tibco community · json real time xml real time action...

38
© Copyright 2000-2016 TIBCO Software Inc. Mike Alperin Kai Waehner August, 2016 Machine Learning in Manufacturing: Data Mining to Real-time Processing

Upload: others

Post on 28-May-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Mike Alperin

Kai Waehner

August, 2016

Machine Learning in Manufacturing:

Data Mining to Real-time Processing

Page 2: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

• Introduction to Machine Learning

• Machine Learning in Manufacturing

• Methods and Architecture

• Data Mining - Demo

• Real-time Processing of Streaming Data - Demo

• Q&A

Agenda

Page 3: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Introduction to Machine Learning

Page 4: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Machine Learning

Machine learning is a method of data analysis that automates analytical

model building. Using algorithms that iteratively learn from data, machine

learning allows computers to find hidden insights without being explicitly

programmed where to look.

http://www.sas.com

Page 5: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Real World Examples of Machine Learning

Spam Detection Search Results +Product Recommendation

Picture Detection(Friends, Locations, Products)

Machine Learning is already present in your daily life…

Now, every enterprise is beginning to leverage it!

Page 6: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

• Supervised – Solve known problems

• What factors are driving manufacturing defects?

• Decision Trees, Random Forest, Gradient Boosting Machine

• Unsupervised – Identify new patterns, Detect anomalies

• Are there new failure modes emerging?

• Clustering, Principle Components, Neural Networks, Support Vector

Machines

• Optimization – Support Decision-making

• What is the optimum scheduling of operators or equipment maintenance?

• Genetic Algorithm

Types of Machine Learning

Page 7: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Decision Tree – Titanic Survival Rate

family size

Wikipedia

Page 8: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Classical Statistics – Fit parameters to a well-defined model

Page 9: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Decision Tree – Product Pass / Fail by Process & Equipment

Bad Product

Good Product

Clearcoat Bake Temperature>= 132 C

Sanding Station1, 2, 4 3 Basecoat Thickness

Peeling Clearcoat

< 132 C

… … … …

Automobile Paint Process

Page 10: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Decision Tree – Training and Test Data Sets

Page 11: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Ensemble Tree Algorithms

• Random Forest, Gradient Boosting Machine (GBM)

• Method – Average many simple trees

• Sample the data: fit a simple tree

• Re-sample the data; up-weighting the observations that weren’t fitted well in

previous model

• Continue adding trees until fit is good

• Save all the trees and average them

• Better fit + prediction than single trees

Page 12: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

12

Gradient Boosting Machine (GBM)

• Machine Learning• Machine learning algorithms + Big Data sets can produce models

that accurately fit complex data patterns.

• GBM: Better results than classic statistical methods• Ideal for data-mining / predictions for complex processes &

products• Performs well for variable reduction• Can fit complex nonlinear relationships & interactions• Scales to Big Data

• Easier to use• No need to specify the data model• Accommodates continuous and categorical predictors & responses• Handles missing data and outliers well

• Simple user interface • Model complexity can be hidden from user• Results presented with easy-to-understand visualizations• Interface easily customized for your use case and users

© Copyright 2000-2016 TIBCO Software Inc.

Page 13: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Advanced Analytics and Big Data Tools

Many more ….

Page 14: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Machine Learning in Manufacturing

Page 15: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Correlate Product or Equipment Results to Process & Supplier Data

• Supplier - Incoming Materials and Components• measured electrical, chemical, physical characteristics

• batch-id, lot_id

• Manufacturing Process• Physical, chemical or electrical measurements

• WIP / MES: track-in / track-out date, process equipment id, recipe, operator, …

• Process equipment sensor data

• Equipment Maintenance logs

• Defect Inspections

• Cost of labor, materials, machines and facilities

• Product Quality and Reliability Test• Measured product functional and performance characteristics

• Accelerated life test results

• Product Field Returns• Failure mode, unit / batch / lot ID

• Failure analysis root cause results

• Warranty / Repair claim, call center and cost – structured & unstructured

Page 16: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

• Problem

• Product & Equipment problems difficult to accurately diagnose for complex manufacturing processes

• Big Data problem – millions of units, hundreds / thousands of predictors

• Response: Product, Process or Equipment Fail data

• Predictors: in-process equipment, process and product measurements or attributes

• Value

• Being used by customers to find previously undetected problems. Reduces time-to-market and increases profit.

• Method

• GBM analysis template to identify significant predictors, interactions and nonlinearities

• For large datasets, hybrid data access used to perform variable reduction step in-DB

• Simple interface – easy for business analyst to run and interpret results

GBM results for semiconductor yield as a function of in-process equipment & product measurements

Machine Learning to Predict Equipment or Product Fails

Page 17: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Method and Architecture

Page 18: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

INSIGHT ACTION

Insight – Action Loop

Page 19: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Fast Data Reference Architecture

Operational Analytics

OperationsOperationsLive UI

SENSOR DATA

TRANSACTIONS

MESSAGE BUS

MACHINE DATA

SOCIAL DATA

Streaming AnalyticsAction

AggregateAggregate

RulesRules

Stream Processing

AnalyticsAnalytics

CorrelateCorrelate

Live Monitoring

Continuous query processing

Continuous query processing

AlertsAlerts

Manual action, escalation

Manual action, escalation

HISTORICAL ANALYSIS

Data SheetsData

Sheets

BIBI

Data Scientists

Data Scientists

CleansedData

History

Data Discovery

Enterprise Service BusEnterprise Service Bus

ERPERP MDMMDM DBDB WMSWMS

SOASOA

Data Storage

Intern

al Data

Integratio

n B

us

APIAPI

Complex Event Processing

Machine LearningMachine Learning

Big Data

Page 20: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Demo: Data Mining

Page 21: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

• Model Set up – Parameter Selection

• Model Configuration

• Run Model

• Evaluate model: ROC Curve, AUC

• Visualize Model Results

• Variable Importance

• Interactions

Demo Outline

Page 22: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Demo Screenshot – Model Configuration and Evaluation

Page 23: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Demo Screenshot – Model Results

Page 24: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Real-time Processing

Of Streaming Data

Page 25: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Predictive Analytics for Manufacturing

Goal: Scrap parts as early as possible to reduce costs in a manufacturing process.

Question: When to scrap a part in Station 1 instead of sending it to Station 2?

Station 1 Station 2

Cost Before9€

7€ 13€Total Cost

29€(or more)

Scrap? Scrap?

Page 26: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

Fast Data Architecture for Predictive Maintenance

Operational Analytics

OperationsOperationsLive UI

CSV Batch

JSON Real Time

XML Real Time

Streaming AnalyticsAction

AggregateAggregate

RulesRules

AnalyticsAnalytics

CorrelateCorrelate

Live Datamart

Continuous query processing

Continuous query processing

AlertsAlerts

Manual action, escalation

Manual action, escalation

HISTORICAL ANALYSIS Data Scientists

Data Scientists

FlumeHDFS

Spotfire

R / TERRR / TERRHDFS

Hadoop (Cloudera)

StreamBase

TIBCO Fast Data Platform

H2OH2O

Oracle RDBMS

Avro Parquet … PMMLPMML

Inte

rnal D

ata

Page 27: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO Spotfire with H2O Integration

Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)

Page 28: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO Spotfire with H2O Integration

Advanced Analytics (“Scrap parts as early as possible!”)

Page 29: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO Spotfire with H2O Integration

Advanced Analytics (“Scrap parts as early as possible!”)

Page 30: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO StreamBase + R / TERR

Page 31: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO StreamBase + H20

Page 32: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO StreamBase + PMML

Page 33: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Desktop Client

Page 34: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Web API

Page 35: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Demo:

Real Time Processing

(Predictive Scrapping of Parts

in an Assembly Line)

Page 36: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Learn & Do More: Machine Learning on the TIBCO Community

Wiki page

Component Exchange:• Data functions• Accelerators• Templates

https://community.tibco.com/wiki/machine-learning-tibco-spotfire-and-streambase

https://community.tibco.com/exchange

Page 37: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Learn & Do More: Accelerators on the TIBCO Community

https://community.tibco.com/wiki/accelerators https://community.tibco.com/exchange

Component Exchange Accelerators:• Apache Spark• Intelligent Equipment• Connected Vehicles

Wiki page

Page 38: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query

© Copyright 2000-2016 TIBCO Software Inc.

Q & A