an open-source streaming machine learning and real-time ... · an open-source streaming machine...

26
1 William Markito @william_markito Fred Melo @fredmelo_br An Open-Source Streaming Machine Learning and Real-Time Analytics Architecture Using an IoT example (incubating) (incubating)

Upload: others

Post on 22-May-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

1

William Markito@william_markito

Fred Melo@fredmelo_br

An Open-Source Streaming Machine Learning and Real-Time Analytics Architecture

Using an IoT example

(incubating) (incubating)

2

Traditional Data Analytics - Limitations

HDFS

Data Lake

Store Analytics

Hard to change Labor intensive

Inefficient

No real-time information ETL based Data-source specific

3

Stream-based, Real-Time Closed-Loop Analytics

HDFSData LakeExpert System /

Machine Learning

In-Memory Real-Time Data

Continuous Learning Continuous

Improvement Continuous Adapting

Data Stream Pipeline

Multiple Data Sources Real-Time Processing Store Everything

4

A Streaming Machine Learning for IoT Example

Sensor Data

Smart System

Learns with HISTORICAL TRENDS

"How were the temperature and vibration sensors reading when the latest failures happened? "

Live data becomes historical over time

Real-Time

Evaluates LIVE DATA“According to historical trends, there’s an 80% chance this equipment would fail in the next 12 hours"

Historical

Predictive Maintenance Scenario

5

Info

Analysis

Look at past trends (for similar input)

Evaluate current input

Score / Predict

Machine Learning

Streaming Machine Learning

6

Info

Analysis

Filter

[ json ]

Machine Learning

Streaming Machine Learning

7

Info

Analysis

Filter Enrich Machine Learning

Streaming Machine Learning

8

Info

Analysis

Filter Enrich Transform Machine Learning

Streaming Machine Learning

9

Info

Analysis

Filter Enrich TransformML Model

Streaming Machine Learning

10

Info

Analysis

Filter Enrich Transform

Transform

ML Model

Streaming Machine Learning

11

In-Memory Data Grid

Front-end

Update Push

ML Model

Streaming Machine Learning

12

Neural Network

In-Memory Data GridReal-time scoring

Train

Supervised Learning ExampleStreaming Machine Learning

13

Ingest Transform SinkSpringXD

Store / Analyze

Fast Data

Distributed Computing

Predict / Machine Learning

Other Sources and Destinations

JMS

A Streaming Machine Learning Reference Architecture

Indoors Localization - Applied Example

14

Trilateration and its limitations

15

Noisy Data

Physical Barriers

Large Overlap Areas

Moving Targets

Innacuracy

Large Overlap Areas

Particle Filters - Calculating the optimum solution

16

Particle Filters - Calculating the optimum solution

17

The Solution

18

1. Capture signal strength 2. Calculate distance from

antenna 3. Trilaterate different sensors

to predict location in real-time 4. Show on a map with live

updates

Architecture Overview

19

Ingest

SpringXD

Groovy

JSON HTTP

+ Distance

Transform Sink

Calculate Device Distance Predict

Location

Spring Boot

Application Platform

GUI

20

• Cache • Configurable through XML, ,Java

• Region • Distributed j.u.Map on steroids • Highly available, redundant

• Member • Locator, Server, Client

• Callbacks • Listener, Writer, AsyncEventListener, Parallel/Serial

Geode Basic Concepts

Introduction to SpringXD

21

Runs as a distributed application or as a single node

Spring XD

22

A stream is composed from modules. Each module is deployed to a container and its channels are bound to the transport.

Demo

24

Why have we selected those projects

• Iterative & Exploratory model

• Web based REPL • Multiple Interpreters

• Apache Geode • Apache Spark • Markdown • Flink • Python…

• In-memory & Persistent • Highly Consistent • Extreme transaction

processing • Thousands of concurrent

clients • Reliable event model

• Productivity • Built-in connectors • Cloud Agnostic • Highly Scalable • Easy to setup • Streams without coding

25

https://github.com/Pivotal-Open-Source-Hub/WifiAnalyticsIoTSource code and detailed instructions available at:

25

William Markito@william_markito

Fred Melo@fredmelo_br

Follow us on GitHub!

26

26

William Markito@william_markito

Fred Melo@fredmelo_br

Implementing a Highly Scalable In-Memory Stock Prediction System with Apache Geode (incubating), R and Spring XD

Room: Tohotom - 14:30, Sep 30 Fred Melo, Pivotal, William Markito, Pivotal