low-latency machine learning applications · advanced machine learning & ai time-series...

11
© 2018 HAZELCAST | 1 THE LEADING IN-MEMORY COMPUTING PLATFORM Low-Latency Machine Learning Applications John DesJardins – CTO N. America

Upload: others

Post on 25-May-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 HAZELCAST | 1

THE LEADING IN-MEMORY COMPUTING PLATFORM

Low-Latency Machine Learning Applications

John DesJardins – CTO N. America

Page 2: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 HAZELCAST | 2

AI Techniques Continue to Expand & Evolve

AI

Machine Learning

SupervisedLearning

Classification

Regression

UnsupervisedLearning

Dimensionality Reduction

Clustering

ReinforcementLearning

Deep Learning

Simulation

Each Innovation Introduces New Challenges in Scalability of Compute & Storage

Image/Video ProcessingUnstructured DataFraud & Anomaly Detection

Predicting TrendsStructured Numeric Data

Image ProcessingUnstructured Data Feature Extraction

Data ExplorationFeature Extraction

Images, Video, AudioAdvanced Machine Learning & AITime-Series Analysis

Page 3: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

Online Machine Learning within an In-Memory PlatformMove from Reactive to Pro-Active

Take Action before negative impact or ahead of opportunity

Ingest Classify Predict Pro-ActEnrich

Context Meaning

Low-Latency Data Grid Data at Rest

Low-Latency Stream Processing - Data in Motion

Models

No SQLData Lake

Batch MLModel Training

Hazelcast In-Memory Platform – Fast Data

Offline – Slow DataData at Rest

Hazelcast IMDG

Hazelcast Jet

Page 4: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

Online Inference & Nearline Machine Learning

Ingest Classify Predict Pro-ActEnrich

Context Meaning

Low-Latency Data Grid Data at Rest

Low-Latency Stream Processing - Data in Motion

Models

No SQLData Lake

Batch MLModel Training

Offline – Slow DataData at Rest

Hazelcast IMDG

Hazelcast Jet

Nearline Re-training

Page 5: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2019 HAZELCAST Confidential & Proprietary | 5

Card-ProcessingInfrastructure

Time-Based

SLA

Swipe

Response

The Hazelcast DifferenceExample: Credit Card Processing

Majority of Time Consumed in Network Transit

MillisecondsInitial Processing:Microseconds

FraudDetectionAlgorithm

Tiny Window of TimeFor Accurate Processing

Time

# ofCard

Terminals

Traditional

eCommerce

iPhones

Square

Payment Evolution

Time

# ofTransactions � Performance at massive scale

� Increase in fraud attempts

Business Challenge

Performance At Scalegives time for

Multiple Algorithms

Business Opportunity: TIME

Page 6: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 Hazelcast Inc. Confidential & Proprietary

Hazelcast - High Performance Platform

Secure | Manage | OperateEmbeddable | Scalable | Low-Latency

Secure | Resilient | Distributed

Ingest & TransformEvents, Connectors, Filtering

CombineJoin, Enrich, Group, Aggregate

StreamWindowing, Event-Time Processing

Compute & ActDistributed & Parallel Computations

Live StreamsKafka, JMS,

Sensors, Feeds

DatabasesJDBC,

Relational, NoSQL,Change Events

FilesHDFS, Flat Files,

Logs, File watcher

ApplicationsSockets

Mobile

Apps

Commerce

Communities

Social

AnalyticsVisualization

Data Lake

IntegrateAPIs, Microservices, Notifications

CommunicateSerialization, Protocols

Store/UpdateCaching, CRUD Persistence

ComputeQuery, Process, Execute

IMDG In-Memory Data Grid

Jet In-Memory Streams

ScaleClustering & Cloud, High Density

ReplicateWAN Replication, Partitioning

Management Center

SecurePrivacy, Authentication,

Authorization

AvailableRolling Upgrades, Hot Restart

SecurePrivacy, Authentication,

Authorization

AvailableJob Elasticity, Graceful shutdown

Page 7: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© HAZELCAST Confidential & Proprietary | 7

Messaging(Kafka, JMS)

loT, Social

CustomConnector

Enterprise Applications

Hazelcast IMDG

Filesystem(files, watcher)

Object Store(HDFS, S3, ..)

Database (JDBC, Redis,

Mongo, )

Messaging

Alerts

Enterprise Applications

Object Store

Filesystem

Database

Hazelcast

Interactive Analytics

Enrichment

RemoteIMDG Database

TransformComputeStreamCombineIngest

Source Output

Inference

In-memOperational

Storage

Join, Enrich,Group,

Aggregate

Windowing, Event-TimeProcessing

Distributed and Parallel

Computations

Filter, Clean, Convert

Predict, Score Apply ML

ML or business rules

Database Events

Publish

In-memory, Subscriber

Notifications

Model Service

InferenceCo-located Model

Local IMDG

Page 8: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 Hazelcast Inc. Confidential & Proprietary

Evolution of Stream Processing

1st Gen (2000s)Hadoop(batch) or Apama(CEP)hard choices

Distributed Batch Compute – MapReduce – scaled, parallelized, distributed, resilient, - not real-timeorSiloed, Real-time – Complex Event Processing – specialized languages, not resilient, not distributed(single instance), hard to scale, fast, but brittle, proprietary

2nd Gen (2014)Sparkhard to manage

Micro-batch distributed – heavy weight, complex to manage, not elastic, require large dedicated environments with many moving parts, not Cloud-friendly, not low-latency

3rd Gen (2017 Jet & Flink)flexible & scalableTrue “Fast Data”

Distributed, real-time streaming – highly parallel, true streams, advanced techniques (Directed Acyclic Graph) enabling reliable distributed job executionFlexible deployment - Cloud-native, elastic, embeddable, light-weight, supports serverless, fog & edge.Low-latency Streaming, ETL, and fast-batch processing, built on proven data grid

Page 9: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 Hazelcast Inc. Confidential & Proprietary

Streaming Performance

*

* Spark had all performance options, including Tungsten, turned on

Page 10: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 Hazelcast Inc. Confidential & Proprietary

Give Hazelcast a try

Hazelcast.org – Data Grid

https://github.com/hazelcast

Jet.hazelcast.org – Streaming Analytics

Demos: https://jet.hazelcast.org/demos/

My info: •John DesJardins

[email protected] - email

•@johnmdesjardins - Twitter

Page 11: Low-Latency Machine Learning Applications · Advanced Machine Learning & AI Time-Series Analysis. ... APIs, Microservices, Notifications Communicate Serialization, Protocols Store/Update

© 2018 HAZELCAST | 11

Thank You