continuous machine learning - ai ukraine · easily deploy and run datacenter-wide app services such...

29
Continuous Machine Continuous Machine Learning Learning AI Ukraine’16 AI Ukraine’16 Kostiantyn Bokhan, PhD Kostiantyn Bokhan, PhD Project Lead at Samsung R&D Ukraine Project Lead at Samsung R&D Ukraine Kharkiv, October 2016 Kharkiv, October 2016

Upload: others

Post on 22-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

Continuous Machine Continuous Machine LearningLearning

AI Ukraine’16AI Ukraine’16

Kostiantyn Bokhan, PhDKostiantyn Bokhan, PhDProject Lead at Samsung R&D UkraineProject Lead at Samsung R&D UkraineKharkiv, October 2016Kharkiv, October 2016

Page 2: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

●ML dev. workflows ML dev. workflows

●ML dev. issuesML dev. issues

●ML dev. solutions ML dev. solutions

●Continuous machine learning (CML)Continuous machine learning (CML)

●Aspects of CMLAspects of CML

●CML infrastructureCML infrastructure

●CML – deliveryCML – delivery

Agenda Agenda AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 2

Page 3: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. workflows ML dev. workflows AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 3

RequirementsRequirements PlanningPlanning

Gathering Datasets

Gathering Datasets

Featuredesign Featuredesign

Model TrainingModel

TrainingModel

ValidationModel

ValidationModel TestingModel Testing

Train framework (Python/R/Matlab)

TestsTests FEFE Modeltools

Modeltools ClassifierClassifier UIUI

QAQA MarketMarket

Application (C++/Java)

Page 4: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. workflows ML dev. workflows AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 4

Define the Problem

Define the Problem

Gather DatasetsGather

Datasets

Clean DatasetsClean

Datasets

Visualize,Explore

Visualize,Explore

Measure,EvaluateMeasure,Evaluate

Hypothesize,Model

Hypothesize,Model

DeployDeploy

Data selection

Data selection

Data transformation

Data transformation

Feature design

Page 5: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. workflows ML dev. workflows AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 5

Datasets

Train data

Test data

labels

FeatureExtraction

Trainingthe

model

Evalthe

modelModelfeatures

Training

Input Data

FeatureExtractio

n PredictLabels

Modelfeatures

Predicting

Page 6: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 6

SmallBig

Bigger

Complex

Simple

Complex

DataSize

Modelcomplexity

1996 2006 2016

Page 7: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 7

Perf

orm

ance

Perf

orm

ance

Machine Learningon PC

Page 8: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 8

Perf

orm

ance

Perf

orm

ance

Machine Learningon PC

My other computer is Amazon EC2

Machine Learning on AWS

Page 9: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 9

Perf

orm

ance

Perf

orm

ance

Machine Learningon PC

Machine Learning on dedicated cluster

My other computer is Amazon EC2

Machine Learning on AWS

Page 10: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 10

Unce

rtain

tyU

nce

rtain

ty

Page 11: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 11

Vari

ety

Vari

ety

Page 12: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 12

Relia

bili

tR

elia

bili

tyy

Page 13: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. issues ML dev. issues AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 13

Reso

urc

e

Reso

urc

e

man

ag

em

en

tm

an

ag

em

en

t

Page 14: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 17

+

Performance Performance

Page 15: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 18

+

Performance + Reliability Performance + Reliability

Mesos

Page 16: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 19

Variety Variety

Page 17: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan

Resource management Resource management

20

Memory CPU Storage GPU

Kernel →

Init.rd → Marathon

Mesos

dcos CLI Singularity Aurora

MPI

CUDA

Theano

Page 18: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 22

Source: https://dcos.io/docs/1.8/overview/architecture/

Mesosphere Enterprise DC/OS is an enterprise grade datacenter-scale operating system, providing a single platform for running containers, big data, and distributed apps in production.

Services & Applications

Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform

DC/OS Powered by Apache Mesos

Runtime, tools and best practices built-in to simplify operations and deliver a production self-healing infrastructure

Run Anywhere

Bare-metal, virtual, cloud or hybrid – DC/OS runs on it all – only requirement is a modern Linux distro, Windows support coming soon :)

Page 19: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 23

Source: https://dcos.io/docs/1.8/overview/architecture/

Page 20: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 24

Apache Mesos is the open-source distributed systems kernel at the heart of the Mesosphere DC/OS. It abstracts the entire datacenter into a single pool of computing resources, simplifying running distributed systems at scale.

Sources: http://nvidia.com, http://blog.arungupta.me/docker-apache-mesos-marathon/

Page 21: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML Resources CML Resources AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 25

Virtual nodesBare-metal nodes

with GPUTest devices

Hybrid

Page 22: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

ML dev. solutions ML dev. solutions AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan

Uncertainty Uncertainty

26

Continuous Machine Continuous Machine LearningLearning

We can’t remove uncertainty but we can automate routines We can’t remove uncertainty but we can automate routines especially delivery and integrationespecially delivery and integration

Page 23: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

Aspects of CML Aspects of CML AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 28

Continuous

Continuous

Development

Continuous

Integration Continuous

Deployment

Continuous Delivery

Continuous Everything

Page 24: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML Infrastructure CML Infrastructure AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 31

Train Validation Test

Build Test Verification

GIT Jenkins Dockerregistry

Developers

Data scientists

DevelopersDevelopers

MesosMaster

StandbyMaster

StandbyMaster

MesosAgent

Spark job

Batchdocker

jobSparkCUDA

job

MesosAgent

SparkDocker

job

MesosAgent

Spark job

CUDA job

MesosAgent

Singularity Marathon

Singularity Marathon

Pool ofDevices

Page 25: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML – deploy CML – deploy AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 32

Page 26: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML – deploy CML – deploy AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 33

Page 27: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML – deploy CML – deploy AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 34

Page 28: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

CML – deploy CML – deploy AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan 35

Page 29: Continuous Machine Learning - AI Ukraine · Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos

Questions?Questions?

AI Ukraine’16AI Ukraine’16

K.BokhanK.Bokhan Samsung R&D Institute UkraineSamsung R&D Institute Ukraine36