arnocandelaifrontiers011217
TRANSCRIPT
HowDeepLearningWillMakeUsMoreHumanAgain
ArnoCandel,PhDCTO,H2O.ai@ArnoCandel
AIFrontiers,SantaClara
Jan12,2017
H2O.ai Machine Intelligence 3
SoftwareProduct:H2O-AIforBusinessTransformation• ScalableandDistributedDataScienceandMachineLearning:DeepLearning,GradientBoosting,RandomForest, GeneralizedLinearModeling,K-MeansClustering,PCA,GLRM,…
• Apachev2OpenSource(github.com/h2oai)
H2OisEasytoUseandDeploy• h2o.ai/downloadandrunanywhere,immediately• ClientAPIs:R,Python,Java,Scala,REST,FlowGUI• Spark(cf.SparklingWater),Hadoop,Standalone• Auto-generatedJava/C++ScoringCode
H2O.ai-MakersofH2O
Powerful,ScalableTechniquesforDeepLearningandAI
Winyourcopyatourbooth!
Dec2016-brandnew!
H2OBook-WrittenbytheCommunity
H2O.aiMachine Intelligence
UserBasedInsurance
WATCH NOW
WATCH NOW
“H2O is an enabler in how people are thinking about data.”
“We have many plans to use H2O across the different business units.”
7
H2O.aiMachine Intelligence
DigitalMarketing-Campaigns
“H2O gave us the capability to do Big Modeling. There is no limit to scaling in H2O.”
“Working with the H2O team has been amazing.”
“The business value that we have gained from advanced analytics is enormous.”
WATCH NOW
WATCH NOW
8
H2O.aiMachine Intelligence
WATCH NOW
WATCH NOW
MatchingTVWatchingBehaviorwithBuyingBehavior
“Unlike other systems where I had to buy the whole package and just use 10-20%, I can customize H2O to suit my needs.”
“I am a big fan of open source. H2O is the best fit in terms of cost as well as ease of use and scalability and usability.”
9
H2O.aiMachine Intelligence
WATCH NOW
WATCH NOW
Insurance-RiskAssessment
“Predictive analytics is the differentiator for insurance companies going forward in the next couple of decades.”
“Advanced analytics was one of the key investments that we decided to make.”
10
H2O.aiMachine Intelligence
Fintech-Fraud/Risk/Churn/etc.
“H2O is a great solution because it's designed to be enterprise ready and can operate on very large datasets.”
”H2O has been a one-stop shop that helps us do all our modeling in one framework.”
”H2O is the best solution to be able to iterate very quickly on large datasets and produce meaningful models.”
WATCH NOW
WATCH NOW
11
H2O.ai Machine Intelligence 12
HighLevelArchitectureofH2O
HDFS
S3
NFS
DistributedIn-Memory
ParallelParser
LosslessCompression
H2OComputeEngine
ProductionScoringEnvironment
Exploratory&DescriptiveAnalysis
FeatureEngineering&Selection
Supervised&UnsupervisedModeling
ModelEvaluation&Selection
Predict
Data&ModelStorage
ModelExport: StandaloneScoringCode
C/C++/Java R/Py/etc.
DataPrepExport:PlainOldJavaObject
Local
SQL
LDAP Kerberos SSL HTTPS
HTTP
H2O.ai Machine Intelligence
NativeAPIs:Java,Scala—RESTAPIs:R,Python,Flow,JavaScript,Java
13
library(h2o)h2o.init()h2o.deeplearning(x=1:4,y=5,as.h2o(iris))
importh2o
fromh2o.estimators.deeplearningimportH2ODeepLearningEstimator
h2o.init()
dl=H2ODeepLearningEstimator()
dl.train(x=list(range(1,4)),y="Species",training_frame=iris.hex)
import_root_.hex.deeplearning.DeepLearningimport_root_.hex.deeplearning.DeepLearningParametersvaldlParams=newDeepLearningParameters()dlParams._train=iris.hexdlParams._response_column=‘Speciesvaldl=newDeepLearning(dlParams)valdlModel=dl.trainModel.get
Allheavyliftingisdonebythebackend!
Built-ininteractiveGUIandnotebook-nocodingneeded!
DeepWaterBringsState-Of-The-ArtDeepLearningonGPUstoH2O
H2ODeepLearning:simplemulti-layernetworks,CPUs
H2ODeepWater:arbitrarynetworks,CPUsorGPUs
Limitedtobusinessanalytics,statisticalmodels(CSVdata)
Largenetworksforbigdata(e.g.image1000x1000x3->3minputsperobservation)
1-5layersMBs/GBsofdata
1-1000layersGBs/TBsofdata
Open-Source-LeverageCommunityCode,DataandModels
World’sBestImageClassifier(Google+Microsoft,Aug2016)
https://research.googleblog.com/2016/08/improving-inception-and-image.html
open-sourceimplementation
H2Otakesmxnetgraphdefinitionasinput
BuildyourownmodelswithDeepWaterToday!https://github.com/h2oai/h2o-3/blob/master/h2o-py/tests/testdir_algos/deepwater/pyunit_inception_resnet_v2_deepwater.py
DeepWater-EasiestToUseGPUDeepLearningEver!
Yesterday:SmallData(<GB) Today:BigData(TeraBytes,ExaBytes)
Data+Skillsaregoodforbusiness
Data+MachineLearningAREthebusiness
ThingsareChangingQuickly
CEO:“WewilltransformourbusinesswithAI”Management:“HiresomeonetogiveusAI”SeniorDataScientist:“IshouldlookintoAI”JuniorDataScientist:“IuseTensorFlowallthetime”HighSchoolKid:“IdidmyinternshiponDeepLearning”AverageJoe:“Iwantaself-drivingcar(andkeepmyjob)”
StanfordProfessors:“focusoninterpretability,startwithsimplemodels!”
TheHypeandRealityofAI
WhoDoestheWorkandonWhatInfrastructure?
WhichoneforDevelopmentvsProduction?
Cloud?Which? OnPremise?
DataLake?Micro-Services?
WhichoneforDevelopmentvsProduction?
WhenistheModelGoodEnough?
Crowdsourcing? TrustaGenius? InternalBake-Off?
Whatproblemareyousolvinginthefirstplace?
Whatproblemshould/couldyoubesolvinginstead?
Whatcanyoulearnfromthemodel?
Howcanyouimprovethemodels?More,betterdata?
Howcanyoucharacterizethemodel?
DoyouneedAI,DeepLearningorjustasimplemodel?
BacktotheDrawingBoard!
GradientBoosting Machine
Generalized LinearModeling
DeepLearning
Distributed RandomForest
DoyouneedAI,DeepLearningorjustaSimpleModel?