h2o advancements - arno candel
TRANSCRIPT
H2OAdvancementsArnoCandel,PhD
ChiefArchitect,Physicist&Hacker,H2O.ai@ArnoCandelJuly182016
NewEnterpriseFeatures
Auth&Security LDAP/Kerberos/HTTPS/Encryptionformax.compliance,IPv6
Semi-Supervised Pre-trainDeepLearningmodelonunlabeleddata,thenfine-tune
LargePOJOs ProductionizelargeJavamodels(multi-GBsourcecode)
HyperParameterTuning
Automaticallytunesmodelparametersforthedesiredmetric,withconvergence-basedearlystoppingformodelsandsearch
Steam PlatformforDataProducts-next-genproduct
SparklingWater2.0 ThekillerappforthelatestversionofApacheSpark
AdvancedMunging BringingR’sdata.tabletoH2O-scalable,fastanddistributed
Mission-CriticalCapabilitiesforEnterpriseProductionUse
NewFeaturesforH2OTreeAlgorithms:GBM+DRF
HighestAccuracyforSmarterApplications
Optimalmissingvaluehandling
Missingdatahasmeaning,israrelymissingatrandomOptimalsplitsarefoundtakingmissingvaluesintoaccount
Quantile-basedhistograms
Findsoptimalsplitpointsfordatawithoutliers,e.g.-99999,0,1,2,3,4,5
NewAlgorithm ExtraTreesClassifier(pickbestamongrandomsplitpoints)RobustRegression:HuberlossforGBM
Higheraccuracyformodelsondatawithoutliersquadraticlossforinliers,linearlossforoutliers
MoretuningparametersforGBM
col_sample_rate_change_per_level-H2Oexclusivelearn_rate_annealing-fastertrainingmin_split_improvement-avoidsoverfittingmax_abs_leafnode_pred-avoidsoverfittingsample_rate_per_class-forimbalanceddatasets
IntegrationwithexistingGPUbackends
Leverageopen-sourcetoolsandresearchforTensorFlow,Caffe,mxnet,Theano,etc.
ScalabilityandEaseofUse/DeploymentofH2O
Distributedtraining,real-timemodelinspectionFlow,R,Python,Spark/Scala,Java,REST,POJO,Steam
ConvolutionalNeuralNetworks
Image,video,speechrecognition,etc.
RecurrentNeuralNetworks
Sequences,timeseries,etc.NLP:naturallanguageprocessing
HybridNeuralNetworksArchitectures
Speechtotexttranslation,imagecaptioning,sceneparsing,etc.
DeepWater:Next-GenDeepLearninginH2O
EnterpriseDeepLearningforBusinessTransformation
MuchmoreaboutDeepLearningtomorrowafternoon!