h2o advancements - arno candel

4
H2O Advancements Arno Candel, PhD Chief Architect, Physicist & Hacker, H2O.ai @ArnoCandel July 18 2016

Upload: jo-fai-chow

Post on 21-Apr-2017

639 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: H2O Advancements - Arno Candel

H2OAdvancementsArnoCandel,PhD

ChiefArchitect,Physicist&Hacker,H2O.ai@ArnoCandelJuly182016

Page 2: H2O Advancements - Arno Candel

NewEnterpriseFeatures

Auth&Security LDAP/Kerberos/HTTPS/Encryptionformax.compliance,IPv6

Semi-Supervised Pre-trainDeepLearningmodelonunlabeleddata,thenfine-tune

LargePOJOs ProductionizelargeJavamodels(multi-GBsourcecode)

HyperParameterTuning

Automaticallytunesmodelparametersforthedesiredmetric,withconvergence-basedearlystoppingformodelsandsearch

Steam PlatformforDataProducts-next-genproduct

SparklingWater2.0 ThekillerappforthelatestversionofApacheSpark

AdvancedMunging BringingR’sdata.tabletoH2O-scalable,fastanddistributed

Mission-CriticalCapabilitiesforEnterpriseProductionUse

Page 3: H2O Advancements - Arno Candel

NewFeaturesforH2OTreeAlgorithms:GBM+DRF

HighestAccuracyforSmarterApplications

Optimalmissingvaluehandling

Missingdatahasmeaning,israrelymissingatrandomOptimalsplitsarefoundtakingmissingvaluesintoaccount

Quantile-basedhistograms

Findsoptimalsplitpointsfordatawithoutliers,e.g.-99999,0,1,2,3,4,5

NewAlgorithm ExtraTreesClassifier(pickbestamongrandomsplitpoints)RobustRegression:HuberlossforGBM

Higheraccuracyformodelsondatawithoutliersquadraticlossforinliers,linearlossforoutliers

MoretuningparametersforGBM

col_sample_rate_change_per_level-H2Oexclusivelearn_rate_annealing-fastertrainingmin_split_improvement-avoidsoverfittingmax_abs_leafnode_pred-avoidsoverfittingsample_rate_per_class-forimbalanceddatasets

Page 4: H2O Advancements - Arno Candel

IntegrationwithexistingGPUbackends

Leverageopen-sourcetoolsandresearchforTensorFlow,Caffe,mxnet,Theano,etc.

ScalabilityandEaseofUse/DeploymentofH2O

Distributedtraining,real-timemodelinspectionFlow,R,Python,Spark/Scala,Java,REST,POJO,Steam

ConvolutionalNeuralNetworks

Image,video,speechrecognition,etc.

RecurrentNeuralNetworks

Sequences,timeseries,etc.NLP:naturallanguageprocessing

HybridNeuralNetworksArchitectures

Speechtotexttranslation,imagecaptioning,sceneparsing,etc.

DeepWater:Next-GenDeepLearninginH2O

EnterpriseDeepLearningforBusinessTransformation

MuchmoreaboutDeepLearningtomorrowafternoon!