h2o core introduction
TRANSCRIPT
![Page 1: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/1.jpg)
H2O Core Architecture &
Algori thms
Avkash [email protected]
@avkashchauhanhttps://www.linkedin.com/in/avkashchauhan
![Page 2: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/2.jpg)
Please visit: http://www.h2o.ai/customers/# H2O Users List: http://www.h2o.ai/user-list/
![Page 3: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/3.jpg)
H2O Platform(s)
In-Memory, Distributed Machine Learning Algorithms with H2O Flow GUI
H2O AI Open Source Engine Integration with Spark
DEEP WATER
![Page 4: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/4.jpg)
Key features
• Open Source (Apache 2.0)
• All supported ML algorithms are coded by our engineers
• Designed for speed, scalability and for super large data-sets
• Same distribution for open source community & enterprise
• Very active production, every other week release
• Vibrant open source community
o https://community.h2o.ai
• Enterprise Support portal
o https://support.h2o.ai
• We have 70,000 users, 8,000 organizations and growing daily
![Page 5: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/5.jpg)
Usage: Simple Solut ion
o Single Deployable compiled Java code (jar)
o Ready to use point and click FLOW Interface
o Connection from R and Python after specific packages are
installed
o Use Java, Scala natively and any other language through
RESTful API
o Deployable models - Binary & Java (POJO & MOJO)
o One click prediction/scoring engine
![Page 6: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/6.jpg)
Usage: Complex Solut ion
o Multi-node Deployment
o Spark and Hadoop distributed environment
• Sparkling Water (Spark + H2O)
o Data ingested from various inputs
• S3, HDFS, NFS, JDBC, Object store etc.
• Streaming support in Spark (through Sparking Water)
o Distributed machine learning for every algorithm in platform
o Prediction service deployment on several machines
![Page 7: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/7.jpg)
H2O Core
![Page 8: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/8.jpg)
H2O Core
H2O
![Page 9: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/9.jpg)
H2O Core
CPU
![Page 10: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/10.jpg)
H2O Core
CPU
![Page 11: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/11.jpg)
H2O Core
CPU
Model Building
![Page 12: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/12.jpg)
H2O Core
H2O
H2O
H2O
![Page 13: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/13.jpg)
H2O Core
CPU CPU CPU
![Page 14: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/14.jpg)
H2O Core
CPU CPU CPU
Model Building
H2O Distributed In-Memory
![Page 15: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/15.jpg)
H2O Core
YARN
CPU CPU CPU
![Page 16: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/16.jpg)
H2O Core
YARN
CPU CPU CPU
Model Building
H2O Distributed In-Memory
![Page 17: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/17.jpg)
H2O Core
YARN
CPU CPU CPU
Model Building
H2O Distributed In-Memory
SQL NFS
S3
![Page 18: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/18.jpg)
H2O Core
YARN
CPU CPU CPU
Model Building
H2O Distributed In-Memory
SQL NFS
S3
Models
Binary
MOJO
POJO
![Page 19: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/19.jpg)
H2O Cluster
H2O
![Page 20: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/20.jpg)
H2O Clients
H2O Cluster
H2O
H2O
![Page 21: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/21.jpg)
H2O Clients
H2O Cluster
REST/
JSON
LocalMachine
H2O
H2O
![Page 22: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/22.jpg)
H2O Clients
H2O Cluster
REST/
JSON
LocalMachine
H2O
H2O
![Page 23: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/23.jpg)
H2O Clients
H2O Cluster
REST/
JSON
LocalMachine
H2O
![Page 24: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/24.jpg)
H2O Clients
JVM 1
JVM 2
JVM N
REST/
JSON
LocalMachine
H2O
![Page 25: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/25.jpg)
H2O Clients
JVM 1
JVM 2
JVM N
H2O Cluster
REST/
JSON
LocalMachine
H2O
![Page 26: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/26.jpg)
H2O Clients
JVM 1
JVM 2
JVM N
x1 x2 x3 xp y
H2O Cluster
REST/
JSON
LocalMachine
H2O
![Page 27: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/27.jpg)
H2O Clients
JVM 1
JVM 2
JVM N
x1 x2 x3 xp y
H2O Cluster
REST/
JSON
LocalMachine
H2O
![Page 28: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/28.jpg)
Current Algori thm Overview
Statistical Analysis
• Linear Models (GLM)
• Naïve Bayes
Ensembles
• Random Forest
• Distributed Trees
• Gradient Boosting Machine
• Stacking / Super Learner
Deep Neural Networks
• MLP
• Autoencoder
• Anomaly Detection
• Deep Features
• CNN, RNN (Deep Water)
Clustering
• K-Means (Auto-K)
Dimension Reduction
• Principal Component Analysis
• Generalized Low Rank Models
Word Embedding
• Word2Vec
Time Series
• iSAX
Machine Learning Tuning
• Hyperparameter Search
• Early Stopping
![Page 29: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/29.jpg)
H2O Flow
![Page 30: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/30.jpg)
H2O R Interface
![Page 31: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/31.jpg)
H2O Python Interface
![Page 32: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/32.jpg)
Deployment Code
YARN
CPU CPU CPU
Model Building
H2O Distributed In-Memory
SQL NFS
S3
Models
![Page 33: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/33.jpg)
Deployment Code: Plain Old Java Object (POJO)
POJO
![Page 34: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/34.jpg)
Current Algori thm Overview
Statistical Analysis
• Linear Models (GLM)
• Naïve Bayes
Ensembles
• Random Forest
• Distributed Trees
• Gradient Boosting Machine
• R Package - Stacking / Super
Learner
Deep Neural Networks
• Multi-layer Feed-Forward Neural
Network
• Auto-encoder
• Anomaly Detection
Clustering• K-Means
Dimension Reduction
• Principal Component Analysis
• Generalized Low Rank Models
Solvers & Optimization
• Generalized ADMM Solver
• L-BFGS (Quasi Newton Method)
• Ordinary Least-Square Solver
• Stochastic Gradient Descent
Data Munging
• Scalable Data Frames
• Sort, Slice, Log Transform
• Data.table (1B rows groupBy record)
Text Processing
• Word2Vec
![Page 35: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/35.jpg)
What is new – Driverless AI
• https://techcrunch.com/2017/07/06/h2o-ais-driverless-ai-automates-machine-learning-for-businesses/
![Page 36: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/36.jpg)
Helpful resources
• Docs
o http://docs.h2o.ai/h2o/latest-stable/index.html
• H2O User Guide
o http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html
• Source Code
o https://github.com/h2oai/
o https://github.com/h2oai/h2o-3
o https://github.com/h2oai/sparkling-water
o https://github.com/h2oai/deepwater
• Meetup content
o https://github.com/h2oai/h2o-meetups
• Tutorials
o https://github.com/h2oai/h2o-tutorials
![Page 37: H2O Core Introduction](https://reader031.vdocuments.site/reader031/viewer/2022021421/5a669b457f8b9a494c8b4b95/html5/thumbnails/37.jpg)
Thank you so much!!
ありがとうございました