choosing a deep learning library

44
Choosing a Deep Learning Library There are a lot of them JesseBrizzi{.com,@gmail.com,@curalate.com}

Upload: others

Post on 05-May-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Choosing a Deep Learning Library

Choosing a Deep Learning Library

There are a lot of them

JesseBrizzi{.com,@gmail.com,@curalate.com}

Page 2: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Who am I/What do I do?● Research Engineer

○ Focus in Computer Vision and Machine learning

○ CS background

● Work on Image Intelligence Team @Curalate

○ E-Commerce SaaS

○ Platform to enable brands to find image based social media content to repurpose for

e-commerce purposes.

○ Image Intelligence Team owns entire pipeline of researching new ML application to

training, development, and then getting it into production.

○ Intelligent Product Tagging - technology that can analyze an image and use machine

learning to identify specific products depicted within that image.

Page 3: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Page 4: Choosing a Deep Learning Library
Page 5: Choosing a Deep Learning Library

Choosing a Deep Learning Library

at’s L’s a Neural Net?

Page 6: Choosing a Deep Learning Library

Choosing a Deep Learning Library

at’s L’s a Neural Net?● FCN - Fully Connected Network

○ Multilayer perceptron/fundamental neural net where each neuron is connect to all neurons in

the previous layer of the network.

● CNN - Convolutional Neural Network

○ Neural net that uses convolutional layers, heavily used in Computer Vision applications.

● RNN - Recurrent Neural Network

○ Neural net that feeds its output back into itself to process the next input, heavily used in

Natural Language Processing applications.

● LSTM - Long Short-Term Memory Recurrent Neural Net

○ Fancy RNNs that contain additional control over what output is passed to the next input.

Page 7: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Important Factors● Academia vs Industry

○ Who is the target audience?

● Community support

○ Pretrained models?

○ Research paper repos?

○ How googleable are bugs and issues?

● Development speed/barriers for entry

○ Abstractions of low level concepts.

○ Documentation quality

○ Supported programming languages

○ The ability to Scale

Page 8: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Important Factors● Codebase Quality

○ Is the code actively maintained?

●Performance

○ Benchmarks (oldish) https://arxiv.org/pdf/1608.07249.pdf

○ Performance does not scale very well on CPUs. 16 core CPUs are only slightly better than 4

or 8 core CPUs.

○ GPUs perform much better than many-core CPUs.

○ Scalability across multiple GPUs

○ Performance is also affected by the design of configuration files/implementation paradigm.

Page 9: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Important Factors● Train to Production pipeline

○ Support for a fast to prototype language (python, R) and deployment in your production

language (java/scala, c++, JS, whatever).

○ Train locally if you have the hardware vs training on pre-prepared, simplified cloud

services.

○ Ability to run on different platforms ranging from mobile phones to massive server farms

○ Transfer your work to other libraries

Page 10: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Imperative vs Symbolic paradigms● Dynamic Computation Graphing (Imperative Programming)

○ Are built at runtime which lets you use standard language statements.

○ At run time the system generation the graph structure.

○ Useful for when the graph structure needs to change at run time.

○ Makes debugging easy.

● Imperative programs tend to be more flexible

○ It’s easier to use native language features.

○ The graph can follow your programs logical control flow.

Page 11: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Imperative vs Symbolic paradigms● Symbolic Programs Tend to be More Efficient

○ Both in terms of memory and speed.

○ Can safely reuse the memory for in-place computation.

○ Can also operation folding optimizations.

● Static Computation Graphing (Symbolic Paradigm)

○ Define the computation graph once, execute graph many times.

○ Can optimized the graph at the start

○ Good for fixed size Net (feed-forward, CNN)

● Easier to manage in terms of loading and resources

Page 12: Choosing a Deep Learning Library

Libraries That People Should Know About

Page 13: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Caffe● IMO the first mainstream production ready lib.

○ high performance and well tested C++ codebase.

● One of the first, and largest, model zoos.

● Large community of open source research projects.

● Able to train a net from your data without writing any code.

● Good for feedforward networks, image processing, and for fine-tuning

pretrained nets

● Main advantage was being first to market.

● Can convert models to almost any other relevant lib.

UC Berkeley

Watches: 2,241 Stars: 27,296Forks: 16,454

Avg Issue Resolution: 3 DaysOpen issues: 13%

Symbolic Paradigm

Research Citations (2014): 10,159

Model zoo

Page 14: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● Has bad design choices that are inherited from its original use case:

conventional CNN applications.

● Not good for recurrent networks

● Does not support Auto differentiation

● Very verbose in layer and network definitions

○ the graph is treated as a collection of layers, as opposed to

nodes of single tensor operations

CaffeUC Berkeley

Watches: 2,241 Stars: 27,296Forks: 16,454

Avg Issue Resolution: 3 DaysOpen issues: 13%

Symbolic Paradigm

Research Citations (2014): 10,159

Model zoo

Page 15: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Keras● A library that sits on top of other DL libs and provides a single, easy to use, high level interface.

● Very modular, minimal, readable, object oriented code.

● Great for beginners, with great documentation

● Lacks in optimizations

● Supported backends

○ Tensorflow, Theano, CNTK, MXNet

● Can export your trained models into the backends format.

● Fork included in TensorFlow’s Python library.

● Not as customizable

Keras

Watches: 1,982 | Stars: 38,796Forks: 14,799

Avg Issue Resolution: 23 DaysOpen issues: 24%

Symbolic Paradigm

Model zoo

Page 16: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Tensorflow● The current most popular option.

○ Largest active community

○ More open source projects and models.

● Google’s attempt to build a single deep learning framework for

everything deep learning related.

○ Built with massive distributed computing in mind (powers G-apps).

○ Has mobile capabilities in the form of TensorFlow Mobile and

TensorFlow Light.

● TensorBoard is amazing for debugging and training.

● TensorFlow Serving for prod deployments (python)

● A lot of documentation (official and 3rd party)

Google

Watches: 8,606 Stars: 121,864Forks: 72,545

Avg Issue Resolution: 8 DaysOpen issues: 16%

Symbolic/Dynamic Paradigm

Research Citations (2016): 6233

Model zoo

CNN Example Code (Keras R)

CNN Example Code (Keras Py)

CNN Example Code

Page 17: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● Deep Google Cloud integration.

● Pretty low level (Keras and Sonnet help solve this)

● Most things outside of the core c/python library are “experimental”

○ All of the APIs outside of the Python API are not covered by

their API stability promises.

● Biggest issue with library is performance.

○ TensorFlow is just slower and more of a resource hog when

compared to the other libraries.

○ Other libs can perform twice as fast on typical deep net tasks.

○ Avoid for performant RNNs or LSTMs networks.

○ Worst at scaling efficiency.

TensorflowGoogle

Watches: 8,606 Stars: 121,864Forks: 72,545

Avg Issue Resolution: 8 DaysOpen issues: 16%

Symbolic/Dynamic Paradigm

Research Citations (2016): 6233

Model zoo

CNN Example Code (Keras R)

CNN Example Code (Keras Py)

CNN Example Code

Page 18: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Torch/PyTorch● Torch was one of the original academic

focused libs.

● Many maintainers went to work at

Facebook and created PyTorch.

● They use the same underlying C lib.

○ Provide similar performance.

● They differ in

○ Interface (Lua vs Python)

○ Auto diff capabilities

○ Paradigms

Deepmind, NYU, IDIAP

Watches: 665 | Stars: 8,218Forks: 2,340

Avg Issue Resolution: 69 DaysOpen issues: 34%

Symbolic Paradigm

Research Citations: 1,246

Model zoo

Facebook

Watches: 1,197 | Stars: 25,450Forks: 6,044

Avg Issue Resolution: 6 DaysOpen issues: 24%

Symbolic/Dynamic Paradigm

Research Citations: 879

Model zoo

CNN Example Code

Page 19: Choosing a Deep Learning Library

Choosing a Deep Learning Library

PyTorch● PyTorch was made with the goal of fixing or modernizing Torch.

● Hybrid fronted for switching between paradigms.

● PyTorch also has its own visualization dashboard called Visdom.

● Probably should avoid if want to deploy into production.

○ Facebook maintains a separate lib targeted at developers,

Caffe2.

○ Making changes to make PyTorch production ready.

○ Caffe2 recently merged into PyTorch

● Researchers tend to prefer PyTorch over Tensorflow

○ Makes prototyping easy

Facebook

Watches: 1,197 | Stars: 25,450Forks: 6,044

Avg Issue Resolution: 6 DaysOpen issues: 24%

Symbolic/Dynamic Paradigm

Research Citations: 879

Model zoo

CNN Example Code

Page 20: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● Newer and growing option.

● Largest officially supported API selection.

○ High compatibility and consistency.

● Direct competitor to TensorFlow across all applications.

○ It can run on everything from a web browser, a mobile

phone, to a massive distributed server farm.

○ Amazon has found that you can get up to an 85% scaling

efficiency with MXNet.

● Has its own serving framework and deep integration with AWS.

● Also has its own Tensorboard forks.

MXNetApache, Amazon

Watches: 1,180 | Stars: 16,450Forks: 5,889

Avg Issue Resolution: 40 DaysOpen issues: 13%

Symbolic/Dynamic Paradigm

Research Citations: 712

Model zoo

CNN Example Python Code

CNN Example Code (Gluon)

Page 21: Choosing a Deep Learning Library

Choosing a Deep Learning Library

MXNet Gluon● Collaboration between AWS and Microsoft.

● Provides a clear, concise, and simple API for deep learning.

○ Full set of plug-and-play neural network building blocks.

■ predefined layers, optimizers, and initializers

○ Built in model zoo.

● Hybridization is awesome

○ Hybrid Symbolic/Dynamic graph functionality.

○ Offers benefits of both.

○ Can make Gluon 3x faster than PyTorch

● Great documentation for absolute beginners.

Page 22: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● The non Python API’s are lacking in certain aspects.

○ The documentation can be weak.

○ Stability issues at full production scale.

● Community is growing, but is still small

○ Never the first library used for open source projects

MXNetApache, Amazon

Watches: 1,180 | Stars: 16,450Forks: 5,889

Avg Issue Resolution: 40 DaysOpen issues: 13%

Symbolic/Dynamic Paradigm

Research Citations: 712

Model zoo

CNN Example Python Code

CNN Example Code (Gluon)

Page 23: Choosing a Deep Learning Library

Choosing a Deep Learning Library

CNTK● Microsoft Cognitive Tooklit was originally created by MSR Speech

researchers

○ Now it has expanded to all types of deep learning applications.

● Used in Skype, Xbox, Cortana, anything “Azure”

● Focus on NLP with unbeatable RNN/LSTM performance

● Supports distributed training like TensorFlow

● Only library with first class support for the Windows ecosystem.

○ No support for OSX

○ Simple Azure deployment

○ .NET language support

Microsoft

Watches: 1,388 | Stars: 15,850Forks: 4,217

Avg Issue Resolution: 28 DaysOpen issues: 15%

Symbolic/Dynamic Paradigm

Research Citations: 140

Model zoo

CNN Example Code

CNN Example Code (Keras)

Page 24: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● Average model zoo size/quality

● Good documentation consistent with other Microsoft products

● Non conventional open source license history.

● Small community

● Used the least in research

CNTKMicrosoft

Watches: 1,388 | Stars: 15,850Forks: 4,217

Avg Issue Resolution: 28 DaysOpen issues: 15%

Symbolic/Dynamic Paradigm

Research Citations: 140

Model zoo

CNN Example Code

CNN Example Code (Keras)

Page 25: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● https://onnx.ai/

● Open Neural Network Exchange Format

● Created in collaboration with AWS, Facebook and Microsoft

● Library and format for converting trained Neural Net models

between libraries

● Provides a standardized onnx model format.

ONNX

Page 26: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Performance Comparisons Summary● Benchmarks (oldish 2017) https://arxiv.org/pdf/1608.07249.pdf

○ Compares CNTK, Torch, Caffe, MXNet, Tensorflow

○ CPU’s to Multiple GPU performance on Synthetic/Real data across various deep

learning architectures (CNN, FCN, RNN, LSTM...).

● Single GPU

○ Caffe, CNTK and Torch perform better than MXNet and TensorFlow on FCNs.

○ MXNet is outstanding in CNNs, especially the larger size of networks, while Caffe and

CNTK also achieve good performance on smaller CNNs.

○ RNNs or LSTMs, CNTK obtains excellent time efficiency, which is up to 5-10x the rest.

Page 27: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Performance Comparisons Summary● Multiple GPUs

○ MXNet and Torch scale the best and TensorFlow scales the worst.

○ CNTK performs better scaling on FCNs specifically.

● Library specific optimizations

○ CNTK allows the trade off GPU memory for better computing efficiency.

○ MXNet can enable model auto-tuning using the NVidia cuDNN library.

● Overall the performance of TensorFlow is lacking compared to the other tools.

Page 28: Choosing a Deep Learning Library

Other Libraries to take note of...

Page 29: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Theano● University of Montreal

● Research Citations - 290

● Development has ended, may it rest in peace ⚰● Makes you do a lot of things from scratch, which leads to more verbose code.

● Single GPU support

● Numerous open-source deep-libraries have been created and built on top of Theano,

including Keras, Lasagne and Blocks

● CNN Example Code (Keras) or CNN Example Code (Lasagne)

● No real reason to use over TensorFlow unless you are working with old code.

Page 30: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Caffe 2● Facebook

● CNN Example Code

● Merged into the PyTorch codebase.

● Caffe2 targets supporting production applications with a focus on mobile.

● Caffe2 is built to excel at large scale deployments.

○ Caffe2 is built to utilizing both multiple GPUs on a single-host and multiple hosts with GPUs.

● Caffe2 improves Caffe in a series of directions:

○ first-class support for large-scale distributed training

○ mobile deployment

○ new hardware support (in addition to CPU and CUDA)

○ flexibility for future directions such as quantized computation

○ stress tested by the vast scale of Facebook applications

Page 31: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Fast.ai● fastai

● The library is based on research into deep learning best practices.

● Built on top of PyTorch

● Free, online, yearly updated courses in deep learning

○ Can even take it in person in SF

● Quickest at integrating new research examples

● Great for beginners getting into research.

Watches 555 Star 12,306 Forks 4,479 Median Issue Resolution 8 HOURS Open Issues 1%

Page 32: Choosing a Deep Learning Library

Choosing a Deep Learning Library

CoreML● Apple

● Closed source

● Not a full DL library (you can not use it to train models at the moment), but mainly focused on

deploying pretrained models to IOS and OSX devices

○ If you need to train your own model you will need to use one of the above libraries

○ Model converters available for Keras, Caffe, Scikit-learn, libSVM, XGBoost, MXNet, and

TensorFlow

Page 33: Choosing a Deep Learning Library

Choosing a Deep Learning Library

● https://www.mathworks.com/products/deep-learning.html

● a MATLAB toolbox implementing CNNs and LSTMs.

● GPU support and cloud GPU on AWS with MATLAB Distributed Computing Server

● Create, edit, visualize, and analyze deep learning networks with interactive apps.

● Visualize network topologies, training progress, and activations of the learned features in a

deep learning network.

● Import models from Caffe/Tensorflow-Keras/Onnx

● Not open source

○ $500 annual license

○ $1250 perpetual license

Deep Learning Toolbox

Page 34: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Deeplearning4j● Skymind

● Keras Support (Python API)

● Written with Java and the JVM in mind

● Focus on enterprise scale

● Great Documentation

● DL4J takes advantage of the latest distributed computing frameworks including Hadoop and

Apache Spark to accelerate training. On multi-GPUs, it is equal to Caffe in performance.

● Can import models from Tensorflow

Watches 835 Star 10,431 Forks 4,602 Median Issue Resolution 6 days Open Issues 20%

Page 35: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Chainer● Preferred Networks

● Research Citations(2015) - 207

● CNN Example Code

● Dynamic computation graph

● Used by IBM, Intel

● Japanese and English Community

Watches 328 Star 4,626 Forks 1,228 Median Issue Resolution 44 days Open Issues 11%

Page 36: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Darknet● https://github.com/pjreddie/darknet

● Very small open source effort with a laid back dev group.

○ Emojis and jokes everywhere.

○ Seems more of an exercise by the developers.

● Not useful for production environments.

● Maintainer wrote my favorite research paper.

Watches 786 Star 11,980 Forks 6,770 Median Issue Resolution 26 days Open Issues 76%

Page 37: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Sonnet● DeepMind

● Google DeepMind

○ One of the biggest name in industry research

○ AlphaGo, AlphaStar

● Built on Tensorflow, makes NN construction and

training easy and extensible.

Watches 475 Star 7,362 Forks 1,011 Median Issue Resolution 14 days Open Issues 14%

Page 38: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Knet.jl● https://github.com/denizyuret/Knet.jl

● is the Koç University deep learning framework implemented in Julia

● supports GPU operation, automatic differentiation, and dynamic computational graphs

● Model code can use the full power and expressivity of Julia.

● CNN Example Code

Watches 75 Star 833 Forks 149 Median Issue Resolution 9 days Open Issues 17%

Page 39: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Paddle● Baidu

● PArallel Distributed Deep LEarning

● Chinese documentation with an English translation.

● originally developed by Baidu scientists and engineers for the purpose of applying deep

learning to many products at Baidu.

● Really only use if you are in the chinese market/ecosystem.

Watches 649 Star 8,224 Forks 2,232 Median Issue Resolution 14 days Open Issues 18%

Page 40: Choosing a Deep Learning Library

Choosing a Deep Learning Library

ConvNetJS● Stanford

● Train Neural Networks entirely in your browser.

● Start training a net now!

● Great for visualizing the full network and training process.

● Mainly used for demonstrating and teaching deep learning on the web

○ See Stanford’s CS231n

Watches 645 Star 9,563 Forks 1,891 Median Issue Resolution 59 days Open Issues 69%

Page 41: Choosing a Deep Learning Library

Choosing a Deep Learning Library

Neon● Intel

● Written with Intel Nervana MKL accelerated hardware in mind (Xeon and Phi processors)

● Intel's reference deep learning framework committed to best performance on all hardware.

● One of the fastest libraries

● One of the first half precision floating point enabled libraries.

Watches 366 Star 3,730 Forks 830 Median Issue Resolution 25 days Open Issues 17%

Page 42: Choosing a Deep Learning Library

Choosing a Deep Learning Library

DyNet● Carnegie Mellon University

● Dynamic computation graph

● Small user community

Watches 200 Star 2,688 Forks 626 Median Issue Resolution 7 days Open Issues 12%

Page 43: Choosing a Deep Learning Library

Choosing a Deep Learning Library

TLDR● Choose TensorFlow or MXNet-Gluon for Industry/Production Environments

○ TensorFlow if you prioritize community support and documentation, MXNet if you need

performance

● Pytorch if you are doing research/developing new models/layers.

● Keras if you are new and want to get started quick.

● Fast.ai + PyTorch if you are here to learn.

● CNTK if you ❤ Windows/Visual Studio/.NET or want to do high performance NLP

● CoreML for deploying things to Apple devices

● Deeplearning4j if you really like to keep things in the JVM.

Page 44: Choosing a Deep Learning Library

Choosing a Deep Learning Library