Download - Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle 2017

Apache Mahout Distributed Matrix Math for Machine Learning

About Me

• Senior Director of Data Science at Lucidworks (Apache Solr/Lucene, Fusion search tools)

• Formerly Chief Data Scientist, Technical Lead of Data Science Practice at Accenture

• Committer and PMC Member, Apache Mahout

• On Twitter @akm

• Email at [email protected], [email protected]

• Adversarial Learning podcast with @joelgrus at http://adversariallearning.com

mailto:[email protected]

mailto:[email protected]

http://adversariallearning.com

Apache Mahout Recent Trends in 0.12/0.13

• Simplify and improve performance of distributed matrix-math programming

• Provide flexible computation options for software and hardware

• Enable easier and quicker new algorithm development

• Allow polyglot programming and plotting in notebooks via Apache Zeppelin

Introduction to Apache Mahout

Apache Mahout is an environment for creating scalable, performant, machine-learning applications

Apache Mahout provides:

• Mathematically expressive Scala DSL

• A collection of pre-canned math and statistics algorithms

• Interchangeable distributed engines

• Interchangeable native solvers (JVM, CPU, GPU, CUDA, or custom)

Feature Highlights in Recent Releases

• v 0.13.1, Soon — CUDA Solvers, Apache Spark 2.1/Scala 2.11 support

• New web site platform, May 2017 — Moved from ASF CMS system to Markdown and Jekyll; allows documentation pull requests to be merged in and published automatically

• v 0.13.0, Apr 2017 — GPU/CPU Solvers, algorithm framework

• v 0.12.2, Nov 2016 — Apache Zeppelin integration for notebooks and visualization

• v 0.12.0, Apr 2016 — Apache Flink backend support

• New Mahout book, Feb 2016 — ‘Apache Mahout: Beyond MapReduce’ by Dmitriy Lyubimov and Andrew Palumbo

• v 0.10.0 - Apr 2015 - Mahout-Samsara vector-math DSL, MapReduce jobs soft-deprecated, Spark backend support

Topic Overview

• Mahout-Samsara: Declarative, R-like, domain-specific language (DSL) for matrix math

• Backend-agnostic programming

• Apache Zeppelin notebooks

• Algorithm development framework (modeled after sk-learn)

• Solve on available CPU cores, single or multiple GPUs, or in the JVM

• Next steps, and how to get involved

Mahout-Samsara

Mahout-Samsara

MapReduce is dead; long live the little clip-art blue man!

Mahout-Samsara

• Mahout-Samsara is an easy-to-use domain-specific language (DSL) for large-scale machine learning on distributed systems like Apache Spark and Flink

• Uses Scala as programming/scripting environment

• Algebraic expression optimizer for distributed linear algebra

• Provides a translation layer to distributed engines

• Support for Spark and Flink DataSets, RDDs

• System-agnostic, R-like DSL; actual formula from (d)spca:

val G = B %*% B.t - C - C.t + (ksi dot ksi) * (s_q cross s_q)

Mahout-Samsara

• Mahout-Samsara computes C = A’A via row-outer-product formulation:

• Executes in a single pass over row-partitioned A

Example of an algebraic optimization

• Logical optimization

• Optimizer rewrites plan to use logical operator for Transpose-Times-Self matrix multiplication

• Single pass: multiply partitioned rows by themselves as transposed columns

• Computation of A’A:

val C = A.t %*% A

• Naïve execution

• 1st pass: transpose A (requires repartitioning of A)

• 2nd pass: multiply result with A (expensive, potentially requires repartitioning again)

Mahout-Samsara

• Mahout-Samsara computes C = A’A via row-outer-product formulation:

• Executes in a single pass over row-partitioned A

Example of an algebraic optimization

Backend-Agnostic Programming

Apache Zeppelin Notebooks


• Notebooks for polyglot programming with all types of data

• Plotting with R and Python off of computed data from other tools in the same notebook

• Share variables between interpreters

• For more: https://zeppelin.apache.org

• Mahout interpreter for Zeppelin released June 2016

• Post by Trevor Grant on how to use it at https://rawkintrevo.org/2016/05/19/visualizing-apache-mahout-in-r-via-apache-zeppelin-incubating

• https://mahout.apache.org/docs/0.13.1-SNAPSHOT/tutorials/misc/mahout-in-zeppelin/

https://zeppelin.apache.org

https://rawkintrevo.org/2016/05/19/visualizing-apache-mahout-in-r-via-apache-zeppelin-incubating

https://rawkintrevo.org/2016/05/19/visualizing-apache-mahout-in-r-via-apache-zeppelin-incubating

https://mahout.apache.org/docs/0.13.1-SNAPSHOT/tutorials/misc/mahout-in-zeppelin/


Add the Mahout Interpreter


Add the Mahout Interpreter, click “Create”


Example usage


Hand results to R for plotting

Algorithm Development Framework

Algorithm Development Framework

• Patterned after R and Python (sk-learn) APIs

• Fitter populates a Model, which contains the parameter estimates, fit statistics, a summary, and has a predict() method

• https://rawkintrevo.org/2017/05/02/introducing-pre-canned-algorithms-apache-mahout

• https://mahout.apache.org/docs/0.13.1-SNAPSHOT/tutorials/misc/contributing-algos

https://rawkintrevo.org/2017/05/02/introducing-pre-canned-algorithms-apache-mahout

https://mahout.apache.org/docs/0.13.1-SNAPSHOT/tutorials/misc/contributing-algos

Solve on CPU, GPU, or JVM


Current architecture with native CPU and GPU support and unreleased jCUDA bindings


Initial benchmarking on latest release


Initial benchmarking on latest release

• Sparse MMul at geometry of 1000 x 1000 %*% 1000 x 1000 density = 0.2, with 5 runs Mahout JVM Sparse multiplication time: 1501 msMahout jCUDA Sparse multiplication time: 49 ms

30x speedup

• Sparse MMul at geometry of 1000 x 1000 %*% 1000 x 1000 density = .02, with 5 runs Mahout JVM Sparse multiplication time: 34 ms Mahout jCUDA Sparse multiplication time: 4 ms

8.5x speedup

• Sparse MMul at geometry of 1000 x 1000 %*% 1000 x 1000 density = .002, with 5 runs Mahout JVM Sparse multiplication time: 1 ms Mahout jCUDA Sparse multiplication time: 1 ms

0x speedup


• jCUDA work is still in a branch, will be in master in the next couple months

• Currently the modes of compute are JVM, CPU (using all available cores), and single GPU

• Multi-GPU is next priority

• Currently multiplication takes place in different solvers based on matrix shape (banding, triangularity, etc.)

• Directing location for data and compute based on shape and density is another priority

• Watch this space for other speedups

Next steps

How to Use Mahout and Get Involved

How to Use Mahout and Get Involved

Web: https://mahout.apache.org

Source code, PRs welcome: https://github.com/apache/mahout

Mailing lists: https://mahout.apache.org/community/mailing-lists.html

Download, install, embed: https://mahout.apache.org/downloads.html

https://mahout.apache.org

https://github.com/apache/mahout

https://mahout.apache.org/community/mailing-lists.html

https://mahout.apache.org/community/mailing-lists.html

https://mahout.apache.org/downloads.html

https://mahout.apache.org/downloads.html

Thank YouQ&A

h2ps://mahout.apache.org h2ps://github.com/apache/mahout

https://mahout.apache.org

https://github.com/apache/mahout

Download - Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle 2017

Top Related