![Page 1: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/1.jpg)
MLeap: Release Spark ML PipelinesMikhail Semeniuk and Hollin Wilkins
![Page 2: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/2.jpg)
Opening Demo
http://spark-summit.combust.ml
How much should I rent my house for on AirBnb?
Yes, open your cell phone and go here :)
![Page 3: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/3.jpg)
Action Reaction
![Page 4: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/4.jpg)
Hard-Coded Models(SQL, Java, Ruby)
PMML Emerging Solutions(yHat, DataRobot)
Enterprise Solutions(Microsoft, IBM, SAS)
MLeap
Quick to Implement
Open Sourced
Committed to Spark/Hadoop
API Server Infrastructure
![Page 5: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/5.jpg)
mleap-spark
mleap-runtime
mleap-coreBundle.ML
mleap-serialization
![Page 6: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/6.jpg)
Regressions
![Page 7: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/7.jpg)
![Page 8: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/8.jpg)
VectorAssembler Continuous Feature Vector StandardScaler
StringIndexer
StringIndexer
StringIndexer
OneHotEncoder
OneHotEncoder
VectorAssembler
LinearRegression
Categorical Feature
Categorical FeatureIndex
Categorical Feature
One Hot Vector
Categorical Feature Vector
VectorAssembler
Scaled Continuous Feature Vector
Final Feature Vector
Continuous Feature
Legend
Final Feature Vector Prediction
Regression Pipeline
OneHotEncoder
![Page 9: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/9.jpg)
LeapFrame LeapFrame LeapFrame
Categorical Feature
StringIndexer OneHotEncoderCategorical
Feature Index
Categorical Feature One Hot Vector
StringIndexer OneHotEncoder
![Page 10: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/10.jpg)
![Page 11: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/11.jpg)
![Page 12: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/12.jpg)
Spark Estimator Spark Model MLeap Model
MLeap Spark
Spark DataFrame Spark LeapFrame Spark LeapFrame
MLeap Spark
Spark DataFrame
MLeap Transformer
MLeap Spark
![Page 13: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/13.jpg)
BenchmarksMLeap: 0.011ms/transform Spark: 23.4ms/transform
![Page 14: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/14.jpg)
![Page 15: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/15.jpg)
![Page 16: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/16.jpg)
Combust.ML Overview
Combust.ML
![Page 17: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/17.jpg)
Thank Yous
![Page 18: MLeap: Productionize Data Science Workflows Using Spark](https://reader034.vdocuments.site/reader034/viewer/2022042723/5871559c1a28ab8e5b8b514b/html5/thumbnails/18.jpg)
THANK YOU.
Hollin Wilkinsemail: [email protected]: https://github.com/hollinwilkinstwitter: https://twitter.com/HollinWilkinslinkedin: https://www.linkedin.com/in/hollinwilkins
Mikhail Semeniukemail: [email protected]: https://github.com/seme0021twitter: https://twitter.com/MikhailSemeniuklinkedin: https://www.linkedin.com/in/semeniuk