microservices & teraflops: effortlessly scaling data science with pywren | anacondacon 2017
TRANSCRIPT
![Page 1: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/1.jpg)
![Page 2: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/2.jpg)
MICROSERVICES & TERAFLOPS
Effortlessly scaling data science #thecloudistoodamnhard
Eric Jonas Postdoctoral Researcher [email protected] | @stochastician
![Page 3: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/3.jpg)
![Page 4: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/4.jpg)
A BIG FAN OF ANACONDA
![Page 5: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/5.jpg)
“BIG” DATA(near-by) stars neurons nuclei
size 10^9 m 10^-5m 10^-14m
number 1 10^11 10^26
data size 2 PB 12 TB/sec ??/sec
![Page 6: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/6.jpg)
images courtesy NASA SOHO
Sun in UV (304 Å)you are here
![Page 7: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/7.jpg)
Solar Flare Prediction Using Photospheric and Coronal Image Data. Jonas, Bobra, Shankar, Recht. American Geophysical Union, 2016
![Page 8: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/8.jpg)
NEUROSCIENCE AT ALL SCALES
![Page 9: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/9.jpg)
Could a Neuroscientist understand a microprocessor? Jonas, Kording. PLOS Computational Biology, 2017
![Page 10: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/10.jpg)
AND I WANT MORE!
![Page 11: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/11.jpg)
Superresolution
Phase contrastTomography
Adaptive Optics
![Page 12: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/12.jpg)
How do you get busy physicists and electrical engineers to give up Matlab?
How do we get busy astronomers
to give up IDL?
![Page 13: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/13.jpg)
Why is there no “cloud button”?
PREVIOUSLY, ON
![Page 14: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/14.jpg)
The cloud is too damn hard!
Jimmy McMillanFounder and Chairman The Rent is Too Damn High Party
Less than half of the graduatestudents in our group have
ever written a Spark or Hadoop job
![Page 15: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/15.jpg)
–Eric Jonas, 2017“I hate computers”
![Page 16: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/16.jpg)
![Page 17: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/17.jpg)
#THECLOUDISTOODAMNHARD
• What type? what instance? What base image?
• How many to spin up? What price? spot?
• wait, Wait, WAIT oh god
• now what? DEVOPS
![Page 18: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/18.jpg)
WHAT DO WE WANT?
1. Very little overhead for setup once someone has an AWS account. In particular, no persistent overhead -- you don't have to keep a large (expensive) cluster up and you don't have to wait 10+ min for a cluster to come up
![Page 19: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/19.jpg)
WHAT DO WE WANT?
2. As close to zero overhead for users as possible In particular, anyone who can write python should be able to invoke it through a reasonable interface. It should support all legacy code
![Page 20: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/20.jpg)
WHAT DO WE WANT?
3. Target jobs that run in the minutes-or-more regime.
![Page 21: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/21.jpg)
WHAT DO WE WANT?
4. I don't want to run a service. That is, I personally don't want to offer the front-end for other people to use, rather, I want to directly pay AWS.
![Page 22: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/22.jpg)
WHAT DO WE WANT?
5. It has to be from a cloud player that's likely to give out an academic grant -- AWS, Google, MS Azure. There are startups in this space that might build cool technology, but often don't want to be paid in AWS research credits.
![Page 23: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/23.jpg)
WHAT WE WANT1.Very little overhead for setup once someone has an AWS account. In particular, no persistent overhead -- you don't have to keep a large (expensive) cluster up and you don't have to wait 10+ min for a cluster to come up
2.As close to zero overhead for users as possible -- in particular, anyone who can write python should be able to invoke it through a reasonable interface.
3.Target jobs that run in the minutes-or-more regime.
4.I don't want to run a service. That is, I personally don't want to offer the front-end for other people to use, rather, I want to directly pay AWS.
5.It has to be from a cloud player that's likely to give out an academic grant -- AWS, Google, Azure. There are startups in this space that might build cool technology, but often don't want to be paid in AWS research credits.
![Page 24: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/24.jpg)
Powered by Continuum Analytics
+
![Page 25: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/25.jpg)
–Eric Jonas, 2017“I hate computers”
servers
![Page 26: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/26.jpg)
• 300 seconds single-core (AVX2)
• 512 MB in /tmp
• 1.5GB RAM
• Python, Java, Node
AWS LAMBDA
![Page 27: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/27.jpg)
THE API
![Page 28: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/28.jpg)
LAMBDA SCALABILITYCompute Data
![Page 29: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/29.jpg)
YOU CAN DO A LOT OF WORK WITH MAP!
ETL parametertuning
![Page 30: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/30.jpg)
IMAGENET EXAMPLEPreprocess 1.4M images from
IMAGENETCompute GIST image descriptor(some random python code off
the internet)
![Page 31: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/31.jpg)
![Page 32: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/32.jpg)
![Page 33: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/33.jpg)
HOW IT WORKS
pull job from s3download anaconda runtime
python to run codepickle resultstick in S3
your laptop the cloud
future = runner.map(fn, data)
Serialize func and dataPut on S3Invoke Lambda
func datadatadata
future.result()
poll S3unpickle and return
result
![Page 34: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/34.jpg)
A BRIEF HISTORY OF SHARING
Overhead
Isolat
ion
Processes1960s, MULTICS
Virtual Machines
1990s, VMWare, Xen
Renting/VPS1990s, SGE
HW VMs2000s, Intel VT-X
Containers2008 chroot/LXC
(mostly wrong)
• Process isolation
• network isolation
• filesystem isolation
• memory / cpu constraints
![Page 35: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/35.jpg)
(Leptotyphlops carlae)
Start
Delete non-AVX2 MKL
strip shared libs
conda clean
eliminate pkg
delete pyc
977 MB
1205MB
441MB
946 MB
670 MB
510MB
Want our runtime to include
![Page 36: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/36.jpg)
MAP IS NOT ENOUGH? A lot of data analytics looks like:
ETL / preprocessing featurizationData machine learning
Distributed! Scale! TensorFlow
Deep MLBaseGreat PyWren Fit
![Page 37: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/37.jpg)
–Paul Barnum, quoted in McSherry, 2015
“You can have a second computer when you’ve shown you know how to use the first one.”
![Page 38: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/38.jpg)
Scalability! But at what COST? Frank McSherry, Michael Isard, Derek G. Murray. USENIX Hot Topics In Operating Systems, 2015
![Page 39: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/39.jpg)
SINGLE-MACHINE REDUCE
But I don’t have a big server!
futures = exec.map(function, data)answer = exec.reduce(reduce_func, futures)
cores RAM COST
x1.32xlarge 64 2 TB $14/hr
x1.16xlarge 32 1TB $7/hr
p2.16xlarge 32 + 16 GPUs 750 GB $14/hr
r4.16xlarge 32 500 GB $4/hr
![Page 40: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/40.jpg)
STUPID LAMBDA TRICKS
Shivaram told me todayhe has this up to 6M/sec
transactions (!)
![Page 41: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/41.jpg)
BUT I CAN’T USE THE CLOUD!
![Page 42: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/42.jpg)
PYWREN MAKES SCALE A BIT EASIER• Do you have a python
function?
• Do you want to scale it?
• Try it out!
• Map : Today
• BigReduce : 1.0 in a week
• Parameter server : Experimental
![Page 43: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/43.jpg)
THANKS! https://github.com/ericmjonas/pywren
ShivaramVenkataraman
BenRecht
IonStoica
![Page 44: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/44.jpg)
![Page 45: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/45.jpg)
EXTRA SLIDES
![Page 46: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/46.jpg)
BEHIND THE HOOD
![Page 47: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/47.jpg)
UNDERSTANDINGHOST ALLOCATION
![Page 48: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/48.jpg)
SO WHEN IS THIS USEFUL?• Parameter searching
• Last-minute NIPS experiments
• Expensive forward modelsm
assiv
ely p
arall
el co
mpu
te
serial/ local
mas
sively
par
allel
com
pute
serial/ local
mas
sively
par
allel
com
pute
serial/ local
mas
sively
par
allel
com
pute
serial/ local
![Page 49: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/49.jpg)
GETTING AROUND THE LIMITATIONS
• Runtime [anaconda]
• Job lifetime [generators]
• Synchronization (memcache/redis?)
• inter-lambda IPC
![Page 50: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/50.jpg)
WORKER REUSE
![Page 51: Microservices & Teraflops: Effortlessly Scaling Data Science with PyWren | AnacondaCON 2017](https://reader031.vdocuments.site/reader031/viewer/2022021813/58ce822a1a28ab210a8b5c5d/html5/thumbnails/51.jpg)
COORDINATION?