putting hadoop on any cloud. nati shalom at big data spain 2012
DESCRIPTION
Session presented at Big Data Spain 2012 Conference 16th Nov 2012 ETSI Telecomunicacion UPM Madrid www.bigdataspain.org More info: http://www.bigdataspain.org/es-2012/conference/putting-hadoop-cloud/nati-shalomTRANSCRIPT
![Page 1: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/1.jpg)
The Elephant
in the Cloud
Putting Hadoop on Any Cloud
@natishalom
![Page 2: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/2.jpg)
Columbus & The Cloud
THE DISCOVERY OF AMERICA THE THING THAT MADE IT POSSIBLE
![Page 3: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/3.jpg)
Why Cloud Portability
Matters
![Page 4: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/4.jpg)
Cloud Portability Myth #1
No one really needs cloud portability
![Page 5: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/5.jpg)
Cloud Portability
Facts
Zynga moved ~80% of their workload from Amazon to their private zCloud
“own the base, rent the spike”
http://code.zynga.com/2012/02/the-evolution-of-zcloud/
![Page 6: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/6.jpg)
Cloud Portability
Facts Started with Linode, then moved to RackSpace, then to AWS
http://code.mixpanel.com/2010/11/08/amazon-vs-rackspace/
![Page 7: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/7.jpg)
Cloud Portability
Facts
• You want the flexibility to choose what’s right for you, when it’s right for you
• Based on pricing, features, availability, performance, etc.
![Page 8: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/8.jpg)
Cloud Portability Myth #2
Cloud Portability ==
Cloud API Standardization
![Page 9: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/9.jpg)
Cloud APIs, Today
Standard APIs (?)OCCIVCloud
OSS FrameworksOpenStackCloudStackEucalyptus
Abstraction frameworksJCloudsDeltacloudFogLibvirt
![Page 10: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/10.jpg)
Cloud APIs, Today
Standard APIsNot practical in the foreseeable future
OSS Projects Need a couple more years to converge &
mature
Abstraction FrameworksProbably the only
practical (near-term) option
![Page 11: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/11.jpg)
Realization:
What You Really Care
about Is App
Portability
OS is the same on any cloud
Most clouds have compute & storage
Elasticity & scaling have same effects on the app, regardless of the cloud
![Page 12: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/12.jpg)
Cloud Portability Myth #3 All infrastructure
clouds were born equal
![Page 13: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/13.jpg)
Food for Thought
Offerings can vary quite a bit:
• Amazon guarantees only 99.5% uptime
• RackSpace will give you $$$ every time they crash
• Joyent claims to be significantly faster than both
![Page 14: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/14.jpg)
And Some Features Are
Unique…
Amazon the only major vendor to offer SSD storage. Netflix says it’s:
• ½ the price for the same throughput
• ⅕ the latency on avg.
• Even slowest requests are 6x faster
http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html
@uri1803
![Page 15: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/15.jpg)
Let’s Talk Big Data on the Cloud
![Page 16: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/16.jpg)
A Typical Big Data App…
![Page 17: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/17.jpg)
Managing Big Data on the
Cloud
• Auto start VMs• Install and configure
app components • Monitor • Repair • (Auto) Scale• Burst…
![Page 18: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/18.jpg)
The Challenges ..
Consistent Management
Making the deployment, installation, scaling, fail-over looks the same through the entire stack
![Page 19: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/19.jpg)
The Challenges (Cont)..
Cloud Portability
Choosing the Right Cloud for the Job
Running Bare-Metal for high I/O workload, Public cloud for sporadic workloads..
![Page 20: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/20.jpg)
Hadoop
• Available under different distributions
• Cloudera• IBM BigInsights• MapR• Hortonworks
![Page 21: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/21.jpg)
Big Data Apps, on Any Cloud, Your Way
Open source (Apache2)
![Page 22: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/22.jpg)
Putting Cloudify and
Hadoop Together
• Run on Any Cloud• Consistent MGT• Dynamic Scaling • Auto Recovery• Auto Scaling• Role Assignments • Monitoring• Simple maintenance
![Page 23: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/23.jpg)
Few Snippets..
![Page 25: Putting Hadoop on any Cloud. NATI SHALOM at Big Data Spain 2012](https://reader033.vdocuments.site/reader033/viewer/2022061218/54b71cec4a7959177f8b4699/html5/thumbnails/25.jpg)
Thank You!
References: http://www.cloudifysource.org http://github.com/CloudifySource https://github.com/CloudifySource/cloudify-recipes/tree/master/services/biginsights