hong kong openstack summit: savanna - hadoop on openstack

Download Hong Kong OpenStack Summit: Savanna - Hadoop on OpenStack

Post on 26-Jan-2015

106 views

Category:

Health & Medicine

4 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • 1. Savanna Hadoop on OpenStack Ilya Elterman (Mirantis) Matthew Farrellee (Red Hat) Sergey Lukjanov (Mirantis)

2. Agenda Savanna Overview Current state EDP overview other features Roadmap Live Demo 3. Agenda Savanna Overview Current state EDP overview other features Roadmap Live Demo 4. OpenStack Data Processing - Savanna Mission: To provide the OpenStack community with an open, cutting edge, performant and scalable data processing stack and associated management interfaces provision and operate Hadoop clusters schedule and operate Hadoop jobs 5. Hadoop - Big Data Platform 6. Popularity HadoopOpenStackhttp://www.google.com/trends/explore?q=hadoop+openstack#q=openstack%2C%20hadoop&cmpt=q 7. Use Cases Self-service provisioning of Hadoop clusters Utilization of unused compute capacity for bursty workloads Run Hadoop workloads in few clicks without expertise in Hadoop ops 8. Architecture Overview KeystoneHadoop VM Hadoop VMHorizonHadoop VMAuthSavanna PagesCluster Configuration Manager REST APISavanna Python ClientHadoop VMVendors PluginsJob Sources Job ManagerData Access LayerSwift Nova Trove DBData SourcesResources Orchestration ManagerGlanceHeatCinderNeutron 9. Savanna Status Official incubated OpenStack project v0.3 released 17 Oct 2013 Supported Hadoop distros: Vanilla Apache Hadoop (reference implementation) Hortonworks Data Platform 1.3.x Intel Distribution on review Cloudera Distribution in blueprint Included in OpenStack distros: RDO - http://openstack.redhat.com Mirantis OpenStack - http://software.mirantis.com 10. Cluster Provisioning Performance 11. Agenda Savanna Overview Current state EDP overview other features Roadmap Live Demo 12. EDP Overview End users have data and questions The data lives in a data repository The questions are embodied in code Savanna Elastic Data Processing (EDP) brings the Hadoop ecosystem to the end user Hides all cluster management behind the scenes 13. EDPCustomers launch millions of Amazon EMR clusters every year. http://aws.amazon.com/elasticmapreduce/ 14. EDP Variety and depth of value add offerings on top of clouds are growing Offerings are rarely open, rarely allow for choice Examples - Google Cloud, Azure, AWS 15. EDP Savanna and EDP can both match and exceed use cases provided by most public clouds 16. EDP in Savanna v0.3 UI, integrated into Horizon, for ad-hoc analytics queries based on Hive or Pig API to execute MapReduce jobs without exposing details of underlying infrastructure Pluggable data sources: Swift Supported job types: Jar, Pig, Hive Integration with Oozie for workflow management 17. Agenda Savanna Overview Current state EDP overview other features Roadmap Live Demo 18. Cluster Ops in Savanna 0.3 REST API Configuration templates Manual cluster scaling Data node anti-affinity and location control Full support of data locality - rack and 4-level awareness for HDFS and Swift Swift integration 19. OpenStack Integration in Savanna 0.3 OpenStack Dashboard plugin Both Neutron and Nova Network support Keystone trusts used for async operations Python client 20. Agenda Savanna Overview Current state EDP overview other features Roadmap Live Demo 21. Live Demo 22. Icehouse Roadmap Integration with OpenStack ecosystem Heat Tempest Devstack Ceilometer Ironic EDP enhancements Code hardening Polished api v2 Performance testing 23. Design Summit Sessions Friday, November 8 1:30pm Network and installation topologies 2:20pm Heat integration and scalability 3:10pm Further OpenStack integration 4:10pm Savanna in Icehouse http://goo.gl/2iEv8u 24. Q&A