one click hadoop clusters - anywhere...one click hadoop clusters - anywhere october, 2015 janos...
TRANSCRIPT
![Page 1: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/1.jpg)
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
One click Hadoop clusters - anywhere
October, 2015
Janos Matyas, Senior Director of Engineering
![Page 2: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/2.jpg)
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Overview
• Introduction
• Goals and motivations
• Technology stack
• How it works
• Results/achievements/future plans
• Demo and Q&A
![Page 3: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/3.jpg)
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Goals and motivations
• Full Hadoop stack provisioning – everywhere
• Automate and unify the process
• Zero-configuration approach
• Same process through a cluster lifecycle (Dev, QA, UAT, Prod)
• Provide tooling - UI, REST API and CLI/shell
• Secure and multi-tenant
• SLA policy based autoscaling
![Page 4: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/4.jpg)
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Technology stack
• Docker
• Swarm
• Consul
• Apache Ambari
![Page 5: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/5.jpg)
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Docker
• Container based virtualization
• Lightweight and portable
• Build once, run anywhere
• Ease of packaging applications
• Automated and scripted
• Isolated
![Page 6: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/6.jpg)
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Docker – How it works • Containers are isolated, but share OS and bins/
libraries
• No need to emulate hardware
![Page 7: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/7.jpg)
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Swarm
• Native clustering for Docker
• Distributed container orchestration
• Same API as Docker
![Page 8: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/8.jpg)
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Swarm – How it works
• Swarm managers/agents
• Discovery services
• Advanced scheduling
![Page 9: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/9.jpg)
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Consul
• Service discovery/registry
• Health checking
• Key/Value store
• DNS
• Multi datacenter aware
![Page 10: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/10.jpg)
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Consul – How it works
• Consul servers/agents
• Consistency through a quorum (RAFT)
• Scalability due to gossip based protocol (SWIM)
• Decentralized and fault tolerant
• Highly available
• Consistency over availability (CP)
• Multiple interfaces - HTTP and DNS
• Support for watches
![Page 11: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/11.jpg)
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari
• Easy Hadoop cluster provisioning
• Management and monitoring
• Key feature - Blueprints
• REST API, CLI shell
• Extensible • Stacks • Services • Views
![Page 12: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/12.jpg)
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari – How it works
• Ambari server/agents
• Define a blueprint (blueprint.json)
• Define a host mapping (hostmapping.json)
• Post the cluster create
![Page 13: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/13.jpg)
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Cloudbreak is a cloud-agnostic Hadoop as a
Service API. Abstracts the provisioning and ease
management and monitoring of on-demand
clusters.
Cloudbreak is a powerful left surf that breaks over a coral reef, a mile off
southwest the island of Tavarua, Fiji.
![Page 14: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/14.jpg)
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
• Benefits • Zero configuration • Elastic • Secure • Infrastructure agnostic • Heterogenous clusters • Auto-scaling
• Main REST resources • /template – specify an instance group infrastructure
• /stack – creates an infrastructure based on a template
• /blueprint – describes a Hadoop cluster
• /cluster – creates a Hadoop cluster
![Page 15: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/15.jpg)
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak – How it works
• Start VMs - with a running Docker daemon
• Cloudbreak Bootstrap • Start Consul Cluster • Start Swarm Cluster (Consul for discovery)
• Start Ambari servers/agents - Swarm API
• Ambari services registered in Consul (Registrator)
• Post Blueprint
![Page 16: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/16.jpg)
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak - Features
• Extensible – easy to implement Service Provider Interface
• Cloudbreak “recipes” • Automate host configuration • Pre/post Ambari lifecycle hooks • Services reconfiguration • Automate/execute custom actions
• Side – effects • Ambari CLI/shell and Groovy based client • Cloud Foundry’s UAA Dockerized • Munchausen – bootstrap Swarm with Consul • Dockerized full Hadoop stack (Apache Hadoop 60K+, Ambari 12K+, Spark 10K+ downloads)
![Page 17: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/17.jpg)
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak - Hadoop as a Service API
• Public tech preview • Microsoft Azure • Amazon AWS • Google Cloud Platform • OpenStack
• Private tech preview – R&D • Bare metal • Rackspace Managed Cloud • HP Helion Public Cloud
*integration SPI is available
![Page 18: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/18.jpg)
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak – SPI • Cloud providers have very different API, though model is very similar • Non – invasive implementation • One interface to implement - CloudPlatformConnector Network Security Group Image
Subnet Subnet Rules Rules
Instance
Volume Volumes
Volume IP Address
UserData
Instance
Volume Volumes
Volume IP Address
Instance
Volume Volumes
Volume IP Address
![Page 19: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/19.jpg)
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Periscope
Periscope is a heuristic Hadoop scheduler
associated with a QoS profile. Built on
YARN schedulers, cloud and VM resource
management API's it allows to associate
SLA's to applications and customers.
Periscope is a powerful, fast, thick and top-to-bottom right-hander, eastward from
Sumbawa's famous west-coast.
![Page 20: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/20.jpg)
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Periscope
• Benefits • Zero configuration • Metric and time based alarms • SLA policy based autoscaling • Secure • Hostgroup specific
• Main REST resources • /clusters – specify a cluster to be monitored
• /alerts– time and metric based
• /policies – specify an SLA policy for a cluster based on an alarm
• /applications – specify an SLA policy for an application (under development)
![Page 21: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/21.jpg)
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Periscope – How it works
• Configures/monitors alarms in Ambari
• Setup alarm, cooldown periods
• Manages cluster sizes
• Allow to associate SLA scaling policies to alarms
• Orchestrates Cloudbreak to up/downscale the cluster
![Page 22: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning](https://reader034.vdocuments.site/reader034/viewer/2022042220/5ec5f7a996bf20768251c672/html5/thumbnails/22.jpg)
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Demo and Q&A