openshift yarn - strata 2014

11
Page 1 © Hortonworks Inc. 2014 OpenShift scheduling docker containers in YARN via Kubernetes

Upload: hortonworks

Post on 10-Dec-2014

241 views

Category:

Software


2 download

DESCRIPTION

Learn about the exciting integration work that has been done with YARN, Red Hat OpenShift and Kurbernetes Docker container orchestration. During this presentation we will cover the basics of this exciting YARN integration effort and then launch into a demo. You won’t want to miss seeing web application docker container, Storm, and Hive SQL queries all running in the same HDP cluster!

TRANSCRIPT

Page 1: Openshift YARN - strata 2014

Page 1 © Hortonworks Inc. 2014

OpenShift scheduling docker containers in YARN via Kubernetes

Page 2: Openshift YARN - strata 2014

Page 2 © Hortonworks Inc. 2014

Static website Web frontend User DB Queue Analytics DB

Development VM QA server Public Cloud Contributor’s laptop

Docker: a shipping container system for code Mul$p

licity

 of  Stacks  

Mul$p

licity

 of  

hardware  

environm

ents  

Production Cluster

Customer Data Center

Do  services  and  apps  interact  

appropriately?  

Can  I  migrate  

smoothly  and  quickly  

…that can be manipulated using standard operations and run consistently on virtually any hardware platform  

An engine that enables any payload to be encapsulated as a lightweight, portable, self-sufficient container…  

Page 3: Openshift YARN - strata 2014

Page 3 © Hortonworks Inc. 2014

Why are Docker containers lightweight?

HDP 2.1 Hortonworks Data Platform

I/O performance comparable to Bare Metal

Page 4: Openshift YARN - strata 2014

Page 4 © Hortonworks Inc. 2014

Kubernetes – Container Orchestrator

•  Service for container cluster management •  Allows deploying and managing applications running

on multiple hosts using docker •  Open sourced by Google •  Supports GCE, CoreOS, Azure, vSphere •  Used to manage Docker containers as a default

implementation •  Master – maintain state of Kubernetes Server runtime

•  Scheduler, API server, registries, storage •  Minions – represent the host were containers created

•  Kubelet – manage pod and container lifecycle

Page 5: Openshift YARN - strata 2014

Page 5 © Hortonworks Inc. 2014

OpenShift

•  Red Hat’s platform as a service for applications in the cloud that supports both public and private cloud •  Provides high level abstraction for applications on top

of containers allowing easy scaling, service discovery, and deployment

•  Enable Docker image authors to easily deliver reusable application components, including highly available databases, monitoring and log aggregation tools, service discovery platforms, and prepackaged web applications

•  Allow developers to deeply customize their runtime environments while preserving operational support at scale for those applications

Page 6: Openshift YARN - strata 2014

Page 6 © Hortonworks Inc. 2014 6

Kubernetes/YARN/Docker Integration

YARN Node Manager

YARN Node Manager

Kubernetes AppMaster

Page 7: Openshift YARN - strata 2014

Page 7 © Hortonworks Inc. 2014

Understanding Storm via a Real-World Use Case

A large truck fleet company wants to, in real-time, capture events of drivers in their trucks on the road across the US. Sensor devices on trucks captures all kinds of events varying from vehicle diagnostics to driver infractions. §  E.g.: Excessive breaking/acceleration, speeding, start/stop, etc..

Initial Business Requirement: §  Stream these events in, filter on violations and do real-time alerting

on “lots” of erratic behavior over a short period of time..

Page 8: Openshift YARN - strata 2014

Page 8 © Hortonworks Inc. 2014

High Level Architecture Truck Streaming Data

T(N)T(2)T(1)

Interactive Query

TEZ

Perform Ad Hoc Queries on driver/truck

events and other related data

sources

Messaging Grid(WMQ, ActiveMQ, Kafka)

truckeventsTOPIC

Stream Processing with Storm

Kafka Spout

HBase BoltMonitoring

Bolt

HDFS Bolt

High Speed Ingestion

Distributed Storage

HDFS

Write to HDFS

Email

Alerts

ActiveMQAlert Topic

Create Alerts

Real-time Serviing with HBase

driver dangerous

events

driver dangerous events

count

Write to HBase

Update Alert Thresholds

Spring WebApp with SockJS WebSockets

Real-Time Streaming Driver Monitoring App

Query driver events in real-time

Consume alerts in real-time

Batch Analytics

MR2

Do batch analysis/models & update HBase

with right thresholds for

alerts

Page 9: Openshift YARN - strata 2014

Page 9 © Hortonworks Inc. 2014

HDP Provides a Single Data Platform Truck Streaming Data

T(N)T(2)T(1)

Interactive Query

TEZ

Perform Ad Hoc Queries on driver/truck

events and other related data

sources

Messaging Grid(WMQ, ActiveMQ, Kafka)

truckeventsTOPIC

Stream Processing with Storm

Kafka Spout

HBase BoltMonitoring

Bolt

HDFS Bolt

High Speed Ingestion

Distributed Storage

HDFS

Write to HDFS

Email

Alerts

ActiveMQAlert Topic

Create Alerts

Real-time Serviing with HBase

driver dangerous

events

driver dangerous events

count

Write to HBase

Update Alert Thresholds

Spring WebApp with SockJS WebSockets

Real-Time Streaming Driver Monitoring App

Query driver events in real-time

Consume alerts in real-time

Batch Analytics

MR2

Do batch analysis/models & update HBase

with right thresholds for

alerts YARN Enables 4 different apps/workloads on a single cluster

HDP Data Lake

Page 10: Openshift YARN - strata 2014

Page 10 © Hortonworks Inc. 2014

HDP Provides a Single Data Platform Truck Streaming Data

T(N)T(2)T(1)

Interactive Query

TEZ

Perform Ad Hoc Queries on driver/truck

events and other related data

sources

Messaging Grid(WMQ, ActiveMQ, Kafka)

truckeventsTOPIC

Stream Processing with Storm

Kafka Spout

HBase BoltMonitoring

Bolt

HDFS Bolt

High Speed Ingestion

Distributed Storage

HDFS

Write to HDFS

Email

Alerts

ActiveMQAlert Topic

Create Alerts

Real-time Serviing with HBase

driver dangerous

events

driver dangerous events

count

Write to HBase

Update Alert Thresholds

Spring WebApp with SockJS WebSockets

Real-Time Streaming Driver Monitoring App

Query driver events in real-time

Consume alerts in real-time

Batch Analytics

MR2

Do batch analysis/models & update HBase

with right thresholds for

alerts YARN Enables 4 different apps/workloads on a single cluster

HDP Data Lake

Page 11: Openshift YARN - strata 2014

Page 11 © Hortonworks Inc. 2014

Demo: OpenShift scheduling Docker container in YARN

Running in Docker