choosing the right framework for running docker containers in prod

Post on 11-Jan-2017

8.211 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Choosing the Right Framework for Running

Docker Containers in ProdPresented by: Josh Padnick

Phoenix DevOps

Docker is a game-changer.

Why Devs • Lightweight

Containers are just isolated processes. We can start a new container in seconds.

• Portable My Mac, the linux EC2 instance, and your Windows PC all run the exact same container.

• EcosystemI can easily share images, manage private images, and use “official” images for virtually all open source software.

Why Devs • Squeeze More Resources out of a Single Server

Did you know this dirty secret of the Infrastructure-as-a-Service world?

85%

15%

In Use Free

SOURCE: http://radar.oreilly.com/2014/12/why-the-data-center-needs-an-operating-system.html

Typical Data Center Resource Utilization

So can I run multiple containers in a single VM?

VM-1 VM-2

Service A Service AService B

Service C

Service B

Something like this?

Yes! But, well…

The Gartner Tech HypecycleAny guesses where the “multi-container VM” paradigm is?

SOURCE: https://setandbma.wordpress.com/2012/05/28/technology-adoption-shift/

My Take on ThisIn reality, the exact spot varies by team,

so this is a bit of a generalization.

Today’s talk is about our options for that red dot.

For each option, we’ll cover:

• Pro’s • Con’s • When to use

• Full-stack web-app engineer for 12+ years.

• Since I’ve worked with many different teams, I generally help accelerate the DevOps/AWS learning curve for teams.

• PhxDevOps Clients include: Intel, Infusionsoft, American Bible Society, CÜR Music, plus multiple startups and web design companies.

Josh Padnick

These slides are posted on http://joshpadnick.com

I help software teams scale their app using DevOps and AWS.http://PhoenixDevOps.com

@OhMyGoshJosh

My LinkedIn

My GitHub

Want to know more about building scalable apps on AWS? Check out a 12,000+ word article I wrote on how at http://bit.ly/1EtYRbL.

Disclaimers• I have a bias toward AWS and may leave

out solutions from other IaaS providers such as Azure.

• The solutions we cover today are deep and diverse. This talk reflects my own experiences but your mileage may vary!

Agenda• CoreOS in 60 seconds

• Theory of Multi-Container VM’s

• The Three Paradigms of Multi-Container VM’s

• Cover all the Major Solutions

We only have 60 minutes. So we’ll move fast.

CoreOS in 60 Seconds

What is ? • It’s super stripped-down linux. You don’t even

get a package manager.

• The idea is you run everything as a container.

• CoreOS is based on ChromiumOS, which itself is based on Gentoo Linux.

• Uses systemd for init.

CoreOS and This Presentation• Because CoreOS is “built for Docker”, many

solutions use it as their default OS.

• In reality, you can usually use any OS that runs Docker natively, but CoreOS is often the “recommended” linux distro for Docker.

Theory ofMulti-Container VM’s

What does it take to run this?

VM-1 VM-2

Service A Service AService B

Service C

Service B

Docker BuilderWe need

somewhere to build our image.

• If we build from a fresh environment each time, every Docker image/layer is downloaded from scratch.

• Ideally, our “Docker Builder” has pre-downloaded (“seeded”) all our most popular Docker images.

• In practice, this is managed by your build tool, like Jenkins, CircleCI, Shippable, etc.

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Docker BuilderWe need

somewhere to build our image.

• Main options here are:

• Cloud

• Docker Hub

• Quay.io

• On-Premise

• Docker Trusted Registry (Paid)

• Docker Distribution (Free)

• Quay.io (Paid)

Docker RegistryWe need

somewhere to store our built

images.

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

VM-1 VM-2

Service A Service AService B

Service C

Service B

VM-1 VM-2

Service A Service AService B

Service C

Service B

“Increase Service C container count from 1 to 2.”

VM-1 VM-2

Service A Service AService B

Service C

Service B

Service C

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Container SchedulingWe need something to decide which host will

launch our new container?• One of the most important

considerations when choosing a host is “who’s got the memory and CPU I need?”

• But we also need to know who’s in a different Availability Zone / Data Center so we can achieve high fault tolerance.

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

VM-1 VM-2

Service A Service AService B

Service C

Service B

VM-1 VM-2

Service A Service AService B

Service C

Service B

GET ServiceB HTTP/1.1

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Service DiscoveryWhen we launch new

containers, we need to tell our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

“Increase Service C container count from 1 to 2.”

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

Service C

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

Service C

VM-1 VM-2

Service A Service AService B

Service C

Service B

Routing / Load Balancing Solution

Service C

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Service DiscoveryWhen we launch new

containers, we need to tell our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Auto-Restart Failed Containers

Something needs to know that a container failed and

auto-restart it.

Service DiscoveryWhen we launch new

containers, we need to tell our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Auto-Restart Failed Containers

Something needs to know that a container failed and

auto-restart it.

Extract Container LogsWe need a way to read logs

from all containers.Service Discovery

When we launch new containers, we need to tell

our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Extract Container LogsWe need a way to read logs

from all containers.

Monitor EverythingWe need to monitor cluster resources and individual

containers.

Auto-Restart Failed Containers

Something needs to know that a container failed and

auto-restart it.

Service DiscoveryWhen we launch new

containers, we need to tell our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Docker RegistryWe need

somewhere to store our built

images.

Automated DeploymentWe need a way to deploy our Docker image into the cluster.

Extract Container LogsWe need a way to read logs

from all containers.

Monitor EverythingWe need to monitor cluster resources and individual

containers.

Auto-Restart Failed Containers

Something needs to know that a container failed and

auto-restart it.

Service DiscoveryWhen we launch new

containers, we need to tell our router they exist.

Routing / Load BalancingWe need a way to route a

request to any of our containers.

Container SchedulingWe need something to decide which host will

launch our new container?

Docker BuilderWe need

somewhere to build our image.

Does our cluster have “state”?• Yes!

• Router needs to know which containers are from which services.

• We need to know which hosts are actually in our cluster.

Storing Cluster State• This topic alone warrants full books.

• One option for storing state is to simply use a database like PostgreSQL.

• But the more popular option is for each host in the cluster to store state in an eventually consistent way using a “consensus algorithm.” I call this a cluster datastore.

• The most popular such solutions are: etcd, consul, and zookeeper.

Unit of Container Deployment• We need something that describes what kind of

container to deploy.

• This is typically a declarative file in either YAML or JSON that declares all aspects of our docker run command, whether 2+ containers are run together, etc.

The Theory in Summary• Docker builder

• Docker registry

• Automated deployment

• Container scheduling

• Routing / Load Balancing

• Service discovery

• Auto-restart failed containers

• Logging

• Monitoring

• Cluster datastore

• Unit of Container Deployment

Paradigms of Multi-Container VM’s

Three Paradigms• Cluster Frameworks

• Platform-as-a-Service (PaaS)

• Data Center Operating Systems

Paradigm #1:Cluster Frameworks

The Big Idea• You control the infrastructure (e.g. AWS, Azure)

• You’re given an unopinionated set of primitives on top of which you can build your own solution.

• Primitives include launching containers, but not full deployment.

Major Cluster Frameworks

+ Fleet

Docker Swarm

(We’ll cover each of these today)

Paradigm #2:Platform-as-a-Service (PaaS)

The Big Idea• You control the infrastructure (e.g. AWS, Azure)

• Install the PaaS tool on top of your own infrastructure.

• PaaS tool typically sits on top of a Cluster Framework.

• You’re not 100% sure how it works, but it solves your needs today and you can always deep dive later, or (hopefully) get commercial support.

Major PaaS Solutions

(We’ll cover Deis in depth shortly)

https://github.com/remind101/empire

Paradigm #3:Data Center Operating Systems

The Big Idea• You control the infrastructure (e.g. AWS, Azure)

• You’re given an opinionated framework which has everything you need to deploy.

• You operate at the abstraction level of “cluster” and really don’t care when individual hosts die.

• These tend to be the most powerful, and the most complex.

Major Data Center Operating System Frameworks

(We’ll cover each of these today)

( + ? )

Hybrid Paradigms:Where A runs on B

Hybrid Combo #1:Run Kubernetes on Mesos

Hybrid Combo #2:Run Kubernetes as a Hosted Service

Hybrid Combo #3:Run the Open Source PaaS “Empire” on

top of EC2 Container Service

https://github.com/remind101/empire

Hybrid Combo #4:Use Docker Swarm as the UX to Mesos.

Mentally Managing the Hybrids

• Don’t get too caught up on these exotic combinations.

• Focus first on one of the “non-hybrid” technologies.

• Then evaluate what can be run on top of your choice technology, and whether it will make life easier for you.

The Major Solutions

Disclaimers• We just don’t have enough time to cover

each solution in depth.

• We only get about 3 minutes per solution, so let’s get started!

Cluster Framework Solution:

+ Fleet

How It Works• Launch a CoreOS cluster on the IaaS platform of your

choice (e.g. AWS, Azure, VMWare, etc.)

• CoreOS comes with a CLI tool fleet that enables launching containers, but does not constitute a full deployment system.

• Best thought of as a set of primitives you can work with, not a full-fledged framework.

• Define systemd unit files to describe the Docker container you want to launch.

Docker Builder Roll Your Own

Docker Registry Roll Your Own

Deployment: Scheduling Built into fleet, but no resource-aware scheduling

Deployment: Routing Roll Your Own

Deployment: Service Discovery Roll Your Own

Auto-Restart Failed Containers Built into Fleet

Monitoring Roll Your Own

Logging Roll Your Own

Cluster Data Store etcd

Unit of Deployment Systemd unit file

Pro’s• Relatively mature/stable among Cluster Frameworks.

• Once you’ve setup etcd, everything else “just works”.

• RESTful API into fleet allows for easily building out your own custom solution.

• Fleet will auto-restart failed containers.

• Tagging cluster nodes allows for clever distribution of containers (e.g. across Availability Zones).

• CoreOS gives us a well-defined method for updating individual cluster nodes to the latest CoreOS.

• Commercial support available.

Con’s• Setting up etcd can be painful.

• Fleet does not allow resource-aware scheduling, so containers may run out of resources.

• Fleet does not expose a primitive for “transferring” a container from one cluster node to another.

• No built-in way to monitor cluster-wide resource consumption.

• Not usable for a production cluster without significant setup overhead (e.g. setting up service discovery).

• Learning fleet ultimately requires learning systemd and discovering what fleet commands actually do.

When To Use It• You want to learn the foundations of CoreOS.

• You want high customizability over your setup and can tolerate non-resource-aware scheduling.

• You’re willing to manually handle many operations such as launching additional containers.

Cluster Framework Solution:

How It Works• Launch at least 3 EC2 instances in AWS.

• Install the ECS agent on each node (or launch an AMI with the agent pre-installed).

• Cluster setup “just works”

• Define a “Task Definition” to describe how one or more Docker containers should be launched.

• Define a “Service” that launches one or more instances of the Task Definition, and ECS auto-deploys your Tasks (containers).

Docker Builder Roll Your Own

Docker Registry Roll Your Own

Deployment: Scheduling Resource-aware, pluggable scheduler. Can be swapped w/ custom one.

Deployment: Routing Leverages AWS Elastic Load Balancers

Deployment: Service Discovery Built in to services

Auto-Restart Failed Containers Built in to services

Monitoring Basic monitoring included at cluster level.

Logging Roll Your Own

Cluster Data Store Zookeeper (but this is hidden to us)

Unit of Deployment Task Definition

Pro’s• Very easy to set up.

• Simple UX.

• Low learning curve.

• Covers most of what you need out of the box, including built-in routing and service discovery.

• Presumably AWS will keep improving it.

• Supported via AWS.

Con’s• Doesn’t support dynamic port mapping from container to host.

• Each service requires its own Elastic Load Balancer, which is $18/month. (Unless you’re willing to expose a service on a port other than 80/443)

• Supports rolling deployments provided you have a spare node to launch a new service instance on. Blue/Green deployments are claimed as a feature, but require out-of-band customization.

• Not recommended to leverage the existing Zookeeper cluster datastore in use, so you may have to run two cluster datastores (e.g Zookeeper + Consul).

• Use of “private subnets” requires two separate clusters, one for public and one for private.

When To Use It• You use AWS and…

• You want to get up and running quickly with your Docker-based microservices framework.

• You want to run your monolith using containers today, knowing you can migrate to other cluster tech’s in the future.

• You want an official solution with official support.

• You want to minimize the number of vendors/tech’s you work with.

Cluster Framework Solution:

Docker Swarm

How It Works• You use “docker-machine” to launch multiple EC2

instances (or other VMs). Each EC2 instance is configured with the Docker daemon and the docker-swarm agent (which is just a container).

• You launch one or more “Swarm Masters”, one of which is the “master leader.” You use this to control your cluster.

• You can now run Docker CLI commands to launch containers on your cluster.

Docker Builder Roll Your Own

Docker Registry Roll Your Own

Deployment: Scheduling Resource-aware, pluggable scheduler. Can be swapped w/ custom one.

Deployment: Routing Roll Your Own

Deployment: Service Discovery Roll Your Own

Auto-Restart Failed Containers Open GitHub Issue: https://github.com/docker/swarm/issues/599

Monitoring Roll Your Own

Logging Roll Your Own

Cluster Data Store Pluggable!See https://docs.docker.com/

swarm/discovery/ Unit of Deployment Docker container, or Docker compose manifest

Pro’s• Use the Docker CLI you’ve come to know and love.

• You can run other Docker tools that call the old Docker CLI directly on top of Swarm and they will “just work”. Does this matter?

• Potentially simpler to program against compared to CoreOS fleet.

• Resource-aware scheduling.

• Open source event bus available allows for interesting possibilities in response to cluster events (esp. around service discovery).

• Official Docker solution.

Con’s• Not yet recommended for production.

• My own experience with docker-machine and docker-swarm have been underwhelming in terms of stability.

• Other than “official Docker solution” and “use Docker CLI”, I don’t see any superior features to alternatives.

When To Use It• For experiments and curiosity.

• Docker swarm is intriguing, but under heavy development and doesn’t yet present a clear value proposition compared to alternatives.

• But check back in 6 months, and it may be a solid contender. See Project Orca for an exciting opinionated take on the UX. https://youtu.be/8vSPpPSd00w?t=1h25m16s

PaaS Solution:

How Deis Works

Control Plane Component

Cluster

Control Plane Component

Control Plane Component

Data Plane Component

Data Plane Component

Routerper Host

Routerper Host

How Deis Works

Control Plane Component

Cluster

Control Plane Component

Control Plane Component

Data Plane Component

Data Plane Component

Routerper Host

Routerper Host

Service A Service B Service B Service C

Deis Workflow for Dev’s (basically, Heroku on your own infrastructure)

SOURCE: http://docs.deis.io/en/latest/understanding_deis/concepts/

How Deis Works for Operators

SOURCE: http://docs.deis.io/en/latest/understanding_deis/architecture/

Docker Builder Included!

Docker Registry Included!

Deployment: Scheduling Defaults to resource-unaware fleet. Pluggable schedulers are in tech-preview.

Deployment: Routing Included!

Deployment: Service Discovery Included!

Auto-Restart Failed Containers Included!

Monitoring Roll Your Own

Logging Built-in logspout means you can send your logs anywhere.

Cluster Data Store PostgreSQL + Ceph

Unit of Deployment Heroku buildpack, Dockerfile, or Docker image.

Pro’s• Everything in one package, ready to go. Get up and

running pretty quickly.

• Nice workflow for dev’s

• Open source

• Great community

• Good paradigm of what you would eventually need to build.

• Commercial support available through Engine Yard.

Con’s• Learning curve for operators can feel steep.

• When the PaaS fails, it’s time to start climbing the learning curve. For example, I once terminated a node, broke the ceph cluster and had to dig into the guts to figure out how to fix it.

• Deis’s architectural opinions may differ from your own.

Data Center Operating Solution #1:

How Mesos Works

Cluster

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Our job is to store cluster state!

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

We make “resource offers” to “frameworks”.

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

We run “tasks”, which are Docker containers. We take our orders

from the Master.

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

A Resource Offer:“Master, I have 2 CPU cores, 8 GB RAM,

and 25GB of disk space available!”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

A Resource Offer:“Master, why go for 2 CPUs when

I’ve got 3 CPUs and 10 GBs of RAM!”

You’ve just seen Tier 1 of the Mesos resource scheduling algorithm.

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Master Forwards Resource Offers“I have received your offers and they will

be forwarded to whomever I please.”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Master Forwards Resource Offers“Marathon, I choose you first! I can offer

you 3 CPUs and 10 GB of RAM.”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Frameworks Accept/Reject Offers:“Yawn. Pass.”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Master Forwards Resource Offers“Chronos, surely you have need of resources. I offer you the same!”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Frameworks Accept/Reject Offers:“Let’s do this.

I need 1 GB of RAM and 1 CPU core.”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Master Schedules Task on a Slave“It shall be so!

Agent #22, you shall run this task.”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Slave Receives the Task:“Strength and honor, sire!”

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Slave starts running the task as a docker container.

How Mesos Works

Zookeeper

Cluster

Zookeeper Zookeeper

Mesos Master

Mesos Master

Mesos Master

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

MesosSlave

(aka Agent)

Framework:Chronos

Framework:Marathon

Framework:ApacheSpark

Framework:Cassandra

Docker Builder Roll Your Own

Docker Registry Roll Your Own

Deployment: Scheduling Resource-aware scheduler. Can run other “frameworks” side by side.

Deployment: Routing Roll Your Own

Deployment: Service Discovery Roll Your Own

Auto-Restart Failed Containers Included!

Monitoring Roll Your Own

Logging Roll Your Own

Cluster Data Store Zookeeper

Unit of Deployment Mesos Task (which will usually be a Docker container, usually submitted through Marathon.

But wait, there’s more!• Setting up Mesos involves coordinating many

different moving pieces.

• Also, there’s no immediate way to gain a cluster-wide view of total memory/CPU/disk space use.

• Also, the learning curve can be steep.

Mesosphere DCOS is meant to solve these problems.

• Offers “turn-key” setup (though the setup itself is not really production-grade).

• Offers a fancy UI for viewing cluster resource usage.

• Offers a special CLI for installing frameworks with 1 command.

• It’s very much in active development and would work best with a Mesosphere support plan.

Pro’s• I find the Mesos abstraction the most intuitive when it

comes to managing cluster resources.

• Scalability is off the charts. Verizon, Siri, Yelp, Twitter and OpenTable all use Mesos.

• Growing community.

• Multiple “frameworks” already supported such as Apache Spark and Cassandra.

• Solomon Hykes called it the “gold standard” for running Docker containers in a cluster.

Con’s• It can take weeks to setup if you need to do it right.

• The learning curve for dev’s is manageable, but for operators there are many moving pieces.

• There are certain edge cases that are rare but that would affect cluster performance over time.

• If you want to run Mesos on CoreOS, either you need to violate the CoreOS way, or run Mesos Master / Slave (Agent) in docker containers which is officially not recommended.

When To Use It• You’re running multiple microservices, and you anticipate

significant scale.

• You want to squeeze as much possible utilization out of your large cluster as possible.

• You’re ready to adopt the cluster as the primary abstraction and expect to co-mingle prod and dev, multiple services, and multiple frameworks.

• Note: Smallest company I met at MesosCon was ~60 employees. That is probably the lower limit of company size before Mesos makes sense (IMO).

Mesos + Docker Swarm• At MesosCon (August 2015), Docker showed

Docker Swarm as the CLI-based way to control Mesos deployments.

Data Center Operating Solution #2:

Disclaimers• Mr. Padnick may or may not have any

actual real-world experience with Kubernetes but felt it necessary to include it here for the sake of completeness.

Kubernetes Pods• A pod is a group of docker containers that

should be run together.

Pod

Web Server

Content Management

Server

SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98

Kubernetes Labels• A label is a set of key-value pairs that attach to allow Kubernetes

to identify groups of pods.

• Concept of labels is baked into most APIs.

Pod

SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98

FE

Pod

BI, FE

Pod

v2

Kubernetes Replication Controllers

• A replication controller is a definition: “I want to run this pod 5 times.”

• If one of the pods fails, Kubernetes will auto-restart a new one.

SOURCE: Illustrations reproduced from https://www.youtube.com/watch?v=Fcb4aoSAZ98

Pod

v1

Pod

v1

ReplicationController

#Pods = 2 Label selector: v1

Kubernetes Cluster

Node

Kubernetes Masteretcd

API Server Controller Manager Server

Scheduler Server

Kubernetes Masteretcd

API Server Controller Manager Server

Scheduler Server

kubeletagentproxy

Pod

FE

Pod

v2

Pod

v1

Node

Node

Node

Node

Node

Node

Docker Builder Roll Your Own

Docker Registry Roll Your Own

Deployment: Scheduling Resource-aware scheduler.

Deployment: Routing Included!

Deployment: Service Discovery Included!

Auto-Restart Failed Containers Included!

Monitoring Optimal support with Google Cloud Engine. Limited support for others.

Logging Optimal support with Google Cloud Engine. Limited support for others.

Cluster Data Store etcd

Unit of Deployment Pod

Pro’s• Produced by google.

• Very well-documented.

• Open source.

• The “successor” to CoreOS + Fleet. Commercially supported by CoreOS as tectonic.

• If run in Google Cloud Engine, can potentially be quite powerful.

Con’s• Preferential support for Google Cloud Engine.

• Produced by Google but not necessarily the exact system Google uses to run its own cluster (though based on it).

• I may or may not be aware of additional issues.

When To Use It• You’re running Google Cloud Engine

• You have prior experience from working at Google

• I may or may not be aware of add’l use cases.

Mesos + Kubernetes• You can run kubernetes on top of Mesos as an

alternative to Marathon.

Final Thoughts

Closing Thoughts• To get started quickly, choose EC2 Container

Service.

• To get a feel for the core technologies, choose a PaaS like Deis and slowly learn CoreOS.

• To run multi-container VM’s at (potentially huge) scale, choose Mesos.

• There are many more “satellite” projects I didn’t cover solving unique problems!

Now have fun and docker on!

Q&A

top related