kubernetes immersion

Kubernetes Immersion

Introduction: Who Am I

Juan Larriba

DevOps Engineer at everis innolab

@compilemymind

Introduction: Containers

Containers are gaining a lot of traction because they isolate different applications on the same physical or virtual hardware

Usually, servers are provisioned for the worst case scenario, leading to a lot of unused resources most of the time

Containerization lets us to securely share that hardware between different applications that can work a different times, optimizing the usage time

Introduction: Container Orchestrators

Currently there are 4 main container orchestrators fighting to be the market leader

Kubernetes

Mesos

Docker Swarm

Nomad

Kubernetes Architecture

Architecture

Architecture

Kubernetes is programmed as a monolithic application but deployed as a microservices application

It provides two basic components: Hyperkube and Kubelet

It relies on external services for networking and persistent storage of its own state

All comunications, both external and internal, use the HTTPS protocol

Architecture: Software Defined Networking

One of the first problems we face when working with Docker, is the manual port management issue

When deploying a number of containers on the same machine, we need to track manually which ports is exposing each container

To avoid this problem, Kubernetes uses a Software Defined Networking (commonly Flannel, but also Weave and even cloud providers native IaaS)

Each container is then automatically assigned a different IP, so all of them can expose the same port

Architecture: etcd

Kubernetes needs to persist its state in some kind of persistent storage

It uses exclusively etcd as its backend

etcd is a distributed key-value storage created by the CoreOS team

Each etcd major version breaks the previous API

As of Kubernetes 1.4, the version used is etcd2

Architecture: Kubelet

The Kubelet is a native Linux daemon that needs to be executed in each member of a cluster: masters and nodes

It communicates with its node Docker API to effectively launch the Docker containers required by other Kubernetes components

It really can work standalone, acting as a Supervisord of Docker containers

Before 1.4, it was the only Kubernetes component that did not work as a Docker container. From 1.4 onwards, it is also deployed in a Docker container

Architecture: kube-apiserver

It is deployed only in the master

It is the entrypoint for the Kubernetes cluster

It exposes a REST API

The client communicates and sends commands to the apiserver, who validates the information sent and if it is correct stores it in etcd

Architecture: kube-scheduler


The Scheduler is aware of the cluster status and decides where the new objects must be colocated

It is a very complex piece of software, the real “brain” of the Kubernetes cluster

As stated in Kubernetes documentation:

The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on

Architecture: kube-controller-manager


The Controller-Manager is a the control loop of the cluster

The Controller-Manager watches the shared state of the cluster stored in etcd by the API Server

It continuously compares the desired state of the cluster with the current state and notifies the other components of the cluster to perform the actions needed to move the cluster towards the desired state

Architecture: kube-proxy

It is deployed as a static pod on each node of the cluster

Implements Services capabilities

Kubernetes Addons

Addons: Ingress Controller

It provides a way to route external requests to applications in the cluster

Matches DNS names and contexts (which external clients like browsers can understand) to Kubernetes Services

One specification, multiple implementations

Currently we use the Nginx implementation, but a custom implementation is easily done

Addons: Dashboard

A web frontend for the cluster

It shows in a graphical UI all the information that can be obtained through the API or the CLI

Addons: Heapster

Reads monitoring data from the Kubelet (extracted from the Docker API and the node it lives in) and exposes it via a REST API

It can be deployed standalone and it will store all the cluster metrics for the last 15 minutes

It can be plugged to different backends, currently supporting Log, InfluxDB, Google Cloud Monitoring, Google Cloud Logging, Hawkular-Metrics, OpenTSDB, Monasca, Kafka, Riemann, Elasticsearch…

When plugged to a backend, it will store unlimited metrics (limited by the backend policies)

Addons: Kubedash

Exposes a web frontend for Heapster metrics

It can be used as a very basic monitoring tool for Kubernetes clusters

Limited functionality, but very easy to install and operate, so it can be very convenient sometimes

Addons: SkyDNS

Kubernetes uses DNS for service discovery

As each application deployed in the cluster will have its own IP, Kubernetes provides a way to resolve service names to Ips

SkyDNS is a Google implementation of the DNS protocol in Go with some improvements

DNS entries can be managed using a REST API

The DNS entries are stored in etcd, making SkyDNS a cloud-ready DNS server

Kubernetes Objects

Objects: Pod

The most basic unit of computation in Kubernetes is a Pod

A Pod can contain one or more Docker containers, but for simplification, we will only store one container in one Pod

Each Pod definition passed to the Kubelet creates, at least, two Docker containers: the user container and a special Pod container that handles the networking

A Pod has a SDN assigned IP, and thus it is only accessible from the same node

Objects: Service

Defines a “ClusterIP” so a Pod can be reached from each node of the cluster

Every replica of the same Pod share the same Service, which acts as Load Balancer

A Service is not an Nginx or an HAProxy, it does not consume resources nor it is deployed to a node. It is a kube-proxy configuration

Depending on the IaaS, a Service can aquire an external IP

Objects: Ingress

Exposes a Service with a network wide URL so it can be accessed from the outside world

Provides a much more safer and manageable way of accessing services than directly exposing IPs

The Ingress endpoint is provided by the Ingress Controller Addon

Objects: ReplicationController

Ensures that a specified number of pod “replicas” are running at any one time

If there are too many pods, it will kill some. If there are too few, the replication controller will start more

You can think of a replication controller as something similar to a process supervisor, but rather than individual processes on a single node, the replication controller supervises multiple pods across multiple nodes

Objects: ReplicaSet

It is the next-gen ReplicationController, still in beta.

The biggest difference is that ReplicaSets do not support the rolling-update command

ReplicaSets can be used standalone, but their main usage is to be used by Deployments to orchestrate pod creation, deletion and updates

When you use Deployments you don’t have to worry about managing the Replica Sets that they create

Objects: Deployment

Provides declarative updates for ReplicaSet

It provides all the capabilities of a Replication Controller, but adds other powerful features

It adds the versioning feature: a Deployment is able to track the previously deployed versions and perform easy rollbacks

Pause and Resume

Update the Deployment to recreate the pods

Objects: DaemonSet

It is a special kind of ReplicationController that ensures one replica of a pod is running on each node of the cluster

You do not specify directly how many replicas does a DaemonSet deploys

As nodes are added to the cluster, pods are added to them. As nodes are removed from the cluster, those pods are garbage collected

Objects: Namespace

Every Kubernetes Object must be unique

This can be a nightmare as the cluster grows

To avoid this problem, each Object is created inside a Namespace, and its name only needs to be unique to that Namespace.

DNS Service Discovery takes in account the Service Name and the Namespace when resolving

Kubernetes Persistence

Persistence: Volume

A Kubernetes Volume is a temporal data storage that lives while the pod is alive

It persists through container restarts, but a pod restart will erase the information

It is meant to be shared between different containers of the same Pod

As we take the approach of having just one container for each Pod, these kind of volumes do not have any usage

Persistence: Persistent Volume

When containers need to store information in a persistent way, we use Persistent Volumes

A Persistent Volume is a piece of networked storage provisioned and made available to the cluster by an administrator

It is not meant to be created during a normal Kubernetes workflow

It is an abstraction of hardware resources (disk storage) so Pods can use it without knowing what underlying technology provides the storage

Persistence: Persistent Volume Claim

When a user of the cluster wants to request storage for his Pods, he creates a Persistent Volume Claim

The Claim object will automatically search the pooled and unused Persistent Volumes to find one that matches the request

Once a Persistent Volume has been claimed, its ownership cannot be changed until the Claim is removed from the cluster

Kubernetes CLI

CLI: Frequent Commands

kubectl get namespace

kubectl get pods –namespace default

kubectl describe pod <podname>

kubectl logs <podname>

kubectl exec –it <podname> bash

kubectl create –f <filename.yml>

CLI: Frequent Commands

kubectl rolling-update dms-appointments --image=registry.pre.kube.volkswagengroup.es/everis/dms-appointments --image-pull-policy Always --namespace everis-dev

Kubectl

KUBERNETES ADVANCED

Advanced: Secret

It is meant to hold sensitive information, such as password, in an encrypted way

Putting secret info in a Secret is safer thant putting it verbatim in a Pod definition or a Docker image

Secrets are used by Pods by mounting them in a container Volume

Advanced: ConfigMap

It is a standard way of storing generic configuration as a Kubernetes object

It is very similar to a Secret, but to work with string that do not contain sensitive information

It can be thought of a HashMap for Kubernetes.

Advanced: Horizontal Pod Autoscaler

It can automatically scale the number of Pods in a ReplicationController, Deployment or ReplicaSet based on observed CPU utilization

The user defines an autoscaling rule referencing CPU: Scale when the Pod is at 80% CPU for 2 minutes with an upper limit of 10 replicas

Then, the autoscaler polls the CPU metric and scale up or down based on that rule

Its functionality is very limited

Advanced: Resource Limits

When created without limits, a container inside a Pod can potentially demand all the node’s resources

As not all the containers peak at the same time, this beahivour is sometimes wonderful, as it cut down infrastructure costs

But for the moments we need hard limits, we can establish limits to both a Pod or a Namespace

Advanced: REST API

As stated before, the only interface the Kubernetes components expose to the world and between them, is an HTTPS one

Thus, everything can be achieved accessing directly the REST API exposed by the apiserver

An extensive API documentation can be found in the Kubernetes documentation page

Advanced: Downward API

Allows containers to consume information about themselves or the system and expose that information how they want it, without necessarily coupling to the Kubernetes client or REST API

It is a way to declarative use the Kubernetes API while writing YAML files

Examples of common information retrieved with Downward API are the Pod’s IP or its memory and CPU limits

Q&AQuestions and Answers

@compilemymind

kubernetes immersion

Technology