integrating docker into the bungee cloud elasticity benchmark€¦ · integrating docker into the...

51
Bachelor Thesis Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair for Computer Science II (Software Engineering) Prof. Dr.-Ing. Samuel Kounev Reviewer André Bauer Advisor Submission 14. September 2018 www.uni-wuerzburg.de

Upload: others

Post on 26-May-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Bachelor Thesis

Integrating Docker into the BUNGEEcloud elasticity benchmark

Filipp RoosDepartment of Computer ScienceChair for Computer Science II (Software Engineering)

Prof. Dr.-Ing. Samuel KounevReviewerAndré BauerAdvisor

Submission14. September 2018 www.uni-wuerzburg.de

Page 2: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 3: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

I declare that I have developed and written the enclosed thesis completely by myself, andhave not used sources or means without declaration in the text.

Würzburg, 14. September 2018

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .(Filipp Roos)

Page 4: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 5: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Abstract

Containers are the most prevalent technology for virtualization in the server world, al-lowing to easily and quickly distibute workloads in a cluster of servers. Determining andpredicting the number of needed resources to adequately handle the varying demand in cur-rent applications is the task of auto-scalers. To be able to evaluate auto-scaling solutions,this thesis explores the field of elasticity benchmarking in regards to Docker containers,using the top two orchestrators Kubernetes and Swarm.

We evaluate the orchestrators in different questions: Are measurements repeatable? Doesthe performance of a container vary in longer tests? How should the machines that make upthe cluster be dimensioned? Which orchestrator should be used for elasticity benchmarks?These questions are answered by use of various benchmarks and test scenarios.

By developing plugins for and extending the elasticity benchmark BUNGEE to work withKubernetes and Swarm, the groundwork for further benchmarks of scaling solutions usingcontainers is laid down.

We find and identify multiple issues in various projects that need to be fixed before furtherbenchmarks in this area can be conducted. Finally, we present various opportunities forfuture expansion of scope and improvement of the evaluation setup.

v

Page 6: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 7: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Zusammenfassung

Container sind die derzeit verbreitetste Technologie, um Server zu virtualisieren, besondersweil sie es erlauben, einfach und schnell Arbeitsbelastung in einem Cluster zu verteilen. DieErmittlung und Vorhersage der Ressourcenanzahl, die für eine reibungslose Verarbeitungder sich verändernden Nachfrage notwendig ist, ist die Aufgabe der sogenannten Auto-Scaler. Um Auto-Scaling-Lösungen evaluieren zu können, erforscht diese Bachelorarbeitdas Feld des Elastizitäts-Benchmarking mit Docker-Containern. Hierzu werden die beidenmeistgenutzten Orchestratoren, Kubernetes und Swarm, verwendet.

Wir haben die Orchestratoren auf verschiedene Fragen hin untersucht: Sind die Messungenwiederholbar? Schwankt die Leistung eines Containers in längeren Tests? Wie sollten dieRechner für ein Cluster dimensioniert sein? Welcher Orchestrator sollte für Elastizitäts-Benchmarks verwendet werden? Diese Fragen werden durch verschiedene Testszenarienund Benchmarks beantwortet.

Indem wir Plugins für den Elastizitäts-Benchmark BUNGEE entwickeln und ihn somitum eine Anbindung zu Kubernetes und Swarm erweitern, schaffen wir das Fundament fürweitere Benchmarks von Skalierungslösungen mit Containern.

Wir finden und identifizieren mehrere Probleme in verschiedenen Problemen, die behobenwerden müssen, bevor weitere Benchmarks in diesem Bereich stattfinden können. Ab-schließend präsentieren wir verschiedene Möglichkeiten, den Testaufbau zu verbessern unddessen Umfang zu erweitern.

vii

Page 8: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 9: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Contents

List of Acronyms xi

1. Introduction 11.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Thematic Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3. Questions and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4. Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2. Basics of Thesis, Related Work 52.1. Introduction to Scaling and Elasticity . . . . . . . . . . . . . . . . . . . . . 52.2. The BUNGEE Cloud Elasticity Benchmark . . . . . . . . . . . . . . . . . . 62.3. Docker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4. Docker Orchestrators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4.1. Swarm Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4.2. Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4.3. Mesos with Marathon . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4.4. Nomad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4.5. Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.5. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5.1. BUNGEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.5.2. Container benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . 11

3. Approach 133.1. Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2. Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4. Implementation 154.1. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1.1. Swarm Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.1.2. Kubernetes Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2. Changes to Existing Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.3. Ansible Playbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5. Evaluation 215.1. Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215.2. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2.1. Swarm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.2.2. Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.2.3. BUNGEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.3. stress-ng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.4. System Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.5. Long-Time Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.6. Parallelism and Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

ix

Page 10: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

x Contents

5.7. Differences between container orchestrators . . . . . . . . . . . . . . . . . . 28

6. Conclusion 316.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316.3. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Bibliography 34

Appendix 37A. UML Diagrams of Implementation Work . . . . . . . . . . . . . . . . . . . . 37

x

Page 11: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

List of Acronyms

VM Virtual Machine

API Application Programming Interface

REST Representational State Transfer

OS Operating System

DLIM Descartes Load Intensity Meta-Model

SMT Simultaneous Multithreading

SUT System Under Test

SLO Service Level Objective

IDE Integrated Development Environment

JVM Java Virtual Machine

JSON JavaScript Object Notation

YAML YAML Ain’t Markup Language

IaaS Infrastructure as a Service

HPC High Performance Computing

xi

Page 12: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 13: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

1. Introduction

We begin this thesis by describing the reasoning behind virtualization in Section 1.1.In Section 1.2, we then take a look at different types of virtualized computers, virtualmachines and containers. Afterwards, in Section 1.3, we explain our results and theaccomplished goals of the thesis. At the end of the chapter, Section 1.4 goes over thegeneral structure of the thesis.

1.1. MotivationIn a fast-paced world, we are always on the lookout for new ways to organize computingresources efficiently. Services need to maintain a certain level of responsiveness even undera high level of load. In web applications, for example, the load is influenced mainly bythe number of concurrent users. When there are few users, there are free capacities whichshould be used for other customers or workloads.

In order to split workloads across multiple computers, we need to isolate these workloadsfrom each other and provide a way to scale them independently and quickly. The mostcommon way to solve this issue is to divide a computer into smaller units each operatingin its own environment. This approach is called virtualization, as each of these units is avirtual computer by itself.

This thesis focuses primarily on evaluating the performance of different implementationsof virtualization. Our main focus are the recently developed Docker containers. Whichconfiguration of containers is best suited for a given workload, and how does this compareto other virtualization solutions? In order to answer these questions, we are extending andusing the BUNGEE benchmark tool.

1.2. Thematic IntroductionTo scale services quickly and utilize hardware resources better, one or more layers ofvirtualization are added between hardware and software. There are different approachesto server virtualization. We concentrate our efforts on two proven models: platform levelvirtualization with Virtual Machines (VMs) and operating system level virtualization withDocker containers.

VMs virtualize a complete computer with a dedicated processor for each instance. Re-sources are managed by a so-called hypervisor, for example KVM, Xen, Hyper-V or

1

Page 14: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

2 1. Introduction

Figure 1.1.: Comparison of virtual machines and containers, taken from [Doc15]

VMware. The operating system the hypervisor is running on is called the host. To runprograms in a VM, a complete other Operating System (OS), the so-called guest, has to berun at the same time. This in turn causes a comparatively large amount of performanceto be lost. On the flip-side, encapsulating the guest completely allows for independentoperation. Mixing different host and guest OSs is easily possible. This corresponds to theleft column in Figure 1.1.

Containers take a slightly different approach to the same problem. Only part of the host OSis abstracted away, the containers use the same instance of the system kernel as the host.The user space, which encompasses running applications and files, is an isolated part of thehost. Containers as a drop-in replacement for VMs have existed for many years, packingmultiple services into one unit. This is supported by software like OpenVZ/Virtuozzo,LXC and FreeBSD Jails. At this time, development is shifting towards more lightweightcontainers with only one app per container. The projects going in this direction are Docker,which is our sole focus in this work, and the alternative rkt. Both include support for theOpen Containers Initiative image format. A visualization of this approach can be seen inthe right column in Figure 1.1.

We are focusing on Docker containers in this thesis, as the BUNGEE benchmark onlysupported VMs before. To answer our questions, we are using different solutions forbenchmarking containers and extending the BUNGEE benchmark in the progress.

1.3. Questions and Goals

Besides academic research, the primary code product of the thesis is an extended version ofthe BUNGEE benchmark which supports using different container orchestrators in placeof a cloud management system.

On the base of this work, we are further exploring the following questions about virtual-ization with the Docker platform:

Q1) Which circumstances determine if it is better to run a container cluster on manysmaller systems or fewer, more powerful hosts?

Q2) What are the main benefits and drawbacks of the Docker cluster management tools?

Q3) Does the performance of a container change measurably over its lifetime?

2

Page 15: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

1.4. Thesis Overview 3

Q4) To what degree does the performance of a container depend on other containers onthe same machine?

Q5) By how much does the performance fluctuate between runs of the same test?

As a side task, we perform general code cleanup and improve the BUNGEE project. Themajor goals in this direction are:

M1) Make the project compatible with Maven, the build system used for other DescartesJava applications

M2) Improve cross-platform compatibility

M3) Integrate Wilhelm’s multi-tier benchmark tools into the main codebase

1.4. Thesis OverviewNext up, Chapter 2 introduces the reader to the concepts necessary for understanding thework. It also goes into detail about the software we use. We take a look at other researchin the area, too.

After the introductory chapters, the focus shifts to our scientific research. Which assump-tions are we making and which questions are we trying to answer? This is the topic ofChapter 3.

In Chapter 4 we present the plugins for the BUNGEE benchmark that we developed—how they work, the internal structure, the development process and the challenges weencountered. Additional scripts for performing the benchmarks are also described.

Finally, we try to answer the questions we have asked in Chapter 5. The results are plottedand implications are discussed further.

At the end of the thesis, in Chapter 6, we summarize the test results and present areaswith potential for further work.

3

Page 16: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 17: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

2. Basics of Thesis, Related Work

In this chapter, the key components needed for our research are introduced. To explain theproblems our research faces, we begin by defining basic terms of the field in Section 2.1.Afterwards, the software in use is described in more detail, starting in Section 2.2 withthe benchmarking tool BUNGEE and going over the containerization tool Docker with itssupporting infrastructure in Section 2.3. Later in the chapter, Section 2.5 presents existingwork related to this thesis.

2.1. Introduction to Scaling and ElasticityTwo key concepts dominate the field of work we are in and are therefore important forfurther understanding of the thesis. We start with scalability, it is defined by [HKR13, p.3] as:

Scalability is the ability of the system to sustain increasing workloads by makinguse of additional resources.

Systems can be scaled with two different approaches. The simplest approach, addingresources to a machine, is called vertical scaling. Upgrading hardware speeds up nearly allapplications without any software changes and optimizations, but is more time-consumingwhen executed live. Increasing single-core processor performance and memory on the fly isdifficult to support, and decreasing them is even harder. This type of scaling runs into thelimits of the hardware—processor and memory—quickly, too. In contrast, adding moreof the same type of system, which is called horizontal scaling, is harder to implement inapplications as it needs the workloads to be split across multiple machines. In production,however, it is faster to start and stop whole systems than modifying existing ones.

Another important term, elasticity, is defined by [HKR13, p. 2] as

the degree to which a system is able to adapt to workload changes by provision-ing and de-provisioning resources in an autonomic manner, such that at eachpoint in time the available resources match the current demand as closely aspossible.

High elasticity of a service is paramount to achieving the goal of optimal resource usage.For making use of the scalability of a system and scaling it up and down autonomically,a mechanism called an auto-scaler can be used. To compare the results of different ap-proaches to auto-scaling, we use the BUNGEE benchmarking tool, which is explored indetail in the next section.

5

Page 18: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

6 2. Basics of Thesis, Related Work

Figure 2.1.: Overview of BUNGEE’s operating principle, taken from [Wil17, p. 10]

2.2. The BUNGEE Cloud Elasticity BenchmarkBUNGEE is a benchmarking tool for Infrastructure as a Service (IaaS) clouds based onVMs. Its development started in Weber’s masters thesis [Web14] and was later published asthe paper [HKWG15]. The standard workflow to benchmark a cloud system is illustratedin Figure 2.1.

To stress a cloud system, BUNGEE uses the load generator JMeter with a custom plugin1

to send requests to a replicated set of web servers running on the cloud system under test.The load intensity is represented as requests per second internally in BUNGEE, beforeany measurement with JMeter the load is converted to timestamps and written to a textfile. During every measurement run, direct results such as full response time, actual timefor generating the response, number of responses and percentage of failed responses arerecorded. These values can later be used for evaluation.

The first step is a performance analysis of the System Under Test (SUT). Here, load isapplied to the system at different levels of scaling. A run always takes a fixed length of time,over which the load stays constant. After a run ends, the load is doubled. This procedureis repeated until the system is unable to meet a fixed Service Level Objective (SLO), suchas “95 percent of responses must not take longer than 500 milliseconds” in the run. Theexact load cut-off is then determined by a binary search algorithm. This cut-off forms asingle data point in the top left graph of the illustration (1).

There are two different types of system analysis which are possible with BUNGEE: thesimple system analysis and the detailed system analysis. In the simple system analysis,the result is then interpolated by multiplying it with the number of units up to a specifiedmaximum number. For the detailed system analysis the system is scaled up and the testrepeated. The analysis begins at one resource and continues by scaling the system up oneresource at a time. This test is repeated until either a specified maximum number of unitsis reached or the performance does not increase anymore by scaling up.

An existing load profile, which is a recorded or synthetic time-series of load in the DescartesLoad Intensity Meta-Model (DLIM) format (see [vKHK14] for further information), iscombined with the system analysis data to form a calibrated load. First goal of thisprocess is to establish a target number of resources the SUT has to provide at any time

1https://github.com/andreaswe/JMeterTimestampTimer

6

Page 19: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

2.2. The BUNGEE Cloud Elasticity Benchmark 7

Figure 2.2.: Results of a multi-tier system analysis, taken from [Wil17, p. 41]

to handle the requests. For this purpose, we assume an ideal cloud system with the samenumber of resources as our SUT. The maximum number of requests this ideal system canhandle is the maximum load recorded in the load profile. This ideal cloud system scaleslinearly—for a cloud system of t total units, the load that a active units can handle is a

tof the maximum load. The number of units this system needs to handle a given load isthe ideal number of units the SUT needs to provide. A scaling function is established sothat the number of requests to this virtual cloud system can be translated to the numberof requests to be sent to the SUT. This function is a linear interpolation of the systemanalysis data, with the unit count being multiplied by the maximal load of the profile.The load profile is then adjusted with this function, the result is called a calibrated loadprofile. This process ensures the SUT ideally behaves like the fictional ideal system theload profile is based on. In Figure 2.1, this is the bottom right graph.

Afterwards, this load is applied to the cloud service. BUNGEE records the number ofresources the system scales to at each time (the red “Supplied Units” line in the bottomleft graph). This is combined with the ideal units time-series determined in the last step(blue “Demanded Units” line). Aggregate values of this time-series quantify the elasticityof a system. Various further options for evaluation are present. Time-series results canthen be plotted as a graph and exported to PDF, or exported via CSV for external post-processing and plotting.

As stated in Section 1.3, the main goal of this work is to integrate the BUNGEE benchmarkwith various Docker orchestrators to lay the groundwork for elasticity benchmarking onthe Docker platform.

BUNGEE in its original form only supports benchmarking of single-tier cloud applica-tions, where all the components of the application are only scaled up or down together.Recently, Wilhelm has extended the application to support benchmarking of multi-tierapplications [Wil17]. Multi-tier cloud applications consist of multiple components, calledtiers, responsible for different parts of a service. These components can be scaled inde-pendently, which leads to the complete system being able to handle different amounts ofload for each combination of resource amounts. Figure 2.2 shows the results of runninga BUNGEE system analysis on the multi-tier test application COCONuT. Each dot is a

7

Page 20: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

8 2. Basics of Thesis, Related Work

Figure 2.3.: Docker architecture, taken from [Doc18]

scaling configuration located on the intersection of the three axes describing the resourceamounts of the components. The measured maximum load is the color of the dot: theminimum is blue and the maximum is red. Benchmarking multi-tier applications is notpart of this thesis. Anyhow, as Docker facilitates multi-tier deployments, which makes thistype of benchmark all the more important, we are integrating Marcus’ extensions into ournew version of BUNGEE.

In the next section, our focus is on the applications that BUNGEE helps to investigate—the Docker runtime and the cluster orchestrators.

2.3. DockerDocker is currently the market-leading runtime for per-app containerization. In 2013, theproject was started by the hosting company dotCloud, which later changed its name toDocker Inc. [Doc13]. First using LXC as its container engine, it later switched to the ownproject libcontainer. Its image and runtime format were later developed into standards bythe Open Containers Initiative [The18b].

The Docker architecture is based around the so-called Docker objects: images, containers,networks, volumes, plugins and services [Doc18]. Images are the templates from whichcontainers can be deployed. Both house the container userspace in a standardized format,the former as a template for deployment and the latter in a running environment. Contain-ers can be connected to multiple virtual networks, which allows for bridging or isolationof services as needed. The runtime of a container is saved inside an overlay file system(previously aufs, currently overlayfs). For persistent files such as databases, which shouldnot be deleted when removing the container, volumes are the preferred storage option.Plugins come in three varieties—authorization, volume and network—and can be used toextend the capabilities of Docker in these areas. Lastly, services are used with the built-inorchestrator Swarm mode and represent a replicated set of multiple containers based onthe same image.

The Docker stack is a server-client architecture consisting of multiple services. The dae-mon, also called dockerd, manages the Docker objects. It has different areas of function-ality relating to the container lifecycle. Figure 2.3 shows examples for this functionality.It includes:

• Building images from so-called Dockerfiles, which include definitions of the tasks tobe performed when building. This corresponds to the dotted arrow in the figure.

8

Page 21: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

2.4. Docker Orchestrators 9

• Downloading pre-made images from repositories called Docker registries. These im-ages can also be used as a base for own custom images. This action corresponds tothe striped arrow in the figure.

• Creating and running containers from images, managing networks and volumes. Thisis visualized by the dotted striped arrow.

• Orchestrating other instances of dockerd joined in a swarm. This is not shown inthe figure.

All of dockerd’s functionality can be controlled with a Representational State Transfer(REST) Application Programming Interface (API). Tools and orchestrators communicatewith dockerd by means of this API. It is also used by the command line client, which isjust called docker, to administer the system.

2.4. Docker OrchestratorsVarious software projects integrate Docker to manage containers across a cluster of servers.These projects are referred to as Docker orchestrators or task schedulers. In this section,we take a look at four orchestrators and compare them.

The top 3 of solutions is based on a survey by the Cloud Native Computing Foundation[The18a]. It consists of Kubernetes, which is used by 83 percent of respondents, Swarmmode, used by 21 percent and Mesos, used by 9 percent. Additionally, we chose Nomad asa newcomer for this comparison because of its “simple yet powerful” approach of doing onething—orchestrating workloads—well, while the other solutions cram all features relatingto containers into one large project. All of these tools provide APIs that we can use to runour tests, however in this work we focus exclusively on the two most popular solutions,Swarm and Kubernetes.

2.4.1. Swarm Mode

The Swarm mode originated as an external tool, but has since been integrated into themain Docker application. It uses the same API structures as regular containers. Thus,it is easy to deploy when extending an existing Docker application to run on multipleservers. Swarm only provides support for replicated constantly-running services, but notfor jobs that run to completion. Autoscaling support is provided by an external toolcalled Orbiter2, which exposes an API to trigger scaling up/down, in combination with anexternal trigger. Swarm includes support for basic ingress load balancing and is nativelysupported by Træfik, the load balancer we are using.

2.4.2. Kubernetes

Kubernetes, which is a spin-off from Google’s internal Borg tool, was developed for Docker-based containers. In the meantime, it has gained support for other runtimes, such as rkt,as well. The standard process for deploying an app starts with a so-called deployment,which is defined in a YAML Ain’t Markup Language (YAML) file. It contains a controller,a piece of software implementing a control loop to provide the defined configuration ofpods. Examples are the ReplicaSet, which replicates a continuously running pod acrossdifferent machines, or the Job, which runs a defined number of instances of the podsto completion. A pod is the smallest unit in the Kubernetes scheduler. It consists ofeither a single container or multiple containers sharing the same machine and sharingresources. Later changes are made in the deployment, which propagates the changes to

2https://github.com/gianarb/orbiter

9

Page 22: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

10 2. Basics of Thesis, Related Work

its child objects. Kubernetes also includes an internal autoscaler called Horizontal PodAutoscaling. Networking and load balancing is handled by one of several overlay networkplugins and Træfik is supported as an ingress load balancer. With its various options,Kubernetes is the industry-leading container orchestrator.

2.4.3. Mesos with Marathon

Apache Mesos, started in 2009, is a cluster management tool which later gained supportfor containers. It allows to run applications distributed across machines in a cluster.Marathon adds scaling and fault recovery, similar to Kubernetes’ ReplicaSets. One ben-efit of this system is the plug-and-play nature—other frameworks can be managed byMarathon alongside individual applications. An auto-scaling example script is availablefor the DC/OS platform, as well as a load balancing tool called Marathon-LB which gen-erates a configuration for the external tool HAProxy. Træfik also supports this platform.As the complexity of this system is higher than the others, this tool is not considered inthis thesis.

2.4.4. Nomad

The latest software, Nomad, is the last step of the HashiStack workflow for applicationdeployment. It has a small feature set by itself, but is designed to work together with otherapps in the stack. These apps are Vagrant, a development and test setup tool, Packer, apackaging tool, Terraform, a server provisioning tool, Vault, a secure store, and Consul,a service discovery and communication app and key/value database. Load balancing isintegrated into Consul, and autoscaling is planned for Nomad but is apparently on hold.Because of its smaller marketshare and due to time constraints, we have not evaluated thisorchestrator further.

2.4.5. Comparison

Table 2.1 shows a detailed comparison of the four orchestrators. The solution whichis most integrated with Docker is Swarm mode, as it is a part of the Docker daemonitself. Kubernetes and Nomad were built for Docker from the beginning, Mesos as a taskscheduler began development before containerization was widespread. Kubernetes is amodular platform on which all sorts of applications and plugins can be built, as is Dockeritself. This makes them both universally usable. In contrast, Mesos with Marathon isintegrated into the cluster operating system DC/OS and Nomad is intended to be usedwithin the HashiStack workflow. Swarm and Kubernetes only operate on containers, whileMesos and Nomad also support native applications and Nomad even supports virtualmachines and Java applications. All solutions have a way of attaching an auto-scaler,but only Kubernetes integrates one, allowing to either monitor the containers based ontheir resource consumption or specify custom metrics. Swarm and Kubernetes have anintegrated ingress load balancer, Mesos and Nomad require an external tool. The last lineof the table describes the history of the relative number of Google searches as reported byGoogle Trends [Goo18].

2.5. Related Work

As the tools described in this chapter have scientific interest beyond our topic, this sectionhighlights previous work related to this thesis.

10

Page 23: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

2.5. Related Work 11

Name Swarmmode

Kubernetes Mesos + Marathon Nomad

Maindeveloper

Docker Google Apache,Mesosphere

HashiCorp

Dockerintegra-tion

Part ofDocker

Built for Docker,part of DockerEnterprise Ed.

Supports Docker asworkload

Docker ismain focus

Part ofplatform

Docker Is itself a platform DC/OS HashiStack

Unit oforganiza-tion

Singlecontainer

Pod (containergroup)

Application(container, native)

Job(container,VM, native)

Auto-Scaling

Unofficialproject

Internal (resource orcustom metrics)

External with API Planned

Loadbalancing

Internal Internal Using HAProxy UsingConsul

Googleinterest[Goo18]

Slowlyrising sincelate 2014

Over 10 times morepopular than others,rising

Constant, on parwith Swarm sinceMid-2017

About halfof Swarm

Table 2.1.: Comparison of Docker orchestrators

2.5.1. BUNGEE

Various projects have used BUNGEE to evaluate scaling solutions, such as the predic-tive auto-scaler Chameleon [Bau16], its multi-tier extension [Les17] and the FOX cost-aware scaling mechanism [LBHK18]. For evaluation, these auto-scaling algorithms wereset up to scale virtual machines. After a system analysis, BUNGEE was used to applya calibrated load to the SUT while monitoring the reactions of the scalers. The supplyand demand allocations were then used to compute different elasticity metrics such asover/underprovisioning time share and accuracy.

In comparison, we are using BUNGEE in our work to benchmark static clouds. Here, thenumber of allocated resources, i.e., running containers, is only dependent on the boundsset by BUNGEE itself and does not change throughout a calibrated load benchmark. Thefurther implications of this difference are discussed in Section 3.2.

2.5.2. Container benchmarking

Several benchmarks for containers have already been published. These works mainly focuson comparisons between VMs and Docker containers.

Raho et al. compared the processor, memory and network performance of Docker contain-ers to KVM and Xen VMs, running on an ARMv7 processor architecture. They come tothe conclusion that, in general, the performance of Docker is comparable to both hyper-visors [RSPR15].

Another comparison study by Felter et al. focused on database storage performance withMySQL as a benchmark. Their results show that Docker volumes are marginally slowerthan native storage without any virtualization. The non-volume overlay filesystem, how-ever, is around 20 percent slower, while KVM qcow images have a third of the native

11

Page 24: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

12 2. Basics of Thesis, Related Work

performance. Processor and memory performance was equal or better in Docker. Basedon their results, they propose to use containers instead of VMs for IaaS clouds, and ratheruse VMs inside the containers instead of the other way around [FFRR15].

Dua, Raja, and Kakadia examine different configurations of VMs and containers forPlatform-as-a-Service clouds. They see a bright future for containers in this field of appli-cation, provided containers get more standardized (which has since happened), more secureand OS independent. As main advantages they list fast startup times and performanceadvantages. [DRK14].

In Xavier et al.’s paper, Xen-based VMs and containers of various runtimes are comparedfor usage in high-performance computing environments. Main focus of this paper is theisolation of various resources. The authors conclude that performance isolation in con-tainers is poor for all resources except for processor performance. For HPC clusters, thisdoes not matter as much as other types of usage, as these clusters typically consist of abig number of individual servers and it is possible to isolate simultaneous workloads byassigning different hardware to each [XNR+13].

Leitner and Cito conducted various performance tests in public clouds. Their researchquestions are formulated as hypotheses with the evaluation describing the conditions underwhich the hypothesis is confirmed or rejected. The final recommendation in this paper isthat users perform their own benchmarks before selecting a cloud provider, as differentproviders offer vastly different conditions and performance [LC16].

The main difference to our work is that we are using BUNGEE, so the implementationwork done in this thesis can later be used for elasticity-related benchmarking. In addition,we leave VMs out of our comparisons and instead focus on Docker containers exclusively.

12

Page 25: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

3. Approach

In this chapter, we first introduce the tests we perform and to which questions and use-casesthese apply in Section 3.1. Afterwards, in Section 3.2, we take a look at the assumptionsthat Weber made in [Web14] about elasticity benchmarking in IaaS clouds and discuss thedifferences to our evaluation of container clusters.

3.1. TestsBefore testing with BUNGEE and the orchestrators, we measure a baseline of perfor-mance using the stress-ng suite. To stress the machine it runs on, each of its workerthreads iterates over a variety of tests, called stressors, which consist of comparativelysmall operation packages, for a defined time. These built-in metrics are described in themanual as unsuitable for benchmarking [16]. Nevertheless, they are used here to get real-istic comparative performance results for both the native worker nodes and using ”vanillaDocker”, running without an orchestrator. We chose stress-ng because its mix of testsallows for its average performance numbers to be close to the performance difference inreal-world processor-bound applications.

We compare different configurations of clusters by running few containers per VM onmultiple small VMs and more containers per VMs on fewer, but larger sized ones. Runningcontainers on different VM provides better isolation between workloads, meaning thatperformance per VM stays constant even when additional machines get added. This testtherefore shows if hardware isolation impacts performance in a positive or negative way.

worker-small worker-medium worker-large worker-huge

ContainerVM

Server

Figure 3.1.: Different VM configurations for testing

13

Page 26: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

14 3. Approach

Our expectation is that the performance drops as the number of VMs goes up, because ofincreased context switching and scheduling overhead.One of the tests is a long-time one. We are running a container cluster over a period of30 minutes and taking continuous performance measurements with BUNGEE. The goalof this test is to see if the performance of a container changes with its uptime, as was aresearch question asked in Section 1.3. We expect the performance to stay constant overtime.Another test explores how different applications running in separate containers affect eachother. For this, we use the stress-ng program to generate constant processor load inmultiple containers at the same time. As Docker uses cgroups for dividing processor powerbetween containers, we expect the performance to exactly divide between the containers,as is the case for multiple applications on a single OS.

3.2. AssumptionsWeber chose the original scope for BUNGEE to be IaaS clouds based on VMs [Web14,pp. 25-26]. Available resources are fixed per unit of scaling and are dedicated to the unitat scaling time. Our scenario differs from this by being based on a container cluster,which has flexible resource limits between the containers. Raw performance is based onthe allocation of containers to servers and the workloads these containers work on. Theonly raw performance metric used for BUNGEE is processor load, which translates wellto containers, as comparisons have shown. The only performance overhead over regularhardware in this metric is additional context switching.The first part of the benchmarking routine is the system analysis. It generates a demandfunction to be used for adapting the rest of the benchmark. [Web14, p. 32] describes thelimit of this function to be

[. . . ] either caused by a limited amount of available resources or by a limitedscalability due to other reasons like a limited bandwidth or increased overhead.In the latter case, additional resources are available, but even after addingresources the SLO cannot be satisfied.

In our test setup, the limiting factor is the number of machines in the Docker cluster.However, this manifests not as a “limited amount of available resources” in the sense ofthe original text. Instead, it is still possible to add more units, i.e., start more Dockercontainers, which in turn share the available resources of the host machines with thealready running containers. For example, with four worker servers in the setup, the generalscaling assumption of “more units equals more performance” breaks after adding a fifthcontainer. The new container needs to share one of the four worker servers with one ofthe four original containers, halving the performance of both the old and new container.BUNGEE sees this as a performance decrease, as the overhead for running two containersexceeds the original overhead of running just one container.We are not integrating an auto-scaler into these plugins, which rules out the possibilityand necessity of resource supply monitoring as described in [Web14, p. 38]. As such, allmetrics which concern auto-scalers do not apply to the benchmarks done in this thesis.This gives BUNGEE the role of a simple load generator and analysis tool. The systemanalysis with its binary search algorithm serves as a simple performance metric. Runningthe calibrated load benchmark serves as a tool to stress the systems to a defined level whilechecking for plausibility and extracting metrics from the log files using the SLO system.Like Weber, we are focusing our efforts on single-tier applications. However, we aim tomake extending our cloud plugins for multi-tier benchmarks as easy as possible by basingour code on Wilhelm’s multitier-bungee branch.

14

Page 27: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

4. Implementation

To be able to run BUNGEE on our chosen container orchestrators, the benchmark had tobe extended with plugins for them. In this chapter, the architecture of the existing codeand our new code is described in Section 4.1. Section 4.2 explains the challenges we facedwith the original code and our changes to remedy these issues. Lastly, in Section 4.3 wecover some scripts (called playbooks) for the automation tool Ansible we wrote to aid intest setup.

4.1. ArchitectureBUNGEE consists of a number of components, which are used for different parts of the testroutine. Figure 4.1 shows these components and their dependencies. The bungee compo-nent is the core of the application. It defines the interfaces used by all other components,provides helpers for setting up the SUT and analyzing it. For the measurement phase,it calibrates the load patterns, invoking the load generation and measuring the resultingparameters.

Other components, which were not used at all in our thesis, include the following: Sim-pleHTTP is the original payload installed on each virtualized instance in the SUT. Itgenerates CPU stress by calculating Fibonacci numbers. However, it has been replacedby another application for newer tests, further described in Section 5.1. The evaluationpackage contains the specific experiments to be run. It invokes the other components torun specific tests and then post-processes the test results for drawing. chart draws theresults using the JFreeChart library and viewer allows the user to directly view the chartsin Eclipse.

Finally and most importantly for our work, cloud.aws and cloud.cloudstack implementthe interfaces defined in the bungee.cloud package to communicate with a specific cloudsystem. The cloud plugins also handle setting up the cloud systems for testing by creatingand starting the necessary VMs. We implemented two of these cloud plugins for the testedDocker orchestrators—cloud.swarm and cloud.kubernetes.

BUNGEE’s architecture separates the cloud plugins into separate Java packages in thenamespace tools.descartes.bungee.cloud. Each cloud plugin implements two interfaces—CloudInfo and CloudManagement—with the former allowing to get the current numberof allocated resources and the latter allowing to set and read the bounds for scaling. Thisallows for so-called active monitoring as described in [Web14, p. 38]. In our case, the

15

Page 28: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

16 4. Implementation

bungee

cloudstack

aws

chartviewer

evaluation

SimpleHTTP

Figure 4.1.: BUNGEE package diagram

clusters are static, so we ignore the lower bound and take the upper bound as the numberof replications of our service (Swarm) or ReplicaSet (Kubernetes). The upper bound onlydiffers from the current number while starting or stopping containers, i.e., when the boundhas recently changed.

Entry points to run measurements with the cloud plugin are placed in the tools.des-cartes.bungee.examples package. Of these classes, RunBenchmarkOnSwarm and Run-BenchmarkOnKubernetes are not used in our work. Future work can use these classes forrunning an elasticity measurement. As the underlying RunBenchmark class assumes thatthe system automatically scales down to 1 unit in idle periods, it times out if the cloudis static, which currently is the case with these plugins. TestSwarm and TestKubernetescan be used to test scaling the system to a certain value.

To run a system analysis, we use the DetailedSwarmAnalysis and DetailedKubernetesAnal-ysis class. It loads the plugin specific configuration files (see respective subsection below),the JMeter configuration (jmeter.prop), the request settings problem size and timeout(request.prop) and the host properties file which specifies hostname, in our case servicename (Swarm) or object name prefix (Kubernetes), port and path (host.prop). With theseparameters it then creates the necessary objects using the orchestrator API, based on thespecification file. Finally, the program starts the analysis using the DetailedSystemAnaly-sis class, which takes care of creating the file structure, writing the appropriate timestampsfor each intensity level to be tested, starting JMeter with the correct parameters, analyzingthe results for the specified SLO, scaling up and writing the mapping into a CSV file inthe end.

In the following subsections, we go over the specific details for each cloud plugin. UMLdiagrams for both plugins can be found in Appendix A.

16

Page 29: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

4.1. Architecture 17

4.1.1. Swarm Plugin

Our Swarm plugin is in the subdirectory and namespace tools.descartes.bungee.cloud.swarm.It uses the docker-java1 library to connect to the Docker daemon which is designated asthe Swarm master.

All the settings needed to create services are managed by a class called ServiceSettings.For this plugin there are only two individual settings: the public hostname or IP addressof the service and the service specification itself. The class reads these settings from aJava property file, called docker.properties, which contains a reference to a service specfile in the services folder. Services are specified in the Docker API JavaScript ObjectNotation (JSON) format2. The JSON is converted into the format required by the docker-java library using the Jackson Databind3 JSON parser.

As discussed in the last section, cloud plugins require at least two interfaces, CloudInfo andCloudManagement. Both are implemented in a single class named SwarmManagement.This class implements creation, scaling and monitoring of Swarm services. To instantiatethe class, the properties file for the docker-java library is provided as a parameter to theconstructor. It contains the connection infos for the Docker daemon and is usually calledswarm.properties.

An instance of the ServiceSettings class is loaded into a SwarmManagement by means ofthe setServiceSettings method. For convenience there is also an overloaded method whichallows using a settings file directly. A getter is available, too.

Before running a test, the method createService is used to prepare the Swarm service. Ittakes the service name and the initial number of replicas to be used as parameters, asthese are dependent on the test which is being run. All other settings are directly readfrom the spec provided in the ServiceSettings class. First, any existing service with thename is stopped. Then, the spec is adapted to feature the given service name and replicas.BUNGEE’s IPMap is modified to redirect the requests to this service to the correct host.Finally, the service is created and the method waits for the service to reach the specifiednumber of replicas. If this does not happen within six minutes, the service is deleted againand the service creation fails.

There is also a stopService method, which stops and deletes a service if it is present.

4.1.2. Kubernetes Plugin

The Kubernetes plugin is contained in the subdirectory and namespace tools.descartes.bungee.cloud.kubernetes. To connect to the Kubernetes master, we use the “Kubernetes& OpenShift 3 Java Client” 4 library.

The biggest difference to the swarm plugin is that while a Swarm service is a completeunit which defines everything from the container image to the external hostname andport, Kubernetes uses five different objects to describe our service: First, a deploymentthat creates and manages a ReplicaSet, which is where, among other things, our numberof replicas is set. The ReplicaSet deploys Pods, which contain the container(s) we use forthe test. For accepting external connections, we additionally need to create a Service andan Ingress.

To keep track of these different components, we assign specific names to all of them: thespecified name prefix, set as a parameter of the createObjects method, and the kind of

1https://github.com/docker-java/docker-java2https://docs.docker.com/engine/api/v1.37/#operation/ServiceCreate3https://github.com/FasterXML/jackson-databind4https://github.com/fabric8io/kubernetes-client

17

Page 30: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

18 4. Implementation

object in lowercase, separated by a dash. In our case, the components are called lu-deployment, lu-service and lu-ingress.

Similar to the Swarm plugin, we use a class called KubernetesManagement to implementmost of the functionality. Required properties for the client library are passed in as aYAML or JSON formatted file, which we called k8s-client.yml. The second class responsi-ble for the settings is called KubernetesObjectSettings and references a Kubernetes formatYAML file holding all the object specifications. All other methods have the same func-tionality, with “Service” replaced by “Objects”.

We only implemented the object kinds described above for createObjects and deleteOb-jects, but extending the class for other types of objects is easily possible.

Passive monitoring, which involves retrieving a list of resource allocations from the cloudsystem, could be implemented for Kubernetes in the future by utilizing the watch func-tionality, where the Kubernetes master sends any changes of the watched resources backto the client. As we do not need this functionality for our work, we only implementedactive monitoring in both plugins. Refer to Section 3.2 for more details on this topic.

4.2. Changes to Existing CodeThe previous version of BUNGEE assumes the hostname to be an unique identifier or namefor the cloud service. Because the orchestrators are being run via a local load balancer, thisassumption breaks, as we send all requests to this load balancer, which then forwards itto the service. As a result, requests get sent with this virtual service name as a hostname,which does not exist, instead of the proper one, e.g. “localhost”. Fortunately, BUNGEEallows the hostname to be remapped to a different one before sending it to JMeter. Thisis done by using the IPMap singleton class. To change the mappings, originally the classitself has to be changed and recompiled before running tests. As this is inconvenient anderror-prone, we added a setter method to this class that allows the relevant mapping tobe set while creating the service as configured in the service settings file.

The JMeterController class is responsible for running the load generator in a separateJVM. As BUNGEE was developed on a Windows machine, it operates correctly using theWindows shell by quoting parts of the command. When run under Linux, such as our testenvironment, these quotes are taken as part of the parameters themselves. This in turnprevents JMeter from running on Linux with unmodified BUNGEE. Our solution to thisissue was to replace the use of a string for command invocation with the ProcessBuilderconstructor that takes a list of Strings. By using a list, enquoting the arguments is notneeded anymore and the Java runtime handles the operating system specific behavior forus.

With the default JMeter configuration the application tries to access the servlet, i.e., therunning on the containers without specifying a path. In the bungee-lu-servlet Dockercontainer, the default path is tools.descartes.bungee.Servlet. There are two different ap-proaches to fixing this issue. As we run the load balancer Træfik, the first idea was to tellit to modify the path accordingly. This can be done by assigning the AddPrefix rule tothe service. This is set in the JSON service specification (Swarm) or the YAML objectspecification (Kubernetes) read by BUNGEE. The second approach is changing the JMXfile, which contains the test plan for JMeter, to include the path in the request. This isimplemented in Wilhelm’s multi-tier extensions, which is why we took this approach forour test.

When analyzing a system, after sending the request to scale the cloud up or down,BUNGEE waits for the system to reach the specified resource amount. Once the sys-tem has reached the amount, there is a further fixed delay of 3 minutes where the system

18

Page 31: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

4.3. Ansible Playbooks 19

waits for the machines to fully start up and stabilize. While this may be necessary forVM-based cloud systems, the servlet Docker containers only take around 15 seconds tostart up until requests are returned properly. To shorten the time it takes to start a testrun, we extended the responsible classes ResourceWatch, DetailedSystemAnalysis and ourcloud plugins to allow us to change the delay to 30 seconds.

BUNGEE in its original version uses the Eclipse Classpath format for managing its de-pendencies. We switched to the Apache Maven build tool to make installing and runningBUNGEE easier on a new machine. Building and installing BUNGEE takes only oneinvocation of Maven, as the tool installs most of the dependencies automatically from anonline repository. It handles starting the application from the command line as well, usingthe exec:java plugin.

4.3. Ansible PlaybooksIn addition to the cloud plugins for Kubernetes, we implemented Ansible playbooks forsetting up the cloud environment and benchmarking with stress-ng.

We use the inventory file (hosts) to specify the types and sizes of VMs to start. Mixingdifferent sizes of VMs is possible, as is setting up the master on a different machine thanthe current one. All Ansible playbooks are started using a shell script (run-ansible.sh),which sets the inventory and extra variables for the setup playbooks, disables SSH hostkey checking as it interferes with our setup, and specifies to ask for the local root passwordto install software.

The first step in every playbook is to call prepare-vms.yml, which ensures the correctVMs are created, running and reachable. This helper playbook then assigns the correct IPaddress to the inventory for subsequent playbooks and upgrades all systems to the newestpackage versions.

To set up the orchestrators and the Docker registry for the first time, we use the playbooksetup.yml. It disables swap, which is a requirement of Kubernetes, and then installsthe private Docker registry containing the container image. Next, the playbook ensuresthat the Docker swarm mode is not active and then creates a new swarm using ansible-dockerswarm. Last, it invokes Kubespray to set up the Kubernetes cluster.

When we are done with the benchmark for a certain cluster configuration, the VMs canbe deleted (moved to the trash) and expunged (irreversibly deleted) by delete.yml.

There is a number of playbooks relating to a failed approach to reducing setup times.After testing, we shut the VMs down with shutdown.yml and save the current state withtemplate.yml. The latter creates a CloudStack template for every VM. For the nextVM creation, these templates are used to quickly restore the state of the cluster withouthaving to run setup.yml again. Sadly, this approach breaks the clusters. This problem isdescribed in further detail in Section 5.1.

For stress-ng testing, we use the following playbooks in the subdirectory stress: install-stress-ng.yml installs stress-ng on all workers. stress-loop.yml runs stress-ng five timeswith Docker and natively, saving results in separate directories with timestamps. stress-single.yml does the same, only once. Both use the helper playbooks stress-without-docker.yml and stress-with-docker.yml. For the isolation benchmark, the variant stress-with-multiple-docker.yml, which uses stress-loop-multicontainer.yml, has been developed.

19

Page 32: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 33: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

5. Evaluation

To answer our questions about container virtualization, we ran several tests. This chapterfirst describes the test setup in Section 5.1, then explains the configuration for the utilitiesin Section 5.2. In the remainder of the chapter, we present our evaluation results, beforewe recapitulate our experiences with the orchestrators in Section 5.7.

5.1. SetupAll tests are run using the respective orchestrators on a VM cluster managed by ApacheCloudStack. Tests are repeated at minimum five times to guarantee repeatability andmeasure the variance of the results.

The layout of our evaluation setup is visualized in Figure 5.1, with servers as rectangles,VMs as rounded rectangles and Docker containers as circles. One VM, the master, willtake the role of workspace, orchestrator and load-generator for the various tests. Theother VMs are used as workers in the benchmark. These are grouped together on a singlephysical server of which the specifications can be found in Table 5.1. This is done toprevent external influences from other applications.

The containers for our tests are created using an image containing a simple Java applicationcalled bungee-lu-servlet running on the Apache Tomcat1 application server. It acceptsrequests and stresses the processor by calculating LU decompositions of big matrices—more requests lead to more processor usage.

Table 5.2 shows the sizes of our worker VMs, which are oriented around common IaaSclouds. The number of VMs that can run in parallel depends on if Simultaneous Multi-threading (SMT) is enabled in the hardware. In this thesis, we ran all our tests with SMTdisabled.

1http://tomcat.apache.org/

Server HP DL160 Gen9

Processor Intel Xeon E5-2630v3, 8 cores, 2.6 GHz

Memory 32 GB

Storage 500 GB hard drive, 7200 rpm

Table 5.1.: Server hardware specifications

21

Page 34: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

22 5. Evaluation

CloudStack cloud

Dedicated server

Worker 1

Servlet

Worker 2

Servlet

Worker 3

Servlet

Worker 4

Servlet

Other server

MasterJMeter

BUNGEE

Træfik

IaaS cloudServerVM

Container

Figure 5.1.: Evaluation cloud setup

Name small medium large huge

Processor 1 core 2 cores 4 cores 8 cores

Memory 2 GB 8 GB 8 GB 16 GB

Storage 10 GB

Maximum number without SMT 8 4 2 1

Maximum number with SMT 16 4 4 2

Table 5.2.: Worker VM specifications

22

Page 35: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

5.2. Configuration 23

As a load balancer, we use the open source solution Træfik2, as it interacts with thecontainer orchestrators automatically out of the box. Using the same load balancer for alltests eliminates one potential distorting factor.

To set up the VMs in the cluster, we are using the IT automation tool Ansible3 togetherwith plugins for setting up the cloud orchestrators4. There is built-in support for ApacheCloudStack as well, which allows us to set up the test cluster with one so-called playbook,a file containing a definition of the VMs and software. The playbooks used for runningthe tests are described in detail in Section 4.3.

For switching between the different sizes of VMs, we tried to use the template system ofCloudStack. We set up our cluster with the small machines and then saved one templateper VM. Before we can re-create the machines with another specification—called computeofferings in CloudStack—we need to destroy the original machines. This is because Cloud-Stack does not allow more VMs to be created on a dedicated server than the server can runat the same time. Recreating the VM with a different compute offering only changes theIP address from the perspective of the orchestrators. Thus, the newly-created machinesshould still be joined to the clusters managed by the different orchestrators. This templat-ing approach seemed to work for Swarm mode, but caused major issues such as brokenoverlay networking and Kubernetes being unable to retrieve a list of all nodes. To fix theissues after switching, we had to re-run the setup script again to reconfigure the Swarmand Kubernetes clusters. In the end, the only advantage templating has for this work isnot needing to completely reinstall the orchestrators and redownload the installation dataagain.

5.2. ConfigurationNot only do the tools used in the tests need to be installed, they have various options forconfiguration as well. In this section we explain our process of configuring the componentsof the benchmark setups.

5.2.1. Swarm

For the Swarm configuration, we used the example from [18a] as a starting point. BothTræfik and bungee-servlet-lu are deployed as services. We constrain Træfik to only run onthe manager node and allow it to have access to the Docker socket, as the example shows.The options need to be transferred into two different formats: both services as YAML forthe docker stack command line client and the servlet as JSON for BUNGEE. To ensure noerrors are made in conversion, we use the network sniffer Wireshark5 to dump the JSONspecification as it is transferred to the server.

5.2.2. Kubernetes

To configure our Kubernetes objects for BUNGEE, we followed the guide in [18b]. Kube-spray enables rule-based access control by default, so the RoleBinding has to be set up.We chose to use a DaemonSet for the Træfik containers, as it allows us to easily restrictthe load balancer to only run on the master node, as is the case with Swarm. This isachieved by first “tolerating” the master node, so the Træfik container is scheduled to runthere, and afterwards using the nodeSelector property in the Pod specification to restrict

2http://traefik.io/3http://www.ansible.com/4ansible-dockerswarm (http://github.com/messer/ansible-dockerswarm) and kubespray (http://

github.com/kubernetes-incubator/kubespray)5https://www.wireshark.org/

23

Page 36: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

24 5. Evaluation

the Pods from running anywhere else. As nodeSelector does not allow for direct targetingof the master role, we have to manually label the master node using kubectl. The syntaxfor the DaemonSet has slightly changed in the apps/v1 API, now it is required to specifya selector block with a matchLabels declaration as is done with a Deployment.

Above, in Subsection 4.1.2, we already went over the model used for the servlet containers.The containers themselves are Pods, which are controlled by a ReplicaSet, which is in turnupdated by a Deployment. A Service specifies open ports, while an Ingress allows theingress controller—Træfik in this case—to send specified outside requests to the Pods.

5.2.3. BUNGEE

As described in Section 2.2, there are different parameters to be set in the configuration ofBUNGEE. Setting these parameters correctly is of utmost importance to correct testing.

First, the SLO to use for the system analysis was unchanged from the one used in previoustests, as there is no reason to change it: 95 percent of requests must have a response timeof under 500 ms. This time is measured by JMeter from sending the request to receivingan answer, thus it includes network round-trip times.

The request size is a query parameter set in every request that defines the size of thematrix to be transformed on the server. The higher this request size is set, the more loadis produced and the longer it takes for the worker containers to generate a response. Alower request size increases the networking overhead, as more requests are sent to achievethe same load level. On the other hand, this allows for better accuracy when comparingdifferent scaling levels, as more requests provide a more granular unit of measurement.Nevertheless, the request size must not be chosen too low and the request timeout mustnot be chosen too high. Otherwise, multiple issues occur as the intensity rises. BUNGEEcalculates the number of JMeter threads to be used as a function of the timeout parameterand the current intensity. The default process limit on the workspace server, which isbetween 31000 and 32000, becomes an issue as JMeter can not create all specified threads.This causes the program to hang. For our tests, the request size parameter was set to 400and the timeout set to 1000 ms, which is double of the targeted SLO response time.

5.3. stress-ngIn our benchmark, we run 8 processor-focused worker threads for a time of 60 seconds.Because the operations have different sizes, these threads do not run for exactly 60 seconds,but rather finish their last task before quitting.

The result we are first taking a look at is the number of loops of the stressor list, i.e., runsof the “all” stressor, run to completion divided by the wall clock time the thread ran forin seconds. This number is calculated by stress-ng and output as ”bogo ops per secondreal time”. The stacked bars in Figure 5.2 show this metric for every cluster configurationfrom Table 5.2. The total height of the stack represents the performance of all workerVMs combined, while the individual bars show each worker.

Another metric stress-ng delivers is called “bogo ops per second user + system time”. In-stead of dividing by the wall clock time, this metric divides by the average of the processortime that was allocated to each thread by the operating system. It more accurately repre-sents hardware performance without taking overheads by increased context switching intoaccount. The actual results, plotted as the black line, show that this metric is not quiteconstant, but unstable, as the error bars show. The maximal value for small-native washigher than the medium-native could achieve, while medium-docker had a run where itsperformance was worse than the average for small VMs.

24

Page 37: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

5.4. System Analysis 25

0200400600800

100012001400160018002000

small-native

small-docker

medium-native

medium-docker

large-native

large-docker

huge-native

huge-docker

0

50

100

150

200

250

Bog

oop

s(cpu

=8)/s

real

time

Bog

oop

s(cpu

=8)/s

usr+

systime

Worker size, run method

Figure 5.2.: stress-ng benchmark results using different cluster configurations, without us-ing an orchestrator

As can be seen from both of these metrics, there was no measurable performance differencebetween running stress-ng directly on the VM and inside a single Docker container perVM. When doubling the number of VMs and simultaneously halving the processor corecount, the measured performance overhead doubles. Thus, when the complete system ispartitioned into VMs, every additional VM incurs a small linear increase in performanceoverhead.

5.4. System AnalysisAs stated before in 2.2, the first step of the BUNGEE benchmarking workflow is a systemanalysis. We begin by doing a system analysis with Swarm mode. Our results are plottedin Figure 5.3. The system scales almost linearly with the number of deployed containers.As can clearly be seen, the performance takes a sharp hit with smaller VM sizes, with theworker-small configuration only reaching 60 percents of worker-huge. This raised suspicionabout the validity of the results. One possible bottleneck that could affect BUNGEE isthe network performance, as all requests are sent through Træfik to the workers, passingthrough an overlay network.

We set out to figure out the reasons for this behavior and found a bug report from 2017[17]. This discussion thread describes a similar issue, where a big amount of traffic comesfrom a single host. The Linux IPVS overlay network treats the request as reconnectionattempts and adds a wait time of one second to the requests. As the timeout for arequest is exactly one second long, the request fails. Kubernetes works around this issueby setting the responsible kernel module parameters to reasonable values. With Swarm,turning off connection reuse manually after creating the virtual network is supposed to fixthe bottleneck, albeit risking reduced performance when scaling up or connection errorswhen scaling down.

To validate this claim, we used the recommended instructions to set the module parameternet.ipv4.vs.conn_reuse_mode to 0 for the overlay network between Træfik and the work-ers. The results of the second system analysis, after this command was run, are plotted in

25

Page 38: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

26 5. Evaluation

0

20

40

60

80

100

120

140

160

180

200

0 1 2 3 4 5 6 7 8 9

Max

imum

intensity

Resource amount

smallmedium

largehuge

Figure 5.3.: System analysis results using different cluster configurations, with Swarmmode, without any manual tweaks

Figure 5.4. Here, the values for the worker-medium and worker-large configurations (greenand blue lines) are reasonable, which confirms the fix from [17] had a positive effect. Whatimmediately becames apparent, though, is the massive instability in the worker-small con-figuration (violet), which is visualized by the error bars. It is also interesting that fullperformance in the worker-huge (yellow) configuration was only reached after starting asecond container on the same VM, which goes against the theory we established in Section3.2.

We run a third system analysis, this time using the Kubernetes orchestrator, to see if thisnew problem is constrained to Swarm mode. Its results are visualized in Figure 5.5. Thistime there are no larger deviations and the worker-small results seem fine at first glance.However, the worker-medium results for three and four resources are a lot lower than thelinear trend and the worker-large performance for 2 containers are on average lower thanworker-small. The interesting effect in the worker-huge measurements—two containers onthe same server giving a higher performance than one—continues in this experiment aswell.

5.5. Long-Time TestTo explain these issues and see if the issue persists with longer benchmark runs, we run along-time test of the cloud system. For a period of 30 minutes, we apply a constant loadof 161 requests per second to a Kubernetes cluster with the worker-medium configuration.At this point we can take a closer look at the responses recorded by JMeter. Here we cansee a certain pattern which is characteristic for all request series we measured: at firstonly one or two servers are processing data, while the other servers are overloaded and letall requests time out. Over time, the rest of the servers join in.

This phenomenon appears to be caused because of a shortage of worker threads in theApache Tomcat server that powers the servlet. As the processor load is at the maximumlevel constantly, there is no processor time left to create any new threads which could

26

Page 39: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

5.5. Long-Time Test 27

20

40

60

80

100

120

140

160

180

200

0 1 2 3 4 5 6 7 8 9

Max

imum

intensity

Resource amount

smallmedium

largehuge

Figure 5.4.: System analysis results using different cluster configurations, with Swarmmode, connection reuse disabled

20

40

60

80

100

120

140

160

180

200

0 1 2 3 4 5 6 7 8 9

Max

imum

intensity

Resource amount

smallmedium

largehuge

Figure 5.5.: System analysis results using different cluster configurations, with Kubernetes

27

Page 40: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

28 5. Evaluation

0

200

400

600

800

1000

1200

1400

1600

1800

0 5 10 15 20 25 30

Num

berof

requ

ests

Time in minutes

4-fail4-pass3-fail3-pass2-fail2-pass1-fail1-pass

timeout

Figure 5.6.: Plot of responses for long-time test, grouped by host and SLO compliance

handle the requests in parallel. To fix this issue, we propose to create a higher number ofworker threads initially, so the surge of requests is handled better. Sadly, we did not havethe chance to make any changes in the servlet, so we could not test this theory.

5.6. Parallelism and IsolationBecause of the issues described above in obtaining reliable performance readings by us-ing BUNGEE, we switched to the stress-ng benchmark for exploring running multiplecontainers on one machine. Once again, we run 8 processor-focused worker threads inDocker containers, but this time for 180 seconds. As with the long-time test, we chosethe worker-medium configuration. We have plotted the results similarly to Section 5.3 inFigure5.7.

Here we can see that all parallel containers generated very similar results to each other.Multiple containers are able to share a system, as their processes and threads are managedby the host system just like native processes. In contrast to the BUNGEE benchmarksabove, there is no measurable performance difference between running one to four con-tainers on a single VM.

5.7. Differences between container orchestratorsIn the work for this thesis, we have explored the two orchestrators Swarm and Kubernetesand exposed part of their objective benefits and drawbacks. In this last section of theevaluation, we are going to list and briefly explain the positive and negative points ofworking with the two orchestrators. Each topic has a “winner”, which is based on whichorchestrator was better-suited for our work in the given category.

• API structure: KubernetesBy splitting the project API into separate parts, Kubernetes remains clearly struc-tured. Swarm’s API has much less functionality, but is harder to grasp, even moreso because of the confusing documentation.

28

Page 41: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

5.7. Differences between container orchestrators 29

0

200

400

600

800

1000

1200

1400

1600

1800

2000

4 3 2 10

50

100

150

200

250

Bog

oop

s(cpu

=8)/s

real

time

Bog

oop

s(cpu

=8)/s

usr+

systime

Number of containers per host

Figure 5.7.: stress-ng parallelism benchmark results, without using an orchestrator

• Ansible support: SwarmThe ansible-dockerswarm role integrated well into Ansible and allowed to quicklydeploy a cluster. Kubespray is bloated with lots of special cases and support fordifferent overlay networks, which leads to it taking multiple times as long as ansible-dockerswarm.

• Documentation: KubernetesKubernetes’ documentation is logically separated into the objects and categories,while the docs for Swarm are hidden inside the main Docker documentation andonly cover command line syntax for the most part.

• Extensibility: KubernetesNearly everything in Kubernetes can be replaced by custom solutions, as its core isbuilt as a modular plugin-style architecture. Docker and by extension Swarm mode,although supporting simple plugins, is monolithic in comparison.

• Functionality: KubernetesEven without plugins, Kubernetes allows for many more use-cases with its differenttypes of controllers. Swarm only supports simple always-on services.

• Java library: SwarmThe official Swarm Java library is a pleasure to work with. Kubernetes’ official JavaAPI is largely autogenerated and the alternative solution we used is built around“fluent DSL” principles instead of actual Java patterns.

• Required setup: SwarmKubespray enforces many rules which are irrelevant to our work. ansible-dockerswarmonly needs to be fed two different groups and produces a working cluster in a shortperiod of time.

• Stability over re-imaging: SwarmKubernetes broke completely when we changed cluster configurations. It needed tobe re-installed from scratch to work again. Swarm survived, but the cluster had tobe rebuilt, which took a minute with Ansible.

29

Page 42: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 43: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

6. Conclusion

In the final chapter, we briefly summarize our work in Section 6.1. Afterwards, in Section6.2, we answer the research questions from the introduction. Finally, Section 6.3 presentssuggestions for future work based on ths thesis.

6.1. SummaryThis thesis evaluates the suitability of container-based virtualization for elasticity bench-marking of cloud systems. By using container orchestrators, a similar interface to VMclusters has been used to benchmark different scenarios. After comparing the top threeorchestrators with the newcomer Nomad, the BUNGEE benchmark has been extendedto include benchmarking via the container orchestrators Swarm mode and Kubernetes.BUNGEE provides a common framework for different cloud systems and evaluations,which allows for the integration and implementation work to be reused for other typesof tests as well. In the process, a number of research questions have been answered andthe groundwork for future evaluation and work has been laid down.

6.2. ResultsIn Section 1.3 we have established a number of research questions for this thesis. Here, wego over these questions once more, answering each one briefly and referring to the relateddetailed write-ups.

Q1) Which circumstances determine if it is better to run a container clusteron many smaller systems or fewer, more powerful hosts? (see Section 5.3)The processor performance difference between native applications and Docker con-tainers are neglegible. If virtualization is used, larger VMs reduce context switchingoverhead and improve performance, but sacrifice fine-grained scalability.

Q2) What are the main benefits and drawbacks of the Docker cluster man-agement tools? (see Sections 2.4 and 5.7)Swarm is simpler and better integrated into Docker, but more restricted and less ma-ture. Kubernetes has tons of features and is trusted by the majority of companies,while requiring more time to set up and configure properly.

Q3) Does the performance of a container change measurably over its lifetime?(see Section 5.5)

31

Page 44: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

32 6. Conclusion

The containers themselves stay at constant performance, but care should be takento allow the server applications to start enough worker threads before being flooded.

Q4) To what degree does the performance of a container depend on othercontainers on the same machine? (see Section 5.6)A container shares the processor performance with all other applications on thesystem, especially other containers. Running more replicas leaves less of this perfor-mance available per container.

Q5) By how much does the performance fluctuate between runs of the sametest? (see Section 5.4)In general, all of the benchmarks we have run were very stable. Performance fluctu-ations and non-linear scaling indicate a problem, as has been the case with Swarm.

For the implementations, we set the following goals:

M1) Make the project compatible with Maven, the build system used for otherDescartes Java applicationsDependencies are now managed by Maven, which simplifies the install process andremoves clutter from the code repository. Running the benchmark without a runningIDE is greatly simplified as well.

M2) Improve cross-platform compatibilityBUNGEE now runs on Windows and Linux. Theoretically it could be run on anyplatform that supports the required tools and libraries.

M3) Integrate Wilhelm’s multi-tier benchmark tools into the main codebaseWe merged our work with the multi-tier codebase. However, we did not test multi-tier benchmarking of any sort.

6.3. Future Work

In the course of this thesis we have evaluated and implemented several options for elasticitybenchmarking in container-based clusters. Ideas for future work using the findings andimplementations of the thesis are listed below:

• Implementation

– Integration of an alternative load generator: Originally, we have plannedto integrate an alternative to JMeter, namely the HTTP load generator byJóakim v. Kistowki. The main advantage of using this project for BUNGEEis that it is built for timestamp-based request sending from the ground up. Itsserver component could be directly integrated into BUNGEE, making deploy-ment easier and decreasing overhead. This solution increases customizability,too, as the load generator is controlled by Lua scripts.

– More cloud plugins: We have listed four of the most common Docker orches-trators, choosing to evaluate the two most widespread solutions. Additionalcloud plugins for other orchestrators such as Mesos and Nomad have not yetbeen implemented.

– Configurability: It is currently necessary to recompile BUNGEE to changebasic configuration parameters such as IP address maps, timeouts and waittimes between measurements. Reading these parameters from config files in-stead of hardcoding them would make it possible to configure BUNGEE withoutunderstanding the code.

32

Page 45: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

6.3. Future Work 33

– Documentation: Currently, the documentation for the BUNGEE benchmarkconsists of the theses and papers which have implemented and used the bench-mark. There is also a quick-start guide available at the homepage, which coverssetting up BUNGEE on Amazon AWS clouds. Collecting the necessary informa-tion for setting up other cloud plugins and performing multi-tier measurementsinto a central documentation, for example on the website of the project1, wouldmake integrating BUNGEE easier.

• Evaluation

– Investigate and fix Servlet performance issues: In a large portion of teststhe Servlet containers have experienced processor starvation. Before using theservlet container to run more benchmarks, the idle worker thread number shouldbe increased.

– Re-evaluate Swarm mode: We have identified the issue described in [17] asa major issue for benchmarking. Before considering Swarm mode for furtherbenchmarking with BUNGEE, a reliable fix or work-around for this bug hasto be found. Afterwards, a re-evaluation of the questions in this thesis may beconducted using the Swarm orchestrator.

– Elasticity Benchmarking: The main reason for our work has been to allowelasticity benchmarking with containers. To benchmark scaling solutions, saidsolutions have to be compatible with the chosen orchestrator. Furthermore, theBUNGEE plugin of the orchestrator has to be adapted to work with the scaler,which only introduces minor changes in our code.

– Multi-tier elasticity benchmarking: Based on the work of [Wil17] and thisthesis, multi-tier scaling solutions are able to be benchmarked as well. As wehave not tested this part of the code, tweaks and fixes are likely necessary.

– Compare VMs with containers: We have only compared two containerorchestrators to each other, leaving comparisons with VMs out of the pic-ture. Combining the existing cloud plugins with our newly-implemented pluginsmakes similar evaluations as described in Subsection 2.5.2 possible.

– SMT: In 5.1 we have hinted at the possibility of running more VMs on asingle machine by turning off SMT on the host. The impact of this change onperformance and isolation of the VMs remains to be explored.

1http://descartes.tools/bungee

33

Page 46: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair
Page 47: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Bibliography

[16] Ubuntu Manpage: stress-ng - a tool to load and stress a computer system,0.05.23, Apr. 2016. [Online]. Available: http://manpages.ubuntu.com/manpages/xenial/man1/stress-ng.1.html.

[17] (Oct. 2017). [SWARM] Very poor performance for ingress network with lotsof parallel requests, [Online]. Available: https://github.com/moby/moby/issues/35082.

[18a] Docker Swarm (mode) cluster - Træfik, v1.7.0-rc2, Containous SAS, Jul.2018. [Online]. Available: https://docs.traefik.io/user-guide/swarm-mode/.

[18b] Kubernetes Ingress Controller - Træfik, v1.7.0-rc3, Containous SAS, Jul.2018. [Online]. Available: https://docs.traefik.io/user-guide/kubernetes/.

[Bau16] A. Bauer, “Design and Evaluation of a Proactive, Application-Aware Elas-ticity Mechanism”, Master Thesis, University of Würzburg, Am Hubland,Informatikgebäude, 97074 Würzburg, Germany, Sep. 2016.

[Doc13] Docker Inc. (Oct. 2013). dotCloud, Inc. is Becoming Docker, Inc. | DockerBlog, [Online]. Available: https://web.archive.org/web/20140330235649/http://blog.docker.io/2013/10/dotcloud-is-becoming-docker-inc/.

[Doc15] ——, (Jul. 2015). What is Docker?, [Online]. Available: https : / / web .archive.org/web/20150702020248/https://www.docker.com/whatisdocker.

[Doc18] ——, (Jun. 2018). Docker Overview | Docker Documentation, [Online]. Avail-able: https://docs.docker.com/engine/docker-overview/.

[DRK14] R. Dua, A. R. Raja, and D. Kakadia, “Virtualization vs Containerization toSupport PaaS”, in 2014 IEEE International Conference on Cloud Engineer-ing, Mar. 2014, pp. 610–614. doi: 10.1109/IC2E.2014.41.

[FFRR15] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated perfor-mance comparison of virtual machines and Linux containers”, in 2015 IEEEInternational Symposium on Performance Analysis of Systems and Software(ISPASS), Apr. 2015, pp. 171–172. doi: 10.1109/ISPASS.2015.7095802.

[Goo18] Google. (Jun. 2018). docker swarm, kubernetes, mesos, hashicorp - GoogleTrends, [Online]. Available: https://g.co/trends/cyRVG.

[HKR13] N. R. Herbst, S. Kounev, and R. Reussner, “Elasticity in Cloud Computing:What it is, and What it is Not”, (Short Paper), in Proceedings of the 10th In-ternational Conference on Autonomic Computing (ICAC 2013), AcceptanceRate (Short Paper): 36.9%, San Jose, CA: USENIX, Jun. 2013. [Online].Available: https://www.usenix.org/conference/icac13/elasticity-cloud-computing-what-it-and-what-it-not.

35

Page 48: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

36 Bibliography

[HKWG15] N. R. Herbst, S. Kounev, A. Weber, and H. Groenda, “BUNGEE: An Elastic-ity Benchmark for Self-Adaptive IaaS Cloud Environments”, in Proceedingsof the 10th International Symposium on Software Engineering for Adaptiveand Self-Managing Systems (SEAMS 2015), Firenze, Italy, May 2015.

[LBHK18] V. Lesch, A. Bauer, N. Herbst, and S. Kounev, “FOX: Cost-Awareness for Au-tonomic Resource Management in Public Clouds”, in Proceedings of the 9thACM/SPEC International Conference on Performance Engineering (ICPE2018), (To Appear), Full Paper Acceptance Ratio: 23,7%, Berlin, Germany:ACM, Apr. 2018. doi: 10 . 1145 / 3184407 . 3184415. [Online]. Available:https://doi.org/10.1145/3184407.3184415.

[LC16] P. Leitner and J. Cito, “Patterns in the Chaos&Mdash;A Study of Perfor-mance Variation and Predictability in Public IaaS Clouds”, ACM Trans.Internet Technol., vol. 16, no. 3, 15:1–15:23, Apr. 2016, issn: 1533-5399. doi:10.1145/2885497. [Online]. Available: http://doi.acm.org/10.1145/2885497.

[Les17] V. Lesch, “Self-Aware Multidimensional Auto-Scaling”, Würzburg SoftwareEngineering Award sponsored by Bosch Rexroth, Master Thesis, Universityof Würzburg, Am Hubland, Informatikgebäude, 97074 Würzburg, Germany,Sep. 2017.

[RSPR15] M. Raho, A. Spyridakis, M. Paolino, and D. Raho, “KVM, Xen and Docker:a performance analysis for ARM based NFV and Cloud computing”, in 3rdWorkshop on Advances in Information, Electronic and Electrical Engineering(AIEEE 2015), IEEE, 2015, pp. 1–8.

[The18a] The Cloud Native Computing Foundation. (Aug. 2018). CNCF Survey: Useof Cloud Native Technologies in Production Has Grown Over 200%, [Online].Available: https://www.cncf.io/blog/2018/08/29/cncf-survey-use-of-cloud-native-technologies-in-production-has-grown-over-200-percent/.

[The18b] The Linux Foundation. (Jun. 2018). Home - Open Containers Initiative, [On-line]. Available: https://www.opencontainers.org/.

[vKHK14] J. G. von Kistowski, N. R. Herbst, and S. Kounev, “Modeling Variations inLoad Intensity over Time”, in Proceedings of the 3rd International Workshopon Large-Scale Testing (LT 2014), co-located with the 5th ACM/SPEC In-ternational Conference on Performance Engineering (ICPE 2014), Dublin,Ireland: ACM, Mar. 2014, pp. 1–4, isbn: 978-1-4503-2762-6. doi: 10.1145/2577036.2577037. [Online]. Available: http://doi.acm.org/10.1145/2577036.2577037.

[Web14] A. Weber, “Resource Elasticity Benchmarking in Cloud Environments”, Mas-ter Thesis, Karlsruhe Institute of Technology (KIT), Am Fasanengarten 5,76131 Karlsruhe, Germany, Aug. 2014.

[Wil17] M. Wilhelm, “Elasticity Benchmarking for Multi-Tier Cloud Applications”,Würzburg Software Engineering Award sponsored by Bosch Rexroth, Bache-lor Thesis, University of Würzburg, Am Hubland, Informatikgebäude, 97074Würzburg, Germany, Jun. 2017.

[XNR+13] M. G. Xavier, M. V. Neves, F. D. Rossi, T. C. Ferreto, T. Lange, andC. A. F. D. Rose, “Performance Evaluation of Container-Based Virtualiza-tion for High Performance Computing Environments”, in 2013 21st Euromi-cro International Conference on Parallel, Distributed, and Network-BasedProcessing, Feb. 2013, pp. 233–240. doi: 10.1109/PDP.2013.41.

36

Page 49: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

Appendix

A. UML Diagrams of Implementation Work

37

Page 50: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

38 6. Appendix

swarm

examples

0..11..1

�use�

�use�

�use�

SwarmManagement

+SwarmManagement(dockerConfig: File)+getServiceSettings(): ServiceSettings+setServiceSettings(serviceSettings: ServiceSettings): void+setServiceSettings(serviceSettings: File): void+createService(serviceName: String, resourceAmount: int): boolean+stopService(serviceName: String): boolean+getScalingBounds(serviceName: String): Bounds+setScalingBounds(serviceName: String, bounds: Bounds): boolean+getNumberOfResources(serviceName: String): int+getNumberOfResources(): int

ServiceSettings

+ServiceSettings(spec: ServiceSpec, publicHost: String)+getSpec(): ServiceSpec+setSpec(spec: ServiceSpec): void+getPublicHost(): String+setPublicHost(publicHost: String): void+load(file: File): ServiceSettings+save(file: File): void

CloudInfo CloudManagement

TestSwarm

RunBenchmarkOnSwarm

DetailedSwarmAnalysis

Figure A.1.: Class diagramm of the Swarm mode cloud plugin

38

Page 51: Integrating Docker into the BUNGEE cloud elasticity benchmark€¦ · Integrating Docker into the BUNGEE cloud elasticity benchmark Filipp Roos Department of Computer Science Chair

A. UML Diagrams of Implementation Work 39

kubernetes

examples

0..11..1

�use�

�use�

�use�

KubernetesManagement

+KubernetesManagement(k8sClientSettings: File)+getObjectSettings(): KubernetesObjectSettings+setObjectSettings(objectSettings: KubernetesObjectSettings): void+setObjectSettings(objectSettings: File): void+createObjects(namePrefix: String, resourceAmount: int): boolean+deleteObjects(namePrefix: String): boolean+getScalingBounds(namePrefix: String): Bounds+setScalingBounds(namePrefix: String, bounds: Bounds): boolean+getNumberOfResources(namePrefix: String): int+getNumberOfResources(): int

KubernetesObjectSettings

+KubernetesObjectSettings(spec: List<HasMetadata>, publicHost: String)+getSpec(): List<HasMetadata>+setSpec(spec: List<HasMetadata>): void+getPublicHost(): String+setPublicHost(publicHost: String): void+load(file: File): KubernetesObjectSettings+save(file: File): void

CloudInfo CloudManagement

TestKubernetes

RunBenchmarkOnKubernetes

DetailedKubernetesAnalysis

Figure A.2.: Class diagramm of the cloud plugins

39