thesis.dvi · web viewtable 4 iops of different filesystems with different workloads on ebs with...

110
1 ANALYSIS OF FILE SYSTEMS PERFORMANCE IN AMAZON EC2 STORAGE by Nagamani Pudtha A Thesis submitted to the School of Graduate Studies in partial fulfilment of the requirements for the degree of Master of Science Department of Computer Science University of Colorado Colorado Springs December 2014

Upload: buimien

Post on 16-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

1

ANALYSIS OF FILE SYSTEMS PERFORMANCE IN AMAZON EC2 STORAGE

by

Nagamani Pudtha

A Thesis submitted to the

School of Graduate Studies

in partial fulfilment of the requirements for the degree of

Master of Science

Department of Computer Science

University of Colorado Colorado Springs

December 2014

Colorado Springs Colorado

1

Copyrights by Nagamani Pudtha 2014

All Rights Reserved

2

This thesis final report for the Master of Computer Science degree by

Nagamani Pudtha

has been approved for the

Department of Computer Science

By

_______________________________________________________

Advisor: Jia Rao

_______________________________________________________

Dr. C. Edward Chow

_______________________________________________________

Dr. Xiaobo Zhou

____________________

Date

3

Nagamani Pudtha (M.S. Computer Science)

Analysis of File systems performance in Amazon EC2 storage

Thesis directed by Professor Jia Rao, Department of Computer Science

ABSTRACT

Cloud computing has gained tremendous popularity in recent years. Cloud

computing refers to using a third–party network of remote servers hosted on the

internet to store and manage all your data, rather than locally. In simple words, cloud

services provide you with your own hard drive in the cloud or on the internet. There

are public clouds and private clouds available in the market. And whether a cloud is

public or private, the success key is creating an appropriate and efficient server,

network and storage infrastructure in which all resources can be efficiently utilized

and shared. In cloud computing, data storage becomes even more crucial since all

data resides on the storage systems in a shared infrastructure model. It is very

important to understand the performance of a particular storage before making the

transition.

In this paper we perform experiments on amazon EC2 cloud storage. We select

Amazon public cloud platform since it is one of the most widely adopted public cloud

platforms and offers Infrastructure-as-a-Service (IaaS). Prior work has shown that

applications with significant communication or I/O tend to perform poorly in

virtualized cloud environments. However, there is a limited understanding of the I/O

characteristics of cloud environments. In this paper several tests and benchmark were

used to evaluate the I/O under different storage settings of Amazon cloud with

4

different types of file systems with the goal of to be able to exercise and observe the

I/O performance from different perspectives by using different workloads with a

special focus on long running jobs. We use FIO, FileBench and Blktrace

benchmarking tools. Blktrace helps to provide detailed block level IO analysis of both

EBS and Instance Store disks. Through a set of detailed micro and macro

benchmarking measurements, the tests revealed the different levels of performance

degradation in EBS and instance storage due to the different types of file systems and

different types of workloads.

From our interpretation of test results in detail on how the time spent in different IO

stages, we are providing user guidance on how to select the storage option while

choosing instances.

5

ACKNOWLEDGEMENTS

First and foremost I would like to sincerely thank my advisor Jia Rao for all the

guidance and interest he took in the progress of this work. I am very grateful to him

for his constructive comments, feedback and also providing the resources like

Amazon Web Services during our research. I will consider myself lucky if I have

imbibed at least a small percentage of his admirable qualities like devotion and

single-minded dedication towards work.

I am extremely grateful to Dr. Edward Chow and Dr. Xiabou Zhou for their valuable

suggestions and advises in the thesis proposal and during all of this thesis work

without which it would have been very difficult for me to come up with this great

work.

Many thanks go to my family members for their constant support and encouragement.

I am grateful to my parents for all their love, affection and blessings without which I

would not have gotten this far in life. And finally a special note of thanks goes to

Venkat, my husband for his continual encouragement, support; advice and patience

that has enabled me to accomplish things I never thought were possible.

6

CONTENTSABSTRACT...............................................................................................................................3

LIST OF TABLES.....................................................................................................................8

LIST OF GRAPHS....................................................................................................................8

CHAPTER 1............................................................................................................................10

INTRODUCTION...................................................................................................................10

1.1 MOTIVATION...........................................................................................................10

1.2 THESIS GOAL...........................................................................................................11

1.3 THESIS ORGANIZATION........................................................................................12

CHAPTER 2............................................................................................................................13

BACKGROUND.....................................................................................................................13

2.1 AMAZON EC2...........................................................................................................13

2.2 AMAZON EC2 STORAGE........................................................................................14

2.2.1 ELASTIC BLOCK STORE........................................................................................14

2.2.1.1 GENERAL PURPOSE VOLUMES........................................................................14

2.2.2 INSTANCE STORE...................................................................................................15

2.3 IO ANALYSIS TOOLS..............................................................................................16

2.3.1 BLKTRACE/BLKPARSE..........................................................................................16

CHAPTER 3............................................................................................................................18

EXPERIMENT METHODOLOGY........................................................................................18

3.1 OVERVIEW...............................................................................................................18

3.2 PROPERTIES.............................................................................................................18

3.3 INSTANCE TYPE SELECTION...............................................................................18

3.4 FILE SYSTEM SELECTION.....................................................................................19

3.5 TEST BED SETUP.....................................................................................................21

3.6 BENCHMARKING TOOLS......................................................................................22

3.6.1 MACRO BENCHMARKS.........................................................................................22

3.6.2 MACRO BENCHMARKS.........................................................................................24

CHAPTER 4............................................................................................................................25

7

EXPERIMENT RESULTS & ANALYSIS.............................................................................25

4.1. FILEBENCH RESULTS............................................................................................25

4.1.1 EBS.............................................................................................................................26

4.1.2 INSTANCE STORE:..................................................................................................31

4.2 FIO RESULTS............................................................................................................37

4.2.1 EBS:............................................................................................................................38

4.2.2 INSTANCE STORE...................................................................................................51

CHAPTER 6............................................................................................................................62

IO ANALYSIS.........................................................................................................................62

CHAPTER 7............................................................................................................................65

DISCUSSION..........................................................................................................................65

CHAPTER 8............................................................................................................................67

FUTURE WORK.....................................................................................................................67

CHAPTER 9............................................................................................................................67

CONCLUSION........................................................................................................................67

BIBLIOGRAPHY....................................................................................................................69

APPENDIX A..........................................................................................................................71

FIO Benchmark 64k block size with random workload:.........................................................71

APPENDIX B..........................................................................................................................72

SAMPLE FILEBENCH OUTPUT:.........................................................................................72

APPENDIX C..........................................................................................................................74

SAMPLE BLKTRACE OUTPUT:..........................................................................................74

APPENDIX D...........................................................................................................................77

SAMPLE BTT OUTPUT.......................................................................................................77

LIST OF FIGURES

Figure 1: Amazon EC2 Storage...................................................................................14

Figure 2: blktrace General Architecture......................................................................17

8

LIST OF TABLES

Table 1 Experiment Setup –EBS...............................................................................21

Table 2 FileBench Workloads...................................................................................25

Table 3 IOPS of different filesystems with different workloads on EBS with 8k….36

Table 4 IOPS of different filesystems with different workloads on EBS with 512k 36

Table 5 IOPS of different filesystems with different workloads on Insance Store

with 8k........................................................................................................37

Table 6 IOPS of different filesystems with different workloads on Instance Store

with 512k.................................................................................................37

Table 7 FIO Benchmark Parameters for EBS and Instance store..............................38

Table 8 IOPS for 8 jobs & 4k block size- EBS vs Instance store..............................59

Table 9 IOPS for 16 jobs & 4k block size- EBS vs Instance store............................59

Table 10 IOPS for 32 jobs & 4k block size- EBS vs Instance store............................59

LIST OF GRAPHS

Graph 4.1:1 IOPS of different file systems with different workloads on EBS............27

Graph 4.1:2 Latency of different filesystems with different workloads on EBS........29

Graph 4.1:3 Bandwidth of file systems on EBS..........................................................30

Graph 4.1:4 IOPS of different filesystems with different workloads on Instance Store.............................................................................................................................31

Graph 4.1:5 Latency of different file systems with different workloads on Instance Store.....................................................................................................................33

9

Graph 4.1:6 Bandwidth of different filesystems with different workloads on Instance Store.....................................................................................................................35

Graph 4.1:7 E XT3 Random Read Write 8 jobs..........................................................38

Graph 4.1:8 EXT3 Random Read Write 16 jobs.........................................................39

Graph 4.1:9 EXT3 Random Read Write 32 jobs.........................................................39

Graph 4.1:10 EXT3 Sequential Read Write 8 jobs......................................................40

Graph 4.1:11EXT3 Sequential Read Write 16 jobs.....................................................41

Graph 4.1:12 EXT3 Sequential Read Write 32 jobs....................................................41

Graph 4.1:13 EXT4 Rand Read Write 8 jobs.............................................................42

Graph 4.1:14 EXT4 Rand Read Write 16 jobs............................................................43

Graph 4.1:15 EXT4 Rand Read Write 32 jobs............................................................43

Graph 4.1:16 EXT4 Sequential Read Write 8 jobs.....................................................44

Graph 4.1:17 EXT4 Sequential Read Write 16 jobs....................................................44

Graph 4.1:18 EXT4 Sequential Read Write 32 jobs...................................................45

Graph 4.1:19 XFS Rand Read Write 8 jobs................................................................46

Graph 4.1:20 XFS Rand Read Write 16 jobs..............................................................46

Graph 4.1:21 XFS Rand Read Write 32 jobs..............................................................46

Graph 4.1:22 XFS Sequential Read Write 8 jobs........................................................47

Graph 4.1:23 XFS Sequential Read Write 16 jobs......................................................47

Graph 4.1:24 XFS Sequential Read Write 32 jobs.....................................................48

Graph 4.1:25 EXT3 Rand Read Write 8 jobs..............................................................50

Graph 4.1:26 EXT3 Rand Read Write 16 jobs...........................................................50

Graph 4.1:27 EXT3 Rand Read Write 32 jobs...........................................................50

Graph 4.1:28 EXT3 Sequential Read Write 8 jobs......................................................51

10

Graph 4.1:29 EXT3 Sequential Read Write 8 jobs......................................................52

Graph 4.1:30 EXT3 Sequential Read Write 32 jobs...................................................52

Graph 4.1:31 EXT4 Random Read Write 8 jobs.........................................................53

Graph 4.1:32 EXT4 Random Read Write 16 jobs.......................................................53

Graph 4.1:33 EXT4 Random Read Write 32 jobs.......................................................53

Graph 4.1:34 EXT4 Sequential Read Write 8 jobs.....................................................54

Graph 4.1:35 EXT4 Sequential Read Write 16 jobs....................................................54

Graph 4.1:36 EXT4 Sequential Read Write 32 jobs....................................................55

Graph 4.1:37 XFS Random Read Write 8 jobs...........................................................56

Graph 4.1:38 XFS Random Read Write 16 jobs.........................................................56

Graph 4.1:39 XFS Random Read Write 32 jobs.........................................................56

Graph 4.1:40 XFS Sequential Read Write 8 jobs.......................................................57

Graph 4.1:41 XFS Sequential Read Write 16 jobs......................................................57

Graph 4.1:42 XFS Sequential Read Write 32 jobs.....................................................58

11

CHAPTER 1

INTRODUCTION

1.1 MOTIVATION

Amazon EC2, the leading IaaS (Infrastructure as a Service) provider and a subset of

offerings from Amazon Web Services, has had a significant impact in the business IT

community and provides reasonable and attractive alternatives to locally-owned

infrastructure. Amazon Elastic Compute Cloud has been used for host of a small and

medium sized enterprises for various usages. Amazon was introduced in 2006 and

supports a wide range of instance types with different storage settings. Amazon EC2

provides Elastic Block Storage (EBS) [2] Instance Storage [3] and Amazon Simple

Storage Service (S3).

There are a lot discussions and questions in the community about which storage

setting should a user choose. This thesis will analyze the performance of Amazon

EBS and Instance storages, with different combinations of file systems with different

type of workloads and read write operations. And also we are extending the paper

where authors focused on the nested file systems performance [1] on only one kind

of storage on a single instance, where as our paper is working on Amazon Ec2’s EBS

and Instance storage performances by launching 2 instances on each storage and

tested with different workloads under different file systems.

Understanding how data makes its way from the application to storage devices is key

to understanding how I/O works. With this knowledge, user can make much better

decisions about storage design and storage purchases for their application. Monitoring

the lowest level of the I/O stack, the block driver, is a crucial part of this overall

understanding of I/O patterns.

12

1.2 THESIS GOAL

In this thesis, we aim to present a measurement study to characterize the performance

implications of the storage of Amazon Elastic Cloud Computing (EC2) [6] data

center. Performance has a long tradition in storage research, we measure the

performance by analyzing the I/O characteristics, workload demand, and storage

configuration by attaching General Purpose SSD volume to instances and will

provide a user guidance on how to select the storage option in Amazon EC2.

This research aims to answer the following questions within the bounds of the

environments tested.

Are there any wide range of performance variations between Amazon EC2

storage options

Can the block size be a cause of I/O performance degradation

Which one delivers the better peak performance

Which one delivers more consistent performance

Is any of these two settings all-time winner for all workloads? Or the

performance is workload-dependent.

We ran many experiments on both EBS and Instance store instances, collected

detailed performance measurements, and analyzed them. We found that different

workloads, not too surprisingly, have a large impact on system behavior. No single

file system worked best for all workloads. Some file system features helped

performance and others hurt it.

Our goal is to quickly observe the events in each layer, how they interact and also to

provide enough information to study even small details. To gather information from

these components, we have used the existing blktrace mechanism [15] [15]

1.3 THESIS ORGANIZATION

The remainder of this thesis is structured as follows.

13

In chapter 2, we describe some of the background that led to this project and our

goals for the project. Chapter 3, describes the experiment set up. Chapter 4 talks

about the benchmarking tools what we are using in our experiment. Chapter 5 discuss

the results of those benchmarks on EXT3, EXT4 and XFS file systems with EBS and

Instance store volumes. Chapter 7 includes the discussion of the results. Chapter 7

focuses on the future areas of this thesis work. Chapter 8 will give the conclusion.

14

CHAPTER 2

BACKGROUND

2.1 AMAZON EC2

Amazon Elastic Compute Cloud (Amazon EC2) [6] is a component of Amazon’s

Web Services (AWS). EC2 is a central part of Amazon.com’s cloud computing

platform. Amazon EC2 is a Web-based service from which user can rent for a

monthly or hourly fee, virtual servers in the cloud and also run custom applications

on those servers. Elasticity in EC2 refers to the ease in which user can scale server

and application resources as their computing demands needed.

Amazon EC2 uses the Xen virtualization technique [5] to manage physical servers.

There might be several Xen virtual machines running on one physical server. Each

Xen virtual machine is called an instance in Amazon EC2. There are several types of

instances. Each type of instance provides a predictable amount of computing

capacity. The input-output (I/O) capacities of these types of instances are vary

according to the storage attached to those instances. Allocated EC2 instances can be

placed at different physical locations. Amazon organizes the infrastructure into

different regions and availability zones.

To use EC2, a subscriber creates an Amazon Machine Image (AMI) containing the

operating system, application programs and configuration settings. Then the AMI is

uploaded to the Amazon Simple Storage Service (Amazon S3) and registered with

Amazon EC2, creating a so-called AMI identifier (AMI ID). Once this has been done,

the subscriber can requisition virtual machines on an as-needed basis. Capacity can be

increased or decreased in real time from as few as one to more than 1000 virtual

machines simultaneously. Billing takes place according to the computing, storage and

network resources consumed.

15

2.2 AMAZON EC2 STORAGE

In amazon cloud we have three types of storage choices for an instance boot disk or

its root device. They are Instance store, Elastic Block Storage (EBS) and Simple

Storage Service (S3). In this section, we will briefly discuss these three storage

settings. These three types of storages of EC2 is depicted in the following Figure 1 [6]

Figure 1: Amazon EC2 Storage

2.2.1 ELASTIC BLOCK STORE

Amazon’s EBS volumes provides persistent block level storage. Once we attach EBS

volume to an instance, we can create file systems and we also can run the database on

top of these volumes. Amazon EBS provides three types of volumes, first one is

General Purpose (SSD), second one is Provisioned IOPS (SSD), and the last one is

Magnetic. The three volume types differ in performance characteristics and cost. In

our thesis research we are only attaching General Purpose storage volume to our

instances as per the cost constraints.

16

2.2.1.1GENERAL PURPOSE VOLUMES

General Purpose (gp2) volume is the currently default EBS volume type when

launching “Create Volume” in EC2 console in Amazon cloud. General Purpose

volumes are backed by Solid-State Drives (SSDs) and are suitable for a broad range

of workloads, including small to medium-sized databases, development and test

environments, and boot volumes. General Purpose volumes provide the ability to

burst up to 3,000 IOPS per volume, independent of volume size, to meet the

performance needs of most applications. General Purpose volumes also deliver a

consistent baseline of 3 IOPS/GB and provide up to 128MBps of throughput per

volume. I/O is included in the price of General Purpose volumes, so you pay only for

each GB of storage you provision.

General Purpose SSD, measured by the benchmark of IOPS, offers 10 times more

input/output operations each second, with one-tenth the latency of magnetic tape

drives, as well as greater bandwidth and consistency. When we need a greater number

of IOPS than General Purpose (SSD) volumes provide or we have a workload where

performance consistency is critical, Amazon EBS Provisioned IOPS (SSD) volumes

will help us.

2.2.2 INSTANCE STORE

Instance-store volumes [3] are temporary storage, which survive rebooting an EC2

instance, but when the instance is stopped or terminated (e.g., by an API call, or due

to a failure), this store is lost. EBS volumes are built on replicated storage, so that the

failure of a single component will not cause data loss.

For instances, as they are a temporary storage, you should not rely on these disks to

keep long-term data or even other data that you would not want to lose when a failure

happens (i.e., stop/start instance, failure on the underlying h/w, terminating

instances), for these purposes, it is better to choose persistent storages like EBS or S3.

And also, you can’t upgrade your instance and it is not scalable. But instance store is

faster than EBS with its non-persistent characteristic.

17

2.3 IO ANALYSIS TOOLS

Linux has some excellent tools for tracing I/O request queue operations in the block

layer. For example, tools such as iostat, iotop [20] , sar[19] and etc. iotop[20] is

used to get quite few I/O stats for a particular system, but it only gives you an overall

picture of statistics without a great deal of in detail, hence it is not recommended to

use iotop to determine how the application is doing the I/O. iotop only gives an idea

of how much throughput and not the iops that the application is generating.

Iostat [18] is the go-to tool for Linux storage performance monitoring and allows you

to collect quite a few I/O statistics. It is available nearly everywhere, it works on the

vast majority of Linux machines, and it's relatively easy to use and understand.

Relative to iotop, iostat gives you a much larger array of statistics, but it does not

separate out I/O usage on a per-process basis instead you get aggregate view of all

I/O usage.

Sar [19] is one of the most common tools for gathering information about system

performance. It works like iotop, runs on each compute node and gather I/O statistics.

But it examine the I/O pattern of an application at a higher level. To get around these

issues, we will want to go deeper and watch I/O statistics.

2.3.1 BLKTRACE/BLKPARSE

The tools that come with the kernel to watch I/O statistics in depth are blktrace and

blkparse. These are very powerful tools. Blktrace is a block layer IO tracing

technique which provides information about request queue operations up to user

space in detail. Blktrace transfers event traces from the kernel into either long-term

storage or provides formatted output via blkparse. Compared to all other tools (which

we discussed above), it provides detailed information about request queue operations.

Blktrace needs no special support or configuration apart from having debugfs

mounted on /sys/kernel/debug. Blkparse utility formats the events stored in files and

it directly outputs data collected by blktrace. General architecture of blktrace is

shown in Figure 2 [21]

18

Figure 2: blktrace General Architecture

There are around 20 different events produced by blktrace, of which we only use a

few. Below we list few events what we use, for a full list refer to the blkparse manual

page [17].

Request Inserted (I): We use this to tell IO inserted onto request queue.

Request queued (Q): We use this to track I/O request ownership, because other

events, such as dispatch and completion, run in an arbitrary process context.

Request dispatch (D): This event is issued when the I/O scheduler moves a request

from a queue inside the I/O scheduler and to the dispatch queue, so that the disk can

service the request.

Request completion (C): This happens when the disk has serviced are quest. For a

write operation this means that the data has been written to the platter (or at least

buffered), and for a read operation that the data is stored in host memory.

19

CHAPTER 3

EXPERIMENT METHODOLOGY

3.1 OVERVIEW

To evaluate the performance implications of different storage settings of Amazon

EC2, we built a test bed. In this section, we introduce the methodology of our

measurement study. We first explain the properties we measure in our experiments,

Instance type selection, File systems selection and the methodology we use to

measure the properties.

3.2 PROPERTIES

In our experiment, we use two types of bench marking tools, one is File Bench [13]

and another one is FIO [4] File Bench is used to generate macro-benchmarks and

FIO [4] to generate micro-benchmarks. We will analyze the impact of the choice of

file system with different kinds of workloads in VMs, analyze any performance

degradations or improvements with write dominated workloads and read dominated

workloads. We find bandwidth of each workload with different file systems on EBS

and Instance store storages. We run blktrace in parallel to FIO and Filebench to trace

the IO statistics.

3.3 INSTANCE TYPE SELECTION

Amazon EC2 provides different types of instances for users. Our measurement

experiments are mainly based on Amazon EC2 M3 large instances [14] and provides

a balance of compute, memory, and network resources. These are SSD-based instance

storage for fast I/O performance. This m3 family instances used for small and mid-

size databases, data processing tasks that require additional memory, caching fleets,

and for running backend servers for SAP, Microsoft SharePoint, and other enterprise

applications. We choose large instances with EBS volume as comparison with large

20

instances with Instance store volume, to study the performance implications in virtual

machines.

3.4 FILE SYSTEM SELECTION

Linux supports several different file systems. Each has strengths and weaknesses and

its own set of performance characteristics. One important attribute of a filesystem is

journaling, which allows for much faster recovery after a system crash. Generally, a

journaling filesystem is preferred over a non-journaling one when you have a choice.

In our thesis we are using EXT3, EXT4 and XFS file systems. We selected these file

systems for their widespread use on Linux servers and the variation in their features.

Following is a brief summary of these file systems.

EXT3: EXT3 [11] stands for third extended file system. The ext3 file system adds

journaling capability to a standard ext2 file system and is therefore an evolutionary

growth of a very stable file system. Because it adds journaling on top of the proven

ext2 file system, it is possible to convert an existing ext2 file system to ext3 and even

convert back again if required. Journaling has a dedicated area in the file system,

where all the changes are tracked. When the file system crashes, the possibility of file

system corruption is less because of journaling. Maximum individual file size can be

from 16GB to 2 TB and overall EXT3 file system size can be from 2 TB to 32 TB.

Once you partition the hard disk by using fdisk command, we can create a file

systems on that partition by using mkfs command.

EXT3 supports 3 types of journaling, first one is Journal, second one is ordered and

third one is Writeback.

Journal – Metadata and content are saved in the journal.

Ordered – Only metadata is saved in the journal. Metadata are journaled only

after writing the content to disk. This is the default.

Writeback – Only metadata is saved in the journal. Metadata might be

journaled either before or after the content is written to the disk.

21

EXT4: EXT4[12] stands for fourth extended file system. The ext4 file system started

as extensions to ext3 to address the demands of even larger file systems by increasing

storage limits and improving performance. To preserve the stability of ext3, it was

decided in June 2006 to fork the extensions into a new file system, ext4. The ext4 file

system, was released in December 2008 and included in the 2.6.28 kernel. Some of

the changes from ext3 are:

File systems up to 1 exabyte (EB). 1EB=1024 PB (petabyte). 1PB= 1024 TB

(terabyte), with individual file size from 16 GB to 16 TB.

The use of extents to replace block mapping as a means of improving

performance

Journal checksums to improve reliability

Faster file system checking because unallocated blocks can be skipped during

checks.

Delayed allocation and multi block allocators to improve performance and

reduce file fragmentation.

XFS: XFS is a 64-bit, highly scalable file system that was developed by Silicon

Graphics Inc. (SGI) and first deployed in the Unix-based IRIX operating system (OS)

in 1994. XFS supports large files and large file systems. For a 64-bit implementation,

XFS can handle file systems of up to 18 exabytes, with a maximum file size of 9

exabytes. There is no limit on the number of files. XFS is a journaling file system

and, as such, keeps track of changes in a log before committing the changes to the

main file system. The advantage is guaranteed consistency of the file system and

expedited recovery in the event of power failures or system crashes.

XFS filesystems are divided into a number of equally sized chunks called Allocation

Groups. Each AG can almost be thought of as an individual filesystem that maintains

its own space usage. Each AG can be up to one terabyte in size, regardless of the

underlying device's sector size. Each AG starts with a superblock. The first one is the

primary superblock that stores aggregate AG information. Secondary superblocks are

22

only used by xfs_repair when the primary superblock has been corrupted [8] On

Linux system we need to install xfsprogs package to get the xfs file system.

3.5 TEST BED SETUP

Our experiments have been conducted on two 64-bit Amazon EC2 virtual machine

instances, both consisting of Amazon Linux AMI (HVM) 2014.03.1 with 30Gib

memory and 8 virtual CPUs. The first instance was an m3.2xlarge, with general

purpose volume, consisting of a 1024GB EBS backed storage volume. The second

instance was also m3.2xlarge type. It consists of Instance store storage, consisting of

a 1024GB Instance store volume.

EBS: The entire host operating system was installed on a single disk (xvda) while

another single disk (xvdb) is used for experiments. We create multiple equal-sized

partitions from sdb, each corresponding to a different host file system. Each partition

is then formatted using the default parameters of the host file system’s mkfs*

command and is mounted using the default parameters of mount.

More hardware and software configuration settings are listed in Table 1.

Devices #Blocks Type

/dev/xvdb1 30720 Ext3

/dev/xvdb2 30720 Ext4

/dev/xvdb3 30720 XFS

Table 1 Experiment Setup –EBS

3.6 BENCHMARKING TOOLS

This research uses couple of storage benchmarking tools. Some are quite simple

while others strive to show a more real world I/O profile. In this section we discuss

macro-benchmarks and micro benchmarks to understand the potential performance

23

impact of file systems on realistic workloads and with read/write operations, from

which, we were able to observe significant performance impact.

3.6.1 MACRO BENCHMARKS

Our main objective is to understand how much of a performance impact on different

storage settings with different file systems, different workloads have. As mentioned

before we use EXT3, EXT4 and XFS file systems in both EBS and Instance store

storages VMs. For micro benchmark test we are considering FileBench benchmarking

tool to test each storage with different workloads.

FileBench Benchmarking

FileBench lets you create realistic profiles, or configurations, that emulate the real-

world behavior of your own applications. FileBench comes with a batch of prefab

profiles for various servers and functions, such as mail, Web, and fileservers, and

tasks such as reading, writing, copying and deleting files.

FileBench comes with its own language, .f, for creating profiles, which is a lean and

straightforward language. We use Filebench [3] to generate macro-benchmarks of

different I/O transaction characteristics controlled by predefined parameters, such as

the number of files to be used, average file size, and I/O buffer size. FileBench can

emulate different workloads with its flexible Work-load Model Language (WML)

[13]

A WML is to describe a work-load, workload description is called a personality.

Personalities define one or more groups of file system operations (e.g., read, write,

append, stat), to be executed by multiple threads. Each thread performs the group of

operations repeatedly, over a configurable period of time. At the end of the run,

FileBench reports the total number of performed operations. WML allows one to

specify synchronization points between threads and the amount of memory used by

each thread, to emulate real-world application more accurately. Personalities also

describe the directory structure(s) typical for a specific workload: average file size,

24

directory depth, the total number of files, and alpha parameters governing the file and

directory sizes that are based on a gamma random distribution [13]

Since Filebench supports a synchronization between threads to simulate concurrent

and sequential I/O s, we use this tool to create four server workloads: a file server, a

web server, a mail server, and a database server

• File server: Emulates a file service. File operations are a mixture of create, delete,

append, read, write, and attribute on files of various sizes.

• Web server: Emulates a web service. File operations are dominated by reads: open,

read, and close. Writing to the web log file is emulated by having one append

operation per open.

• Mail server: Emulates an e-mail service. File operations are within a single

directory consisting of I/O sequences such as open/read/close, open/append/close, and

delete.

• Database server: Emulates the I/O characteristic of Oracle 9i. File operations are

mostly read and write on small files. To simulate database logging, a stream of

synchronous writes is used. This workload generates a dataset with a specified

number of directories and files using a gamma distribution to determine the number

of sub-directories and files. It then spawns a specified number of threads where each

thread performs a sequence of open, read entire file and close operations over a

chosen number of files, outputting resulting data to a logfile.

3.6.2 MACRO BENCHMARKS

FIO: In this part, we discuss macro-level benchmark FIO [4] . With FIO we examine

disk I/O workloads. As a highly configurable benchmark, FIO defines a test case

based on different I/O transaction characteristics, such as total I/O size, block size,

number of I/O parallelism, and I/O mode. Our thesis focus is on the performance

variation of primitive I/O operations, such as read and write. With the combination of

25

these I/O operations and two I/O patterns, random and sequential, we design four test

cases: random read, random write, sequential read, and sequential write.

Fio is an I/O tool meant to be used both for benchmark and stress/hardware

verification. It has support for 19 different types of I/O engines (sync, mmap, libaio,

posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O

priorities, rate I/O, forked or threaded jobs, and much more. It can work on block

devices as well as files. Fio accepts job descriptions in a simple-to-understand text

format. Fio displays all sorts of I/O performance information, including complete IO

latencies and percentiles. It supports Linux, FreeBSD, NetBSD, OpenBSD, OS X,

OpenSolaris, AIX, HP-UX, Android, and Windows.

CHAPTER 4

EXPERIMENT RESULTS & ANALYSIS

4.1. FILEBENCH RESULTS

The specific parameters of each workload are listed in Table 2, showing that the

experimental working set size is configured to be much larger than the size of the

page cache in the VM. The detailed description of these workloads represented in

Table 2.

Services # Files # Threads File size I/O sizeFile server 50000 50 128kb 8k-512k

Mail server 50000 16 8k 8k-512k

Web server 50000 100 16k 8k-512k

Database Server 8 200 1g 8k-512kTable 2 FileBench Workloads

Run filebench: Filebench is quick to set up and use unlike many of the commercial

benchmarks which it can emulate. It is also a handy tool for micro-benchmarking

26

storage subsystems and studying the relationships of complex applications such as

relational databases with their storage without having to incur the costs of setting up

those applications, loading data and so forth. Filebench uses loadable workload

personalities in a common framework to allow easy emulation of complex

applications upon file systems. The workload personalities use a Workload Definition

Language to define the workload's model.

We first load the required workload by using load command, then we will set the

parameters like on which directory (you must mount a file system on this directory

before doing this), filesize, number of files, number of threads, iosize and number of

seconds we want to run that workload, in response first it will create a filesystem tree

with the properties we defined earlier. After specified time completed , then we will

see the results like howmany operations, operations per second, latency in ms and etc.

4.1.1 EBS

There are three metrics of storage performance: Latency, IOPS (Input Output

operations Per Second), and Throughput. Understanding the relationships between

these metrics is the key to understanding of the storage performance. Latency is the

combined delay between an input or command and the desired output. In a computer

system, latency is often used to mean any delay or waiting that increases real or

perceived response time beyond what is desired. IOPS are nothing more than the

number of I/O transactions that can be performed in a single second. The amount of

data transferred from one place to another or processed in a specified amount of time.

Data transfer rates for disk drives and networks are measured in terms of throughput.

Typically, throughputs are measured in kbps, Mbps and Gbps.

IOPS: How often a storage device can perform IO tasks is measured in Input/output

Operations per Second (IOPS), and varies depending on the type of IO being done.

IOPS tells us how quickly each drive can process IO requests. The greater the number

of IOPS, the better the performance. In this section we see each workload IOPS on

different file systems.

27

8 16 32 64 5120

20

40

60

80

100

120

140

DB Server

ExT3 EXT4 XFS

IOSIZE

IOPS

8 1 6 3 2 6 4 5 1 20

2000

4000

6000

8000

10000

12000

14000

Fi le S er v er

EXT3 EXT4 XFS

IOSIZE

IOPS

8 1 6 3 2 6 4 5 1 20

2000

4000

6000

8000

10000

12000

14000

16000

W eb ser ver

ExT3 EXT4 XFS

IOSIZE

IOPS

8 1 6 3 2 6 4 5 1 20

2000

4000

6000

8000

10000

12000

Mail S er v er

ExT3 EXT4 XFS

IOSIZE

IOPS

files

erve

r

web

serv

er

mai

lser

ver

dbse

rver

files

erve

r

web

serv

er

mai

lser

ver

dbse

rver

files

erve

r

web

serv

er

mai

lser

ver

dbse

rver

0

2000

4000

6000

8000

10000

12000

14000

16000

8k 16k 32k 64k 512k

File Systems Workload

IOPS

Graph 4.1:1 IOPS of different file systems with different workloads on EBS

28

From Graph 4.1:1, we observed that Mail server and webserver were not affected

much with IOsize increase. On these 2 workloads, 3 file systems behaved almost

same with only slight variation in number of IOPS. And when we closely observe

both workloads, webserver performs better performance than Fileserver, and also as

by increasing the IOsize, the number of IOPS were decreased in Fileserver and

Webserver with EXT4 filesystem, whereas for EXT3 and XFS on fileserver and

webserver workloads we got slightly increased IOPS.

We saw significant variations in IOPS in DB server. As IOsize increasing, say 8k to

512k the number of IOPS were increased for every file system. Especially for XFS

file system it is almost 130% increase of IOPS at 512k. EXT3 and EXT4 also got

increased IOPS at 512k but very less improvement.

When we analyze each file system, all EXT3, EXT4 and XFS performed well with

Webserver workload. And at the same IO size, for Webserver EXT3 performed 200

times better than the DB server. And for same number of files and at the same IOsize

webserver performed almost 2 times better performance than Mail server

Latency: The size of the I/O transfers can also impact the latency of the transfer,

because larger I/O transfer sizes take longer to complete. So performance is best

when we hit less latency.

EXT3 EXT4 XFS0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

WebServer

8 16k 32k 64k 512k

Filesystems

Laten

cy (m

s)

EXT3 EXT4 XFS0

0.20.40.60.8

11.21.41.61.8

FileServer

8k 16k 32k 64k 512k

Filesystems

Laten

cy (m

s)

29

EXT3 EXT4 XFS0

2000400060008000

10000120001400016000

DBServer

8k 16k 32k 64k 512k

File systems

Latenc

y (ms)

EXT3 EXT4 XFS0

0.5

1

1.5

2

2.5

3

3.5

4

MailServer

8k 16k 32k 64k 512k

File systems

Latenc

y (ms)

Graph 4.1:2 Latency of different filesystems with different workloads on EBS

Database Server:

From Graph 4.1: 2, Database server workload works pretty well with EBS storage and

results low latencies than other workloads. We found EXT3 and EXT4 produced

similar results in DB Server with low latencies. When we consider XFS it got lower

latencies like EXT3 and EXT4 when we increased block size from 16k to 512k. We

suggest that when user want to run DB server workload, it is better to use either

EXT3 or EXT4.

Mail Server:

When we consider email (Mail server) workload EXT4 works better than other 2 files

systems such as EXT3 and XFS. EXT4 behaves differently for different IO size.

EXT4 results less latency when we increase IOsize from 8k to 512k. We can say that

for higher IO size good to go with EXT4 file system when you are dealing with email

workload.

File Server:

With Fileserver workload, EXT3 behaves well with only for small IO sizes. XFS was

not consistent, latencies were up and down. Compared to EXT3 and XFS, EXT4

results seems neat, we can see that clearly. Latencies were affected by the size of IO.

With the increasing IOsize, EXT4 results high latencies. From the results of the

30

above graph of File Server, we can say that for small IO Sizes, EXT4 works very well

for File Server workload in EBS storage of Amazon EC2.

Webserver:

When user runs webserver workload, on these 3 file systems, we found that XFS was

not at all affected by increasing or decreasing the IO size of file. And EXT4 throws

low latency when there is an increase of IO size from 8k to 512k. EXT3 didn't show

any variation from low IO size to High IO size. From the results, we suggest, with

EBS storage it is better to mount your webserver workload on EXT4 rather on EXT3

and XFS.

Workloads Comparison: When you compare EXT3, EXT4 and XFS with EBS

storage instance, as expected we observed significant differences in IOPS with

different workloads. Compared to File Server and Webserver, file systems got huge

difference in IOPS with Mail server and DB server workloads, with different

combinations of IO size.

Filesystems: When we analyze each file system, EXT3 and XFS hit very high latency

in the mail server.

Bandwidth: Bandwidth determines how fast data can be transferred over time. It is

the amount of data that can be transferred per second.

file

serv

er

web

serv

er

mai

lser

ver

db

serv

er

file

serv

er

web

serv

er

mai

lser

ver

db

serv

er

file

serv

er

web

serv

er

mai

lser

ver

db

serv

er

EXT4_EBS XFS_EBS

0

5000

10000

15000

20000

25000

8k 16k 32k 64k 512kIOsize and WorkLoads

Ban

dw

idth

(MB

/s)

Graph 4.1:3 Bandwidth of file systems on EBS

31

We also observed bandwidth for different workloads on each files system. From

Graph 4.1:3, Database server workload performs well by resulting high bandwidth.

From the above graph, EXT4 file system performed pretty well with the database

server and the amount of data transferred was more than the other file systems.

4.1.2 INSTANCE STORE:

In this section we present the results of Amazon Ec2 Instance store performance with different file systems and analysis of these results

IOPS:

8 16 32 64 5120

2000400060008000

100001200014000

FileServer

EXT3 EXT4 XFS

Iosize

IOPS

8 16 32 64 5120

2000400060008000

100001200014000

WebServer

ExT3 EXT4 XFS

Iosize

IOPS

8 16 32 64 5120

20

40

60

80

100

120

DBServer

ExT3 EXT4 XFS

Iosize

IOPS

8 16 32 64 5120

2000

4000

6000

8000

10000

12000

MailServer

ExT3 EXT4 XFS

Iosize

IOPS

Graph 4.1:4 IOPS of different filesystems with different workloads on Instance Store

On Amazon Instance Store, we observed different Filebench results. From Graph

4.1.4, we noticed that webserver workload gave better performance (high IOPS) than

32

all other workloads. And in webserver, as we increasing the IOsize, the number of

IOPS were decreased for EXT4 and XFS file systems where as for EXT3 we found a

little bit performance improvement. Webserver workload got 30% more IOPS than

the mail server IOPS for each filesystem. Similar to file server workload and DB

server workload, reduction in IO size from 512kb to 8kb, we observed the overall

performance of XFS was increased.

With DB server workload, we found maximum IOPS for small IOsize (8k) with XFS.

And also XFS behaved better than ext3 and ext4 even at high IOsizes (512k) we saw

significant variations in IOPS in DB server. As IOsize increasing, say 8k to 512k the

number of IOPS were increased for every file system. Especially for XFS file system

it is almost 130% increase of IOPS at 512k. EXT3 and EXT4 also got increased IOPS

than at 8k but very less improvement.

When we analyze each file system, all EXT3, EXT4 and XFS performed well with

Webserver workload like in Amazon EBS storage. And for same number of files and

at the same IOsize webserver performed better performance than Mail server by 36%.

XFS file system performs similar with Fileserver and Mail server workloads. Among

all workloads, webserver workload is suited for XFS and with all workloads except

for DB server workload, EXT4 got high performance.

Latency:

EXT3 EXT4 XFS0

0.5

11.5

2

2.53

3.54

MailServer

8k 16k 32k 64k 512k

FILE SYSTEMS

LATE

NCY(m

s)

EXT3 EXT4 XFS0

2

4

6

8

10

12

FileServer

8k 16k 32k 64k 512k

FILESYSTEMS

LATEN

CY(m

s)

33

EXT3 EXT4 XFS0

50

100

150

200

250

300

DBServer

8k 16k 32k 64k 512k

FILESYSTEM

LATE

NCY(m

s)

EXT3 EXT4 XFS0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

WebServer

8 16k 32k 64k 512k

FILESYSTEM

LateN

cy(MS

)

Graph 4.1:5 Latency of different file systems with different workloads on Instance Store

Mail Server:

From the Graph 4.1.5, email (Mail server) workload XFS works better and same at

low IO sizes and high IO sizes. As we are increasing the IO size, EXT3 didn’t affect

much. EXT4 results high latency when we increase IOsize from 8k to 512k. We can

say that with low IO size it is better to choose XFS file system when you are dealing

with email workload.

File Server:

With Fileserver workload, EXT4 behaves very well with low IO sizes. Next XFS did

good job at low IO sizes. Compared to EXT3 and XFS, EXT4 results very less

latencies. With the increasing IOsize, EXT4 results high latencies. From the results

of the above graph of File Server, we can say that for small IO Sizes, EXT4 works

very well for your File Server workload in EBS storage of Amazon EC2.

Database Server:

From our experiment results (Graph 4.1.5), Database server workload works very

badly with Instance storage and results high latencies than other workloads. We found

EXT4 produced better results in DB Server with low latencies when we increase the

IOsize from 64k to 512k. When we consider XFS it got higher latencies than EXT3

and EXT4 for each IO size. Here system doing high IOPS with high latency. Doing

more IOPS would be nice but the DB needs less latency in order to see significantly

34

improved performance. We suggest that when user wants to run DB server workload,

it is better to use either EXT3 or EXT4.

Webserver Workload:

When user runs webserver workload, on EXT3, EXT4 and XFS, we found that none

was affected by increasing or decreasing the IO size of file. And EXT4 throws low

latency compared to other 2 file systems. Analyzing the results we suggest that, with

Instance storage it is better to mount your webserver workload on EXT4 rather on

EXT3 and XFS.

Workload Analysis: When you compare different workloads with instance storage,

we observe some differences in IOPS with different workloads. Other than DB server

workloads, all other workloads gave less latencies. Among mail server, file server and

webserver, file server results high latency with EXT3 file system and web server

results low latency with EXT4. Except in the mail server, latencies were affected by

the size of IO. From our analysis, we can say the Webserver works pretty well with

Instance storage.

File Systems Analysis: When we analyze each file system, EXT4 was best among

them with low latencies in webserver workload. Latencies were affected by the size

of IO. EXT4 also did a good job in fileserver with different IO sizes, and in DB

server also it finished the job in less time when we increase the IOsize to 512k. From

our analysis EXT4 was the best file system with any workload.

BANDWIDTH:

35

file

serv

er

web

serv

er

mai

lser

ver

db

srve

r

file

serv

er

web

serv

er

mai

lser

ver

db

srve

r

file

serv

er

web

serv

er

mai

lser

ver

db

srve

r

XFS_Instance EXT4_Instance EXT3_Instance

02000400060008000

100001200014000160001800020000

8k

16k

32k

64k

512kWorkloads

Ban

dw

idth

(M

b/s

)

Graph 4.1:6 Bandwidth of different filesystems with different workloads on Instance Store

Like in EBS storage, dbserver workload got higher bandwidth in Instance storage, but

here XFS performs better than EXT4 and EXT3.

EBS VS INSTANCE STORE:

Bandwidth:

When we compare EBS and Instance Store storages, EXT4 consumed significant

bandwidth on Mail server, Web server and File server workload. XFS consumed

significant bandwidth on DB server workload.

EBS IOPS: Following two tables shows the number of total IOPS we got on each file

systems with different workloads on small IO size (8k) and large IO size (512k)

respectively

For Small IO Sizes:

8k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 12143.62 12313.749 12361.546WebSever 13271.074 13342.91 13284.554DBServer 57.99 84.987 17.794MailServer 8649.666 10976.13 8767.098

Table 3 IOPS of different filesystems with different workloads on EBS with 8k

For High IO sizes:

36

512k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 12146.801 12222.435 12245.031WebSever 13329.213 13154.089 13295.974DBServer 58.991 88.985 83.987MailServer 8220.46 10291.505 8817.825

Table 4 IOPS of different filesystems with different workloads on EBS with 512k

From Table 3 and Table 4, for smaller IO sizes XFS delivers very good IOPS with

different workloads except on DB server. As we increasing the block size say

from 8k to 512k, we observed that EXT4 was best among other file systems. We

also noticed that with Data base workload, XFS got more performance when we

increase the iosize from 8k to 512 k almost 80% increase in IOPS.

Instance Store IOPS:

For 8k IO size:

8k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 7386.07 12181.359 9382.177WebSever 12758.949 13133.797 10961.689DBServer 76.987 81.986 109.975MailServer 8828.579 9761.338 8116.941

Table 5 IOPS of different filesystems with different workloads on Instance Store with 8k

For 512k IO size:

512k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 7078.581 12185.311 8751.845WebSever 12773.772 13098.564 10955.835DBServer 82.986 83.987 96.983MailServer 8858.039 9163.494 8149.452

Table 6 IOPS of different filesystems with different workloads on Instance Store with 512k

Amazon EC2 instance with Instance store volume has significant performance

variations on different file systems. For smaller IO sizes like 8k, EXT4 was the

best among all file systems with any workload except DB server workload. And

EXT3 got poor performance with DB workload and XFS got very high IOPS

compared to other file systems. For high IO size like 512k, see better

performance with EXT4 file system, compared to EXT3 and XFS almost 30 to 40

% variation with file server workload.

37

EBS vs Instance Store: If we check for 8k (small) IO size, XFS file system works

very well in EBS, with all workloads except for database workload. EXT3 got so

much IOPS difference when you run on EBS rather with Instance store like 64%.

The next one is XFS, it got 31% increase performance in EBS.

4.2 FIO RESULTS

In this section we are doing a micro-level benchmarking with FIO [4] on different file

systems on EBS and Instance store. The specific I/O characteristics of these test cases

are listed in Table 7.

Parameters Block size Number of jobs I/O Pattern Runtime

EXT3 4k-1024k 8, 16, 32 Random/Sequential 60

EXT4 4k-1024k 8,16, 32 Random/Sequential 60

XFS 4k-1024k 8,16, 32 Random/Sequential 60

Table 7 FIO Benchmark Parameters for EBS and Instance store

In the following example, we see how to write and run a FIO benchmarking with

randomread/write on XFS. I used block size ranges from 4k to 1024k with different

numbere of jobs say from 8 jobs, 16jobs and 32 jobs. Here our focus is on the

performance variations of primitive I/O operations, such as read and write. With the

combination of these I/O operations and two I/O patterns, random and sequential, we

design four test cases: random read, random write, sequential read, sequential write.

This thesis run FIO for 0% reads(means 100% writes) to 100%reads(same as 0%

writes) randomly and sequentially.

Example:

./fio --filename=/xfs/4krandreadwrite00 --direct=1 --rw=randrw --refill_buffers --

norandommap --randrepeat=0 --size=1024m --bs=4k --rwmixread=70 --iodepth=8 --

numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite00

--output=/npudtha/fioresults_xfs/RandReadWrite/4krandreadwrite00

38

Here rwmixread=int means percentage of mixed workload that should be reads; in

our example read percentage is 70 so writes percentage will be 30%. The above

command will create a 1024m file, and perform 4KB reads and writes using a

70%/30% (i.e. 3 reads are performed for every 1 write) split within the file, with 8

operations running at a time. Here we are noticing how many 4k (4096 byte)

operations the drive will handle per second with each block being read or written to a

random position. With iodepth of 8, this means that there are 8 separate threads taking

place with the drive, each thread independently running its own transfers. We write

same with for other combinations.

4.2.1 EBS:

Random read/write: Random means the files are scattered all over the drive, not in

neat rows or groups, so take more work to find. Random IO is the most difficult and

time consuming type a storage device must deal with. In this section we discuss

results of Random read/write results on EBS storage with EXT3 file systems with

different number of jobs.

EXT3 IOPS:

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandRWEXT3_EBS_8Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:7 E XT3 Random Read Write 8 jobs

39

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandRW_EXT3_EBS_16Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:8 EXT3 Random Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandRW_EXT3_EBS_32Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:9 EXT3 Random Read Write 32 jobs

For write dominated workloads say 90% writes, 8jobs, 16 jobs and 32 jobs results

vary for 4k, 8k 16k and 32k block sizes. That tells us, when the work loads are write

dominated, choosing 8k(8 x 1383 IOPS) block size can write more data than 4k (4k x

1472 IOPS) and we observed that as we increasing the block size 32k(32 x 972 IOPS)

the number of IOPS were reduced, but the overall write of data was increased. We

also observed as we increased block size from 64k to 128k, we saw that IOPS

decreased by 30% almost (525 IOPS). And also when we increased the block size

from 128k to 1024k it only gave only 109 IOPS but writes more data than at 128k

block size.

For read dominated workloads say 90% reads, 8jobs, 16 jobs and 32 jobs results

similar IOPS for 4k, 8k 16k and 32k block sizes. And also from results, when the

40

work loads are read dominated, choosing 32k (32 x 3064 IOPS) block size can read

more data and we observed that as we increasing the block size 64k (64 x 2089) the

number of IOPS were reduced, but the overall bytes of data was increased. We also

observed as we increased block size from 64k to 128k, we saw that IOPS decreased

exactly to half (1044 IOPS) irrespective of number of jobs. And also when we

increased the block size from 128k to 1024k the performance suffered, it gave 122

IOPS for 8 jobs, to 124 IOPS for 16 jobs and 129 IOPS for 32 jobs and in all cases it

read less data than at 128k block size.

The main observation, dominated read performance was better than the write

dominated performance. For example at 4k block size, read IOPS were more than

double to write IOPS at 90% (read/write).

Sequential Read & Write: In this section we discuss results of sequential read/write

results on EBS storage with EXT3 file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SEQRW_EXT3_EBS_8Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:10 EXT3 Sequential Read Write 8 jobs

41

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 00

500

1000

1500

2000

2500

3000

3500

SEQRW_EXT3_EBS_32Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:12 EXT3 Sequential Read Write 32 jobs

For read dominated workloads say 90% reads, 8jobs, 16 jobs and 32 jobs results

similar IOPS for 4k, 8k 16k and 32k block sizes and also we noticed that better IOPS

than random workloads. We observed similar results as random read write except

when the block size was increased from 128k to 1024k, the performance degraded.

For write dominated workloads say 90% writes, as we increasing the block size the

IOPS decreased for all jobs linearly. We also observed as we increased block size

from 64k to 128k, we saw that IOPS decreased to 596 IOPS. And also when we

increased the block size from 128k to 1024k it only gave 103 IOPS but writes more

data than at 128k block size.

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SEQRW_EXT3_EBS_16Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:11EXT3 Sequential Read Write 16 jobs

42

Here, we found surprised results. As we know, disks are split into linear addressable

regions, and such a unit is called a sector. Because sectors close to each other in the

linear space are also physically adjacent, it is faster to read two neighboring sectors

than two sectors that are far apart means for sequential reads we should get better

performance than random reads, but here we saw the other way. From our

observation, we think the reason of performance degradation might be because of

reading a big size of block size randomly is better than sequential reads due to

parallel jobs. From the results, we found that there is a performance difference with

different file systems. The common workloads are usually 60 % reads and 40% writes

and we observed better performance when we run 32 jobs on EXT4.

EXT4:

Random read/write: In this section we discuss results of Random read/write results on EBS storage with EXT4 file systems with different number of jobs.

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 00

500

1000

1500

2000

2500

3000

3500

Ra n d R W_ EX T4 _ E BS_ 8 J O BS

4k8k16k32k64k128k1024k

%Reads

IOP

S

Graph 4.1:13 EXT4 Rand Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RANDRW_EXT4_EBS_16JOBS

4k8k16k32k64k128k1024k

%Reads

IOP

S

Graph 4.1:14 EXT4 Rand Read Write 16 jobs

43

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandRW_EXT4_EBS_32Jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:15 EXT4 Rand Read Write 32 jobs

With EXT4 file size, for write dominated workloads say 90% writes, when we

compare 8jobs, 16 jobs and 32 jobs results vary for 8k 16k and 32k block sizes. That

tells us, when the work loads are write dominated, increasing the block size 32k(32 x

1094 IOPS) the number of IOPS were reduced, but the overall write of data was

increased. We also observed as we increased block size from 64k to 128k, we saw

that IOPS decreased like in EXT3. And also when we increased the block size from

128k to 1024k it only gave 96 IOPS but writes more data than at 128k block size.

For read dominated workloads say 90% reads, 8jobs, 16 jobs gave similar results but

at 32 jobs it gave less IOPS for 4k and 8k block sizes. In this case we can push more

data when we choose 32k block size with 8 jobs. That tells us, when the work loads

are read dominated, choosing 32k(32 x 3063 IOPS) block size can read more data

and we observed that as we increasing the block size 64k(64 x 2089) the number of

IOPS were reduced, but the overall bytes of data was increased.

For both read dominated and write dominated workloads, as we increasing the block

size we get better IOPS. And again read dominated workloads throw better

performance than the write dominated workloads and also better performance

observed at less number of threads.

Sequential Read/Write: In this section we discuss results of sequential read/write

results on EBS storage with EXT4 file systems with different number of jobs.

44

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqRWEXT4_8jobs_EBS

4k8k16k32k128k1024k

Read%

IOP

S

Graph 4.1:16 EXT4 Sequential Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqRWEXT4_16jobs_EBS

4k8k16k32k128k1024k

Read%

IOPS

Graph 4.1:17 EXT4 Sequential Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqRWEXT4_32jobs_EBS

4k8k16k32k128k1024k

Read%

IOPS

Graph 4.1:18 EXT4 Sequential Read Write 32 jobs

For read dominated workloads say 90% reads, for 4k, 8k 16k and 32k block sizes we

found similar results at 16 jobs and 32 jobs results and somewhat less at 8jobs. And

also we noticed that less IOPS compared to random workloads. We observed similar

results as random read write except when the block size was increased from 64k

45

to128k to 1024k, the performance degraded and number of IOPS were close to half

and when we increased from 128k to 1024k it was drastically down in IOPS. For

write dominated workloads say 90% writes, as we increasing the block size the IOPS

decreased for all jobs linearly. We also observed as we increased block size from 64k

to 128k, we saw that IOPS decreased from 858 to 100 IOPS at 32 jobs. And also

when we increased the block size from 128k to 1024k it only gave 97 IOPS but writes

more data than at 128k block size. Here, as we expected that for sequential reads we

should get better performance than random reads, we got exactly like that with EXT4.

XFS:

Random Read/Write: In this section we discuss results of Random read/write results

on EBS storage with XFS file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandReadWrite_XFS_EBS_8jobs

4k8k16k32k64k128k1024k

%Reads

IOP

S

Graph 4.1:19 XFS Rand Read Write 8 jobs

46

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandReadWrite_XFS_EBS_16jobs

4k

8k

16k

32k

64k

128k

1024k

%Reads

IOP

S

Graph 4.1:20 XFS Rand Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

RandReadWrite_XFS_EBS_32jobs

4k8k16k32k64k128k1024k

%Reads

IOP

S

Graph 4.1:21 XFS Rand Read Write 32 jobs

With XFS file system, as we are increasing the block size number of IOPS were

linearly decreased with dominate writes. From 50 % reads and 50 % writes they

behaved differently, and number of IOPS were similar for 4k, 8k, 16k and 32k and

from 128 k to 1024k they were reduced. But we should notice that, even the number

of IOPS decreased us still able to read more data at 128k and 1024k. Random read

write with XFS results were not so different from previous file systems.

Sequential read/write: In this section we discuss results of sequential read/write on

EBS storage with XFS file systems with different number of jobs.

47

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqReadWrite_XFS_EBS_8jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:22 XFS Sequential Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqReadWrite_XFS_EBS_16jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:23 XFS Sequential Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

500

1000

1500

2000

2500

3000

3500

SeqReadWrite_XFS_EBS_32jobs

4k8k16k32k64k128k1024k

Read%

IOP

S

Graph 4.1:24 XFS Sequential Read Write 32 jobs

48

For write dominated and read dominated workloads, at 8jobs, we didn't see any

variation in IOPS from 4k block size to 16k block size. As we increasing the block

size further, we observed the variation in total IOPS. For write dominated workloads,

say at 90% writes, at 32k block size XFS gave 2844 IOPS, it means we got less IOPS

by increasing the block size from 16k to 32k but we also can say that we can write

more data at 32k block size.

With the same block size (32k), for read dominated workloads say at 90% reads we

did not find any difference in total IOPS, but when we increase block size from 32k

to 64k and at same 90% reads, the performance was degraded by 30% and when we

increased block size from 64k to 128k performance was decreased 50% and when we

increased block size from 128k to 1024k it was drastically down in IOPS

When we compare 8jobs, 16 jobs and 32 jobs XFS file system works well with 8 jobs

at write dominate workloads between block sizes from 4k to 64k and also we

observed that at 128k block size with 16 jobs we are able to write more data. And at

90% reads, we found xfs results high IOPS with 16 jobs than 8 and 32 jobs.

When we analyze random read/write and sequential read/write, for write dominated

workloads sequential read/write results more IOPS with 8 jobs and, for read

dominated workloads, both behaved in same manner.

EXT3 vs EXT4 vs XFS on EBS storage:

When we compare EXT3, EXT4 and XFS file systems on amazon ec2 cloud with

EBS store attached for random read/write workloads, for read dominated workloads

(at 90% reads) all file systems gave almost same number of IOPS(between 3063 and

3064), we can say that for any read dominated workload user can choose any file

systems among these three.

Whereas for write dominated workloads, we saw significant difference with each file

system. Among all these file systems, here xfs file system got more IOPS (1623) at

80% writes. And next EXT4 performance was better than EXT3 at 80% writes. From

49

our analysis, we suggest that for write dominated workloads it is better to choose XFS

file system rather EXT3 and EXT4 file systems.

When we compare EXT3, EXT4 and XFS file systems on amazon ec2 cloud with

EBS store attached for sequential read/write workloads, for read dominated

workloads (at 90% reads) EXT3 file system performed better than other 2 file

systems. Whereas for write dominated workloads ((at 90% writes), we got more

number of IOPS with XFS file system, we found almost 50% increase in performance

than other file systems. And also as we are increasing the block size we found less

IOPS in each file system. The main observation was, performance varies with the file

system and also with the block size.

4.2.2 INSTANCE STORE

In this section we present the results of Amazon Ec2 Instance store performance with different file systems and analysis of these results

EXT3

Random read/write: In this section we discuss results of Random read/write results on instance store storage with EXT3 file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

RandRW_EXT3_Instore_8jobs

4k

8k

16k

32k

64k

128k

1024k

Read%

IOP

S

Graph 4.1:25 EXT3 Rand Read Write 8 jobs

50

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

RandRW_EXT3_Instore_16jobs

4k

8k

16k

32k

64k

128k

1024k

Read%

IOP

S

Graph 4.1:26 EXT3 Rand Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

45000

RandRW_EXT3_Instore_32jobs

4k

8k

16k

32k

64k

128k

1024k

Read%

IOP

S

Graph 4.1:27 EXT3 Rand Read Write 32 jobs

For write dominated workloads say 90% writes, 8jobs, 16 jobs and 32 jobs results

vary for 4k, 8k 16k and 32k block sizes. And it also tells us, when the work loads are

write dominated, choosing 8k block size with 8 jobs( results high IOPS and we

observed that as we increase the block size the number of IOPS were reduced, at the

same time the overall data written was increased. We also observed as we increased

block size from 64k to 128k, we saw that IOPS decreased by about 30% under 8jobs

work, and decreased by more than 50% under 16 jobs and 32 jobs workload. And also

when we increased the block size from 128k to 1024k it results very low IOPS in all

jobs. For read dominated workloads say 90% reads, we found very high IOPS

(40073) when number of jobs are 32. Like in write dominated workload, as we

increase the block size the number of IOPS were decreased.

51

The main observation, dominated read performance was better than the write

dominated write performance. For example at 4k block size, read IOPS were more

than double to write IOPS at 90% (read/write). And when number of jobs are 8, 16

and 32 the number of IOPS were started decreasing very rapidly when we increase

the block size from 16k to 32k

Sequential read/write: In this section we discuss results of sequential read/write on

EBS storage with EXT3 file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

14000

SeqRW_Instore_EXT3_8Jobs

4k

8k

16k

32k

64k

128k

1024k

Read

IOP

S

Graph 4.1:28 EXT3 Sequential Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

14000

16000

18000

SeqRW_Instore_EXT3_16Jobs

4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:29 EXT3 Sequential Read Write 8 jobs

52

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

14000

16000

SeqRW_Instore_EXT3_32Jobs

4k

8k

16k

32k

64k

128k

1024k

Read%

IOP

S

Graph 4.1:30 EXT3 Sequential Read Write 32 jobs

For any number of jobs (8, 16, 32), EXT3 file system performance was much better

with write dominated workloads for small block size. As we are increasing the block

size performance was degraded. When we compare 8jobs, 16 jobs and 32 jobs EXT3

file system works well with 16 jobs at write dominate workloads at 4k block size.

And at 90% reads or write dominated workloads, we found EXT3 results high IOPS

with 32 jobs than with 8 and 16 jobs.

EXT4

Random read/write: In this section we discuss results of Random read/write results

on instance store storage with EXT4 file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

RandRW_EXT4_Instore_8Jobs

4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:31 EXT4 Random Read Write 8 jobs

53

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

RandRW_EXT4_Instore_16Jobs4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:32 EXT4 Random Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

RandRW_EXT4_Instore_32Jobs

4k

8k

16k

32k

64k

128k

1024k

Read%

IOPS

Graph 4.1:33 EXT4 Random Read Write 32 jobs

With EXT4 file size, the performance of those workloads that are dominated by read

and write operations are effected by block size. We noticed that as we increase block

size, IOPS gradually decreased. Here the block size was directly affects the number

of input output operations. When the workloads are dominated by write operations

(say at 90% writes), the performance changes were negligible. For read dominated

workloads say 90% reads, at 32 jobs the performance was better than at 8 and 16 jobs.

That tells us, when the work loads are read dominated, choosing big block size can

read more data and we observed that as we increasing the block size the number of

IOPS were reduced, but the overall bytes of data was increased. And read dominated

workloads throw better performance than the write dominated workloads and also

better performance observed at high number of threads.

54

Sequential read/write: In this section we discuss results of sequential read/write on

instance store storage with EXT4 file systems with different number of jobs.

0

2000

4000

6000

8000

10000

12000

14000

SeqRW_EXT4_Instore_8jobs4k

8k

16k

32k

64k

128k

1024k

Read%

IOP

S

Graph 4.1:34 EXT4 Sequential Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

14000

16000

SeqRW_EXT4_Instore_16jobs

4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:35 EXT4 Sequential Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

2000

4000

6000

8000

10000

12000

14000

16000

SeqRW_EXT4_Instore_32jobs

4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:36 EXT4 Sequential Read Write 32 jobs

55

For read dominated workloads say 90% reads, for 4k, 8k 16k and 32k block sizes we

found similar results at 16 jobs and 32 jobs results and somewhat less at 8jobs. And

also we noticed that less IOPS compared to random workloads. We observed similar

results as random read write except when the block size was increased from 64k

to128k to 1024k, the performance degraded and number of IOPS were close to half

and when we increased from 128k to 1024k it was drastically down in IOPS.

For write dominated workloads say 90% writes, as we increasing the block size the

IOPS decreased for all jobs linearly. We also observed as we increased block size

from 64k to 128k, we saw that IOPS decreased from 858 to 100 IOPS at 32 jobs. And

also when we increased the block size from 128k to 1024k it only gave 97 IOPS but

writes more data than at 128k block size.

Here, as we expected that for sequential reads we should get better performance than

random reads, we got exactly like that with EXT4.

XFS:

Random read/write: In this section we discuss results of Random read/write results

on instance store storage with XFS file systems with different number of jobs.

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

RandRW_XFS_INSTORE_8Jobs4k

8k

16k

32k

64k

128k

1024k%Reads

IOP

S

Graph 4.1:37 XFS Random Read Write 8 jobs

56

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

RandRW_XFS_INSTORE_16Jobs

4k

8k

16k

32k

64k

128k

1024k

%Reads

IOP

S

Graph 4.1:38 XFS Random Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

RandRW_XFS_INSTORE_32Jobs

4k

8k

16k

32k

64k

128k

1024kRead%

IOP

S

Graph 4.1:39 XFS Random Read Write 32 jobs

With XFS file system, as we are increasing the block size number of IOPS were

linearly decreased for 8 jobs for 50% read and write. When we run 16 jobs and 32

jobs for 50% reads and writes at 16k, 32k, 64k  block size the IOPS almost same.

Another observations read dominated workloads did better at 32 jobs with small

block size (4k) and write dominated workloads 8 jobs with 8k did better from the

above observations

Sequential read/write: In this section we discuss results of sequential read/write on

instance store storage with different XFS file systems with different number of jobs.

57

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

SeqRW_XFS_instore_8jobs

4k

8k

16k

32k

64k

128k

Read%

IOP

S

Graph 4.1:40 XFS Sequential Read Write 8 jobs

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

SeqRW_XFS_instore_16jobs

4k

8k

16k

32k

64k

128k

Read%

IOP

S

Graph 4.1:41 XFS Sequential Read Write 16 jobs

0 10 20 30 40 50 60 70 80 900

5000

10000

15000

20000

25000

30000

35000

40000

45000

SeqRW_XFS_instore_32jobs

4k

8k

16k

32k

64k

Read%

IOP

S

Graph 4.1:42 XFS Sequential Read Write 32 jobs

With sequential workload, XFS didn't effect by the number of threads. XFS

performed well with read dominated workloads than with write dominated workloads.

58

It is also effected by the block size. As we are increasing the block size, the number

of input output operations can be performed by this storage every second were

decreased. When we analyze random read/write and sequential read/write, Random

read/write workloads performance was great than sequential read/write workloads.

EXT3 vs EXT4 vs XFS on EBS & Instance storage:

IOPS for 8 jobs4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead

3063 3064 3064 27500 26566 25136

90%Random Write

1472 1558 1773 11954 11334 9960

90%Seq-Read 3070 3064 3059 12954 12370 2513690% Seq-Write

1685 1516 3059 2931 2396 9960

Table 8 IOPS for 8 jobs & 4k block size- EBS vs Instance store

IOPS for 16jobs

4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead

3062 3063 3061 35987 33882 33820

90%Random Write

1507 1582 1653 11443 11314 9691

90%Seq-Read 3090 3063 3063 16012 15000 3382090% Seq-Write

1712 1685 1791 2717 2609 9691

Table 9 IOPS for 16 jobs & 4k block size- EBS vs Instance store

IOPS for 32jobs4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead

3063 3045 3064 40073 37307 38422

90%Random Write

1499 1809 1828 11592 11068 9869

90%Seq-Read 3091 3063 3052 15235 14972 3842290% Seq-Write

1712 1682 1801 2712 2449 9869

Table 10 IOPS for 32 jobs & 4k block size- EBS vs Instance store

Table 8, 9, and 10 compares EBS and Instance Store storage setting with different file

systems. From our results, we can see that Instance Store storage setting results good

IOPS than EBS. Within instance store volumes, XFS did pretty well with sequential

workloads and for random workloads we noticed that EXT3 did performed well.

59

CHAPTER 6

IO ANALYSIS

In this section we do block-level analysis of EXT4 and XFS file systems with

sequential/Random Read/Writes with the help of Blktrace tool as EXT4 got high

IOPS in EBS storage and XFS got higher IOPS in Instance storage.

For IO analysis, we started with blktrace which is a block layer IO tracing

mechanism, it provides detailed information about request queues. Blktrace need to

be specified the disk (in our case it is /dev/xvdb) with –d. s blktrace output itself is

not human readable format, we need to use blkparse to make output human readable.

As we can see from the blkparse output (Appendix), each I/O is printed along with a

summary of the operations and how they were processed by the I/O scheduler. This is

great information, which can be used to figure out I/O patterns (random reads,

random writes, sequential reads, sequential writes etc.), the size of the I/O operations

hitting physical devices in a system and the type of workload on a system. We saved

blktrace traces to the disk, then run blkparse tool with specified disk (in our case

/dev/xvdb) and specified a file to store the combined binary system (in our case

bp_0.bin, bp_1.bin, bp_2.bin, bp_3.bin, bp_4.bin, bp_5.bin, bp_6.bin for CPU 0

through CPU 6). Blktrace produces a series of binary files, one file per CPU per

device. Then we run btt on a file which was produced by blkparse tools (btt –i

bp_1.bin)

Ex: Following is the output of bp_1.bin with EXT4 file system on EBS while us

running FIO benchmark.

==================== All Devices ====================

ALL MIN AVG MAX N--------------- ------------- ------------- ------------- -----------

Q2Q 0.000000302 0.000187836 0.018320058 11608

60

Q2G 0.000000316 0.000555075 0.021534113 6633G2I 0.000000537 0.000000884 0.000016285 6633Q2M 0.000000220 0.000521300 0.020452472 5757I2D 0.000000262 0.000567899 0.020857786 6825M2D 0.000003591 0.000305261 0.002367892 5557D2C 0.000003985 0.009692405 0.021917346 11547Q2C 0.000685686 0.010587135 0.024264175 11548

Here Q2Q is the time between requests sent to the block layer, Q2G is how long it

takes from the time a block I/O is queued to the time it gets a request allocated for it.

G2I measures how long it takes from the time a request is allocated to the time it is

inserted into the device's queue. Q2M is how long it takes from the time a block I/O

is queued to the time it gets merged with an existing request. I2D is how long it takes

from the time a request is inserted into the device's queue to the time it is actually

issued to the device. M2D is how long it takes from the time a block I/O is merged

with an existing request until the request is issued to the device. D2C is the service

time of the request by the device and Q2C is the total time spent in the block layer for

a request

61

CHAPTER 7

DISCUSSION

We performed an extensive I/O benchmarking study comparing different storage

options on Amazon EC2 instances. Amazon EC2 virtual machine has ephemeral local

disk and has the option to mount an elastic block storage volume. Typically, the

performance of the local disk tends to be slightly higher than the EBS corresponding

volumes.

As we mentioned in the beginning of this thesis, this research aims to answer some

important questions which were mentioned in the thesis outline, such as which one

delivers the better peak performance, which one delivers more consistent

performance, is any of these two settings all-time winner for all workloads, or the

performance is workload-dependent. Based on the experimental results, the

performance on these hosts tends to show a fair amount of variability due to the

attached storage settings. From EBS and Instance store volumes results we found that

there are I/O performance variations due to the block size. As we are increasing the

block size, the number of IOPS were decreased hence the I/O performance was

degraded. EC2 performance varies significantly under different host file systems. The

performance of file systems is affected much more by write than read operations.

This paper also evaluates the file system’s impact on EBS and instance storages

performance. We studied several popular Linux file systems, with various mount and

format options, using the FileBench workload generator to emulate four server

workloads: Web, database, mail, and file server. However, file system design,

implementation, and available features have a significant effect on CPU/disk

utilization, and hence on performance. We noticed that default file system options are

often suboptimal, and also a careful matching of expected workloads to file system

types and options can improve performance.

62

From Chapter 5, we understood that Instance Store storage setting were better than

EBS in regards of IOPS. But we have to understand that EBS got some major

advantages, such as we can have proper backups, you won’t lose your important data,

and EBS provides scaling. While EBS-backed instances provide certain level of

durability compared to ephemeral storage instances, they can and do fail. We also

understands that when a server doesn’t need EBS to launch, it is better to choose

Instance Store settings as it is cheaper than EBS-backed AMIs. EBS-backed AMIs

are quite a bit more expensive than using ephemeral storage. EBS volumes are limited

to 1TB of space, so you have to stripe them to get bigger volume sizes. EBS is good

but not for backing instances. It's fabulous for storing massive amounts of data,

getting quick snapshot backups, and quick restores for disaster recovery.

In the real world, we typically won’t have a single host pumping I/O into a storage

array. More likely you will have many hosts doing input output operations in parallel.

We expect performance degradation with write workloads, but we experience read

performance degradation with different file systems.

Hard drives actually provide plenty of sequential speed, especially when aggregated

into various RAID implementations. Virtualization overhead is more felt in sequential

workloads accessed through smaller block sizes than random workloads. Sequential

read workloads come into play during OLAP, batch processing, content delivery,

streaming and backup scenarios. Sequential write performance comes into play

during caching, replication, HPC and database logging workloads. Another key

component to analyzing sequential performance is observing latency metrics. The

cloud customers can estimate the expected performance based on the characteristics

of the workload they deploy.

63

CHAPTER 8

FUTURE WORK

We perform an exhaustive study comparing the performance of Amazon EBS and

Amazon instance store test bed. However more work is needed to understand the

effect of different types of instances (small, medium, large and exlarge) with different

types of storage settings. We plan to expand our study to include other file systems

(e.g., Reiser4, and BTRFS), as we believe they have greater optimization

opportunities. We also wanted to find is there any impact of instance size with

different storage settings. And also we are planning to do this work on UCCS cloud,

hence will help UCCS students and staff to understand the performance implications

of UCCS cloud and will help to choose appropriate file systems for different types of

workloads, in order to increase the performance.

64

CHAPTER 9

CONCLUSION

Proper benchmarking and analysis are tedious, time consuming tasks. We conducted

a comprehensive study of file systems on modern systems, evaluated popular server

workloads, and varied many parameters. We collected and analyzed performance

metrics. We discovered and explained significant variations in performance. We

found that XFS worked better than EXT3 and EXT4 and storage setting wise,

instance store storage setting is better than EBS setting. And also we conclude that

there are no universally good configurations for all workloads, and we explained

complex behavior that go against common conventions.

Our main objective is to better understand performance implications of Amazon EBS

and Instance store volumes attached instances with real world workloads with

different file systems mounted on these instances. By examining a large set of

different combinations of file systems under various workloads, we have

demonstrated the significant difference of the two types of storages of Amazon EC2'

performance, and hence, system administrators must be careful in choosing file

systems in order to reap the greatest benefit from Amazon cloud.

Our preliminary investigation on these two storage settings will help researchers to

better understand critical performance issues in this area, and shed light on finding

more efficient methods in utilizing these storage areas. We recommend that servers be

tested and optimized for expected workloads before used in production. Given the

long-running nature of busy Internet servers, software-based optimization techniques

can have significant, cumulative long-term benefits. Understanding the connection

between system-level workloads and the I/O pattern that the drive experiences is

essential to optimizing performance. We hope that our work will motivate system

designers to more carefully analyze the performance issues on Amazon EC2.

65

66

BIBLIOGRAPHY

[1] “Le, Duy, Hai Huang, and Haining Wang. “Understanding performance implications

of nested file systems in a virtualized environment." FAST. 2012.

[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html

[3] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html

[4] FIO - How to. http://www.bluestop.org/fio/HOWTO.txt

[5] Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Pratt, I.,

Warfield, A.: Xen and the art of virtualization. In: SOSP. ACM, New York (2003)

[6] Amazon Elastic Compute Cloud –EC2. http://aws.amazon.com/ec2/

[7] http://aws.amazon.com/s3/

[8] IBM Cloud Computing. http://www.ibm.com/cloud-computing/us/en/

[9] http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure/tmp/en-US/html/

index.html

[10] Filebench: http://linux.die.net/man/1/filebench

[11] EXT3: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/

6/html/Storage_Administration_Guide/ch-ext3.html

[12] EXT4: https://ext4.wiki.kernel.org/index.php/Main_Page

[13] File Bench: http://filebench.sourceforge.net/wiki/index.php/Main_Page

[14] Amazon Instance Types: http://aws.amazon.com/ec2/instance-types/

[15] Alan D. Brunelle. Block I/O layer tracing: blktrace. April 2006. Available from:

http://www.gelato.org/pdf/apr2006/ gelato_ICE06apr_blktrace_brunelle_hp.pdf [cited

May 2009].

67

[16] Jens Axboe. blktrace manual, 2008.

[17] Jens Axboe. blkparse manual, 2008.

[18] http://linux.die.net/man/1/iostat

[19] http://linux.die.net/man/1/sar

[20] http://linux.die.net/man/1/iotop

[21] www.mimuw.edu.pl/.../gelato_ICE06apr_blktrace_brunelle_hp.pdf

68

APPENDIX A

FIO Benchmark 64k block size with random workload:

[root@ip-10-63-61-88 output]# cat 64krandomreadwrite100

64krandreadwrite100: (g=0): rw=randrw, bs=64K-64K/64K-64K/64K-64K,

ioengine=libaio, iodepth=8

64krandreadwrite100: (g=0): rw=randrw, bs=64K-64K/64K-64K/64K-64K,

ioengine=libaio, iodepth=8

fio-2.1.5

Starting 32 processes

64krandreadwrite100: Laying out IO file(s) (1 file(s) / 1024MB)

64krandreadwrite100: (groupid=0, jobs=32): err= 0: pid=28843: Tue Apr 7 09:27:32

2015

read: io=7473.2MB, bw=127245KB/s, iops=1988, runt= 60140msec

slat (usec): min=5, max=174510, avg=12634.37, stdev=28676.53

clat (msec): min=1, max=409, avg=115.98, stdev=53.82

lat (msec): min=1, max=409, avg=128.61, stdev=50.64

clat percentiles (msec):

| 1.00th=[ 12], 5.00th=[ 17], 10.00th=[ 48], 20.00th=[ 88],

| 30.00th=[ 98], 40.00th=[ 102], 50.00th=[ 106], 60.00th=[ 110],

| 70.00th=[ 123], 80.00th=[ 161], 90.00th=[ 196], 95.00th=[ 202],

| 99.00th=[ 251], 99.50th=[ 265], 99.90th=[ 302], 99.95th=[ 310],

| 99.99th=[ 351]

bw (KB /s): min= 2365, max= 6932, per=3.14%, avg=3989.51, stdev=719.09

lat (msec) : 2=0.01%, 4=0.02%, 10=0.15%, 20=5.97%, 50=4.27%

lat (msec) : 100=24.22%, 250=64.25%, 500=1.11%

cpu : usr=0.01%, sys=0.16%, ctx=124384, majf=0, minf=782

IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.8%, 16=0.0%, 32=0.0%, >=64=0.0%

69

submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

issued : total=r=119571/w=0/d=0, short=r=0/w=0/d=0

latency : target=0, window=0, percentile=100.00%, depth=8

Run status group 0 (all jobs):

READ: io=7473.2MB, aggrb=127245KB/s, minb=127245KB/s,

maxb=127245KB/s, mint=60140msec, maxt=60140msec

Disk stats (read/write):

xvdb: ios=118609/12, merge=846/10, ticks=9115184/476, in_queue=9123220,

util=99.82%

APPENDIX B

Sample filebench Output:

FILEBENCH benchmark results on different types of servrs with 8k IO size

FileServer:

filebench> load fileserver

$meanfilesize=128k

$nthreads=50

$nfiles=50000

$iosize=8k

run 60

IO Summary: 728709 ops, 12143.621 ops/s, (1104/2208 r/w), 295.1mb/s, 581us cpu/op, 1.1ms latency

WebServer:

filebench> load webserver

filebench> set $dir=/ext3

filebench> set $meanfilesize=16k

filebench> set $nthreads=100

70

filebench> set $nfiles=50000

filebench> set $iosize=8k

filebench> run 60

IO Summary: 796362 ops, 13271.074 ops/s, (4281/429 r/w), 70.3mb/s, 396us cpu/op, 0.2ms latency

MailServer:

filebench> load varmail

filebench> set $dir=/ext3

filebench> set $meanfilesize=8k

filebench> set $nthreads=16

filebench> set $nfiles=50000

filebench> set $iosize=8k

filebench> run 60

IO Summary: 519044 ops, 8649.666 ops/s, (1331/1331 r/w), 23.1mb/s, 458us cpu/op, 3.6ms latency

Db server:

filebench> load mongo

$dir=/ext3

$filesize=1g

$nthreads=200

$nfiles=8

$iosize=8k

run 60S

IO Summary: 58 ops, 57.990 ops/s, (9/10 r/w), 13621.7mb/s, 184737us cpu/op, 194.3ms latency

71

APPENDIX C

Sample BlkTrace Output:

root@ip-10-63-61-88 bt]# ./blkparse -i xvdb.blktrace.*

Input file xvdb.blktrace.0 added

Input file xvdb.blktrace.1 added

Input file xvdb.blktrace.2 added

Input file xvdb.blktrace.3 added

Input file xvdb.blktrace.4 added

Input file xvdb.blktrace.5 added

Input file xvdb.blktrace.6 added

Input file xvdb.blktrace.7 added

202,16 3 1 0.000000000 0 C WS 589064 + 8 [0]

202,16 3 2 0.000006284 0 D WS 587680 + 16 [swapper/3]

202,16 1 1 0.000019827 3698 Q WS 589128 + 8 [fio]

202,16 1 2 0.000020718 3698 G WS 589128 + 8 [fio]

202,16 1 3 0.000021076 3698 P N [fio]

202,16 1 4 0.000021808 3698 I WS 589128 + 8 [fio]

202,16 1 5 0.000022071 3698 U N [fio] 1

202,16 1 12 0.000797031 3698 M WS 589144 + 8 [fio]

Etc……….

^CCPU0 (xvdb):

Reads Queued: 0, 0KiB Writes Queued: 97, 388KiB

Read Dispatches: 0, 0KiB Write Dispatches: 14, 56KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB

72

Read Merges: 0, 0KiB Write Merges: 47, 188KiB

Read depth: 0 Write depth: 28

IO unplugs: 50 Timer unplugs: 0

CPU1 (xvdb):

Reads Queued: 0, 0KiB Writes Queued: 119, 476KiB

Read Dispatches: 0, 0KiB Write Dispatches: 12, 48KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB

Read Merges: 0, 0KiB Write Merges: 34, 136KiB

Read depth: 0 Write depth: 28

IO unplugs: 85 Timer unplugs: 0

CPU2 (xvdb):

Reads Queued: 0, 0KiB Writes Queued: 157, 628KiB

Read Dispatches: 0, 0KiB Write Dispatches: 6, 24KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB

Read Merges: 0, 0KiB Write Merges: 46, 184KiB

Read depth: 0 Write depth: 28

IO unplugs: 111 Timer unplugs: 0

CPU3 (xvdb):

Reads Queued: 0, 0KiB Writes Queued: 1,226, 4,904KiB

Read Dispatches: 0, 0KiB Write Dispatches: 805, 6,296KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 839, 6,444KiB

Read Merges: 0, 0KiB Write Merges: 636, 2,544KiB

Read depth: 0 Write depth: 28

IO unplugs: 588 Timer unplugs: 0

CPU7 (xvdb):

73

Reads Queued: 0, 0KiB Writes Queued: 8, 32KiB

Read Dispatches: 0, 0KiB Write Dispatches: 2, 8KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB

Read Merges: 0, 0KiB Write Merges: 5, 20KiB

Read depth: 0 Write depth: 28

IO unplugs: 3 Timer unplugs: 0

Total (xvdb):

Reads Queued: 0, 0KiB Writes Queued: 1,607, 6,428KiB

Read Dispatches: 0, 0KiB Write Dispatches: 839, 6,432KiB

Reads Requeued: 0 Writes Requeued: 0

Reads Completed: 0, 0KiB Writes Completed: 839, 6,444KiB

Read Merges: 0, 0KiB Write Merges: 768, 3,072KiB

IO unplugs: 837 Timer unplugs: 0

Throughput (R/W): 0KiB/s / 22,220KiB/s

Events (xvdb): 7,405 entries

Skips: 0 forward (0 - 0.0%

Trace actions:

C – Complete: A previously issued request has been completed. The output will detail the sector and size of that request, as well as the success or failure of it.

D – Issued: A request that previously resided on the block layer queue or in the io scheduler has been sent to the driver.

I – inserted: A request is being sent to the Io scheduler for addition to the internal queue and later service by the driver. The request is fully formed at this time.

Q – Queued: This notes intent to queue Io at the given location. No real requests exists yet.

M - Back merge: A previously inserted request exists that ends on the boundary of where this Io begins, so the Io scheduler can merge them together.

G - Get request: To send any type of request to a block device, a struct request container must be allocated first.

74

P – Plug: When Io is queued to a previously empty block device queue, Linux will plug the queue in anticipation of future iOS being added before this data is needed.

U – Unplug: Some request data already queued in the device, start sending requests to the driver. This may happen automatically if a timeout period has passed or if a number of requests have been added to the queue.

APPENDIX D

Sample btt output

root@ip-10-164-200-68:/bt/bttroot@ip-10-164-200-68 btt]# ./btt -i bp_0.bin==================== All Devices ==================== ALL MIN AVG MAX N--------------- ------------- ------------- ------------- -----------Q2Q 0.000000302 0.000190622 0.018320058 9623Q2G 0.000000320 0.000646859 0.021534113 5580G2I 0.000000537 0.000000891 0.000016285 5580Q2M 0.000000220 0.000586568 0.020452472 4805I2D 0.000000295 0.000582587 0.020857786 5757M2D 0.000003591 0.000312210 0.002367892 4618D2C 0.000003985 0.009535719 0.021917346 9562Q2C 0.000685686 0.010527590 0.024264175 9563

==================== Device Overhead ==================== DEV | Q2G G2I Q2M I2D D2C---------- | --------- --------- --------- --------- --------- --------- (202, 16) | 3.5853% 0.0049% 2.7996% 3.3315% 90.5689%---------- | --------- --------- --------- --------- --------- --------- Overall | 3.5853% 0.0049% 2.7996% 3.3315% 90.5689%

==================== Device Merge Information ===============

DEV | #Q #D Ratio | BLKmin BLKavg BLKmax Total ---------- | -------- -------- ------- | -------- -------- -------- -------- (202, 16) | 9624 5579 1.7 | 8 14 64 83088

==================== Device Q2Q Seek Information ============= DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- (202, 16) | 9624 261907.1 0 | 0(5689)---------- | --------------- --------------- --------------- | ---------------

75

Overall | NSEEKS MEAN MEDIAN | MODE Average | 9624 261907.1 0 | 0(5689)==================== Device D2D Seek Information ============ DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- (202, 16) | 5579 451860.7 0 | 0(1306)---------- | --------------- --------------- --------------- | --------------- Overall | NSEEKS MEAN MEDIAN | MODE Average | 5579 451860.7 0 | 0(1306)==================== Plug Information ==================== DEV | # Plugs # Timer Us | % Time Q Plugged---------- | ---------- ---------- | ---------------- (202, 16) | 5574( 0) | 0.241062860% DEV | IOs/Unp IOs/Unp(to)---------- | ---------- ---------- (202, 16) | 1.0 0.0---------- | ---------- ---------- Overall | IOs/Unp IOs/Unp(to) Average | 1.0 0.0

===============Active Requests At Q Information ============

DEV | Avg Reqs @ Q---------- | ------------- (202, 16) | 0.6

================ I/O Active Period Information =============== DEV | # Live Avg. Act Avg. !Act % Live---------- | ---------- ------------- ------------- ------ (202, 16) | 1 0.000000000 0.000000000 100.00---------- | ---------- ------------- ------------- ------ Total Sys | 1 0.000000000 0.000000000 100.00

# Total System

# Total System : q activity

0.000019827 0.0

0.000019827 0.4

1.834373837 0.4

1.834373837 0.0

76

# Total System : c activity

0.000773561 0.5

0.000773561 0.9

1.834348980 0.9

1.834348980 0.5

# Per process

# blktrace : q activity

# blktrace : c activity

0.503369468 1.5

0.503369468 1.9

0.507339801 1.9

0.507339801 1.5

1.531444631 1.5

1.531444631 1.9

1.531542381 1.9

1.531542381 1.5

# fio : q activity

0.000019827 2.0

0.000019827 2.4

1.834373837 2.4

1.834373837 2.0

# fio : c activity

0.056336745 2.5

77

0.056336745 2.9

0.298750693 2.9

0.298750693 2.5

0.438779299 2.5

0.438779299 2.9

1.532828650 2.9

1.532828650 2.5

1.643768755 2.5

1.643768755 2.9

1.732713450 2.9

1.732713450 2.5

# jbd2 : q activity

0.163400885 3.0

0.163400885 3.4

0.185009019 3.4

0.185009019 3.0

# jbd2 : c activity

# kernel : q activity

# kernel : c activity

0.000773561 4.5

0.000773561 4.9

1.834348980 4.9

1.834348980 4.5

78