thesis.dvi · web viewtable 4 iops of different filesystems with different workloads on ebs with...
TRANSCRIPT
1
ANALYSIS OF FILE SYSTEMS PERFORMANCE IN AMAZON EC2 STORAGE
by
Nagamani Pudtha
A Thesis submitted to the
School of Graduate Studies
in partial fulfilment of the requirements for the degree of
Master of Science
Department of Computer Science
University of Colorado Colorado Springs
December 2014
Colorado Springs Colorado
2
This thesis final report for the Master of Computer Science degree by
Nagamani Pudtha
has been approved for the
Department of Computer Science
By
_______________________________________________________
Advisor: Jia Rao
_______________________________________________________
Dr. C. Edward Chow
_______________________________________________________
Dr. Xiaobo Zhou
____________________
Date
3
Nagamani Pudtha (M.S. Computer Science)
Analysis of File systems performance in Amazon EC2 storage
Thesis directed by Professor Jia Rao, Department of Computer Science
ABSTRACT
Cloud computing has gained tremendous popularity in recent years. Cloud
computing refers to using a third–party network of remote servers hosted on the
internet to store and manage all your data, rather than locally. In simple words, cloud
services provide you with your own hard drive in the cloud or on the internet. There
are public clouds and private clouds available in the market. And whether a cloud is
public or private, the success key is creating an appropriate and efficient server,
network and storage infrastructure in which all resources can be efficiently utilized
and shared. In cloud computing, data storage becomes even more crucial since all
data resides on the storage systems in a shared infrastructure model. It is very
important to understand the performance of a particular storage before making the
transition.
In this paper we perform experiments on amazon EC2 cloud storage. We select
Amazon public cloud platform since it is one of the most widely adopted public cloud
platforms and offers Infrastructure-as-a-Service (IaaS). Prior work has shown that
applications with significant communication or I/O tend to perform poorly in
virtualized cloud environments. However, there is a limited understanding of the I/O
characteristics of cloud environments. In this paper several tests and benchmark were
used to evaluate the I/O under different storage settings of Amazon cloud with
4
different types of file systems with the goal of to be able to exercise and observe the
I/O performance from different perspectives by using different workloads with a
special focus on long running jobs. We use FIO, FileBench and Blktrace
benchmarking tools. Blktrace helps to provide detailed block level IO analysis of both
EBS and Instance Store disks. Through a set of detailed micro and macro
benchmarking measurements, the tests revealed the different levels of performance
degradation in EBS and instance storage due to the different types of file systems and
different types of workloads.
From our interpretation of test results in detail on how the time spent in different IO
stages, we are providing user guidance on how to select the storage option while
choosing instances.
5
ACKNOWLEDGEMENTS
First and foremost I would like to sincerely thank my advisor Jia Rao for all the
guidance and interest he took in the progress of this work. I am very grateful to him
for his constructive comments, feedback and also providing the resources like
Amazon Web Services during our research. I will consider myself lucky if I have
imbibed at least a small percentage of his admirable qualities like devotion and
single-minded dedication towards work.
I am extremely grateful to Dr. Edward Chow and Dr. Xiabou Zhou for their valuable
suggestions and advises in the thesis proposal and during all of this thesis work
without which it would have been very difficult for me to come up with this great
work.
Many thanks go to my family members for their constant support and encouragement.
I am grateful to my parents for all their love, affection and blessings without which I
would not have gotten this far in life. And finally a special note of thanks goes to
Venkat, my husband for his continual encouragement, support; advice and patience
that has enabled me to accomplish things I never thought were possible.
6
CONTENTSABSTRACT...............................................................................................................................3
LIST OF TABLES.....................................................................................................................8
LIST OF GRAPHS....................................................................................................................8
CHAPTER 1............................................................................................................................10
INTRODUCTION...................................................................................................................10
1.1 MOTIVATION...........................................................................................................10
1.2 THESIS GOAL...........................................................................................................11
1.3 THESIS ORGANIZATION........................................................................................12
CHAPTER 2............................................................................................................................13
BACKGROUND.....................................................................................................................13
2.1 AMAZON EC2...........................................................................................................13
2.2 AMAZON EC2 STORAGE........................................................................................14
2.2.1 ELASTIC BLOCK STORE........................................................................................14
2.2.1.1 GENERAL PURPOSE VOLUMES........................................................................14
2.2.2 INSTANCE STORE...................................................................................................15
2.3 IO ANALYSIS TOOLS..............................................................................................16
2.3.1 BLKTRACE/BLKPARSE..........................................................................................16
CHAPTER 3............................................................................................................................18
EXPERIMENT METHODOLOGY........................................................................................18
3.1 OVERVIEW...............................................................................................................18
3.2 PROPERTIES.............................................................................................................18
3.3 INSTANCE TYPE SELECTION...............................................................................18
3.4 FILE SYSTEM SELECTION.....................................................................................19
3.5 TEST BED SETUP.....................................................................................................21
3.6 BENCHMARKING TOOLS......................................................................................22
3.6.1 MACRO BENCHMARKS.........................................................................................22
3.6.2 MACRO BENCHMARKS.........................................................................................24
CHAPTER 4............................................................................................................................25
7
EXPERIMENT RESULTS & ANALYSIS.............................................................................25
4.1. FILEBENCH RESULTS............................................................................................25
4.1.1 EBS.............................................................................................................................26
4.1.2 INSTANCE STORE:..................................................................................................31
4.2 FIO RESULTS............................................................................................................37
4.2.1 EBS:............................................................................................................................38
4.2.2 INSTANCE STORE...................................................................................................51
CHAPTER 6............................................................................................................................62
IO ANALYSIS.........................................................................................................................62
CHAPTER 7............................................................................................................................65
DISCUSSION..........................................................................................................................65
CHAPTER 8............................................................................................................................67
FUTURE WORK.....................................................................................................................67
CHAPTER 9............................................................................................................................67
CONCLUSION........................................................................................................................67
BIBLIOGRAPHY....................................................................................................................69
APPENDIX A..........................................................................................................................71
FIO Benchmark 64k block size with random workload:.........................................................71
APPENDIX B..........................................................................................................................72
SAMPLE FILEBENCH OUTPUT:.........................................................................................72
APPENDIX C..........................................................................................................................74
SAMPLE BLKTRACE OUTPUT:..........................................................................................74
APPENDIX D...........................................................................................................................77
SAMPLE BTT OUTPUT.......................................................................................................77
LIST OF FIGURES
Figure 1: Amazon EC2 Storage...................................................................................14
Figure 2: blktrace General Architecture......................................................................17
8
LIST OF TABLES
Table 1 Experiment Setup –EBS...............................................................................21
Table 2 FileBench Workloads...................................................................................25
Table 3 IOPS of different filesystems with different workloads on EBS with 8k….36
Table 4 IOPS of different filesystems with different workloads on EBS with 512k 36
Table 5 IOPS of different filesystems with different workloads on Insance Store
with 8k........................................................................................................37
Table 6 IOPS of different filesystems with different workloads on Instance Store
with 512k.................................................................................................37
Table 7 FIO Benchmark Parameters for EBS and Instance store..............................38
Table 8 IOPS for 8 jobs & 4k block size- EBS vs Instance store..............................59
Table 9 IOPS for 16 jobs & 4k block size- EBS vs Instance store............................59
Table 10 IOPS for 32 jobs & 4k block size- EBS vs Instance store............................59
LIST OF GRAPHS
Graph 4.1:1 IOPS of different file systems with different workloads on EBS............27
Graph 4.1:2 Latency of different filesystems with different workloads on EBS........29
Graph 4.1:3 Bandwidth of file systems on EBS..........................................................30
Graph 4.1:4 IOPS of different filesystems with different workloads on Instance Store.............................................................................................................................31
Graph 4.1:5 Latency of different file systems with different workloads on Instance Store.....................................................................................................................33
9
Graph 4.1:6 Bandwidth of different filesystems with different workloads on Instance Store.....................................................................................................................35
Graph 4.1:7 E XT3 Random Read Write 8 jobs..........................................................38
Graph 4.1:8 EXT3 Random Read Write 16 jobs.........................................................39
Graph 4.1:9 EXT3 Random Read Write 32 jobs.........................................................39
Graph 4.1:10 EXT3 Sequential Read Write 8 jobs......................................................40
Graph 4.1:11EXT3 Sequential Read Write 16 jobs.....................................................41
Graph 4.1:12 EXT3 Sequential Read Write 32 jobs....................................................41
Graph 4.1:13 EXT4 Rand Read Write 8 jobs.............................................................42
Graph 4.1:14 EXT4 Rand Read Write 16 jobs............................................................43
Graph 4.1:15 EXT4 Rand Read Write 32 jobs............................................................43
Graph 4.1:16 EXT4 Sequential Read Write 8 jobs.....................................................44
Graph 4.1:17 EXT4 Sequential Read Write 16 jobs....................................................44
Graph 4.1:18 EXT4 Sequential Read Write 32 jobs...................................................45
Graph 4.1:19 XFS Rand Read Write 8 jobs................................................................46
Graph 4.1:20 XFS Rand Read Write 16 jobs..............................................................46
Graph 4.1:21 XFS Rand Read Write 32 jobs..............................................................46
Graph 4.1:22 XFS Sequential Read Write 8 jobs........................................................47
Graph 4.1:23 XFS Sequential Read Write 16 jobs......................................................47
Graph 4.1:24 XFS Sequential Read Write 32 jobs.....................................................48
Graph 4.1:25 EXT3 Rand Read Write 8 jobs..............................................................50
Graph 4.1:26 EXT3 Rand Read Write 16 jobs...........................................................50
Graph 4.1:27 EXT3 Rand Read Write 32 jobs...........................................................50
Graph 4.1:28 EXT3 Sequential Read Write 8 jobs......................................................51
10
Graph 4.1:29 EXT3 Sequential Read Write 8 jobs......................................................52
Graph 4.1:30 EXT3 Sequential Read Write 32 jobs...................................................52
Graph 4.1:31 EXT4 Random Read Write 8 jobs.........................................................53
Graph 4.1:32 EXT4 Random Read Write 16 jobs.......................................................53
Graph 4.1:33 EXT4 Random Read Write 32 jobs.......................................................53
Graph 4.1:34 EXT4 Sequential Read Write 8 jobs.....................................................54
Graph 4.1:35 EXT4 Sequential Read Write 16 jobs....................................................54
Graph 4.1:36 EXT4 Sequential Read Write 32 jobs....................................................55
Graph 4.1:37 XFS Random Read Write 8 jobs...........................................................56
Graph 4.1:38 XFS Random Read Write 16 jobs.........................................................56
Graph 4.1:39 XFS Random Read Write 32 jobs.........................................................56
Graph 4.1:40 XFS Sequential Read Write 8 jobs.......................................................57
Graph 4.1:41 XFS Sequential Read Write 16 jobs......................................................57
Graph 4.1:42 XFS Sequential Read Write 32 jobs.....................................................58
11
CHAPTER 1
INTRODUCTION
1.1 MOTIVATION
Amazon EC2, the leading IaaS (Infrastructure as a Service) provider and a subset of
offerings from Amazon Web Services, has had a significant impact in the business IT
community and provides reasonable and attractive alternatives to locally-owned
infrastructure. Amazon Elastic Compute Cloud has been used for host of a small and
medium sized enterprises for various usages. Amazon was introduced in 2006 and
supports a wide range of instance types with different storage settings. Amazon EC2
provides Elastic Block Storage (EBS) [2] Instance Storage [3] and Amazon Simple
Storage Service (S3).
There are a lot discussions and questions in the community about which storage
setting should a user choose. This thesis will analyze the performance of Amazon
EBS and Instance storages, with different combinations of file systems with different
type of workloads and read write operations. And also we are extending the paper
where authors focused on the nested file systems performance [1] on only one kind
of storage on a single instance, where as our paper is working on Amazon Ec2’s EBS
and Instance storage performances by launching 2 instances on each storage and
tested with different workloads under different file systems.
Understanding how data makes its way from the application to storage devices is key
to understanding how I/O works. With this knowledge, user can make much better
decisions about storage design and storage purchases for their application. Monitoring
the lowest level of the I/O stack, the block driver, is a crucial part of this overall
understanding of I/O patterns.
12
1.2 THESIS GOAL
In this thesis, we aim to present a measurement study to characterize the performance
implications of the storage of Amazon Elastic Cloud Computing (EC2) [6] data
center. Performance has a long tradition in storage research, we measure the
performance by analyzing the I/O characteristics, workload demand, and storage
configuration by attaching General Purpose SSD volume to instances and will
provide a user guidance on how to select the storage option in Amazon EC2.
This research aims to answer the following questions within the bounds of the
environments tested.
Are there any wide range of performance variations between Amazon EC2
storage options
Can the block size be a cause of I/O performance degradation
Which one delivers the better peak performance
Which one delivers more consistent performance
Is any of these two settings all-time winner for all workloads? Or the
performance is workload-dependent.
We ran many experiments on both EBS and Instance store instances, collected
detailed performance measurements, and analyzed them. We found that different
workloads, not too surprisingly, have a large impact on system behavior. No single
file system worked best for all workloads. Some file system features helped
performance and others hurt it.
Our goal is to quickly observe the events in each layer, how they interact and also to
provide enough information to study even small details. To gather information from
these components, we have used the existing blktrace mechanism [15] [15]
1.3 THESIS ORGANIZATION
The remainder of this thesis is structured as follows.
13
In chapter 2, we describe some of the background that led to this project and our
goals for the project. Chapter 3, describes the experiment set up. Chapter 4 talks
about the benchmarking tools what we are using in our experiment. Chapter 5 discuss
the results of those benchmarks on EXT3, EXT4 and XFS file systems with EBS and
Instance store volumes. Chapter 7 includes the discussion of the results. Chapter 7
focuses on the future areas of this thesis work. Chapter 8 will give the conclusion.
14
CHAPTER 2
BACKGROUND
2.1 AMAZON EC2
Amazon Elastic Compute Cloud (Amazon EC2) [6] is a component of Amazon’s
Web Services (AWS). EC2 is a central part of Amazon.com’s cloud computing
platform. Amazon EC2 is a Web-based service from which user can rent for a
monthly or hourly fee, virtual servers in the cloud and also run custom applications
on those servers. Elasticity in EC2 refers to the ease in which user can scale server
and application resources as their computing demands needed.
Amazon EC2 uses the Xen virtualization technique [5] to manage physical servers.
There might be several Xen virtual machines running on one physical server. Each
Xen virtual machine is called an instance in Amazon EC2. There are several types of
instances. Each type of instance provides a predictable amount of computing
capacity. The input-output (I/O) capacities of these types of instances are vary
according to the storage attached to those instances. Allocated EC2 instances can be
placed at different physical locations. Amazon organizes the infrastructure into
different regions and availability zones.
To use EC2, a subscriber creates an Amazon Machine Image (AMI) containing the
operating system, application programs and configuration settings. Then the AMI is
uploaded to the Amazon Simple Storage Service (Amazon S3) and registered with
Amazon EC2, creating a so-called AMI identifier (AMI ID). Once this has been done,
the subscriber can requisition virtual machines on an as-needed basis. Capacity can be
increased or decreased in real time from as few as one to more than 1000 virtual
machines simultaneously. Billing takes place according to the computing, storage and
network resources consumed.
15
2.2 AMAZON EC2 STORAGE
In amazon cloud we have three types of storage choices for an instance boot disk or
its root device. They are Instance store, Elastic Block Storage (EBS) and Simple
Storage Service (S3). In this section, we will briefly discuss these three storage
settings. These three types of storages of EC2 is depicted in the following Figure 1 [6]
Figure 1: Amazon EC2 Storage
2.2.1 ELASTIC BLOCK STORE
Amazon’s EBS volumes provides persistent block level storage. Once we attach EBS
volume to an instance, we can create file systems and we also can run the database on
top of these volumes. Amazon EBS provides three types of volumes, first one is
General Purpose (SSD), second one is Provisioned IOPS (SSD), and the last one is
Magnetic. The three volume types differ in performance characteristics and cost. In
our thesis research we are only attaching General Purpose storage volume to our
instances as per the cost constraints.
16
2.2.1.1GENERAL PURPOSE VOLUMES
General Purpose (gp2) volume is the currently default EBS volume type when
launching “Create Volume” in EC2 console in Amazon cloud. General Purpose
volumes are backed by Solid-State Drives (SSDs) and are suitable for a broad range
of workloads, including small to medium-sized databases, development and test
environments, and boot volumes. General Purpose volumes provide the ability to
burst up to 3,000 IOPS per volume, independent of volume size, to meet the
performance needs of most applications. General Purpose volumes also deliver a
consistent baseline of 3 IOPS/GB and provide up to 128MBps of throughput per
volume. I/O is included in the price of General Purpose volumes, so you pay only for
each GB of storage you provision.
General Purpose SSD, measured by the benchmark of IOPS, offers 10 times more
input/output operations each second, with one-tenth the latency of magnetic tape
drives, as well as greater bandwidth and consistency. When we need a greater number
of IOPS than General Purpose (SSD) volumes provide or we have a workload where
performance consistency is critical, Amazon EBS Provisioned IOPS (SSD) volumes
will help us.
2.2.2 INSTANCE STORE
Instance-store volumes [3] are temporary storage, which survive rebooting an EC2
instance, but when the instance is stopped or terminated (e.g., by an API call, or due
to a failure), this store is lost. EBS volumes are built on replicated storage, so that the
failure of a single component will not cause data loss.
For instances, as they are a temporary storage, you should not rely on these disks to
keep long-term data or even other data that you would not want to lose when a failure
happens (i.e., stop/start instance, failure on the underlying h/w, terminating
instances), for these purposes, it is better to choose persistent storages like EBS or S3.
And also, you can’t upgrade your instance and it is not scalable. But instance store is
faster than EBS with its non-persistent characteristic.
17
2.3 IO ANALYSIS TOOLS
Linux has some excellent tools for tracing I/O request queue operations in the block
layer. For example, tools such as iostat, iotop [20] , sar[19] and etc. iotop[20] is
used to get quite few I/O stats for a particular system, but it only gives you an overall
picture of statistics without a great deal of in detail, hence it is not recommended to
use iotop to determine how the application is doing the I/O. iotop only gives an idea
of how much throughput and not the iops that the application is generating.
Iostat [18] is the go-to tool for Linux storage performance monitoring and allows you
to collect quite a few I/O statistics. It is available nearly everywhere, it works on the
vast majority of Linux machines, and it's relatively easy to use and understand.
Relative to iotop, iostat gives you a much larger array of statistics, but it does not
separate out I/O usage on a per-process basis instead you get aggregate view of all
I/O usage.
Sar [19] is one of the most common tools for gathering information about system
performance. It works like iotop, runs on each compute node and gather I/O statistics.
But it examine the I/O pattern of an application at a higher level. To get around these
issues, we will want to go deeper and watch I/O statistics.
2.3.1 BLKTRACE/BLKPARSE
The tools that come with the kernel to watch I/O statistics in depth are blktrace and
blkparse. These are very powerful tools. Blktrace is a block layer IO tracing
technique which provides information about request queue operations up to user
space in detail. Blktrace transfers event traces from the kernel into either long-term
storage or provides formatted output via blkparse. Compared to all other tools (which
we discussed above), it provides detailed information about request queue operations.
Blktrace needs no special support or configuration apart from having debugfs
mounted on /sys/kernel/debug. Blkparse utility formats the events stored in files and
it directly outputs data collected by blktrace. General architecture of blktrace is
shown in Figure 2 [21]
18
Figure 2: blktrace General Architecture
There are around 20 different events produced by blktrace, of which we only use a
few. Below we list few events what we use, for a full list refer to the blkparse manual
page [17].
Request Inserted (I): We use this to tell IO inserted onto request queue.
Request queued (Q): We use this to track I/O request ownership, because other
events, such as dispatch and completion, run in an arbitrary process context.
Request dispatch (D): This event is issued when the I/O scheduler moves a request
from a queue inside the I/O scheduler and to the dispatch queue, so that the disk can
service the request.
Request completion (C): This happens when the disk has serviced are quest. For a
write operation this means that the data has been written to the platter (or at least
buffered), and for a read operation that the data is stored in host memory.
19
CHAPTER 3
EXPERIMENT METHODOLOGY
3.1 OVERVIEW
To evaluate the performance implications of different storage settings of Amazon
EC2, we built a test bed. In this section, we introduce the methodology of our
measurement study. We first explain the properties we measure in our experiments,
Instance type selection, File systems selection and the methodology we use to
measure the properties.
3.2 PROPERTIES
In our experiment, we use two types of bench marking tools, one is File Bench [13]
and another one is FIO [4] File Bench is used to generate macro-benchmarks and
FIO [4] to generate micro-benchmarks. We will analyze the impact of the choice of
file system with different kinds of workloads in VMs, analyze any performance
degradations or improvements with write dominated workloads and read dominated
workloads. We find bandwidth of each workload with different file systems on EBS
and Instance store storages. We run blktrace in parallel to FIO and Filebench to trace
the IO statistics.
3.3 INSTANCE TYPE SELECTION
Amazon EC2 provides different types of instances for users. Our measurement
experiments are mainly based on Amazon EC2 M3 large instances [14] and provides
a balance of compute, memory, and network resources. These are SSD-based instance
storage for fast I/O performance. This m3 family instances used for small and mid-
size databases, data processing tasks that require additional memory, caching fleets,
and for running backend servers for SAP, Microsoft SharePoint, and other enterprise
applications. We choose large instances with EBS volume as comparison with large
20
instances with Instance store volume, to study the performance implications in virtual
machines.
3.4 FILE SYSTEM SELECTION
Linux supports several different file systems. Each has strengths and weaknesses and
its own set of performance characteristics. One important attribute of a filesystem is
journaling, which allows for much faster recovery after a system crash. Generally, a
journaling filesystem is preferred over a non-journaling one when you have a choice.
In our thesis we are using EXT3, EXT4 and XFS file systems. We selected these file
systems for their widespread use on Linux servers and the variation in their features.
Following is a brief summary of these file systems.
EXT3: EXT3 [11] stands for third extended file system. The ext3 file system adds
journaling capability to a standard ext2 file system and is therefore an evolutionary
growth of a very stable file system. Because it adds journaling on top of the proven
ext2 file system, it is possible to convert an existing ext2 file system to ext3 and even
convert back again if required. Journaling has a dedicated area in the file system,
where all the changes are tracked. When the file system crashes, the possibility of file
system corruption is less because of journaling. Maximum individual file size can be
from 16GB to 2 TB and overall EXT3 file system size can be from 2 TB to 32 TB.
Once you partition the hard disk by using fdisk command, we can create a file
systems on that partition by using mkfs command.
EXT3 supports 3 types of journaling, first one is Journal, second one is ordered and
third one is Writeback.
Journal – Metadata and content are saved in the journal.
Ordered – Only metadata is saved in the journal. Metadata are journaled only
after writing the content to disk. This is the default.
Writeback – Only metadata is saved in the journal. Metadata might be
journaled either before or after the content is written to the disk.
21
EXT4: EXT4[12] stands for fourth extended file system. The ext4 file system started
as extensions to ext3 to address the demands of even larger file systems by increasing
storage limits and improving performance. To preserve the stability of ext3, it was
decided in June 2006 to fork the extensions into a new file system, ext4. The ext4 file
system, was released in December 2008 and included in the 2.6.28 kernel. Some of
the changes from ext3 are:
File systems up to 1 exabyte (EB). 1EB=1024 PB (petabyte). 1PB= 1024 TB
(terabyte), with individual file size from 16 GB to 16 TB.
The use of extents to replace block mapping as a means of improving
performance
Journal checksums to improve reliability
Faster file system checking because unallocated blocks can be skipped during
checks.
Delayed allocation and multi block allocators to improve performance and
reduce file fragmentation.
XFS: XFS is a 64-bit, highly scalable file system that was developed by Silicon
Graphics Inc. (SGI) and first deployed in the Unix-based IRIX operating system (OS)
in 1994. XFS supports large files and large file systems. For a 64-bit implementation,
XFS can handle file systems of up to 18 exabytes, with a maximum file size of 9
exabytes. There is no limit on the number of files. XFS is a journaling file system
and, as such, keeps track of changes in a log before committing the changes to the
main file system. The advantage is guaranteed consistency of the file system and
expedited recovery in the event of power failures or system crashes.
XFS filesystems are divided into a number of equally sized chunks called Allocation
Groups. Each AG can almost be thought of as an individual filesystem that maintains
its own space usage. Each AG can be up to one terabyte in size, regardless of the
underlying device's sector size. Each AG starts with a superblock. The first one is the
primary superblock that stores aggregate AG information. Secondary superblocks are
22
only used by xfs_repair when the primary superblock has been corrupted [8] On
Linux system we need to install xfsprogs package to get the xfs file system.
3.5 TEST BED SETUP
Our experiments have been conducted on two 64-bit Amazon EC2 virtual machine
instances, both consisting of Amazon Linux AMI (HVM) 2014.03.1 with 30Gib
memory and 8 virtual CPUs. The first instance was an m3.2xlarge, with general
purpose volume, consisting of a 1024GB EBS backed storage volume. The second
instance was also m3.2xlarge type. It consists of Instance store storage, consisting of
a 1024GB Instance store volume.
EBS: The entire host operating system was installed on a single disk (xvda) while
another single disk (xvdb) is used for experiments. We create multiple equal-sized
partitions from sdb, each corresponding to a different host file system. Each partition
is then formatted using the default parameters of the host file system’s mkfs*
command and is mounted using the default parameters of mount.
More hardware and software configuration settings are listed in Table 1.
Devices #Blocks Type
/dev/xvdb1 30720 Ext3
/dev/xvdb2 30720 Ext4
/dev/xvdb3 30720 XFS
Table 1 Experiment Setup –EBS
3.6 BENCHMARKING TOOLS
This research uses couple of storage benchmarking tools. Some are quite simple
while others strive to show a more real world I/O profile. In this section we discuss
macro-benchmarks and micro benchmarks to understand the potential performance
23
impact of file systems on realistic workloads and with read/write operations, from
which, we were able to observe significant performance impact.
3.6.1 MACRO BENCHMARKS
Our main objective is to understand how much of a performance impact on different
storage settings with different file systems, different workloads have. As mentioned
before we use EXT3, EXT4 and XFS file systems in both EBS and Instance store
storages VMs. For micro benchmark test we are considering FileBench benchmarking
tool to test each storage with different workloads.
FileBench Benchmarking
FileBench lets you create realistic profiles, or configurations, that emulate the real-
world behavior of your own applications. FileBench comes with a batch of prefab
profiles for various servers and functions, such as mail, Web, and fileservers, and
tasks such as reading, writing, copying and deleting files.
FileBench comes with its own language, .f, for creating profiles, which is a lean and
straightforward language. We use Filebench [3] to generate macro-benchmarks of
different I/O transaction characteristics controlled by predefined parameters, such as
the number of files to be used, average file size, and I/O buffer size. FileBench can
emulate different workloads with its flexible Work-load Model Language (WML)
[13]
A WML is to describe a work-load, workload description is called a personality.
Personalities define one or more groups of file system operations (e.g., read, write,
append, stat), to be executed by multiple threads. Each thread performs the group of
operations repeatedly, over a configurable period of time. At the end of the run,
FileBench reports the total number of performed operations. WML allows one to
specify synchronization points between threads and the amount of memory used by
each thread, to emulate real-world application more accurately. Personalities also
describe the directory structure(s) typical for a specific workload: average file size,
24
directory depth, the total number of files, and alpha parameters governing the file and
directory sizes that are based on a gamma random distribution [13]
Since Filebench supports a synchronization between threads to simulate concurrent
and sequential I/O s, we use this tool to create four server workloads: a file server, a
web server, a mail server, and a database server
• File server: Emulates a file service. File operations are a mixture of create, delete,
append, read, write, and attribute on files of various sizes.
• Web server: Emulates a web service. File operations are dominated by reads: open,
read, and close. Writing to the web log file is emulated by having one append
operation per open.
• Mail server: Emulates an e-mail service. File operations are within a single
directory consisting of I/O sequences such as open/read/close, open/append/close, and
delete.
• Database server: Emulates the I/O characteristic of Oracle 9i. File operations are
mostly read and write on small files. To simulate database logging, a stream of
synchronous writes is used. This workload generates a dataset with a specified
number of directories and files using a gamma distribution to determine the number
of sub-directories and files. It then spawns a specified number of threads where each
thread performs a sequence of open, read entire file and close operations over a
chosen number of files, outputting resulting data to a logfile.
3.6.2 MACRO BENCHMARKS
FIO: In this part, we discuss macro-level benchmark FIO [4] . With FIO we examine
disk I/O workloads. As a highly configurable benchmark, FIO defines a test case
based on different I/O transaction characteristics, such as total I/O size, block size,
number of I/O parallelism, and I/O mode. Our thesis focus is on the performance
variation of primitive I/O operations, such as read and write. With the combination of
25
these I/O operations and two I/O patterns, random and sequential, we design four test
cases: random read, random write, sequential read, and sequential write.
Fio is an I/O tool meant to be used both for benchmark and stress/hardware
verification. It has support for 19 different types of I/O engines (sync, mmap, libaio,
posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O
priorities, rate I/O, forked or threaded jobs, and much more. It can work on block
devices as well as files. Fio accepts job descriptions in a simple-to-understand text
format. Fio displays all sorts of I/O performance information, including complete IO
latencies and percentiles. It supports Linux, FreeBSD, NetBSD, OpenBSD, OS X,
OpenSolaris, AIX, HP-UX, Android, and Windows.
CHAPTER 4
EXPERIMENT RESULTS & ANALYSIS
4.1. FILEBENCH RESULTS
The specific parameters of each workload are listed in Table 2, showing that the
experimental working set size is configured to be much larger than the size of the
page cache in the VM. The detailed description of these workloads represented in
Table 2.
Services # Files # Threads File size I/O sizeFile server 50000 50 128kb 8k-512k
Mail server 50000 16 8k 8k-512k
Web server 50000 100 16k 8k-512k
Database Server 8 200 1g 8k-512kTable 2 FileBench Workloads
Run filebench: Filebench is quick to set up and use unlike many of the commercial
benchmarks which it can emulate. It is also a handy tool for micro-benchmarking
26
storage subsystems and studying the relationships of complex applications such as
relational databases with their storage without having to incur the costs of setting up
those applications, loading data and so forth. Filebench uses loadable workload
personalities in a common framework to allow easy emulation of complex
applications upon file systems. The workload personalities use a Workload Definition
Language to define the workload's model.
We first load the required workload by using load command, then we will set the
parameters like on which directory (you must mount a file system on this directory
before doing this), filesize, number of files, number of threads, iosize and number of
seconds we want to run that workload, in response first it will create a filesystem tree
with the properties we defined earlier. After specified time completed , then we will
see the results like howmany operations, operations per second, latency in ms and etc.
4.1.1 EBS
There are three metrics of storage performance: Latency, IOPS (Input Output
operations Per Second), and Throughput. Understanding the relationships between
these metrics is the key to understanding of the storage performance. Latency is the
combined delay between an input or command and the desired output. In a computer
system, latency is often used to mean any delay or waiting that increases real or
perceived response time beyond what is desired. IOPS are nothing more than the
number of I/O transactions that can be performed in a single second. The amount of
data transferred from one place to another or processed in a specified amount of time.
Data transfer rates for disk drives and networks are measured in terms of throughput.
Typically, throughputs are measured in kbps, Mbps and Gbps.
IOPS: How often a storage device can perform IO tasks is measured in Input/output
Operations per Second (IOPS), and varies depending on the type of IO being done.
IOPS tells us how quickly each drive can process IO requests. The greater the number
of IOPS, the better the performance. In this section we see each workload IOPS on
different file systems.
27
8 16 32 64 5120
20
40
60
80
100
120
140
DB Server
ExT3 EXT4 XFS
IOSIZE
IOPS
8 1 6 3 2 6 4 5 1 20
2000
4000
6000
8000
10000
12000
14000
Fi le S er v er
EXT3 EXT4 XFS
IOSIZE
IOPS
8 1 6 3 2 6 4 5 1 20
2000
4000
6000
8000
10000
12000
14000
16000
W eb ser ver
ExT3 EXT4 XFS
IOSIZE
IOPS
8 1 6 3 2 6 4 5 1 20
2000
4000
6000
8000
10000
12000
Mail S er v er
ExT3 EXT4 XFS
IOSIZE
IOPS
files
erve
r
web
serv
er
mai
lser
ver
dbse
rver
files
erve
r
web
serv
er
mai
lser
ver
dbse
rver
files
erve
r
web
serv
er
mai
lser
ver
dbse
rver
0
2000
4000
6000
8000
10000
12000
14000
16000
8k 16k 32k 64k 512k
File Systems Workload
IOPS
Graph 4.1:1 IOPS of different file systems with different workloads on EBS
28
From Graph 4.1:1, we observed that Mail server and webserver were not affected
much with IOsize increase. On these 2 workloads, 3 file systems behaved almost
same with only slight variation in number of IOPS. And when we closely observe
both workloads, webserver performs better performance than Fileserver, and also as
by increasing the IOsize, the number of IOPS were decreased in Fileserver and
Webserver with EXT4 filesystem, whereas for EXT3 and XFS on fileserver and
webserver workloads we got slightly increased IOPS.
We saw significant variations in IOPS in DB server. As IOsize increasing, say 8k to
512k the number of IOPS were increased for every file system. Especially for XFS
file system it is almost 130% increase of IOPS at 512k. EXT3 and EXT4 also got
increased IOPS at 512k but very less improvement.
When we analyze each file system, all EXT3, EXT4 and XFS performed well with
Webserver workload. And at the same IO size, for Webserver EXT3 performed 200
times better than the DB server. And for same number of files and at the same IOsize
webserver performed almost 2 times better performance than Mail server
Latency: The size of the I/O transfers can also impact the latency of the transfer,
because larger I/O transfer sizes take longer to complete. So performance is best
when we hit less latency.
EXT3 EXT4 XFS0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
WebServer
8 16k 32k 64k 512k
Filesystems
Laten
cy (m
s)
EXT3 EXT4 XFS0
0.20.40.60.8
11.21.41.61.8
FileServer
8k 16k 32k 64k 512k
Filesystems
Laten
cy (m
s)
29
EXT3 EXT4 XFS0
2000400060008000
10000120001400016000
DBServer
8k 16k 32k 64k 512k
File systems
Latenc
y (ms)
EXT3 EXT4 XFS0
0.5
1
1.5
2
2.5
3
3.5
4
MailServer
8k 16k 32k 64k 512k
File systems
Latenc
y (ms)
Graph 4.1:2 Latency of different filesystems with different workloads on EBS
Database Server:
From Graph 4.1: 2, Database server workload works pretty well with EBS storage and
results low latencies than other workloads. We found EXT3 and EXT4 produced
similar results in DB Server with low latencies. When we consider XFS it got lower
latencies like EXT3 and EXT4 when we increased block size from 16k to 512k. We
suggest that when user want to run DB server workload, it is better to use either
EXT3 or EXT4.
Mail Server:
When we consider email (Mail server) workload EXT4 works better than other 2 files
systems such as EXT3 and XFS. EXT4 behaves differently for different IO size.
EXT4 results less latency when we increase IOsize from 8k to 512k. We can say that
for higher IO size good to go with EXT4 file system when you are dealing with email
workload.
File Server:
With Fileserver workload, EXT3 behaves well with only for small IO sizes. XFS was
not consistent, latencies were up and down. Compared to EXT3 and XFS, EXT4
results seems neat, we can see that clearly. Latencies were affected by the size of IO.
With the increasing IOsize, EXT4 results high latencies. From the results of the
30
above graph of File Server, we can say that for small IO Sizes, EXT4 works very well
for File Server workload in EBS storage of Amazon EC2.
Webserver:
When user runs webserver workload, on these 3 file systems, we found that XFS was
not at all affected by increasing or decreasing the IO size of file. And EXT4 throws
low latency when there is an increase of IO size from 8k to 512k. EXT3 didn't show
any variation from low IO size to High IO size. From the results, we suggest, with
EBS storage it is better to mount your webserver workload on EXT4 rather on EXT3
and XFS.
Workloads Comparison: When you compare EXT3, EXT4 and XFS with EBS
storage instance, as expected we observed significant differences in IOPS with
different workloads. Compared to File Server and Webserver, file systems got huge
difference in IOPS with Mail server and DB server workloads, with different
combinations of IO size.
Filesystems: When we analyze each file system, EXT3 and XFS hit very high latency
in the mail server.
Bandwidth: Bandwidth determines how fast data can be transferred over time. It is
the amount of data that can be transferred per second.
file
serv
er
web
serv
er
mai
lser
ver
db
serv
er
file
serv
er
web
serv
er
mai
lser
ver
db
serv
er
file
serv
er
web
serv
er
mai
lser
ver
db
serv
er
EXT4_EBS XFS_EBS
0
5000
10000
15000
20000
25000
8k 16k 32k 64k 512kIOsize and WorkLoads
Ban
dw
idth
(MB
/s)
Graph 4.1:3 Bandwidth of file systems on EBS
31
We also observed bandwidth for different workloads on each files system. From
Graph 4.1:3, Database server workload performs well by resulting high bandwidth.
From the above graph, EXT4 file system performed pretty well with the database
server and the amount of data transferred was more than the other file systems.
4.1.2 INSTANCE STORE:
In this section we present the results of Amazon Ec2 Instance store performance with different file systems and analysis of these results
IOPS:
8 16 32 64 5120
2000400060008000
100001200014000
FileServer
EXT3 EXT4 XFS
Iosize
IOPS
8 16 32 64 5120
2000400060008000
100001200014000
WebServer
ExT3 EXT4 XFS
Iosize
IOPS
8 16 32 64 5120
20
40
60
80
100
120
DBServer
ExT3 EXT4 XFS
Iosize
IOPS
8 16 32 64 5120
2000
4000
6000
8000
10000
12000
MailServer
ExT3 EXT4 XFS
Iosize
IOPS
Graph 4.1:4 IOPS of different filesystems with different workloads on Instance Store
On Amazon Instance Store, we observed different Filebench results. From Graph
4.1.4, we noticed that webserver workload gave better performance (high IOPS) than
32
all other workloads. And in webserver, as we increasing the IOsize, the number of
IOPS were decreased for EXT4 and XFS file systems where as for EXT3 we found a
little bit performance improvement. Webserver workload got 30% more IOPS than
the mail server IOPS for each filesystem. Similar to file server workload and DB
server workload, reduction in IO size from 512kb to 8kb, we observed the overall
performance of XFS was increased.
With DB server workload, we found maximum IOPS for small IOsize (8k) with XFS.
And also XFS behaved better than ext3 and ext4 even at high IOsizes (512k) we saw
significant variations in IOPS in DB server. As IOsize increasing, say 8k to 512k the
number of IOPS were increased for every file system. Especially for XFS file system
it is almost 130% increase of IOPS at 512k. EXT3 and EXT4 also got increased IOPS
than at 8k but very less improvement.
When we analyze each file system, all EXT3, EXT4 and XFS performed well with
Webserver workload like in Amazon EBS storage. And for same number of files and
at the same IOsize webserver performed better performance than Mail server by 36%.
XFS file system performs similar with Fileserver and Mail server workloads. Among
all workloads, webserver workload is suited for XFS and with all workloads except
for DB server workload, EXT4 got high performance.
Latency:
EXT3 EXT4 XFS0
0.5
11.5
2
2.53
3.54
MailServer
8k 16k 32k 64k 512k
FILE SYSTEMS
LATE
NCY(m
s)
EXT3 EXT4 XFS0
2
4
6
8
10
12
FileServer
8k 16k 32k 64k 512k
FILESYSTEMS
LATEN
CY(m
s)
33
EXT3 EXT4 XFS0
50
100
150
200
250
300
DBServer
8k 16k 32k 64k 512k
FILESYSTEM
LATE
NCY(m
s)
EXT3 EXT4 XFS0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
WebServer
8 16k 32k 64k 512k
FILESYSTEM
LateN
cy(MS
)
Graph 4.1:5 Latency of different file systems with different workloads on Instance Store
Mail Server:
From the Graph 4.1.5, email (Mail server) workload XFS works better and same at
low IO sizes and high IO sizes. As we are increasing the IO size, EXT3 didn’t affect
much. EXT4 results high latency when we increase IOsize from 8k to 512k. We can
say that with low IO size it is better to choose XFS file system when you are dealing
with email workload.
File Server:
With Fileserver workload, EXT4 behaves very well with low IO sizes. Next XFS did
good job at low IO sizes. Compared to EXT3 and XFS, EXT4 results very less
latencies. With the increasing IOsize, EXT4 results high latencies. From the results
of the above graph of File Server, we can say that for small IO Sizes, EXT4 works
very well for your File Server workload in EBS storage of Amazon EC2.
Database Server:
From our experiment results (Graph 4.1.5), Database server workload works very
badly with Instance storage and results high latencies than other workloads. We found
EXT4 produced better results in DB Server with low latencies when we increase the
IOsize from 64k to 512k. When we consider XFS it got higher latencies than EXT3
and EXT4 for each IO size. Here system doing high IOPS with high latency. Doing
more IOPS would be nice but the DB needs less latency in order to see significantly
34
improved performance. We suggest that when user wants to run DB server workload,
it is better to use either EXT3 or EXT4.
Webserver Workload:
When user runs webserver workload, on EXT3, EXT4 and XFS, we found that none
was affected by increasing or decreasing the IO size of file. And EXT4 throws low
latency compared to other 2 file systems. Analyzing the results we suggest that, with
Instance storage it is better to mount your webserver workload on EXT4 rather on
EXT3 and XFS.
Workload Analysis: When you compare different workloads with instance storage,
we observe some differences in IOPS with different workloads. Other than DB server
workloads, all other workloads gave less latencies. Among mail server, file server and
webserver, file server results high latency with EXT3 file system and web server
results low latency with EXT4. Except in the mail server, latencies were affected by
the size of IO. From our analysis, we can say the Webserver works pretty well with
Instance storage.
File Systems Analysis: When we analyze each file system, EXT4 was best among
them with low latencies in webserver workload. Latencies were affected by the size
of IO. EXT4 also did a good job in fileserver with different IO sizes, and in DB
server also it finished the job in less time when we increase the IOsize to 512k. From
our analysis EXT4 was the best file system with any workload.
BANDWIDTH:
35
file
serv
er
web
serv
er
mai
lser
ver
db
srve
r
file
serv
er
web
serv
er
mai
lser
ver
db
srve
r
file
serv
er
web
serv
er
mai
lser
ver
db
srve
r
XFS_Instance EXT4_Instance EXT3_Instance
02000400060008000
100001200014000160001800020000
8k
16k
32k
64k
512kWorkloads
Ban
dw
idth
(M
b/s
)
Graph 4.1:6 Bandwidth of different filesystems with different workloads on Instance Store
Like in EBS storage, dbserver workload got higher bandwidth in Instance storage, but
here XFS performs better than EXT4 and EXT3.
EBS VS INSTANCE STORE:
Bandwidth:
When we compare EBS and Instance Store storages, EXT4 consumed significant
bandwidth on Mail server, Web server and File server workload. XFS consumed
significant bandwidth on DB server workload.
EBS IOPS: Following two tables shows the number of total IOPS we got on each file
systems with different workloads on small IO size (8k) and large IO size (512k)
respectively
For Small IO Sizes:
8k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 12143.62 12313.749 12361.546WebSever 13271.074 13342.91 13284.554DBServer 57.99 84.987 17.794MailServer 8649.666 10976.13 8767.098
Table 3 IOPS of different filesystems with different workloads on EBS with 8k
For High IO sizes:
36
512k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 12146.801 12222.435 12245.031WebSever 13329.213 13154.089 13295.974DBServer 58.991 88.985 83.987MailServer 8220.46 10291.505 8817.825
Table 4 IOPS of different filesystems with different workloads on EBS with 512k
From Table 3 and Table 4, for smaller IO sizes XFS delivers very good IOPS with
different workloads except on DB server. As we increasing the block size say
from 8k to 512k, we observed that EXT4 was best among other file systems. We
also noticed that with Data base workload, XFS got more performance when we
increase the iosize from 8k to 512 k almost 80% increase in IOPS.
Instance Store IOPS:
For 8k IO size:
8k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 7386.07 12181.359 9382.177WebSever 12758.949 13133.797 10961.689DBServer 76.987 81.986 109.975MailServer 8828.579 9761.338 8116.941
Table 5 IOPS of different filesystems with different workloads on Instance Store with 8k
For 512k IO size:
512k EXT3 IOPS EXT4IOPS XFS IOPSFileServer 7078.581 12185.311 8751.845WebSever 12773.772 13098.564 10955.835DBServer 82.986 83.987 96.983MailServer 8858.039 9163.494 8149.452
Table 6 IOPS of different filesystems with different workloads on Instance Store with 512k
Amazon EC2 instance with Instance store volume has significant performance
variations on different file systems. For smaller IO sizes like 8k, EXT4 was the
best among all file systems with any workload except DB server workload. And
EXT3 got poor performance with DB workload and XFS got very high IOPS
compared to other file systems. For high IO size like 512k, see better
performance with EXT4 file system, compared to EXT3 and XFS almost 30 to 40
% variation with file server workload.
37
EBS vs Instance Store: If we check for 8k (small) IO size, XFS file system works
very well in EBS, with all workloads except for database workload. EXT3 got so
much IOPS difference when you run on EBS rather with Instance store like 64%.
The next one is XFS, it got 31% increase performance in EBS.
4.2 FIO RESULTS
In this section we are doing a micro-level benchmarking with FIO [4] on different file
systems on EBS and Instance store. The specific I/O characteristics of these test cases
are listed in Table 7.
Parameters Block size Number of jobs I/O Pattern Runtime
EXT3 4k-1024k 8, 16, 32 Random/Sequential 60
EXT4 4k-1024k 8,16, 32 Random/Sequential 60
XFS 4k-1024k 8,16, 32 Random/Sequential 60
Table 7 FIO Benchmark Parameters for EBS and Instance store
In the following example, we see how to write and run a FIO benchmarking with
randomread/write on XFS. I used block size ranges from 4k to 1024k with different
numbere of jobs say from 8 jobs, 16jobs and 32 jobs. Here our focus is on the
performance variations of primitive I/O operations, such as read and write. With the
combination of these I/O operations and two I/O patterns, random and sequential, we
design four test cases: random read, random write, sequential read, sequential write.
This thesis run FIO for 0% reads(means 100% writes) to 100%reads(same as 0%
writes) randomly and sequentially.
Example:
./fio --filename=/xfs/4krandreadwrite00 --direct=1 --rw=randrw --refill_buffers --
norandommap --randrepeat=0 --size=1024m --bs=4k --rwmixread=70 --iodepth=8 --
numjobs=8 --runtime=60 --group_reporting --name=4krandreadwrite00
--output=/npudtha/fioresults_xfs/RandReadWrite/4krandreadwrite00
38
Here rwmixread=int means percentage of mixed workload that should be reads; in
our example read percentage is 70 so writes percentage will be 30%. The above
command will create a 1024m file, and perform 4KB reads and writes using a
70%/30% (i.e. 3 reads are performed for every 1 write) split within the file, with 8
operations running at a time. Here we are noticing how many 4k (4096 byte)
operations the drive will handle per second with each block being read or written to a
random position. With iodepth of 8, this means that there are 8 separate threads taking
place with the drive, each thread independently running its own transfers. We write
same with for other combinations.
4.2.1 EBS:
Random read/write: Random means the files are scattered all over the drive, not in
neat rows or groups, so take more work to find. Random IO is the most difficult and
time consuming type a storage device must deal with. In this section we discuss
results of Random read/write results on EBS storage with EXT3 file systems with
different number of jobs.
EXT3 IOPS:
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandRWEXT3_EBS_8Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:7 E XT3 Random Read Write 8 jobs
39
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandRW_EXT3_EBS_16Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:8 EXT3 Random Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandRW_EXT3_EBS_32Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:9 EXT3 Random Read Write 32 jobs
For write dominated workloads say 90% writes, 8jobs, 16 jobs and 32 jobs results
vary for 4k, 8k 16k and 32k block sizes. That tells us, when the work loads are write
dominated, choosing 8k(8 x 1383 IOPS) block size can write more data than 4k (4k x
1472 IOPS) and we observed that as we increasing the block size 32k(32 x 972 IOPS)
the number of IOPS were reduced, but the overall write of data was increased. We
also observed as we increased block size from 64k to 128k, we saw that IOPS
decreased by 30% almost (525 IOPS). And also when we increased the block size
from 128k to 1024k it only gave only 109 IOPS but writes more data than at 128k
block size.
For read dominated workloads say 90% reads, 8jobs, 16 jobs and 32 jobs results
similar IOPS for 4k, 8k 16k and 32k block sizes. And also from results, when the
40
work loads are read dominated, choosing 32k (32 x 3064 IOPS) block size can read
more data and we observed that as we increasing the block size 64k (64 x 2089) the
number of IOPS were reduced, but the overall bytes of data was increased. We also
observed as we increased block size from 64k to 128k, we saw that IOPS decreased
exactly to half (1044 IOPS) irrespective of number of jobs. And also when we
increased the block size from 128k to 1024k the performance suffered, it gave 122
IOPS for 8 jobs, to 124 IOPS for 16 jobs and 129 IOPS for 32 jobs and in all cases it
read less data than at 128k block size.
The main observation, dominated read performance was better than the write
dominated performance. For example at 4k block size, read IOPS were more than
double to write IOPS at 90% (read/write).
Sequential Read & Write: In this section we discuss results of sequential read/write
results on EBS storage with EXT3 file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SEQRW_EXT3_EBS_8Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:10 EXT3 Sequential Read Write 8 jobs
41
0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 00
500
1000
1500
2000
2500
3000
3500
SEQRW_EXT3_EBS_32Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:12 EXT3 Sequential Read Write 32 jobs
For read dominated workloads say 90% reads, 8jobs, 16 jobs and 32 jobs results
similar IOPS for 4k, 8k 16k and 32k block sizes and also we noticed that better IOPS
than random workloads. We observed similar results as random read write except
when the block size was increased from 128k to 1024k, the performance degraded.
For write dominated workloads say 90% writes, as we increasing the block size the
IOPS decreased for all jobs linearly. We also observed as we increased block size
from 64k to 128k, we saw that IOPS decreased to 596 IOPS. And also when we
increased the block size from 128k to 1024k it only gave 103 IOPS but writes more
data than at 128k block size.
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SEQRW_EXT3_EBS_16Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:11EXT3 Sequential Read Write 16 jobs
42
Here, we found surprised results. As we know, disks are split into linear addressable
regions, and such a unit is called a sector. Because sectors close to each other in the
linear space are also physically adjacent, it is faster to read two neighboring sectors
than two sectors that are far apart means for sequential reads we should get better
performance than random reads, but here we saw the other way. From our
observation, we think the reason of performance degradation might be because of
reading a big size of block size randomly is better than sequential reads due to
parallel jobs. From the results, we found that there is a performance difference with
different file systems. The common workloads are usually 60 % reads and 40% writes
and we observed better performance when we run 32 jobs on EXT4.
EXT4:
Random read/write: In this section we discuss results of Random read/write results on EBS storage with EXT4 file systems with different number of jobs.
0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 00
500
1000
1500
2000
2500
3000
3500
Ra n d R W_ EX T4 _ E BS_ 8 J O BS
4k8k16k32k64k128k1024k
%Reads
IOP
S
Graph 4.1:13 EXT4 Rand Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RANDRW_EXT4_EBS_16JOBS
4k8k16k32k64k128k1024k
%Reads
IOP
S
Graph 4.1:14 EXT4 Rand Read Write 16 jobs
43
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandRW_EXT4_EBS_32Jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:15 EXT4 Rand Read Write 32 jobs
With EXT4 file size, for write dominated workloads say 90% writes, when we
compare 8jobs, 16 jobs and 32 jobs results vary for 8k 16k and 32k block sizes. That
tells us, when the work loads are write dominated, increasing the block size 32k(32 x
1094 IOPS) the number of IOPS were reduced, but the overall write of data was
increased. We also observed as we increased block size from 64k to 128k, we saw
that IOPS decreased like in EXT3. And also when we increased the block size from
128k to 1024k it only gave 96 IOPS but writes more data than at 128k block size.
For read dominated workloads say 90% reads, 8jobs, 16 jobs gave similar results but
at 32 jobs it gave less IOPS for 4k and 8k block sizes. In this case we can push more
data when we choose 32k block size with 8 jobs. That tells us, when the work loads
are read dominated, choosing 32k(32 x 3063 IOPS) block size can read more data
and we observed that as we increasing the block size 64k(64 x 2089) the number of
IOPS were reduced, but the overall bytes of data was increased.
For both read dominated and write dominated workloads, as we increasing the block
size we get better IOPS. And again read dominated workloads throw better
performance than the write dominated workloads and also better performance
observed at less number of threads.
Sequential Read/Write: In this section we discuss results of sequential read/write
results on EBS storage with EXT4 file systems with different number of jobs.
44
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqRWEXT4_8jobs_EBS
4k8k16k32k128k1024k
Read%
IOP
S
Graph 4.1:16 EXT4 Sequential Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqRWEXT4_16jobs_EBS
4k8k16k32k128k1024k
Read%
IOPS
Graph 4.1:17 EXT4 Sequential Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqRWEXT4_32jobs_EBS
4k8k16k32k128k1024k
Read%
IOPS
Graph 4.1:18 EXT4 Sequential Read Write 32 jobs
For read dominated workloads say 90% reads, for 4k, 8k 16k and 32k block sizes we
found similar results at 16 jobs and 32 jobs results and somewhat less at 8jobs. And
also we noticed that less IOPS compared to random workloads. We observed similar
results as random read write except when the block size was increased from 64k
45
to128k to 1024k, the performance degraded and number of IOPS were close to half
and when we increased from 128k to 1024k it was drastically down in IOPS. For
write dominated workloads say 90% writes, as we increasing the block size the IOPS
decreased for all jobs linearly. We also observed as we increased block size from 64k
to 128k, we saw that IOPS decreased from 858 to 100 IOPS at 32 jobs. And also
when we increased the block size from 128k to 1024k it only gave 97 IOPS but writes
more data than at 128k block size. Here, as we expected that for sequential reads we
should get better performance than random reads, we got exactly like that with EXT4.
XFS:
Random Read/Write: In this section we discuss results of Random read/write results
on EBS storage with XFS file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandReadWrite_XFS_EBS_8jobs
4k8k16k32k64k128k1024k
%Reads
IOP
S
Graph 4.1:19 XFS Rand Read Write 8 jobs
46
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandReadWrite_XFS_EBS_16jobs
4k
8k
16k
32k
64k
128k
1024k
%Reads
IOP
S
Graph 4.1:20 XFS Rand Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
RandReadWrite_XFS_EBS_32jobs
4k8k16k32k64k128k1024k
%Reads
IOP
S
Graph 4.1:21 XFS Rand Read Write 32 jobs
With XFS file system, as we are increasing the block size number of IOPS were
linearly decreased with dominate writes. From 50 % reads and 50 % writes they
behaved differently, and number of IOPS were similar for 4k, 8k, 16k and 32k and
from 128 k to 1024k they were reduced. But we should notice that, even the number
of IOPS decreased us still able to read more data at 128k and 1024k. Random read
write with XFS results were not so different from previous file systems.
Sequential read/write: In this section we discuss results of sequential read/write on
EBS storage with XFS file systems with different number of jobs.
47
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqReadWrite_XFS_EBS_8jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:22 XFS Sequential Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqReadWrite_XFS_EBS_16jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:23 XFS Sequential Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
500
1000
1500
2000
2500
3000
3500
SeqReadWrite_XFS_EBS_32jobs
4k8k16k32k64k128k1024k
Read%
IOP
S
Graph 4.1:24 XFS Sequential Read Write 32 jobs
48
For write dominated and read dominated workloads, at 8jobs, we didn't see any
variation in IOPS from 4k block size to 16k block size. As we increasing the block
size further, we observed the variation in total IOPS. For write dominated workloads,
say at 90% writes, at 32k block size XFS gave 2844 IOPS, it means we got less IOPS
by increasing the block size from 16k to 32k but we also can say that we can write
more data at 32k block size.
With the same block size (32k), for read dominated workloads say at 90% reads we
did not find any difference in total IOPS, but when we increase block size from 32k
to 64k and at same 90% reads, the performance was degraded by 30% and when we
increased block size from 64k to 128k performance was decreased 50% and when we
increased block size from 128k to 1024k it was drastically down in IOPS
When we compare 8jobs, 16 jobs and 32 jobs XFS file system works well with 8 jobs
at write dominate workloads between block sizes from 4k to 64k and also we
observed that at 128k block size with 16 jobs we are able to write more data. And at
90% reads, we found xfs results high IOPS with 16 jobs than 8 and 32 jobs.
When we analyze random read/write and sequential read/write, for write dominated
workloads sequential read/write results more IOPS with 8 jobs and, for read
dominated workloads, both behaved in same manner.
EXT3 vs EXT4 vs XFS on EBS storage:
When we compare EXT3, EXT4 and XFS file systems on amazon ec2 cloud with
EBS store attached for random read/write workloads, for read dominated workloads
(at 90% reads) all file systems gave almost same number of IOPS(between 3063 and
3064), we can say that for any read dominated workload user can choose any file
systems among these three.
Whereas for write dominated workloads, we saw significant difference with each file
system. Among all these file systems, here xfs file system got more IOPS (1623) at
80% writes. And next EXT4 performance was better than EXT3 at 80% writes. From
49
our analysis, we suggest that for write dominated workloads it is better to choose XFS
file system rather EXT3 and EXT4 file systems.
When we compare EXT3, EXT4 and XFS file systems on amazon ec2 cloud with
EBS store attached for sequential read/write workloads, for read dominated
workloads (at 90% reads) EXT3 file system performed better than other 2 file
systems. Whereas for write dominated workloads ((at 90% writes), we got more
number of IOPS with XFS file system, we found almost 50% increase in performance
than other file systems. And also as we are increasing the block size we found less
IOPS in each file system. The main observation was, performance varies with the file
system and also with the block size.
4.2.2 INSTANCE STORE
In this section we present the results of Amazon Ec2 Instance store performance with different file systems and analysis of these results
EXT3
Random read/write: In this section we discuss results of Random read/write results on instance store storage with EXT3 file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
RandRW_EXT3_Instore_8jobs
4k
8k
16k
32k
64k
128k
1024k
Read%
IOP
S
Graph 4.1:25 EXT3 Rand Read Write 8 jobs
50
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
RandRW_EXT3_Instore_16jobs
4k
8k
16k
32k
64k
128k
1024k
Read%
IOP
S
Graph 4.1:26 EXT3 Rand Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
45000
RandRW_EXT3_Instore_32jobs
4k
8k
16k
32k
64k
128k
1024k
Read%
IOP
S
Graph 4.1:27 EXT3 Rand Read Write 32 jobs
For write dominated workloads say 90% writes, 8jobs, 16 jobs and 32 jobs results
vary for 4k, 8k 16k and 32k block sizes. And it also tells us, when the work loads are
write dominated, choosing 8k block size with 8 jobs( results high IOPS and we
observed that as we increase the block size the number of IOPS were reduced, at the
same time the overall data written was increased. We also observed as we increased
block size from 64k to 128k, we saw that IOPS decreased by about 30% under 8jobs
work, and decreased by more than 50% under 16 jobs and 32 jobs workload. And also
when we increased the block size from 128k to 1024k it results very low IOPS in all
jobs. For read dominated workloads say 90% reads, we found very high IOPS
(40073) when number of jobs are 32. Like in write dominated workload, as we
increase the block size the number of IOPS were decreased.
51
The main observation, dominated read performance was better than the write
dominated write performance. For example at 4k block size, read IOPS were more
than double to write IOPS at 90% (read/write). And when number of jobs are 8, 16
and 32 the number of IOPS were started decreasing very rapidly when we increase
the block size from 16k to 32k
Sequential read/write: In this section we discuss results of sequential read/write on
EBS storage with EXT3 file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
SeqRW_Instore_EXT3_8Jobs
4k
8k
16k
32k
64k
128k
1024k
Read
IOP
S
Graph 4.1:28 EXT3 Sequential Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
16000
18000
SeqRW_Instore_EXT3_16Jobs
4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:29 EXT3 Sequential Read Write 8 jobs
52
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
16000
SeqRW_Instore_EXT3_32Jobs
4k
8k
16k
32k
64k
128k
1024k
Read%
IOP
S
Graph 4.1:30 EXT3 Sequential Read Write 32 jobs
For any number of jobs (8, 16, 32), EXT3 file system performance was much better
with write dominated workloads for small block size. As we are increasing the block
size performance was degraded. When we compare 8jobs, 16 jobs and 32 jobs EXT3
file system works well with 16 jobs at write dominate workloads at 4k block size.
And at 90% reads or write dominated workloads, we found EXT3 results high IOPS
with 32 jobs than with 8 and 16 jobs.
EXT4
Random read/write: In this section we discuss results of Random read/write results
on instance store storage with EXT4 file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
RandRW_EXT4_Instore_8Jobs
4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:31 EXT4 Random Read Write 8 jobs
53
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
RandRW_EXT4_Instore_16Jobs4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:32 EXT4 Random Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
RandRW_EXT4_Instore_32Jobs
4k
8k
16k
32k
64k
128k
1024k
Read%
IOPS
Graph 4.1:33 EXT4 Random Read Write 32 jobs
With EXT4 file size, the performance of those workloads that are dominated by read
and write operations are effected by block size. We noticed that as we increase block
size, IOPS gradually decreased. Here the block size was directly affects the number
of input output operations. When the workloads are dominated by write operations
(say at 90% writes), the performance changes were negligible. For read dominated
workloads say 90% reads, at 32 jobs the performance was better than at 8 and 16 jobs.
That tells us, when the work loads are read dominated, choosing big block size can
read more data and we observed that as we increasing the block size the number of
IOPS were reduced, but the overall bytes of data was increased. And read dominated
workloads throw better performance than the write dominated workloads and also
better performance observed at high number of threads.
54
Sequential read/write: In this section we discuss results of sequential read/write on
instance store storage with EXT4 file systems with different number of jobs.
0
2000
4000
6000
8000
10000
12000
14000
SeqRW_EXT4_Instore_8jobs4k
8k
16k
32k
64k
128k
1024k
Read%
IOP
S
Graph 4.1:34 EXT4 Sequential Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
16000
SeqRW_EXT4_Instore_16jobs
4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:35 EXT4 Sequential Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
16000
SeqRW_EXT4_Instore_32jobs
4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:36 EXT4 Sequential Read Write 32 jobs
55
For read dominated workloads say 90% reads, for 4k, 8k 16k and 32k block sizes we
found similar results at 16 jobs and 32 jobs results and somewhat less at 8jobs. And
also we noticed that less IOPS compared to random workloads. We observed similar
results as random read write except when the block size was increased from 64k
to128k to 1024k, the performance degraded and number of IOPS were close to half
and when we increased from 128k to 1024k it was drastically down in IOPS.
For write dominated workloads say 90% writes, as we increasing the block size the
IOPS decreased for all jobs linearly. We also observed as we increased block size
from 64k to 128k, we saw that IOPS decreased from 858 to 100 IOPS at 32 jobs. And
also when we increased the block size from 128k to 1024k it only gave 97 IOPS but
writes more data than at 128k block size.
Here, as we expected that for sequential reads we should get better performance than
random reads, we got exactly like that with EXT4.
XFS:
Random read/write: In this section we discuss results of Random read/write results
on instance store storage with XFS file systems with different number of jobs.
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
RandRW_XFS_INSTORE_8Jobs4k
8k
16k
32k
64k
128k
1024k%Reads
IOP
S
Graph 4.1:37 XFS Random Read Write 8 jobs
56
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
RandRW_XFS_INSTORE_16Jobs
4k
8k
16k
32k
64k
128k
1024k
%Reads
IOP
S
Graph 4.1:38 XFS Random Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
RandRW_XFS_INSTORE_32Jobs
4k
8k
16k
32k
64k
128k
1024kRead%
IOP
S
Graph 4.1:39 XFS Random Read Write 32 jobs
With XFS file system, as we are increasing the block size number of IOPS were
linearly decreased for 8 jobs for 50% read and write. When we run 16 jobs and 32
jobs for 50% reads and writes at 16k, 32k, 64k block size the IOPS almost same.
Another observations read dominated workloads did better at 32 jobs with small
block size (4k) and write dominated workloads 8 jobs with 8k did better from the
above observations
Sequential read/write: In this section we discuss results of sequential read/write on
instance store storage with different XFS file systems with different number of jobs.
57
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
SeqRW_XFS_instore_8jobs
4k
8k
16k
32k
64k
128k
Read%
IOP
S
Graph 4.1:40 XFS Sequential Read Write 8 jobs
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
SeqRW_XFS_instore_16jobs
4k
8k
16k
32k
64k
128k
Read%
IOP
S
Graph 4.1:41 XFS Sequential Read Write 16 jobs
0 10 20 30 40 50 60 70 80 900
5000
10000
15000
20000
25000
30000
35000
40000
45000
SeqRW_XFS_instore_32jobs
4k
8k
16k
32k
64k
Read%
IOP
S
Graph 4.1:42 XFS Sequential Read Write 32 jobs
With sequential workload, XFS didn't effect by the number of threads. XFS
performed well with read dominated workloads than with write dominated workloads.
58
It is also effected by the block size. As we are increasing the block size, the number
of input output operations can be performed by this storage every second were
decreased. When we analyze random read/write and sequential read/write, Random
read/write workloads performance was great than sequential read/write workloads.
EXT3 vs EXT4 vs XFS on EBS & Instance storage:
IOPS for 8 jobs4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead
3063 3064 3064 27500 26566 25136
90%Random Write
1472 1558 1773 11954 11334 9960
90%Seq-Read 3070 3064 3059 12954 12370 2513690% Seq-Write
1685 1516 3059 2931 2396 9960
Table 8 IOPS for 8 jobs & 4k block size- EBS vs Instance store
IOPS for 16jobs
4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead
3062 3063 3061 35987 33882 33820
90%Random Write
1507 1582 1653 11443 11314 9691
90%Seq-Read 3090 3063 3063 16012 15000 3382090% Seq-Write
1712 1685 1791 2717 2609 9691
Table 9 IOPS for 16 jobs & 4k block size- EBS vs Instance store
IOPS for 32jobs4k block size EXT3_EBS EXT4_EBS XFS_EBS EXT3_Instore EXT4_Instore XFS_Instore90%RandomRead
3063 3045 3064 40073 37307 38422
90%Random Write
1499 1809 1828 11592 11068 9869
90%Seq-Read 3091 3063 3052 15235 14972 3842290% Seq-Write
1712 1682 1801 2712 2449 9869
Table 10 IOPS for 32 jobs & 4k block size- EBS vs Instance store
Table 8, 9, and 10 compares EBS and Instance Store storage setting with different file
systems. From our results, we can see that Instance Store storage setting results good
IOPS than EBS. Within instance store volumes, XFS did pretty well with sequential
workloads and for random workloads we noticed that EXT3 did performed well.
59
CHAPTER 6
IO ANALYSIS
In this section we do block-level analysis of EXT4 and XFS file systems with
sequential/Random Read/Writes with the help of Blktrace tool as EXT4 got high
IOPS in EBS storage and XFS got higher IOPS in Instance storage.
For IO analysis, we started with blktrace which is a block layer IO tracing
mechanism, it provides detailed information about request queues. Blktrace need to
be specified the disk (in our case it is /dev/xvdb) with –d. s blktrace output itself is
not human readable format, we need to use blkparse to make output human readable.
As we can see from the blkparse output (Appendix), each I/O is printed along with a
summary of the operations and how they were processed by the I/O scheduler. This is
great information, which can be used to figure out I/O patterns (random reads,
random writes, sequential reads, sequential writes etc.), the size of the I/O operations
hitting physical devices in a system and the type of workload on a system. We saved
blktrace traces to the disk, then run blkparse tool with specified disk (in our case
/dev/xvdb) and specified a file to store the combined binary system (in our case
bp_0.bin, bp_1.bin, bp_2.bin, bp_3.bin, bp_4.bin, bp_5.bin, bp_6.bin for CPU 0
through CPU 6). Blktrace produces a series of binary files, one file per CPU per
device. Then we run btt on a file which was produced by blkparse tools (btt –i
bp_1.bin)
Ex: Following is the output of bp_1.bin with EXT4 file system on EBS while us
running FIO benchmark.
==================== All Devices ====================
ALL MIN AVG MAX N--------------- ------------- ------------- ------------- -----------
Q2Q 0.000000302 0.000187836 0.018320058 11608
60
Q2G 0.000000316 0.000555075 0.021534113 6633G2I 0.000000537 0.000000884 0.000016285 6633Q2M 0.000000220 0.000521300 0.020452472 5757I2D 0.000000262 0.000567899 0.020857786 6825M2D 0.000003591 0.000305261 0.002367892 5557D2C 0.000003985 0.009692405 0.021917346 11547Q2C 0.000685686 0.010587135 0.024264175 11548
Here Q2Q is the time between requests sent to the block layer, Q2G is how long it
takes from the time a block I/O is queued to the time it gets a request allocated for it.
G2I measures how long it takes from the time a request is allocated to the time it is
inserted into the device's queue. Q2M is how long it takes from the time a block I/O
is queued to the time it gets merged with an existing request. I2D is how long it takes
from the time a request is inserted into the device's queue to the time it is actually
issued to the device. M2D is how long it takes from the time a block I/O is merged
with an existing request until the request is issued to the device. D2C is the service
time of the request by the device and Q2C is the total time spent in the block layer for
a request
61
CHAPTER 7
DISCUSSION
We performed an extensive I/O benchmarking study comparing different storage
options on Amazon EC2 instances. Amazon EC2 virtual machine has ephemeral local
disk and has the option to mount an elastic block storage volume. Typically, the
performance of the local disk tends to be slightly higher than the EBS corresponding
volumes.
As we mentioned in the beginning of this thesis, this research aims to answer some
important questions which were mentioned in the thesis outline, such as which one
delivers the better peak performance, which one delivers more consistent
performance, is any of these two settings all-time winner for all workloads, or the
performance is workload-dependent. Based on the experimental results, the
performance on these hosts tends to show a fair amount of variability due to the
attached storage settings. From EBS and Instance store volumes results we found that
there are I/O performance variations due to the block size. As we are increasing the
block size, the number of IOPS were decreased hence the I/O performance was
degraded. EC2 performance varies significantly under different host file systems. The
performance of file systems is affected much more by write than read operations.
This paper also evaluates the file system’s impact on EBS and instance storages
performance. We studied several popular Linux file systems, with various mount and
format options, using the FileBench workload generator to emulate four server
workloads: Web, database, mail, and file server. However, file system design,
implementation, and available features have a significant effect on CPU/disk
utilization, and hence on performance. We noticed that default file system options are
often suboptimal, and also a careful matching of expected workloads to file system
types and options can improve performance.
62
From Chapter 5, we understood that Instance Store storage setting were better than
EBS in regards of IOPS. But we have to understand that EBS got some major
advantages, such as we can have proper backups, you won’t lose your important data,
and EBS provides scaling. While EBS-backed instances provide certain level of
durability compared to ephemeral storage instances, they can and do fail. We also
understands that when a server doesn’t need EBS to launch, it is better to choose
Instance Store settings as it is cheaper than EBS-backed AMIs. EBS-backed AMIs
are quite a bit more expensive than using ephemeral storage. EBS volumes are limited
to 1TB of space, so you have to stripe them to get bigger volume sizes. EBS is good
but not for backing instances. It's fabulous for storing massive amounts of data,
getting quick snapshot backups, and quick restores for disaster recovery.
In the real world, we typically won’t have a single host pumping I/O into a storage
array. More likely you will have many hosts doing input output operations in parallel.
We expect performance degradation with write workloads, but we experience read
performance degradation with different file systems.
Hard drives actually provide plenty of sequential speed, especially when aggregated
into various RAID implementations. Virtualization overhead is more felt in sequential
workloads accessed through smaller block sizes than random workloads. Sequential
read workloads come into play during OLAP, batch processing, content delivery,
streaming and backup scenarios. Sequential write performance comes into play
during caching, replication, HPC and database logging workloads. Another key
component to analyzing sequential performance is observing latency metrics. The
cloud customers can estimate the expected performance based on the characteristics
of the workload they deploy.
63
CHAPTER 8
FUTURE WORK
We perform an exhaustive study comparing the performance of Amazon EBS and
Amazon instance store test bed. However more work is needed to understand the
effect of different types of instances (small, medium, large and exlarge) with different
types of storage settings. We plan to expand our study to include other file systems
(e.g., Reiser4, and BTRFS), as we believe they have greater optimization
opportunities. We also wanted to find is there any impact of instance size with
different storage settings. And also we are planning to do this work on UCCS cloud,
hence will help UCCS students and staff to understand the performance implications
of UCCS cloud and will help to choose appropriate file systems for different types of
workloads, in order to increase the performance.
64
CHAPTER 9
CONCLUSION
Proper benchmarking and analysis are tedious, time consuming tasks. We conducted
a comprehensive study of file systems on modern systems, evaluated popular server
workloads, and varied many parameters. We collected and analyzed performance
metrics. We discovered and explained significant variations in performance. We
found that XFS worked better than EXT3 and EXT4 and storage setting wise,
instance store storage setting is better than EBS setting. And also we conclude that
there are no universally good configurations for all workloads, and we explained
complex behavior that go against common conventions.
Our main objective is to better understand performance implications of Amazon EBS
and Instance store volumes attached instances with real world workloads with
different file systems mounted on these instances. By examining a large set of
different combinations of file systems under various workloads, we have
demonstrated the significant difference of the two types of storages of Amazon EC2'
performance, and hence, system administrators must be careful in choosing file
systems in order to reap the greatest benefit from Amazon cloud.
Our preliminary investigation on these two storage settings will help researchers to
better understand critical performance issues in this area, and shed light on finding
more efficient methods in utilizing these storage areas. We recommend that servers be
tested and optimized for expected workloads before used in production. Given the
long-running nature of busy Internet servers, software-based optimization techniques
can have significant, cumulative long-term benefits. Understanding the connection
between system-level workloads and the I/O pattern that the drive experiences is
essential to optimizing performance. We hope that our work will motivate system
designers to more carefully analyze the performance issues on Amazon EC2.
66
BIBLIOGRAPHY
[1] “Le, Duy, Hai Huang, and Haining Wang. “Understanding performance implications
of nested file systems in a virtualized environment." FAST. 2012.
[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html
[3] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
[4] FIO - How to. http://www.bluestop.org/fio/HOWTO.txt
[5] Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Pratt, I.,
Warfield, A.: Xen and the art of virtualization. In: SOSP. ACM, New York (2003)
[6] Amazon Elastic Compute Cloud –EC2. http://aws.amazon.com/ec2/
[7] http://aws.amazon.com/s3/
[8] IBM Cloud Computing. http://www.ibm.com/cloud-computing/us/en/
[9] http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure/tmp/en-US/html/
index.html
[10] Filebench: http://linux.die.net/man/1/filebench
[11] EXT3: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/
6/html/Storage_Administration_Guide/ch-ext3.html
[12] EXT4: https://ext4.wiki.kernel.org/index.php/Main_Page
[13] File Bench: http://filebench.sourceforge.net/wiki/index.php/Main_Page
[14] Amazon Instance Types: http://aws.amazon.com/ec2/instance-types/
[15] Alan D. Brunelle. Block I/O layer tracing: blktrace. April 2006. Available from:
http://www.gelato.org/pdf/apr2006/ gelato_ICE06apr_blktrace_brunelle_hp.pdf [cited
May 2009].
67
[16] Jens Axboe. blktrace manual, 2008.
[17] Jens Axboe. blkparse manual, 2008.
[18] http://linux.die.net/man/1/iostat
[19] http://linux.die.net/man/1/sar
[20] http://linux.die.net/man/1/iotop
[21] www.mimuw.edu.pl/.../gelato_ICE06apr_blktrace_brunelle_hp.pdf
68
APPENDIX A
FIO Benchmark 64k block size with random workload:
[root@ip-10-63-61-88 output]# cat 64krandomreadwrite100
64krandreadwrite100: (g=0): rw=randrw, bs=64K-64K/64K-64K/64K-64K,
ioengine=libaio, iodepth=8
64krandreadwrite100: (g=0): rw=randrw, bs=64K-64K/64K-64K/64K-64K,
ioengine=libaio, iodepth=8
fio-2.1.5
Starting 32 processes
64krandreadwrite100: Laying out IO file(s) (1 file(s) / 1024MB)
64krandreadwrite100: (groupid=0, jobs=32): err= 0: pid=28843: Tue Apr 7 09:27:32
2015
read: io=7473.2MB, bw=127245KB/s, iops=1988, runt= 60140msec
slat (usec): min=5, max=174510, avg=12634.37, stdev=28676.53
clat (msec): min=1, max=409, avg=115.98, stdev=53.82
lat (msec): min=1, max=409, avg=128.61, stdev=50.64
clat percentiles (msec):
| 1.00th=[ 12], 5.00th=[ 17], 10.00th=[ 48], 20.00th=[ 88],
| 30.00th=[ 98], 40.00th=[ 102], 50.00th=[ 106], 60.00th=[ 110],
| 70.00th=[ 123], 80.00th=[ 161], 90.00th=[ 196], 95.00th=[ 202],
| 99.00th=[ 251], 99.50th=[ 265], 99.90th=[ 302], 99.95th=[ 310],
| 99.99th=[ 351]
bw (KB /s): min= 2365, max= 6932, per=3.14%, avg=3989.51, stdev=719.09
lat (msec) : 2=0.01%, 4=0.02%, 10=0.15%, 20=5.97%, 50=4.27%
lat (msec) : 100=24.22%, 250=64.25%, 500=1.11%
cpu : usr=0.01%, sys=0.16%, ctx=124384, majf=0, minf=782
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.8%, 16=0.0%, 32=0.0%, >=64=0.0%
69
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=119571/w=0/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=8
Run status group 0 (all jobs):
READ: io=7473.2MB, aggrb=127245KB/s, minb=127245KB/s,
maxb=127245KB/s, mint=60140msec, maxt=60140msec
Disk stats (read/write):
xvdb: ios=118609/12, merge=846/10, ticks=9115184/476, in_queue=9123220,
util=99.82%
APPENDIX B
Sample filebench Output:
FILEBENCH benchmark results on different types of servrs with 8k IO size
FileServer:
filebench> load fileserver
$meanfilesize=128k
$nthreads=50
$nfiles=50000
$iosize=8k
run 60
IO Summary: 728709 ops, 12143.621 ops/s, (1104/2208 r/w), 295.1mb/s, 581us cpu/op, 1.1ms latency
WebServer:
filebench> load webserver
filebench> set $dir=/ext3
filebench> set $meanfilesize=16k
filebench> set $nthreads=100
70
filebench> set $nfiles=50000
filebench> set $iosize=8k
filebench> run 60
IO Summary: 796362 ops, 13271.074 ops/s, (4281/429 r/w), 70.3mb/s, 396us cpu/op, 0.2ms latency
MailServer:
filebench> load varmail
filebench> set $dir=/ext3
filebench> set $meanfilesize=8k
filebench> set $nthreads=16
filebench> set $nfiles=50000
filebench> set $iosize=8k
filebench> run 60
IO Summary: 519044 ops, 8649.666 ops/s, (1331/1331 r/w), 23.1mb/s, 458us cpu/op, 3.6ms latency
Db server:
filebench> load mongo
$dir=/ext3
$filesize=1g
$nthreads=200
$nfiles=8
$iosize=8k
run 60S
IO Summary: 58 ops, 57.990 ops/s, (9/10 r/w), 13621.7mb/s, 184737us cpu/op, 194.3ms latency
71
APPENDIX C
Sample BlkTrace Output:
root@ip-10-63-61-88 bt]# ./blkparse -i xvdb.blktrace.*
Input file xvdb.blktrace.0 added
Input file xvdb.blktrace.1 added
Input file xvdb.blktrace.2 added
Input file xvdb.blktrace.3 added
Input file xvdb.blktrace.4 added
Input file xvdb.blktrace.5 added
Input file xvdb.blktrace.6 added
Input file xvdb.blktrace.7 added
202,16 3 1 0.000000000 0 C WS 589064 + 8 [0]
202,16 3 2 0.000006284 0 D WS 587680 + 16 [swapper/3]
202,16 1 1 0.000019827 3698 Q WS 589128 + 8 [fio]
202,16 1 2 0.000020718 3698 G WS 589128 + 8 [fio]
202,16 1 3 0.000021076 3698 P N [fio]
202,16 1 4 0.000021808 3698 I WS 589128 + 8 [fio]
202,16 1 5 0.000022071 3698 U N [fio] 1
202,16 1 12 0.000797031 3698 M WS 589144 + 8 [fio]
Etc……….
^CCPU0 (xvdb):
Reads Queued: 0, 0KiB Writes Queued: 97, 388KiB
Read Dispatches: 0, 0KiB Write Dispatches: 14, 56KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
72
Read Merges: 0, 0KiB Write Merges: 47, 188KiB
Read depth: 0 Write depth: 28
IO unplugs: 50 Timer unplugs: 0
CPU1 (xvdb):
Reads Queued: 0, 0KiB Writes Queued: 119, 476KiB
Read Dispatches: 0, 0KiB Write Dispatches: 12, 48KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 34, 136KiB
Read depth: 0 Write depth: 28
IO unplugs: 85 Timer unplugs: 0
CPU2 (xvdb):
Reads Queued: 0, 0KiB Writes Queued: 157, 628KiB
Read Dispatches: 0, 0KiB Write Dispatches: 6, 24KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 46, 184KiB
Read depth: 0 Write depth: 28
IO unplugs: 111 Timer unplugs: 0
CPU3 (xvdb):
Reads Queued: 0, 0KiB Writes Queued: 1,226, 4,904KiB
Read Dispatches: 0, 0KiB Write Dispatches: 805, 6,296KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 839, 6,444KiB
Read Merges: 0, 0KiB Write Merges: 636, 2,544KiB
Read depth: 0 Write depth: 28
IO unplugs: 588 Timer unplugs: 0
CPU7 (xvdb):
73
Reads Queued: 0, 0KiB Writes Queued: 8, 32KiB
Read Dispatches: 0, 0KiB Write Dispatches: 2, 8KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 0, 0KiB
Read Merges: 0, 0KiB Write Merges: 5, 20KiB
Read depth: 0 Write depth: 28
IO unplugs: 3 Timer unplugs: 0
Total (xvdb):
Reads Queued: 0, 0KiB Writes Queued: 1,607, 6,428KiB
Read Dispatches: 0, 0KiB Write Dispatches: 839, 6,432KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 839, 6,444KiB
Read Merges: 0, 0KiB Write Merges: 768, 3,072KiB
IO unplugs: 837 Timer unplugs: 0
Throughput (R/W): 0KiB/s / 22,220KiB/s
Events (xvdb): 7,405 entries
Skips: 0 forward (0 - 0.0%
Trace actions:
C – Complete: A previously issued request has been completed. The output will detail the sector and size of that request, as well as the success or failure of it.
D – Issued: A request that previously resided on the block layer queue or in the io scheduler has been sent to the driver.
I – inserted: A request is being sent to the Io scheduler for addition to the internal queue and later service by the driver. The request is fully formed at this time.
Q – Queued: This notes intent to queue Io at the given location. No real requests exists yet.
M - Back merge: A previously inserted request exists that ends on the boundary of where this Io begins, so the Io scheduler can merge them together.
G - Get request: To send any type of request to a block device, a struct request container must be allocated first.
74
P – Plug: When Io is queued to a previously empty block device queue, Linux will plug the queue in anticipation of future iOS being added before this data is needed.
U – Unplug: Some request data already queued in the device, start sending requests to the driver. This may happen automatically if a timeout period has passed or if a number of requests have been added to the queue.
APPENDIX D
Sample btt output
root@ip-10-164-200-68:/bt/bttroot@ip-10-164-200-68 btt]# ./btt -i bp_0.bin==================== All Devices ==================== ALL MIN AVG MAX N--------------- ------------- ------------- ------------- -----------Q2Q 0.000000302 0.000190622 0.018320058 9623Q2G 0.000000320 0.000646859 0.021534113 5580G2I 0.000000537 0.000000891 0.000016285 5580Q2M 0.000000220 0.000586568 0.020452472 4805I2D 0.000000295 0.000582587 0.020857786 5757M2D 0.000003591 0.000312210 0.002367892 4618D2C 0.000003985 0.009535719 0.021917346 9562Q2C 0.000685686 0.010527590 0.024264175 9563
==================== Device Overhead ==================== DEV | Q2G G2I Q2M I2D D2C---------- | --------- --------- --------- --------- --------- --------- (202, 16) | 3.5853% 0.0049% 2.7996% 3.3315% 90.5689%---------- | --------- --------- --------- --------- --------- --------- Overall | 3.5853% 0.0049% 2.7996% 3.3315% 90.5689%
==================== Device Merge Information ===============
DEV | #Q #D Ratio | BLKmin BLKavg BLKmax Total ---------- | -------- -------- ------- | -------- -------- -------- -------- (202, 16) | 9624 5579 1.7 | 8 14 64 83088
==================== Device Q2Q Seek Information ============= DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- (202, 16) | 9624 261907.1 0 | 0(5689)---------- | --------------- --------------- --------------- | ---------------
75
Overall | NSEEKS MEAN MEDIAN | MODE Average | 9624 261907.1 0 | 0(5689)==================== Device D2D Seek Information ============ DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- (202, 16) | 5579 451860.7 0 | 0(1306)---------- | --------------- --------------- --------------- | --------------- Overall | NSEEKS MEAN MEDIAN | MODE Average | 5579 451860.7 0 | 0(1306)==================== Plug Information ==================== DEV | # Plugs # Timer Us | % Time Q Plugged---------- | ---------- ---------- | ---------------- (202, 16) | 5574( 0) | 0.241062860% DEV | IOs/Unp IOs/Unp(to)---------- | ---------- ---------- (202, 16) | 1.0 0.0---------- | ---------- ---------- Overall | IOs/Unp IOs/Unp(to) Average | 1.0 0.0
===============Active Requests At Q Information ============
DEV | Avg Reqs @ Q---------- | ------------- (202, 16) | 0.6
================ I/O Active Period Information =============== DEV | # Live Avg. Act Avg. !Act % Live---------- | ---------- ------------- ------------- ------ (202, 16) | 1 0.000000000 0.000000000 100.00---------- | ---------- ------------- ------------- ------ Total Sys | 1 0.000000000 0.000000000 100.00
# Total System
# Total System : q activity
0.000019827 0.0
0.000019827 0.4
1.834373837 0.4
1.834373837 0.0
76
# Total System : c activity
0.000773561 0.5
0.000773561 0.9
1.834348980 0.9
1.834348980 0.5
# Per process
# blktrace : q activity
# blktrace : c activity
0.503369468 1.5
0.503369468 1.9
0.507339801 1.9
0.507339801 1.5
1.531444631 1.5
1.531444631 1.9
1.531542381 1.9
1.531542381 1.5
# fio : q activity
0.000019827 2.0
0.000019827 2.4
1.834373837 2.4
1.834373837 2.0
# fio : c activity
0.056336745 2.5
77
0.056336745 2.9
0.298750693 2.9
0.298750693 2.5
0.438779299 2.5
0.438779299 2.9
1.532828650 2.9
1.532828650 2.5
1.643768755 2.5
1.643768755 2.9
1.732713450 2.9
1.732713450 2.5
# jbd2 : q activity
0.163400885 3.0
0.163400885 3.4
0.185009019 3.4
0.185009019 3.0
# jbd2 : c activity
# kernel : q activity
# kernel : c activity
0.000773561 4.5
0.000773561 4.9
1.834348980 4.9
1.834348980 4.5