2018
AUTOMATIC BACK UP SYSTEM FOR CEPH IN OPENSTACK
NAZARUL AIMAN BIN NORDIN
BACHELOR OF COMPUTER SCIENCE (NETWORK SECURITY)
WITH HONOUR
UNIVERSITI SULTAN ZAINAL ABIDIN
2018
AUTOMATIC BACK UP SYSTEM FOR CEPH IN OPENSTACK
NAZARUL AIMAN BIN NORDIN
BACHELOR OF COMPUTER SCIENCE (NETWORK SECURITY)
WITH HONOUR
FACULTY OF INFORMATIC AND COMPUTING
UNIVERSITI SULTAN ZAINAL ABIDIN TERENGGANU,
MALAYSIA
i
DECLARATION
This dissertation is submitted as a partial fulfillment for the award of a Bachelor of Computer
Science (Computer Network Security) at the University Sultan Zainal Abidin (UniSZA). The
work is the result of my own investigations. All sections of the text and results which have been
obtained from other workers/sources are fully referenced. I understand that cheating and
plagiarism constitute a breach of University regulations and will be dealt with accordingly.
Signature :
Name : Nazarul Aiman Bin Nordin
Date :
ii
CONFIRMATION
This is to confirm that the research conducted and the writing of this report was under my
supervisor.
Signature :
Name : Dr. Wan Nor Shuhadah Binti Wan
Nik
Date :
iii
ACKNOWLEDGEMENT
In the name of God, Allah, the Most Gracious and the Most Merciful.
I would like to express my deepest appreciation to Dr. Wan Nor Shuhadah Binti Wan
Nik which represent as my supervisor for willing to spend her time guiding me throughout
the preparation of my final year project.
I also want to thank you for my parents and my family for giving moral support and
encouragement whenever I feel like give up. I also give special thanks to all lecturers of Faculty
of Informatics and Computing for their support during my final year project. Sincere thanks to
my fellow friends for always be my side to help me successfully complete my final year
project.
May Allah S.W.T bless all effort for completing this final year project.
Thank you.
iv
ABSTRACT
Cloud computing provide space for use of various services, such as software development platforms,
servers, storage and software all over the internet. While cloud storage involves storing data on multiple
virtual server that generally hosted by the third party. In managerial context, open source software such
as OpenStack is introduce to build and manage cloud computing platform for public and private cloud.
OpenStack allow user to deploy virtual machine and other instances that prepare vary services for
different task managing in cloud computing over the internet. In context of storage, new open sources
software such as Ceph exist as the best storage solution for Cloud computing and also compatible with
OpenStack that handle many task in Cloud. Ceph put together to facilitate highly scalable object, block
and file-based storage under one whole system. Certain feature in Ceph cannot operate automatically
whereas world nowadays are demand with automation in everyday work. So automation is the feature
that allow process or procedure to be done automatically with minimum human assistance. In context
of Ceph, all data need to be back up manually in order to keep the data safe. By existing of one of the
automation component which is auto back up, it help the data to be back up and retrieve automatically
and safely via the time and date that have been set up.
v
ABSTRAK
Cloud computing memberikan ruang untuk kegunaan pelbagai perkhidmatan, seperti platform
pembangunan perisian, server, penyimpanan dan perisian di seluruh internet. Manakala Cloud Storage
melibatkan peyimpanan data dalam pelbagai virtual server yang biasanya dihoskan oleh pihak ketiga.
Dalam konteks pengurusan, perisian sumber terbuka seperti OpenStack diperkenalkan untuk membina
dan mengurus platform Cloud computing untuk public cloud dan private cloud. OpenStack
membenarkan pengguna untuk menggunakan virtual machine dan contoh lain yang menyediakan
pelbagai perkhidmatan untuk menguruskan tugas yang berbeza dalam Cloud computing melalui
internet. Dalam konteks penyimpanan, perisian sumber terbuka baru seperti Ceph wujud sebagai
penyelesaian simpanan terbaik untuk Cloud computing dan juga serasi dengan OpenStack yang
mengendalikan banyak tugas di Cloud. Ceph meletakkan bersama-sama untuk memudahkan objek
berskala tinggi, blok dan penyimpanan berasaskan fail yang sangat berskala di bawah satu sistem . Ciri-
ciri tertentu dalam Ceph tidak boleh beroperasi secara automatik sedangkan dunia hari ini memerlukan
automasi dalam kerja harian. Jadi automasi adalah ciri yang membolehkan proses atau prosedur
dilakukan secara automatik dengan sedikit bantuan manusia. Dalam konteks Ceph, semua data perlu di
‘back up’ secara manual untuk memastikan data selamat. Dengan adanya salah satu daripada komponen
automasi iaitu automatic back up, ia membantu data untuk disandarkan dan diperolehi semula secara
automatik dan selamat melalui masa dan tarikh yang telah ditetapkan.
vi
TABLE OF CONTENTS
CONTENTS PAGE
DECLARATION i
CONFIRMATION ii
ACKNOWLEDGEMENT iii
ABSTRACT iv
TABLE OF CONTENTS vi
LIST OF TABLES viii
LIST OF FIGURES ix
LIST OF ABBREVIATIONS / TERMS / SYMBOLS x
CHAPTER 1
1.1
INTRODUCTION
Introduction
1
1.2 Problem Statement 2
1.3 Objectives 3
1.4 Scopes 3
1.5
1.6
Limitation of works
Expected Result
4
4
CHAPTER 2
2.1
LITERATURE REVIEW
Introduction
5
2.2 Cloud Computing 5
2.3 OpenStack 6
2.4 Ceph Deployment, Management & Monitoring 8
2.5 Research of Study 10
2.6 Summary 13
vii
CHAPTER 3 METHODOLOGY
3.1 Introduction 14
3.2 Flowchart 14
3.3 Configure OpenStack 16
3.4 Installing Ceph in OpenStack Environment 18
3.4.1 Integrating Ceph with Glance 18
3.4.2 Integrating Ceph with Cinder 22
3.4.3 Integrating Ceph with Nova 26
3.5 OpenStack Configuration for Automatic Back Up in
Ceph
34
3.6 Summary 35
REFERENCE 36
viii
LIST OF TABLE
Table no.
Title
Page
Table 2.1 Comparison of Literature Review 13
ix
LIST OF FIGURES
Figure No.
Title
Page
Figure 2.1 OpenStack Services 7
Figure 3.1 Flowchart of Automatic Back Up Ceph Storage
in OpenStack
14
Figure 3.2 Component in OpenStack that Connected to
Dashboard
16
Figure 3.3 Architecture of Component in OpenStack 17
Figure 3.4 Architecture of Network Attached Storage
(NAS)
34
x
LIST OF ABBREVIATIONS / TERMS / SYMBOLS
IPT Institute of Higher Learning
UniSZA Universiti Sultan Zainal Abidin
FIK Faculty of Informatic and Computing
IT Information Technology
1
CHAPTER I
INTRODUCTION
Cloud storage involves storing data on multiple virtual server that generally hosted by the third party.
The data stored in cloud storage system can be access from any location that has internet connection.
When cloud storage is used, some files may be on a physical server in different country. Since most
users do not know where their physical files are, using cloud storage can be thought of as a vague,
untouchable thing, much like a cloud itself. Most data accessed via the internet that is not stored on our
personal computer is part of the cloud. For example, the used of an internet-based email service like
Gmail, Yahoo, or Live, can be access anywhere that has internet connection. This is because the data
is stored on servers owned by the respective e-mail providers, not user local machine. Although the
cloud storage is more like an idea than something you can physically touch, the computers that make
up this system require physical space. The facilities that store the physical equipment used by the cloud
storage are called data centres. Sometimes you may also hear the term server farm. Data centres can be
anywhere in the world. They are generally in warehouses that have complex cooling systems to keep
the computers from overheating.
OpenStack is a free and open-source cloud computing software platform. It also provide
services for managing a Cloud environment remotely. It makes horizontal scaling easy, which means
that tasks that benefit from running concurrently can easily serve more or fewer users on the fly by just
spinning up more instances. For example, mobile application that needs to communicate with a remote
server might be able to divide the work of communicating with each user across many different
instances, all communicating with one another but scaling quickly and easily as the application gains
more users. OpenStack consist of a group of interrelated projects that control pools of processing,
storage, and networking resources. It provides users methods and support to deploy virtual machines
2
in a remote environment. State in OpenStack is maintained in centrally managed relational database
(MySQL or Maria DB). OpenStack provides all the services for a Location as a service (LaaS). Most
important is OpenStack allow user to access the source code, make any changes or modifications they
need, and freely share these changes back out to the community at large. It also means that OpenStack
has the benefit of thousands of developers all over the world working in tandem to develop the
strongest, most robust, and most secure product that they can.
Ceph is an open source software put together to facilitate highly scalable object, block and file-
based storage under one whole system. Ceph uses Ceph Block Device, a virtual disk that can be attached
to bare-metal Linux-based servers or virtual machines. RADOS (Reliable Autonomic Distributed
Object Store), an important component in Ceph, provides block storage capabilities like snapshots and
replication which can be integrated with OpenStack Block Storage. Ceph also makes use of POSIX
(Portable Operating System Interface), a Ceph file system to store data in their storage clusters. The
file system uses the same clustered system as Ceph block storage and object storage to store a large
amount of data.
1.1 Problem Statement
The proposed system in this project aims to solve problems encountered while user storing
amount of data in ceph system. The data store in ceph storage system may be lost due to the some factor
such as accidental deletion, file corruption, drive failure, malware, software bugs, theft, damage, and
more. Redundancy in this cases only protects against drive failure, whereas a true backup protects
against every one of these factors. But manually backup cannot fully solve this problem if the user
failed to back up their own data. So the system needs to back up its own data based on the scheduling
to be set.
3
1.2 Objective
To study / do research on OpenStack cloud platform and Ceph storage system respectively.
To configure OpenStack cloud platform.
To configure Ceph that attach to OpenStack.
To create one of the automated feature in Ceph which is auto backup.
To configure task scheduler for backing up the data in Ceph.
To integrate and test Ceph storage system running in Openstack cloud platform with
automation special to autobackup.
1.3 Scope
This project will configure and implement one of the cloud computing platform called
Openstack.
This project will configure and implement one of the Openstack platform called Ceph.
This proposed framework is targeted for one of the automated feature in Ceph which is auto
backup base on scheduling that has been set up.
This project will integrate the automated feature, auto backup in Ceph storage system running
on Openstack cloud platform.
4
1.4 Limitation
Maintaining and monitoring process need to be done regularly to make sure the system is
functional and have enough drive space after implementation of auto-backup system in Ceph.
The system will only manage for those who are using Openstack’s cloud.
1.5 Expected Result
Ceph storage system can backing up the data stored by itself based on scheduling that has
been set up.
The data that had been back up is able to restore successfully.
The data store is more secure as the system can back up by itself when it come to some factor
that can cause data lost.
5
CHAPTER 2
BACKGROUND OF ORGANIZATION
2.1 Introduction
By this chapter we will discussed more about cloud computing, OpenStack, Ceph and
automated feature that can be implemented in Ceph system based on the previous research. We will get
more knowledge and better understanding about cloud computing and OpenStack, Ceph configuration
in OpenStack and Ceph integration method with automated feature.
2.2 Cloud Computing
Cloud computing is internet based processing enable the data delivering between computer
and other device via the internet. Public email services is a cloud based example of usage as our
email Inbox, Sent, Drafts, Deleted and Contacts files are hosted on the provider's computer
servers. SaaS or cloud computing landscape has turn into new generation of capabilities as they
offer web-based word processing, spreadsheets, presentation capabilities and data storage in
public email services. Cloud computing have several characteristic model such as scalability that
make cloud more flexible, access that prepare quick self-service access to cloud user, interface
which provide instruction on how two application or data can communicate each other and usage.
Cloud computing is usually classified into three categories like Infrastructure as a Service (IaaS)
which storage and computing resources, Platform as a Service (PaaS) which is “black box”
infrastructure that developers can build applications on top of, Software as a Service (SaaS)
where a service provider hosts software Service on off-site servers.
6
2.3 OpenStack
The article did a research on different solution exists for the deployment of open source cloud.
The open source that take part in this research is Eucalyptus, Open Nebula and OpenStack. Eucalyptus
is a solution that allows the installation of private and hybrid cloud infrastructure while Open Nebula
is an open source under Apache 2 license used for data centre and private cloud. The main focus is
OpenStack which have special characteristic for open source such as scalable, compatible, flexible and
open. OpenStack is made of unique architecture built by some component but there is three component
that mostly related to this project like OpenStack Compute, Image and Block Storage. OpenStack
Compute called Nova is a management platform in a large network of virtual machines and redundant
and scalable architecture that control the infrastructure to control IaaS clouds. There is seven main
component in Nova which is API server which handles we page requests from users, processes the
request and return the data. Other component are Message Queue which enable dispatching the
exchanged instructions to facilitate communication, Compute Controller which creating and
manipulating virtual servers, Object Store which provide storage services, while Volume Controller
handle the assignment, attachment and manipulation of volumes. Network Controller in Nova create
Bridges, VLAN, DHCP, DNS and firewall rules. The last component is Scheduler which distributes
task and determines where they should be executed. Next component in OpenStack architecture is
OpenStack Imaging Service called Glance and it’s provide storage services, recording and distributing
the image to virtual machine disk. The other component in OpenStack architecture is OpenStack Block
Storage service called Cinder which is able to access specific location on disk drive. Other component
in OpenStack architecture are added to fill current demand. So this article conclude that OpenStack is
designed to allow administrators and researchers to deploy IaaS infrastructure and provide tools for
creating and managing virtual machines on top of existing resources.
7
Figure 2.1. OpenStack Services
8
2.4 Ceph Deployment, Management and Monitoring
Ceph is the best solution in block storage for OpenStack as it cluster can serve all type of data,
object, block device and file. Ceph can back up the storage for Cinder volume, Nova ephemeral disks
and Glance images. Ceph also can replace Swift, the services that manage storage space via compatible
Reliable Autonomic Distributed Object Store Gate Way (RADOSGW). Ceph deployment required
hardware and network planning. So hardware requirement needed to deploy Ceph are like Object
Storage Device (OSD), monitor and metadata. This article focus on real world production clusters, so
Ceph-deploy is recommended open-source deployment tools for Ceph while Ceph plugins for
monitoring process.
However there is several Ceph operation such as clock drift and network latency, splitting
placement groups, low storage density, data locality and bucket hierarchy for failure domain. Ceph is
sensitive to clock drift and network latency, therefore it is preferable to install NTP on Ceph nodes,
especially on running Ceph Monitors. Then Ceph enable the increasing the number of Placement Group
in small increment by updating the corresponding pool’s parameter in order to shorter the recovery
operation. Low storage density in Ceph are caused by replication factor equal to three and for that
solution, this article suggest to use erasure coding to transform the original object into smaller chunk
in order to reduce the replication factor below three. Next in data locality, Ceph make sure that primary
Object Storage Device (OSD) is located in the same server with the client. Bucket hierarchy in
Controlled Replication Under Scalable Hashing (CRUSH) map are able to reflect the actual
infrastructure of a data center or federated data centers.
This article also discuss about Ceph troubleshooting if power failure occurs and OSD problem
such as Ceph running full, OSD outage, lost OSD journal and stale Placement Group. Whenever the
monitor fail to work, this article sugget to remove the monitor from the cluster and add new monitor.
For Ceph performance, incorrect or non-optimal configuration will result in slow data/journal reads
9
and writes, unresponsive OSDs, slow backfill and recovery operations. So certain recommendation
need to be consider In Ceph cluster such as storage, file system, Placement Group, RAM/CPU, network
and RADOSGW.
.
10
2.5 RESEARCH OF STUDIES
A few review on existing research paper were made. The research paper cloud computing,
OpenStack and Ceph mainly focus on the technique of the system. The table below include
descriptions of the method used in the article.
Table 2.1: Comparison of Literature Review
AUTHOR YEAR TITLE SUMMARIZATION
Carl Mazzanti - How Does Cloud
Computing Work
Cloud computing is internet based
processing enable the data delivering
between computer and other device via
the internet. The article focus on like
Infrastructure as a Service (IaaS) which
storage and computing resources,
Platform as a Service (PaaS) which is
“black box” infrastructure that
developers can build applications on
top of, Software as a Service (SaaS)
where a service provider hosts software
Service on off-site servers.
11
Sefraoui, O.,
Aissaoui, M., &
Eleuldj, M.
2012 OpenStack: Toward
an Open - Source
solution for Cloud
Computing
OpenStack have special characteristic
for open source such as scalable,
compatible, flexible and open.
OpenStack is made of unique
architecture built by three main
component like OpenStack Compute,
Image and Object. Based on the article,
it introduced me to many component
exist in OpenStack but the most related
to my project is OpenStack Nova,
Glance and Cinder. Those three are
very compatible for integration process
with Ceph.
Avishay
Traeger
- Deploying Ceph with
OpenStack
Ceph is the best solution in block
storage for OpenStack as it cluster can
serve all type of data, object, block
device and file. Ceph can back up the
storage for Cinder volume, Nova
ephemeral disks and Glance images.
This three services in OpenStack is
very compatible with Ceph.
12
Avishay Traeger - Integration Ceph
Storage with
OpenStack
Step by step integration of Ceph with
OpenStack is explained well and all the
configuration done are very easy to
understand. Three services in
OpenStack which is Nova, Glace and
Cinder are explained on how they very
suitable to integrate with Ceph storage.
13
2. 6 SUMMARY
Based on this chapter, it can be conclude that previous research is an important step to start a
new project proposed as it give more knowledge about topic and how previous research does their
research. It give a knowledge about cloud computing, OpenStack and how to install or run OpenStack
from the first till the end and it give knowledge about Ceph and which integration to OpenStack is
compatible with Ceph . This is also important to not repeat the same mistake or same idea and technique
being used.
14
CHAPTER 3
PROJECT
3.1 INTRODUCTION
The development of this system is carried out by applying methodology which refer to a way
to find result of a problem on selected topic. It can be different outcome of method according to cases.
3.2 Flowchart
Flowchart is a diagram of the sequence of movements or actions of people or things involved
in a complex system. The proposed method starts with the installation of virtual box because I need to
do all the work in Linux environment in order to easily configure OpenStack and Ceph. Then Centos 7
is chosen as Linux environment because it prepare a right base for this project. Centos 7 is installing in
virtual box and all the rule agreement about Centos 7 services need to be follow in order to success the
installation. After that, OpenStack Mitaka deployment take part in Centos 7 with all the configuration
of three main architecture prepared in OpenStack which is Cinder, Nova and Glance architecture. This
architecture support the next configuration in this flowchart which is Ceph. OpenStack Glance help to
store images in a Ceph Block Device, and Cinder to boot a VM using a copy-on-write clone of an
Figure 3.1 Flowchart of automatic back up Ceph storage in OpenStack
15
image. That why OpenStack and satellite project Ceph is compatible with each other. Lastly we enter
to the tuning phase where the configuration on Ceph in OpenStack are needed in order to create
automated feature in Ceph environment. Back up feature is chosen to automated it as Ceph are famous
with storage solution for cloud computing. So to maintain its speciality, Ceph need to be more efficient
in storing up the data and keep this available when needed.
16
3.3 Configure OpenStack
First of all, we have to configure our own OpenStack platform. We need to configure all of the
component required in OpenStack such as create the dashboard, key pair, identity, network, block
storage, compute, image, object storage and a few more component that is needed based on the project
proposed. All OpenStack compute, storage and network services can be managed through OpenStack
Horizon which is a dashboard prepared for administrator and user to be accessed. Figure 3.1 show how
the dashboard is connected to the all component in OpenStack.
When all component needed have been set, the key pair must be created to get the public key.
Public key is one of the security provide by OpenStack. At this point, OpenStack Keystone take part as
it help to authenticate and authorized system used by all component in OpenStack. It provide a central
directory to map user to vary OpenStack services due to the OpenStack environment that allow user to
reside it from anywhere. It also keep a catalog of every particular service that are ran in cloud.
After that we configure one by one component that connected to dashboard as shown in figure
3.2. We configure Image Service in OpenStack Glance to manage various server images in cloud. It is
a repository of all the images that are available in cloud to be launched as servers. Then we continue
our configuration with OpenStack Neutron in order to create virtual network and network interfaces.
Neutron is attaches to proprietary vendor networking product. Next, we configure OpenStack Cinder
act as block storage component and help in accessing specific location on a disk drive. We continue
Figure 3.2 Component in OpenStack that connected to dashboard
17
again to configure OpenStack Nova as it fetches VM image from OpenStack Glance and help in
deploying plus managing large number of virtual machine or instances in order to handle computing
task. Last but not least, we configure OpenStack Swift to retrieves objects like an image using REST
Web services which similar to Amazon A3.
Figure 3.3 show the architecture of some important component in OpenStack.
Figure 3.3 Architecture of component in OpenStack
18
3.4 Installing Ceph in OpenStack Environment
3.4.1 Integrating Ceph with Glance
Glance supports several storage backends, such as a locally mounted filesystem, Amazon S3,
OpenStack Swift, and Ceph RBD. Use the following steps to use Ceph RBD as a Glance storage
backend:
Note: It is good practice to use a dedicated Ceph pool for Glance. Ceph pool is a logical partition for
storing objects. Ceph allows you to have separate settings, such as permissions, number of replicas,
CRUSH rules and so on per pool. By default, this pool will share all the available Ceph OSDs with
other pools; however, for example, you can configure Ceph to use dedicated Ceph OSDs (disks) for
this pool.
The following command will create a new Ceph pool, where ‘images’ is the name and ‘64’ is the
number of placement groups for this pool:
$ ceph osd pool create images 64
pool 'images' created
Note that you can use any valid name for the pool instead of ‘images’. In this case, use your pool’s
name in the following commands. Also, the number of placement groups should be appropriate for
your deployment (read more about configuring placement groups for Ceph.
Create a new Ceph user for Glance and set the appropriate permissions for the user and
pool:
$ ceph auth get-or-create images \
mon ‘allow r’ \
osd ‘allow class-read object_prefix rbd_children, \
allow allow rwx pool=images’ \
-o /ceph.client.images.keyring
Copy the keyring file from the Ceph node to the nodes where Glance services
(specifically glance-api and glance-registry) are deployed:
$ cat /etc/ceph/ceph.client.images.keyring | \
19
ssh <node> "sudo tee /etc/ceph/ceph.client.images.keyring"
On that OpenStack node, set the appropriate permissions for the keyring file:
$ sudo chgrp glance /etc/ceph/ceph.client.images.keyring
$ sudo chmod 0640 /etc/ceph/ceph.client.images.keyring
On that OpenStack node, edit /etc/ceph/ceph.conf and add the following lines:
[client.images]
keyring = /etc/ceph/ceph.client.images.keyring
On that OpenStack node, edit the file /etc/glance/glance-api.conf.
In the [glance_store] section set the following parameters, making sure to change the existing
parameters and adding new ones:
stores = glance.store.rbd.Store,glance.store.http.Store
default_store = rbd
rbd_store_pool = images
rbd_store_user = images
rbd_store_ceph_conf = /etc/ceph/ceph.conf
Then restart glance-api service and verify that the integration with Ceph works:
$ sudo service glance-api restart
Download a test image:
$ wget \
http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
20
The downloaded image is in QCOW2 format. Consider converting the image’s format to
“raw”. This format is simply the disk’s raw data, as is, and therefore the image will generally
consume more disk space and take longer to upload. However, if the image is uploaded as
“raw”, boot disks can later be created from the image using copy-on-write snapshots. To
convert the image to RAW format we will use qemu-img tool from qemu-utils package:
$ qemu-img convert \
cirros-0.3.4-x86_64-disk.img \
cirros-0.3.4-x86_64-disk.raw
Upload the image to Glance:
$ openstack image create \
--file cirros-0.3.4-x86_64-disk.raw \
--container-format bare \
--disk-format raw \
--public \
cirros-0.3.4
Check that the image exists in Glance:
$ openstack image list
+--------------------------------------+--------------+--------+
| ID | Name | Status |
+--------------------------------------+--------------+--------+
| b47824a5-89f3-4b1c-8b39-60f0afa5bd70 | cirros-0.3.4 | active |
+--------------------------------------+--------------+---------
The image ID will be different in your case, make sure you use the ID from your output in the
following commands. Check that the image exists in Ceph:
$ rbd -p images ls
b47824a5-89f3-4b1c-8b39-60f0afa5bd70
21
Use the image ID from the output above to see the details:
$ rbd -p images info b47824a5-89f3-4b1c-8b39-60f0afa5bd70
rbd image 'b47824a5-89f3-4b1c-8b39-60f0afa5bd70':
size 40162 kB in 5 objects
order 23 (8192 kB objects)
block_name_prefix: rbd_data.6eb03f96e580
format: 2
features: layering, striping
flags:
stripe unit: 8192 kB
stripe count: 1
22
3.4.2 Integrating Ceph with Cinder Services
To begin integrating Ceph with Cinder, first create a dedicated Ceph pool for
Cinder (‘volumes’ is the name, 64 is the number of placement groups):
$ ceph osd pool create volumes 64
pool 'volumes' created
Note that you can use any valid name for the pool instead of ‘volumes’. In this case, use your pool’s
name in the following commands. Also the number of placement groups should be appropriate for your
deployment.
Create a new Ceph user for Cinder and set the appropriate permissions for the user and
pools:
$ ceph auth get-or-create volumes \
mon ‘allow r’ \
osd ‘allow class-read object_prefix rbd_children, \
allow allow rwx pool=volumes, \
allow rx pool=images’ \
-o /ceph.client.volumes.keyring
Copy the keyring file from Ceph node to the OpenStack node, where the Cinder services
(specifically cinder-api and cinder-volume) are deployed:
$ cat /etc/ceph/ceph.client.volumes.keyring | \
ssh <node> "sudo tee /etc/ceph/ceph.client.volumes.keyring"
On that OpenStack node, set the appropriate permissions for the keyring file:
$ sudo chgrp cinder /etc/ceph/ceph.client.volumes.keyring
$ sudo chmod 0640 /etc/ceph/ceph.client.volumes.keyring
On that OpenStack node, edit /etc/ceph/ceph.conf and add the following lines:
[client.volumes]
keyring = /etc/ceph/ceph.client.volumes.keyring
23
Generate a UUID (universally unique identifier) that we will need for the Ceph integration
with libvirt:
$ uuidgen
a5d0dd94-57c4-ae55-ffe0-7e3732a24455
On that OpenStack node, edit the file /etc/cinder/cinder.conf and set the following parameters
in the corresponding sections, making sure to change the existing parameters and adding new
ones:
[DEFAULT]
enabled_backends = RBD-backend
[RBD-backend]
volume_backend_name=RBD-backend
rbd_pool=volumes
rbd_user=volumes
rbd_secret_uuid=a5d0dd94-57c4-ae55-ffe0-7e3732a24455
volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_ceph_conf=/etc/ceph/ceph.conf
Discover here how to deploy Cinder in production environments.
Note that we used the generated UUID for the rbd_secret_uuid parameter.
On that OpenStack node, restart cinder-api and cinder-volume services:
$ sudo service cinder-api restart
$ sudo service cinder-volume restart
24
On a Ceph node, create the key file and copy it to the hypervisor nodes (where nova-
compute is deployed):
$ ceph auth get-key client.volumes | \
ssh <node> “tee client.volumes.key”
On the hypervisor nodes, create a new temporary file ceph.xml:
<secret ephemeral="no" private="no">
<uuid>a5d0dd94-57c4-ae55-ffe0-7e3732a24455</uuid>
<usage type="ceph">
<name>client.volumes secret</name>
</usage>
</secret>
Note that we used the generated UUID for the uuid element. Now define a new secret for
libvirt on the hypervisor node:
$ virsh secret-define --file ceph.xml
Secret a5d0dd94-57c4-ae55-ffe0-7e3732a24455 created
$ virsh secret-set-value \
--secret a5d0dd94-57c4-ae55-ffe0-7e3732a24455 \
--base64 $(cat client.volumes.key)
Delete the temporary files:
$ rm client.volumes.key ceph.xml
Verify that that the integration with Ceph works. Create a new volume:
$ openstack volume create --size 1 vol1
25
Check that the volume exists in Cinder:
$ openstack volume list
+------------------+--------------+-----------+------+-------------+
| ID | Display Name | Status | Size | Attached to |
+------------------+--------------+-----------+------+-------------+
| ...-8d3914468ec7 | vol1 | available | 1 | |
+------------------+--------------+-----------+------+-------------+
Check that the volume exists in Ceph:
$ rbd -p volumes ls
volume-...-8d3914468ec7
Use the ID from the output above to see the volume details.
$ rbd -p volumes info volume-...-8d3914468ec7
rbd image 'volume-...-8d3914468ec7':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.641e2dc70192
format: 2
features: layering, striping
flags:
stripe unit: 4096 kB
stripe count: 1
26
3.4.3 Integrating Ceph with Nova
Firstly, create a new Ceph pool for Nova ephemeral disks:
$ ceph osd pool create compute 64
pool 'compute' created
Note that you can use any valid name for the pool instead of ‘compute’. In this case, use your pool’s
name in the following commands. Also the number of placement groups should be appropriate for
your deployment.
You will need to create a new Ceph user for Nova and set the appropriate permissions for
the user and pools:
$ ceph auth get-or-create compute \
mon ‘allow r’ \
osd ‘allow class-read object_prefix rbd_children, \
allow allow rwx pool=compute, \
allow allow rwx pool=volumes, \
allow rx pool=images’ \
-o /ceph.client.compute.keyring
On a Ceph node, copy the keyring file from the Ceph node to the hypervisor nodes (where
the nova-compute services are deployed):
$ cat /etc/ceph/ceph.client.compute.keyring | \
ssh <node> "sudo tee /etc/ceph/ceph.client.compute.keyring"
$ ceph auth get-key client.compute | \
ssh <node> “tee client.compute.key”
27
On the hypervisor node, set the appropriate permissions for the keyring file:
$ sudo chgrp nova /etc/ceph/ceph.client.compute.keyring
$ sudo chmod 0640 /etc/ceph/ceph.client.compute.keyring
On the hypervisor node, edit /etc/ceph/ceph.conf and add the following lines:
[client.compute]
keyring = /etc/ceph/ceph.client.compute.keyring
Generate a UUID that we will need for Ceph integration with libvirt:
$ uuidgen
5b67401f-dc5e-496a-8456-9a5dc40e7d3c
On the hypervisor node, edit /etc/nova/nova.conf file, set the following parameters in the
corresponding sections, making sure to change the existing parameters and add new ones:
[libvirt]
images_rbd_pool=compute
images_type=rbd
rbd_secret_uuid=5b67401f-dc5e-496a-8456-9a5dc40e7d3c
rbd_user=compute
Note that we used the generated UUID for the rbd_secret_uuid parameter. On the hypervisor
node, restart nova-compute service:
$ sudo service nova-compute restart
28
On the hypervisor node, create a new temporary file ceph.xml:
<secret ephemeral="no" private="no">
<uuid>5b67401f-dc5e-496a-8456-9a5dc40e7d3c</uuid>
<usage type="ceph">
<name>client.compute secret</name>
</usage>
</secret>
Note that we used the generated UUID for the uuid element. Now define a new secret for
libvirt on the hypervisor node:
$ sudo virsh secret-define --file ceph.xml
Secret 5b67401f-dc5e-496a-8456-9a5dc40e7d3c created
$ sudo virsh secret-set-value \
--secret 5b67401f-dc5e-496a-8456-9a5dc40e7d3c \
--base64 $(cat client.compute.key)
Delete the temporary files:
$ rm client.compute.key ceph.xml
Now let’s verify that that the Nova integration with Ceph works. List the existing networks:
$ nova net-list
+--------------------------------------+--------------------+------+
| ID | Label | CIDR |
+--------------------------------------+--------------------+------+
| a2612069-44a5-4421-aeb2-9b296cad2d44 | admin_floating_net | None |
| c8a9e13e-c95d-466f-96c0-27898818aec3 | admin_internal_net | None |
+--------------------------------------+--------------------+------+
29
Use the network ID from the output above and the cirros-0.3.4 image we uploaded early to
boot a new VM using ‘Test’ as its name:
$ nova boot --flavor m1.tiny --image cirros-0.3.4 \
Test --nic net-id=c8a9e13e-c95d-466f-96c0-27898818aec3
Check that the VM exists in Nova:
$ nova list
+---------------+------+--------+------------+-------------+---------------+
| ID | Name | Status | Task State | Power State | Networks |
+---------------+------+--------+------------+-------------+---------------+
| c0f90bd2-9f8a | Test | ACTIVE | - | Running | 192.168.111.3 |
+---------------+------+--------+------------+-------------+---------------+
Check that VM’s ephemeral volume exists in Ceph:
$ rbd -p compute ls
c0f90bd2-9f8a_disk
c0f90bd2-9f8a_disk.config
Use the ID from the output above to see the details:
$ rbd -p compute info c0f90bd2-9f8a_disk
rbd image 'c0f90bd2-9f8a_disk':
size 1024 MB in 128 objects
order 23 (8192 kB objects)
block_name_prefix: rbd_data.6eb87110b48c
format: 2
features: layering, striping
flags:
parent: images/b47824a5-89f3-4b1c-8b39-60f0afa5bd70@snap
overlap: 40162 kB
stripe unit: 8192 kB
stripe count: 1
30
The ‘parent’ in the output above means that the ephemeral volume is a copy-on-write clone of
snapshot of the cirros-0.3.4 image that we have uploaded to Glance (read more about Ceph
snapshots). The following command shows that there is a new snapshot of the uploaded image
(use the ID from the “Integrating Ceph with Glance” section):
$ rbd snap ls images/b47824a5-89f3-4b1c-8b39-60f0afa5bd70
SNAPID NAME SIZE
1 snap 40162 kB
The following command shows that there is one child volume (clone) of the specified
snapshot:
$ rbd children images/b47824a5-89f3-4b1c-8b39-60f0afa5bd70@snap
compute/c0f90bd2-9f8a_disk
Now let’s create a new bootable persistent volume from the existing cirros-0.3.4 image. We
use ‘bootable-cirros-0.3.4’ as the volume name (you can use any other name) and 1 GB as the
volume size (the last parameter):
$ cinder create --image-id b47824a5-89f3-4b1c-8b39-60f0afa5bd70 \
--display_name=bootable-cirros-0.3.4 1
Check that the new volume exists in Cinder:
$ openstack volume list
+------------------+-----------------------+-----------+------+-------------+
| ID | Display Name | Status | Size | Attached to |
+------------------+-----------------------+-----------+------+-------------+
| ...-c2f272bb8594 | bootable-cirros-0.3.4 | available | 1 | |
| ...-8d3914468ec7 | vol1 | available | 1 | |
+------------------+-----------------------+-----------+------+-------------+
31
Check that the new volume exists in Ceph:
$ rbd -p volumes ls
volume-...-c2f272bb8594
volume-...-8d3914468ec7
Use the ID from the output above to see the volume details:
$ rbd -p volumes info volume-...-c2f272bb8594
rbd image 'volume-...-c2f272bb8594':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.740369fa6d55
format: 2
features: layering, striping
flags:
parent: images/b47824a5-89f3-4b1c-8b39-60f0afa5bd70@snap
overlap: 40162 kB
stripe unit: 4096 kB
stripe count: 1
The ‘parent’ in the output above means that the new volume is a copy-on-write clone of the
snapshot of the cirros-0.3.4 image we have uploaded to Cinder. Let’s boot a new VM from the
volume:
$ nova boot --flavor m1.tiny \
--block-device \
source=volume,id=...-c2f272bb8594,\
dest=volume,size=1,shutdown=preserve,bootindex=0 Test2 \
--nic net-id=c8a9e13e-c95d-466f-96c0-27898818aec3
32
In the command above we used the same network ID that we used to boot a VM from image.
In addition we used ‘–block-device’ parameter to specify the volume ID and that we want to
preserve the volume if a VM is deleted. Let’s check that the VM is running:
$ nova list
+---------------+-------+--------+------------+-------------+---------------+
| ID | Name | Status | Task State | Power State | Networks |
+---------------+-------+--------+------------+-------------+---------------+
| c0f90bd2-9f8a | Test | ACTIVE | - | Running | 192.168.111.3 |
| b40c5c63-6150 | Test2 | ACTIVE | - | Running | 192.168.111.4 |
+---------------+-------+--------+------------+-------------+---------------+
Check that there is no ephemeral volume in Ceph for that VM, because we have booted it
from a volume:
$ rbd -p compute ls
c0f90bd2-9f8a_disk
c0f90bd2-9f8a_disk.config
b40c5c63-6150_disk.config
33
Check the VM details:
$ nova show Test2
+----------------------+-------------------------------------------------+
| Property | Value |
...
| id | b40c5c63-6150 |
| image | Attempt to boot from volume - no image supplied |
| name | Test2 |
| os-extended-volumes: | [{"id": "b40c5c63-6150", |
volumes_attached | "delete_on_termination": false}] |
...
+----------------------+-------------------------------------------------+
The output above shows that we have booted the VM from the volume, which is a copy-on-write
clone of the snapshot of the Glance image.
34
3.5 OpenStack configuration for automatic back up in Ceph
First of all, in order to success backing up the data or file automatically in Ceph we need to
make sure that we have a place to store that data or file. Recently, Network-Attached Storage (NAS)
become popular among user in order for them to save their picture, video and other kind of data or files.
NAS provide the space for data or file to be stored over the network by using Ethernet connection, so
that the data stored can be access or retrieve everywhere that have connection. Figure 3.4 show the
architecture of Network Attached Storage (NAS)
To begin the process, we need to configure OpenStack and ceph in order to create a path for
saving the copy data or file from Ceph storage to the NAS. When back up and retrieve the data or file
process on NAS is success, we need to configure again OpenStack and Ceph in order to make that back
up and retrieve data or file done their job automatically. With the help of CronJob application, we can
integrate it with OpenStack together with Ceph and configure the scheduling for the data or file being
back up automatically based on set up time or date by user.
Figure 3.4 Architecture of Network Attached Storage (NAS)
35
3.6 SUMMARY
Methodology is very important in system and application development. There also a lots of
different software development methodology that available and can be used to develop any kind of
system. The right methodology can help the system to be done at exact time. The process in each phase
in the methodology are explained step by step so that it can be understand easily.
36
REFERENCE
Sefraoui, O., Aissaoui, M., & Eleuldj, M. (2012). OpenStack: toward an open-source solution
for cloud computing. International Journal of Computer Applications, 55(3), 38-42.
G.vonLaszewski,J.Diaz,F.Wang,andG.C.Fox,Towards Cloud Deployments using FutureGrid
Indiana University, Bloomington, IN, FutureGrid Draft Paper, April 2012.
Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G. & Stoica, I. (2009). Above
the clouds: A berkeley view of cloud computing. Dept. Electrical Eng. and Comput. Sciences,
University of California, Berkeley, Rep. UCB/EECS, 28(13), 2009.
OpenStack API Documentation: October 3, 2018 [online], October 2018, Available from:
https://developer.openstack.org/api-guide/quick-start/
OpenStack Installation Guide: December 24, 2018 [online], December 2018, Available from:
https://docs.openstack.org/install-guide/
Avishay Traeger, Integrating Ceph Storage with OpenStack- A Step by Step Guide: July 6, 2016
[online], July 2016, Available from: https://www.stratoscale.com/blog/storage/integrating-
ceph-storage-openstack-step-step-guide/
Avishay Traeger, Deploying Ceph with OpenStack – Challenge and Solution: Jun 3, 2016
[online], Jun 2016, Available from: https://www.stratoscale.com/blog/storage/deploying-ceph-
challenges-solutions/
Robert Seles, How to Schedule a SQL Server backup: March 31, 2015 [online]. March 2015,
Available from: https://solutioncenter.apexsql.com/how-to-create-a-sql-server-scheduled-
backup/
Bergsten, J. R. (2002). U.S. Patent No. 6,363,462. Washington, DC: U.S. Patent and Trademark
Office.