server and storage virtualization
TRANSCRIPT
white PAPeR
SeRveR And StoRAge viRtuAlizAtion
Sunjay Parikh
MphasiS an EDS Company
September 2008
MphasiS white paperServer and Storage Virtualization
| � |
table of Contents
1. why virtualization 2
2. An inside look at today’s data Centers 2
3. what is virtualization? 2
4. inter-relatedness between Server and Storage virtualization 2
5. how is virtualization achieved? 3
Server Virtualization 3
Storage Virtualization 3
Array based Virtualization 4
Network based Virtualization 4
10. Conclusion 4
| � |
MphasiS white paper Server and Storage Virtualization
why virtualizationWe all have heard the recent news about the rising
energy costs of data centers, about reducing the carbon
footprint, which is the CO2 emissions resulting from the
data center cooling, about the ineffective utilization of
servers in the data centers. All this has brought about a
new term called “green computing”, also called “greening
of IT”. Several companies have put “green computing”
or “greening of IT” as one of their top IT priorities for the
next decade and beyond.
This has put one very important technology in the
centerpiece, namely – Virtualization. It is not that
virtualization is a new technology by any means; infact
virtualization is as old as the mainframes. IBM has been
using virtualization technology in the mainframes for the
past several decades. In fact, in the old IBM VM operating
system (VM/370), VM stood for Virtual Machine.
What is important, however, is that virtualization is
gaining prominence of late because of the energy and the
environmental factors that have become a serious issue
today, and if not dealt with appropriately could cause
unforeseen damage not only to the environment, but to a
company’s bottom line as well.
There are several types of virtualization, or to put it more
correctly, there are several areas in which virtualization
can be realized. In this white paper, we will cover two
types of virtualization – server and storage
virtualization, while other types of virtualization will be
covered in future white papers.
An inside look at today’s data Centers Around mid-nineties, as the internet became mainstream,
and became accessible to corporations and general
public (and not only limited to universities and research
centers), the volume of applications and information
proliferated. Speed and storage became the key factors.
The end result was a huge demand for servers and huge
data centers which would house those servers.
Earlier in the days of mainframe, a single mainframe
could be used to store multiple applications and
databases. Modern data centers on the other hand have
expanded to include hundreds or even thousands of
servers running diverse operating systems and
applications. Often a server performs a single task, for
instance, operating as an e-mail or a web server, or for
shared applications. This has led to processing
inefficiencies, with many dedicated servers operating at
under 20% utilization.
As the number of servers has grown, so has the cost of
operations, including people, space, power, and cooling.
An IDC report states that the labor required for
maintaining a single small application server can cost
between $500 and $3,000 per month in a production
environment and that figure excludes costs associated
with backup and recovery, network connectivity, power
and air conditioning.
Computing demands are increasing, but companies
cannot afford the escalating hardware, power, labor, and
facilities costs of adding new servers.
Virtualization alleviates these pressures, making the data
center more lean and efficient. Instead of computing
resource silos, virtualization treats underlying computing
assets as a resource pool to be shared by many.
Virtualization not only supports server consolidation, but
it enables workloads to be added and moved
automatically to precisely match real-time computing
needs as demand changes. This provides greater agility
and better business continuity.
Many companies have even adopted a “virtualization
first” policy, which means that new projects don’t get
new machines, but virtual resources.
what is virtualization?In layman terms, virtualization is nothing but
“virtuali-zing” an actual physical resource like the
hardware or the storage. To simplify it further, one single
physical resource is “virtually” made available as several
logical resources. That is it. The definition is not any more
complicated than that.
So, in case of hardware virtualization, we could have one
single piece of hardware being shared by multiple
operating systems instead of a single operating system
(as on our laptop or desktop). Thus, we pool the
hardware assets – disk space, network capacity, memory,
processing power – and share them amongst the
operating systems. This, by the way, is commonly called
“server virtualization” as servers are commonly
virtualized and not desktop machines.
In case of “storage virtualization” on the other hand, an
actual physical storage is made available into several
logical storage spaces to be shared by many servers and
applications. At its most basic, storage virtualization
makes scores of separate hard drives look to be one big
storage pool. The virtualization system presents to the
user a logical space for data storage and itself handles
the process of mapping it to the actual physical location.
inter-relatedness between Server and Storage virtualization
As mentioned, server virtualization allows for better
utilization of hardware, consolidation of physical servers,
increased availability, and lower data center operating
costs. As per some of the analyst reports, businesses are
MphasiS white paperServer and Storage Virtualization
| � |
now running production workloads on virtual machines
and gaining phenomenal cost benefits, not to mention
the ecological benefits resulting from it.
To be effective, however, virtualization should be
implemented at multiple levels. This includes
virtualization of server, network and storage
infrastructure. Currently, the spotlight is on server
virtualization due to the sheer number of servers that
have proliferated across the data center. However, once
businesses have gained sufficient improvement in the
server consolidation, they will need to turn their
attention to the storage infrastructure. This is because of
the interdependency between server virtualization and
networked storage. Let’s take a look at that.
Virtual machine images are typically stored on networked
storage to allow the ease of moving virtual machines
between physical servers for load balancing, high
availability and maximum resource utilization. Also,
multiple copies of virtual machines are created and
stored on the network for replication and disaster
recovery purposes. Thus, networked storage becomes an
essential component to unlock the full potential of server
virtualization.
As we consolidate multiple virtual machines onto a single
physical server, we face the limitations of the local disk
capacity of that physical server. There are two extreme
scenarios – either the disk is heavily underutilized (when
the VMs do not require much data storage,) or the disk
is heavily utilized and in fact falls short of the required
usage. Ideally, a typical virtualized server would fall
somewhere in the middle. Whatever the case, some form
of networked storage or consolidated storage or
virtualized storage is required. (Note that all the three
terms are used interchangeably here).
With storage virtualization, less time is spent managing
storage devices, since some chores can be centralized.
Virtualization also increases the efficiency of storage;
letting files be stored wherever there is room, rather
than have some drives go underutilized. And drives can
be added or replaced without requiring downtime to
reconfigure the network and affected servers. Backup
and mirroring are also much faster because only changed
data needs to be copied; this eliminates the need for
scheduled storage management downtime.
Thus, with the increase in server consolidation, it
becomes necessary to abstract out the storage such that
“virtual servers” can now use “virtual storage” without
each being physically tied to the other. This would result
in maximum scalability and peak resource utilization.
Storage virtualization also brings in centralized
management, increased capacity utilization, improved
availability, enhanced data protection and reduced
back-up time.
Again, the cost benefits are dramatic when you add
storage virtualization to server virtualization.
how is virtualization achieved?This is a very important question and brings to bear some
of the advances made in this space in recent times.
Server virtualization
Server virtualization is accomplished via a piece of
software that isolates the operating systems and the
applications from the platform hardware resources and
from each other. Each such partition is called a “virtual
machine”. Each instance of a virtual machine runs in its
own partition with its own copy of the operating
system. Each virtual machine has its own virtualized CPU,
memory, and storage space, and can share the hardware
devices attached to the physical platform.
The virtualization software, called a virtual machine
monitor or VMM, also called a “hypervisor”, manages OS
requests and application activities, shifting control of the
hardware to the OS as required. The hypervisor acts as
a broker between the VMs and the hardware assets. The
guest operating system makes a call to the hypervisor to
access the native services of the system.
Hypervisor comes in two flavors - a native or bare-metal,
type I hypervisor is software that runs directly on a given
hardware platform. The guest operating systems (of the
VM) then run on top on it. A hosted, type II hypervisor
on the other hand runs within an operating system. The
guest operating system (of the VM) then runs above it.
There are several products available in the market in both
the versions. Open source Xen, Citrix XenServer, Oracle
VM are examples of type I hypervisor. Microsoft Windows
Server 2008, where server virtualization is available as
part of the operating system with the new “Hyper-V”
feature is an example of hosted type II hypervisor.
Storage virtualization
In storage virtualization, the system presents to the user
a logical space for data storage and itself handles the
process of mapping it to the actual physical location. The
virtualization software maintains the mapping
information, also called the “meta-data” for the
virtualized storage. This mapping information is stored
as a mapping table. The virtualization software uses the
meta-data to re-direct I/O requests. It receives an
incoming I/O request containing information about the
location of the data in terms of the logical disk and
translates this into a new I/O request to the physical disk
location.
Virtualization of physical storage devices takes two forms:
storage area network (SAN) virtualization and file area
network (FAN) virtualization. SAN virtualization is often
| � |
MphasiS white paper Server and Storage Virtualization
implemented on the host computer, with server-based
appliances that attach to a Fiber Channel switch, or on
the storage controller itself. Virtualization of the file
space is accomplished either with appliances or
software that pools the data from individual file servers
and network attached storage devices into a single
managed entity.
Storage virtualization technologies are classified based
on the type of storage – array based or network based.
Array based virtualization
Modern breed of array based storage has RAID
controllers which allow downstream attachment of other
storage devices. A primary storage controller provides
the virtualization services and allows the direct
attachment of other storage controllers. The primary
controller will provide the pooling and meta-data
management services. The benefit of this type of storage
is that no additional hardware or other infrastructure
costs are involved. The big disadvantage is we cannot use
heterogeneous storage devices in this case and limited
replication and data migration capabilities. IBM System
Storage SAN Volume Controller is an example of this type
of storage virtualization technology.
network based virtualization
Network based virtualization is delivered via a network
device using an intelligent fiber channel switch connected
as a SAN (storage area network). This is a more common
form of storage virtualization. The virtualization device
sits in the SAN and provides the layer of abstraction
between the hosts performing the I/O and the storage
controllers providing the storage capacity.
There are two implementations of network based storage
virtualization – appliance based and switch based.
Appliance based devices are dedicated hardware devices
that provide SAN connectivity. I/O requests are targeted
at the appliance itself, which performs the meta-data
mapping before redirecting the I/O to the underlying
storage. Switch based devices, as the name suggests,
reside in the physical switch hardware used to
connect the SAN devices. They use different techniques
to provide the meta-data mapping, such as packet
cracking to snoop on incoming I/O requests and perform
the I/O redirection.
In-band, also called symmetric storage virtualization
consists of a cluster of servers with Fiber Channel ports,
which run the virtualization software. This software uses
an in-band architecture in which both data and control
information flow between the host computer and the
storage controller, flows into a node and then back out
onto the SAN. The software presents an interface to a
server that looks like a storage controller and an interface
to a storage controller that looks like a server. Each
cluster consists of as many as four pairs of nodes called
an IO Group. Write data is cached and mirrored across
the node pair and a single node manages the cluster.
In out-of-band or asymmetric virtualization, data flow is
separated from control flow. These devices only perform
the meta-data mapping functions. Data and metadata are
sent to different places. This requires additional software
in the host which knows to first request the location of
the actual data. Therefore an I/O request from the host is
intercepted before it leaves the host; a meta-data lookup
is requested from the meta-data server (this may be
through an interface other than the SAN) which returns
the physical location of the data to the host. The
information is then retrieved through an actual I/O
request to the storage. Caching is not possible as the
data never passes through the device.
Conclusion In this white paper, we have seen some of the reasons
why virtualization is gaining prominence of late. The
driving force behind virtualization is the containment of
ecological destruction and the rising energy costs that
companies are facing today. One of the most prominent
ways in which virtualization is realized is through “server
virtualization” which consolidates and pools the
hardware resources and makes them available to
operating systems for effective utilization. Closely
following on the heels of server virtualization is “storage
virtualization”, which pools and abstracts out the physical
storage and makes it logically available to servers and
applications. The virtualization technologies use either a
combination of hardware and software or only software
to provide the services. Combined together, both server
and storage virtualization provide spectacular cost
benefits and resource utilization.
09
08
About MphasiS
MphasiS, an edS company, delivers Applications Services, Remote infrastructure
Services, BPo and KPo services through a combination of technology know-how, domain
and process expertise. we service clients in the Manufacturing, Financial Services,
healthcare, Communications, energy, transportation, Consumer & Retail industries and
to Governments around the world. We are certified with ISO 9001:2000, ISO/IEC
27001:2005 (formerly known as BS 7799), assessed at CMMI v 1.2 Level 5 and are
undergoing SAS 70 certification. We also provide SEI CMMI, ISO and Six Sigma related
services support.
MphasiS is a performance based company, dedicated to outstanding customer service.
we offer capabilities to provide innovative solutions by sustainable cost savings and
improved business performance through flexible engagement models. Customer
centricity, transparency in operations, result-oriented activity and flexibility are the
values on which we build long-term relationships with our clients.
Contact us
USAMphasiS
460 Park Avenue South
Suite # 1101, new York
nY 10016, u.S.A.
Tel: +1 212 686 6655
Fax: +1 212 686 2422
UKMphasiS
100 Borough high Street
london Se1 1lB
Tel: +44 20 30 057 660
Fax: + 44 20 30 311 348
MphasiS edinburgh house43-51 windsor Road
Slough Sl1 2ee, uK
Tel: +44 0 1753 217 700
Fax: +44 0 1753 217 701
INDIAMphasiS
Bagmane technology Park
Byrasandra
C.v. Raman nagar
Bangalore 560 093, india
Tel: +91 80 4042 6000
Fax: +91 80 2534 6760
MphasiS and the MphasiS logo are registered trademarks of MphasiS Corporation. All other
brand or product names are trademarks or registered marks of their respective owners.
MphasiS is an equal opportunity employer and values the diversity of its people. Copyright ©
MphasiS Corporation. All rights reserved.