server and storage virtualization

6
WHITE PAPER SERVER AND STORAGE VIRTUALIZATION Sunjay Parikh MphasiS an EDS Company September 2008

Upload: cameroon45

Post on 13-Jun-2015

230 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: SeRveR And StoRAge viRtuAlizAtion

white PAPeR

SeRveR And StoRAge viRtuAlizAtion

Sunjay Parikh

MphasiS an EDS Company

September 2008

Page 2: SeRveR And StoRAge viRtuAlizAtion

MphasiS white paperServer and Storage Virtualization

| � |

table of Contents

1. why virtualization 2

2. An inside look at today’s data Centers 2

3. what is virtualization? 2

4. inter-relatedness between Server and Storage virtualization 2

5. how is virtualization achieved? 3

Server Virtualization 3

Storage Virtualization 3

Array based Virtualization 4

Network based Virtualization 4

10. Conclusion 4

Page 3: SeRveR And StoRAge viRtuAlizAtion

| � |

MphasiS white paper Server and Storage Virtualization

why virtualizationWe all have heard the recent news about the rising

energy costs of data centers, about reducing the carbon

footprint, which is the CO2 emissions resulting from the

data center cooling, about the ineffective utilization of

servers in the data centers. All this has brought about a

new term called “green computing”, also called “greening

of IT”. Several companies have put “green computing”

or “greening of IT” as one of their top IT priorities for the

next decade and beyond.

This has put one very important technology in the

centerpiece, namely – Virtualization. It is not that

virtualization is a new technology by any means; infact

virtualization is as old as the mainframes. IBM has been

using virtualization technology in the mainframes for the

past several decades. In fact, in the old IBM VM operating

system (VM/370), VM stood for Virtual Machine.

What is important, however, is that virtualization is

gaining prominence of late because of the energy and the

environmental factors that have become a serious issue

today, and if not dealt with appropriately could cause

unforeseen damage not only to the environment, but to a

company’s bottom line as well.

There are several types of virtualization, or to put it more

correctly, there are several areas in which virtualization

can be realized. In this white paper, we will cover two

types of virtualization – server and storage

virtualization, while other types of virtualization will be

covered in future white papers.

An inside look at today’s data Centers Around mid-nineties, as the internet became mainstream,

and became accessible to corporations and general

public (and not only limited to universities and research

centers), the volume of applications and information

proliferated. Speed and storage became the key factors.

The end result was a huge demand for servers and huge

data centers which would house those servers.

Earlier in the days of mainframe, a single mainframe

could be used to store multiple applications and

databases. Modern data centers on the other hand have

expanded to include hundreds or even thousands of

servers running diverse operating systems and

applications. Often a server performs a single task, for

instance, operating as an e-mail or a web server, or for

shared applications. This has led to processing

inefficiencies, with many dedicated servers operating at

under 20% utilization.

As the number of servers has grown, so has the cost of

operations, including people, space, power, and cooling.

An IDC report states that the labor required for

maintaining a single small application server can cost

between $500 and $3,000 per month in a production

environment and that figure excludes costs associated

with backup and recovery, network connectivity, power

and air conditioning.

Computing demands are increasing, but companies

cannot afford the escalating hardware, power, labor, and

facilities costs of adding new servers.

Virtualization alleviates these pressures, making the data

center more lean and efficient. Instead of computing

resource silos, virtualization treats underlying computing

assets as a resource pool to be shared by many.

Virtualization not only supports server consolidation, but

it enables workloads to be added and moved

automatically to precisely match real-time computing

needs as demand changes. This provides greater agility

and better business continuity.

Many companies have even adopted a “virtualization

first” policy, which means that new projects don’t get

new machines, but virtual resources.

what is virtualization?In layman terms, virtualization is nothing but

“virtuali-zing” an actual physical resource like the

hardware or the storage. To simplify it further, one single

physical resource is “virtually” made available as several

logical resources. That is it. The definition is not any more

complicated than that.

So, in case of hardware virtualization, we could have one

single piece of hardware being shared by multiple

operating systems instead of a single operating system

(as on our laptop or desktop). Thus, we pool the

hardware assets – disk space, network capacity, memory,

processing power – and share them amongst the

operating systems. This, by the way, is commonly called

“server virtualization” as servers are commonly

virtualized and not desktop machines.

In case of “storage virtualization” on the other hand, an

actual physical storage is made available into several

logical storage spaces to be shared by many servers and

applications. At its most basic, storage virtualization

makes scores of separate hard drives look to be one big

storage pool. The virtualization system presents to the

user a logical space for data storage and itself handles

the process of mapping it to the actual physical location.

inter-relatedness between Server and Storage virtualization

As mentioned, server virtualization allows for better

utilization of hardware, consolidation of physical servers,

increased availability, and lower data center operating

costs. As per some of the analyst reports, businesses are

Page 4: SeRveR And StoRAge viRtuAlizAtion

MphasiS white paperServer and Storage Virtualization

| � |

now running production workloads on virtual machines

and gaining phenomenal cost benefits, not to mention

the ecological benefits resulting from it.

To be effective, however, virtualization should be

implemented at multiple levels. This includes

virtualization of server, network and storage

infrastructure. Currently, the spotlight is on server

virtualization due to the sheer number of servers that

have proliferated across the data center. However, once

businesses have gained sufficient improvement in the

server consolidation, they will need to turn their

attention to the storage infrastructure. This is because of

the interdependency between server virtualization and

networked storage. Let’s take a look at that.

Virtual machine images are typically stored on networked

storage to allow the ease of moving virtual machines

between physical servers for load balancing, high

availability and maximum resource utilization. Also,

multiple copies of virtual machines are created and

stored on the network for replication and disaster

recovery purposes. Thus, networked storage becomes an

essential component to unlock the full potential of server

virtualization.

As we consolidate multiple virtual machines onto a single

physical server, we face the limitations of the local disk

capacity of that physical server. There are two extreme

scenarios – either the disk is heavily underutilized (when

the VMs do not require much data storage,) or the disk

is heavily utilized and in fact falls short of the required

usage. Ideally, a typical virtualized server would fall

somewhere in the middle. Whatever the case, some form

of networked storage or consolidated storage or

virtualized storage is required. (Note that all the three

terms are used interchangeably here).

With storage virtualization, less time is spent managing

storage devices, since some chores can be centralized.

Virtualization also increases the efficiency of storage;

letting files be stored wherever there is room, rather

than have some drives go underutilized. And drives can

be added or replaced without requiring downtime to

reconfigure the network and affected servers. Backup

and mirroring are also much faster because only changed

data needs to be copied; this eliminates the need for

scheduled storage management downtime.

Thus, with the increase in server consolidation, it

becomes necessary to abstract out the storage such that

“virtual servers” can now use “virtual storage” without

each being physically tied to the other. This would result

in maximum scalability and peak resource utilization.

Storage virtualization also brings in centralized

management, increased capacity utilization, improved

availability, enhanced data protection and reduced

back-up time.

Again, the cost benefits are dramatic when you add

storage virtualization to server virtualization.

how is virtualization achieved?This is a very important question and brings to bear some

of the advances made in this space in recent times.

Server virtualization

Server virtualization is accomplished via a piece of

software that isolates the operating systems and the

applications from the platform hardware resources and

from each other. Each such partition is called a “virtual

machine”. Each instance of a virtual machine runs in its

own partition with its own copy of the operating

system. Each virtual machine has its own virtualized CPU,

memory, and storage space, and can share the hardware

devices attached to the physical platform.

The virtualization software, called a virtual machine

monitor or VMM, also called a “hypervisor”, manages OS

requests and application activities, shifting control of the

hardware to the OS as required. The hypervisor acts as

a broker between the VMs and the hardware assets. The

guest operating system makes a call to the hypervisor to

access the native services of the system.

Hypervisor comes in two flavors - a native or bare-metal,

type I hypervisor is software that runs directly on a given

hardware platform. The guest operating systems (of the

VM) then run on top on it. A hosted, type II hypervisor

on the other hand runs within an operating system. The

guest operating system (of the VM) then runs above it.

There are several products available in the market in both

the versions. Open source Xen, Citrix XenServer, Oracle

VM are examples of type I hypervisor. Microsoft Windows

Server 2008, where server virtualization is available as

part of the operating system with the new “Hyper-V”

feature is an example of hosted type II hypervisor.

Storage virtualization

In storage virtualization, the system presents to the user

a logical space for data storage and itself handles the

process of mapping it to the actual physical location. The

virtualization software maintains the mapping

information, also called the “meta-data” for the

virtualized storage. This mapping information is stored

as a mapping table. The virtualization software uses the

meta-data to re-direct I/O requests. It receives an

incoming I/O request containing information about the

location of the data in terms of the logical disk and

translates this into a new I/O request to the physical disk

location.

Virtualization of physical storage devices takes two forms:

storage area network (SAN) virtualization and file area

network (FAN) virtualization. SAN virtualization is often

Page 5: SeRveR And StoRAge viRtuAlizAtion

| � |

MphasiS white paper Server and Storage Virtualization

implemented on the host computer, with server-based

appliances that attach to a Fiber Channel switch, or on

the storage controller itself. Virtualization of the file

space is accomplished either with appliances or

software that pools the data from individual file servers

and network attached storage devices into a single

managed entity.

Storage virtualization technologies are classified based

on the type of storage – array based or network based.

Array based virtualization

Modern breed of array based storage has RAID

controllers which allow downstream attachment of other

storage devices. A primary storage controller provides

the virtualization services and allows the direct

attachment of other storage controllers. The primary

controller will provide the pooling and meta-data

management services. The benefit of this type of storage

is that no additional hardware or other infrastructure

costs are involved. The big disadvantage is we cannot use

heterogeneous storage devices in this case and limited

replication and data migration capabilities. IBM System

Storage SAN Volume Controller is an example of this type

of storage virtualization technology.

network based virtualization

Network based virtualization is delivered via a network

device using an intelligent fiber channel switch connected

as a SAN (storage area network). This is a more common

form of storage virtualization. The virtualization device

sits in the SAN and provides the layer of abstraction

between the hosts performing the I/O and the storage

controllers providing the storage capacity.

There are two implementations of network based storage

virtualization – appliance based and switch based.

Appliance based devices are dedicated hardware devices

that provide SAN connectivity. I/O requests are targeted

at the appliance itself, which performs the meta-data

mapping before redirecting the I/O to the underlying

storage. Switch based devices, as the name suggests,

reside in the physical switch hardware used to

connect the SAN devices. They use different techniques

to provide the meta-data mapping, such as packet

cracking to snoop on incoming I/O requests and perform

the I/O redirection.

In-band, also called symmetric storage virtualization

consists of a cluster of servers with Fiber Channel ports,

which run the virtualization software. This software uses

an in-band architecture in which both data and control

information flow between the host computer and the

storage controller, flows into a node and then back out

onto the SAN. The software presents an interface to a

server that looks like a storage controller and an interface

to a storage controller that looks like a server. Each

cluster consists of as many as four pairs of nodes called

an IO Group. Write data is cached and mirrored across

the node pair and a single node manages the cluster.

In out-of-band or asymmetric virtualization, data flow is

separated from control flow. These devices only perform

the meta-data mapping functions. Data and metadata are

sent to different places. This requires additional software

in the host which knows to first request the location of

the actual data. Therefore an I/O request from the host is

intercepted before it leaves the host; a meta-data lookup

is requested from the meta-data server (this may be

through an interface other than the SAN) which returns

the physical location of the data to the host. The

information is then retrieved through an actual I/O

request to the storage. Caching is not possible as the

data never passes through the device.

Conclusion In this white paper, we have seen some of the reasons

why virtualization is gaining prominence of late. The

driving force behind virtualization is the containment of

ecological destruction and the rising energy costs that

companies are facing today. One of the most prominent

ways in which virtualization is realized is through “server

virtualization” which consolidates and pools the

hardware resources and makes them available to

operating systems for effective utilization. Closely

following on the heels of server virtualization is “storage

virtualization”, which pools and abstracts out the physical

storage and makes it logically available to servers and

applications. The virtualization technologies use either a

combination of hardware and software or only software

to provide the services. Combined together, both server

and storage virtualization provide spectacular cost

benefits and resource utilization.

Page 6: SeRveR And StoRAge viRtuAlizAtion

09

08

About MphasiS

MphasiS, an edS company, delivers Applications Services, Remote infrastructure

Services, BPo and KPo services through a combination of technology know-how, domain

and process expertise. we service clients in the Manufacturing, Financial Services,

healthcare, Communications, energy, transportation, Consumer & Retail industries and

to Governments around the world. We are certified with ISO 9001:2000, ISO/IEC

27001:2005 (formerly known as BS 7799), assessed at CMMI v 1.2 Level 5 and are

undergoing SAS 70 certification. We also provide SEI CMMI, ISO and Six Sigma related

services support.

MphasiS is a performance based company, dedicated to outstanding customer service.

we offer capabilities to provide innovative solutions by sustainable cost savings and

improved business performance through flexible engagement models. Customer

centricity, transparency in operations, result-oriented activity and flexibility are the

values on which we build long-term relationships with our clients.

Contact us

USAMphasiS

460 Park Avenue South

Suite # 1101, new York

nY 10016, u.S.A.

Tel: +1 212 686 6655

Fax: +1 212 686 2422

UKMphasiS

100 Borough high Street

london Se1 1lB

Tel: +44 20 30 057 660

Fax: + 44 20 30 311 348

MphasiS edinburgh house43-51 windsor Road

Slough Sl1 2ee, uK

Tel: +44 0 1753 217 700

Fax: +44 0 1753 217 701

INDIAMphasiS

Bagmane technology Park

Byrasandra

C.v. Raman nagar

Bangalore 560 093, india

Tel: +91 80 4042 6000

Fax: +91 80 2534 6760

MphasiS and the MphasiS logo are registered trademarks of MphasiS Corporation. All other

brand or product names are trademarks or registered marks of their respective owners.

MphasiS is an equal opportunity employer and values the diversity of its people. Copyright ©

MphasiS Corporation. All rights reserved.