tco - cloud computing

Upload: andreas-michael

Post on 04-Jun-2018

254 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 TCO - Cloud Computing

    1/9

    198 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2011

    Yan HanTutorial

    articles: one was to make a case forusing the cloud;4 while the other

    provided more details of moving alibrarys IT infrastructure (ILS, web-site, and digital library systems) to acloud along with discussing motiva-tion, results, and evaluation in threeareas (quality and stability, impacton library services, and cost).5 Onthe cost discussion, Mitchell men-tioned the difficulty of calculatingtechnology Total Cost of Ownership(TCO) and cited two papers suggest-ing minimal cost savings. Mitchellsuggested the same but did not pro-vide detailed cost information. Incomparison, this paper has a detailed

    breakdown cost analysis alongwith different services, such as webapplications and storage. Mirsa andMondal proposed a suitability indexand a Return on Investment (ROI)model by taking into considerationimpacts and real value.6 Their suit-ability index and ROI model is wellthought but consider using the cloudfor every aspect of all IT operationsas a whole. As a result, a companyusing this model will have the final

    conclusion of a suitable, or mayor may not be, or not suitable.However, modular IT operations andservices (e.g., e-mail and storage) can

    be evaluated individually becausethese services can be easily upgradedor changed with minimal impactsto customers. I/O intensive servicesand storage intensive services havedifferent resource requirements andthus the same evaluation criteriamay not give an accurate pictureof costs and benefits. For example,storing digital preservation files for

    libraries is a one-time data intensiveoperation. Giving the above differentnature of IT operations and services,cloud computing may be suitablefor some IT operations but not forothers. Healy suggested that manycompanies did not have a completefinancial analysis by missing staffretraining and system management.He listed the following areas for TCO:hardware, software, recurring licens-ing and maintenance, bandwidth,

    a starting point for locating informa-tion for research; (2) buyer, the library

    as a purchaser of resources; and (3)archive, the library as a repository ofresources. The 2009 survey indicatesa gradual decline in their percep-tion of the importance of gateway,no change in archive, growth inbuyer, and increased importancefor two new roles: teaching sup-port and research support.1 Tomeet customers needs in these roles,libraries are innovating services,including catalogs and home websites(as gateway services), repositoryand digital library programs (asarchive, teaching support, andresearch support services), andinterlibrary loan (as a buyer andresearch support services).

    These services rely on stable andeffective IT infrastructure to operate.In the past, the growing needs ofthese web applications increased ITexpenditures and work complexity.More web applications, more storage,and more IT support staff are weavedinto centralized on-site IT infrastruc-ture along with huge investments

    in physical servers, networks, andbuildings. However, decreasingbudgets in libraries have had hugeimpact on all aspects of library opera-tions and staffing. Web applicationsrunning on local, managed serversmight not be effective in technologynor efficient in cost. Web applica-tions utilizing cloud computing can

    be much more effective and efficientin some cases.

    Literature Review

    There are a growing number ofarticles related to cloud computingin libraries. Chudnov described hispersonal experience of using cloudservices Amazon EC2 and S3 in aninformal tone, costing him 50 cents.2

    Jordan discussed OCLCs strategiesof building its next generation ofservices in cloud and provided aclear view of OCLCs future direc-tions for us.3 Mitchell wrote two

    Cloud Computing:

    Case Studies andTotal Costs ofOwnership

    This paper consists of four major sec-

    tions: The first section is a literature

    review of cloud computing and a cost

    model. The next section focuses on

    detailed overviews of cloud comput-

    ing and its levels of services: SaaS,

    PaaS, and IaaS. Major cloud comput-

    ing providers are introduced, includ-ing Amazon Web Services (AWS),

    Microsoft Azure, and Google App

    Engine. Finally, case studies of imple-

    menting web applications on IaaS

    and PaaS using AWS, Linode and

    Google AppEngine are demonstrated.

    Justifications of running on an IaaS

    provider (AWS) and running on a

    PaaS provider (Google AppEngine)

    are described. The last section dis-

    cusses costs and technology analy-

    sis comparing cloud computing with

    local managed storage and servers.

    The total costs of ownership (TCO)

    of an AWS small instance are sig-

    nificantly lower, but the TCO of a

    typical 10TB space in Amazon S3 are

    significantly higher. Since Amazon

    offers lower storage pricing for huge

    amounts of data, the TCO might be

    lower. Readers should do their ownanalysis on the TCOs.

    A2009 study from Ithaka sug-gested that faculty perceivethree traditional functions of

    a library: (1) gateway, the library as

    Yan Han ([email protected]) isAssociate Librarian, University of ArizonaLibraries, Tucson, Arizona.

  • 8/13/2019 TCO - Cloud Computing

    2/9

    SELECTING A WEB CONTENT MANAGEMENT SYSTEM FOR AN ACADEMIC LIBRARY WEBSITE | HAN 199CLOUD COMPUTING: CASE STUDIES AND TOTAL COSTS OF OWNERSHIP | HAN 199

    fundamental computing resourcesso that they can deploy and run

    arbitrary software such as oper-ating systems and applications.13In this model, the providers onlymanage underlying physical cloudinfrastructure (e.g. physical serv-ers and network), and providesservices via virtualization. Theusers have maximum control onthe infrastructure as if they ownunderlying physical servers andnetwork. Leading providers of thismodel includes Amazon, Linode,Rackspace, Joyent, and IBM BlueCloud.

    Major cloud computing provid-ers include Amazon Web Services(AWS), Microsoft Windows Azure,and Google AppEngine. AWS isconsidered to be an IaaS, PaaS, andSaaS provider, which offers a collec-tion of multiple computing servicesthrough the Internet, including a fewwell-known services such as AmazonElastic Compute Cloud (EC2),14Amazon Simple Storage Service (S3),and Amazon SimpleDB. EC2 started

    as a public beta in 2006. It allowsusers to pay for computing resourcesas they use them. With scalable useof computing resources and attrac-tive pricing models, EC2 is one of the

    biggest brand names in cloud com-puting. It offers different OS options,including multiple Linux distribu-tions, OpenSolaris, and WindowsServer. EC2 uses Xen virtualization,each virtual machine is called aninstance. An instance in EC2 has nopersistent storage, and data storedwill be lost if the instance is termi-

    nated. Therefore it is typical to useEC2 along with Amazon Elastic BlockStore (EBS) or S3, which providespersistent storage for EC2 instances.Amazon claims that both EBS and S3are highly available and reliable. Auser can create, start, stop, and termi-nate server instances through multiplegeographical locations for benefits ofresource optimization and high avail-ability. For example, a user can startan instance in northern Virginia, a

    potential to transform the IT indus-try and IT services, shifting the way

    IT infrastructure and hardware aredesigned, purchased, and managed.Many experts have their own versionof cloud computing, which was dis-cussed before.9The National Instituteof Standards and Technology definescloud computing as a model forenabling convenient, on-demandnetwork access to a shared pool ofconfiguration computing resourcesthat can be rapidly provisioned andreleased with minimal managementeffort or service provider interac-tion.10 NIST also gives its threeservice models layered based oncomputing infrastructure:

    Software as a Service (SaaS) allowsusers to use the cloud computingproviders applications through athin client interface such as a web

    browser.11 In the SaaS model, thecloud computing providers man-age almost everything in the cloudinfrastructure (e.g., physical serv-ers, network, OS, applications). Itis directly targeted for general end

    users. The end users can directlyrun applications on the cloudsand do not need install, upgrade,and backup applications and theirwork. Typical SaaS products areGoogle Apps and Salesforce SalesCRM.

    Platform as a Service (PaaS) allowsusers to deploy their own appli-cations on the providers cloudinfrastructure under the providersenvironment such as programminglanguages, libraries, and tools.12In this model, the cloud comput-

    ing providers manage everythingexcept the application in the cloudinfrastructure. PaaS is directlytargeted for general software devel-opers. They can develop, test, andrun their codes on a PaaS plat-form. Typical examples of thismodel includes Google AppEngine,Windows Azure, and Joyent.

    Infrastructure as a Service (IaaS)allows users to manage process-ing, storage, networks, and other

    staffing allocation, monitoring,backup, failover, security audit and

    compliance, integration, training,and speed to implementation.7

    The author published his firstpaper regarding cloud computingin 2010.8 Since then, the author hasimplemented and has been manag-ing multiple web applications andservices using IaaS and PaaS pro-viders. Several web applications ofthe University of Arizona Libraries(UAL) have been migrated to thecloud. This paper focuses on enter-prise-level applications and services,not individual-level cloud appli-cations such as Google Docs. Thepurposes of this article are to

    define cloud computing and levelsof services;

    introduce and compare majorcloud computing providers;

    provide case studies of runningtwo web applications (DSpace anda home grown Java application)utilizing cloud computing with

    justification; provide a comparison of TCO of

    running web applications compar-ing a cloud computing providerwith a local managed server;

    provide a comparison of TCO of10TB storage space comparing acloud computing provider withlocal managed storage; and

    briefly discuss technology advan-tages of cloud computing.

    Definition of CloudComputing and Levelsof Services

    Cloud Computing Servicesand Providers

    Cloud computing is becoming popu-lar in the IT industry. Over the pastfew years, the supply-and-demandof this new area has been seeinga huge increase of investment ininfrastructure and has been drawing

    broader uses in the United States.The author believes that it has a

  • 8/13/2019 TCO - Cloud Computing

    3/9

    200 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2011

    16GB storage, 200GB transfer, and thecost is $19.95 per month.20Customerspay up front.

    Open-Source Cloud

    Computing Software and

    Private Cloud

    Cloud computing also goes to opensource if any person or organizationwants to set up their own clouds.Eucalyptus is an open-source cloudcomputing system developed bythe University of California at SantaBarbara. Some of its eye-catching fea-tures include full compatibility withAmazon EC2 public infrastructureand multiple hypervisors, whichallows different virtual machines (e.g.,Xen, KVM, VSphere) to run on oneplatform.21 Its open-source company,

    Eucalyptus Systems, provides techni-cal supports to end users. Buildinga cloud infrastructure on cloud(s) isalso possible and might be desirablein certain situations. Current Linuxdistributions work with Eucalyptusto provide private cloud servicessuch Ubuntu Enterprise Cloud andRed Hats Deltacloud.

    Some organizations have beensetting up private clouds to utilizeadvantages of cloud computing. The

    Azure allows non-Windows appli-cations to run on the platform. Forexample, Apache web server can berun as a worker role.17 There alsoare a few small-to-medium size pro-viders such as Linode.18Table 1 listsmajor cloud computing providers.

    The cloud computing providers

    operate in two business models: vari-able (pay-for-your-usage) plans andfixed plans. Variable plans allows cus-tomers to pay only for the resourcesactually consumed (e.g., instance-hours, data transfer). AWS offers avariable plan. Google App Engineworks in a similar way. Google AppEngine offers two interesting fea-tures: daily budgets and free quotas.A daily budget allows customers tocontrol the amount of resources usedevery day. The free quota is currentlyset as 6.5 hours of CPU time per day,

    1 GB data in and out per day, and1GB of data storage.19By the end ofeach month, customers receive a billlisting the number of running hours,the amount of storage used, the sizeof data transfers, and other add-onservices. Linode only offers a fixedplan. The charge is based on theamount of RAM, data storage, anddata transfer by assuming an instanceis always running. For example, thesmallest instance has 360MB RAM,

    mirroring instance in Ireland, andanother mirroring instance in Asia.

    Amazon keeps increasing its offeringby introducing new PaaS and SaaSservices, such as SimpleDB, SimpleE-mail Service, and e-commerce.

    Google App Engine is a PaaSprovider offering a cloud platformfor web applications in Googlesdata centers. It was released as a

    beta version in 2008 but is currentlyin a full service mode. AppEnginefunctions like a middle layer, whichfrees customers worrying aboutrunning OSs, modules, and librar-ies. It currently supports Pythonand Java programming languagesand related frameworks, and it isexpected to support more languagesin the future. Google App Engineuses BigTable with its GQL (a SQL-like language). BigTable15is Googlesproprietary database, used in mul-tiple Google applications such asGoogle Earth, Google Search, andApp Engine. The design of GQLintentionally does not support joinstatement for multiple machine opti-mization.16 Unlike AWS, Google

    AppEngine has a nice feature thatallows customers a taste of the plat-form: it is free of charge up to acertain level of resource use. Afterthat, fees are charged for additionalCPU time, bandwidth and storage.

    Windows Azure also is a PaaSprovider, which runs on Microsoftdata centers. It provides a new wayto run applications and storing datain Microsoft way. Microsoft custom-ers can install and run applicationson Microsoft Cloud. Customers areprovided with two different instance

    types: web role instances and workerrole instances. Customers can use aweb role instance to accept incom-ing HTTP/HTTPS requests usingASP.NET, Windows CommunicationFoundation (WCF) or another.NETtechnology working with IIS. Aworker role instance is not asso-ciated with IIS, but functions as a

    background job. The two instancescan be combined to create desiredweb services. It is clear that Windows

    Table 1.List of Major Cloud Computing providers

    Cloud Computing Provider LayerAkamai PaaS, SaaS

    Amazon Web Services IaaS, PaaS, SaaS

    EMC SaaS

    Eucalyptus IaaS open source software

    Google PaaS(AppEngine), SaaS

    IBM PaaS, SaaS

    Linode IaaS

    Microsoft PaaS (Azure), SaaS

    Rackspace IaaS, PaaS, SaaS

    Salesforce.com PaaS, SaaSVMware vCloud PaaS, IaaS

    Zoho SaaS

  • 8/13/2019 TCO - Cloud Computing

    4/9

    SELECTING A WEB CONTENT MANAGEMENT SYSTEM FOR AN ACADEMIC LIBRARY WEBSITE | HAN 201CLOUD COMPUTING: CASE STUDIES AND TOTAL COSTS OF OWNERSHIP | HAN 201

    the work of modification of SQL-stylecode would have been significant.

    The author has a monthly bill of $40using an AWS small instance.

    Case Study 2: JapaneseGIF Holding Library Finder

    Application

    The author helped the NorthAmerican Coordinating Council on

    Japanese Library Resources (NCC)to develop and maintain a web ser-vice to identify Japanese Global ILLFramework (GIF) libraries to facili-tate interlibrary loan (ILL) service.The application was developed in

    Java using J2EE framework, and runin typical Java servlet container suchas Tomcat. The application was ini-tially operated in a small, locallymanaged server, and was migratedto Linode and Google AppEngine inMay 2010.

    Cloud Computing Provider Selection

    and Implementation

    Unlike case 1, the author tested and

    installed the application to AWS,Linode and Google AppEngine. AWSand Linode are IaaS providers whichgive users greater control over virtualnodes on their cloud infrastructure.Google AppEngine might be a bet-ter choice when applications run onnormal OS environments, becausesystem administration tasks can becompleted by PaaS providers, sav-ing users time and resources. Asa PaaS provider, Google maintainsits infrastructure environment suchas OS, programming languages, and

    tools. Installing the application inGoogle AppEngine can go throughan Eclipse plug-in or through com-mand lines.

    In this case, the GIF applicationis a simple system written in Javawithout any database transactions.Therefore Google App Enginesproprietary GQL database is not a

    barrier. However, users should beaware that Google AppEngine hasother unique features. For example,

    Cloud Computing Provider Selection

    and Implementation

    A typical DSpace instance requiresJava and related libraries, J2EE envi-ronment, and PostgreSQL as database

    backend. Three cloud computingproviders have been evaluated: AWS,Linode, and Google AppEngine. Twoinstances were successfully installedand configured in AWS and Linodeafter a few days of testing. Buildinga DSpace instance on the cloud isthe same process as running it onlocal except that it is much quickerto build, restart, rebuild, and backup.For example, an initial OS installationin a traditional server will take a fewhours compared to doing the sametask that takes a few minutes usingan IaaS provider.

    Installation on the AWS EC2 andLinode is almost the same exceptcreating a login and setting upsecurity policies. To log on to AWS,command line tools using an X.509certificate using Public/Private keyare by default. A generated keypairis required to SSH an instance andno password SSH option is provided.

    In addition, appropriate securitygroups are required to set up toenable network protocols. In thiscase, protocols such as SSH andHTTP along with typical port num-

    ber 80 and 8080 must be enabled.Activities such as manage instances,creating images, and setup securitypolicies can be set up through AWSweb interface (see figure 1). Steps andcommands of running regular opera-tions can be found in the appendix.In Linode, using root to log onis allowed. Users do not need to

    set network and security policies,as protocols and ports are alreadyopen. In system administrationpractice, running applications with-out enforcing security policies doespresent security risks to applicationsand systems. Linode allows users toset up security policies. The authordecided not to proceed with instal-lation in Google AppEngine becauseof its proprietary database GQL. Ifimplemented in Google AppEngine,

    private cloud eases concerns in thepublic cloud such as security of data,

    control of data, and legal issues. Forexample, an institution can buildits own cloud infrastructure usingEucalyptus (or Ubuntu Cloud) withits own computing resources or sim-ply using Amazon AWS. The privatecloud computing service becomescustomizable cloud computingresources which can be configuredand reconfigured as needed. Whyis this valuable? In traditional com-puting approaches, servers, storage,and networking equipment are pur-chased, configured, and then usedwithout significant changes forthree to five years until lives end.In this case, some planning must

    be scheduled ahead of time think-ing of computing resource needsin three to five years. It is certainthat additional resources (e.g., RAM,hard disks, CPU) will be reservedfor future needs and are currentlywasted. The private cloud reducesconcerns regarding security and datacontrol. However, one must still buy,

    build, and manage the private cloud,

    increasing TCO and reducing the costbenefit.

    Case Studies:Applications on theCloud

    Case Study 1: DSpaceImplementation and Analysis

    Many libraries are running theirinstitutional repositories at locallymanaged servers. UAL has been run-

    ning its repositories since 2004 as oneof the earliest DSpace adapters. Oneof the DSpace instances was testedon the cloud in January 2010 aftercomparing costs and supports. Laterthe author chose to run a productionDSpace in AWS starting March 2010.The repository (http://www.afghandata.org/) currently holds 1,800 titlesof digitized unique Afghan materi-als. Since then, several content andsystem updates have been applied.

  • 8/13/2019 TCO - Cloud Computing

    5/9

    202 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2011

    a good case for calculating the TCO.25In cases below, readers should beaware that there are the followingassumptions:

    Software, training, licensing, andmaintenance costs are the sameby assuming using on the samesoftware environment on the localmanaged infrastructure and on thecloud.

    Monitoring costs are the samebased on the fact that monitoringsoftware has to be hosted some-where.

    Bandwidth and network costsignored.

    Security audit and complianceignored by assuming all data are

    open.

    The author runs an instance of100GB in AWS and a monthly billof this node is around $40. In com-parison, if running a local managedserver, a physical server would have

    been purchased. In our case, a com-parison of TCO shows that the cloudcomputing model has a significant50 percent cost saving, assuming aserver life expectancy is five years.

    Analysis andDiscussions

    Cost Analysis

    Running applications on the cloudgives many technical advantagesand results in significant cost savingsover running them on local managedservers. In this section, the authorpresents detailed cost comparisons

    between virtual managed nodesin the cloud computing and localmanaged storage and servers in thetraditional model. Cost saving andlow barriers to launch web servicesusing the cloud is significant whenconsidering easy start-up, scalability,

    and flexibility. One of the biggestadvantages of the cloud computinglies in its on-demand, allowing usersto start applications with minimalcost. The current cost of starting aninstance on AWS is 0.03 per hour ifreserved.Above the Clouds: A BerkeleyView of Cloud Computingcites a com-parison: It costs $2.56 to rent $2worth of CPU and costs are $6.00when purchasing vs. $1.20$1.50 permonth on S3.24Clearly Healy made

    currently Google AppEngine onlyallows users to have their codes run-ning in Python and Java; it uses itsown database query language GQL.This creates an extra step for devel-

    opers who are willing to migrateexisting codes to Google and existingSQL queries have to be rewritten.In addition, other limitations withGoogle App Engine include allow-ing only a subset of the JRE standardedition and users are unable to createnew threads.22The cost of running theapplication on Google App Engine isgreat, because Google App Engineoffers free of charge up to its freequota. Google identified 90 percent ofapplications were hosted free.23Thisis a great PaaS resource for small web

    applications.

    Applications on the Cloud

    Since 2009, the author has been run-ning multiple web applications andservices on multiple IaaS and PaaSproviders and has been very happyregarding services and overall costs.The running applications and ser-vices are listed in table 2.

    Figure 1.Amazon AWS Management Console

  • 8/13/2019 TCO - Cloud Computing

    6/9

    SELECTING A WEB CONTENT MANAGEMENT SYSTEM FOR AN ACADEMIC LIBRARY WEBSITE | HAN 203CLOUD COMPUTING: CASE STUDIES AND TOTAL COSTS OF OWNERSHIP | HAN 203

    Operation expense: $7,190$10,690. Ignoring downtime andfailure expenses, insurance cost,technology training, and backupprocess. System administrator cost:

    $3,500$7,000 = 5 years x 12percent time x (50,000 salary

    + 50000 x 40 percent benefits).12 percent time is about 510minutes per day assumingthis administrator works at8 hours per day 5 days perweek at 100 percent capacity. Space cost: $1,500.

    Space cost for a book in UALis $2.80 per year. A physicalserver is estimated to be $300dollars per year for space.

    Electricity cost: $2,190.

    of a 1.01.2 GHz 2007 Opteronor 2007 Xeon processor.26

    The TCO of a physical server com-parable to an AWS small instancefor 5 years: $5,858$7,608. An AWS small instance is

    roughly 50 percent of comput-ing power of a server quoted.

    (The TCO here is calculated as50 percent of $11,715$15,215).

    Hardware: $4,525. $4,525 = $2,658 (server) +

    $1,125 (3-year support) +$1,125 x2 /3 (additional2-year support). Note: DellPowerEdge server: Intel XeonE56302.53Ghz with 5-yearsupport for mission critical6-hours repair (source: Dell.com quoted on Oct. 20, 2010).

    The TCO of an AWS small instancefor 5 years: $2,750$3,750. Hardware: $0. Operation expense: $2,750

    $3,750 System administrator cost:

    $0$1,000?. By eliminatingphysical infrastructure, there

    is no need or minimal cost tomanage a server.

    $2,750 = $350 (AWS ini-tial subscription fee) + $40/month x 12 months x 5 years.The instances capacity can

    be found on AWS, and CPUpower can be evaluated

    by using /proc/cpuinfo.Amazon indicated that OneEC2 Compute Unit providesthe equivalent CPU capacity

    Table 2.Some UAL Web Applications and Cloud Computing Service Providers

    Computing

    Infrastructure Functions Applications

    Computing

    Environment Instances

    Service

    Providers

    Data Storage Data storage N/A Linux /

    Windows

    Data storage using EBS or S3 AWS

    Access Digital

    repository

    DSpace J2EE, Java,

    Tomcat,

    PostgreSQL,

    Afghanistan Digital Collections

    AWS

    Linode

    Content

    Management

    System

    Joomla Linux, Apache,

    PHP, MySQL,

    Afghanistan Digital Libraries

    AWS

    Linode

    Website HTML HTML Sonoran Desert Knowledge

    Exchange

    AWS

    Linode

    Integrated

    Library System

    Koha Linux, Apache,

    Perl, MySQL

    Afghanistan Higher Education

    Union Catalog

    AWS

    Linode

    Web

    applications

    Home-grown

    J2EE web

    application

    J2EE, Java,

    Tomcat

    Japanese GIF (Global

    Interlibrary-loan) Holding Finder

    at

    Linode at Google App Engine

    AWS

    Linode

    Google App

    Engine

    Computing

    Services

    Monitoring Nagios Linux, Perl Internal application AWS

    Linode

    Networked

    Devices

    Administration

    SSH, SFTP Linux N/A AWS

    Linode

  • 8/13/2019 TCO - Cloud Computing

    7/9

    204 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2011

    meet users needs at will.Rebuilding nodes and creating

    imaging are also easier on the cloud.Server failure resulting from hard-ware error can result in significantdowntime. The UAL has a few serverfailure in the past few years. Last yeara servers RAID hard drives failed.The time spent on ordering newhard disks, waiting for server com-pany technicians arrival, and finallyrebuilding software environment(e.g., OS, web servers, applicationservers, user and group privileges)took six or more hours, not to mentionabout stress rising among custom-ers due to unavailability of services.Mirroring servers could minimizeservice downtime, but the cost would

    be almost doubled. In comparison,in the cloud computing model, theauthor took a few snapshots usingthe AWS web management interface.If a node fails, the author can launchan instance using the snapshot withina minute or two.

    Factors such as software andhardware failure, natural disasters,network failure, and human errors

    are the main causes for system down-time. The cloud computing providersgenerally have multiple data cen-ters in different regions. For instance,Amazon S3 and Google AppEngineare claimed to be highly availableand highly reliable. Both AWS andGoogle App Engine offer automaticscaling and load balancing. The cloudcomputing providers have hugeadvantages in offering high avail-ability to minimize hardware failure,natural disasters, network failure, andhuman errors, while the locally man-

    aged server and storage approach hasto be invested a lot to reduce theserisks. In 2009 and 2010 the Universityof Arizona has experienced at leasttwo network and server outages eachlasting a few hours; one failure was

    because of human error and the otherwas because of a power failure fromTucson Electric Power. When a powerline was cut by accident, what canyou do? In comparison, over the pasttwo years minimal downtime from

    includes 12TB hard disks(about 10TB usable space after

    RAID 5 configuration) with5-year support, assuming5-year life expectancy.

    Operation expense: $1,438$2,138 per year. System administrator cost:

    $700$1,200. See above. Space cost: $300. See above. Electricity costs: $438 per year.

    See above. Network cost ignored.

    Technology Analysis

    There is no need to purchase a server;no need to initial a cloud node; noneed to setup security policies; noneed to install Tomcat, Java and J2EEenvironment; and no need to updatesoftware. Compared to the traditionalapproach, PaaS eliminates upfronthardware and software investment,reduces time and work for setting uprunning environment, and removeshardware and software upgrade andmaintenance tasks. IaaS eliminatesupfront hardware investment along

    with other technical advantages dis-cussed below.The cloud computing model

    offers much better scalability over thetraditional model due to its flexibilityand lower cost. In our repository,the initial storage requirement is notsignificant, but can grow over time ifmore digital collections are added. Inaddition, the number of visits is nothigh, but can increase significantlylater. An accurate estimate of bothfactors can be difficult. In the tra-ditional model, a purchased server

    has preconfigured hardware withlimited storage. Upgrading storageand processing power can be costlyand problematic. Downtime will becertain during the upgrade process.In comparison, the cloud comput-ing model provides an easy wayto upgrade storage and processingpower with no downtime if han-dling well. Bigger storage and largerinstances with high-memory or high-CPU can be added or removed to

    Electricity cost: $2,190 = 5years x 365 days/year x 24

    hours/day x 0.5 kilowatt /hour x $0.10/kilowatt.

    Most libraries running digitallibrary programs require big storagefor preserving digitization files. Theanalysis below just illustrates a com-parison of the TCO of 10TB space. Itshows that the TCO of locally man-aged storage has lower costs thanAmazon S3s storage TCO. Thoughthe cloud computing model still havethe advantage of on-demand, avoid

    big initial investment on equipment,the author believes that locally man-aged storage may be a better solutionif planned well. Since Amazon S6storage pricing decreases from$0.14/GB to $0.095/GB over 500TB,Amazon S3s TCO might be lower ifan organization has huge amountsof data. The author suggests readersshould do their own analysis.

    The TCO of 10TB in Amazon S3per year: $16,800. Note: AmazonS3 replicate data at least 3 times,

    assuming these preservation filesdo not need constant changes.Otherwise, data transfer fees could

    be high. Operation expense: $16,800 per

    year. $16,800 = $1,400/month x 12

    months. (based on AmazonS3 pricing of $0.14/GB permonth)

    Network cost ignored. The TCO of a 10TB physical stor-

    age per year: $11,212$12,612. To match reliability of Amazon

    S3, local managed storage needsthree copies of data: two in harddisk and one in tape. Note: DellAX45I SAN storage: quoted onOctober 26, 2010. Replicate data3 times, including 2 copies inhard disks, one copy in tape.Ignoring time value of money, 3percent inflation per year basedon CPI statistic data.

    Hardware: $4,168 per year. $20,840 a SAN storage

  • 8/13/2019 TCO - Cloud Computing

    8/9

    SELECTING A WEB CONTENT MANAGEMENT SYSTEM FOR AN ACADEMIC LIBRARY WEBSITE | HAN 205CLOUD COMPUTING: CASE STUDIES AND TOTAL COSTS OF OWNERSHIP | HAN 205

    06), Nov. 68, 2006, Seattle, Wash.,ht tps ://www.usenix .org/events/osdi06/tech/chang/chang_html/(accessed Apr. 21, 2010).16. Google, GQL Reference, 2010,

    http://code.google.com/appengine/docs/python/datastore/gqlreference.html (accessed Apr. 21, 2010); GoogleDevelopers, Campfire One: IntroducingGoogle App Engine (pt. 3), 2010, http://www.youtube.com/watch?v=oG6Ac7d-Nx8 (accessed Apr. 21, 2010).17. David Chappell, Introducing

    Windows Azure, 2009, http://down-load.microsoft.com/download/e/4/3/e43bb4843b524fa8-a9f9-ec60a32954bc/Azure_Services_Platform.pdf (accessedApr. 2, 2010).18. Linode, LinodeXen VPS

    Hosting, 2010, http://www.linode.com/(accessed Apr. 7, 2010).19. Google, QuotasGoogle App

    Engine, 2010, http://code.google.com/appengine/docs/quotas.html (accessedOct. 21, 2010).20. Jay Jordan, Climbing Out of the

    Box and Into the Cloud: Building Web-Scale for Libraries, Journal of LibraryAdministration 51, no. 1 (2011): 317.

    21. Nurmi Daniel et al., The EucalyptusOpen-Source Cloud-Computing System,in 9th IEEE/ACM International Symposiumon Cluster Computing and the Grid, 2009,

    doi: 10.1109/CCGRID.2009.93.22. Google, The JRE White List

    Google App EngineGoogle Code, 2010,http://code.google.com/appengine/docs/java/jrewhitelist.html (accessedApr. 9, 2010); Google, The Java ServeletEnvironment, 2010, http://code.google.com/appengine/docs/java/runtime.html (accessed Apr. 9, 2010).23. Google, Changing Quotas To Keep

    Most Apps Serving Free, 2009, http://googleappengine.blogspot.com/2009/06/changing-quotas-to-keep-most-apps.html (access Oct. 21, 2010).24. Michael Armbust et al., Above the

    Clouds: A Berkeley View of Cloud Computing(EECS Department, University ofCalifornia, Berkeley: Reliable AdaptiveDistributed Systems Laboratory, 2009),http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html(accessed July 1, 2009).25. Amazon, Amazon EC2 Pricing,

    2010, http://aws.amazon.com/ec2/pric-ing/ (accessed Feb. 20, 2010).26. Michael Healy, Beyond CYA as

    a service, Information Week 1288 (2011):2426.

    case of 10TB storage. Since Amazonoffers lower storage pricing for huge

    amounts of data, readers are recom-mended to do their own analysis onthe TCOs.

    References

    1. Roger C. Schonfeld and RossHousewright, Faculty Survey 2009: KeyStrategic Insights for Libraries, Publishers,and Societies, 2010, http://www.ithaka.org/ithaka-s-r/research/faculty-surveys-20002009/faculty-survey-2009 (accessedApr. 20, 2010).

    2. Daniel Chudnov, A View From the

    Clouds, Computers in Libraries 30, no. 3(2010): 3335.

    3. Jay Jordan, Climbing Out of theBox and Into the Cloud: Building Web-Scale for Libraries, Journal of LibraryAdministration 51, no. 1 (2011): 317.

    4. Erik Mitchell, Cloud Computingand Your Library, Journal of WebLibrarianship4, no. 1 (2010): 8386.

    5. Erik Mitchell, Using CloudServices For Library IT Infrastructure,Code4Lib Journal 9 (2010), http://journal.code4lib.org/articles/2510 (accessed Feb10, 2011).

    6. Subhas C. Misra and Arka Mondal,

    Identification of a Companys Suitabilityfor the Adoption of Cloud Computingand Modelling its Corresponding Returnon Investment,Mathematical & ComputerModelling53 (2011): 50421, doi: 10.1016/j.mcm.2010.03.037.

    7. Michael Healy, Beyond CYA asa service, Information Week 1288 (2011):2426.

    8. Yan Han, On the Clouds: ANew Way of Computing, InformationTechnology & Libraries 29, no. 2 (2010):8893.

    9. Ibid.10. Peter Mell and Tim Grance, The

    NIST Definition of Cloud Computing, NIST,http://csrc.nist.gov/groups/SNS/cloud-computing/ (accessed Oct. 21, 2010).11. Ibid.12. Ibid.13. Ibid.14. Amazon, Amazon Elastic Compute

    Cloud (Amazon EC2), 2010, http://aws.amazon.com/ec2/ (accessed Oct. 21, 2010).15. Fay Chang et al., Bigtable: A

    Distributed Storage System for StructureData, in 7th Symposium on OperatingSystems Design and Implementation (OSDI

    the cloud computing providers wasreported.

    There are some issues whenimplementing cloud computing.

    Above the Clouds: A Berkeley View ofcloud computing discusses ten obsta-cles and related opportunities forcloud computing.27 All of theseobstacles and opportunities are tech-nical. The authors first paper on thistopic also discusses legal jurisdictionissues when considering cloud com-puting.28 Users should be aware ofthese potential issues when making adecision of adopting the cloud.

    Summary

    This paper starts with literaturereview of articles in cloud computing,some of them describing how librar-ies are incorporating and evaluatingthe cloud. The author introducescloud computing definition, identi-fies three-level of services (SaaS, PaaS,and IaaS), and provides an overviewof major players such as Amazon,Microsoft, and Google. Open source

    cloud software and how private cloudhelps are discussed. Then he presentscase studies using different cloudcomputing providers: case 1 of usingan IaaS provider Amazon and case 2of using a PaaS provider Google. Incase 1, the author justifies the imple-mentation of DSpace on AWS. In case2, the author discusses advantagesand pitfalls of PaaS and demonstratesa small web application hosted inGoogle AppEngine. Detailed analysisof the TCOs comparing AWS withlocal managed storage and servers

    are presented. The analysis showsthat the cloud computing has techni-cal advantages and offers significantcost savings when serving web appli-cations. Shifting web applications tothe cloud provides several techni-cal advantages over locally managedservers. High availability, flexibility,and cost-effectiveness are some of themost important benefits. However,the locally managed storage is stillan attractive solution in a typical

  • 8/13/2019 TCO - Cloud Computing

    9/9

    206 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2011

    (accessed July 1, 2009).29. Yan Han, On the Clouds: A

    New Way of Computing, InformationTechnology & Libraries 29, no. 2 (2010):8893.

    (EECS Department, University ofCalifornia, Berkeley: Reliable AdaptiveDistributed Systems Laboratory, 2009),http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-200928.html

    27. Erik Mitchell, Cloud Computingand Your Library, Journal of WebLibrarianship4, no. 1 (2010): 8386.28. Michael Armbust et al., Above the

    Clouds: A Berkeley View of Cloud Computing,

    Appendix. Running Instances on Amazon EC2

    Task 1: Building a New DSpace instance

    Build a clean OS: select an Amazon Machine image (AMI) such as Ubuntu 9.2 to get up and running in a minute or two. Install required modules and packages: install Java, Tomcat, PostgreSQL, and mail servers. Configure security and network access on the node. Install and configure DSpace: install system and configure configuration files.

    Task 2: Reloading a New DSpace instance

    Create a snapshot of current node with the EBS if desired: use AWSs management tools to create a snapshot. Register the snapshot using AWSs management tools and write down the snapshot id, specify the kernel and ramdisk.

    command: ec2-register: registers the AMI specified in the manifest file and generate a new AMI ID (see Amazon EC2Documentation)(example: ec2-register -s snap-12345 -a i386 -d Description of AMI -n name-of-image kernel aki-12345 ramdisk ari-12345

    In the future, a new instance can be started from this snapshot image in less than a minute.

    command: ec2-run-instances: launches one or more instances of the specified AMI (see Amazon EC2 Documentation)

    (example: ec2-run-instance ami-a553bfcc -k keypair2 -b /dev/sda1=snap-c3fcd5aa: 100:false)

    Task 3: Increasing Storage Size of Current Instance

    To create an instance with desired persistent storage (e.g., 100 GB)

    command: ec2-run-instances: launches one or more instances of the specified AMI (see Amazon EC2 Documentation)(example: ec2-run-instances ami-54321 -k ec2-key1 -b /dev/sda1=snap-12345:100:false)

    If you boot up an instance based on one of these AMIs with the default volume size, once its started up you can doan online resize of the file system:

    Command: resize2fs: ext2 file system resizer

    (example: resize2fs /dev/sda1)

    Task 4: Backup

    Go to AWS web interface and navigate to the Instances panel. Select our instance and then choose Create Image (EBS AMI). This newly created AMI will be a snapshot of our system in its current state.