computing in the clouds weiss

10
COMPUTING IN THE CLOUDS DECEMBER 2007 16 Quick, go outside and look up at the clouds in the sky. What shapes do you see? A ball of cotton? A bunny rabbit? A massively parallel distributed data center? If the latter sounds spot on, you might already be computing in the clouds. At least, that’s the latest catchphrase buzzing around the industry. “Cloud computing,” as it’s being called by everyone from IBM to Google to Amazon to Microsoft, is supposedly the next big thing. But like the clouds themselves, “cloud computing” can take on different shapes depending on the viewer, and often seems a little fuzzy at the edges. COMPUTING IN THE CLOUDS BY AARON WEISS Illustration by James O’Brien Powerful services and applications are being integrated and packaged on the Web in what the industry now calls “cloud computing”

Upload: threesixty

Post on 11-Nov-2014

1.307 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Computing in the clouds weiss

COMPUTING IN THE CLOUDS DECEMBER 200716

Quick, go outside and look up at the clouds in the sky. Whatshapes do you see? A ball of cotton? A bunny rabbit? A massively parallel distributed data center? If the latter soundsspot on, you might already be computing in the clouds.

At least, that’s the latest catchphrase buzzing around theindustry. “Cloud computing,” as it’s being called by everyonefrom IBM to Google to Amazon to Microsoft, is supposedly the next big thing. But like the clouds themselves, “cloud computing” can take on different shapes depending on the viewer, and often seems a little fuzzy at the edges.

COMPUTING IN THE CLOUDS

B Y A A R O N W E I S S

I l l u s t r a t i o n b y J a m e s O ’ B r i e n

Powerful services and applications are being integrated and packaged on the Web in what the

industry now calls “cloud computing”

Page 2: Computing in the clouds weiss

COMPUTING IN THE CLOUDS DECEMBER 2005 17

Page 3: Computing in the clouds weiss

tion, of course. Desktop (and laptop)computing power has been on anaccelerated tear for 30 years. But thenetworked era, and the data centersthat power it, are starting to makethe IT industry—and its investors—take a fresh look at Watson’s per-spective on centralized computing, ifnot his specific enumeration.

But data centers aren’t new, either.In the dot-com boom of the mid-’90s, many a startup invested venturecapital into traditional enterprisesolutions like Sun SPARC servers. Isthis the cloud?

Yes and no. Google is often credit-ed with innovating search on theWeb and, more recently, advertising.But to many, Google’s architecturebehind the scenes has spawned justas significant a revolution.

The data centers of the early dot-com era were, in some respects,direct descendents of Watson’s main-frames. Physically smaller, perhaps,but Sun servers and their ilk contin-ued to represent an exclusive kind ofcomputing—concentrated powerdesigned, and priced, for exclusivecustomers—namely, enterprise.

But Google turned the datacenter model on its head. Ratherthan power a network with a

COMPUTING IN THE CLOUDS DECEMBER 200718

To some, the cloud looks like Web-based applications, a revival of thethin-client. To others, the cloud lookslike utility computing, a grid thatcharges metered rates for processingtime. Then again, the cloud could bedistributed or parallel computing,designed to scale complex processesfor improved efficiency. Maybe every-

one is right. There are many shapesin the clouds.

Cloud Shapes: The Data CenterIt is not news that today’s majorInternet companies have built massivedata centers to power their onlinebusinesses. Decades ago, computingpower was concentrated in main-frames tucked away behind the scenesbecause there was no alternative—only a hulking room-sized box thatcould contain any significant amountof computational power. The ideathat this power could be distributedrather than centralized seemed likesuch folly that in 1943, IBMChairman Thomas Watson saidfamously (or infamously) that “Ithink there’s a world market formaybe five computers.”

The era of the personal computerthat has flourished since the 1970sdirectly contradicted Watson’s predic-

Both Google and IBM have a vested interest in encouragingcloud computing: THEY NEED PEOPLE TO HIRE.

Page 4: Computing in the clouds weiss

small number of high-poweredand very expensive servers, whynot deploy cheap, commodityhardware in large numbers?

Today, Google runs an estimatedhalf-million servers clustered into adozen or so physical locations. Bycreating a network that is spread thinand wide rather than narrow anddeep, Google created a new kind ofconcentrated power—derived morefrom scale of the whole than any oneconstituent part. This, some say,describes the cloud.

As computational and networkingarchitecture, the cloud is very robust.Sometimes described as “self-heal-ing,” a thin, wide network can recov-er gracefully from the most commonailments, such as connection andhardware failures, because there areso many more drones available totake on the work.

But a cloud can consume a lot ofpower to run. Aside from the powerneeded to drive thousands, or hun-dreds of thousands, of processorsand peripherals—hard drives, coolingfans—all these whirring machinesgenerate lots of heat. It is estimatedthat 50 percent of energy costs inrunning a large data center arederived from cooling needs alone.

Worse still for the cloud, the worldis immersed in a global energycrunch, as both demand and specula-tion has driven up pricing for mostconventional energies toward recordlevels.

Reducing the operating costs of aGoogle-inspired data center cloudinvolves both physical and virtualsolutions. Physically, data centers arelike plants arcing toward the sun-light, migrating toward cheap energy.Google is building a major data cen-

ter at a derelict factory site on thebanks of the Columbia River in thePacific Northwest, located both nearcheap hydroelectric energy and amajor trans-Pacific Internet node.

Likewise, Microsoft, IBM, and othersare following suit, scouting sites both inthe Pacific Northwest and Canada wherehydroelectric power is cheaper (and green-er) than the coal-derived power usedthroughout much of the U.S. The eyes ofinvestors are also turning toward China,where new power plants are being rapidlybuilt (and, some are quick to caution,without the “costly” burdens of environ-mental controls).

Besides cheaper power,data centers are makingheavy use of virtualiza-tion to squeeze the mostout of the watts they’reconsuming. With majorvendors like VMWare

and Citrix, which recently acquiredthe Xen virtualization platform,increasingly targeting the data center,virtualization allows a single serverto run multiple operating instancessimultaneously. By sandboxing eachOS inside artificial boundaries, notonly can each instance run indepen-dently of the others, but CPU idletime is minimized.

Just what distinguishes a “cloud”from “a bunch of machines” can be alittle fuzzy. But the next evolutionthat may illuminate the fog is the so-called “data center OS”—or, in thespirit of the buzz, the CloudOS. Inthe fall of this year, VMWare andCisco announced a joint venture todevelop such a “fabric” (to use theirword in spite of the mixedmetaphor).

In a data center like that employed

COMPUTING IN THE CLOUDS DECEMBER 2007 19

Page 5: Computing in the clouds weiss

by Google and the many other enter-prises inspired by their model, eachserver is fundamentally an indepen-dent machine running its own copyof an operating system. For servers towork together on common tasksrequires an abstraction layer of soft-ware—often, highly specialized, cus-tom software—that intelligentlydivvies up jobs. But a more efficientand resource-friendly solution wouldbe a single operating system, whichintrinsically utilizes the resources ofmany machines.

Essentially, an operating system isdesigned to manage resources—harddrive space, memory, and so on. Atrue data center, or cloud, OS willtreat every processing unit availableas just another resource, relying onnetworking channels to replicate thekinds of intra-server channels thatnow coordinate events within a singlephysical machine. Under the com-mand of a single “omniscient” oper-ating system, the cloud becomes amore cohesive entity.

Cloud Shapes: Distributed ComputingWhether “the cloud” represents adata center at a single physical loca-tion or dozens, hundreds, or thou-sands of data centers spread aroundthe world, its speed and efficiency islimited by how intelligently it dele-gates responsibility.

Completing any general computingtask—say, retrieving the results of asearch with relevant contextualadvertisements—requires a longseries of smaller jobs to be complet-ed. Database queries, parsing ofresults, construction of result sets,and formatting of result pages, toname the most common. Even thesetasks can be further broken down

into sub-tasks. Those sub-tasks canbe broken into even smaller tasks,and so on, until you’re nearly downto “bare metal” as they say and deal-ing with disk and memory access.

Ideally, if tasks are broken intotheir smallest constituent jobs,and each job could be complet-ed simultaneously using avail-able processing resourcessomewhere in the cloud, youcould achieve an optimally effi-

cient architecture: the most optimisticdefinition of distributed computing.

In reality, some jobs are dependenton the results of other jobs.Furthermore, designing algorithms tomost effectively divide jobs and dis-tribute them throughout the cloud inreal time is complex, to understatethe case.

But distributed computing, like thedata center itself, is not inherentlynew to the era of the cloud. If weconsider the entire Internet a cloud,one need only look at popular pro-jects like SETI@home andFolding@home to see public exam-ples of distributed computing atwork. In these projects, individualsrun software on their PC which con-nects them to a server that divideslarge jobs among small clients tocrunch numbers toward a particulargoal—in these examples, searchingfor alien life among radio waves andcomputing protein folding simula-tions to aid medical research. Youcould even consider botnets—soft-ware that maliciously infects unwit-ting client machines to send outparcels of spam—a form of nefariousdistributed computing.

The open source project Hadoopprovides a general-purpose frame-

COMPUTING IN THE CLOUDS DECEMBER 200720

Page 6: Computing in the clouds weiss

work for developers to rapidlyemploy distributed computing in awide variety of projects. Activelyunder development within the aus-pices of the Apache Foundation,Hadoop liberates software developersfrom creating specialized customsoftware for taking advantage of dis-tributed computing in a cloud.Capable of distributing petabytes ofdata-processing across thousands ofcomputing nodes, the Hadoop plat-form is itself comprised of several

technologies to ensure efficiency anddata integrity as data swirls aroundthe cloud. Some of these sub-projects, like MapReduce whichdivides jobs into component tasks,and HDFS, the Hadoop DistributedFilesystem, can be employed individ-ually or together by developers intheir applications.

Cloud Shapes: A Utility GridIt seems like every shape one sees inthe cloud is inspired by a computingmodel from the past. Back in thedays of mainframes and fancy super-computers housed at research univer-sities, valuable processing time wasessentially for sale. Researchers

would submit jobs to these number-crunching powerhouses (in theirday), and be billed for the cost ofcycles used. Processing time wasdelivered like electricity—you paidfor what you used.

Today, most medium to large-sizedorganizations invest in their owndata centers and use them at will.Although processing cycles are notmetered like a utility, there are signif-icant costs to building a data cen-ter—real estate, hardware, power,

cooling, and ongoing maintenance.What’s worse for the balance sheet

is that organizations need to plan forworst-case scenarios. In the case of adata center, not only can this includethe costs of backup and redundancy,but in overpowered servers capableof handling loads which can peakhigh but occur infrequently. Bus-inesses may find themselves using 99percent of their computing capacityonly 10 percent of the time, leavingexpensive equipment often idle andangering the accountants.

In the Web space there is a thriv-ing market of hosting providers whoinvest in their own data centers andsell usage to customers, be they indi-

COMPUTING IN THE CLOUDS DECEMBER 2007 21

WHY NOT JUST MOVE ALL PROCESSINGPOWER TO THE CLOUD, and walk around with anultralight imput device with a screen?

Page 7: Computing in the clouds weiss

viduals or businesses. Although host-ing providers can go some waytoward minimizing underutilizationof in-house servers by attaching sur-charges to peak usage, their capacityto stretch can be limited. Providerstypically assign clients to a whole orpartial physical server, and do nottruly replicate the kinds of cloud fea-tures, like distributed computing,

that a dedicated data center can offer.Meanwhile, massive companies

like Amazon, Google, and IBM haveinvested in, innovated, and becomeexpert at housing their own large-scale data centers. Invoking the ideaof the cloud, all three also senseopportunity: Why not scale up theirdata centers—grow the cloud—andcreate business models to supportthird-party use?

In fact, Internet retail giantAmazon is the first out of the gate tocommercialize their cloud. After aperiod of limited access testing,Amazon opened their ElasticCompute Cloud, or EC2, to fee-based public use in October 2007.

To use Amazon’s EC2, customersfirst create a virtual image of theircomplete software environment usingprovided tools. The image is thenused to create an “instance” of a

machine in Amazon’s cloud. Usingvirtualization, Amazon presents themachine image to the end user as if itwere a dedicated server, with thesame degree of access an administra-tor would have to their own server.

Customers can choose configura-tion templates for their machineinstance—for example, a server with1.7GB of memory, one processing

core, and 160GB of storage space.Additional configurations supportmore memory, more cores, and morestorage. But the real magic in EC2 isthat customers can create and destroymachine instances at will. As a result,software can scale itself to exactlythe amount of computing power itrequires.

Consider a Web-based applicationrunning in Amazon’s cloud. Supposethere is a sudden surge in visitors,perhaps thanks to media coverage.Today, many Web applications failunder the load of big traffic spikes.But in the cloud, assuming that theWeb application has been designedintelligently, additional machineinstances can be launched ondemand. The application dynamically,and gracefully, scales up. When trafficslows down, the app can scale down,terminating the extra instances.

COMPUTING IN THE CLOUDS DECEMBER 200722

Software, in a cloud, becomes a service. To a company likeMicrosoft, whose substantial fortunes are built on softwareas a purchasable, local application, THE THREAT ISNOT FALLING ON DEAF EARS.

Page 8: Computing in the clouds weiss

Amazon then charges customersbased on their cumulative “instancehours,” a way of metering usage,currently set at $0.10 per instancehour for the most basic instance type.To provide consistent and predictableperformance, machine instance typesare rated by what Amazon calls“compute units,” which deliver aspecific, known amount of perfor-mance, regardless of the underlyinghardware Amazon is using insidetheir cloud. A second charge appliesto data moved in and out ofAmazon’s network, ranging from$0.10 to $0.18 per gigabyte.

In contrast to Amazon’s commer-cial effort, Google and IBM recentlyannounced a partnership to apply asimilar utility cloud model to com-puter science education. The twocompanies have put together a dedi-cated data center mixing both com-modity and enterprise hardware.Running open source softwareincluding Linux, Xen virtualization,and Apache Hadoop, theGoogle/IBM cloud presents a distrib-uted computing canvas for educa-tional access. Both Google and IBMhave a vested interest in encouragingcloud computing: They need peopleto hire.

Aconsiderable concernamong major enterpriseis that today’s computerscience students lackaccess to the distributedcomputing environ-ments that make up the

cloud. Companies like Google willneed new hires experienced in writingcode designed to run in a cloud ofperhaps thousands of processors. Buteducational institutions don’t have

these kinds of massive, sandboxeddata centers dedicated for studentuse, and many lack instructors withleading-edge experience.

To some industry analysts, com-modified cloud computing likeAmazon’s EC2 will change the faceof enterprise computing. It may pavethe way to where businesses nolonger invest anything into data cen-ters of their own. Out goes the hard-ware, out goes the power bill, andout goes all the processing powerthat rarely gets used. Instead, theybuy cloud computing time from com-mercial providers, who can specializein building massive data centers atsites selected to minimize operatingcosts.

Cloud Shapes: Software as a ServiceAs radical a shift as cloud computingmay represent for the enterprise—who needs servers?—some think itwill radically change personal com-puting even more. Who needs com-puters at all? At least, the kind thateat up battery life and contain churn-ing hard drives. Why not just moveall processing power to the cloud,and walk around with an ultralightinput device with a screen?

Some say it’s already begun.Burgeoning Web applications havebeen the rage for several years.Fueled by technological evolutionslike AJAX, which allows browser-based code to behave more like alocal application, and people’s desirefor mobility and data ubiquity, weincreasingly use the Web for applica-tion functionality. E-mail was boththe first “killer app” for the Internet,and later, on the Web. For manytoday, Web-based e-mail is the onlykind they use.

COMPUTING IN THE CLOUDS DECEMBER 2007 23

Page 9: Computing in the clouds weiss

Taken further, Web-based applica-tions like Google Docs threaten coreproductivity applications on thedesktop. While the features ofGoogle Docs are a slim shadow of amajor, and majorly profitable, appli-cation like Microsoft Word, it pro-vides a taste of a cloud-based future.Your data is reachable anywhere, andthe only software you need to accessit is a browser. Which, in turn, means

that your data and your applicationsare available in one form anotherfrom your desktop PC, your laptop,and your handheld.

Software, in the cloud, becomes aservice. To a company like Microsoft,whose substantial fortunes are builton software as a purchasable, localapplication, the threat is not fallingon deaf ears. Windows Live,Microsoft’s “cloud,” may be today’stake on Windows 1.0—a look intothe future.

Even Adobe has begun to launchstripped-down versions of its compu-tationally-heavy marquee titles asWeb-based services, includingPhotoshop and the video editing suitePremiere.

Pundits and futurists like NicholasCarr love this stuff because the cloudrepresents a paradigm shift and a dis-

ruptive force and a whole new way.Carr predicts that Google and Applewill partner to push the cloud com-puting/software-as-a-service modeleven further. He foresees a lightweightmobile device crafted by Apple thatwill tap into Google’s cloud, bringingtogether the two masters of the frontend and the back end.

In the near-term, a ubiquitouscloud faces obstacles. Critics argue

that visions like Carr’s are simplyrevivals of failed thin-client dreamsof the past. Thin clients like thosetouted by Oracle-founder LarryEllison in the ’90s have not managedto become cost-effective. With pricesplummeting and performance sky-rocketing for full-featured desktopand laptop computers, it has beendifficult to produce a relatively pow-erless thin client at a low enough costto attract buyers.

Advocates for thin client cloudcomputing argue that full-fledgedmachines, however powerful, are alsoa hassle with negative productivity.They have many parts that can failand their software needs constantcare and feeding in a world riddledwith software updates, viruses, andspyware. Centralizing processingpower in the cloud liberates users to

COMPUTING IN THE CLOUDS DECEMBER 200724

THE CLOUD DEMANDS A HIGH DEGREE OFTRUST. Significant amounts of data which were previouslystored only in individual offices and homes would now residein data centers controlled by third-parties.

Page 10: Computing in the clouds weiss

COMPUTING IN THE CLOUDS DECEMBER 2007 25

choose efficient, uncomplicated accessmachines. This argument, and itspotential economic feasibility, gainsweight in today’s environment, wherelaptop sales are outpacing the desk-top, and even more mobile devicesare becoming commonplace.

For cloud computing tomove front and center, thenetworks that tie every-thing together need to beextremely robust. Afterall, the cloud represents asignificant inversion from

personal computing in the 1970s and1980s, when machines stood aloneand derived all of their utility fromtheir own capabilities. Under thecloud, client machines become nearlyuseless on their own. Are our net-works ready to handle the load?

Many would say no, especiallythose living in the large U.S. market,where broadband quality and pene-tration fares poorly relative to manysmaller industrialized nations. In con-trast to local computing power, whichcontinues to boom year after year,broadband advances here have beenslow and may create at least a short-term bottleneck holding back a majorshift toward mainstream cloud com-puting.

Network bandwidth aside, thecloud raises concerns among privacyadvocates. Most significantly, thecloud demands a high degree of trust.Significant amounts of data whichwere previously stored only in indi-vidual offices and homes would nowreside in data centers controlled bythird-parties. Do we have adequateprivacy laws? How should encryptionplay a role?

Software as a service in the cloudcan revive fears of vendor lock-in, asignificant consideration in the main-frame era. Supposing a cloud opera-tor and a thin-client vendor partnertogether, it is possible that each halfwill require the other. Services fromthe cloud may be inaccessible tothose without an access device from asingle brand. Some fear that thecloud could encourage the growth ofwalled-gardens, a potential step backcompared to the relatively openInternet of today.

Partly Cloudy?There are those who scoff that“cloud computing” is just the latestbranding of some old, familiar com-puting models—which is partly true.It’s a buzzword almost designed to bevague, but cloud computing is morethan just a lot of fog.

The cloud concept draws on manyexisting technologies and architec-tures. Centralizing computing poweris not new and, if anything, is areturn to the computer’s roots. Nor isutility computing new, or distributedcomputing, or even software as a service.

But the cloud is new in that it inte-grates all of the above computingmodels. This integration requirescomputing’s center of power to shiftfrom the processing unit to the net-work. Inside the cloud, processorsbecome commodities. It is the net-work that holds the cloud together,and connects the clouds to each otherand the sky to the ground. ~Aaron Weiss is a technology writer and Web developer shivering in upstate New York, aswell as human proprieter of livenudecats.com

PERMISSION TO MAKE DIGITAL OR HARD

COPIES OF ALL OR PART OF THIS WORK FOR

PERSONAL OR CLASSROOM USE IS GRANT-

ED WITHOUT FEE PROVIDED THAT COPIES

ARE NOT MADE OR DISTRIBUTED FOR PROFIT

OR COMMERCIAL ADVANTAGE AND THAT

COPIES BEAR THIS NOTICE AND THE FULL

CITATION ON THE FIRST PAGE. TO COPY

OTHERWISE, TO REPUBLISH, TO POST ON

SERVERS OR TO REDISTRIBUTE TO LISTS,

REQUIRES PRIOR SPECIFIC PERMISSION

AND/OR A FEE.

© ACM 1091-556/07/01200 $5.00