computational needs for the next generation electric grid · lbnl‐5105e computational needs for...

Download Computational Needs for the Next Generation Electric Grid · LBNL‐5105E Computational Needs for the Next Generation Electric Grid Proceedings April 19‐20, 2011 Editors: Joseph

If you can't read please download the document

Upload: phamnga

Post on 21-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • Cornell University

  • LBNL5105E

    ComputationalNeedsfortheNextGenerationElectricGrid

    Proceedings

    April1920,2011

    Editors:JosephH.Eto,LawrenceBerkeleyNationalLaboratory

    RobertJ.Thomas,CornellUniversityTheworkdescribed inthisreportwasfundedbytheOfficeofElectricityDeliveryandEnergy Reliability of the U.S. Department of Energy under ContractNo. DEAC0205CH11231.

  • Disclaimer

    This documentwas prepared as an account ofwork sponsored by theUnited StatesGovernment.Whilethisdocumentisbelievedtocontaincorrectinformation,neithertheUnitedStatesGovernmentnoranyagencythereof,norTheRegentsoftheUniversityofCalifornia, nor any of their employees,makes anywarranty, express or implied, orassumes any legal responsibility for the accuracy, completeness, orusefulness of anyinformation,apparatus,product,orprocessdisclosed,orrepresents that itsusewouldnot infringe privately owned rights. Reference herein to any specific commercialproduct,process,orserviceby its tradename, trademark,manufacturer,orotherwise,doesnotnecessarilyconstituteor imply itsendorsement,recommendation,orfavoringby the United States Government or any agency thereof, or The Regents of theUniversity ofCalifornia.The views and opinions of authors expressed hereindo notnecessarilystateorreflectthoseoftheUnitedStatesGovernmentoranyagencythereof,orTheRegentsoftheUniversityofCalifornia.Ernest Orlando Lawrence Berkeley National Laboratory is an equal opportunityemployer.

  • TableofContents

    AcknowledgementsForewordIntroductionRunningSmartGridControlSoftwareonCloudComputingArchitectures

    WhitePaper 11Authors:KennethBirman,LakshmiGanesh,andRobbertvanRenessee,CornellUniversity

    DiscussantNarrative 135JamesNutaro,OakRidgeNationalLaboratory

    RecorderSummary 137GhalebAbdulla,LawrenceLivermoreNationalLaboratory

    CoupledOptimizationModelsforPlanningandOperationofPowerSystemsonMultipleScales

    WhitePaper 21Author:MichaelFerris,UniversityofWisconsin

    DiscussantNarrative 233AliPinar,SandiaNationalLaboratory

    RecorderSummary 237AlejandroDominguezGarcia,UniversityofIllinoisatUrbanaChampaign

    ModeReconfiguration,HybridModeEstimation,andRiskboundedOptimizationfortheNextGenerationElectricGrid

    WhitePaper 31Authors:AndreasHofmannandBrianWilliams,MITComputerScienceandArtificialIntelligenceLaboratory

    DiscussantNarrative 329BernardLesieutre,UniversityofWisconsin

    RecorderSummary 333HyungSeonOh,NationalRenewableEnergyLaboratory

    ModelbasedIntegrationTechnologyforNextGenerationElectricGridSimulations

    WhitePaper 41Authors:JanosSztipanovits,GrahamHemingway,Vanderbilt

  • University;AnjanBoseandAnuragStivastava,WashingtonStateUniversity

    DiscussantNarrative 445HenryHuang,PacificNorthwestNationalLaboratory

    RecorderSummary 447VictorZavala,ArgonneNationalLaboratory

    ResearchNeedsinMultiDimensional,MultiScaleModelingandAlgorithmsforNextGenerationElectricityGrids

    WhitePaper 51Author:SantiagoGrijalva,GeorgiaInstituteofTechnology

    DiscussantNarrative 533RomanSamulyak,BrookhavenNationalLaboratory

    RecorderSummary 535SvenLeyffer,ArgonneNationalLaboratory

    LongTermResourcePlanningforElectricPowerSystemsUnderUncertainty

    WhitePaper 61Authors:SarahM.RyanandJamesD.McCalley,IowaStateUniversity;DavidL.Woodruff,UniversityofCaliforniaDavis

    DiscussantNarrative 651JasonStamp,SandiaNationalLaboratories

    RecorderSummary 653ChaoYang,LawrenceBerkeleyNationalLaboratory

    FrameworkforLargescaleModelingandSimulationofElectricitySystemsforPlanning,Monitoring,andSecureOperationsofNextgenerationElectricityGrids

    WhitePaper 71Authors:JinjunXiong,EmrahAcar,BhavnaAgrawal,AndrewR.Conn,GaryDitlow,PeterFeldmann,UlrichFinkler,BrianGaucher,AnshulGupta,FookLuenHeng,JayantRKalagnanamAliKoc,DavidKung,DungPhan,AmithSinghee,BasilSmith,IBMSmarterEnergyResearch

    DiscussantNarrative 773LorenToole,LosAlamosNationalLaboratory

    RecorderSummary 775JeffDagle,PacificNorthwestNationalLaboratory

    Appendix1:WorkshopAgenda Appendix11

    Appendix2:WrittenCommentsSubmittedaftertheWorkshop Appendix21

    Appendix3:ParticipantBiographies Appendix31

  • Acknowledgements

    Theworkdescribed inthisreportwasfundedbytheOfficeofElectricityDeliveryandEnergy Reliability of the U.S. Department of Energy under ContractNo. DEAC0205CH11231.The editorswish to thank the following individuals for their help in organizing theworkshopandpreparingtheproceedings:KatherineBehrend;SallyBird;AllanChen;NancyJ.Lewis;AnthonyMa;AndrewMills;AnnikaTodd.

  • Foreword

    TheApril2011DOEworkshop,ComputationalNeedsfortheNextGenerationElectricGrid,wastheculminationofayearlongprocesstobringtogethersomeoftheNationsleadingresearchersandexpertstoidentifycomputationalchallengesassociatedwiththeoperation andplanningof the electricpower system. The attachedpapersprovide ajourney into these experts insights, highlighting a class of mathematical andcomputationalproblemsrelevantforpotentialpowersystemsresearch.Whileeachpaperdefinesaspecificproblemarea,therewereseveralrecurrentthemes.First,thebreadthanddepthofpowersystemdatahasexpandedtremendouslyoverthepastdecade.Thisprovidesthepotentialfornewcontrolapproachesandoperatortoolsthatcanenhancesystemefficienciesandimprovereliability.However,thelargevolumeof data poses its own challenges, and could benefit from application of advances incomputernetworkingandarchitecture,aswellasdatabasestructures.Second, the computational complexityof theunderlying systemproblems isgrowing.Transmitting electricity from clean, domestic energy resources in remote regions tourban consumers, forexample, requiresbroader, regionalplanningovermultidecadetimehorizons. Yet, itmayalsomeanoperational focuson localsolutionsandshortertimescales, as reactive power and system dynamics (including fast switching andcontrols)playanincreasinglycriticalroleinachievingstabilityandultimatelyreliability.Theexpectedgrowthinrelianceonvariablerenewablesourcesofelectricitygenerationplacesanexclamationpointonbothof theseobservations,andhighlights theneed fornew focus in areas such as stochastic optimization to accommodate the increaseduncertainty that isoccurring inbothplanningandoperations. Applicationofresearchadvances in algorithms (especially related tooptimization techniques anduncertaintyquantification)couldacceleratepower system software toolperformance, i.e. speed tosolution,andenhanceapplicabilityfornewandexistingrealtimeoperationandcontrolapproaches,aswellaslargescaleplanninganalysis.Finally, models are becoming increasingly essential for improved decisionmakingacrosstheelectricsystem,fromresourceforecastingtoadaptiverealtimecontrolstoonlinedynamicsanalysis. The importanceofdata isthusreinforcedbytheir inescapablerole in validating, highfidelity models that lead to deeper system understanding.Traditionalboundaries(reflectinggeographic,institutional,andmarketdifferences)arebecomingblurred,andthus,itisincreasinglyimportanttoaddresstheseseamsinmodelformulationandutilization toensureaccuracy in theresultsandachievepredictabilitynecessaryforreliableoperations.Each paper also embodies the philosophy that our energy challenges requireinterdisciplinary solutions drawing on the latest developments in fields such asmathematics, computation, economics, as well as power systems. In this vein, the

  • workshop shouldbeviewednotas the endproduct,but thebeginningofwhatDOEseeks to establish as a vibrant, ongoingdialogue among these various communities.Bridging communication gaps among these communitieswill yield opportunities forinnovationandadvancement.Thepapersandworkshopdiscussionprovidetheopportunitytolearnfromexpertsonthecurrentstateoftheartoncomputationalapproachesforelectricpowersystems,andwhereonemayfocustoaccelerateprogress. IthasbeenextremelyvaluabletomeasIbetter understand this space, and consider future programmatic activities. I amconfident that you toowill enjoy the discussion, and certainly learn from themanyexperts.Iwouldliketothanktheauthorsofthepapersforsharingtheirperspectives,aswellas thepaperdiscussants,sessionrecorders,andparticipants. Themeetingwouldnothavebeenassuccessfulwithoutyourcommitmentandengagement. Ialsowouldlike to thank Joe Eto and Bob Thomas for their vision and leadership in bringingtogethersuchawellstructuredandproductiveforum.Sincerely,GilBindewaldProgramManager,AdvancedComputation&GridModelingOfficeofElectricityDeliveryandEnergyReliabilityUnitedStatesDepartmentofEnergy

  • Introduction

    BackgroundTheUSelectricpowersystemhasundergonesubstantialchangesincethelate1980sandpromisestocontinuetochangeintotheforeseeablefuture. Thechangesbeganin1989withtherestructuringoftheway industryprocuredelectricsupply. Therestructuringsought to replace centralized decisionmaking by the traditional vertically integratedutilitywithdecentralizeddecisionmakingbymarketparticipants thatestablishpricesthroughmarket forces,not regulation. Today, that transformation is still inprogress.Vertically integrated firms continue to serve customers in regions of the country. Asustainablemeansforensuringadequatetransmissionhasnotbeendemonstrated.Andthe demand side of the equation is not able to compete fully in an open marketenvironment.

    In the midst of this restructuring, advances in electric transportation, a greaterawarenessoftheenvironmentaleffectsofelectricityproduction,andadesireonthepartoftheUStoeliminateitsdependenceonforeignoilhaspromptedamovementtoagainreinvent the electric system. The new objectives include better accommodation ofplanningandoperationaluncertainty,especiallythatassociatedwithvariablerenewablegenerationsourcessuchaswindandsolar,accommodationofmajorreductionsinCO2and other pollutants harmful to air quality, and economic and reliable operation ofexistingassetswithlessmarginthaninthepast.Itisnowagreedthatthesmartgridthat will be needed to achieve these objectives will involve the confluence of newsensing,communication,control,andcomputingasauniqueblendoftechnologiesthatmustbedesigned specifically tomanage the requirements for a future electricpowersystembasedoncompetitivemarkets. Fundamental to thisagreement is the idea thatsignificantadvancesareneeded intheareasof largescalecomputation,modeling,anddatahandling.

    ApproachTo begin to address the largescale computation, modeling, and data handlingchallenges of the future grid, seven survey paperswere commissioned in 2010 fromeminently qualified authorities conducting research activities in problem areas ofinterest.Eachpaperwastodefineaproblemarea,conciselyreviewindustrypracticeinthis area up to the present time, and provide an objective, critical, and comparativeassessmentof researchneededduring thenext5 to10years. Authorswereasked toidentify seminal papers or reports that had motivated later, generic, related work.

  • Electric power system computational needs appropriate for discussion in the surveypapersweretoinclude:

    1. Newalgorithms thatarescalableandrobust forsolving largenonlinearmixedintegeroptimizationproblemsandmethodsforefficientlysolving(inrealtime)largesetsofordinarydifferentialequationswithalgebraicconstraints,includingdelays, parameter uncertainties, and monitored data as inputs. These newalgorithmsshouldaccommodaterandomness forcapturingappropriatenotionsof security and incorporate recent results on improving deterministic andrandomizedalgorithmsforcomputationallyhardproblems.

    2. Anewmathematics for characterizinguncertainty in information created fromlargevolumesofdataaswellasforcharacterizinguncertaintyinmodelsusedforprediction.

    3. New methods to enable efficient use of highbandwidth networks bydynamically identifyingonly thedata relevant to thecurrent informationneedanddiscardingtherest.Thiswouldbeespeciallyusefulforwideareadynamiccontrolwheredatavolumeandlatencyarebarriers.

    4. Newsoftwarearchitecturesandnewrapiddevelopmenttoolsformerginglegacyandnew codewithoutdisruptingoperation. Software shouldbeopen source,modular,andtransparent.Securityisahighpriority.

    We assume that designing and building larger and faster computers and fastercommunicationswillnotbesufficienttosolvetheelectricgridcomputationalproblems,althoughthese improvementsmightultimatelybehelpful. Instead,ourexpectation isthatfundamentaladvancesareneededintheareasofalgorithms,computernetworkingand architecture, databases and data overwhelm, simulation and modeling, andcomputationalsecurity;perhapsmostimportantly,theseadvancesmustbeachievableinatimeframethatwillbeusefultotheindustry.

    On April 1920, 2011, a twoday workshop was held on the campus of CornellUniversity to explore critical computational needs for future electric power systems.Workshopparticipantsprovided inputbasedon thepresentationof the sevenpapers.Thesevenpaperswerenotexpectedtobeexhaustive,butactedasaframeworkwithinwhichtoexplorearichrangeoftopicsassociatedwiththeoverallissue.

    Thecollectionofmaterialsinthisvolumeisintendedtoprovideascompletearecordaspossibleof theworkshopproceedings. Thevolume contains the finalversionsof thesevenpapersthatwerepresented,alongwithdiscussionsofthepapersfocusthatwerepreparedaheadoftimetostimulatediscussionattheworkshop,andthereportsofthediscussions that took place amongworkshop participants. The authors of the sevenpapersreviewedandapprovedthereportsoftheworkshopdiscussions,whichincludethereportersinterpretations.

  • SummaryofthePapersIndeveloping theworkshopour focushasbeenonaclassofproblems thathavebeenneglectedbutwillhavetobesolvedifwearetomoveinatimelywaytoanewsmartarchitecture capable of accommodating our vision for the grid of the future. Wesummarizebelowthecontributionseachofthesevenpapersmakestothediscussionofthisclassofproblems.

    The first paper by Kenneth P. Birman, Lakshmi Ganesh, and Robbert van Renesseexplores the relativelynewparadigmofcloudcomputing in relation to futureelectricpowersystemneeds. Theauthorsnote that futureneedswilldemandscalabilityofakindthatonlycloudcomputingcanoffer.Theirthesisisthattherewillbepowersystemrequirements (realtime, consistency, privacy, security, etc.) that cloud computingcannot currently support and that many of these needs, which are specific to theexpectedfutureelectricpowerparadigm,willnotsoonbefilledbythecloudindustry.

    ThesecondpaperbyMichaelFerrisisaboutmodeling.Thispaperarguesthatdecisionprocessesarepredominantlyhierarchicaland that,asaresult,models tosupportsuchdecisionprocessesshouldalsobelayeredorhierarchical.Ferriscontendsthat,althoughadvicecanbeprovidedfromtheperspectiveofmathematicaloptimizationonmanagingcomplex systems, that advice must be integrated into an interactive debate withinformed decisionmakers. He also agrees that treating uncertainties in large scaleplanningprojectswillbecomeevenmorecriticalasthesmartgridevolvesbecauseoftheincrease in volatility of both supply and demand.Optimizationmodelswith flexiblesystemsdesigncanhelpaddress theseuncertaintiesnotonlyduring theplanningandconstructionphases,butalsoduringtheoperationalphaseofaninstalledsystem.

    The thirdpaper byAndreasG.Hofmann andBrianC.Williams focuses on the twinproblems of: 1) increasing the level of automation in the analysis and planning forcontingencies in response to unexpected events, and 2) the problem of incorporatingconsiderations of optimality into contingency planning and the overall energymanagementprocess.Withregardtothefirstproblem,theauthorsnotethat,althoughthelevelofanticipatedautomationisstilladvisoryandhumansremainintheloop,useofautomationwouldreducethedrudgeryanderrorpronenatureofthecurrentlaborintensiveapproach.Automationwouldalsoguaranteethecompletenessofananalysisand validity of the contingency plans. With regard to the second problem, theoptimizationwouldincludeestablishingriskboundsonactionstakentoachieveoptimalperformance.

    ThefourthpaperbyJanosSztipanovits,GrahamHemingway,AnjanBose,andAnuragSrivastava is also aboutmodeling. The thesis is that the future electric systemwillrequire the efficient integration of digital information, communication systems, realtimedata delivery, embedded software and realtime control decisionmaking. Theauthors posit that no highfidelity models are capable of simulating electric gridinteractionswithcommunicationandcontrol infrastructureelementsforlargesystems.

  • They also conclude that it is a challenge tomodel infrastructure interdependenciesrelatedtothepowergrid,includingthenetworksandsoftwareforsensors,controls,andcommunication.

    The fifth paper by SantiagoGrijalva argues that future electric system problems aremultiscale and that there is a need to develop multiscale simulation models andmethodstothelevelthatexistsinotherengineeringdisciplines.Thepaperdiscusses18areas of multiscale, multidimensional power system research that are needed toprovideaframeworkforaddressingemergingpowersystemproblems.

    ThesixthpaperbySarahM.Ryan,JamesD.McCalley,andDavidL.Woodruffdescribescomputationaltoolsthatareneededintheareaofoptimizationforlargescaleplanningmodels thataccount foruncertainty. Theauthorspresent,asanexample,aproposedmodel forelectricsystemplanning that includes linkageswith transportationsystems.The paper addressesmultiobjective planning in the presence of uncertainty wheredecision makers must balance, for example, sustainability, costs (investment andoperational),longtermsystemresiliency,andsolutionrobustness.

    Theseventhandfinalpaper,byJinjunXiongandhisassociates,explorescomputationalchallenges in the context of securityconstrained unit commitment and economicdispatch with stochastic analysis, management of massive data sets, and conceptsrelated to largescalegrid simulation. Althoughotherpapersaddress simulationandoptimization, this paper is unique in its exploration of emerging substantive datamanagementissues.

    ConclusionsTheApril2011workshoptouchedonimportantresearchanddevelopmentneedsforthefutureelectricpowersystem,butwasnotexhaustive. Wehope that ithascreated thebasis for the formation of a community of researcherswhowill focus on these verysubstantialandinterestingneeds.

    Wewishtothanktheauthorsofthepapersfortheiroutstandingcontributionsandforprovidingimportantfoodforthought.Inadditiontotheauthorsofthepapers,wewishto acknowledge the contributions of the PaperDiscussants: JamesNutaro,Ali Pinar,BernardLesieutre,HenryHuang,RomanSamulyak,JasonStamp,andLorenToole;andtheSessionRecorders:GhalebAbdulla,AlejandroDominguezGarcia,HyungSeonOh,VictorZavala,SvenLeyfferChaoYang,andJeffDagle. Finally,wearegratefulfortheactive contributions of the industry participants and invited guests. Everyonesparticipationledtoaverysuccessfulenterprise.JosephH.EtoRobertJ.Thomas

  • ComputationalNeedsforNextGenerationElectricGrid Page11

    WhitePaper

    RunningSmartGridControlSoftwareonCloudComputingArchitectures

    KennethP.Birman,LakshmiGanesh,andRobbertvanRenesse

    CornellUniversity

    AbstractTherearepressingeconomicaswellasenvironmentalargumentsfortheoverhaulofthecurrentoutdatedpowergrid,anditsreplacementwithaSmartGridthatintegratesnewkindsofgreenpowergeneratingsystems,monitorspoweruse,andadaptsconsumptiontomatch power costs and system load. This paper identifies some of the computingneedsforbuildingthissmartgrid,andexaminesthecurrentcomputinginfrastructuretosee whether it can address these needs. Under the assumption that the powercommunityisnotinapositiontodevelopitsownInternetorcreateitsowncomputingplatforms from scratch, andhencemustworkwithgenerally accepted standards andcommerciallysuccessfulhardwareandsoftwareplatforms,we thenask towhatextenttheseexistingoptionscanbeused toaddress therequirementsof thesmartgrid. Ourconclusionsshouldcomeasawakeupcall:manypromisingpowermanagement ideasdemand scalability of a kind that only cloud computing can offer, but also haveadditional requirements (realtime, consistency, privacy, security, etc.) that cloudcomputingwouldnotcurrentlysupport. Someofthesegapswillnotsoonbefilledbythecloud industry, forreasonsstemming fromunderlyingeconomicdrivers thathaveshapedtheindustryandwillcontinuetodoso.Ontheotherhand,wedontseethisasa looming catastrophe: a focused federal research program could create the neededscalabilitysolutionsandthenworkwiththecloudcomputingindustrytotransitiontheneededtechnologiesintostandardcloudsettings.Wellarguethatoncethesestepsaretaken, the solutions should be sufficientlymonetized to endure as longterm optionsbecausetheyarealsoofhigh likelyvalue inothersettingssuchascloudbasedhealthcare,financialsystems,andforothercriticalcomputinginfrastructurepurposes.

  • ComputationalNeedsforNextGenerationElectricGrid Page12

    1Introduction:TheEvolvingPowerGridTheevolutionofthepowergridhasbeencompared,unfavorably,withtheevolutionofmodern telephony;whileEdison,oneof thearchitectsof the former,would recognizemost components of the current grid, Bell, the inventor of the latter, would findtelephonyunrecognizablyadvancedsincehistime[40].Itisnotsurprising,then,thatthepowergridisunderimmensepressuretodayfrominabilitytoscaletocurrentdemands,and isgrowing increasingly fragile,evenas the repercussionsofpoweroutagesgrowevermore serious.Upgrading to a smarter gridhas escalated from being adesirablevision,toanurgentimperative.Clearly,thecomputingindustrywillhaveakeyroletoplayinenablingthesmartgrid,andourgoalinthispaperistoevaluateitsreadiness,initscurrentstate,forsupportingthisvision.

    Figure1:Summaryoffindings.AmoretechnicallistofspecificresearchtopicsappearsinFigure6.

    EXECUTIVESUMMARYHIGHASSURANCECLOUDCOMPUTINGREQUIREMENTSOFTHEFUTURESMART

    GRID Support for scalable realtime services. A realtime servicewillmeet its timingrequirementsevenifsomelimitednumberofnode(server)failuresoccurs.Todayscloudsystemsdosupportservices that require rapid responses,but their responsetimecanbedisruptedbytransientInternetcongestionevents,orevenasingleserverfailure.

    Support for scalable, consistency guaranteed, faulttolerant services. The termconsistency covers a range of cloudhosted services that support database ACIDguarantees, statemachine replicationbehavior,virtual synchrony,orother strong,formallyspecifiedconsistencymodels,uptosomelimitednumberofserverfailures.AttheextremeofthisspectrumonefindsByzantineFaultToleranceservices,whichcaneven toleratecompromise (e.g.byavirus)ofsome servicemembers. Todayscloudcomputingsystemsoftenembrace inconsistency[31][37],making ithard toimplementascalableconsistencypreservingservice.

    ProtectionofPrivateData.Currentcloudplatformsdosuchapoorjobofprotectingprivate data thatmost cloud companiesmust remind their employees to not beevil.Neededareprotectivemechanismsstrongenoughsothatcloudsystemscouldbe entrusted with sensitive data, even when competing power producers orconsumersshareasingleclouddatacenter.

    HighlyAssuredInternetRouting. IntodaysInternet,consumersoftenexperiencebriefperiodsoflossofconnectivity.However,researchisunderwayonmechanismsfor providing secured multipath Internet routes from points of access to cloudservices. Duplicated,highlyavailablerouteswillenablecriticalcomponentsof thefuturesmartgridtomaintainconnectivitywiththecloudhostedservicesonwhichtheydepend.

  • ComputationalNeedsforNextGenerationElectricGrid Page13

    We shall startwithabrief review toestablish common terminologyandbackground.For our purposes here, the power grid can be understood in terms of three periods[34],[10].Theearlygridaroseastheindustrynearedtheendofanextendedperiodofmonopolycontrol.Powersystemswereownedandoperatedbyautonomous,verticallyintegrated, regional entities that generated power, bought and sold power toneighboring regions, and implemented proprietary Supervisory Control And DataAcquisition (SCADA) systems. These systems mix hardware and software. Thehardwarecomponentscollectstatusdata(linefrequency,phaseangle,voltage,stateoffaultisolationrelays,etc.),transmitthisinformationtoprogramsthatcleantheinputofanybaddata,andthenperformstateestimation.Havingcomputedtheoptimalsystemconfiguration,theSCADAplatformdeterminesacontrolpolicyforthemanagedregion,andthensendsinstructionstoactuatorssuchasgeneratorcontrolsystems,transmissionlines with adjustable capacity and other devices to increase or decrease powergeneration, increaseordecreasepower sharingwithneighboring regions, shed loads,etc.TheSCADA systemalsoplayskey roles inpreventinggrid collapseby sheddingbussesifregionalsecurity1requiressuchanaction.The restructuring period began in the 1990s and was triggered by a wave ofregulatory reforms aimed at increasing competitiveness [19]. Regional monopoliesfragmented into power generating companies, Independent SystemOperators (ISOs)responsible for longdistance power transmission and grid safety, and exchanges inwhichpowercouldbeboughtandsoldsomewhatinthemannerofothercommodities(althoughthedetailsofpowerauctionsarespecifictotheindustry,andthedifficultyofstoringpoweralsodistancespowermarkets fromotherkindsofcommoditymarkets).Smallpowerproducers entered themarket, increasing competitivepressures in someregions.Greaterinterregionalconnectivityemergedastransmissionlineswerebuilttofacilitate transferofpower fromareaswith lessexpensivepower,orexcessgeneratingcapacityintoregionswithmorecostlypower,orlesscapacity.Onesideeffectofderegulationwas tocreateneweconomicpressures tooptimize thegrid,matchinglinecapacitytothepatternofuse. Marginsofexcesspowergeneratingcapacity, and excess transmission capacity, narrowed significantly, hence therestructuredgridoperatesmuchnearer its security limits. SCADA systemsplaykeyroles,performingadjustmentsinrealtimethatarevitalforgridsecurity. Thecostofthese systems can be substantial; even modest SCADA product deployments oftenrepresent investmentsof tensorhundredsofmillionsofdollars, andbecause federalregulatorypolicies require full redundancy,most such systems are fully replicated attwolocations,sothatnosinglefaultcanresultinalossofcontrol.

    1Securityhereistomeanthesafetyandstabilityofthepowergrid,ratherthanprotectionagainstmalice.

  • ComputationalNeedsforNextGenerationElectricGrid Page14

    Thisreviewwaspreparedduringtheveryfirstyearsofanewerainpowerproductionand delivery: the dawn of the smart power grid. Inefficient power generation,unbalanced consumption patterns that lead to underutilization of expensiveinfrastructure on the one hand, and severe overload on the other, aswell as urgentissuesofnationalandglobalconcernsuchaspowersystemsecurityandclimatechangearealldrivingthisevolution[40].Asthesmartgridconceptmatures,wellseedramaticgrowthingreenpowerproduction:smallproductiondevicessuchaswindturbinesandsolarpanelsorsolarfarms,whichhavefluctuatingcapacityoutsideofthecontrolofgridoperators. Small companies that specialize in producing power under just certainconditions (price regimes, certain times of theday, etc.)will becomemore andmorecommon. Power consumers are becomingmore sophisticated about pricing, shiftingconsumption from peak periods to offpeak periods; viewed at a global scale, thisrepresents a potentially nonlinear feedback behavior. Electric vehicles are likely tobecome importantover thecomingdecade,at least indenseurbansettings,andcouldshiftasubstantialnewloadintothegrid,evenastheydecreasethenationaldemandforpetroleum products. The operation of the grid itself will continue to grow incomplexity, because the effect of these changing modalities of generation andconsumption will be to further fragment the grid into smaller regions, but also toexpandthehigherlevelgridoflongdistancetransmissionlines.Clearly,alotofworkisrequiredtotransitionfromthe50yearoldlegacygridoftodaytothesmartgridofthefuture.Ourpurpose inthispaper istoseehowfarthecomputing industry isreadytomeettheneedsofthistransition.

    2TheComputationalNeedsoftheSmartGridWepresenta fewrepresentativeexamples thatshowhow largescalecomputingmustplay a key role in the smart power grid. In the next sections,we shall seewhethercurrentcomputingplatformsarewellsuitedtoplaythisrole.

    i. Thesmarthome.Inthisvision,thehomeofthefuturemightbeequippedwithavarietyofpowerusemetersandmonitoringdevices,adaptingbehaviortomatchcostofpower, loadon thegrid,andactivitiesof the residents. Forexample,ahotwaterheatermightheatwhenpowerischeapbutallowwatertocoolwhenhotwater is unlikely to be needed. Awashingmachinemight turn itself onwhenthecostofpowerdropssufficiently.Airconditioningmighttimeitselftomatchusepatterns,powercosts,andoverallgrid state. Over time,onemightimagineways that a SCADA system could reach directly into the home, forexampletocoordinateairconditioningorwaterheatingcyclessothatinsteadofbeingrandomanduniform,theyoccurattimesandinpatternsconvenienttotheutility.

    ii. Ultraresponsive SCADA for improved grid efficiency and security. In thisarea,thefocusisonimprovingthesecuritymarginsforexistingregionalcontrolsystems(which,asnotedearlier,arerunningwithslimmargins today),andondeveloping new SCADAparadigms for incorporatingmicropower generation

  • ComputationalNeedsforNextGenerationElectricGrid Page15

    into theoverallgrid. Onedifficult issue is thatthepowerproducedbyawindfarmmightnotbe consumed rightnext to that farm,yetwe lackgrid controlparadigms capable of dealing with the fluctuating production and relativelyunpredictable behavior of large numbers of small power generating systems.Onerecentstudy[2]suggestedthattosupportsuchuses,itwouldbenecessarytocreateanewkindofgridstatsystem, trackingstatusata finegrained level.Such approaches are likely tohavebigbenefits,hence futureSCADA systemsmay need to deal with orders of magnitude more information than currentSCADAapproacheshandle.

    iii. Wide area grid state estimation. Blackouts such as the NorthEast andSwiss/Italian blackouts (both in 2003), originated with minor environmentalevents(linetripscausedbydownedtrees),butthatsnowballedthroughSCADAsystem confusions that in turn caused operator errors (seeNortheast_Blackout_of_2003 and 2003_Italy_blackout in Wikipedia).Appealing though itmay be to blame thehumans, those operator errorsmayhavebeendifficulttoavoid.Theyreflectedtheinabilityofregionaloperatorstodirectlyobserve thestateof thebroaderpowergrids towhich theirregionsarelinked; lackingthatability,ahodgepodgeofguessworkandtelephonecallsareoften the only way to figure out what a neighboring power region isexperiencing. Moreover, the ability to put a telephone call through during aspreading crisis that involves loss of power over huge areas is clearly notsomethingonecannecessarilycountupon inanyfuturesystemdesign. Asthepowergridcontinuestofractureintosmallerandsmallerentities,thiswideareacontrolproblemwillgrowinimportance,withISOsandotheroperatorsneedingto continuously track the evolution of the state of the grid and, especiallyimportant, to sense abnormal events such as bus trips or equipment failures.Dataaboutpowercontractsmight informdecisions,hence thegridstate reallyincludesnot justthedatacapturedfromsensorsbutalsothe intentrepresentedinthecollectionofpowerproductionandconsumptioncontracts.

    Whatarethecomputationalneedsimpliedbythesekindsofexamples?

    i. Decentralization. Information currently captured and consumed in a singleregionalpowersystemwillincreasinglyneedtobevisibletoneighboringpowersystemsandperhapsevenvisibleonanationalscale.Aninterestingdiscussionofthistopicappearsin[2].

    ii. Scalability. Everysmartgridconceptwevereviewedbringshugenumbersofnewcontrollableentitiestothetable. Insomeideas,everyconsumershomeorofficebecomesanindependentpointforpotentialSCADAcontrol.Inothers,thehomes and offices behave autonomously but stillmust tap intodynamicdatagenerated by the power provider, such as pricing or load predictions. Otherideas integrate enormous numbers of small power producing entities into thegridand requirenontrivial controladjustments tokeep thegrid stable. Thusscalabilitywillbeakeyrequirementscalabilityofakindthatdwarfswhatthe

  • ComputationalNeedsforNextGenerationElectricGrid Page16

    industry has done up to now, and demands a shift to new computationalapproaches[25][26][2][40].

    iii. Time criticality. Some kinds of information need to be fresh. For example,studies have shown that correct SCADA systems can malfunction whenpresented with stale data, and some studies have even shown that SCADAsystemsoperatedover Internet standards like theubiquitousTCP/IPprotocolscan malfunction [25][26][2][12], because outofthebox TCP delays data forpurposesof flow controland to correctdata loss. Future smartgrid solutionswilldemandrealtimeresponseeveninthepresenceoffailures.

    iv. Consistency. Some kinds of information will need to be consistent[5][6][7][8][25][19],inthesensethatifmultipledevicesarecommunicatingwithaSCADAsystematthesametime,theyshouldbereceivingthesameinstructions,even if they happen to connect to the SCADA system over different networkpathsthatleadtodifferentserversthatprovidethecontrolinformation.Noticethat were not saying that control data must be computed in some sort ofradically new, decentralizedmanner: the SCADA computation itself could belocalized,justastodayscloudsystemsoftenstartwithonecopyofavideoofanimportant news event. But the key to scalability is to replicate data andcomputation,andconsistency issuesarisewhenaclientplatform requestsdatafromaservicereplica:isthisreallythemostcurrentversionofthecontrolpolicy?Further, notice that consistency and realtime guarantees are in someways atodds. Ifwewant toprovide identicaldata tosomesetofclients, failuresmaycausedelays:weloserealtimeguaranteesofminimaldelay.Ifwewantminimaldelay,we run the risk that a lostpacket or a sudden crash could leave someclientswithoutthemostrecentdata.

    v. DataSecurity. Severalkindsofdatamentionedabovemightbeof interest tocriminals, terrorists, or entities seeking an edge in the power commoditiesmarket. Adequateprotectionwillbe a critical requirementof future SCADAsystems.

    vi. Reliability.Powersystemsthatlosetheircontrollayer,evenbriefly,areatgraveriskofdamageorcompletemeltdown.ThusanySCADAsolutionforthefuturesmartgridneedstohavehighreliability.

    vii. Abilitytotoleratecompromise.Themostcriticalsubsystemsandservicesmayneed to operate evenwhileunder attack by intruders, viruses, orwhen someservers are malfunctioning. The technical term for this form of extremereliabilityisByzantineFaultTolerance;theareaisarichoneandmanysolutionsareknown,butdeploymentsarerareandlittleisknownabouttheirscalability.

    3TheEvolvingComputingIndustry:AnEconomicStoryWe shall now describe the current state of the computing industry, and examine itsabilitytoprovidethepropertiesdescribedaboveforthefuturesmartgrid.Webeginbygiving a brief history of the computing industry and the economic drivers of its

  • ComputationalNeedsforNextGenerationElectricGrid Page17

    evolution.These samedrivers are likely todeterminewhether thepower communitycanusecurrentcomputingplatformsforitsneeds,ornot.Prior to the late1990s, the computing industrywasaworldof client computers thatreceived data and instructions from servers. Clientserver computing represented arelativelywrenchingtransitionfromanevenearliermodel(mainframecomputing),andthenecessaryarchitectureandtoolswereslowtomature;insomesense,theexcitementassociatedwith theareaanticipated theactualqualityof the technologyby five to tenyears.Yettheclientserverarchitectureslowlygainedacceptanceandbecamethebasisofwidelyadoptedstandards,until finally,within the lastdecadeorso,software toolsforcreatingthesekindsofapplicationshavemadeitpossibleforatypicalprogrammertocreateanddeploysuchapplicationswithrelativeease.Today,clientservercomputingisthenorm,yetthepowerindustryretainslegaciesfromthemainframe computing era. For example, SCADA systems use high performancecomputing (HPC) techniques but play roles similar to SCADA solutions in oldermainframe architectures, which featured a big computer in themiddle of a slavednetworkofsensorsandactuators.Thisisincontrasttocloudarchitectures,whichtaketheclientservermodelandpushitevenfurther:theclientisnowsupportedbymultipledata centers,eachofwhichmightbe composedofavastnumberof relatively simpleservers,withsecondandeventhirdtiersofsupportlayeredbehindthem.Buttheissuesare also social:power is a critical infrastructure sector one that affectsnearly everyothersectorandunderstandably,thepowercommunityistraditionallyriskaverseandslowinadoptingnewtechnologytrends.The computing industry has seen three recent technical epochs, each succeeding theprior one in as little as five years. Looking first at the period up to around thecentennial, we saw a gamechanging transition as the early Internet emerged,blossomed,brieflycrashed(the .comboomandbust),andthendramaticallyexpandedagain. That first boom and bust cycle could be called the early Internet and wasdominated by the emergence of web browsers and by humanoriented Internetenterprises.TheInternetarchitecturebecameuniversalduringthisperiod. Priortotheperiodinquestion,wehadanumberofnetworkingtechnologies,withsomespecializedones used in settings such as wireless networks, or in support of communicationsoverlaid on power transmission lines. Many power companies still use those old,specialized, communication technologies. But today, the Internet architecture hasbecomestandard.Thisstandardizationisuseful.Forexamplemodernpowercompaniesvisualizethestatusofsensorsandactuatorsthroughsmallwebpagesthatprovidequickaccesstoparametersettingsandcontrols.Softwareonthosedevicescanbequicklyandeasily patched by upgrading to new versions over the network. But these samecapabilitieshavealsocreatedthepotentialforunintendedconnectivitytotheInternetasawhole.Attackerscanexploittheseopportunities:wesawthisinthewidelypublicizedEligibleReceiverexercises, inwhichthegovernmentdemonstratedthatatechnically

  • ComputationalNeedsforNextGenerationElectricGrid Page18

    savvybutnonexpert teamcouldusepubliclyavailable information to takecontrolofpowersystemsandinflictseriousdamageontransformers,generators,andothercriticalequipment[39].We now arrive at a period covering roughly the past five years,whichwitnessed abreathtaking advance in the penetration and adoption of web technologies.Standardization aroundwebprotocols and the easeof addingweb interfaces even tooldermainframeor clientserver applicationsmeant thatprettymuch any computingentitycouldaccessanyothercomputingentity,beithardwareorsoftware.Outsourcingboomed as companies in India,China, and elsewhere competed to offer inexpensivesoftwaredevelopmentservices. Penetrationof theInternet into thepublicandprivatesector triggered explosive revenue growth in all forms of Internet advertising. Newcomputing platforms (mobile phones, tablet computers) began to displace traditionalones,triggeringafurtherboomassociatedwithmobilityandappcomputingmodels.Rarelyhavesomanychangesbeencompressedintososhortaperiodoftime.Perhapsmostunsettling of all, completelynew companies like Facebook andGoogledisplacedwellestablishedoneslikeIBM,HP,andMicrosoft,seeminglyovernight.Onemight reasonably argue that the power industry should be immune to this sort ofturmoil,yet the impactof restructuringhas causedanequal shakeupon thebusinesssideof thepower community, even if the technical side remains less impacted. Andthere isgoodreason tobelieve that thiswillsoonchange. Forexample, the team thatcreatedGoogle isprominentamong industry leaderspromotingasmarterpowergrid.Itishardtoimaginethembeingcontenttodothingsintheusualways.Cloud computing,ourprimary focus in thispaper, isanoverarching termcovering thetechnologiesthatsupportthemostrecentfiveyearsorsooftheInternet,withdifferentspecificmeanings for different cloud operators. The termmeans different things todifferentcloudowner/operators,butsomeformofcloudcomputingcanbeexpectedinanyfutureInternet.ArecentdocumentlayingoutaFederalCloudComputingStrategy,draftedbytheCIOoftheUnitedStatesgovernment(Dr.VivekKundra)recentlycalledforspendingabout$20billionofthe$80billionfederalITbudgetoncloudcomputinginitiatives [28]andurgedallgovernmentagencies todevelopCloudbasedcomputingstrategies.Aboutathirdofthecostwouldcomefromreductionsininfrastructurecostthroughdatacenterconsolidation.Theperspectivethatshedsthemostlightontheformthatcloudcomputingtakestodaystarts by recognizing that cloud computing is an intelligent response to a highlymonetizeddemand, shapedby theeconomicsof the sectors fromwhich thatdemandemanated[8]. Thesesystemsguaranteethepropertiesneededtomakemoneyinthesesectors; properties not required (or useful only in less economically importantapplications)tendnottobe.

  • ComputationalNeedsforNextGenerationElectricGrid Page19

    Whataretheserequirements?Perhapsthemostimportantemergesfromthepressuretoaggregatedatainphysicallyconcentratedplaces.Theriseoflightweight,mobiledevices,and of clientswho routinely interactwithmultipledevices, shifts the emphasis frompersonal computing (emailon theusersownmachine,pictures inmyprivate folder,etc.) towards data center hosting models, for example Hotmail, Gmail, Flickr, andYouTube. Social networking sites gained in popularity, for example Facebook,YouTube,Flickr, andTwitter; they revolve around sharing information:mydata andyourdataneedtobeinthesameplaceifweretoshareandtonetworkinasharingdrivenmanner.Moreover,becausecloudplatformsmakemoneybyperformingsearchandplacingadvertising,cloudprovidersroutinelyneedtoindexthesevastcollectionsofdata, creating precomputed tables that are used to rapidly personalize responses toqueries.Thus,cloudcomputingsystemshaveexceptionalcapabilitiesformovingdatafromtheInternet into the cloud (web crawlers), indexingand searching thatdata (MapReduce[16], Chord [3], Dynamo [17], etc.), managing files that might contain petabytes ofinformation (BigTable [13], the Google File System [20],Astrolabe[35]), coordinatingactions(Chubby[12],Zookeeper[26],DryadLINQ[38]),and implementingcloudscaledatabases(PNUTS[15]).Thesearejustafewofmanyexamples.Massivedatasetsarejustonerespectinwhichcloudsystemsarespecializedinresponseto the economics of the field. Massivedata centers are expensive, and this creates apowerful incentive todrive the costsdown and to keep thedata center as busy andefficient as possible. Accordingly, cost factors such asmanagement,power use, andother considerations have received enormous attention [21]. Incentives can cut bothways:socialnetworkingsitesarepopular,hencecloudcomputingtoolsforsharingarehighlyevolved;privacyislesspopular,hencelittleisknownaboutprotectingdataoncewemoveitintothecloud[29].Itshouldnotbesurprisingthatcloudcomputinghasbeenshapedbythehiddenhandofthemarket,butitisimportanttoreflectontheimplicationsofthisobservation.Thespecificattributesofthemoderndatacenteranditscloudcomputingtoolsarematchedclosely to theways thatcompanies likeAmazon,Microsoft,GoogleandFacebookusethem: those kinds of companies invested literally hundreds of billions of dollars toenable the capabilities with which they earn revenue. Cloud computing emergedovernight,butnotpainlessly,andthecapabilitieswehavetodayreflecttheurgentneedsofthecompaniesoperatingthecloudplatforms.How thenwillwedealwith situations inwhich thepower grid community needs acloudcapabilitylackingintodaysplatforms?Ourmarketbasedperspectivearguesforthreepossibleanswers. If there isaclear reason that thecapability isorwillsoonbecentral to an economically important cloud computing application, awatch andwaitapproachwould suffice. Sooneror later, the trainwould comedown the track. If a

  • ComputationalNeedsforNextGenerationElectricGrid Page110

    capabilitysomehowwouldbecostlytoownandoperate,evenifitweretoexist,itmightrapidlybeabandonedandactivelyrejectedbythecommunity.Wellseethatthereisaninstance of this nature associated with consistency. Here, only by finding a moreeffectivewaytosupportthepropertycouldonehopetoseeitadoptedincloudsettings(hence,usingthesameeconomicmetricsthecommunityusestomakeitsowngo/nogodecisions). Finally, therearecapabilities that thecommercialcloudcommunitywouldfindvaluable,buthasntneededsourgentlyastoincentivizethecommunitytoactuallycreatetheneededtechnology.Insuchcases,solvingtheprobleminausefulprototypeformmightsufficetoseeitbecomepartofthestandards.

    4TheCaseforHostingtheSmartGridonCloudComputingInfrastructuresCloud computing isof interest to thepower community for severalbusiness reasons.Some parallel the green energy considerations that have stimulated such dramaticchangeinthepowerindustry:cloudcomputingisaremarkablyefficientandgreenwaytoachieveitscapabilities.Othersreflectpricing:cloudcomputingturnsouttobequiteinexpensiveindollarterms,relativetooldermodelsofcomputing.Andstillothersarestoriesofrobustness:bygeographicallyreplicatingservices,companieslikeGoogleandMicrosoftareachievingfractionofasecondresponsivenessforclientsworldwide,evenwhenfailuresorregionalpoweroutagesoccur.Cloudsystemscanbemanagedcheaplyandinhighlyautomatedways,andprotectedagainstattackmoreeasilythantraditionalsystems [31]. Finally, cloud computing offers astonishing capacity and elasticity: amoderncloudcomputingsystemisoftenhostedonafewdatacentersanyoneofwhichmight have more computing and storage and networking capacity than all of theworlds supercomputing centers added together, and can often turn on a dime,redeploying services to accommodate instantaneous load shifts. We shall enumeratesomeoftheissuesinthedebateaboutusingthecloudforbuildingthesmartgrid.

    4.1TheCloudComputingScalabilityAdvantageThe cloud and its transformation of the computing industry have resulted in thedisplacement ofpreviouskey industryplayers like Intel, IBM, andMicrosoftbynewplayers like Google, Facebook, and Amazon. Technology these newage companiescreated is becoming irreversibly dominant for any form of computing involvingscalability:atermthatcanmeandirectcontactwithlargenumbersofsensors,actuatorsor customers, but can also refer to the ability of a technical solution to run on largenumbersoflightweight,inexpensiveserverswithinadatacenter.Earliergenerationsofapproacheswereoftenabandonedpreciselybecause theyscaledpoorly. And thishascriticalimplicationsforthesmartgridcommunity,becauseitimpliesthattotheextentthatwelaunchasmartgriddevelopmenteffortinthenearterm,andtotheextentthatthegridincludescomponentsthatwillbeoperatedatlargescale,thoseelementswillbebuiltonthesameplatformsthataresupportingtheFacebooksandAmazonsoftodays

  • ComputationalNeedsforNextGenerationElectricGrid Page111

    builtonthesameplatformsthataresupportingtheFacebooksandAmazonsoftodayscomputingworld. In Figure 2 and Figure 3,we look at the scalability needs of twoscenariosrepresentativeofthefuturesmartgrid.

    4.2TheCloudCostAdvantageTheSmartGridneedsanationalscale,pervasivenetworkthatconnectseveryelectricityproducer in themarket,fromcoalandnuclearplants tohydroelectric,solar,andwindfarms, and small independent producers, with every electricity consumer, fromindustrialmanufacturingplantstoresidences,andtoeverydevicepluggedintothewall.Thisnetworkshouldenable the interconnecteddevices toexchangestatus informationand controlpower generation and consumption.The scale of such anundertaking ismindboggling. Yet, thekeyenabler, in the formof thenetwork itself,alreadyexists.Indeed, the Internet already allows household refrigerators to communicate withsupermarkets and transact purchases [30]. Itwont be difficult to build applications(apps) that inform thewashingmachineof the right time to run its load,basedonpowerpricinginformationfromtheappropriategenerators.Whatevertheirweaknesses,the public Internet and cloud offer such a strong cost advantage that the powercommunity cannot realistically ignore them in favor of building a private, dedicatednetworkforthesmartgrid.

    4.3MigratingHighPerformanceComputing(HPC)totheCloudWe noted that SCADA systems are instances of high performance computingapplications. ItthereforemakessensetoaskhowthecloudwillimpactHPC. Priortothe1990s,HPC revolvedaround special computinghardwarewithuniqueprocessingcapabilities. Thesedeviceswere simply tooexpensive,andaround1990gaveway tomassiveparallelism.Theshiftrepresentedabigstepbackwardforsomekindsofusers,because thesenew systemswere inferior to theones they replaced for somekindsofcomputation. Yet like itornot, the economicsof themarketplace toredown theoldmodeland installed thenewone,andHPCuserswere forced tomigrate.Today,evenparallelHPC systems face a similar situation. A single cloud computingdata centermighthavestorageandcomputingcapabilitiestensorhundredsoftimesgreaterthan all of the worlds supercomputing facilities combined. Naturally, thisincentivizestheHPCcommunitytolooktothecloud.Moreover,totheextentthatHPCapplicationsdomigrate into thecloud, thecommunitywilling topay tousededicatedHPC(noncloudHPC)shrinks.Thisleavesasmallermarketand,overtime,representsacounterincentiveforindustryinvestmentinfasterHPCsystems.Thetrendisfarfromcleartoday,butonecanreasonablyaskwhethersomeday,HPCaswecurrentlyknowit(onfastparallelcomputers)willvanishinfavorofsomenewHPCmodelmorecloselymatchedtothepropertiesofcloudcomputingdatacenters.

  • ComputationalNeedsforNextGenerationElectricGrid Page112

    Scenarioone:NationalScalePhasorDataCollection

    Aphasorisacomplexnumberrepresentingthemagnitudeandphaseangleofawave. Phasorsaremeasured at different locations at a synchronized time (within onemicrosecond of oneanother). The required accuracy canbeobtained fromGPS. For 60Hz systems, eachPhasorMeasurementUnit (PMU) takesabout10 to30suchmeasurementspersecond. Thedata fromvarious (up toabout60)PMUs iscollectedbyaPhasorDataConcentrator (PDC) (transmittedoverphone lines),and then forwardedalongaWideAreaMeasurementSystem (WAMS) toaSCADAsystem.TheSCADAsystemmustreceivethedatawithin2to10seconds.

    It has been suggested that as the future power grid becomes increasingly interconnected topromotesharingsoastoreducewastedpowerandsmooththeregional impactoferraticwindandsolarpowergeneration,wewillalsoexposethegridtorollingoutages.Apossibleremedyisfortheregionaloperatorstotrackthenationalgridbycollectingphasordatalocallyandsharingit globally. We now suggest that the scale of the resultingproblem is similar to the scale ofcomputational challenges that motivated web search engines to move to the modern cloudcomputingmodel.

    Simple backoftheenvelopecalculations lead to a cloud computingmodel: Todays largestPMUdeploymenthasabout120PMUs,butforthepurposesoutlinedhere,onecouldimagineadeploymentconsistingofatleast10,000PMUs.Ifwehave25PMUsperPDC,thensuchasystemwould require 400 PDCs. Each PDC would deliver 30 measurements per second. If ameasurement is 256 bytes in size (including magnitude, phase angle, timestamp, origininformation,andperhapsadigitalsignaturetoprotectagainsttamperingorotherformsofdatacorruption), then each PDC would deliver 25 x 256 x 30 = 192 KBytes/sec. The 400 PDCscombined would contribute about 77Mbytes/sec, or about 615Mbits/sec. The data wouldprobablyhavetobesharedonanationalscalewithperhaps25regionalSCADAsystems,locatedthroughoutthecountry,hencetheaggregatedatatransmissionvolumewouldbeapproximately15Gbit/sec,morethanthefullcapacityofastateoftheartopticalnetworklinktoday2.

    WhileitwouldbefeasibletobuildasemidedicatednationalscalephasordataInternetforthispurpose,operated solely for andby thepower community,weposit that sharing the existinginfrastructurewouldbesomuchcheaperthat it isnearly inevitablethatthepowercommunitywill follow thatpath. Doing so leverages thehuge investmentunderway in cloud computingsystemstodistributemoviesandInternetvideo;indeed,thedataratesareactuallycomparable(a single streamedHDDVD is about 40Mbits/second). But it also forcesus to askwhat theimplicationsofmonitoringand controlling thepowergridover the Internetmightbe; thesequestionsareatthecoreofourstudy(wepose,butdontactuallyanswerthem).

    Figure2:TrackingPhasorDataonaNationalScale

    2The10Gbitratequotedisnearthephysicallimitsforasingleopticalnetworklinkoperatedoverlongdistances(asdeterminedbytheShannoncodingtheory).Butitisimportanttokeepinmindthat Internetproviders,having invested inopticalnetworkingcapacity,canoften runmultiplesidebysideoptical linkson thesamephysicalpath. Thus, thecore Internetbackbonerunsat40Gbits, and this is achieved using 4 sidebyside 10Gbit optical links. Moreover, networkproviders often set aside dedicated bandwidth under business arrangementswith particularenterprises:GoogleorMSN, forexample,orNetflix. Thuseven if the futurepowergrid runsovertheInternet,thisdoesnotimplythatgridcontroltrafficcouldbedisruptedorsqueezedoutbyotherkindsofpublictraffic.

  • ComputationalNeedsforNextGenerationElectricGrid Page113

    ScenarioTwo:PowerAwareAppliancesinaSmartHome

    AccordingtothemostrecentUSgovernmentcensusreport,theUnitedStateshadapproximately115millionhouseholdsin2010.Applianceownershipiswidelybutvariablyestimated.Reportsonthewebsuggestthatmorethan95%ofallhouseholdshavemajorkitchenequipmentsuchasarefrigeratorandrange,that40to60%ownadishwasher,between60and95%haveadedicatedwasheranddryer,andthatasmanyas80%ormorehavetheirownhotwaterheaters(thequalityof thesestatisticsmaybeerratic). Thesehomesareheated,airconditioned,artificially lighted,andcontainmanypowereddevices(TVs,radios,etc.).Somewillsoonownelectricvehicles.

    Such numbersmake clear the tremendous opportunity for smart energymanagement in thehome. Current industry trends suggest the following mode: the consumer will probablygravitate towards mobile phone apps that provide access to home energy managementsoftware,simplybecause thismodelhas recentlygainedsomuchcommercial traction throughwideadoptionofdevicessuchastheiPhone,BlackBerry,andAndroidphones,allofwhichadoptthis particular model; apps are easy to build, easy to market, have remarkable marketpenetration,andarefamiliartotheenduser.Astheyevolve,powerawareappswillcoordinateaction tooperateappliances in intelligentways thatreduceendusercostsbutalsosmoothoutpowerdemands,reduceloadwhenthegridcomesunderstress,etc.

    Thus,onemight imagineahomeownerwho loads thedishwasherbutdoesntmind itrunninglater,needshotwaterearlyinthemorning(orperhapsintheevening;thepatternwillvarybutcouldbe learnedon aperhouseholdbasis), etc. Ideally, the localpowergridwouldwish toschedulethesetasksinapriceaware,capacityaware,energyefficientmanner.

    Inonepopularvision thegrid simplypublishesvaryingprices,whichdevices track. But thisapproach ispoorly controlled: it ishard toknowhowmanyhouseholdswillbe responsive toprice variability, andwhile one could imagine a poorly subscribed service failing for lack ofpopularity, one can also imagine the other extreme, inwhich a small price change drives amassive load shift and actually destabilizes the grid. Some degree of fine grained controlwouldbebetter.

    Thus,we suspect that over time, adifferentmodelwill emerge:utilitieswillbemotivated tocreate theirownpowermanagementapps thatofferbeneficialpricing inexchange fordirectgrid control over some of these tasks: the grid operator might, for example, scheduledishwashing and clotheswashing at times convenient to the grid, vary household heating tomatchpatternsofuse,heatwaterforshowersclosetowhenthathotwaterwillbeneeded,etc.

    Butthesearecloudcomputingconcepts: the iPhone,Blackberry,andAndroidareallso tightlylinked to thecloud that it is justnotmeaningful to imagine themoperating inanyotherway.Smarterhomescansavepower,buttheapplicationsenablingthesestepsmustbedesignedtorunon cloud computing systems,whichwillnecessarilyhandle sensitivedata,beplaced into lifecriticalroles,andmustbecapableofdigitaldialogwiththeutility itself. Allofthesearethekindsofissuesthatmotivateourrecommendationthatthepowercommunitystartnowtothinkabouthowsuchproblemscanbesolvedinasafe,trustworthy,andprivatemanner.

    Figure3:PowerAwareHomeUsingCloudHostedPowerManagementApplications(Apps)ThebigchallengeforHPCinthecloudrevolvesaroundwhatsomecallthecheckpointbarrier. The issue is this:modernHPC tools arent designed to continue executionsduring failures. Instead, a computation running on n nodeswill typically stop and

  • ComputationalNeedsforNextGenerationElectricGrid Page114

    restart ifoneof then fails. Toensure thatprogress ismade,periodiccheckpointsareneeded.Aswescaleanapplicationup,itmustcheckpointmoreoftentomakeprogress.Butcheckpointingtakestime.Itshouldbeclearthatthereisanumberofnodesbeyondwhichalltimewillbespentcheckpointingandhencenoprogresscanbemadeatall.Ontraditional HPC hardware platforms, the checkpoint barrier has not been relevant:failureratesare low. Butcloudcomputingsystemsoftenhaverelativelyhighratesofnode and storage server failures: having designed the systems to tolerate failures, itbecomesacostbenefitoptimizationdecisiontodecidewhethertobuyamorereliable,butmorecostlyserver,ortobuyalargernumberofcheaperbutlessreliableones.ThisthensuggeststhatHPCinthecurrentformmaynotmigrateeasilytothecloud,andalsothat itmaynot bepossible to just run todays standard SCADA algorithms on largenumbersofnodes as the scaleof theproblemswe confrontgrows in response to thetrendsdiscussedearlier. NewSCADAsolutionsmaybeneeded inanycase;versionsmatchedcloselytothecloudmodelmaybemostcosteffective.

    4.4HighAssuranceApplicationsandtheCloudComputingDilemmaThecloudwasnotdesignedforhighassuranceapplications,andthereforeposesseveralchallenges for hosting a critical infrastructure service like the smart grid. Onecomplicatingfactoristhatmanyofthecostsavingsaspectsofthecloudreflectformsofsharing:multiplecompanies(evencompetitors)oftensharethesamedatacenter,soastokeep the serversmore evenly loaded and to amortize costs. Multiple applicationsinvariablyruninasingledatacenter.Thus,whereasthepowercommunityhasalwaysowned and operated its own proprietary technologies, successful exploitation of thecloudwillforcetheindustrytolearntoshare.Thisisworrying,becausetherehavebeenepisodesinwhichunscrupulouscompetitionwithinthepowerindustryhasmanifesteditselfthroughcorporateespionage,attemptstomanipulatepowerpricing,etc.(ENRONbeing only the most widely known example). Thus, for a shared computinginfrastructure to succeed, itwillneed tohave ironcladbarrierspreventing concurrentusersfromseeingoneanothersdataandnetworktraffic.Thenetwork, indeed,wouldbe a shared resource even ifgridoperatorswere to runprivate, dedicated data centers. The problem here is thatwhile onemight imaginecreatingsomeformofseparateInternetspecificallyforpowerindustryuse,thecostsofdoingsoappeartobeprohibitive.Meanwhile,theexistingInternethasuniversalreachand is highly costeffective. Clearly, just as the cloud has inadequacies today, theexisting Internet raises concerns because of its own deficiencies. But rather thanassuming that these rule out the use of the Internet for smart grid applications,weshould first ask if thosedeficiencies could somehow be fixed. If the Internet can beenhancedtoimproverobustness(forexample,withmultipleroutingpaths),andifdatais encrypted to safeguard it against eavesdroppers (using different keys for differentgridoperators), it isentirelyplausible that thesharedpublic Internetcouldemergeasthecheapestandmosteffectivecommunicationoption for thepowergrid. Indeed,socosteffectiveisthepublicInternetthatthegridseemscertaintoendupusingitevenin

  • ComputationalNeedsforNextGenerationElectricGrid Page115

    itscurrentinadequateform.Thus,itbecomesnecessarytoundertaketheresearchthatwouldeliminatethetechnicalgaps.Wevediscussedtwoaspectsofthecloudinenoughdetailtoillustratethemindsetwithwhich one approaches these kinds of problems, using amarketbased perspective tounderstandwhycloudcomputingtakestheformitdoes,andthenusingthatsamepointof view to conceive of ways that technical improvements might also become selfsustaining cloud computing options once created, evaluated, and demonstrated in aconvincingmanner.Butitisimportanttounderstandthatthesewerejusttwoofmanysuchissues.Letstouchbrieflyonafewotherimportantones.Cloudcomputingisalsopeculiar in its access control and privacy capabilities [18][27][33]. Googlesmotto isDontbeEvil,becauseinthecloud,theprovidersallmustbetrusted;ifGoogle(oranyof its thousands of employees) actually are evil, we may already be in a difficultsituation. The cloud just doesnt have a serious notion of private data and, indeed,many in the industry have gone to lengths to point out that in adetailed, technical,legallybindingsense,termslikeprivacyareverymuchupintheairtoday[33]. Whatpreciselydoes itmean toensure theprivacyofanemail,oravideo, inaworldwherepeoplecasuallysendunencryptedmessagesoverthepublicnetwork,orsharedetailsoftheirpersonalhistorieswithfriendstheyknowonlyasusernamesonFacebook?Soextreme is thissituation,andsopervasive the reachof thecloud, that it isalreadypossiblethatanytechnicalremedycouldbeoutofreach.Atminimum,thelawlagsthetechnology[29]. Aneditorial intheNewYorkTimesgoesfurther,suggestingthattheeraof individualprivacymayalreadybeover [27],a sobering thought for thosewhohopetoliveunobserved,privatelives.Todayscloudtechnologyisalsoweakintheareaofreliability:thecloudisalwaysup,but data centers often suffer from brief episodes of amnesia, literally forgettingsomethingassoonastheylearnit,andthen(perhaps)rediscoveringthelostinformationlater.Sometimes,dataisuploadedintoacloud,promptlylost,andneverrediscoveredatall.Thiscanleadtoanumberofformsofinconsistency,atermusedinthedistributedcomputing community to refer to a system that violates intuitive notions of servercorrectnessinwaysthatrevealthepresenceofmultipleserverreplicasthatareactinginuncoordinatedways,orusingstaleand incompletedata[4]. Aconsistencypreservingguaranteewouldeliminatesuchissues,buttodayscloudsystemsmanagewellenoughwithweakconsistency (afterall,howmuchconsistency is really required forasearchquery,ortoplayavideo?)Byimposingweakconsistencyasanindustrystandard,thecloudplatformsbecomesimplerandhencecheapertobuildandtomanage. Thus,yetagain,weseeeconomicconsiderationsemergingasaprimarydeterminantofwhattheclouddoesanddoesnotoffer.Theissuegoeswellbeyondserviceconsistency.Cloudcomputingalsoplacesfargreateremphasisontherobustnessofthedatacenterasawholethanontherobustnessofany

  • ComputationalNeedsforNextGenerationElectricGrid Page116

    ofthehundredsofthousandsofserversitmayhavewithinit:datacenterscasuallyshutserversdown if theyseem tobecausing trouble. Noreliabilityassumptionsatallaremade about client systems, inpart because viruses,worms, and othermalware havehopelessly compromised the technologies we run on client platforms. By someestimates [14][18], fully 80% of home computers are slaves in one ormore Botnets,basicallyseemingnormal(maybeslow)totheowneryetactuallyunderremotecontrolby shadowy forces, who can use the hijackedmachines as armies in the Internetsversion ofwarfare (for example, Estonia andUkraine have both been taken off thenetwork in recent years [14]), use them as host sites for illicitmaterials, or simplyharness them as sources forwaves of spam. In his fascinating analysis of the cyberattackrisksassociatedwithnetworkbasedterrorism,RichardClarkediscussestherisksto todayspowergridatsome length [14]. Inanutshell,heshows thatpowercontrolsystems are poorly secured and can be attacked via the Internet or, using publicinformation, attacked by cuttingwires. Either outcome could be disastrous. Worstamonghisscenariosareattacksthatuselogicbombsplantedlongaheadoftheevent;heconjectures thatsuch threatsmayalreadybewidelydisseminated in todayspowergridcontrolsystems.Clearly, this situationwillneed to change. The smartgridwillplayawide rangeofsafety and lifecritical roles, and it is completely reasonable to investmoremoney tocreateamorerobusttechnologybase.Forexample,itispossibletouseautomatedcodeverification techniques toprove thatmodestsizedcomputingsystemsarecorrect. Wecanusehardwarerootsoftrusttocreatesmallsystemsthatcannotbecompromisedbyviruses.Bycomposingsuchcomponents,wecancreate fully trustworthyapplications.Such stepsmight notwork for the full range of todays cloud computing uses (andmightnotbewarranted for the cloudapplications that runTwitterorFacebook),butwithtargetedinvestment,thesmartgridcommunitycanreachapointofbeingabletocreatethemandtodeploythemintocloudenvironments.To summarize, letsagainaskwhat cloud computing isreallyabout. Thepast fewpagesshouldmakeitclearthatthetermisreallyaboutmanythings:agreatvarietyofassumptions that can seem surprising, or even shocking,when stated explicitly. Wehaveamodelinwhichalldatafindsitswayintooneormoremassivestoragesystems,whicharecomprisedof largenumbersof individuallyexpendableserversandstorageunits.Cloudplatformsalwaysguaranteethatthedatacenterwillbeoperational,andtryto keep the main applications running, but are far weaker in their guarantees forindividual data items, or individual computations. The cloud security and privacyguaranteesareparticularlyerratic, leaving room for cloudoperators tobeevil if theywere todecide todoso,andeven leavingopen theworry that inacloudsharedwithonescompetitors,theremightbeawayforthecompetitiontospyononesproprietarydata or control activities. Yet there seem to be few hard technical reasons for theselimitations:theystemmorefromeconomicconsiderationsthanfromscience.Giventhelifecriticalroleofthepowergrid,somewayofoperatingwithstrongguaranteesinall

  • ComputationalNeedsforNextGenerationElectricGrid Page117

    of these respectswouldbeneeded,at least for thegridand forothersafety criticalpurposes.

    Figure4:SummaryofAssuranceProperties

    SUMMARYOFCLOUDPROPERTIESCHARACTERISTICSOFTODAYSCLOUDCOMPUTINGANDINTERNET

    INFRASTRUCTURE

    Inexpensive toownandoperate. Economiesof scale, sharing,andautomationarepervasivewithincloudsystemsandcentraltothemodel.

    Emphasisonrapidresponseandscalability.Moderncloudcomputingsystemsaredesigned toensure thatevery request from theclient to thecloud receivesa timelyresponse,eveniftheresponsemightbeincorrect.

    SelfManaged, PowerEfficient, SelfRepairing. Cloud computing systems areastonishinglygreen:theyusepowerefficiently,keepmachinesbusy,anddynamicallyadapt under all sorts of stresses, including load surges, failures,upgrades/downgrades,etc.

    WeakConsistencyGuarantees. Theembraceof theCAP theorem (seeSection6.4)hasbeenused to justify anumberofweakguarantees [31][37]. In anutshell,mostcloudservicesarecapableofusingstaledatatorespondtorequestsandtheclientisexpectedtodealwiththis. Cloudservicesarealsounabletohidefailures:theclientmustanticipatesuddenfaultsandshouldreissuerequestsorotherwisecompensatetomasksuchevents.

    Internetasaweakpoint. Themodern Internetexperiencesasurprisingnumberofbrief outages. Cloud computing systems are expected to ride them out. Multihoming isofferedforthecloudbutnottheaverageclient(acloudcanbeaddressedbytwoormoredistinctIPaddresses),butwelacktruemultipathroutingoptions,soeven with multihoming, some clients may experience long periods of disruptedconnectivity.

  • ComputationalNeedsforNextGenerationElectricGrid Page118

    5ThreeStylesofPowerComputingWe now concretize the foregoing discussion by grouping smart grid computing intothreelooselydefinedcategories.Theseareasfollows:

    i. Applicationswithweakrequirements.Someapplicationshaverelativelyrelaxedneeds. For example, because it takes a long time to install new transmissionlines,applications thatmaintainmapsof thephysical infrastructure inapowerdeliveryregionwillchangerelativelyrarely,muchasroadmapsrarelychange.They can be understood as systems that provide guarantees but against easyconstraints.Todayscloudiswellmatchedtotheseuses.

    ii. Realtime applications. This group of applications needs extremely rapidcommunication, for example to move sensor readings or SCADA controlinformation fast enough to avoid actions based on stale data. Some studiessuggest that for many SCADA control policies, even 50ms of excess delayrelative to theminimum canbeenough to result in incorrect controldecisions[23][25][1]. Todayscloudistunedtoprovidefastresponses,butlittleattentionhasbeengiven tomaintainingspeedduringfailuresof individualservernodesorbriefInternetconnectivitydisruptions.

    iii. Applicationswith strong requirements. A final class of applications requireshighassurance,strongaccesscontrolandsecuritypolicyenforcement,privacy,faulttolerance,consistentbehaviorovercollectionsofendpointsatwhichactionsoccur,orotherkindsofproperties. Wewillargue that theapplications in thisclass share common platform requirements, and that those differ (areincomparable with) the platform properties needed for realtime purposes[4][5][36][23].Todayscloudlacksthetechnologyforhostingsuchapplications.

    Weve argued that the cloud takes the form seen today for economic reasons. Theindustry has boomed, and yethas been so focused on rolling out new competitivelyexciting technologies and products that it has been limited by the relative dearth ofsuperbengineerscapableofcreatinganddeployingnewpossibilities. Thesmartgridwould have a tough time competing head to head for the same engineerswho arefocusedon inventingthenextGoogle,orthenext iPad. However,bytapping intotheacademicresearchcommunity,itmaybepossibletobringsomeofthebrightestmindsinthenextgenerationofresearcherstofocusonthesecriticalneeds.Figure5 summarizesourobservations. Oneprimary conclusion is thatquiteabitofresearchisneededsimplytoclarifythechoicesweconfront.Yetthebroaderpictureisoneinwhichanumberofsignificanttechnologygapsclearlyexist.Ourstrongbeliefisthat these gaps can be bridged, butwe also see strong evidence that todays clouddevelopersandvendorshavelittleincentivetodosoand,forthatreason,thatawatchandwaitapproachwouldnotsucceed.

  • ComputationalNeedsforNextGenerationElectricGrid Page119

    Figure5:CloudHostedSmartGridApplications:SummaryofAssuranceRequirements

    Notes:(1) Some prototypical smart home systems operate by using small computing

    devices to poll cloudhosted web sites that track power pricing, then adaptactions accordingly. However not all proposed homeadaptationmechanismsare this simple;manywould require closer coordination andmightnot fit thecurrentcloudmodelsoclosely.

    (2) Concerns here include the risk that disclosure of toomuch information couldgive some producers opportunities to manipulate pricing during transientgeneration shortages, and concerns thatwithoutpublishing information aboutpowersystemstatus itmaybehard to implementwideareacontracts,yet thatsameinformationcouldbeusedbyterroriststodisruptthegrid.

    (3) Furtherresearchrequiredtoanswerthequestion.

  • ComputationalNeedsforNextGenerationElectricGrid Page120

    6TechnicalAnalysisofCloudComputingOptionsSome technical questions needmore justification thanwas offered in the precedingpages. This section undertakes a slightly deeper analysis on a few particularlyimportant issues. We reiterate claims made earlier, but now offer a more specificexplanationofpreciselywhytheseclaimsarevalidandwhat,ifanything,mightbedoneabouttheissuesidentified.

    6.1RebootingacloudcontrolledsmartgridOneplace to start iswithaquestion thatmany readersarenodoubtpuzzledby: theseemingconundrumofimplementingasmartgridcontrolsolutionontopofanInternetthatwouldbe incapableof functioningwithoutpower. Howcouldonerestartsuchasystem in theeventofa lossof regionalpower? Thereare twobasicelements toourresponse. First: geographic diversity. Cloud computingmakes it relatively easy toreplicatecontrolfunctionalityattwoormorelocationsthatoperatefarfromoneanotherand hence, if one is lost, the other can step in. As for the Internet, it automaticallyreroutes around failures within a fewminutes. Thus, formany kinds of plausibleoutages that impact a SCADA system atone location,having a softwarebackup at amodest distance is sufficient: shipping photons is cheap and fast. In the Internet,nobody knows if their SCADA system is running next door, or two states over.Geographicdiversityisalsointerestingbecause,atleastforcloudoperators,itoffersaninexpensivewaytoobtainredundancy.Ratherthanbuildingdualsystems,asoccursinmany of todays SCADA platforms for the existing power grid, one could imaginecloudhostedSCADAsolutionsthatamortizecostsinasimilarmannertotodaysmajorcloudapplications,andinthiswayhalvethecostofdeployingafaulttolerantsolution.Butonecan imagine faults inwhicharemoteSCADAplatformwouldbe inaccessiblebecausethewideareanetworkwouldbedown,duetoalackofpowertorunitsroutersandswitches.Thus,thesecondpartoftheanswerinvolvesfailsafedesigns.Thesmartgridwillneedtoimplementasafe,dumbmodeofoperationthatwouldbeusedwhenrestartingaftera regionaloutageand require littleorno finegrainedSCADAcontrol.Asthesystemcomesbackup,moresophisticatedcontroltechnologiescouldbephasedback in. Thus, the seeming cycle of dependencies is broken: first, one restores thepower;next,theInternet;last,themoreelaborateformsofsmartbehavior.

    6.2Adapting standard cloud solutions to supportmoredemandingapplicationsWeverepeatedlyassertedthatthecloudischeap.Butwhyisthisthecase,andtowhatextent do the features of todays cloud platforms relate to the lower cost of thoseplatforms?

  • ComputationalNeedsforNextGenerationElectricGrid Page121

    Cloud computing can be understood as an approach that starts with clientservercomputingas itsbasis,and thenscales itupdramaticallywhereasserversystemsofthepastmighthaverunon32nodes,cloudsystemsoftenhavehundredsofthousandsofmachines,eachofwhichmayhaveasmanyas8to16computationalcores. Thusacloudcomputingsystem isatrulymassivestructure. Someareas largeas45footballfields, packed so densely with computing and storage nodes that machines arepurchasedbythecontainertruckloadandtheentirecontainerisliterallypluggedinasaunit. Yetasvastas thesenumbersmaybe, theyaredwarfedby theeven largernumberofclientsystems.Today,itisnoexaggerationtosaythateverylaptop,desktop,pad,andevenmobiletelephoneisacloudcomputingclientsystem.Manyhaveliterallydozensofcloudapplicationsrunningatatime.Thusthecloudisaworldofbillionsofendusersystemslinked,overtheInternet,totensofmillionsofservers,residingindatacentersthatindividuallyhouseperhapshundredsofthousandsormillionsofmachines.The cost advantage associatedwith thismodel relates to economies of scale. First,simply because of their scale, cloud computing systems turn out to be remarkablyinexpensivetoownandoperatewhencomparedwithasmallrackofserverssuchasonefindsinmostpowerindustrycontrolcenters.JamesHamilton,inhiswidelycitedblogat http://mvdirona.com, has talked about the cost of a cloud. He concludes thatrelative to other types of scalable infrastructure, the overall cost of ownership isgenerallyafactorof10to15lowerwhenallcostsareconsidered(human,infrastructure,servers, power, software development, etc.). This is a dramatic advantage. Cloudsystemsalsorunhot:withbuildingspackedwithmachines,ratherthanhumans,theneedforcooltemperaturesisgreatlyreduced.Themachinesthemselvesaredesignedtotoleratetheseelevatedtemperatureswithoutanincreasedfailurerate.Theapproachistosimplydrawambientairandblowitthroughthedatacenter,withoutanyformofairconditioning. Interior temperaturesof100+Farecommon,and therehasbeen talkofrunning clouds at 120F. Since cooling costsmoney, such options can significantlyreducecosts.Furthermore,cloudsystemsoftenoperateinplaceswherelaborcostsandelectricpowercostsarecheap: ifa largepowerconsumer isclose to thegenerator, theexcesspowerneedsassociatedwithtransmissionlinelossareeliminatedandthepoweritselfbecomescheaper. Thus,onedoesnt find thesesystems in thebasementof the localbank; theywouldmore often be situated near a dam on a river in the PacificNorthwest. Thedevelopers reason thatmoving information (such as data from the client computingsystem) to the cloud, computing in a remoteplace, andmoving the results back is arelativelycheapandfastoptiontoday,andthespeedandgrowthtrendsoftheInternetcertainlysupporttheviewthatastimepasses,thisapproachmightevendobetterandbetter.

    6.3TheInternetasaweaklinkWeve asserted that the Internet is unreliable, yet thismay notmake sense at firstglance;allofushavebecomedependentonadiversityof Internetbasedmechanisms.

  • ComputationalNeedsforNextGenerationElectricGrid Page122

    Yetuponreflection,theconcernmakesmoresense:anyonewhousesanInternetradio,orwhoownsa televisionadapter thatsupportswatchingmoviesondemand,quicklyrealizes thatwhile these technologiesusuallyarequite robust,sometimesoutagesdooccur.Theauthorsofthiswhitepaperownanumberofsuchtechnologiesandhavesometimes experiencedmultiple brief outages daily, some lasting just seconds, andothersperhapsminutes.VoiceoverIPtelephonyisasimilarexperience:usersofSkypethinknothingofneedingtotryacallafewtimesbeforeitgoesthrough.Moreover,allof theseare consequencesofmundane issues: studies reveal that the Internetglitcheswevebeentalkingaboutaremostlytriggeredbyoperatorerror,brief loadsurgesthatcause congestion, or by failures of the routers that support the network; a typicalnetworkroutetodaypassesthrough30ormoreroutersandwhenonegoesoffline,theInternetmayneedasmuchas90secondstorecoverfullconnectivity. Genuinely longInternetoutageshaveoccurredmorerarely,buttheydohappenfromtimetotime,andtherootcausescanbesurprising:inoneevent,anunderseacablegotseveredoffEgypt,andIndiaexperienceddisruptednetworkconnectivityforsomeseveraldays[1].When the Internet has actually come under attack, the situation is much worse.Experiencewithoutrightattacksonthenetwork is less limitedthanonemightrealize:recent events include socalled distributed denial of service attacks that have takenentire small countries (such as Estonia) off the network for weeks, disruptedgovernment andmilitaryweb sites, and harassed companies likeGoogle (when thatcompany complained aboutChinaspoliticalpolicies recently). Awaveof intrusionsintoDepartmentofDefense(DOD)classifiedsystemsresultedinthetheftofwhatmayhave been terabytes of data [14]. Researchers who have studied the problem haveconcludedthattheInternetisreallyaveryfragileandtrustinginfrastructure,evenwhenthemost secureprotocolsare inuse. Thenetwork couldbe literally shutdown,andtherearemanywaystodoit;someentirelybasedonsoftwarethatcanbelaunchedfromanywhereintheworld(fortunately,complexsoftwarenotyetinthehandsofterrorists);other attacks might deliberately target key components such as hightraffic opticalcables, using lowtechmethods such as bolt cutters. Thus any system that becomesdependentupontheInternetrepresentsakindofbetthattheInternetitselfwillbeuptothetask.Thus the Internet is one weak link in the cloud computing story. We tolerate thisweaklinkwhenweuseourwebphonestogetdirectionstoagoodrestaurantbecauseglitches are so unimportant in such situations. But if the future smart grid is to becontrolled over a network, the question poses itself:would this be the Internet, in aliteral sense? Or some other network to be constructed in the future? On this theanswer isprobablyobvious:buildingaprivateInternetforthepowergridwouldbeahugelyexpensiveproposition.Thenationmightwellcontemplatethatoption,butwhenthedaycomes tomake thedecision,wearenot likely tosummon thepoliticalwill toinvestontheneededscale.Moreover,thatprivateInternetwouldbecomeanextensionof the public Internet the moment that some enterprising hacker manages to

  • ComputationalNeedsforNextGenerationElectricGrid Page123

    compromiseevenasinglemachinethathasanInternetconnectionandalsohasawaytotalktothepowernetwork.Thisiswhyweveconcludedthatthebesthopeisforatechnicaladvancethatwouldletusoperateapplicationsthatneedasecure,reliableInternetovertodayslesssecure,lessreliableone. Achievingsuchacapabilitywouldentail improvinghandlingof failureswithintodayscoreInternetrouters(whichoftenarebuiltasclustersbutcanbeslowtohandle failures of even just a single router component), and also offering enhancedoptions forbuildingsecureroutesand forcreatingredundantroutes thatshareas fewlinksaspossible,sothat ifoneroutebecomesdisruptedoroverloaded,asecondroutemightstillbeavailable.Inaddition,thepowergridcanmakeuseofleasedconnectionstofurtherimprovereliabilityandperformance.

    6.4BrewersCAPConjectureandtheGilbert/LynchCAPTheoremWeve discussed the relativelyweak consistency properties offered by todays cloudcomputing platforms and even commented that cloud providers embraceinconsistencyasavirtue[31][37].Whyisthisthecase,andcanwehopetodoanythingaboutit? Cloudcomputingsystemsaresomassive(andyetbuiltwithsuchrelativelyweakcomputers)thatthecorechallengeinbuildingcloudapplicationsistofindwaysto scale those applications up, so that the application (a term that connotes a singlething) might actually be implemented by thousands or even tens of thousands ofcomputers,withtheusersrequestsvectoredtoanappropriatemachine.Howcan this formofscalingbeaccomplished? It turnsout that theanswerdependsmuchontheextenttowhichdifferentusersystemsneedtosharedata:

    At the easiest end of the spectrum we find what might be called sharednothing applications. Agood examplewouldbe theAmazon shoppingwebpages. As long as the servermy computer is communicating with has areasonableapproximationofthestateoftheAmazonwarehousesystems,itcangiveme reasonableanswers tomyqueries. Iwontnotice ifaproduct showsslightlydifferentpopularityanswers to two identicalqueriesreachingdifferentserversatthesametime,andifthenumberofcopiesofabookisshownas3instock,butwhenIplacemyordersuddenlychangesto1,orto4,nogreatharmoccurs. Indeed,manyofushavehadtheexperienceofAmazonfillingasingleordertwice,andafewhaveseenordersvanishentirely. Allaremanifestationsofwhat is called weak consistency by cloud developers: amodel inwhichpretty good answers are considered to be good enough. Interestingly, thecomputationsunderlyingwebsearchfallsolidlyintothiscategorysomuchsothatentireprogrammingsystemsaimedat thesekindsofcomputingproblemshave become one of the hottest topics for contemporary research; examplesincludeMapReduce[16]andothersimilarsystems,filesystemssuchasGooglesGFS [20] and the associated BigTable database layered on top of it [13], etc.

  • ComputationalNeedsforNextGenerationElectricGrid Page124

    These are systemsdesignedwith loose coupling, asynchronous operation andweakconsistencyasfundamentalpartsoftheirmodel.

    Aslightlyharder(butnotmuchharder)problemarisesinsocialnetworkingsiteslikeTwitterorFacebookwheregroupsofusers sharedata, sometimes in realtime.Here,thetrickturnsouttobetocontrolthenetworkroutingprotocolsandthesocalledDomainNameService(DNS)sothatpeoplewhosharedataenduptalking to the same server. Whilea server farawaymightpullup thewrongversionofapage,orbeslow toreportaTweet, theusers talking to thatsingleserverwould be unaware that the cloud has split itsworkload into perhapsmillionsofdistinctusergroupings.

    Gaming and Virtual Reality systems such as Second Life are similar to thissecondcategoryofsystems:asmuchaspossible,groupsofusersaremappedtosharedservers.Here,agreaterdegreeofsophisticationissometimesneededandcomputergamingdeveloperspublishextensivelyontheirsolutions:onedoesntwant to overload the server, and yet one doeswant to support gameswiththousandsofplayers. eBay facesarelatedchallengewhenanauctiondrawsalargenumberofbidders.Suchsystemsoftenplaysmalltricks:perhapsnoteverybidderseestheidenticalbidsequenceonahotlycontendedforitem.Aslongasweagreeonthewinneroftheauction,thesystemisprobablyconsistentenough.

    Hardestofallareapplicationsthatreallycantbebrokenupintheseways.AirTrafficControlwould be one example:while individual controllersdo ownportionsoftheairspace,becauseairplanestraversemanysuchportionsinshortperiodsoftime,onlyanapproachthattreatsthewholeairspaceasasingleplaceandshowsdatainaconsistentmannercanpossiblybesafe. Themyaccountportion of many web sites has a similar flavor: Amazon may use tricks toimproveperformancewhileoneshops,butwhenanactualpurchaseoccurs,theirsystemlocksdowntoamuchmorecarefulmodeofoperation.

    The tradeoffs between consistency and scalable performance are sometimessummarized using what Eric Brewer has called the Consistency Availability andPartitioning(CAP)theorem[11].Brewer,aresearcheratUCBerkeleyandcofounderofInktomi, argued in awidely cited keynote talk at PODC 2000 that to achieve highperformance and for servers tobe able to respond in anuncoordinated, independentmanner to requests they receive from independentclients, thoseserversmustweakenthe consistencyproperties they offer. In effect,Brewer argues thatweak consistencyscaleswell and strong consistency scales poorly. A formalization ofCAPwas laterprovedundercertainweakassumptionsbyMITsGilbertandLynch,butdatacenterscan oftenmake stronger assumptions in practice, and consequently provide strongerproperties. Moreover, there aremany definitions of consistency, andCAP is only atheoremforthespecificdefinitionthatwasusedintheproof.ThusCAPissomethingofa folktheorem: a convenient paradigm that some data centers cite as a reason foroffering weak consistency guarantees (guarantees adequate for their own needs,

  • ComputationalNeedsforNextGenerationElectricGrid Page125

    althoughinadequateforhighassurancepurposes),yetnotalawofnaturethatcannotbecircumventedunderanycircumstances.Webelievethatmoreinvestigationisneededintothescalabilityandrobustnessoptionsthat weaker consistencymodelsmight offer. CAP holds under specific conditions;perhapsdatacenterscanbedesignedtoinvalidatethoseconditionsmostcloselytiedtothe impossibility result. Hardware assistance might be helpful, for example insupportingbetterformsofcloudsecurity.ThusCAPstandsasanissue,butnotonethatshoulddiscouragefurtherwork.

    6.5HiddenCosts:SecurityImplicationsofWeakConsistencyCloudsecurityillustratesoneofthedangersofcasualacceptanceoftheCAPprinciples.We build secure systems startingwith specifying a securitypolicy that the system isexpected to obey. Typically, these policies consist of rules and those rules arerepresentedasakindofdatabase; thedata in thedatabasegives the logicalbasis formakingsecuritydecisionsandalsoidentifiestheusersofthesystemandthecategoriesofdata.Asthesystemruns,itcanbethoughtofasprovingtheorems:JoeispermittedtoaccessSallysfinancialdatabecausetheyareacouple;SandracandosobecausesheisSallysbanker. John,Sallysexhusband, isnotpermittedtoaccessthoserecords. Thedata evolves over time, and correct behavior of the system depends upon correctinferenceoverthecurrentversionsoftheunderlyingrulesandtheunderlyingdata.Cloud systems have real difficulty with these forms of security, because the sameembraceofweak consistency thatmakes them so scalable also implies thatdatamayoftenbestaleorevenoutrightwrongwhen thesystem tries tooperateon it. Perhapssomenodewillbeslowto learnaboutSallysdivorcemaybe itwillnever learnof it.Cloudsystemsdontprovideabsoluteguaranteesaboutsuchthings,onthewhole,andthis makes them easier to scale up. But it also makes them deeply perhapsfundamentallyuntrustworthy.The termtrustworthydeliberatelygoesbeyondsecurity. Suppose thatasmartgridcontroldeviceneedstohandlesomeevent:perhapslinecyclesdroporincreaseslightly,oracurrentsurgeissensed.Tocoordinatethereactionappropriately,thatdevicemightconsultwith itscloud server. Buteven ifconnectivity isnotdisruptedand thecloudserverisrunning,werunintotheriskthattheserverinstancethatrespondsperhapsoneofabankofinstancesthatcouldnumberinthethousandsmighthavestaledataandhencerespondinanincorrectmanner.Thusitisentirelypossiblefor99serverstoknowaboutsomenewloadonthegrid,andyetfor1servertobeunawareofthis,ortohavedatathatisincorrect(inconsistent)inaplethoraofotherways.Cloudsystemsarealsoquitecasualaboutrestartingserversevenwhiletheyareactivelyhandlingclientrequeststhis,too,ispartofthescalabilitymodel(itreducesthehumancostofmanagement,becauseonedoesntneed togracefully shut thingsdownbefore

  • ComputationalNeedsforNextGenerationElectricGrid Page126

    restartingthemormigratingthem).Thusoursmartgridcontroldevicemightfinditselfworking off instructions that reflect faultydata, ordeprived of control in an abrupt,silentmanner,orsuddenly talking toanewcontrollingserverwithnomemoryof therecentpast.

    7PrettyGoodisSometimesGoodEnoughCloudcomputingisaworldofverylargescalesystemsinwhichmostcomponentsareworkingcorrectlyeven ifafeware laggingbehind,workingwithstaledata,restartingafteranunplannedandsuddenoutage,orotherwisedisrupted.Yetitisvitaltorealizethatformanypurposesthesepropertiesaregoodenough. Facebook,Youtube,Yahoo,Amazon,Google,MSNLive all are examples of systems thathostvastnumbers ofservicesthatworkperfectlywellagainstthissortoferraticmodel.Googlesdifficultiesrepellinghackerattacks(apparentlyfromChina)dogivepause;thiseventillustratesthedownsideofthecloudmodel; it isactuallyquitehardforGoogletosecure itssystemsfor the same reasons we discussed earlier: security seems to be at odds with themechanismsthatmakethosesystemsscalable.Moreover,thecloudmodelwouldseemtocreateloopholesthathackerscanexploit(includingthemassiveandremotenatureofthecloudcentersthemselves:readytargetsforagentsofforeignpowerswhomightwishtointrudeandintroducevirusorotherundesiredtechnicalcomponents).Thefrustrationformanyinthefieldtodayisthatwesimplydontknowenoughaboutwhat canbe solved in the standard cloudmodel. Wealsodontknowenoughaboutmapping strongermodels onto cloudlike substrates or onto the Internet. Could thesamehardwarethatrunstheInternetnothostsoftwarethatmighthavebetternetworksecurityandreliabilitycharacteristics?Onewouldbefoolishtoassertthatthiscannotbedone.Couldthesameplatformsweuseincloudsettingsnotsupportapplicationswithstrongerproperties?Verypossibly.Wesimplydontknowhowtodoso,yet,inpartforthereasonjustcited:Google,Microsoft,Yahoo,andothershaventhadmuchneedtodothis, and so thehuge investment thatgaveus the cloudhasnt seen a correspondinginvestmenttocreateahighlyassuredcloudformissioncriticalroles.Moreover,onecanturntheproblemon itsheadandas