research directions in energy-sustainable cyber–physical systems

18
Sustainable Computing: Informatics and Systems 1 (2011) 57–74 Contents lists available at ScienceDirect Sustainable Computing: Informatics and Systems journal homepage: www.elsevier.com/locate/suscom Survey article Research directions in energy-sustainable cyber–physical systems Sandeep K.S. Gupta , Tridib Mukherjee, Georgios Varsamopoulos, Ayan Banerjee Impact Lab, School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA article info Article history: Received 11 October 2010 Received in revised form 28 October 2010 Accepted 29 October 2010 Keywords: Cyber–physical systems Sustainability Model-based engineering abstract An overview of sustainable computing is provided and different approaches towards design and verifica- tion of energy-sustainable computing (i.e., sustainable computing from energy consumption perspective) are discussed for cyber–physical systems (CPSs), i.e., systems with strong coupling between computing components and non-computing processes in physical environment. A major issue in this regard is the inter-dependencies of the non-computing processes on the computing components and vice versa, and the verification of the CPSs’ sustainability without real deployment. The trends and dependencies of energy consumption for both computing and non-computing components are conceptualized. Based on this conceptualization, CPS resource management algorithms are categorized according to: (i) com- puting workload execution and arrival profiles supported, (ii) knowledge of workload profiles during management decision making, (iii) support of power management in the computing components, and (iv) assumptions on non-computing process behavior. These categories are then discussed along with their pros and cons for two representative CPSs: data centers and body sensor networks (BSNs). A model based engineering approach is used to verify CPS sustainability before real deployment. Several research directions and open problems are further discussed for the design and verification of energy-sustainable CPSs. © 2010 Elsevier Inc. All rights reserved. 1. Introduction With the ongoing focus on environmental sustainability prolif- erating in different domains, sustainable computing, a.k.a. green computing, has been getting increased attention in recent years. There are three principal aspects of sustainable computing: (i) reduction of the energy required for running any computing infras- tructure [1], e.g., energy-efficient management of data centers; (ii) ensuring longevity of computing equipment to reduce need for their replacement [1], e.g., avoid server breakdown in data centers by maintaining safe operating temperature; and (iii) ensur- ing energy consumption within the energy available from the renewable energy sources in the environment [2], e.g., sensors on human body being powered by the energy generated from res- piration, ambulation, and sunlight. This paper gives an overview on sustainable computing in general and focuses specifically on the energy-sustainable computing, i.e., sustainable computing from energy perspective. A major issue in addressing the different aspects of sustainable computing is the need for awareness of the non-computing processes in the physical environment, e.g., the This work was funded in parts by NSF (CNS#0855277, CSR#0834797, CNS#0831544), Intel Corp., Science Foundation of Arizona (SFAz), and Raytheon Corp. Corresponding author. Tel.: +1 4809653806. E-mail address: [email protected] (S.K.S. Gupta). dependency of the equipment longevity on environmental factors and the availability of energy from the environment. Computing systems having strong coupling with the physical environment are referred as cyber–physical systems (CPSs). These systems usually monitor, coordinate, and control non-computing processes. Recent advances in the sensor technologies and embed- ded computing systems have seen a surge of research investigations on CPSs [3–5]. Examples of CPSs include: body sensor networks (BSNs) (i.e., network of medical sensors worn on implanted in human body [6]) that interact with the human physiology (i.e., a non-computing process) to monitor physiological conditions (e.g., heart rate, pulse rate, blood glucose level), medical devices that interact with the human physiology to control physiological con- ditions (e.g., maintaining certain level of drug concentration using infusion pumps [5]), autonomous vehicles that interact with the vehicle mechanics (i.e., a non-computing process) to monitor and control vehicle’s trajectory and dynamics, disaster response systems that interact with various non-computing processes (e.g., human behavior, environment) to monitor critical events and coordinate proper response actions. In addition to the functional interactions with the non- computing processes as demonstrated by the previous examples, interactions with the non-computing processes can aid in the sus- tainability from an energy perspective. In this regard, energy scav- enging (i.e., a type of interaction) can be performed from various sources in the physical environment such as body heat, sunlight, ambulation, vibration, respiration, and so on [2,7]. Powering the 2210-5379/$ – see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.suscom.2010.10.003

Upload: sandeep-ks-gupta

Post on 29-Jun-2016

235 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Research directions in energy-sustainable cyber–physical systems

S

R

SI

a

ARRA

KCSM

1

ecTrt(fcirhpotean

CC

2d

Sustainable Computing: Informatics and Systems 1 (2011) 57–74

Contents lists available at ScienceDirect

Sustainable Computing: Informatics and Systems

journa l homepage: www.e lsev ier .com/ locate /suscom

urvey article

esearch directions in energy-sustainable cyber–physical systems�

andeep K.S. Gupta ∗, Tridib Mukherjee, Georgios Varsamopoulos, Ayan Banerjeempact Lab, School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA

r t i c l e i n f o

rticle history:eceived 11 October 2010eceived in revised form 28 October 2010ccepted 29 October 2010

eywords:yber–physical systemsustainabilityodel-based engineering

a b s t r a c t

An overview of sustainable computing is provided and different approaches towards design and verifica-tion of energy-sustainable computing (i.e., sustainable computing from energy consumption perspective)are discussed for cyber–physical systems (CPSs), i.e., systems with strong coupling between computingcomponents and non-computing processes in physical environment. A major issue in this regard is theinter-dependencies of the non-computing processes on the computing components and vice versa, andthe verification of the CPSs’ sustainability without real deployment. The trends and dependencies ofenergy consumption for both computing and non-computing components are conceptualized. Basedon this conceptualization, CPS resource management algorithms are categorized according to: (i) com-

puting workload execution and arrival profiles supported, (ii) knowledge of workload profiles duringmanagement decision making, (iii) support of power management in the computing components, and(iv) assumptions on non-computing process behavior. These categories are then discussed along withtheir pros and cons for two representative CPSs: data centers and body sensor networks (BSNs). A modelbased engineering approach is used to verify CPS sustainability before real deployment. Several research

lems

directions and open probCPSs.

. Introduction

With the ongoing focus on environmental sustainability prolif-rating in different domains, sustainable computing, a.k.a. greenomputing, has been getting increased attention in recent years.here are three principal aspects of sustainable computing: (i)eduction of the energy required for running any computing infras-ructure [1], e.g., energy-efficient management of data centers;ii) ensuring longevity of computing equipment to reduce needor their replacement [1], e.g., avoid server breakdown in dataenters by maintaining safe operating temperature; and (iii) ensur-ng energy consumption within the energy available from theenewable energy sources in the environment [2], e.g., sensors onuman body being powered by the energy generated from res-iration, ambulation, and sunlight. This paper gives an overviewn sustainable computing in general and focuses specifically on

he energy-sustainable computing, i.e., sustainable computing fromnergy perspective. A major issue in addressing the differentspects of sustainable computing is the need for awareness of theon-computing processes in the physical environment, e.g., the

� This work was funded in parts by NSF (CNS#0855277, CSR#0834797,NS#0831544), Intel Corp., Science Foundation of Arizona (SFAz), and Raytheonorp.∗ Corresponding author. Tel.: +1 4809653806.

E-mail address: [email protected] (S.K.S. Gupta).

210-5379/$ – see front matter © 2010 Elsevier Inc. All rights reserved.oi:10.1016/j.suscom.2010.10.003

are further discussed for the design and verification of energy-sustainable

© 2010 Elsevier Inc. All rights reserved.

dependency of the equipment longevity on environmental factorsand the availability of energy from the environment.

Computing systems having strong coupling with the physicalenvironment are referred as cyber–physical systems (CPSs). Thesesystems usually monitor, coordinate, and control non-computingprocesses. Recent advances in the sensor technologies and embed-ded computing systems have seen a surge of research investigationson CPSs [3–5]. Examples of CPSs include: body sensor networks(BSNs) (i.e., network of medical sensors worn on implanted inhuman body [6]) that interact with the human physiology (i.e., anon-computing process) to monitor physiological conditions (e.g.,heart rate, pulse rate, blood glucose level), medical devices thatinteract with the human physiology to control physiological con-ditions (e.g., maintaining certain level of drug concentration usinginfusion pumps [5]), autonomous vehicles that interact with thevehicle mechanics (i.e., a non-computing process) to monitor andcontrol vehicle’s trajectory and dynamics, disaster response systemsthat interact with various non-computing processes (e.g., humanbehavior, environment) to monitor critical events and coordinateproper response actions.

In addition to the functional interactions with the non-computing processes as demonstrated by the previous examples,

interactions with the non-computing processes can aid in the sus-tainability from an energy perspective. In this regard, energy scav-enging (i.e., a type of interaction) can be performed from varioussources in the physical environment such as body heat, sunlight,ambulation, vibration, respiration, and so on [2,7]. Powering the
Page 2: Research directions in energy-sustainable cyber–physical systems

5 ing: In

cga

rsdwfis[roeresppor

afscitoobtac

pesFatnstcdfbepvrob

(

able from the green sources (as indicated by the shaded regionsin Fig. 1), in which case the extra (or remaining) power needs tobe extracted from the grid or battery. Sustainability of computingoperations from the energy perspective (i.e., energy sustainability)can be defined as follows:

8 S.K.S. Gupta et al. / Sustainable Comput

omputing components from these sources reduces the demand forrid power or battery power, thus reducing the carbon emissionsnd improving the environmental sustainability in general.

Ideally, the CPS operations should be designed such that theequired power can be always supplied from the scavengingources. Towards this objective, one option is to reduce the poweremand of the computing operations so that the demand is alwaysithin the available power (or reduces the grid power demand as

ar as possible). In this regard, strategies for sustainable comput-ng have focused on processor level power management schemesuch as frequency control, voltage control, or sleep state scheduling8], medium access control (MAC) sleep scheduling of the wirelessadio [9,10] communication among the sensors, and amortizationf the wireless communication energy in sensor networks with lessxpensive computation [11]. However, design of sustainable CPSsequires a holistic approach which is aware of the limited availablenergy from the scavenging sources. Also, the power managementtrategies have to take into consideration the dependency of theower consumption to the intensity of the workload. For exam-le, system may or may not be energy-proportional, i.e. they mayr may not proportionally scale their energy consumption withespect to the workload [50].

The holistic cyber–physical perspective further helps in a properwareness of the non-computing processes (other than the onesrom which energy can be scavenged), which indirectly affects theustainability of any computing infrastructure. For example, theooling energy required to maintain safe operating temperaturesn data centers can be reduced if the impact of the computation onhe cooling need is properly understood [12]. Indeed, the total costf ownership (TCO) of the data centers can be enormous becausef a large amount of recurring energy cost, about half of which cane attributed to cooling [13]. As such, it is imperative to transcendhe current sustainable computing practices of power managementnd server provisioning in data centers to a more holistic approachoordinating with the management of the cooling equipment.

Another major challenge in designing sustainable CPSs is theotential prohibitiveness of real life experimental evaluation. Forxample, building a data center to verify holistic managementtrategies can be cumbersome in terms of both time and resources.urther, many CPSs, e.g., BSNs and autonomous vehicles can be haz-rdous to test in real situations due to the risks associated withheir malfunction. Therefore, automated sustainability verificationeeds to be facilitated for CPSs. A well established methodology foruch verification is model based engineering (MBE) [14]. Applica-ion of MBE in CPS however would require modeling of both theomputing and non-computing processes along with their inter-ependencies. Any computing strategy would have to be analyzedor its effects on the sustainability, e.g., analysis of energy needs ofoth computing and non-computing processes for verification ofnergy sustainability. Such modeling and analysis have to captureossible spatio-temporal dynamics of the inter-dependencies, e.g.,ariation of the available energy from scavenging sources (such asespiration) with respect to time (since respiration rate may dependn physical activities [15]) and space (since different portions of theody may extract different amount of energy [2]).

This paper intends to:

(i) conceptualize the trends and inter-dependencies of the powerconsumption in both computing and non-computing processesin a CPS;

(ii) categorize holistic resource management in CPSs that consid-

ers computing workload management, power management,and non-computing process management; and

iii) identify research directions and open problems to design holis-tic resource management and facilitate model-based analysisof CPSs for sustainability verification.

formatics and Systems 1 (2011) 57–74

Section 2 gives a brief overview on sustainable computing bydiscussing different perspectives towards sustainable computingand surveying the research directions taken in regards of theseperspectives. Design and verification of energy-sustainable CPSis then discussed in Section 3. The power characteristics of CPSsis theoretically conceptualized based on the dependencies of thepower consumption among the computing and non-computingcomponents. To achieve any level of energy sustainability, resourcemanagement algorithms in CPSs have to be aware of the behav-ior of non-computing processes and the impact of computingcomponents on these processes. Such resource management algo-rithms are classified based on the workload arrival and executionprofiles supported, knowledge of the workload during manage-ment decision making, support of power management in thecomputing components, and assumptions on the behavior of thenon-computing processes. The theoretical conceptualization of thepower characteristics and pros and cons of different resource man-agement classes are discussed for two representative CPSs: datacenters (in Section 4) and BSNs (in Section 5). Section 6 discusseshow model-based analysis can be performed for energy sustain-ability verification of CPSs followed by various open problems andresearch directions in designing and verification of sustainable CPSs(in Section 7). Finally, Section 8 concludes the paper.

2. Sustainable computing

Sustainable computing can be defined from: (i) energy perspec-tive; and from (ii) equipment recycling perspective, as describedbelow.

2.1. Energy perspective

From the energy perspective, sustainable computing, i.e.energy-sustainable computing, can be described as the balancebetween the power required for computation and the poweravailable from renewable or green sources (i.e., sources in the envi-ronment such as solar power). For example, as shown in Fig. 1, if thepower available from the energy sources is higher than the requiredpower, then the computation can be performed without any powerfrom the grid (or battery). However, both available and requiredpower may vary over time (e.g., solar power is not available dur-ing night and the power requirement depends on the time-varyingcomputing operations performed). Computing operations becomeunsustainable if the required power is higher than the power avail-

Fig. 1. Profile of power required and power available from external (green) sources.Unsustainable operation can be caused by imbalance of available and requiredpower, in which case energy needs to be supplied from the power grid or battery.

Page 3: Research directions in energy-sustainable cyber–physical systems

S.K.S. Gupta et al. / Sustainable Computing: In

Fgas

f

midiahbpcas

nttf

ig. 2. Power imbalance in Fig. 1 can be addressed through storage of energy. At aiven instance, the stored energy is the accumulation of the slack between availablend required power in all the previous time instances. Energy from the externalources can be wasted because of the limit on the storage capacity.

Energy sustainability is the average percentage of energy usedrom the green sources to power all computing units.

In other words, energy-sustainable computing needs to ensureinimum energy requirement from the grid or battery (i.e., min-

mizing the areas of the shaded regions in Fig. 1). There can be aifferent manifestation of the definition depending on the availabil-

ty of energy from green sources. For example, if there is no energyvailable from the green sources (i.e., all the computing operationas to be run from the grid or battery), then energy sustainabilityoils down to reducing the average energy required for the com-uting operations. In all other cases, the best energy-sustainableomputing solution would be to ensure that all the computing oper-tions can be powered by the energy generated from the greenources at all time.

In such a case, energy sustainability can also be measured as theumber of computing units which can be completely powered byhe energy from green sources. There are different research direc-ions in achieving energy-sustainable computing such that the needor grid and battery power is minimized:

(i) Energy storage: Energy storage devices can store energywhenever available from the green sources. There has beenseveral energy storage techniques such as ultra-capacitors,compressed air storage, batteries, fuel cells, and flywheels[16–18]. Fig. 2 shows the variation of stored power for theavailable and required power profiles in Fig. 1. The energyavailable from the storage device at any instance is the accu-mulation of the slack between the available and the requiredpower in all the previous instances. Any slack at a given timegets stored in the storage device for later use. The storedenergy can be used when the power requirement is higherthan the power available from the green sources. During thisperiod, the stored energy reduces. This energy can get replen-ished when the slack between the available and the requiredpower increases. The storage energy is constrained by theenergy capacity limit of the storage device. This limit may incurwastage of energy generated by the green sources (as shownby the shaded region in Fig. 2). The wastage can be significantif there is no replenishment in the later stages. Such a situationcan lead to unsustainable operation if the power requirementis higher than the power available for long periods.

(ii) Reducing energy requirement: Another major research direc-tion is to reduce the energy requirement. Reducing the energy

requirement can either avoid unsustainable operation (whenthe energy requirement becomes always less the energy avail-able) or reduce the energy need from the power grid or battery(reducing the shaded areas in Fig. 1). Following are the differentresearch requirements in this regard:

formatics and Systems 1 (2011) 57–74 59

– Spatio-temporal distribution of operation: One way to achievethe reduction is through distribution of the computingoperation in a spatio-temporal manner. Spatial distribu-tion ensures that the computing operations are distributedin multiple computing units such that no unit gets over-loaded. Spatial distribution is necessary to avoid high energyrequirement in the bottleneck units, i.e., the ones beingoverloaded. Temporal distribution of computing operationis geared towards delaying the operations until the availablepower increases. However, such delaying of operations canundesirably affect the performance of the computing oper-ations and are therefore constrained by the performancerequirements (e.g., service level agreements (SLAs) in datacenters). Spatio-temporal distribution of operation is widelyknown as workload (or job) management, i.e., the deci-sion making to determine when (job scheduling) and where(job assignment or dispatching) to execute the computingworkloads. Job scheduling and assignment problems are ingeneral NP-hard [19]. In the case of online job scheduling,fast heuristic algorithms and policies are extensively used,such as first-come first-serve (FCFS) augmented with back-filling [20]. With respect to energy sustainability, previousresearch has focused on: (i) including economical mod-els for job schedules [21]; (ii) avoiding or even preventingexcessive heat conditions in data centers through job assign-ment algorithms [3,22,23] thus improving the sustainability(through reduction in cooling power requirement); and (iii)performing spatio-temporal job scheduling (i.e., integratedjob scheduling and assignment decision making) in data cen-ters [12,24,25].

– Computing power management: A widely used method toreduce the computing power requirement is by running thecomputing units at different power modes depending onthe operations to performs. For example, a processor notperforming any operation can be kept in sleep or hibernatemode to reduce power requirement. At the same time it isimportant to make enough computing units available so thatthe required computation can be performed. To considerboth performance and energy consumption, the productof energy consumption and computation delay has beenused for comparing energy efficiency of general-purposemicroprocessors [26]. A survey on server power and energymanagement techniques have been performed by Bianchiniand Rajamony in [27]. In this regard, server provisioningand consolidation have been well known and widely usedapproaches in modern data centers. For example, Freon-EC is an extension to the Freon power-aware managementsoftware, which adds power control [28,29]. In the case ofInternet data centers, there are server provisioning schemes[24] that estimate the anticipated workload and use a smallactive server set while suspending the remaining servers. Asimilar concept exists in the domain of wireless communica-tion where the wireless radio is turned on and off dependingon whether there is any communication to be performed ornot, respectively. An example application of this method isradio sleep scheduling in BSNs [9,10].

– Non-computing system management: The power require-ment by the computing units is often complimented withthe requirement from some associated non-computing pro-cesses. For example, the cooling power requirement in datacenters is driven by the heat dissipated to run computing

workloads in the servers. Sufficient cooling of the data centeris needed to maintain a safe temperature (often determinedby the redline temperature as indicated by the manufac-turer) for server longevity (which in turn is an essentialfactor for sustainable computing from equipment recycling
Page 4: Research directions in energy-sustainable cyber–physical systems

6 ing: In

(

Bsc

2

iI

0 S.K.S. Gupta et al. / Sustainable Comput

perspective as indicated in Section 2.2). Further, the scav-enging of power from the non-computing processes, e.g.,solar power scavenging through the photo-electric effect,can determine how much power requirement can be sus-tained (as per Fig. 1).

iii) Scavenging energy from different sources: Apart from reduc-ing the power requirement, one other complimentary optionfor energy-sustainable computing is to increase the poweravailable. This requires identification of different potentialenergy sources and investigating different ways to scavengeenergy from these sources. Roundy et al. [7] and Paradiso andThad [2] provide a comprehensive list of energy scavengingfrom body heat, sunlight, ambulation, vibration, respiration,and so on. Recent work by Sharma et al. at the HP labs [30,31]has shown how the cow manure from dairy waste can be usedto power data centers.

This paper investigates energy-sustainable computing for CPSs.efore continuing in these directions in Section 3, the followingubsection describes sustainable computing from equipment recy-ling perspective.

.2. Equipment recycling perspective

From equipment recycling perspective, sustainable computings defined as the reusability and longevity of the computing units.n this regard, the following are the different research directions:

(i) Maintaining safe operating condition: One way to achievethe longevity of the computing equipment is by maintainingsafe operating condition. For example, as mentioned previ-ously, the data centers need to ensure an operating temperaturewithin the equipment redline temperatures. This is particularlyimportant to increase (or maintain) the manufacturer speci-fied mean time before failure (MTBF), and hence minimize therequirement for replacing the equipment. Common approachesto maintain safe operating temperature are based on two per-spectives:– Non-computing (cooling) perspective: A major practice over

the years in data centers has focused on provisioning the cool-ing for the worst-case temperature scenarios [23,32,33], thusundesirably consuming high energy. More recent approacheshave focused on dynamic control of the cooling units depend-

ing on the variation of the generated heat in the data center[25,34].

– Computing perspective: Another major managementapproach in data centers over the last decade have focusedon manipulating the computing operations in the servers.

Fig. 3. Examp

formatics and Systems 1 (2011) 57–74

Such manipulation involves resource management suchas computing power management [28,29], thermal-awareserver provisioning [24], and thermal-aware workloadscheduling and dispatching [3,12,22–25]. It should be fur-ther noted that these approaches have an impact on theenergy sustainability since they reduce the cooling demandand hence the cooling energy. Recent approaches havefocused on integrating these approaches with dynamiccontrol of the cooling units [25,34].

(ii) Designing sustainable computing platforms: Computingplatforms have been developed to: (i) ensure sustainableand energy-efficient operations [26,35,36], and (ii) using eco-friendly materials [37–39].

3. Energy-sustainable computing in CPS

3.1. Cyber-Physical Systems (CPS)

There are two types of components in any CPS: (i) computingand (ii) physical. Fig. 3 shows the deployment of the two represen-tative CPSs: data centers (Fig. 3a) and BSNs (Fig. 3b). For example,data centers use raised floors and lowered ceilings for cooling aircirculation, with the computing equipment (i.e., the computingcomponents) organized in rows of racks arranged in an aisle-basedlayout, with alternating cold and hot aisles. The cooling of the datacenter room is done by the computer room air conditioners (CRAC),which supply cool air into the data center through the raised floorvents. The cool air flows through the chassis inlet and gets heated upby convection from the computing equipments and hot air comesout of the chassis outlet. The hot air goes to the input of the CRACwhich cools it down. The CRAC along with the hot and cold airconstitute the physical components.

Body sensor network (BSN) is a network of heterogeneous set ofmedical devices that can sense, actuate, compute, and communi-cate with each other through a wireless channel. The architectureof BSN is shown in Fig. 3b. The nodes (i.e., the devices) in a BSNcan be broadly classified into two categories: (1) worker nodes,which are implanted or wearable medical devices with a low com-puting capability interfaced with sensors, actuators, and wirelesstransceivers (e.g., a Photoplethysmogram sensor interfaced withTelosB motes); and (2) base station, which has higher computationand communication capabilities (e.g., PDA) to disseminate and col-lect information to and from the worker nodes, respectively. Each

node in a BSN has a set of neighboring nodes with which it cancommunicate through a one-hop wireless link. The worker nodes,base station and the inter-communication among them form thecomputing components, whereas the human body along with itsphysiology form the physical component.

le CPSs.

Page 5: Research directions in energy-sustainable cyber–physical systems

S.K.S. Gupta et al. / Sustainable Computing: Informatics and Systems 1 (2011) 57–74 61

ctiona

cocctwacbtntfcatporefdptC

3

ccnrc

Fig. 4. CPS fun

In general, Fig. 4 shows the functional architecture of a CPS. Theomputing components are responsible for executing the workloadf a CPS. For example, in a data center, jobs submitted by usersonstitute the workload. For a BSN, workload includes sensing andommunication of physiological signals. Both the computing andhe physical components are powered by a set of energy sources,hich themselves can be part of the physical environment. InCPS, there are strong interactions between the computing

omponents and the physical environment. The interactions cane bidirectional. Interactions from the computing components tohe physical environment normally involve controlling of certainon-computing processes. An example of such interaction can behe control of blood glucose levels by an insulin pump. Interactionsrom the physical environment to the computing componentsan be of two types: direct and indirect. While indirect inter-ctions mean adapting the computing operations depending onhe behavior of the non-computing processes, direct interactionsut an immediate dependency of the computing componentsn the the physical environment (e.g., energy scavenging fromeplenishable sources in the physical environment). Ideally, annergy-sustainable CPS needs to minimize the energy requirementsor its operations or at least make sure that the energy requirementso not exceed the available energy from the green sources in thehysical environment. Therefore, it is important to understand therends and dependencies of the power consumption in all thePS components.

.2. CPS power characteristics

The power consumption of a CPS, pcps, depends on the power

onsumption of both computing and non-computing (i.e., physical)omponents. The total power required by the computing compo-ents is referred as the computing power, whereas the total powerequired by the non-computing components is referred as the non-omputing power. Based on these power requirements, pcps can be

l architecture.

given as follows:

pcps = computing power + non-computing power

=∑i ∈ C

pci +

∑j ∈ NC

pncj , (1)

where C and NC are the sets of computing and non-computingcomponents, respectively, pc

iis the power consumption of com-

puting component i, and pncj

is the power consumption of thenon-computing component j. The dependencies of both pc

iand pnc

j

are described below.

(i) Computing power: The power consumption of any computingcomponent, i ∈ C, depends on two factors: (i) the workloadbeing executed at i (i.e., wi); and (ii) the power mode of i (i.e., ˇi).If the function Gc

i: W × M → � returns the power consump-

tion, pci, of computing component i (where W is a set of all

possible workloads and M is the set of all possible computingpower modes) then pc

ican be obtained as follows:

pci = Gc

i (wi, ˇi), (2)

where wi ∈ W and ˇi ∈ M.(ii) Non-computing power: The power consumption of non-

computing components (e.g., cooling unit in data centers)depends on a set of non-computing parameters (e.g., air tem-perature at the input of cooling unit) of the component and aproperty set of the non-computing process (e.g., amount of heatextracted during the cooling process) performed by the com-ponent. If these sets are denoted by P and S, respectively, then

nc

pj

of any component j ∈ NC can be given as:

pncj = Gnc

j (P, S), (3)

where Gncj

is a function such that Gncj

: �|P| × �|S| → �.

Page 6: Research directions in energy-sustainable cyber–physical systems

6 ing: In

3c

pehpditsTdidiiceaf

3

t

ti

3

pm

(

siosFaeie

2 S.K.S. Gupta et al. / Sustainable Comput

.2.1. Impact of cyber–physical interactions on CPS poweronsumption

The non-computing parameter set, P, is affected by the com-uting operations because of the cyber–physical interactions. Forxample, the air temperature at the cooling unit depends on theeat generated in a data center room; which itself depends on theower consumed by the computing servers. Further, this depen-ency has a spatio-temporal dynamics. For example, at a given time

nstant the impact from a server at one location (to the input airemperature of the cooling unit) may be different than that of aerver with same power characteristics but at different location.his variation is driven by the recirculation pattern of the air in theata center room. Similarly, at a given instant, the control logic of an

nsulin pump may have different effect in different parts of the bodyepending on the drug diffusion rate. Further, at a given location the

mpact may vary with time. We denote as F : (W × �3)n → F |P| the

mpact function that maps the workload running on a computingomponent at an Euclidean location to the non-computing param-ters in P, where n is the total number of computing componentsnd F is the set of all possible non-computing parameter functions(t, x, y, z).

.2.2. Function characterization requirementFor any CPS, it is required to characterize the Gc

iand Gnc

jfunc-

ions ∀i ∈ C and ∀j ∈ NC, respectively, as follows:

The characterization of Gci

function can be performed based onexperimental profiling with different workload and computingmodes.The characterization of Gnc

jfunction has three basic steps:

- identification of the different elements in the sets P and S,- characterization of the impact function F, and- experimental profiling of Gnc

jwith different workload and com-

puting modes.

Given the power characteristics of CPS, the following subsec-ion discusses how different resource management strategies canmpact the CPS power consumption.

.3. CPS resource management

Given a deployment of the computing components, the CPSower consumption depends on three major types of resourceanagement decision making:

(i) workload management, which determines the amount ofworkload in each computing component, thus affecting thecomputing power (as per Eq. (2));

(ii) computing power management, which determines the powermodes of the computing components, thus affecting the com-puting power (as per Eq. (2)); and

iii) non-computing component management, which determines theproperty set S of the non-computing processes, thus affectingthe non-computing power (as per Eq. (3)).

All the management decision making has to ensure that theervice requirements (e.g., job throughput and turnaround timen data centers) meet the user expectations. The determinationf workload at each computing component further affects theet of non-computing parameters P as per the impact function

; this parameter set in turn affects the non-computing powers per Eq. (3). Thus, workload management can have indirectffect on the non-computing power because of the cyber–physicalnteractions. Such effects impose fundamentally different consid-rations in the decision making of workload management in order

formatics and Systems 1 (2011) 57–74

to reduce energy consumption in CPSs. A more coordinated resourcemanagement is required where the workload management and non-computing component management need to be aware of their impacton the CPS power consumption.

3.4. Resource management algorithm classification

Resource management algorithms, which are aware of the non-computing processes and the impact of the computing units tothese processes, can be classified based on: (i) support and knowl-edge of the workload; (ii) assumptions about the non-computingprocesses; and (iii) support for power management. The follow-ing subsections discuss these different categories and identifies thedistinctive algorithm classes in each of these categories.

3.4.1. Support and knowledge of different workloadcharacteristics

As described in the previous section, workload plays an impor-tant role in the CPS power requirements. Workload can becategorized based on their arrival and execution profiles. Workloadcan execute for a long duration in the scale of seconds, minutes,hours, or even days (e.g., long running scientific jobs in HPC datacenters, signal processing and cryptographic operations in sen-sors); or they can be a stream of short requests (in the scaleof milliseconds) such as web transactions and database queries.Further, the arrival of any of these workload can be periodic oraperiodic. For example, in a BSN the workload on the sensors aremostly periodic in nature requiring the same operations (e.g., sens-ing, communication, and cryptographic operations) in predefinedperiods. An aperiodic workload on the other hand has no predefinedperiod of arrival but arrives in an ad hoc manner.

Depending on the knowledge of the workload, a workload man-agement algorithm can be online or offline. An offline algorithm hascomplete knowledge of the workload arrival and execution pro-files, whereas an online algorithm makes decision based only onthe current knowledge. A prediction mechanism can be employedregarding the future workload. However, the accuracy of the pre-diction mechanism hugely depends on the repeating pattern of theworkload. As such, in the rest of the paper we assume that anyalgorithm that only supports periodic workload (with the exact rep-etition of arrival rates and execution times) is inherently offline innature since it can use the past information of the workload as thefuture knowledge. Any workload that is not exactly periodic but hassome periodic pattern (e.g., web requests in Internet data centers)can be thought of as aperiodic; however online algorithms can havebetter estimation of the future for such workloads bringing themcloser to their offline counterparts.

Resource management algorithms can be classified by the dif-ferent types of workload they support and the knowledge of theworkload’s arrival sequence. In this regard, there are three typesof classification categories: (i) workload arrival profile (periodic oraperiodic), (ii) workload execution profile (long running or shortrunning), and (iii) workload knowledge (online or offline). Table 1summarizes these classes as a support matrix indicating the differ-ent workload arrival and execution profiles supported and whetherthe algorithms are online or offline. These different classes are fur-ther described as follows:

– Specific algorithms: This class of algorithms supports only a par-ticular type of workload arrival and execution profile. Furtherthese algorithms can be either online or offline and are not flex-

ible in their decision making when future workload informationbecomes available or is not available, respectively. The algo-rithms are named depending on the specific cases supportedas shown in Table 1. For example, oNline algorithms support-ing Long running and Aperiodic workload are referred as LAN
Page 7: Research directions in energy-sustainable cyber–physical systems

S.K.S.G

uptaet

al./SustainableCom

puting:Informatics

andSystem

s1

(2011)57–74

63

Table 1Classification of CPS resource management algorithms based on their support and knowledge of workload characteristics. The capitalized letters in the sub-categories of supported workload, supported workload arrival, andworkload knowledge are used for the abbreviated nomenclature of the algorithm classes. The symbol ‘*’ is used to denote that all cases in a category is supported. The last two columns show the specific algorithms in differentclasses from the two representative example of BSN and data centers (a ‘–’ means no algorithm in the corresponding class).

Algorithm classes Supported workload Supported workload arrival Workload knowledge Algorithms for BSN Algorithms for datacenters

Long running Short running Periodic Aperiodic oNline oFfline

Specific algorithmsLAN

√ √ √– –

SAN√ √ √

– –LAF

√ √ √Data-centric routing [40] –

SAF√ √ √

– –LPF

√ √ √BSN MAC [41] –

SPF√ √ √

– –One* algorithms

*AN√ √ √ √

– –*AF

√ √ √ √Minimum communication[42]

*PF√ √ √ √

P-M, NP-M –LA*

√ √ √ √– NBS-EATA [43]

SA*√ √ √ √

– –L*F

√ √ √ √– SCINT [12], TASA [44]

S*F√ √ √ √

– –L*N

√ √ √ √– ECTC, MaxUtil [45],

MinHR [46],Proportional-Share[47]

S*N√ √ √ √

– GentleCool [48]Two* algorithms

L**√ √ √ √ √

– FCFS-LRH,EDF-LRHFCFS-XInt, EDF-XIntFCFS-HTS, EDF-HTS[12]

S**√ √ √ √ √

– TAWD, TASP+TAWD[24]

*A*√ √ √ √ √

– –**N

√ √ √ √ √– Mercury [28]

Three* algorithm***

√ √ √ √ √ √NP-NM, drug delivery,reconfiguration [49]

Page 8: Research directions in energy-sustainable cyber–physical systems

6 ing: In

4 S.K.S. Gupta et al. / Sustainable Comput

algorithms. Note that specific algorithms suffer from generalapplicability to different workload since the awareness of theCPS power, pCPS, is based on a specific knowledge of the work-load. An example algorithm in the LAF class is data centric routingalgorithms for BSN that attempt to minimize the communicationenergy [40]. The algorithms assume the complete knowledge ofthe workload and are hence offline. They are mainly used formedical applications featuring long running aperiodic signal pro-cessing jobs. However, many workload for medical monitoringcan be periodic in nature. Further, as mentioned previously, sinceperiodic workloads (where the arrival and execution of work-loads are repeated) are inherently offline in nature, there is noclassification made for online algorithms supporting such work-load. An example algorithm for LPF class supporting periodicoffline workload is a MAC protocol for BSNs [41]. An importantaspect in designing offline algorithms for aperiodic workload isthe higher complexity while considering complete knowledge ofthe workload.One* algorithms: These algorithms support all possible cases inany one of the classification categories mentioned previously.The symbol ‘*’ is used to denote that it can support all the pos-sibilities in a category. Following are the different types of one*algorithms:- *AN and *AF: Online algorithms that support aperiodic arrival

of both long running and short running workload are referredto as *AN class of algorithms. Similarly, offline algorithmssupporting aperiodic arrival of both long running and shortrunning workloads are referred as *AF algorithms. An exam-ple *AF algorithm for BSNs incorporates more computation inthe sensors to minimize communication which can consumeorders of magnitude higher energy than computation [42]. Amajor challenge for these algorithms is to provide unified deci-sion making for stream of short requests and individual longrunning jobs.

- *PF: Offline algorithms supporting periodic arrival of both longrunning and short running workload are referred as *PF class ofalgorithms. Since periodic workloads are inherently offline innature, the online version of these algorithms are not catego-rized. The challenges to address for the *PF are similar to thatof the *AF algorithms. Since the applications in BSNs are mostlyperiodic in nature (e.g., periodic monitoring of the physiologicalsignals), resource management algorithms for BSN fall in thisclass (Table 1). These algorithms will be discussed in furtherdetail in Section 5.3.

- LA* and SA*: Algorithms supporting aperiodic long runningworkload with or without the complete knowledge of futureworkload are referred as LA* algorithms; similar algorithmssupporting only short running workloads are called SA* algo-rithms. The principal challenge for these algorithms is to makeindependent decisions based on whatever workload informa-tion is available. These algorithms can yield higher benefitswhen more information is available on the workload. NBS-EATA is a sample LA* algorithm for cluster grids in data centersthat assumes knowledge of workload execution time to ensurethat the respective deadlines are met while assigning them topower-efficient servers.

- L*F, S*F, L*N, and S*N: Algorithms supporting long running ape-riodic or periodic workload with complete knowledge of theworkload are referred as L*F algorithms; similar algorithmssupporting only short running workloads are called S*F algo-rithms. TASA [44] and SCINT [12] algorithms in the L*F class

have been developed for data centers running long runninghigh performance computing (HPC) jobs. The principal chal-lenge for these classes of algorithms is to support both aperiodicand periodic workload with same decision making. This chal-lenge also persists for the online versions of these classes of

formatics and Systems 1 (2011) 57–74

algorithms, i.e L*N and S*N. Several algorithms have been devel-oped in these classes for data centers (see Table 1).

– Two* algorithms: These algorithms support all possible cases inany two of the three classification categories mentioned previ-ously. Following are the different types of Two* algorithms:- L** and S**: Algorithms supporting both periodic and aperiodic

long running workload irrespective of the future knowledgeare referred as L** algorithms; similar algorithms support onlyshort running jobs are referred as S** algorithms. The principalchallenge is to make independent decisions based on whateverworkload information is available. The problem becomes exac-erbated since the workload can be either periodic or aperiodic.Distinguishing the periodic workload and then being aware ofthat while doing decision making for the aperiodic workloadis essential in these algorithms. Many resource managementalgorithms have been developed for data centers (Table 1).These algorithms will discussed in Section 4.5.

- *A*: Algorithms supporting both long running and short run-ning aperiodic workload irrespective of the future knowledgeare referred as the *A* algorithms.

- **N: Algorithms supporting both long running and short run-ning periodic and aperiodic workload without any futureknowledge are referred as the **N algorithms. Mercury soft-ware suite [28] for data centers falls under this category.

– Three* algorithms: These algorithms support all the possible casesof the three classification categories mentioned previously. Thegoal for a CPS designer is to employ a Three* algorithm. *PF classof resource management algorithms for BSNs are extended tosupport aperiodic workload in an online fashion to design Three*algorithm. Further, any automated drug delivery is supported inboth periodic and aperiodic online manner. These delivery canbe performed for a long duration or a short duration. So anydrug delivery falls under Three* algorithm class (Table 3.4.1).These algorithms will be discussed in further detail in Section5.3. Another Three* algorithm for online reconfiguration of BSNs[49] allows automatic redistribution of computation and com-munication among sensors and the base station.

Apart from the workload based classification schemes, theresource management algorithms can be classified based on theassumption on non-computing state and support for the powermanagement.

3.4.2. Assumption on non-computing stateAssumptions on the non-computing state determines the

behavior of the impact function F. In this regard, there can be twostates: steady-state and transient. A steady-state behavior of thenon-computing process assumes that the parameter set P is sta-bilized to a particular value (e.g., the steady-state temperature atthe input of the cooling unit in a data center). A transient behavioris more dynamic in nature where the continuous variation of theparameters in P over time and space is considered. Transient behav-ior assumptions often provide more accurate predictions. It shouldbe noted here that transient behavior encompasses steady-state forcertain limiting conditions.

3.4.3. Power management supportPower management determines the mode of operation ˇi of

each computing component i. These modes can impact the com-puting power (see Eq. (2)). Resource management algorithms areclassified based on whether a constant mode is assumed or power

management, i.e., dynamic variation of the modes in the computingcomponents, is performed. Overall notational convention for thealgorithm classes follows the three letter notation (for workloadbased classification) followed by hyphen and two letters denot-ing the non-computing state assumption and power management
Page 9: Research directions in energy-sustainable cyber–physical systems

ing: In

sTprbwom

oassctft

3

m(ptobbafttcartaS

S.K.S. Gupta et al. / Sustainable Comput

upport (in sequence). For example, a LAN class of algorithm inable 1 that assumes Steady-state behavior and does Not performower management is referred as LAN-SN algorithm. For algo-ithms that consider transient behavior, a ‘*’ is used since transientehavior encompasses steady-state behavior. Similarly, ‘*’ is usedhen power management is employed. For example, a LAN class

f algorithm that assumes transient behavior and supports poweranagement is referred to as LAN-** algorithm.Given the classification and the notations of different classes

f resource management algorithms, the goal should be to designFive* algorithm for resource management, i.e., the algorithm

hould: (i) be of Three* class when categorized based on workloadupport (and knowledge), (ii) assume transient behavior of the non-omputing processes, and (iii) support power management. Sec-ions 4 and 5 discuss the challenges in designing a Five* algorithmor specific CPSs such as data centers and BSNs, respectively. Also,he pros and cons of various algorithm classes will be discussed.

.5. Verification of CPS for energy sustainability

Verification of CPSs in terms of sustainability involves: (i) deter-ining the energy consumption for long-term CPS operations; and

ii) ensuring that the energy consumption is within the energy sup-lied from the sources in the environment. An ideal way to performhe verification is through experimentation on actual deploymentf a CPS or through accurate simulation of the system. Simulationased verification is widely used since the resources required touild experimental test-bed may not be affordable. Both simulationnd experimentation can also be used to characterize the variousunctions such as the Gc

i, Gnc

j, and F functions (see Section 3.2). Sec-

ions 4 and 5 discuss how these functions can be characterizedhrough real measurement and simulation based profiling for dataenters and BSNs, respectively. Further, in many critical CPSs such

s BSNs, verification may be required at the design time (withouteal deployment). Early design time verification has two advan-ages: (i) it avoids creating real test-scenarios putting lives at risk;nd (ii) it provides a way to guarantee and certify the CPS behavior.uch certification methodology can be useful for the various regu-

Fig. 5. Data centers can be modeled using the

formatics and Systems 1 (2011) 57–74 65

lating agencies (e.g., FDA approval of the medical devices). One wayto perform early design time verification is through model basedengineering (MBE). MBE is the method of developing behavioralmodels of real systems and analyzing the models for requirementverification. There are two main phases in MBE: (1) model develop-ment, and (2) model analysis. In the model development phase, aset of expected properties of the system is determined from the sys-tem requirements. An abstract modeling is further performed thatgenerally involves capturing appropriate parameters whose varia-tions can reflect the system behavior. Mathematical analysis (modelanalysis) is then performed on the abstract model to evaluate theexpected properties and verify the system requirements. In thispaper, we discuss MBE in verifying the CPSs’ energy sustainabilityat design time.

4. Resource management to ensure energy sustainability ofdata centers

A data center is a manifestation of the CPS abstraction laid out inthe previous sections. A data center consists of computing compo-nents (servers), non-computing components (power distributionand chillers), an energy supply most of which comes from thepower grid, and of course it is immersed in a practically closedenvironment in which physical thermal phenomena take place,including the cooling cycle of the servers. Fig. 3a shows a typicallayout of a data center along with the air input to and output fromthe cooling unit (CRAC). The mapping between a data center andthe CPS functional architecture (in Fig. 4) is shown in Fig. 5.

4.1. Characterizing Gci

function

The Gci

function describes the power consumption of theequipment with respect to their utilization. Power profiling of

computing equipment is a standard practice and there are sev-eral well-established methodologies to documenting the powerconsumption of a system with respect to its utilization.

Power profiling usually yields a “power curve” which consists ofaveraged power measurements at sample utilization points (idle,

abstract holistic view of CPS as in Fig. 4.

Page 10: Research directions in energy-sustainable cyber–physical systems

6 ing: Informatics and Systems 1 (2011) 57–74

1yau

r

G

wt(pp

P

4

cFsmt

G

(ctt

G

wpa

C

wFta

G

cstch

m

T

wo

cp

a b

6 S.K.S. Gupta et al. / Sustainable Comput

0%, 20%, etc.). There are research studies that approximate theielded curve to a polynomial, mostly as a linear function. Althoughlinear model is not always accurate [50] it has been extensivelysed due to its simplicity.

Under a linear model, a compute server i consumes power withespect to its CPU utilization as follows:

ci = aiUi + bi,

here bi is the idle power of the server and ai is the slope of the lineo the maximum power. The term Ui denotes the CPU utilization0 ≤ Ui ≤ 1). Note that, if bi = 0, then the system exhibits ideal energyroportionality [50]. The total computing portion of the data centerower is the sum of the individual components:

c =∑

iGci =

∑iaiUi + bi.

.2. Characterizing Gncj

function

The Gncj

function describes the power consumption of the non-omputing equipment with respect to the rest of the data center.or the case of the power distribution equipment, the power con-umption depends on the power drawn by the computing equip-ent, i.e., P = Pc. If we assume a constant efficiency ratio of ˛, then

he power consumption of the power distribution equipment is:

ncj (P, S) = Gnc

j ({Pc}, {˛}) = ˛Pc.

For the case of the chillers, a.k.a. computer room air conditionersCRACs) or heating ventilation air conditioners (HVACs), the poweronsumption depends on the input heat to be extracted divided byhe coefficient of performance (CoP) of the chiller at the operatingemperature:

ncj = Pc

CoP(Tinput),

here Tinput is the input (sensed) temperature. The coefficient oferformance denotes the cooling efficiency of the chiller, and ide-lly is governed by the Carnot efficiency:

oP = Tinput

Tinput − Toutput,

here Toutput is the output (supply) temperature from the CRAC.or heat extractors that remove a roughly constant amount of heat,his efficiency translates into a quadratic curve (Fig. 6a). From thebove, we can denote Gnc

jas:

ncj (P, S) = Gnc

j ({Tc, Tinput}, {CoP(T)}).

Moreover, CRACs feature multiple modes of cooling, i.e., theyan cool at different compression ratios.1 The different compres-ion modes are triggered by a thermostat which senses the inputemperature and compares to its trigger point. Assuming the sameonstant flow, the trigger point can be easily translated into inputeat (Fig. 6b).

Calculation of Tinput can be done using the thermodynamic for-ula:

input = cq�f

Pc+ Toutput,

here cq is the specific heat of air (at Tinput), � is the mass densityf air and f is the flow of air through the CRAC. Using the above

1 The standard cooling technology is vapor-compression cooling, where theoolant vapor is compressed at one phase of the cooling cycle and then decom-ressed to produce the cooling effect.

Fig. 6. Coefficient of performance for a commercial CRAC.

equation, we can replace Tinput on the x axis with the input heat(Fig. 6a).

4.3. Identifying P, S and F

In the discussion above, P has been defined as:

P = {Pc, Tinput}.Although Pc is easy to be estimated, Tinput requires the knowledge ofthe heat distribution in the room along with the air flow patterns.In general, Tinput can be expressed as a weighted sum of the airtemperatures from the heat sources in the room, namely the serversand the CRACs:

Tinput = wc1Tout,c1 + wc2Tout,c2 + · · · + wcnTout,cn + wnc1Tout,nc1

+ · · · + wncmTout,ncm,

if there are n computing components and m non-computing com-ponents.

The parameter Tinput in P not only has a quantitative role indetermining the CoP of the CRAC, but it also denotes the exergyof the incoming heat, i.e., the quality of the heat. The exergy deter-mines how easy it is for the heat to be converted into useful work,which can in turn be used to cool down the data center or produceelectricity. For example, heat supplied at around 98 ◦C is good todrive an absorption chiller, while heat at lower temperatures (circa65 ◦C) can drive an adsorption chiller albeit at a lower CoP.

A factor against running the data center at high exergy temper-atures is the redline temperatures specified by each equipment’smanufacturer. For example, most computing servers have a red-line air-inlet temperature of 35 ◦C or less. The CRACs have to be setto an input temperature such that the air-inlet temperatures at theequipment do not exceed the respective redlines.

To estimate the air-inlet temperatures, we use an extension ofthe w vector. This extension is the heat recirculation matrix D whoseeach element dij denotes by how much the temperature of an airinlet at server j is affected by the heat produced at the server i.Using this matrix and the Gc

iat each server, we can compute the

temperature vector at the air inlets of the data center equipment:

T inlet = D〈Gci 〉.

One of the elements of the Tinlet vector is the CRAC’s inlet tem-perature.

Considering the above, a resource management algorithm’sobjective (see Section 4.5 below) would be to try and allocate theworkload and configure the computing equipment (power mode)and CRAC equipment (power mode S) in such a way as to maximizethe exergy and keep all temperatures below the redline.

4.4. Profiling for Gncj

function

The Gncj

function can be profiled usually by experimentation. Thenon-computing component can be equipped with power meters

Page 11: Research directions in energy-sustainable cyber–physical systems

ing: In

aDeae

4

crbwvawS

aspoimfia

aLceTi(a

uAbttwu

rbwssassas

atcrwaotaon

S.K.S. Gupta et al. / Sustainable Comput

nd thermometers, and induced with varying thermal workload.epending on the technology used, the instrumentation of thexperiment could be complemented with flow meters to measureir flow or chilled water supply, or even sunlight intensity if thequipment uses solar power.

.5. Resource management classes

Traditionally, resource management in data centers mainlyonstituted scheduling and distributing the workload. There areelatively two classes of data center clusters: those that serviceatch-based and relatively long-running workload units (jobs),ith a time sensitivity of minutes or hours, and those that ser-

ice short workload units (transactions), with a time sensitivity ofcouple of seconds or less. Due to the difference in nature of theorkload, data center management algorithms fall into one of L**,

**, L*F, S*F, L*N, and S*N classes (as shown in Table 1).In the batch-oriented data centers, jobs may spend consider-

ble time in the queue before they get serviced. In that manner,cheduling algorithms combine a temporal placement logic (tem-oral scheduling) and a spatial placement logic (spatial placementr server assignment). An example of spatial-only placement logics that of Xint [3] which assigns jobs to servers in such a way as to

inimize the maximum Tinput. An example temporal algorithm isrst-come first-served (FCFS) with back-filling of jobs that fit intovailable servers.

Although there may be periodicity of workload, there is virtu-lly no algorithm that assumes some form of periodicity. Example** algorithms include FCFS-LRH, where LRH stands for least recir-ulated heat, FCFS-XInt, EDF-LRH and EDF-XInt, where EDF is aarliest-deadline first ordering of the arrived workload units [12].hese algorithms assume steady-state behavior and can also bentegrated with power management. An L*F algorithm is SCINTscheduling to minimize cross-interference) [12], which assumesn offline knowledge of the arrived jobs.

In the S** class, there is virtually no queuing of the workloadnits, they are directly passed to the compute nodes for servicing.n example of S** algorithm is load balancing (LB), a.k.a. equal loadalancing (ELB), which stochastically or by round-robin distributeshe transactions among the servers. Thermal-aware workload dis-ribution (TAWD) is another S** approach, which distributes theorkload in such a way as to reduce the work done by the coolingnits.

Combination of scheduling with power management is a fairlyecent trend in resource management in data centers. SCINT, cane considered an L*F-S*, because it can produce a power schedule ofhat systems to turn off and when to turn them off, under a steady-

tate physical model. FCFS-HTS (HTS stands for highest thermostatetting) and EDF-HTS [25] algorithms are L**-*N, because theyssume a transient model and do not have a power managementcheme. On the other hand, TAWD combined with thermal-awareerver provisioning (TASP/TAWD) is a S**-S* algorithm, which makesprediction of the workload intensity for a future workload and

uspends a number of servers.Other algorithms from the literature include TASA, which is

thermal-aware workload placement algorithm and falls underhe L*F-SN class. MinHR is a workload placement algorithm thatonsiders the thermal impact of the operation of servers in theoom [46]. However, it does not require offline knowledge of theorkload and falls under the L*N-SN class. Mercury is a power man-

gement software suite that adjusts the power of servers when they

verheat [28]. It can support different workload arrival and execu-ion patterns and falls under the **N-S* class. ECTC and MaxUtilre workload scheduling algorithms that try to consolidate tasksnto servers [45]. The basic difference of this is the assumption ofon-exclusiveness between tasks and servers. These algorithms fall

formatics and Systems 1 (2011) 57–74 67

under the L*N-** class. Proportional-Share in the Libra managementsoftware is a scheduling algorithm for assigning tasks to computers[47] and falls under L*N-*N class. GentleCool [48] is a schedulingalgorithm that decides on the CPU share distribution among virtu-alized machines on a physical computer. This algorithm falls underthe S*N-SN class.

All these algorithms intend to reduce the power requirementof the data center. However, none are designed to use greenenergy sources. As such, from the definition of energy sustain-ability in Section 2.1, none of these algorithms are completelyenergy-sustainable in nature. However, these algorithms need tobe compared in terms of the grid power consumption (which isthe manifestation of energy sustainability when green sources arenot available, as discussed in Section 2.1). In this regard, for longrunning workload, SCINT is the most energy-sustainable becauseof the offline knowledge of the workload during decision makingand support of power management [12]. L** class of algorithmshas to compromise on energy sustainability to support online deci-sion making. For short running workload, TASP/TAWD is the mostenergy-sustainable since it supports power management whencompared to TAWD [24].

5. Resource management to ensure energy sustainability ofBSNs

Fig. 7 shows the instantiation of the generic CPS functionalarchitecture in Fig. 4 for the specific example of BSNs. The com-puting components in the BSN consist of sensor nodes or medicaldevices. The workload in BSNs are generally periodic and are knownoffline. For example, an infusion pump administers drug into thehuman body following a fixed schedule. Also, a health monitoringapplication such as Ayushman [4] has a deterministic workload. InAyushman, (as shown in Fig. 8) the sensors in the BSN sense physi-ological data for ts seconds and store them in local memory. After ts

seconds they transfer the data to the base station in a single bursttaking time tTx. Every communication is secured by encryption witha secret key which is established between each pair of BSN nodes.Key agreement between any two sensors is performed once in aday using the Physiological value based Key Agreement (PKA) [51]protocol, each execution of which takes tPKA time.

Three types of non-computing units are considered in the BSNexample: (1) human body, whose physiology is controlled by thenodes, (2) energy scavenging sources such as a peizo-electric deviceon shoe sole, which extract energy from the surrounding envi-ronment to provide operating power to the sensor nodes, and (3)medical actuators such as infusion pumps, which cause changes inhuman physiology according to commands from a computing sys-tem. The cyber–physical interactions between the computing andnon-computing components are further three fold:

(i) Heat energy transfer from the sensor nodes to the human body,which causes rise in body temperature.

(ii) Electrical charge transfer from energy scavenging sources tothe sensor nodes, which provide the operating power for thenodes.

(iii) Chemical energy interaction such as diffusion of drug in humanblood caused by actuation decision from the computing unitsuch as infusion pump controller.

Power profiling of a BSN involves three different aspects:

(i) Sensor node power profiling: The power consumption of thesensor nodes needs to be profiled for the different stages ofthe workload. This profiling is performed for different powermodes of the sensor nodes and for different power manage-

Page 12: Research directions in energy-sustainable cyber–physical systems

68 S.K.S. Gupta et al. / Sustainable Computing: Informatics and Systems 1 (2011) 57–74

Sensor

Nodes

Human Body

Network of Sensors on Human Body

Energy Scavenging

Sources

Heat Energy Transfer

Scavenged Energy

Power

consumption

of Sensor

Nodes

Human body

thermal

properties

Scavenging

sources

power profile

Heat

transfer

process

Average

available

power

BAN Power Profile

Heat

transfer

process

Average

available

power

Human body

thermal

propertiesp p

Scavenging

sources

power profilep p

BAN Resource Management

1. Radio sleep

2. Processor sleep

3. Processor

frequency

control

1. Schedule

sensing

2. Schedule

communication

3. Schedule key

agreement

1. Strategy for

allocation of

scavenged energy to

sensor nodes

2. Strategy to reduce

operating

temperature

Simulate

power

consumption

and available

power from

scavenging

sources Thermal

Model of BAN

Analysis of

temperature

of human skin

Model-based

Analysis

Sustainability

Verification

Data & Control FlowDesign & Verification Flow Energy Flow

e abs

(

aa

Ayushman

Workload

Fig. 7. BSNs can be modeled using th

ment strategies. Thus, this stage requires a feedback from theresource management stage as shown in Fig. 7.

(ii) Profiling of non-computing units: Different types of profilingare required for the human body, the scavenging sources andthe medical actuators:(a) The scavenging sources need to be profiled for the average

amount of energy available per unit time.(b) The human body needs to be profiled for its thermal prop-

erties, which will govern its temperature rise.(c) The actuators need to be profiled for power dissipation due

to the actuation process. For example, in case of infusionpump the drug infusion process requires insertion of needlein the human body. Friction of the needle with the tissuecan lead to power dissipation.

iii) Characterization of the cyber–physical interactions: Threetypes of cyber–physical interactions require profiling of threedifferent processes: (1) the heat transfer from the sensor nodesto the human body, (2) the charging of a sensor node withscavenged energy, and (3) drug diffusion processes.

The resource management stage consists of several power man-gement strategies on the sensor nodes, hardware, software andlso on the non-computing energy scavenging sources.

i. Sensor hardware power management: Power management

strategies that can be used in each sensor are: (1) radio sleepscheduling, (2) processor level sleep mode scheduling, and (3)processor level frequency scheduling. Fig. 8 shows when each ofthese strategies can be employed for the Ayushman workload.For example, radio and processor can be put to sleep during the

Sensor

CPU

Utilization

Time

Sensing Phase

Transmission Phase

Security PhaseSleep Cycle

AyushmanWorkloadEnables

processor duty

cycling (sleep

states)

Frequency

Throttling during

security phase

Fig. 8. BSN workload and application of power management strategies.

Peizoelectric devices

on Shoe soles

tract holistic view of CPS as in Fig. 4.

sensing period but has to be active during the data transmissionand PKA stages. Further, frequency control can be performedduring the PKA stage to reduce power consumption.

ii. Sensor software power management: Since communication ismore expensive than computation efficient scheduling of com-munication can achieve energy efficiency. Thus, in Ayushmanwe consider storing data locally and transmitting in bulk. Thisallows radio shutdown during the sensing phase and conser-vation of energy. Further, the PKA key agreement phase is alsoscheduled once in a day to maintain freshness of the keys.

iii. Non-computing component management: In case of the infu-sion pump, controlling the frequency of infusion can reduce thepower dissipated due to the actuation process.

Given the BSN, its power profile under the several manage-ment strategies, model based verification of its sustainability isperformed. For this purpose, architectural model of the sensor nodeand the scavenging sources are developed to verify the sustain-ability. Further, to determine the thermal effects on the humanbody formal models were developed, which characterized thecyber–physical interactions.

5.1. Characterizing Gci

function

The function Gci

is obtained through experimental profiling ofthe sensor nodes. In case of a BSN the profiling experiments wereperformed for two different platforms: Intel Atom and TelosB motesbased nodes. Intel Atom processor provides different modes ofoperation, which have different clock frequencies, while the TelosBmotes only have a single operating mode. For the Atom processor,the power consumption for the most compute intensive opera-tion in Ayushman, PKA, is experimentally obtained for differentoperating frequencies as shown in Table 2. In the table, percentagethrottling means the percentage by which the operating frequency

is reduced from the maximum.

Further, the Atom processor supports sleep modes where thepower consumption is very low. Table 2 characterizes the functionGc

ifor the two platforms. It can be clearly seen that Gc

idepends on

the workload and the operating mode of the processor.

Page 13: Research directions in energy-sustainable cyber–physical systems

S.K.S. Gupta et al. / Sustainable Computing: Informatics and Systems 1 (2011) 57–74 69

Table 2Power consumption of atom for Ayushman workload (wi).

Percentage throttling Power consumption (W)

0 0.191

5

G

wpws

5

rFd

Ncct

sco

bdtrec

5

Fftdtiit

5

as

TA

NP-NM NP-M P-M0

10

20

30

40

50

60

70

80

90

100

110

120

Num

ber

of

Nodes S

usta

ined

Evaluation of Design Alternatives and iterative improvement

All Four

Body Heat + Ambulation

(Long Term Monitoring)

Respiration + Ambulation

(Athletes in training)

Body Heat + Respiration

(Patient Monitoring in Hospital)

Ambulation + Sun Light

(Performance Monitoring

for outdoor sports)

13 0.186425 0.1737, 50, 62, 75 0.16787 0.164

.2. Characterizing Gncj

function

Two different examples are considered to explain the functionncj

. In the first example, energy scavenging nodes are considered,hich act as source of power while in the second case the infusionumps are considered, which dissipate heat energy due to frictionith human tissue. For each of these cases we identify the P and S

ets in the following section.

.2.1. Identifying sets P and SFor the non-computing units that scavenge energy, theP set can

epresent the energy requirement of the computing components.or example in Ayushman the energy consumption of the BSN withnodes can be computed as follows.

EBSN = d

[{(ts)Patom

sleep+ tTx(Pradio + Patom

active)}w + [tPKA(Patom

PKA+ Pradio)]

(d − 1

2

)]. (4)

ote that EBSN ∈ P depends on the computation workload as dis-ussed in Section 3.2. The set S is a property of the non-computingomponent that does not depend on the workload. It can representhe energy obtained from the scavenging sources.

Table 3 [2] gives the power available from the scavengingources and the expected amount of time each scavenging sourcean operate. Each of these power and time values can be membersf the set S.

Equivalently, for the infusion pump, the infusion rate requestedy the controller is a member of the set P. The infusion rate isetermined by a control algorithm [5], which attempts to main-ain a constant drug level in the blood. Whenever drug infusion isequested by the control algorithm the infusion pump dissipatesnergy during the drug injection process. This energy dissipationan be a member of the set S.

.2.2. Identifying the impact function FIn case of the energy scavenging example, the impact function

can be the difference between the elements in P and S. If the dif-erence is negative then the amount of scavenged energy is morehan required. This indicates that the system is sustainable. If theifference is positive then it indicates that energy required is morehan the available. Hence the system is unsustainable. In case ofnfusion pump, Penne’s bioheat equation [52] can be used as thempact function that relates the temperature rise in the humanissue because of the heat dissipated from the infusion pump.

.3. Resource management classes

To ensure sustainability of the Atom based BSN three differentpproaches to communication scheduling and processor level sleepcheduling has been employed:

able 3vailable scavenging power.

Scavenging source Available power (W) Scavenge time (h)

Body heat 0.1–0.15 24Ambulation 1.5 2Respiration 0.42 6Sun light 0.1 3

Iterative improvement in Design Alternatives

Fig. 9. Sustainability analysis results in terms of the number of computing unitspowered by the energy available from green sources (Section 2.1).

i. with processor level sleep scheduling and communication(radio sleep) scheduling (P-M),

ii. no processor level sleep scheduling but with communicationscheduling (NP-M), and

iii. without any processor level sleep scheduling or communicationscheduling (NP-NM).

All the above mentioned strategies consider operation at thelowest frequency so as to achieve lowest power consumption. Fig. 9shows the number of BSN nodes sustained for 24 h of Aysuhmanoperation using energy from different combination of scaveng-ing sources for each design strategy. We consider combination ofscavenging sources according to their applicability in real life situ-ations. Combination of scavenging from body heat and respirationcan be applied for monitoring of bedridden patients in hospital.Ambulation and sunlight can be used in military applications orin performance monitoring of outdoor sports like golf [53]. Ambu-lation and respiration can be used in case of athletes in training.Body heat and ambulation can be used for long term monitoring ina home environment. In Fig. 9 we arrange the combination of scav-enging sources in order of highest to lowest available scavengedenergy. The absence of some bars from the figure indicates thatthe corresponding scavenging combination cannot sustain even asingle node for 24 h.

The Ayushman workload is highly periodic. As such, theaforementioned strategies are all offline. The schedules are pre-determined and are optimized to achieve energy efficiency. Thesescheduling algorithms are aware of the workload characteristics.The Ayushman workload has both long running and short runningjobs. The sensing job is short running and arrives frequently; how-ever the PKA execution between two sensors is long running but isperformed done once in a day. Hence the P-M and NP-M schedulingalgorithms for Ayushman are all *PF algorithms. However, the NP-NM scheduling algorithm works both online and offline and doesnot schedule the communication or change the power modes of theprocessor. It just blindly runs the processor in the lowest operatingfrequency. Hence it is also not dependent on the periodicity of theworkload. Thus it is a Three* algorithm. However, from Fig. 9 we seethat for certain combinations of the scavenging sources the NP-NMalgorithm is not sustainable.

Further, the scheduling algorithms in Ayushman support powermanagement. As in the Atom processor the lowest frequency ofoperation is chosen and the radio is shutdown whenever possibleto save energy. All the algorithms at least control the operating

Page 14: Research directions in energy-sustainable cyber–physical systems

7 ing: In

fahtta

atttetwiHfocbiac

6

iimvd

6

hdcpts(str(ttFrImhfip

tHatMTeM

aggregation of the variations in each individual ROIn or ROIm.We have implemented the aforementioned constructs as part

0 S.K.S. Gupta et al. / Sustainable Comput

requency of the Atom processor to achieve energy efficiencynd does not consider transient behavior of the physiology ofuman body. Thus, the P-M and NP-M algorithms are *PF-S* whilehe NP-NM algorithm is ***-S*. Making the algorithms aware ofransient behavior of the human body is a challenging task and isn open problem.

In case of the infusion pump, the control algorithm is onlines it calculates the required amount of drug infused as and whenhe operator commands are delivered. Further, it can operate inhe offline mode also to maintain a predetermined drug concen-ration. The control algorithm is applicable for both long term jobs,.g., keeping a constant drug level for a long period of time. Fur-her, it can also control drug concentration during bolus requests,hich are short term infusion requests by the human being. The

nfusion requests are generally periodic with infusion schedules.owever, certain pumps also support intermittent bolus requests

rom patients. Thus, the workload can be both periodic or aperi-dic. Thus the infusion control algorithm is a Three* algorithm. Theontrol algorithm is aware of the transient behavior of the humanody. It obtains feedback in terms of the current drug concentration

n the human blood and then computes the future infusion rate sos to maintain the given drug concentration. However, the infusionontrol algorithm is not energy aware making it ***-*N.

. Verification of CPS operations

This section discusses various alternatives and their trade-offsn verifying the sustainability of the CPS operations. Section 3.5ntroduced the alternatives: (i) experimentation on actual deploy-

ent, (ii) experimentation using simulation, and (iii) model-basederification. The following subsection discusses the advantages,isadvantages, and trade-offs between these alternatives.

.1. Trade-offs between the verification approaches

Experimentation on actual deployment of the system is ideal toave high confidence on the verification results. However, actualeployment of CPSs may be: (i) prohibitive in terms of the cost asso-iated in acquiring the requisite resources and the efforts needed toutting these resources together in real deployment; and (ii) poten-ially hazardous because of the potential malfunctioning of theystem. For example, building a real data center involves acquiring:a) computing resources such as hundreds of computing servers,erver enclosures or chassis, network switches, and racks to mounthe servers, (b) real estate such as a building or a reasonable sizedoom, (c) power units such as power distribution units (PDUs), andd) one or more cooling resources such as CRACs. Additionally, allhe resources need to be properly deployed in the room to designhe desired data center on which experimentations are intended.urther, any change in the target data center for experimentationequires redesigning the data center, which is time consuming.n case of BSNs, a real deployment can be hazardaous since any

alfunctioning of the sensors or networking operations can causeealth hazards. Simulation has been a widely used alternative

or real deployments. For example, computational fluid dynam-cs (CFD) based simulators have been used to see the temperaturerofile in a data center room.

Both real deployment and simulation based verification requirehe system to be designed and computing operations implemented.owever, as discussed in Section 3.5, CPSs may require verificationt early design phase without real deployment and implementa-

ion. Model-based Engineering (MBE) can be used in this regard.

odels are used to characterize and abstract the system behavior.he behavior is then analyzed for verification of the desired prop-rties (e.g., sustainability). The correctness of the verification usingBE primarily depends on how accurate the models are. To ensure

formatics and Systems 1 (2011) 57–74

correctness, models can be developed based on profiling using realexperimental results. For example, power models of the computingequipment can be generated by performing regulated experimentsvarying the utilization level and measuring the power consumptionat different utilization. A recent study has focused on developing thepower models for various systems based on experimental resultsfrom SPEC [50]. Using such models to analyze the computing powerconsumption of data centers can give credibility and confidence toany sustainability related verification results based on such analy-sis. One major aspect of model-based verification, as discussed inthe following subsection, is the representation of the models thatis conducive for analytical studies. For example, the models in [50]can be used to analyze energy-proportionality – the property of asystem to consume energy in proportion to its utilization level – ofa data center based on power models of its constituent systems.

6.2. Framework for architectural modeling of CPSs

This section provides an architecture level specification frame-work. The specification considers a CPS as a global collection ofcomputing and physical components and the CPS is representedas a Global CPS (GCPS). A GCPS is a collection of distributed andnetworked cyber–physical subsystems, each of which consists of asingle computing component (or node), referred to as a Local CPS(LCPS). The LCPS subsystem considers each computing node in theCPS as an isolated cyber–physical system enabling modeling andanalysis of the interaction of individual computing node with thephysical environment. An LCPS consists of two entities:

– Computing unit: Abstract modeling of the computing unit to char-acterize the computing as well as the physical behavior of thecomputing nodes. In this regard, two types of properties of theLCPS are defined: (1) Computing property, which characterizes thecomputing behavior (e.g., processor speed and available mem-ory) and (2) Physical property, which characterizes the physicalbehavior of the computing component (e.g., power dissipation).

– Physical environment: This facilitates the modeling of the por-tion of the physical environment with which the computing unitinteracts. Any type of interaction (intended or unintended) ismodeled by the transfer of information (data or energy) betweenthe computing unit and the physical environment and its cor-responding effect is modeled by continuous equations. Theseequations are essentially the impact functions F (see Section3.2.1). In this regard, two constructs are defined ROIm (forunintended interactions) and ROIn (for intended interactions)each of which contains: (1) monitored parameter, which are thenon-computing parameters in P (e.g., temperature and availablescavenged energy), which gets affected by the interactions, and(2) region boundary, the region of the control volume over whichthe monitored parameter varies (depends on the spatio-temporalequations governing the variation of the monitored parameter).

The interactions that are modeled using the ROIn and ROImconstructs are local to a single computing unit. However, the dis-tributed nature of the constituent computing components in a GCPScan lead to cumulative effects of the interactions. These cumulativeeffects are modeled as global interactions and captured by overlapof ROIns and ROIms of the constituent LCPSs of a GCPS. The vari-ation of the monitored parameter in the overlapped region is an

of an extension (annex) to the industry standard abstract architec-ture description language (AADL).2 Fig. 10 gives an outline of the

2 http://www.aadl.info/.

Page 15: Research directions in energy-sustainable cyber–physical systems

S.K.S. Gupta et al. / Sustainable Computing: Informatics and Systems 1 (2011) 57–74 71

properties

Declaration – GlobalCPS (GCPS)

Control Volume Specification• coordinates

• Grid Units

Implementation - GlobalCPSsubcomponents

• LCPS1 • LCPS2 . . .

(x,y,z)

connections• Port group connections between

rent L

Declaration –LocalCPS (LCPS)

Implementation - LocalCPS

subcomponentsComputing UnitRegion of Interest

•Connection between Computing Unit and Region Of Interest

connections

featuresport group LCPSROIn

• Location

Assignments of values to the variables in the port group

Annex CPSAnnex

Implementation – Region Of Interest

features

Declaration – Region Of Interest

port group ROIn2Cyber

•LocationX and LocationY• Scavenged Energy

propertiesLocation X and Y

annex CPSAnnex• Equation of a Circular area• Obtain power supply from scavenging sources

subcomponentsSustainable Power Sources• Body Heat• Ambulation . . .

features

properties

Declaration - Computing Unit

port group Cyber2ROIm

• Energy Demand

Computing Property Set

• Energy Demand

Implementation - Computing Unitsubcomponents

process• Collection of threads

Compute total Energy Demands from individual thread power demands and execution times

Annex CPSAnnex

Declaration –Body Heat

Implementation –Body Heat

Computing Property Set

• Power Supply

properties

Declaration –Ambulation

Implementation –Ambulation

Computing Property Set

• Power Supply

properties

analy

Amacc(aHRtap

SstadoRcct

7

vtr

ROIns of diffe

Fig. 10. Sustainability

ADL specification of the model for a BSN running Ayushman healthonitoring system. The AADL implementation of the sustainability

nalysis involves the specification of three entities: (1) the poweronsumption model of the computing unit (modeled as part of theomputing unit construct in Fig. 10) running Ayushman workload,2) the power supply models of the scavenging sources (modeleds part of two sample scavenging sources, Ambulation and Bodyeat in Fig. 10), and (3) the modeling of the energy scavenged fromOIn (modeled as ROIn2Cyber construct in Fig. 10). Sensing, dataransmission, and PKA protocol steps are modeled as Threads andre characterized by power and energy demand properties. Theseroperties are part of Computing Property Set.

A sustainable power source is modeled as a System (Powerource). Body Heat and Ambulation are different types of powerources shown in the model. These are modeled as implementa-ions of Power Source. Voltage generated by these power sourcesre modeled by voltage property. Some of these stages consumeifferent amounts of current when radio is turned on or off, theseperating characteristics are represented as modes (RadioOn andadioOff) in a thread. Similar model can be developed for the dataenter sustainability evaluation where the data center is GCPS, eachhassis can be modeled as LCPS, inlets of all the chassis can formhe ROIms and the output of the CRAC can be the ROIn.

. Research directions and open problems

There are five major research directions towards design anderification of sustainable CPSs. Fig. 11 depicts these research direc-

ions along with the open problems in each of these directions. Theesearch directions are described below:

(i) Holistic management algorithms: Resource managementalgorithms for CPSs have to be aware of the non-computing pro-

CPSs

sis AADL code sample.

cesses and the impact of the computing components on theseprocesses. Such awareness requires predictions based on theimpact function F. In most cases, accurate characterization ofthe impact function requires analysis of complex transientbehavior of non-computing processes. For example, transientmodeling of the data center as an entire CPS includes transientmodeling of cooling equipment and is still work in progress.

In case of BSNs, the transient behavior of the human bodyis generally non-linear and time variant. For example, in theinfusion pump case the drug diffusion rate in the human bloodover time follows a non-linear differential equation. Further, theamount of drug diffused is dependent on the drug concentrationin the blood at a past time. This is because of the inherent delaysin the transport of the drug through the human blood. Formalanalysis of properties of such physiology are not well estab-lished. Hence, there is no methodology to design an algorithmthat considers the transient behavior of the human body.

Further, Five* algorithms need to be developed that can sup-port different types of workload. Handling mixed workload isa challenge, and it requires an appropriate abstraction of theworkload to capture different workload types. Another chal-lenge for a Five* algorithm is on performing effective powermanagement under online workload arrival. This would requirea form of adequate prediction of the workload arrival patterns.In case of data centers, such studies exist for Internet traffic,but studies of high-performance computing (HPC) workloadarrival patterns are virtually nonexistent mainly because theirsubmission has been handled as an offline workload. In case of

BSNs, since the workload is predominantly periodic, accuratepredictions can be performed. As such, radio sleep schedulingand and power management can be employed as described inSection 5.3. Such solutions however are ill suited for aperiodicworkloads without proper predictions.
Page 16: Research directions in energy-sustainable cyber–physical systems

7 ing: In

(

(

(

2 S.K.S. Gupta et al. / Sustainable Comput

Research in network traffic characterization [54] has shownthat the traffic has non-stationary and self similar behaviorand the predominant Markovian assumption on the workloadarrival does not hold. In such cases, use of fractal analysis tech-niques common in statistical physics domain are suggested foruse. However, such an approach needs to be studied further, anda methodology for developing and analyzing algorithms haveto be developed. Thus, developing algorithms for sustainabilityconsidering the self similar and non stationary behavior of theinput workload is an open problem.

ii) Formal methods: Formal modeling and analysis of CPSs arerequired to provide theoretical guarantees on their sustain-ability. One way to model both the discrete behavior of thecomputing processes and the continuous dynamics of the non-computing processes in CPSs is by using traditional linearhybrid automata [55]. The variation of the system parame-ters can be expressed as linear differential equations. Most ofthe linear hybrid automata reachability analysis [56] implicitlyassume that the first derivative of the continuous variables isconstant. Such assumptions are not applicable in general forwearable devices since the differential equations representingthe continuous dynamics may have higher orders [5]. Hybridautomata supporting higher order differential equations havebeen proposed in [57]. However, their reachability analysis isbased on numerical simulation instead of an analytical evalua-tion. The hybrid automata proposed in [57] allow specificationof continuous dynamics on two dimensions, but their analy-sis is also limited to only numerical simulations. As such, thefollowing scientific gap needs to be filled:– A hybrid automata with the combined capability of speci-

fying time varying, higher order, spatio-temporal continuousdynamics.

– A reachability analysis methodology for such hybridautomata.One way to fill the gap is to develop Spatio-Temporal Hybrid

Automata (STHA). In STHA, discrete states can not only describesystem behavior in time but also in space. The events causingthe state transitions can occur over time and while travers-ing through space. For analysis of such model, discretizationcan be performed for all but one dimension. In the remaining

Holistic

Managem

Algorithm

Performance

Analysis

Safety

Awareness

of Workload

Awarenes

Non-compu

Process

Experimental

Model-based

Sustainability

Metrics

Benchmark

Development

Specification

Language

Tool

Development

Equipment

Longevity

Safe and

Sustainable

Control

Operations

Sustainability

under

Real-time

Requirements

Fig. 11. Open issues and research d

formatics and Systems 1 (2011) 57–74

dimension, existing analysis techniques can be used. However,theoretical bounds need to be provided on any error incurredby the discretization. In this regard, proper tools need to bedeveloped to perform the reachability analysis.

iii) Performance analysis: There are two principal directions tomeasure the performance of a CPS in terms of sustainability: (i)experimental, and (ii) model-based. Either simulations or realtest-bed can be used for experimentation. A major issue in thisregard is the proper metric for sustainability that can captureall the different perspectives described in Section 2. Secondly,proper benchmarks need to be developed that can be used tomeasure the performance of a CPS in terms of these metrics.The metrics need to be generic and abstract enough to cap-ture all the different types of workloads mentioned previouslyin this section. In many CPSs, where real experiments can notbe performed an alternative is to perform model-based analy-sis. Various different modeling constructs required to capturethe non-computing processes and the impact of the computingcomponents on these processes have been discussed in Section6. Recent research has further focused on combining formalmodels with performance models [58]. Novel model specifi-cation language (or extensions to existing languages such asAADL) need to be investigated for representation of these con-structs. Lastly, model-based analysis tools need to be developed.

iv) Safety: Operations of the CPSs need to be safe, i.e., it shouldnot detrimentally impact the non-computing processes. Suchdetrimental impact can affect the equipment longevity (as inthe case of data centers if safe operating temperatures are notmaintained) and can have catastrophic consequences (as in thecase of BSNs if proper monitoring or timely drug delivery isnot performed). Equipment longevity can further impact thesustainability from the equipment recycling perspective (seeSection 2). Safety has an inherent trade-off with energy sustain-ability. For example, in data centers over-cooling can reduceenergy sustainability. Similarly, for BSN, it needs to be made

sure that timely drug delivery is not compromised to reduceenergy requirements. Handling the trade-offs between safetyand sustainability is an open problem for CPSs.

(v) Security: CPSs pose novel problems and opportunities forinformation security because of the impact of the computing

ent

s

Formal

Methods

Security

s of

ting

es

Awareness of Impact

on Non-computing

Processes

Spatio-temporal

Formal Models

Analysis

Methodology

Sustainability

Guarantees

Tool

Development

Reduce

Energy

Footprint

Awareness of

Non-computing

Processes

Awareness of

Impact on

Non-computing

Processes

irections for sustainable CPSs.

Page 17: Research directions in energy-sustainable cyber–physical systems

ing: In

8

sdisaoTttcorimespf

R

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

S.K.S. Gupta et al. / Sustainable Comput

components on the non-computing processes and the possi-ble awareness of the non-computing processes, respectively.On one hand, the information security mechanisms needs tobe aligned with the safety and sustainability requirements ofthe application. For example, in case of BSNs, when there is anymedical emergency, the normal access privileges to physiolog-ical information needs to be dynamically updated so that anyavailable medical personnel can access the required informa-tion [60]. Further, a security operation should not have a highenergy requirement. On the other hand, information from thenon-computing processes can aid the security operations. Forexample, physiological signals can be used to generate cryp-tographic keys in BSNs [51]. However, such key generationtechniques can involve complex signal processing operationswhich can impact the energy sustainability of the system [59].As such, there is a trade-off between security and sustainabilityof CPSs. Any security policy has to handle such trade-offs.

. Conclusions

In this paper, we discussed the various approaches for energy-ustainable computing in CPSs. It is identified that the inherentependencies among the computing and non-computing processes

n a CPS have to be considered for designing energy-sustainableystems and analyzing their energy-sustainability. In this regard,generic theoretical conceptualization of the power consumptionf both computing and non-computing components is provided.his theoretical conceptualization was then demonstrated usingwo representative CPSs: BSNs and data centers. Further, based onhe dependencies of a system’s power consumption on workloadharacteristics, power management strategies, and managementf non-computing units, energy-sustainability management algo-ithms were classified. Five major research directions have beendentified: (i) designing holistic management algorithms, (ii) for-

al methods for verification and theoretical guarantees, (iii)xperimental and model-based performance analysis of CPSs, (iv)afety of CPSs, and (v) information security in CPSs. Various openroblems in these directions have also been discussed to guideuture research in energy-sustainable cyber-physical systems.

eferences

[1] NewsLink Spring 08, Tackling Todays’s Data Center Energy Efficiency Chal-lenges.

[2] J.A. Paradiso, S. Thad, Energy scavenging for mobile and wireless electronics,IEEE Pervasive Computing 4 (January–March (1)) (2005) 18–27.

[3] Q. Tang, S.K.S. Gupta, G. Varsamopoulos, Energy-efficient thermal-aware taskscheduling for homogeneous high-performance computing data centers: acyber-physical approach, IEEE TPDS 19 (11) (2008) 1458–1472.

[4] K. Venkatasubramanian, G. Deng, T Mukherjee, J. Quintero, V. Annamalai, S.K.S.Gupta, Ayushman: a wireless sensor network based health monitoring infras-tructure and testbed, Distributed Computing in Sensor Systems (July) (2005)406–407.

[5] D. Wada, D. Ward, The hybrid model: a new pharmacokinetic model forcomputer-controlled infusion pumps, IEEE Transactions on Biomedical Engi-neering 41 (February (2)) (1994) 134–142.

[6] L. Schwiebert, S.K.S. Gupta, J. Weinmann, Research challenges in wireless net-works of biomedical sensors, in: MobiCom’01: Proceedings of the 7th annualinternational conference on Mobile computing and networking, ACM, NewYork, NY, USA, 2001, pp. 151–165.

[7] S. Roundy, E.S. Leland, J. Baker, E. Carleton, E. Reilly, E. Lai, B. Otis, J.M. Rabaey, V.Sundararajan, P.K. Wright, Improving power output for vibration-based energyscavengers, IEEE Pervasive Computing 4 (1) (2005) 28–36.

[8] P. Rong, M. Pedram, Power-aware scheduling and dynamic voltage setting fortasks running on a hard real-time system, January 2006, pp. 473–478.

[9] H. Li, J. Tan, An ultra-low-power medium access control protocol for bodysensor network, 2005, pp. 2451–2454.

10] S. Ullah, P. Khan, Y.-W. Choi, H.-S. Lee, K.S. Kwak, MAC hurdles in body sensornetworks, in: ICACT’09: Proceedings of the 11th International Conference onAdvanced Communication Technology, IEEE Press, Piscataway, NJ, USA, 2009,pp. 1151–1155.

11] K. Ram, V. Tsiatsis, M.B. Srivastava, Computation hierarchy for in-network pro-cessing, in: WSNA’03: Proceedings of the 2nd ACM International Conference

[

[

formatics and Systems 1 (2011) 57–74 73

on Wireless Sensor Networks and Applications, New York, NY, USA, 2003, pp.68–77.

12] T. Mukherjee, A. Banerjee, G. Varsamopoulos, S.K.S. Gupta, S. Rungta, Spatio-temporal thermal-aware job scheduling to minimize energy consumption invirtualized heterogeneous data centers, Computer Networks, June 2009.

13] U.E.P. Agency, Report to congress on server and data center energy efficiencypublic law 109-431, ENERGY STAR Program, 2007.

14] A. Banerjee, S. Kandula, T. Mukherjee, S.K.S. Gupta, BAND-AiDe: a tool for cyber-physical oriented analysis and design of body area networks and devices, ACMTransactions on Embedded Computing Systems (TECS), Special issue on Wire-less Health Systems (minor revision submitted for review), 2010.

15] J. Geus, D. Posthuma, N. Kupper, M. Berg, G. Willemsen, A. Beem, P. Slagboom,D. Boomsma, A whole-genome scan for 24-hour respiration rate: a major locusat 10q26 influences respiration during sleep, Tech. Rep., 2005.

16] H. Ibrahim, A. Ilinca, J. Perron, Energy storage systems-characteristics and com-parisons, Renewable and Sustainable Energy Reviews 12 (5) (2008) 1221–1250.

17] H. Lund, G. Salgi, The role of compressed air energy storage (caes) in futuresustainable energy systems, Energy Conversion and Management 50 (5) (2009)1172–1179.

18] Review on thermal energy storage with phase change materials and applica-tions, vol. 13 (2), 2009, pp. 318–345.

19] M.R. Gary, D.S. Johnson, A Guide to the Theory of NP-Completeness, Freeman,1979.

20] D. Tsafrir, Y. Etsion, D.G. Feitelson, Backfilling using system-generated predic-tions rather than user runtime estimates, IEEE Transactions on Parallel andDistributed Systems (TPDS) 18 (June (6)) (2007) 789–803.

21] J. Burge, P. Ranganathan, Cost-aware scheduling for heterogeneous enterprisemachines, in: Workshop on Green Computing (GreenCom), Proceedings of theIEEE Cluster Conference, September 2007.

22] J. Moore, J. Chase, P. Ranganathan, Weatherman: Automated online, and pre-dictive thermal mapping and management for data centers, in: 3rd IEEEInternational Conference on Autonomic Computing, June 2006.

23] J. Moore, J. Chase, P. Ranganathan, R. Sharma, Making scheduling “cool”:temperature-aware resource assignment in data centers, in: Usenix AnnualTechnical Conference, April 2005.

24] Z. Abbasi, G. Varsamopoulos, S.K.S. Gupta, Thermal aware server provision-ing and workload distribution for internet data centers, in: ACM InternationalSymposium on High Performance Distributed Computing (HPDC10), June 2010.

25] A. Banerjee, T. Mukherjee, G. Varsamopoulos, S.K.S. Gupta, Cooling-aware andthermal-aware workload placement for green hpc data centers, in: Interna-tional Conference on Green Computing Conference (IGCC2010), August 2010.

26] R. Gonzalez, M. Horowitz, Energy dissipation in general purpose microproces-sors, IEEE Journal of Solid-State Circuits 31 (September (9)) (1996) 1277–1284.

27] R. Bianchini, R. Rajamony, Power and energy management for server systems,Computer 37 (November (11)) (2004) 68–76.

28] H. Taliver, et al., Mercury and Freon: temperature emulation and managementfor server systems, in: ASPLOS-XII: Proceedings of the 12th International Con-ference on Architectural Support for Programming Languages and OperatingSystems, ACM Press, New York, NY, USA, 2006, pp. 106–116.

29] L. Ramos, R. Bianchini, C-oracle: predictive thermal management for data cen-ters, in: IEEE 14th International Symposium on High Performance ComputerArchitecture (HPCA2008), February 2008, pp. 111–122.

30] R. Sharma, T. Christian, M. Arlitt, C. Bash, C. Patel, Design of farm waste-drivensupply side infrastructure for data centers, 2010.

31] HP designs sustainable datacenter fueled by cow manure,http://www.smartplanet.com/business/blog/smart-takes/hp-designs-sustainable-datacenter-fueled-by-cow-manure/7189/.

32] P. Ranganathan, P. Leech, D. Irwin, J. Chase, Ensemble-level power manage-ment for dense blade servers, in: IEEE Proceedings of the 33rd InternationalSymposium on Computer Architecture (ISCA’06), Boston, MA, May 2006, pp.66–77.

33] R.K. Sharma, C.E. Bash, C.D. Patel, R.J. Friedrich, J.S. Chase, Balanceof power: dynamic thermal management for internet data centers,IEEE Internet Computing 9 (1) (2005) 42–49 (Online). Available:http://doi.ieeecomputersociety.org/10.1109/MIC.2005.10.

34] T. Mukherjee, A. Banerjee, G. Varsamopoulos, S.K.S. Gupta, Model-drivenco-ordinated management of data centers, Elsevier Computer Networks54 (16) (2010) 2869–2886, doi:10.1016/j.comnet.2010.08.011 (Online).Available: http://www.sciencedirect.com/science/article/B6VRG-5103X3Y-1/2/bfb2eacf3d1695839a266648141a3296.

35] L.K. Au, W.H. Wu, M.A. Batalin, D.H. McIntire, W.J. Kaiser, Microleap: Energy-aware wireless sensor platform for biomedical sensing applications, 2007, pp.158–162.

36] P.H. Chou, C. Park, Energy-efficient platform designs for real-world wire-less sensing applications, in: ICCAD’05: Proceedings of the 2005 IEEE/ACMInternational Conference on Computer-aided Design, IEEE Computer Society,Washington, DC, USA, 2005, pp. 913–920.

37] 8 most “eco-friendly” nokia mobile phones, http://www.environmentteam.com/2010/04/07/8-most-eco-friendly-nokia-mobile-phones/.

38] L. Yang, R. Vyas, A. Rida, J. Pan, M. Tentzeris, Wearable rfid-enabled sensor nodes

for biomedical applications, May 2008, pp. 2156–2159.

39] Eco gadgets: recyclable paper laptop for sustainable computing,http://www.ecofriend.org/entry/eco-gadgets-recyclable-paper-laptop-for-sustainable-computing/.

40] H. Ghasemzadeh, R. Jafari, Data aggregation in body sensor networks: a poweroptimization technique for collaborative signal processing, June 2010, pp. 1–9.

Page 18: Research directions in energy-sustainable cyber–physical systems

7 ing: In

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

[

sity (ASU), USA. He received his B.E. in Electronics andTelecommunication Engineering from Jadavpur Univer-sity, Kolkata, India. He joined Impact Lab for his researchin Fall 2007. His research is focused on developing andvalidating energy efficient job scheduling algorithms andsafety verification of medical devices.

4 S.K.S. Gupta et al. / Sustainable Comput

41] S. Nabar, J. Walling, R. Poovendran, Minimizing energy consumption in bodysensor networks via convex optimization, in: BSN’10: Proceedings of the 2010International Conference on Body Sensor Networks, IEEE Computer Society,Washington, DC, USA, 2010, pp. 62–67.

42] H. Ghasemzadeh, N. Jain, M. Sgroi, R. Jafari, Communication minimization for in-network processing in body sensor networks: a buffer assignment technique,April 2009, pp. 358–363.

43] S. Khan, I. Ahmad, A cooperative game theoretical technique for joint opti-mization of energy consumption and response time in computational grids,IEEE Transactions on Parallel and Distributed Systems 20 (March (3)) (2009)346–360.

44] L. Wang, G. von Laszewski, J. Dayal, X. He, A. Younge, T. Furlani, Towards thermalaware workload scheduling in a data center, December 2009, pp. 116–122.

45] Y. Lee, A. Zomaya, Energy efficient utilization of resources in cloud computingsystems, The Journal of Supercomputing (2010) 1–13, doi:10.1007/s11227-010-0421-3 (Online). Available: http://dx.doi.org/10.1007/s11227-010r-r0421-3.

46] J. Moore, et al., Making scheduling “cool”: temperature-aware resource assign-ment in data centers, in: 2005 Usenix Annual Technical Conference, April.

47] J. Sherwani, N. Ali, N. Lotia, Z. Hayat, R. Buyya, Libra: a computational economy-based job scheduling system for clusters, Software: Practice and Experience 34(6) (2004) 573–590.

48] R. Ayoub, S. Sharifi, T. Simunic Rosing, Gentlecool: Cooling aware proactiveworkload scheduling in multi-machine systems, March 2010, pp. 295–298.

49] V. Subramanian, M. Gilberti, A. Doboli, Online adaptation policy design for gridsensor networks with reconfigurable embedded nodes, in: DATE, IEEE, 2009,pp. 1273–1278.

50] G. Varsamopoulos, Z. Abbasi, S.K.S. Gupta, Trends and effects of energy propor-tionality on server provisioning in data centers, in: International Conferenceon High performance Computing Conference (HiPC2010), December 2010.

51] K.K. Venkatasubramanian, A. Banerjee, S.K.S. Gupta, Plethysmogram-basedsecure inter-sensor communication in body area networks, in: Military Com-munications Conference, 2008, MILCOM 2008, IEEE, November 2008, pp. 1–7.

52] H.H. Pennes, Analysis of tissue and arterial blood temperature in the restinghuman forearm, Journal of Applied Physiology 1 (1) (1948) 93–122.

53] H. Ghasemzadeh, V. Loseu, E. Guenterberg, R. Jafari, Sport training using bodysensor networks: a statistical approach to measure wrist rotation for golf swing,in: Proceedings of the International Conference on Body Area Networks, ICST,2009, pp. 1–8.

54] R. Marculescu, P. Bogdan, The chip is the network: toward a science of network-on-chip design, 2007.

55] T.A. Henzinger, The Theory of Hybrid Automata, IEEE Computer Society Press,1996, pp. 278–292.

56] G. Lafferriere, G.J. Pappas, S. Yovine, Reachability computation for linear hybridsystems, in: Proceedings of the 14th IFAC World Congress, vol. E, Elsevier Sci-ence Ltd, 1998, pp. 7–12.

57] P. Ye, E. Entcheva, S.A. Smolka, M.R. True, R. Grosu, Hybrid automata as a uni-fying framework for modeling cardiac cells, in: Proceedings of EMBS’06, the28th IEEE International Conference of the Engineering in Medicine and BiologySociety, IEEE Press, 2006, pp. 4151–4154.

58] C. Baier, B.R. Haverkort, H. Hermanns, J.-P. Katoen, Performance evaluation andmodel checking join forces, Communications of the ACM 53 (9) (2010) 76–85.

59] K.K. Venkatasubramanian, A. Banerjee, S.K.S. Gupta, Green and sustainablecyber-physical security solutions for body area networks, in: BSN’09: Proceed-ings of the 2009 Sixth International Workshop on Wearable and ImplantableBody Sensor Networks, 1em plus 0.5em minus 0.4em, IEEE Computer Society,Washington, DC, USA, 2009, pp. 240–245.

60] S.K.S. Gupta, T. Mukherjee, K. Venkatasubramanian, Criticality Aware AccessControl Model for Pervasive Applications, Fourth IEEE International Conferenceon Pervasive Computing and Communications (PerCom’06), Percom, 2006,pp.251–257.

Sandeep K.S. Gupta is a Professor in the School ofComputing, Informatics, and Decision Systems Engineer-ing (SCIDSE), Arizona State University, Tempe, USA. Hereceived the B.Tech degree in Computer Science and Engi-neering (CSE) from Institute of Technology, Banaras HinduUniversity, Varanasi, M.Tech. degree in CSE from IndianInstitute of Technology, Kanpur, and M.S. and Ph.D. degree

in Computer and Information Science from Ohio State Uni-versity, Columbus, OH. His current research is focused oncyber–physical systems with emphasis on green comput-ing, pervasive healthcare, and criticality-aware systems.Gupta’s research awards include a best 2009 SCIDSE seniorresearcher and a best paper award. His research has

formatics and Systems 1 (2011) 57–74

been supported by Science Foundation of Arizona, National Science Foundation,National Institutes of Health, Intel Corp., Raytheon Missile Systems, and NorthropGrumman Corp. He currently serves on several editorial boards including IEEETransactions on Parallel and Distributed Systems, Springer Wireless Networks,Elsevier Sustainable Computing, IEEE Communication Letters. Gupta has servedon several program committees, including Percom and Wireless Health, chair/co-chaired several workshops and conferences, including Greencom and BodyNets,and co-edited several special issues for various journals and magazines, includ-ing IEEE Transactions on Computers and IEEE Pervasive Computing. He is currentlyco-guest editor for a IEEE Proceedings special issue on cyber–physical systems,and he is on program committees for Body Sensor Networks (BSN 2011) and2nd IEEE/ACM Int’l Conference on Cyber–Physical Systems (ICCPS 2011). Guptais a senior member of IEEE and heads the Impact Lab (http://impact.asu.edu)at ASU.

Tridib Mukherjee is a Postdoctoral Research Fellowat the School of Computing Informatics and DecisionSystems Engineering, Arizona State University (ASU),Tempe, USA. He received B.E. degree in computer sci-ence and engineering from Jadavpur University, Kolkata,India, and Ph.D. degree in computer science from ASU.He has also worked in the industry on wireless net-working and embedded systems. His research interestsinclude cyber-physical systems, distributed systems,embedded systems, wireless networks, and model-based engineering. His publication list is available athttp://impact.asu.edu/∼tridib/. He is a member of theIEEE.

Georgios Varsamopoulos received the B.S. degree in com-puter and information engineering from University ofPatras, Greece, the M.S. degree in computer science fromColorado State University, Fort Collins, Colorado, and thePh.D. degree in computer science from Arizona StateUniversity, Tempe, Arizona. He is currently a ResearchAssistant Professor with the School of Computing, Infor-matics and Decision Systems Engineering at Arizona StateUniversity. Dr. Varsamopoulos’ research interests includeenergy-aware and sustainable computing, mobile andpervasive computing, cyber-physical computing, and per-formance analysis. His publication list is available athttp://impact.asu.edu/george/.

Ayan Banerjee is a PhD student at the Department ofComputer Science and Engineering, Arizona State Univer-