thermal aware resource management framework

Thermal Aware Resource Management Framework

Xi He, Gregor von Laszewski, Lizhe WangGolisano College of Computing and Information Sciences

Rochester Institute of TechnologyRochester, NY 14623xi.he@mail.rit.edu

Outline

• Introduction• Motivation• Thermal-aware Resource Management

Framework• Motivational Examples• System Model and Problem Definition• Thermal-aware Task Scheduling Algorithm• Conclusion

Introduction

Distributed Collaborative Experiment

Introduction

• 61 billion kilowatt-hours of power in 2006, 1.5 percent of all US electricity use costing around $4.5 billion.

• Energy usage doubled between 2000 and 2006.• Energy usage will double again by 2011[1]. 61 billion

kilowatt-hours of power in 2006, 1.5 percent of all US electricity use costing around $4.5 billion.

• [1] http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf

Dynamic Voltage Scaling Hardware LevelDynamic Frequency Scaling

Virtualization Software Level

Job Scheduling Middleware LevelVirtual Machine Scheduling

Introduction

Cooling System Data Center Level

Motivation

• Why thermal-aware resource management framework? – To allow end users easily collaborate with each

other and get access to remote resources.– To implement Green Computing.– To monitor temperature situation in Data Center.

Architecture Overview

Different types of task-temperature profiles

Motivational Examples

Task-temperature profile (Buffalo Data Center)

job1=(0,2,20,f(job1))

job2=(0,1,40,f(job2))

node1=40C

node2=32C

node3=34C

node4=32C

node1=40C

node2=40C

node3=40C

node4=40C

job1node4job1node2

job2node3

job1node1job1node2job2node3

max=40Cσ=0

node1=48C node2=40C

node3=40C Node4=32CMax=48C Σ=5.6

System Model

•Where, nodei indicates ith node in the data center; Each node has a temperature-time profile that indicates the node’s temperature value over time.

System Model

•Where, tstart indicates the starting time of job; The job needs nodenum processors and lasts texe; ftemp(t) is a function caused by the execution of the job based on the execution time of the job.

Problem Definition

•Given a set of jobs. Find an optimal schedule to assign each job to the nodes to minimize computing nodes’ temperature deviation. •Where, ΔTemp is the temperature increase that jobk causes.

Problem Definition

•We use standard deviation as the metric for measuring the temperature distribution.

Algorithm

1. Select the node which has the lowest “current” temperature. 2. Sort jobs in descending order of the temperature rise they caused.3. For each job4. Assign the job to the selected node.5. Update the node’s temperature-time profile. 6. Select the node which has the lowest “current” temperature.7. End For8. If a node’s temperature exceed the threshold, don’t choose it

in the next round and let it cool down.

Experiment

0 20 40 60 80 100 120 140 160 1800

f(x) = 6.17136207851786 ln(x) − 16.980854076871f(x) = − 0.000488906926406926 x² + 0.169975108225108 x − 0.543030303030302

Series1Logarithmic (Series1)Polynomial (Series1)

Task temperature profile

Execution Time(s)

rature

Experiment

iCore7 cooling profile

0 20 40 60 80 100 120 14062

Series1Polynomial (Series1)

Time(s)Tem

Result

σ ( Thermal aware task scheduling )

σ ( Random task scheduling )

N=10M=30

6.2 13.4

N=20M=30

5.3 11.1

N=20M=40

7.3 16.5

N indicates the number of job groupsM indicated the number of jobs in each group

Related Work

•In [1], [2], power reduction is achieved by the power- aware task scheduling on DVS-enabled commodity systems which can adjust the supply voltage and support multiple operating points.

•[1] K. H. Kim, R. Buyya, and J. Kim, “Power aware scheduling of bag-of- tasks applications with deadline constraints on dvs-enabled clusters,” in CCGRID, 2007, pp. 541–548. •[2] R. Ge, X. Feng, and K. W. Cameron, “Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters,” in SC, 2005, p. 34.

Related Work

•In [3], [4] thermodynamic formulation of steady state hot spots and cold spots in data centers is examined and based on the formulation several task scheduling algorithms are presented to reduce the cooling energy consumption.

•[3] Q. Tang, S. K. S. Gupta, and G. Varsamopoulos, “Thermal-aware task scheduling for data centers through minimizing heat recirculation,” in CLUSTER, 2007, pp. 129–138.•[4] J. D. Moore, J. S. Chase, P. Ranganathan, and R. K. Sharma, “Making scheduling ”cool”: Temperature-aware workload placement in data centers,” in USENIX Annual Technical Conference, General Track, 2005, pp. 61–75.

CONCLUSION

My accomplishment in the research: Grid computing and Cloud computing

literature review Make an analyzing study on Buffalo data

center operation. Scheduling algorithms literature review

Conclusion• A novel framework to solve resource

management problem.• A thermal-aware task scheduling for data

center, which will save a lot of cooling energy cost.

• Future work– Investigate other thermal characteristic of data

centers.– Continue the development of thermal-aware

resource management framework.

PUBLICATION

G. von Laszewski, F. Wang, A. Younge, X. He, Z. Guo, and M. Pierce, “Cyberaide javascript: A javascript commodity grid

kit,” in GCE08 at SC’08. Austin, TX: IEEE, Nov. 16 2008. [Online]. Available:

http://cyberaide.googlecode.com/svn/trunk/papers/ 08- javascript/vonLaszewski- 08- javascript.pdf

G. von Laszewski, A. Younge, X. He, K. Mahinthakumar, and L. Wang, “Experiment and workflow management using

cyberaide shell,” in 4th International Workshop on Workflow Systems in e-Science (WSES 09) in conjunction with 9th IEEE

International Symposium on Cluster Computing and the Grid. IEEE, 2009.

Appendix

thermal aware resource management framework

energy consumption problem

data centers

large energy consumption

energy efficiency

energy usage

thermal aware task scheduling

us electricity consumption

kilowatthours of power

Documents

automatic, application-aware i/o forwarding resource

energy-aware resource allocation heuristics for efficient

cri resource manager – topology-aware resource assignment

resource aware scheduling in apache storm

resource-aware programming for robotic vision

thermal aware resource management framework xi he, gregor...

resource cost aware scheduling...

social-aware utility-based radio resource management

resource-aware session types for digital contracts

on fundamental principles for thermal-aware design on

resource-aware adaptive scheduling for mapreduce clusters

application-aware cluster resource management › research...

network aware resource allocation in distributed clouds

content-aware dynamic network resource allocation

aware smart resource management

thermal-aware resource management for embedded real-time...

energy efficiency-aware joint resource allocation and

resource aware person re-identification across multiple...

rtah: resource and thermal aware hadoop

resource aware scheduling for hadoop [final presentation]