parallel & distributed computing

12
Status quo What I set out to do ! How much I knew what I was getting into

Upload: rohitainapure

Post on 09-Jun-2015

183 views

Category:

Education


3 download

TRANSCRIPT

Page 1: Parallel & Distributed Computing

Status quo

What

I set o

ut to do !

How much

I knew

what I w

as ge

tting into

Page 2: Parallel & Distributed Computing

Status quo

What

I set o

ut to do !

Fair e

nough id

ea of w

hat to do !

Page 3: Parallel & Distributed Computing

Status quo

What

I set o

ut to do !

I got a

little better !

Page 4: Parallel & Distributed Computing

Status quo

What

I set o

ut to do !

Probably

where I am now !

Page 5: Parallel & Distributed Computing

Energy, Let’s save it !

http://www.youtube.com/watch?v=1-g73ty9v04

Page 6: Parallel & Distributed Computing

Data Center Power usage stats• Prediction : The influential report issued by the E.P.A. in August of 2007

estimated that national energy consumption by computer servers and data centers would nearly double from 2005 to 2010 to roughly 100 billion kilowatt hours of energy at an annual cost of $7.4 billion. It predicted the centers’ demand for power in the United States would rise by 2011 to 12 gigawatts of power, or the output of 25 major power plants, from 7 gigawatts, or about 15 power plants.

• The financial implications are significant; estimates of annual power costs for U.S. data centers now range as high as $3.3 billion. This trend impacts data center capacity as well. According to the Fall 2007 Survey of the Data Center Users Group (DCUG®), an influential group of data center managers, power limitations were cited as the primary factor limiting growth by 46 percent of respondents, more than any other factor. In addition to financial and capacity considerations, reducing data center energy use has become a priority for organizations seeking to reduce their environmental footprint.

Page 7: Parallel & Distributed Computing

Power saving, Machine Learning based scheduler for HPC Data Centers

• Algorithm– ML aspects of it– Complexity– Implementation (Simulation + Real)

• Performance evaluation & prediction

• for the upcoming week)

Page 8: Parallel & Distributed Computing

Algorithm Poll hosts for information about their jobs and status;

OH := select "Emptiable Machines" [jobs < 4];For each Machine (h) in Cluster do:

For each Job (j) in Machine(h) do: CH := select "Fillable Machines" [enough CPU and mem];

For each Machine (ch) in CH do:-- predict effect of moving j from oh to ch; predict R(h) and R(ch) after movement;predict C(h) and C(ch) after movement;compute global R and C after movement;

End ForGet ch leading to highest R among those that decrease C;add movement (j,h,ch) to List_of_movements;End For

If (all jobs in h can be reallocated) then:proceed with the List_of_movements;

End If End For

Page 9: Parallel & Distributed Computing

Programhttps://github.com/codeathon/SchedulerHPC

Page 10: Parallel & Distributed Computing
Page 11: Parallel & Distributed Computing
Page 12: Parallel & Distributed Computing

Hurdles

• Power Usage calculation & prediction– Linear regression relation with CPU Usage based on

relevant attributes like cpu_time, walltime, mem_used, vmem_used, num_of_jobs

• Task Migration / Job Moving– Combine the Performance calculation & CPU Usage

calculation to identify a good task candidate for migration

• A Simulation environment