resource management system for distributed environment b4. nguyen tuan duc

Resource management system for distributed

environmentB4. Nguyen Tuan Duc

Background

Emerging need for resource management system of clusters / grids

Several systems exist, but have problems… Portable Batch System Sun Grid Engine ….

Flexible resource management system Support clusters, grids Fair-share scheduling Maximize utilization of resources Support parallel applications Reduce load aggregation

Agenda

Background Goal Related works Proposal method Problems

Related works

Portable Batch System (MRJ 1990s) Batch queuing system Automatic load-balancing Parallel jobs support Job accounting

Portable Batch System (PBS)

Sun Grid Engine

Batch queuing system by Sun Microsystems Same features with PBS, and Job checkpoint Several add-ons

Problems of batch queuing systems Resource utilization Load aggregation

Server accept too many requests from clients Limit of execution model

Cannot fork, since process created with fork() does not go into the queue

Saito Dai’s system (STDS)

Flexible Resource Management System for Widely Distributed Environment (2006) No load aggregation Job scheduling on each node Independent from execution model (fork, … OK) Support parallel jobs

STDS structure

Two main components Node searching system (graph searching) Scheduler (on each node)

Scheduler Daemon on each node CPU fair-sharing by ‘nice’

Node searching system Create graph from links Node search graph search

STD node searching system

Our approach

Similar to STD system Node searching system Scheduler on each node

But different in … Node search: no graph searching Scheduler: kernel scheduler with user accounting

(budget scheduler)

Scheduler: Budget scheduling Budget scheduling Normal queue & budget queue Normal queue for interactive processes

Linux 2.6 default scheduler Budget queue for CPU-hogging processes

Automatic detecting of CPU-intensive process http://www.logos.ic.i.u-tokyo.ac.jp/~duc/pre/1

107.ppt

Node searching system

Client-server model Daemon on each node Daemon reports CPU state (process number,

CPU utilization, …) directly to user Reports maximum price

From where user can submit jobs? From every where on the cluster, grids From their desktop, via the Internet

Need of a job submitting system

Node searching system (NSS)

Who will determine nodes?

User! Users choose nodes appropriated to their

jobs Parallel jobs: idle CPUs or CPUs with low-price

jobs Long-last jobs: idle CPU, set low-price

Node searching system (NSS) NSS should report to users:

CPU utilization Maximum price Load (process number, ..) …

Daemon on each node sends information about the node to client.

Client is on user’s machine No heavy load aggregation

Problems!!!

May be heavy load on user client NAT, Firewall

How client can connect to server?? Information need?

Only CPU utilization, maximum price, load, average-price?

resource management system for distributed environment b4. nguyen tuan duc

Documents

rameau: umprogramapraanáliseharmônica...

http://duc/paradis/1 the spin model checker promela...

vietnam solar energy summit · chief representative in...

01 - smart grid and applications_nguyen tuan duc

huong dan - thi nghiem hpt _truong duc duc

vuquang.edu.vnvuquang.edu.vn/uploads/download/files/124-to-chuc-hoat-dong-tuan-le... ·...

tuan biodiesel

nls.hcmuaf.edu.vnnls.hcmuaf.edu.vn/data/file/ke hoach to...

tuan noraida tuan hamzah

thesis tuan

implementation of unfccc, kyoto protocol and cdm in vietnam...

international journal of concrete structures and materials -...

tuan athena

cisc processor by tuan tuan

setscrew ball bearings...selection features/benefits page...

modeling demand for catastrophic ood risk insurance in...

tuan dat and duc minh

die mauttabelle 4.2 (bundesstraßen) · 4 - an b4 kieswerk...

nhung rao can cua viec dat ten doanh nghiep (vu duc tuan ...

difficult tracheal intubation dr tuan pham duc 06/2010