resource management system for distributed environment b4. nguyen tuan duc

18
Resource management system for distributed environment B4. Nguyen Tuan Duc

Upload: rosaline-cobb

Post on 28-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Resource management system for distributed environment B4. Nguyen Tuan Duc

Resource management system for distributed

environmentB4. Nguyen Tuan Duc

Page 2: Resource management system for distributed environment B4. Nguyen Tuan Duc

Background

Emerging need for resource management system of clusters / grids

Several systems exist, but have problems… Portable Batch System Sun Grid Engine ….

Page 3: Resource management system for distributed environment B4. Nguyen Tuan Duc

Goal

Flexible resource management system Support clusters, grids Fair-share scheduling Maximize utilization of resources Support parallel applications Reduce load aggregation

Page 4: Resource management system for distributed environment B4. Nguyen Tuan Duc

Agenda

Background Goal Related works Proposal method Problems

Page 5: Resource management system for distributed environment B4. Nguyen Tuan Duc

Related works

Portable Batch System (MRJ 1990s) Batch queuing system Automatic load-balancing Parallel jobs support Job accounting

Page 6: Resource management system for distributed environment B4. Nguyen Tuan Duc

Portable Batch System (PBS)

Page 7: Resource management system for distributed environment B4. Nguyen Tuan Duc

Sun Grid Engine

Batch queuing system by Sun Microsystems Same features with PBS, and Job checkpoint Several add-ons

Page 8: Resource management system for distributed environment B4. Nguyen Tuan Duc

Problems of batch queuing systems Resource utilization Load aggregation

Server accept too many requests from clients Limit of execution model

Cannot fork, since process created with fork() does not go into the queue

Page 9: Resource management system for distributed environment B4. Nguyen Tuan Duc

Saito Dai’s system (STDS)

Flexible Resource Management System for Widely Distributed Environment (2006) No load aggregation Job scheduling on each node Independent from execution model (fork, … OK) Support parallel jobs

Page 10: Resource management system for distributed environment B4. Nguyen Tuan Duc

STDS structure

Two main components Node searching system (graph searching) Scheduler (on each node)

Scheduler Daemon on each node CPU fair-sharing by ‘nice’

Node searching system Create graph from links Node search graph search

Page 11: Resource management system for distributed environment B4. Nguyen Tuan Duc

STD node searching system

Page 12: Resource management system for distributed environment B4. Nguyen Tuan Duc

Our approach

Similar to STD system Node searching system Scheduler on each node

But different in … Node search: no graph searching Scheduler: kernel scheduler with user accounting

(budget scheduler)

Page 13: Resource management system for distributed environment B4. Nguyen Tuan Duc

Scheduler: Budget scheduling Budget scheduling Normal queue & budget queue Normal queue for interactive processes

Linux 2.6 default scheduler Budget queue for CPU-hogging processes

Automatic detecting of CPU-intensive process http://www.logos.ic.i.u-tokyo.ac.jp/~duc/pre/1

107.ppt

Page 14: Resource management system for distributed environment B4. Nguyen Tuan Duc

Node searching system

Client-server model Daemon on each node Daemon reports CPU state (process number,

CPU utilization, …) directly to user Reports maximum price

From where user can submit jobs? From every where on the cluster, grids From their desktop, via the Internet

Need of a job submitting system

Page 15: Resource management system for distributed environment B4. Nguyen Tuan Duc

Node searching system (NSS)

User

Page 16: Resource management system for distributed environment B4. Nguyen Tuan Duc

Who will determine nodes?

User! Users choose nodes appropriated to their

jobs Parallel jobs: idle CPUs or CPUs with low-price

jobs Long-last jobs: idle CPU, set low-price

Page 17: Resource management system for distributed environment B4. Nguyen Tuan Duc

Node searching system (NSS) NSS should report to users:

CPU utilization Maximum price Load (process number, ..) …

Daemon on each node sends information about the node to client.

Client is on user’s machine No heavy load aggregation

Page 18: Resource management system for distributed environment B4. Nguyen Tuan Duc

Problems!!!

May be heavy load on user client NAT, Firewall

How client can connect to server?? Information need?

Only CPU utilization, maximum price, load, average-price?