6d.1 schedulers and resource brokers itcs 4010 grid computing, 2005, unc-charlotte, b. wilkinson

Post on 21-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

6d.1

Schedulers and Resource Brokers

ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson.

6d.2

Scheduler

• Job manager submits jobs to scheduler.

• Scheduler assigns work to resources to achieve specified time requirements.

6d.3

Scheduling

From "Introduction to Grid Computing with Globus," IBM Redbooks

6d.4

Executing GT 4 jobs

Globus has the modes.

• Interactive/interactive-streaming

• Batch

6d.5

GT 4 “Fork” Scheduler

• GT 4 comes with a “fork” scheduler which attempts to execute the job immediately

• Provided for starting and controlling a job on a local host if job does not require any special software loaded or requirements.

• Other schedulers have to be added separately, using an “adapter.”

6d.6

Batch scheduling

• Batch, a term form old computing days, when one submitted a pack of punched cards as the program to a computer and one would come back after the program had been run on the computer, maybe overnight.

6d.7

GRAMservices

GT4 Java Container

GRAMservices

Localscheduler

Userjob

Compute element

GRAMadapter

Local jobcontrolJob

functions

Relationship between GT4 GRAM and a Local Scheduler

I Foster

Client

Various possible

6d.8

Scheduler adapters included in GT 4

• PBS (Portable Batch System)

• Condor

• LSF (Load Sharing Facility)

Third party adapter provided for:

• SGE (Sun Grid Engine)

6d.9

“Meta-schedulers”

• Loosely defined as a higher level scheduler that can scheduler jobs between sites.

Example

Platform Computing's Globus "Community Scheduler Framework"

6d.10

Platform Computing's Globus 3 "Community Scheduler Framework“

Site A – MMJFS on node1

SGE

MJS for SGE

MMJFS

RIPS

Index Service

PBS

MJS for PBS

MMJFS

RIPS

LSF

MJS for LSF

MMJFS

RIPS

managed-job-globusrun

managed-job-globusrun

managed-job-globusrun

Site B – MMJFS on node2

Site C – MMJFS on node3

MMJFS = Master Managed Job Factory ServiceMJS = Managed Job ServiceBlue indicates a Grid Service hosted in a GT3 container

“Grid Standards: Separating the Vision from the Reality” Chris Smith, Platform Computing

6d.11

(Local) Scheduler Issues

• Distribute job• Based on load and characteristics of

machines, available disk storage, network characteristics, … .

• Both globally and locally. • Runtime scheduling!

• Arrange data in right place (Staging)– Data Replication and movement as needed– Data Error checking

6d.12

Scheduler Issues (continued)

• Performance– Error checking – check pointing– Monitoring job, progress monitoring– QOS (Quality of service)– Cost (an area considered by Nimrod-G)

• Security– Need to authenticate and authorize remote

user for job submission• Fault Tolerance• Transparency Automation

6d.13

Scheduling policies

• First-in, First-out• Favor certain types of jobs• Shortest job first• Smallest (or largest) memory first• Short(or long) running job first• Fair sharing or priority to certain users• Dynamic policies

– Depending upon time of day and load– Custom, preemptive, process migration

6d.14

Advance Reservation

• Requesting actions at times in future. • “A service level agreement in which the

conditions of the agreement start at some agreed-upon time in the future” [2]

[2] “The Grid 2, Blueprint for a New Computing Infrastructure,” I. Foster and C. Kesselman editors, Morgan Kaufmann, 2004.

6d.15

Resource Broker

• “A scheduler that optimizers the performance of a particular resource. Performance may be measured by such criteria as fairness (to ensure that all requests for the resources are satisfied) or utilization (to measure the amount of the resource used).” [2]

6d.16

Scheduler/Resource Broker Examples

We will consider in detail two Schedulers/Resource Brokers available that work with Globus, as examples:

• Condor/Condor-G– Used last Fall for Assignment 4

• Sun Grid Engine– To be covered by James Ruff and to be

used in Assignment 4 this year.

6d.17

Condor

• First developed at University of Wisconsin-Madison in mid 1980’s to convert a collection of distributed workstations and clusters into a high-throughput computing facility.

• Key concept - using wasted computer power of idle workstations.

6d.18

Condor

• Converts collections of distributed workstations and dedicated clusters into a distributed high-throughput computing facility.

6d.19

Features

• Include:– Resource finder

– Batch queue manager

– Scheduler

– Checkpoint/restart

– Process migration

6d.20

Intended to run job even if:

• Machines crash

• Disk space exhausted

• Software not installed

• Machines are needed by others

• Machines are managed by others

• Machines are far away

6d.21

Uses

• Consider following scenario:– I have a simulation that takes two hours to

run on my high-end computer– I need to run it 1000 times with slightly

different parameters each time.– If I do this on one computer, it will take at

least 2000 hours (or about 3 months)

From: “Condor: What it is and why you should worry about it,” by B. Beckles, University of Cambridge, Seminar, June 23, 2004

6d.22

– Suppose my department has 100 PCs like mine that are mostly sitting idle overnight (say 8 hours a day).

– If I could use them when their legitimate users are not using them, so that I do not inconvenience them, I could get about 800 CPU hours/day.

– This is an ideal situation for Condor.

• I could do my simulations in 2.5 days.

From: “Condor: What it is and why you should worry about it,” by B. Beckles, University of Cambridge, Seminar, June 23, 2004

6d.23

How does Condor work?

• A collection of machines running Condor called a pool.

• Individual pools can be joined together in a process called flocking.

From: “Condor: What it is and why you should worry about it,” by B. Beckles, University of Cambridge, Seminar, June 23, 2004

6d.24

Machine Roles

• Machines have one or more of 4 roles:

– Central manager– Submit machine (Submit host)– Execution machine (Execute host)– Checkpoint server

6d.25

Central Manager

• Resource broker for a pool.

• Keeps track of which machines are available, what jobs are running, negotiates which machine will run which job, etc.

• Only one central manager per pool.

6d.26

Submit Machine

• Machine which submits jobs to pool.

• Must be at least one submit machine in a pool, and usually more than one.

6d.27

Execute Machine

• Machine on which jobs can be run.

• Must be at least one execute machine in a pool, and usually more than one.

6d.28

Checkpoint Server

• Machine which stores al checkpoint files produced by job which checkpoint.

• Can only be one checkpoint machine in a pool.

• Optional to have a checkpoint machine.

6d.29

Possible Configuration• A central manager.

• Some machine that can only be submit hosts.

• Some machine that can be only execute hosts.

• Some machines that can be both submit and execute hosts.

6d.30

6d.31

1. Central manager monitoring execute hosts so knows what is available and what type of machines each execute host is, and software.

2. Execute hosts periodically send a ClassAd describing themselves to the central manager.

6d.32

3. At times, the central manager enters a negotiation cycle where it matches waiting jobs with available execute hosts.

4. Eventually job is matched with a suitable execute host (hopefully) .

6d.33

5. Central manager informs chosen execute host that is has been claimed and gives it a ticket.

6. Central manage informs submit host which execute host to use and gives it a matching ticket.

6d.34

7. Submit host contacts execute host presenting its matching ticket and transfers job’s executable and date files to execute host if necessary. (shared file system also possible.)

8. When job finished, results returned to submit host (unless shared file system in use between submit and execute hosts).

6d.35

Connections

• Connection between submit and execute host usually done with a TCP connection.

• If connection dies, job resubmitted to Condor pool.

• Some jobs might access files and resources on submit host via remote procedure calls.

6d.36

Checkpointing

• Certain jobs can checkpoint, both periodically for safety and when interrupted.

• If checkpointed job interrupted, it will resume at the last checkpointed state when it starts again.

• Generally no change to source code - need to link Condor’s Standard Universe support library (see later).

6d.37

Types of Jobs

• Classified according to environment it provides. Currently seven environments:

– Standard– Vanilla– PVM– MPI– Globus– Java– Scheduler

6d.38

Standard

• For jobs compiled with Condor libraries.

• Allows for checking pointing and remote system calls.

• Must be single threaded.

• Not available under Windows.

6d.39

Vanilla

• For jobs that cannot be compiled with Condor libraries, and for shell scripts and Windows batch files.

• No checkpointing or remote system calls.

6d.40

PVMFor PVM programs.

MPIFor MPI programs (MPICH).

Both PVM and MPI are message-passing libraries used in message passing programs.

Used for local clusters of computers.

MPI could be used in grid computing – we may talk about this later in the course.

6d.41

Globus

For submitting jobs to resources managed by Globus (version 2.2 and higher).

6d.42

JavaFor Java programs (written for Java Virtual Interface).

SchedulerUsed with DAG scheduled jobs, see later.

6d.43

Submitting a job• Job submitted to “submit host” using

Condor_submit command.

• Job described in “submit description” file.

• Submit description file includes details such as given in an RSL file in Globus, i.e. the name of the executable, arguments, etc.

6d.44

Condor Submit Description File

# This is a comment, condor submit file

Universe = vanilla

Executable = /home/abw/condor/myProg

Input = myProg.stdin

Output = myProg.stdout

Error = myProg.stderr

Arguments = -arg1 -arg2

InitialDir = /home/abw/condor/assignment4

Queue

Describes job to Condor.Used with Condor _submit command.

Description File Example

6d.45

Submitting Multiple Jobs

• Submit file can specify multiple jobs– Example: Queue 500 will submit 500 jobs at once

• Condor calls groups of jobs a cluster

• Each job within cluster called a process

• Condor job ID is the cluster number, a period and process number, for example 26.2

• Single jobs also a cluster but with a single process (process 0)

6d.46

Submitting a job with requirements and preferences • Done using Condor’s “ClassAd”

mechanism, which may include:– What it requires– What it desires– What it prefers, and– What it will accept

• These details start in submit description file.

6d.47

Specifying Requirements

• A C/Java-like Boolean expression that evaluates to TRUE for a match.

# This is a comment, condor submit fileUniverse = vanillaExecutable = /home/abw/condor/myProgInitialDir = /home/abw/condor/assignment4Requirements = Memory >= 512 && Disk > 10000queue 500

6d.48

condor-submit command creates a “ClassAd” from the submit description file, which is then used a in ClassAd matchmaking mechanism.

Command:

condor_submit submit.prog1

ClassAd file

submit description file

6d.49

ClassAd MatchmakingUsed to ensure job done according to constraints of users and owners.

Example of user constraints“ I need a Pentium IV with at least 512 Mbytes of

RAM and speed of at least 3.8 GHz

Example of machine owner constraints “Never run jobs owned by Fred”

6d.50

ClassAd Matchmaking Steps

1. Agents (jobs) and resources (computers) advertise their characteristics and requirements in “classified advertisements.”

2. Matchmaker scans ClassAds and creates pairs that satisfy each others constraints and preferences.

3. Matchmaker informs both parties of match.

4. Agent and resource make contact.

6d.51

Job

Job ClassAd

Machine ClassAdd

Machine ClassAdd

Machine

Match

Machine

6d.52

Job ClassAd Example

[

MyType = “Job”

TargetType=“Machine”

Requirements =

((other.Arch==“INTEL”&&other.OpSys==“LINUX”)

&& other.Disk>myDiskUsage)

DiskUsage = 6000

] 6 MB

Requirements statement must evaluate to true

6d.53

Machine ClassAd Example[MyType=“Machine”TargetType=“Job”Machine=“coit-grid01.uncc.edu”Requirements=((LoadAvg<=0.300000)&&(KeyboardIdle>(15*60))Arch=“INTEL”OpSys=“LINUX”Disk=1000000]

Keyboard idle for more than 15 minutes

Low load average

6d.54

ClassAd’s Rank Statement

• Can be used in job ClassAdd for selection between compatible machines. Choose highest rank

• Rank expression should evaluate to a floating point number.

Example

Rank = (Memory * 10000) + KFlops

Machine speed

6d.55

Rank StatementCan also be used in Machines ClassAd in

matchmaking.

Example

Rank = (other.Department == self.Department)

where Department defined in job ClassAdd, say:

Department=“Computer Science”

6d.56

Job ClassAd[MyType = “Job”TargetType=“Machine”

…Department=“Computer

Science”…]

Machines ClassAd[MyType=“Machine”TargetType=“Job”

…Rank = (other.Department == self.Department)…]

Using rank in Machines ClassAd

6d.57

Directed Acyclic Graph

Manager (DAGMan)

Meta-scheduler

Allows one to specify dependencies between Condor Jobs.

6d.58

Example

“Do not run Job B until Job A completed successfully”

Especially important to jobs working together (as in Grid computing).

6d.59

Directed Acyclic Graph(DAG)

• A data structure used to represent dependencies.

• Directed graph.

• No cycles.

• Each job is a node in the DAG.

• Each node can have any number of parents and childred as long as there are no loops (Acyclic graph).

6d.60

DAG

Job A

Job CJob B

Job D

Do job A.

Do jobs B and C after job A finished

Do job D after both jobs B and C finished.

6d.61

Defining a DAG• Defined by a .dag file, listing each of the

nodes and their dependencies.

• Each “job” statement has an abstract job name (say A) and a file (say a.condor)

• PARENT-CHILD statement describes relationship between two or more jobs

• Other statements available.

6d.62

Example

# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B CParent B C Child D

Job A

Job CJob B

Job D

6d.63

To start a DAG, use condor_submit_dag command with dag file:

condor_submit_dag diamond.dag

condor_submit_dag submits a Scheduler Universe Job with DAGMan as the executable.

6d.64

Running a DAG

• DAGMan acts as a scheduler managing the submission of jobs to Condor based upon DAG dependencies.

• DAGMan holds and submits jobs to Condor queue at appropriate times.

6d.65

Job Failures

• DAGMan continues until it cannot make progress and then creates a rescue file holding current state of DAG.

• When failed job ready to re-run, rescue file used to restore prior state of DAG.

6d.66

Summary of Key Condor Features

• High throughput computing using an opportunitistic environment.

• Provides a mechanisms for running jobs on remote machines.

• Matchmaking

• Checkpointing

• DAG scheduling

6d.67

Condor-G

• Grid enabled version of Condor.

• Uses Globus Toolkit for:– Security (GSI)– managing remote jobs on grid (GRAM)– file handling and remote I/O (GSI-FTP)

6d.68

Remote execution by Condor-G on Globus-managed resources

From:”Condor-G A Computation Management Agent for Multi-Institutional Grids” by J. Frey, T. Tannenbaum, M. Livny, I. Foster and S. Tuecke. Figure probably refers to Globus version 2.

6d.69

More!

Check out Fall 2004 Assignment 4 write-up.

Fall 2004 course

http://www.cs.uncc.edu/~abw/CS493F04

6d.70

More Information• http://www.cs.wisc.org/condor

• Chapter 11, Condor and the Grid, D. Thain, T. Tannenbaum, and M. Livny, Grid Computing: Making The Global Infrastructure a Reality, F. Berman, A. J. G. Hey, and G. Fox, editors, John Wiley, 2003.

• “Condor-G: A Computation Management Agent for Multi-Institutional Grids,” J. Frey, T. Tannenbaum, I. Foster, M. Livny, S. Tuecke, Proc. 10th Int. Symp. High Performance Distributed Computing (HPDC-10) Aug. 2001.

top related