i2g crossbroker
DESCRIPTION
I2G CrossBroker. Enol Fernández UAB. Dublin MPI Course, 10-11 September 2007. Introduction. CrossBroker does automatic scheduling in Grid Environments Resource discovery Resource Selection Job Execution Jobs not treated by gLite: parallel jobs (MPI) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/1.jpg)
I2G CrossBroker
Enol FernándezUAB
Dublin MPI Course, 10-11 September 2007
![Page 2: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/2.jpg)
Dublin MPI Course, 10-11 September 2007 2partner’s
logo
Introduction
CrossBroker does automatic scheduling in Grid Environments
Resource discoveryResource SelectionJob Execution
Jobs not treated by gLite:parallel jobs (MPI)
Run in more than one resource, in a coordinated fashion.
Interactive jobsThe user interacts with the application during its execution
![Page 3: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/3.jpg)
Dublin MPI Course, 10-11 September 2007 3partner’s
logo
Architecture
SchedulingAgent
ResourceSearcher
ApplicationLauncher
Condor-G DAGMan
CE
WN WN
EGEE/Globus
CE
WN WN
EGEE/Globus
MigratingDesktop
InformationIndex
ReplicaManager
CrossBroker
![Page 4: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/4.jpg)
Dublin MPI Course, 10-11 September 2007 4partner’s
logo
Architecture
Scheduling AgentReceives each job and keeps it in a persistent queueContacts Resource Searcher and gets a list of available resources Selects resources and passes them to Application Launcher
Resource SearcherGiven a job description (JobAd), performs the matchmaking between job needs and available resources.Uses the Condor ClassAd library, originally designed for matches of a single job with a single resource.A set matching has been developed to support matches of a single job to a group of resources.
Application LauncherResponsible for providing a reliable submission service of parallel applications on the Grid.Responsible for file staging at the remote site (executable and input/output files)Uses the services of Condor-G
![Page 5: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/5.jpg)
Dublin MPI Course, 10-11 September 2007 5partner’s
logo
Parallel Job Support
Support for parallel jobs:Open MPIPACX-MPIMPICH-P4MPICH-G2Plain (just the machines)
Takes into account sites capabilites. Ability to define starter scripts/process to
start the parallel jobmpi-start is configured automatically and used by default.
![Page 6: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/6.jpg)
Dublin MPI Course, 10-11 September 2007 6partner’s
logo
Parallel Job Support
Changes in JDLJOBTYPE:
Normal: sequential jobs, just one CPUParallel: more than one CPU
SUBJOBTYPE:openmpipacx-mpimpichmpich-g2plain
JOBSTARTER (if not defined, mpi-start)JOBSTARTERARGUMENTS
![Page 7: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/7.jpg)
Dublin MPI Course, 10-11 September 2007 7partner’s
logo
Parallel Job Support
Type = "Job";VirtualOrganisation = "imain";JobType = "Parallel";SubJobType = "pacx-mpi";NodeNumber = 5;Executable = "test-app";Arguments = "-v";InputSandbox = {"test-app", "inputfile"};OutputSanbox = {"std.out", "std.err"};StdErr = "std.err“;StdOutput = "std.out";Rank = other.GlueHostBenchmarkSI00 ;Requirements = other.GlueCEStateStatus == "Production";
![Page 8: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/8.jpg)
Dublin MPI Course, 10-11 September 2007 8partner’s
logo
MPI Across Sites
CrossBroker search and selects sets of resources for the jobs
There is no guarantee that all tasks of the same job will start at the same time
1st choice: select only sites with free resources. The job will run immediately. Unfortunately, free resources are not always available2nd choice: allocate a resource temporally and wait until all other tasks show up. Timeshare the resource with a backfilling policy to avoid resource iddleness
![Page 9: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/9.jpg)
Dublin MPI Course, 10-11 September 2007 9partner’s
logo
MPI Across Sites
[Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10
[Groups with 2 CEs] [Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 [Rank=1000] bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 lngrid02.lip.pt:2129/jobmanager-pbs-workq freeCPUs = 2
CE
CE4= xgrid.icm.edu.plFreeCPUs = 6Disk = 100AverageSI = 1000
CE
CE2=aocegrid.uab.esFreeCPUs = 10Disk = 100AverageSI = 4000
CE
CE3=bee001.ific.uv.esFreeCPUs = 3Disk = 100AverageSI = 1000
CE
CE1=zeus.cyf-kr.edu.plFreeCPUs = 2Disk = 100AverageSI = 2000
RS
MPI enabled CE
Non-MPI enabled CE
CE
CE5=lngrid02.lip.ptFreeCPUs = 2Disk = 100AverageSI = 1000
[Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10
[Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3Rank=1000] lngrid02.lip.pt/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3
![Page 10: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/10.jpg)
Dublin MPI Course, 10-11 September 2007 10partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker Grid Resource
LRMSMPIJOB
![Page 11: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/11.jpg)
Dublin MPI Course, 10-11 September 2007 11partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMSMPIJOB
![Page 12: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/12.jpg)
Dublin MPI Course, 10-11 September 2007 12partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2
MPIJOB
![Page 13: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/13.jpg)
Dublin MPI Course, 10-11 September 2007 13partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2
MPIJOB
![Page 14: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/14.jpg)
Dublin MPI Course, 10-11 September 2007 14partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2MPI
TASK
WaitingFor rest of
tasks
![Page 15: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/15.jpg)
Dublin MPI Course, 10-11 September 2007 15partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2MPI
TASK
JOB
![Page 16: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/16.jpg)
Dublin MPI Course, 10-11 September 2007 16partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2MPI
TASK JOB
BackFillingWhile the MPI waits
![Page 17: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/17.jpg)
Dublin MPI Course, 10-11 September 2007 17partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2MPI
TASK
All tasksReady!
JOB
![Page 18: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/18.jpg)
Dublin MPI Course, 10-11 September 2007 18partner’s
logo
Interactive Job Support
Scheduling priorityInteractive jobs are sent to sites with available machinesIf there are not available machines, use time sharing
Support for interactivity in all kinds of jobssequential and all the MPI flavors
CrossBroker injects intractive agents that enable communication between user and job
Transparent to the userFull integration with glogin & gvid
![Page 19: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/19.jpg)
Dublin MPI Course, 10-11 September 2007 19partner’s
logo
Interactive Job Support
Changes in JDLINTERACTIVE: true/false. Indicates that the job is interactive and the broker should treat it with higher proirity
INTERACTIVEAGENTINTERACTIVEAGENTARGUMENTS
These attributes specify the command (and its arguments) used to communicate with the user.
![Page 20: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/20.jpg)
Dublin MPI Course, 10-11 September 2007 20partner’s
logo
Interactive Job Support
Type = "Job";VirtualOrganisation = "imain";JobType = "Parallel";SubJobType = “openmpi";NodeNumber = 11;Interactive = TRUE;InteractiveAgent = “glogin“;InteractiveAgentArguments = “-r –p 195.168.105.65:23433“;Executable = "test-app";InputSandbox = {"test-app", "inputfile"};OutputSanbox = {"std.out", "std.err"};StdErr = "std.err“;StdOutput = "std.out";Rank = other.GlueHostBenchmarkSI00 ;Requirements = other.GlueCEStateStatus == "Production";
![Page 21: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/21.jpg)
Dublin MPI Course, 10-11 September 2007 21partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2BATCH
INT.JOB
![Page 22: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/22.jpg)
Dublin MPI Course, 10-11 September 2007 22partner’s
logo
Time Sharing
SchedulingAgent
Condor-G
CrossBroker
ApplicationLauncher
Grid Resource
LRMS
Agent
VM1 VM2BATCH INT.JOBStartup-time
ReductionOnly one
layer involved
![Page 23: I2G CrossBroker](https://reader035.vdocuments.site/reader035/viewer/2022062409/56814736550346895db4756e/html5/thumbnails/23.jpg)
Dublin MPI Course, 10-11 September 2007 23partner’s
logo
Other features
Intelligent job retrialdisables submission to failing sites temporarily
Fast notification of job statusbetter interaction with the application
gLite interoperabilityaccepts jobs from gLite's UIable to submit jobs to gLite resources (LCG-CE and gLite CE)