http:// parametric jobs – faciliation of instrument elements usage in grid applications ingrid...
TRANSCRIPT
http://www.dorii.eu/
Parametric jobs – faciliation of Instrument Elements usage in Grid
applications
INGRID 2009
Katarzyna Bylec, Szymon Mueller, Mateusz Pabiś, Mariusz Wojtysiak, Paweł Wolniewicz
Poznań Supercomputing and Netwoking Center
03.04.2009, Alghero, Italy
Outline
Background
DORII and HORUS application
JSDL and its extension
Parameter Sweep capabilities
Technical implementation of HORUS workflow
JAVA library for Parameter Sweep
New oportunities = new problemsSimplify access to the Grid → wider community → new users →
new use cases → new needs → extending classical e-Infrastructure → new problems
Problem: Instrument elements as a virtualisation of data sources - data processing
Input: do the same with many input files
Output: output the same file with changed content
or output many different files as a result of slightly changed processing Goal – to streamline the process – 2 levels:
Automatization
Describing the logic of job
Real life example - DORII
Situation:
well established communities (earthquake, environmental science, experimental science)
applications not or only partially integrated in the European e-Infrastructures
Applications' needs:
To make the daily work more efficient
To automatize the jobs' flow
Enhance usage of scientific devices
Solutions:
Integrate applications to e-Infrastructure
Convert applications into Grid workflows
Case study: HORUS_bench Description:
Insitituto de Hidráulica Ambiental, Universidad de Cantabria
Used to process data gathered via HORUS system
images of Puntal beach, Santander, Spain to measure beach user density, calculate the wather line, etc.
Requirements:
User chooses set of processing algorithms which constitue binary model to run over input images
The same model is run over GB of images data
HORUS_bench workflow
SE
SE
SE
Predefined binary models
Archived images
HORUS_bench output
IE – cameras on the beach
0. make photos
0.1 send photos to application
0.2 store photos on SE
1. model input data
2. processing model
3. computing 4. store results
HORUS workflow adaptation
DORII
E-Infrastructure (access to IEs)
VCR and Workflow System
Workflows: advantages
Automation of the process
Single Sing-On (MyProxy)
All application's task managed from one point
Monitoring of workflow execution
Hide the Grid complexity from user
Problems
Execution of thousands of the same jobs for different input data
JSDL specification
JSDL = Job Submission Description Language
JSDL Working Group in Open Grid Forum
Project created: 09/25/2003, version 1.0 available
Goal:
to specify an abstract standard of job description language that is independent of underlying middleware
to replace existing languages (JDL, RSL, etc.)
to make it extensible:
POSIX Application HPC Profile Application SPMD Application Parameter Sweep
Basic JSDL
POSIX Application definition
Data transfer
$ ./algorithm_model_bin -inputFile file.jpg -outputDir .
Parameter Sweep - values
algorithm_model_bin -inputFile file000.jpgalgorithm_model_bin -inputFile file001.jpg...algorithm_model_bin -inputFile file999.jpg
Parameter Sweep - Functions
algorithm_model_bin -inputFile file0.jpg...algorithm_model_bin -inputFile file68.jpgalgorithm_model_bin -inputFile file70.jpg...algorithm_model_bin -inputFile file665.jpgalgorithm_model_bin -inputFile file667.jpg...algorithm_model_bin -inputFile file999.jpg
HORUS JSDL - 2nd approach
Second approach create one JSDL file and make HORUS take care of many input files
#! bin/bash
tar -xf input.tar./algorithm_model_bintar -cf output.tar ./output/
Parameter sweep - FileSweep
#! /bin/bash
#some pre-processing./algorithm_model_bin -inputFile input_name#some post-processing
Parameter Sweep - summary Parameter – specifies the target JSDL element to be parametrised
DocumentNode (whole value, XPath substring)
FileSweep
Function – specifies the values to be substitued for parameters
Values
LoopInteger
DoubleLoop
Assigment – define the order and dependencies between parameters
Sweep at the same time
Independent sweep
Nested Sweep
Limitations of JSDL and Parameter Sweep No support for JDL collection jobs (shared sandbox) - JSDL limitation
Problem with supporting MW
MWs don't support JSDL, not to say about Parameter Sweep
No parametrisation at the level of workflow
lack of workflow language that would support parametrisation
User specification awareness
sweep of elements corellated witch each other in JSDL
(e.g. DataStaging of POSIX Application elements)
Technical solution
gLite
1. user submits HORUS workflow
2. Workflow manager decomposites the workflow
2.1. Each task is submited on the Grid through CommonLib
3. Parametric JSDL is translated to JSDLs – one for each iteration
3.1. JSDL is translated to JDL
3.2. JSD is submited to gLite
Workflow Editor
Workflow Manager
Common Lib: g-Eclipse
g-Eclipse JSDL and Parameter Sweep
Integrated, middleware independent Grid enabled workbench tool
EC and Eclipse project 07.2006 - 12.2008
JSDL as a default job description language
support for Parameter Sweep extension
multi-page editor for JSDL
...with special page for Parameter Sweep
g-Eclipse's JAVA Parameter Sweep Library
Standalone library extracted from g-Eclipse plug-ins
23kB JAR file
dev.eclipse.org/svnroot/technology/eu.geclipse/trunk/plugins/eu.geclipse.jsdl
g-Eclipse's JAVA Parameter Sweep Library
Work in progress
Current implementation of extension http://schemas.ogf.org/jsdl/2007/01/jsdl-sweep
(newest: 27th draft)
Extend Parameter Sweep support
XPath substring function
FileSweep
DoubleLoop
Processing of valid yet questionable values
Extract JSDL editor as a standalone Eclipse plug-in
Polish API within DORII's Common Lib
Summary
Problems:
IE shortens a time-to-grid for input data
Problem of a lot of input that has to processed
Applications' demands are to empower data processing
Grid complexity has to be hiden from users
Solutions:
Workflows – abstract, high level job presentation and its automation
Parametrisation – abstracting the job's logic to make processing more clear
Contact information
DORII project
www.dorii.eu
g-Eclipse
www.geclipse.eu
[email protected] or [email protected]
1.0 release: http://www.geclipse.eu/index.php?id=downloads
OGF JSDL WG
http://forge.ogf.org/sf/projects/jsdl-wg
questions/comments? [email protected]
Thank you!