gridassist, making the grid invisible ruud grim mark ter linden ivan petiteville ceos march 2005...

32
GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

Upload: norman-morrison

Post on 20-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, making the Grid invisible

Ruud Grim

Mark ter Linden

Ivan Petiteville

CEOS March 2005 Argentina

Page 2: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Contents

• History• Technical Details• Operational Experiences• Future Plans

A user friendly service to supportinstrument calibration/validation &

data (re-) processing.

Page 3: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

History

• 1997-2000 EC FP4 OASE project – Collaboration environments for the simulation and

data processing of Earth Observation data – Chains of applications in distributed environment– Used CORBA technology provided only limited

functionality and was not properly secure (opening of ports in firewall needed)

AtmosphereModel

OMISimulator

Ground DataProcessor

Total OzoneColumn

UVPrediction

Dutch Space

Dutch Space

DLR-DFD KNMI FMI

Page 4: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

GREASE Project2002-2003 (ESA)

• Same concept, with new chassis (Grid) and powered by new engine (Globus Toolkit 2.x)

• The environment should be easy to use and should hide the underlying Grid technology for the scientific user

• Workflow and service oriented approach – more than simple chains of applications.

Service AService B

Service C

Service D

Service E

Service F

Page 5: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Concept

• User friendly client tools run locally on the users workstations for constructing workflows and monitoring jobs

• Centralized controller executes the workflows on the Grid

• Controller implemented as Web Service for easy and standardized access (even through firewalls)

Workstationswith client tools Controller

Grid resources

LAN SOAP Grid

Page 6: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Use cases within ESA

• Instrument validation• Mission simulation • Archive reprocessing • Instrument test data generation (via simulation)• Production-on-Demand• Concurrent design

Satisfying different functional needs:• Collaboration• Computing power• Controlled provision & access of services

Page 7: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Grid implementations @ESA

• Instrument validation (#3)• Mission simulation (#2)• Archive reprocessing • Instrument test data generation (#1)• Production-on-Demand• Concurrent design

Examples (#)

1. OMI test data generation

2. ENVISAT validation

3. GAIA mission analysis

& Grid-on-Demand

Concurrent Design Facility

Page 8: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#1: OMI (NASA AURA) (launched summer 2004)

• Main products: Ozone columns, profiles

• 6-7 GB / day (Level 0 data)

Optical Assembly Electronic Assembly

Page 9: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

2-dimensional CCD

wavelength ()500 pixels

swath

500 pixels

13 km(2 sec. flight)2800 km

flight direction7 km/sec

viewing angle

114 deg

UC#1: Scanning the Earth daily

• Continue global total ozone trends

• Nominal 13 x 24 km spatial resolution or 13 x 13 km for detecting and tracking urban-scale pollution sources

Page 10: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#1: Test data generation

• Fall 2003: Generation of one month of simulated OMI data for Ground Segment Verification (starting beginning 2004)– 230,000 simulation runs of 2 minutes each (total 7666 hours)– Between 50 and 80 CPU’s were used in a 6 week period– 32 Gb telemetry data produced and transferred to NASA

Existing GOME Data

OMI Instr.Simulator

Level 2Algorithm

Level 1bProcessor

Raw DataGenerator Level 0

Processor

spectrum

CCD output

telemetry

Level 0

Level 1

Grid

NASA GS

Page 11: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#2: Instrument ValidationWhat is required?

• Additional validated data– In-situ measurements

• Aircraft• Balloon• Ground (lidar)

– Other space instrumentation

• Quality Assurance• Common data sets• Algorithms• Tools, converters, visualization tools

• Good communication & collaboration

Page 12: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#2: ECV Prototype(ESA THE VOICE project)

Demonstrate possibilities of e-Collaboration for cal / val

• Authorization & Authentication• Communication (agenda, documentation)• Access to

– Meta data catalogue– Data store– Applications & tools

• Under configuration control• In development

• Workflow Management (GridAssist)• Publish & Subscribe

Page 13: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#2: Validation Workflow

• Access to data stores– GOME Level 2– LIDAR (at IPSL or NILU)

• On-demand processing

• Publish/Subscribe tonotify users

PS Agent

BASIC WORKFLOW

Level2Processing

NNO/

OPERA

Level2Storage

Level1Storage

.utv20

LIDARStorage

NNOCollocation

“LIDAR dataProcessing”

LIDAR dataStorage

.utv20listing

.gol/.mdllisting

NNOValidation

collocated.out Data productStorage

.gol/.mdl

.utv20

Publish/SubsribeNotifier

Publish/Subscribe

MySQLDB

EXTENDED WORKFLOW

When no level 2data is found, try

re-collocatingusing level 1 data

Level1Collocation

Page 14: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#2 THE VOICE Workflow Environment

Data stores

Applications

Workflow submission

Drag-and-Drop

ConnectingClick-and-Drop

Access toData stores

Page 15: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#2: VOICE collaboration crossing boundaries

ESACVillaFranca

ESTEC &Dutch SpaceKNMIRIVM

IPSL

Univ Bremen

Tor Vergata

BIRA/IASB

Genève

NILU

ESRIN

NASA

Page 16: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#3: Gaia mission analysis

Science objectives

• Map 10^9 stars in our Galaxy– Astrometry

– Photometry

– Spectra

• Studies– Structure & kinematics of

Galaxy

– Stellar populations

– Origin, formation & evolution of Galaxy

– Stellar astrophysics

– Cosmology

– Extra-solar planetary science

– Fundamental physics

• Core Processing (Global Iterative Solution) using subset of 10^8 stars with– Raw data

– Calibrated data

– Attitude data

– Science data

• 500 TB over 5 yr• 10^20 flop CPU

Page 17: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#3: Gaia ProcessingForeseen architecture (May 2004)

Page 18: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

UC#3: GAIA collaboration

BarcelonaCore Tasks

MeudonRVS

HeidelbergQuick Looks

CambridgePhotometry

LeidenPhotometry

LundAstrometry

TriesteRVS

BruxellesABS

TurinoMinor Planets

RVS

GeneveVariable Stars

NiceFundamental Algos

CopenhagueESTECDutch Space

ESRINESAC

Database

CNES?

• Binary star simulation with the GASS (Gaia Simulator)– 5 year period, submitted as 5

jobs covering 1 year each– Executed on 23 CPU’s in 8

institutes of 5 countries– Total of 3.8 million CPU

seconds used– 16.5 Gb telemetry data

produced and transferred to CESCA

– >1,100 jobs submitted in 6 months

• Data extraction from GDAAS database (Oracle)– Very flexible using Java as

query language

Page 19: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Benefits of GridAssist

• Easy and secure access to applications, data and resources

• Satisfying both collaboration & HPC needs• Unattended execution of large and/or complex jobs using

workflows• Low failure rate (>95% of jobs are successfully completed)• Supports logging at three levels

– Application, GridAssist, Globus

• No or little modifications needed to existing applications; new applications can be added fast

• The Grid environment can easily be extended with more resources

• Easiness of installation

Page 20: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Lessons Learned

• The GridAssist Workflow Tool proved to be a very user-friendly and intuitive tool; users can use it almost directly

• It complies to both High Performance Computing and collaboration needs within ESA; users are very enthusiastic

• Interface problems between applications can be detected early in the development process

• Approach to use GridAssist to run applications on the Grid is usable for many fields that have similar scientific data processing needs (Earth Observation, Astronomy, …?)

Page 21: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Future plans

• Continue development– Improve robustness– Improved workflow features, user management– Improved access to data stores– Interoperability (e.g. gLite)

• Project operations support– Mission analysis– Instrument calibration / validation– Application development– Level 3 & 4 product processing– Archive re-processing

Page 22: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

More info?

• Web site: http://www.gridassist.com/• Contact persons:

– Ivan Petiteville (ESA ESRIN)

e-mail: [email protected]

telephone: +39-06.941.80.567

– Ruud Grim (GridAssist Project Manager)e-mail: [email protected]: +31-71-5.245.416

– Mark ter Linden (GridAssist Developer)e-mail: [email protected]: +31-71-5.245.557

• Photos: courtesy ESA, NASA, KNMI and Internet

Page 23: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Questions ?

+ +

Develop locally, compute and collaborate globally on the Grid.

Page 24: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

The Grid

• Around 1998 the Grid concept was introduced:

Sharing resources in Virtual Organizations

• Demand driven access to computing power• Increased utilization of idle capacity• Greater sharing of computational results

Page 25: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Grid Environment

• Grid environment based on Globus Toolkit 2.x using:

• Globus Resource Allocation and Management (GRAM)– Remote job submission and control– Interface to local job management systems (PBS, LSF, Condor)

• GridFTP– High performance, secure, reliable data transfer

• Grid Security Infrastructure (GSI)– Single sign-on and secure communication– Based on Public Key encryption and X.509 certificates

Page 26: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Features

• Workflow Tool – User interface implemented in Java (Windows, Linux, Unix, Mac)– To add / modify / remove applications, resources and properties– To create, start and monitor workflows – Embed additional (new) services, e.g. browsing in database,

logging at 3 levels, converters, notification services, visualization

• Embed batch programs, not (yet) interactive– No requirements on language (Java, Fortran, C, IDL, …).– User can configure runtime parameters

• Central registry– Storage of information about applications and resources– Configuration control

Page 27: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Architecture

• Implementation in Java – cross platform (tested on Windows, Linux and Mac)

ApacheJakarta Tomcat

Web Server

ApacheAXIS

GridAssist Workflow Engine

JavaCoG-kit

JDBCConnector

GridAssistWorkflow Tool

MySQLDatabase

GlobusToolkit

Data ProcessingApplication

ApacheAXIS

User Workstation Controller Grid Resource

SOAP Globus specificprotocols

LAN Grid Grid

Page 28: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Workflow ToolMaintaining the registry

Resources

Services

Resource or service details

Page 29: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Workflow ToolCreating the workflow

Data stores

Applications

Workflow submission

Drag-and-Drop

ConnectingClick-and-Drop

Page 30: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Workflow ToolStatus Monitoring

Availability& Usage

Submitted workflows & status overview

Page 31: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Hiding Grid technologyIntuitive GUI preferred

DAG structuredDynamic execution

Fault tolerance build-in

Page 32: GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville CEOS March 2005 Argentina

GridAssist, March 2005 CEOS Argentina

Data Processing Applications

• Batch programs, not interactive.• No requirements on language (Java, Fortran, C, IDL, …).• Applications do not have to be modified.• Applications can be configured by the user using runtime

parameters.• A simple wrapper shell script can be written to handle

the input, output and the runtime parameters.• The application itself can be stored on the Grid resource

but also on a storage node (in this case only the wrapper script need to be present on the Grid resource).