vdt and interoperable testbeds rob gardner university of chicago

41
VDT and Interoperable Testbeds Rob Gardner University of Chicago

Upload: leo-daniel-dalton

Post on 16-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VDT and Interoperable Testbeds Rob Gardner University of Chicago

VDT and Interoperable Testbeds

Rob Gardner

University of Chicago

Page 2: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 2

Outline

VDT – Status and Plans

VDT Middleware

VDT Team

VDT release description

VDT and the LCG project

Interoperable Grids – Status and Plans

GLUE working group

Grid operations

Distributed facilities and resource monitoring

ATLAS-kit deployment

WorldGrid: iVDGL-DataTAG grid interoperability project, ATLAS SC2002

Page 3: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 3

VDT Middleware

Joint GriPhyN and iVDGL deliverable

Basic middleware for US LHC program

VDT 1.1.5 in use by US CMS testbed

US ATLAS testbed is installing VDT with WorldGrid components for

EU interoperability

Release structure

introduce new middleware components from GriPhyN, iVDGL, EDG,

and other grid developers and working groups

framework for interoperability software and schema development

(eg. GLUE)

Page 4: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 4

Team

VDT Group

GriPhyN (development) and iVDGL (packaging, configuration, testing,

deployment)

Led by Miron Livny of University of Wisconsin Madison

Alain Roy (CS staff, Condor team)

Scott Koranda (LIGO, University of Wisconsin, Milwaukee)

Saul Youssef (ATLAS, Boston University)

Scott Gose (iVDGL/Globus, Argonne Lab)

New iVDGL hire (CS) from Madison starting December 1

Plus a community of participants

Dantong Yu, Jason Smith (Brookhaven): mkgridmap, post-install configuration

Patrick McGuigan, UT Arlington: valuable testing/install feedback, PIPPY

Alan Tackett, Bobby Brown (Vanderbilt): installation feedback, documentation

Page 5: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 5

VDT

Basic Globus and Condor, plus EDG software (eg. GDMP)

Plus lots of extras …

Pacman

VO management

Test harness

Glue schema

Virtual Data Libraries

Virtual Data Catalog

Language and interpreter

Server and Client

Page 6: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 6

VDT Releases Current version: VDT 1.1.5 Released October 29, 2002

Major recent upgrades (since 1.1.2) Patches to Globus 2.0, including the OpenSSL 0.9.6g security update.

A new and improved Globus job manager created by the Condor team that is more scalable and robust than the one in Globus 2.0. This

job manager has been integrated into the Globus 2.2 release.

Condor 6.4.3 and Condor-G 6.4.3.

New software packages, including:

FTSH (The fault tolerant shell) version 0.9.9

EDG mkgridmap (including perl modules that it depends on)

EDG CRL update

DOE and EDG CA signing policy files, so you can interact with Globus installations and users that use CAs from the DOE and EDG.

A program to tell you what version of the VDT has been installed, vdt-version.

Test programs so that you can verify that your installation works.

The VDT can be installed anywhere--it no longer needs to be installed into the /vdt directory.

VDT configuration

Sets up Globus and GDMP daemons to run automatically

Configures Condor to work as a personal Condor installation or a central manager

Configures Condor-G to work

Enables the Condor job manager for Globus, and a few other basic Globus configuration steps.

Doing some of the configuration for GDMP.

VDT installation logs created, better README files

We now properly set up globus-gram-job-manager-tools.sh, and ensure that it is not overwritten when more Globus bundles are installed.

Fixed mkgridmap so that it can find the perl modules correctly

Most up-to-date CA signing policy files from the EDG and DOE have been included; new signing policy for the new INFN CA

Page 7: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 7

Deployment with Pacman Packaging and post-install configuration: Pacman

Key piece required to do anything: not only for middleware but also

applications and higher level toolkits

Tools to easily manage installation and environment

fetch, install, configure, add to login environment, update

Sits over top of many software packaging approaches (rpm, tar.gz, etc.)

Uses dependency hierarchy, so one command can drive the installation of a

complete environment of many packages

Packages organized into caches hosted at various sites

Distribute responsibility for support

Has greatly helped in testing/installation of VDT: many new features

Made it possible to quickly setup grids and application packages

Page 8: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 8

VDT and LCG Project VDT and EDG being evaluated for LCG-1 testbed

EDG Collected tagged and supported set of middleware and application software packages and

procedures from the European DataGrid project available as RPMs with a master location.

Includes Application Software for deployment and testing on EDG sites

Most deployment expects most/all packages to be installed with small set of uniform

configuration

Base layer of software and protocols are common Globus: X509 certificates, GSI authentication, GridFTP, MDS LDAP monitoring and discovery

framework, GRAM job submission

Authorization extensions: LDAP VO service

Condor: Condor-G job scheduling, matchmaking (ClassAds), Directed Acyclic Graph (job task

dependency manager – DAGMAN)

File movement and storage management: GDMP, GridFTP

Possible solution: VDT + EDG WP1, WP2

If adopted, PI’s of GriPhyN and iVDGL, and US CMS and US ATLAS computing managers will need to define a response for next steps for support

Page 9: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 9

Grid Operations Operations areas

registration authority

VO management

Information infrastructure

Monitoring

Trouble tracking

Coordinated Operations

Policy

Full time effort Leigh Grundhoefer (IU)

New hire (USC)

Part time effort Ewa Deelman (USC)

Scott Koranda (UW)

Nosa Olmo (USC)

Dantong Yu (BNL)

Page 10: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 10

Distributed Facilities Monitoring VO Centric Nagios – Sergio Fantinel, Gennaro Tortonne (DataTAG)

VO Centric Ganglia – Catalin Dumitrescu (U of Chicago) Ganglia

Cluster resource monitoring package from UC Berkeley

Local cluster and meta-clustering capabilities

Meta-daemon storage of machine sensors for CPU load, memory, I/O

Organize sensor data hierarchically

Collect information about job usages

Assign users to VO’s by queries to Globus job manager

Tool for policy development

express, monitor and enforce policies for usage according to VO agreements

Deployment Both packages deployed on US and DataTAG ATLAS and CMS sites

Nagios plugin work by Shawn McKee – sensors for disk, I/O usage

Page 11: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 11

Screen Shots

Page 12: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 12

Glue Project Background

Joint effort of iVDGL Interoperability Group and DataTAG WP4 Led by Ruth Pordes (Fermilab, also PPDG Coordinator) Initiated by HICB (HEP Intergrid Coordination Board)

Goals Technical

Demonstrate that each Grid Service or Layer can Interoperate Provide basis for interoperation of applications Identify differences in protocols and implementations that prevent interoperability Identify lacks in architecture and design that prevent interoperability

Sociological Learn to work across projects and boundaries without explicit mandate or authority but for

the “longer term good of the whole”. Intent to expand from 2 continents to Global - inclusive not exclusive

Strategic Any Glue code, configuration and documents developed will be deployed and supported

through EDG and VDT release structure Once interoperability demonstrated and part of the ongoing culture of global grid

middleware projects Glue should not be needed Provide short term experience as input to GGF standards and protocols Prepare way for movement to new protocols - web and grid services (OGSA)

report from Ruth Pordes report from Ruth Pordes

Page 13: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 13

Glue Security Authentication - X509 Certificate Authorities, Policies and Trust

DOE Science Grid SciDAC project CA is trusted by the European Data Grid and vice versa

Experiment testbeds (ATLAS,  CMS,  ALICE, D0 and BaBar) use cross-trusted certificates

Users starting to understand need multiple certificates also

Agreed upon mechanisms for communicating new CAs and CRLs have been shown to work but

more automation for Revocation is clearly needed

Authorization Initial authorization mechanisms are in place everywhere using the Globus gridmapfiles

Various supporting procedures are used to create gridmapfiles from LDAP databases of

certificates or other means

Identified requirement for more control over access to resources at time of request for use. But

no accepted or interoperable solutions are in place today

Virtual Organization Management or Community Authorization Service is under active discussion

- see the PPDG SiteAA mail archives.. And this is just one mail list of several..

http://www.ppdg.net/pipermail/ppdg-siteaa/2002

Page 14: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 14

Other Glue Work Areas GLUE Schema (for resource discovery, job submission)

EDG and the MDS LDAP schema and information were initially very different Commitment made up front to move to common resource descriptions. Effort has taken since February - weekly phone meetings and much email GLUE Schema Compute and Storage Information are Released in V1 in MDS 2.2

and will be in EDG V2.0 - defined with UML and LDIF.. Led to better understanding by all participants of CIM goals and to collaboration with

CIM schema group through the GGF

File transfer and storage Interoperability tests using GridFTP and SRM V1.0 within the US have started with

some success

Joint demonstrations Common submission to testbeds based on VDT and EDG in a variety of ways

ATLAS Grappa to iVDGL sites (US ATLAS, CMS, LIGO, SSDSS) and EDG (JDL on UI)

CMS-MOP

Page 15: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 15

ATLAS-kit

ATLAS-kit RPM’s based on Alessandro DeSalvo’s work; distributed to

DataTAG and EDG sites; release 3.2.1

Luca Vacarossa packaged version VDT sites with Pacman

Distributed as part of ScienceGrid cache

ATLAS-kit-verify: invokes ATLSIM, does one event

New release 4.0.1 in preparation by AD

Distribution to US sites to be done by Yuri Smirnov (new ATLAS

iVDGL hire, started November 1)

Continued work with Flavia Donno and others

Page 16: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 16

WorldGridhttp:www.ivdgl.org/worldgrid

Collaboration between US and EU grid projects

Shared use of Global resources – across experiment and Grid domains

Common submission portals: ATLAS-Grappa, EDG-Genius, CMS MOP master

VO centric “grid” monitoring: Ganglia and Nagios-based

Infrastructure development project:

Common information index server with Globus, EDG and GLUE schema

18 sites, 6 countries ~130 CPUs

Interoperability components (EDG schema and IP providers for VDT servers, UI and JDL for

EDT sites); First steps towards policy instrumentation and monitoring

Packaging

with Pacman (VDT sites) and RPMs/lcfg (DataTAG sites)

ScienceGrid:

ATLAS, CMS, SDSS and LIGO application suites

Page 17: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 17

WorldGrid Site VDT installation and startup

Packaging and installation, configuration

Gridmap file generation

GIIS(s) registration

Site configuration, WorldGrid-WorkerNode testing

EDG information providers and testing

Gangila sensors, instrumentation

Nagios plugins, display clients

EDG packaged ATLAS and CMS code

Sloan applications

Testing with Grappa-ATLAS submission (Joint EDT and US sites)

Page 18: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 18

WorldGrid at IST2002 and SC2002

Page 19: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 19

Grappa Grappa

Web-based interface for Athena job submission to Grid resources

First one for ATLAS

Based on XCAT Science Portal technology developed at Indiana

Components Built from Jython scripts using java-based grid tools

Framework for web-based job submission

Flexible: user-definable scripts saved as 'notebooks‘

Interest from GANGA team to work collaboratively on grid interfaces

IST2002 and SC2002 demos

Page 20: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 20

XCAT Science Portal

Portal framework for creating “science portals” (application-specific web portals that provide an easy and intuitive interface for job execution on the Grid)

Users compose notebooks to customize portal (eg. The Athena Notebook)

Jython scripts (the user notebooks) flexible python scripting interface easy incorporation of java-based toolkits

Java toolkit IU XCAT Technology (component technology) CoG (Java implementation of Globus) Chimera (Java Virtual Data toolkit-GriPhyN)

HTML form interface runs over Jakarta Tomcat server using https (secure web

protocols)Java integration increases system-independence/

robustness

Page 21: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 21

Athena Notebook

Jython implementation of java-based grid toolkits (CoG, etc) Job Parameter input forms (Grid Resources, Athena

JobOptions, etc) Web-based framework for interactive job submission

--- integrated with ---

Script-based framework for interactive or automatic (eg. Cron-job) job submission

Remote Job Monitoring (for both interactive and cron-based jobs)

Atlfast and Atlsim job submission Visual Access to Grid Resources

Compute Resources MAGDA Catalogue System health monitors: Ganglia, Nagios, Hawkeye, etc

Chimera Virtual Data toolkit -- tracking of job parameters

Page 22: VDT and Interoperable Testbeds Rob Gardner University of Chicago

Grappa communications flow

Grappa Portal Machine:

XCAT tomcat server

Web Browsing Machine (JavaScript)Netscape/Mozilla/Int.Expl/PalmScape

https - JavaScript

http: JavaScript Cactus framework

Script-Based

Submisson

interactiveor

cron-job

Resource A Resource Z. . . MAGDA: registers file/location registers file metadataCompute Resources

http://browse catalogue

CoG :Submission,Monitoring

CoG :

Data Copy

Data Storage: - Data Disk - HPSS

Magda

(spider)

Inputfiles

Page 23: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 23

Grappa Portal

Grappa is not

Just a GUI front end to external grid tools

Just a GUI linking to external web services

But rather

An integrated java implementation of grid tools

The portal itself does many tasks

Job scheduling

Data transfer

Parameter tracking

Page 24: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 24

Grappa Milestones

Design Proposals: Spring 2001

1st Atlsim prototype: Fall 2001

1st Atlas Software Week Demo: March 2002

Selected (May 2002) for US Atlas SC2002 Demo

1st submission across entire US Atlas testbed (Feb 2002)

1st Large Scale Job submission: 50Mevents (April 2002)

Integration with MAGDA (May 2002)

Registration of metadata with MAGDA (June 2002)

Resource Weighted Scheduling (July 2002)

Script based production cron system (July 2002)

GriPhyN-Chimera VDL Integration (Fall 2002)

EDG Compatibility plus DataTAG integration (Fall 2002)

Page 25: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 25

Spare Slides about Grappa Follow

Page 26: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 26

Various Grappa/Athena modes

Has been run using locally installed libraries

Has been run using AFS libraries

Has been run with static “boxed” versions of Atlfast and Atlsim

(where we bring along everything to the remote compute node)

This can translate to input data as well Do we bring input data to the executable

Bring the executable to the data

Bring everything along

Many possibilities

Page 27: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 27

A Quick Grappa Tour

Demo of running atlfast demo-production

Interactive mode

Automatic mode

GRAM contact monitoring

Links to resources external to the portal

Magda metadata queries

Ganglia cluster monitoring

Page 28: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 28

Typical User session

Start up portal onselected machine

Start up webbrowser

Configure/Select testbed resources

Configure input files.

Submit job

Page 29: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 29

User Session

Open monitoring window

Auto refresh(User configurable)

Monitor/cancel jobs

Page 30: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 30

User Session

Monitor Cluster health

The new ganglia version creates an additional level combining different clusters intoa “metacluster”

Page 31: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 31

User Session

Browse MAGDACatalogue

Search for Personalfiles

would like:Search for PhysicsCollections (based on metadata)

Page 32: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 32

Production Running

Configure Cronjob

Automatic job submission

Writing to magda cache

Automatic subnotebook naming

Setup up cron script location and timing

Script interacts with same portal as web-based

Use interactive mode to monitor production

Page 33: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 33

Production Monitoring

Configure and test scriptsusing command line tools Command line submissionAutomatic cron submission View text log files

Page 34: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 34

Production Monitoring

Access portal via web

Cron submissions appear as new subfolders

Click on subfolder to check what was submitted

Monitor job status

Page 35: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 35

Production Monitoring

Select jobs/groups of jobs to monitor

Auto refresh (user configurable)

Cancel job button

Page 36: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 36

Production Monitoring

Browse MAGDA catalogue

Auto registration of files

Check metadata(currently available as command line tool)

Search for files

would like:Search for physics collections

Page 37: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 37

Metadata in MAGDA

Metadata published along with data files

MAGDA registers metadata

Metadata browsingavailable as command line tool:-check individual files

Page 38: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 38

Chimera: Virtual Data Toolkit

Provides tracking of all job parameters

Browseable parameter catalogue

Simplified methods for data recreation

Processes that crash

Data that is lost

Data retrieval slower than re-creation

Condor-G job scheduling

And much more

Page 39: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 39

Grappa & US Atlas Testbed

Currently about 60 CPUs available

6 condor pools

IU, OU, UTA, BNL, BU, LBL

2 standalone machines

ANL, UMICH

Grappa/testbed production rate, cron-based,

Achieved 15M atlfast events/day

Page 40: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 40

Grappa: DC1 -- phase 1

Atlfast-(demo)-production testing successful

Simple atlfast demo with several pythia options

Large scale submission across the grid

Interactive and Automatic submission demostrated

Files and meta-data incorporated into MAGDA

Atlsim production testing initially successful

Atlsim run in both “boxed” and “local” modes.

Only one atlsim mode tested

Page 41: VDT and Interoperable Testbeds Rob Gardner University of Chicago

November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 41

Grappa: DC1 -- phase 2

Atlsim notebook upgrade:

Could make notebook tailor-made for phase2

Launching mechanism for atlsim

Atlsim does a lot that grappa could do

vdc queries

data transfer (input or output)

leave it in atlsim for now -- or --

incorporate some pieces into grappa

Would require additional manpower to define/incorporate/test

atlsim production notebook