vdt and interoperable testbeds rob gardner university of chicago
TRANSCRIPT
VDT and Interoperable Testbeds
Rob Gardner
University of Chicago
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 2
Outline
VDT – Status and Plans
VDT Middleware
VDT Team
VDT release description
VDT and the LCG project
Interoperable Grids – Status and Plans
GLUE working group
Grid operations
Distributed facilities and resource monitoring
ATLAS-kit deployment
WorldGrid: iVDGL-DataTAG grid interoperability project, ATLAS SC2002
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 3
VDT Middleware
Joint GriPhyN and iVDGL deliverable
Basic middleware for US LHC program
VDT 1.1.5 in use by US CMS testbed
US ATLAS testbed is installing VDT with WorldGrid components for
EU interoperability
Release structure
introduce new middleware components from GriPhyN, iVDGL, EDG,
and other grid developers and working groups
framework for interoperability software and schema development
(eg. GLUE)
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 4
Team
VDT Group
GriPhyN (development) and iVDGL (packaging, configuration, testing,
deployment)
Led by Miron Livny of University of Wisconsin Madison
Alain Roy (CS staff, Condor team)
Scott Koranda (LIGO, University of Wisconsin, Milwaukee)
Saul Youssef (ATLAS, Boston University)
Scott Gose (iVDGL/Globus, Argonne Lab)
New iVDGL hire (CS) from Madison starting December 1
Plus a community of participants
Dantong Yu, Jason Smith (Brookhaven): mkgridmap, post-install configuration
Patrick McGuigan, UT Arlington: valuable testing/install feedback, PIPPY
Alan Tackett, Bobby Brown (Vanderbilt): installation feedback, documentation
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 5
VDT
Basic Globus and Condor, plus EDG software (eg. GDMP)
Plus lots of extras …
Pacman
VO management
Test harness
Glue schema
Virtual Data Libraries
Virtual Data Catalog
Language and interpreter
Server and Client
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 6
VDT Releases Current version: VDT 1.1.5 Released October 29, 2002
Major recent upgrades (since 1.1.2) Patches to Globus 2.0, including the OpenSSL 0.9.6g security update.
A new and improved Globus job manager created by the Condor team that is more scalable and robust than the one in Globus 2.0. This
job manager has been integrated into the Globus 2.2 release.
Condor 6.4.3 and Condor-G 6.4.3.
New software packages, including:
FTSH (The fault tolerant shell) version 0.9.9
EDG mkgridmap (including perl modules that it depends on)
EDG CRL update
DOE and EDG CA signing policy files, so you can interact with Globus installations and users that use CAs from the DOE and EDG.
A program to tell you what version of the VDT has been installed, vdt-version.
Test programs so that you can verify that your installation works.
The VDT can be installed anywhere--it no longer needs to be installed into the /vdt directory.
VDT configuration
Sets up Globus and GDMP daemons to run automatically
Configures Condor to work as a personal Condor installation or a central manager
Configures Condor-G to work
Enables the Condor job manager for Globus, and a few other basic Globus configuration steps.
Doing some of the configuration for GDMP.
VDT installation logs created, better README files
We now properly set up globus-gram-job-manager-tools.sh, and ensure that it is not overwritten when more Globus bundles are installed.
Fixed mkgridmap so that it can find the perl modules correctly
Most up-to-date CA signing policy files from the EDG and DOE have been included; new signing policy for the new INFN CA
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 7
Deployment with Pacman Packaging and post-install configuration: Pacman
Key piece required to do anything: not only for middleware but also
applications and higher level toolkits
Tools to easily manage installation and environment
fetch, install, configure, add to login environment, update
Sits over top of many software packaging approaches (rpm, tar.gz, etc.)
Uses dependency hierarchy, so one command can drive the installation of a
complete environment of many packages
Packages organized into caches hosted at various sites
Distribute responsibility for support
Has greatly helped in testing/installation of VDT: many new features
Made it possible to quickly setup grids and application packages
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 8
VDT and LCG Project VDT and EDG being evaluated for LCG-1 testbed
EDG Collected tagged and supported set of middleware and application software packages and
procedures from the European DataGrid project available as RPMs with a master location.
Includes Application Software for deployment and testing on EDG sites
Most deployment expects most/all packages to be installed with small set of uniform
configuration
Base layer of software and protocols are common Globus: X509 certificates, GSI authentication, GridFTP, MDS LDAP monitoring and discovery
framework, GRAM job submission
Authorization extensions: LDAP VO service
Condor: Condor-G job scheduling, matchmaking (ClassAds), Directed Acyclic Graph (job task
dependency manager – DAGMAN)
File movement and storage management: GDMP, GridFTP
Possible solution: VDT + EDG WP1, WP2
If adopted, PI’s of GriPhyN and iVDGL, and US CMS and US ATLAS computing managers will need to define a response for next steps for support
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 9
Grid Operations Operations areas
registration authority
VO management
Information infrastructure
Monitoring
Trouble tracking
Coordinated Operations
Policy
Full time effort Leigh Grundhoefer (IU)
New hire (USC)
Part time effort Ewa Deelman (USC)
Scott Koranda (UW)
Nosa Olmo (USC)
Dantong Yu (BNL)
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 10
Distributed Facilities Monitoring VO Centric Nagios – Sergio Fantinel, Gennaro Tortonne (DataTAG)
VO Centric Ganglia – Catalin Dumitrescu (U of Chicago) Ganglia
Cluster resource monitoring package from UC Berkeley
Local cluster and meta-clustering capabilities
Meta-daemon storage of machine sensors for CPU load, memory, I/O
Organize sensor data hierarchically
Collect information about job usages
Assign users to VO’s by queries to Globus job manager
Tool for policy development
express, monitor and enforce policies for usage according to VO agreements
Deployment Both packages deployed on US and DataTAG ATLAS and CMS sites
Nagios plugin work by Shawn McKee – sensors for disk, I/O usage
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 11
Screen Shots
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 12
Glue Project Background
Joint effort of iVDGL Interoperability Group and DataTAG WP4 Led by Ruth Pordes (Fermilab, also PPDG Coordinator) Initiated by HICB (HEP Intergrid Coordination Board)
Goals Technical
Demonstrate that each Grid Service or Layer can Interoperate Provide basis for interoperation of applications Identify differences in protocols and implementations that prevent interoperability Identify lacks in architecture and design that prevent interoperability
Sociological Learn to work across projects and boundaries without explicit mandate or authority but for
the “longer term good of the whole”. Intent to expand from 2 continents to Global - inclusive not exclusive
Strategic Any Glue code, configuration and documents developed will be deployed and supported
through EDG and VDT release structure Once interoperability demonstrated and part of the ongoing culture of global grid
middleware projects Glue should not be needed Provide short term experience as input to GGF standards and protocols Prepare way for movement to new protocols - web and grid services (OGSA)
report from Ruth Pordes report from Ruth Pordes
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 13
Glue Security Authentication - X509 Certificate Authorities, Policies and Trust
DOE Science Grid SciDAC project CA is trusted by the European Data Grid and vice versa
Experiment testbeds (ATLAS, CMS, ALICE, D0 and BaBar) use cross-trusted certificates
Users starting to understand need multiple certificates also
Agreed upon mechanisms for communicating new CAs and CRLs have been shown to work but
more automation for Revocation is clearly needed
Authorization Initial authorization mechanisms are in place everywhere using the Globus gridmapfiles
Various supporting procedures are used to create gridmapfiles from LDAP databases of
certificates or other means
Identified requirement for more control over access to resources at time of request for use. But
no accepted or interoperable solutions are in place today
Virtual Organization Management or Community Authorization Service is under active discussion
- see the PPDG SiteAA mail archives.. And this is just one mail list of several..
http://www.ppdg.net/pipermail/ppdg-siteaa/2002
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 14
Other Glue Work Areas GLUE Schema (for resource discovery, job submission)
EDG and the MDS LDAP schema and information were initially very different Commitment made up front to move to common resource descriptions. Effort has taken since February - weekly phone meetings and much email GLUE Schema Compute and Storage Information are Released in V1 in MDS 2.2
and will be in EDG V2.0 - defined with UML and LDIF.. Led to better understanding by all participants of CIM goals and to collaboration with
CIM schema group through the GGF
File transfer and storage Interoperability tests using GridFTP and SRM V1.0 within the US have started with
some success
Joint demonstrations Common submission to testbeds based on VDT and EDG in a variety of ways
ATLAS Grappa to iVDGL sites (US ATLAS, CMS, LIGO, SSDSS) and EDG (JDL on UI)
CMS-MOP
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 15
ATLAS-kit
ATLAS-kit RPM’s based on Alessandro DeSalvo’s work; distributed to
DataTAG and EDG sites; release 3.2.1
Luca Vacarossa packaged version VDT sites with Pacman
Distributed as part of ScienceGrid cache
ATLAS-kit-verify: invokes ATLSIM, does one event
New release 4.0.1 in preparation by AD
Distribution to US sites to be done by Yuri Smirnov (new ATLAS
iVDGL hire, started November 1)
Continued work with Flavia Donno and others
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 16
WorldGridhttp:www.ivdgl.org/worldgrid
Collaboration between US and EU grid projects
Shared use of Global resources – across experiment and Grid domains
Common submission portals: ATLAS-Grappa, EDG-Genius, CMS MOP master
VO centric “grid” monitoring: Ganglia and Nagios-based
Infrastructure development project:
Common information index server with Globus, EDG and GLUE schema
18 sites, 6 countries ~130 CPUs
Interoperability components (EDG schema and IP providers for VDT servers, UI and JDL for
EDT sites); First steps towards policy instrumentation and monitoring
Packaging
with Pacman (VDT sites) and RPMs/lcfg (DataTAG sites)
ScienceGrid:
ATLAS, CMS, SDSS and LIGO application suites
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 17
WorldGrid Site VDT installation and startup
Packaging and installation, configuration
Gridmap file generation
GIIS(s) registration
Site configuration, WorldGrid-WorkerNode testing
EDG information providers and testing
Gangila sensors, instrumentation
Nagios plugins, display clients
EDG packaged ATLAS and CMS code
Sloan applications
Testing with Grappa-ATLAS submission (Joint EDT and US sites)
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 18
WorldGrid at IST2002 and SC2002
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 19
Grappa Grappa
Web-based interface for Athena job submission to Grid resources
First one for ATLAS
Based on XCAT Science Portal technology developed at Indiana
Components Built from Jython scripts using java-based grid tools
Framework for web-based job submission
Flexible: user-definable scripts saved as 'notebooks‘
Interest from GANGA team to work collaboratively on grid interfaces
IST2002 and SC2002 demos
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 20
XCAT Science Portal
Portal framework for creating “science portals” (application-specific web portals that provide an easy and intuitive interface for job execution on the Grid)
Users compose notebooks to customize portal (eg. The Athena Notebook)
Jython scripts (the user notebooks) flexible python scripting interface easy incorporation of java-based toolkits
Java toolkit IU XCAT Technology (component technology) CoG (Java implementation of Globus) Chimera (Java Virtual Data toolkit-GriPhyN)
HTML form interface runs over Jakarta Tomcat server using https (secure web
protocols)Java integration increases system-independence/
robustness
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 21
Athena Notebook
Jython implementation of java-based grid toolkits (CoG, etc) Job Parameter input forms (Grid Resources, Athena
JobOptions, etc) Web-based framework for interactive job submission
--- integrated with ---
Script-based framework for interactive or automatic (eg. Cron-job) job submission
Remote Job Monitoring (for both interactive and cron-based jobs)
Atlfast and Atlsim job submission Visual Access to Grid Resources
Compute Resources MAGDA Catalogue System health monitors: Ganglia, Nagios, Hawkeye, etc
Chimera Virtual Data toolkit -- tracking of job parameters
Grappa communications flow
Grappa Portal Machine:
XCAT tomcat server
Web Browsing Machine (JavaScript)Netscape/Mozilla/Int.Expl/PalmScape
https - JavaScript
http: JavaScript Cactus framework
Script-Based
Submisson
interactiveor
cron-job
Resource A Resource Z. . . MAGDA: registers file/location registers file metadataCompute Resources
http://browse catalogue
CoG :Submission,Monitoring
CoG :
Data Copy
Data Storage: - Data Disk - HPSS
Magda
(spider)
Inputfiles
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 23
Grappa Portal
Grappa is not
Just a GUI front end to external grid tools
Just a GUI linking to external web services
But rather
An integrated java implementation of grid tools
The portal itself does many tasks
Job scheduling
Data transfer
Parameter tracking
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 24
Grappa Milestones
Design Proposals: Spring 2001
1st Atlsim prototype: Fall 2001
1st Atlas Software Week Demo: March 2002
Selected (May 2002) for US Atlas SC2002 Demo
1st submission across entire US Atlas testbed (Feb 2002)
1st Large Scale Job submission: 50Mevents (April 2002)
Integration with MAGDA (May 2002)
Registration of metadata with MAGDA (June 2002)
Resource Weighted Scheduling (July 2002)
Script based production cron system (July 2002)
GriPhyN-Chimera VDL Integration (Fall 2002)
EDG Compatibility plus DataTAG integration (Fall 2002)
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 25
Spare Slides about Grappa Follow
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 26
Various Grappa/Athena modes
Has been run using locally installed libraries
Has been run using AFS libraries
Has been run with static “boxed” versions of Atlfast and Atlsim
(where we bring along everything to the remote compute node)
This can translate to input data as well Do we bring input data to the executable
Bring the executable to the data
Bring everything along
Many possibilities
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 27
A Quick Grappa Tour
Demo of running atlfast demo-production
Interactive mode
Automatic mode
GRAM contact monitoring
Links to resources external to the portal
Magda metadata queries
Ganglia cluster monitoring
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 28
Typical User session
Start up portal onselected machine
Start up webbrowser
Configure/Select testbed resources
Configure input files.
Submit job
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 29
User Session
Open monitoring window
Auto refresh(User configurable)
Monitor/cancel jobs
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 30
User Session
Monitor Cluster health
The new ganglia version creates an additional level combining different clusters intoa “metacluster”
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 31
User Session
Browse MAGDACatalogue
Search for Personalfiles
would like:Search for PhysicsCollections (based on metadata)
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 32
Production Running
Configure Cronjob
Automatic job submission
Writing to magda cache
Automatic subnotebook naming
Setup up cron script location and timing
Script interacts with same portal as web-based
Use interactive mode to monitor production
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 33
Production Monitoring
Configure and test scriptsusing command line tools Command line submissionAutomatic cron submission View text log files
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 34
Production Monitoring
Access portal via web
Cron submissions appear as new subfolders
Click on subfolder to check what was submitted
Monitor job status
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 35
Production Monitoring
Select jobs/groups of jobs to monitor
Auto refresh (user configurable)
Cancel job button
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 36
Production Monitoring
Browse MAGDA catalogue
Auto registration of files
Check metadata(currently available as command line tool)
Search for files
would like:Search for physics collections
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 37
Metadata in MAGDA
Metadata published along with data files
MAGDA registers metadata
Metadata browsingavailable as command line tool:-check individual files
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 38
Chimera: Virtual Data Toolkit
Provides tracking of all job parameters
Browseable parameter catalogue
Simplified methods for data recreation
Processes that crash
Data that is lost
Data retrieval slower than re-creation
Condor-G job scheduling
And much more
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 39
Grappa & US Atlas Testbed
Currently about 60 CPUs available
6 condor pools
IU, OU, UTA, BNL, BU, LBL
2 standalone machines
ANL, UMICH
Grappa/testbed production rate, cron-based,
Achieved 15M atlfast events/day
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 40
Grappa: DC1 -- phase 1
Atlfast-(demo)-production testing successful
Simple atlfast demo with several pythia options
Large scale submission across the grid
Interactive and Automatic submission demostrated
Files and meta-data incorporated into MAGDA
Atlsim production testing initially successful
Atlsim run in both “boxed” and “local” modes.
Only one atlsim mode tested
November 12, 2002Rob Gardner, University of Chicago VDT and Interoperable Testbeds 41
Grappa: DC1 -- phase 2
Atlsim notebook upgrade:
Could make notebook tailor-made for phase2
Launching mechanism for atlsim
Atlsim does a lot that grappa could do
vdc queries
data transfer (input or output)
leave it in atlsim for now -- or --
incorporate some pieces into grappa
Would require additional manpower to define/incorporate/test
atlsim production notebook