eu datagrid testbed 2 component review paul millar (university of glasgow) [email protected]...
TRANSCRIPT
EU DataGrid TestBed 2 Component Review
Paul Millar(University of Glasgow)
(slides based on a presentation by Erwin Laure)
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Outline
Middleware developments for TB 2.0
WP1 - Workload Management System
WP2 & 7 - Architecture, Visulisation tools, Data Management and Replica Management Components
WP5 - The Storage Element
WP3 - R-GMA
WP4 - LCFGng
Release plan on application testbed
Summary
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Workload Management System
WMS architecture reviewed
To apply the “lessons” learned and addressing the problems with the first release of the software
e.g. delegate some functionalities to pluggable modules, etc.
To make easier the interoperability with other Grid frameworks
To increase the reliability of the system
To support new functionalities: User APIs (including a Java GUI) Job checkpointing Job partitioning Interactive jobs Dependencies of jobs Parallel jobs
WP 1
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Network Monitoring Architecture
PCP
WEB RTPL
Distributed Data CollectorRaw
Iperf UDPmon GridFTPPingEr
Measure
CollectAndStorage
Visualization
MapCenter
Replica Managers & resources brokers
Network Managers
Forecaster
Processing
ArchiveInfo
Services(R-GMA)
NetworkCost
WP 7
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Network Monitoring Architecture
PCP (Probe Coordination Protocol, dev. by WP7) schedules all active network measurements and avoids conflicts.
Standard and ad-hoc developed tools to measure network metrics
PingEr, IperfEr, UDPmon, rTPL, SNMP proxy agent …
Dedicated R-GMA infrastructure stores all network metrics and GridFTP logging.
Producers for main network metrics are available. R-GMA Archiver will to keep historical network metrics values.
WP 7
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Visualization Tools
MapCenter TopoGrid
RTPL
WP 7
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
GetNetworkCost Suite
getNetworkCost functions assist replica managers and resource brokering
Based on various back-ends for flexibility: CGI,and Globus MDS back-ends in release 1 R-GMA back-end in release 2 Web Services back-end also under development
Based on regular TCP throughput measurement. (release 1)Parameters to be added for enhanced precision:
GridFTP logging information : the more the grid is used, the more precise are the results.
historical data stored in R-GMA Archiver other network metrics (RTT, Jitter…) forecasting methods will be also tested.
WP 7
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Data Management
Moving to the Web Service paradigm:
GSI-enabled Web Services
Secure SOAP communication
Components:
Security : GSI-enabling Web Services fitting the EDG security framework
RLS : Replica Location Service
RMS : Replica Management Services
RMC: Replica Metadata Catalogue
Spitfire 2 : Customizable Grid RDBMS access
WP 2
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Replica Management Services
Replica Management
Services
Optimisation
Replica Metadata
Client
Replica Location
File Transfer
EDG-ReplicaManager
RepMetatdataCatalogue
RepOptimisationServiceGridFTP
RepLocationService
WP 2
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Replica Manager components
ERM: EDG Replica Manager client interface and API Entry point for all clients
ROS: Replication Optimization Service Replica selection based on network metrics (WP7)
RLS: Replica Location Service
Local Replica Catalog services LRC: Logical to Physical file mappings
Replica Location Index services RLI: index on Logical names
RMC: Replication Metadata Catalogue An instance of Spitfire with RDBMS backend and specialized schema
WP 2
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Replica Selection
WP7 providing the network information that serves as the basis for the replica selection
Monitoring of network traffic between the 5 main TB sites
Calculation of expected transfer time for files between given sites.
Optimisation component of Reptor Services (WP2)
Implemented as a stand alone web service
Selection of best replicas based on network latencies
Assist the resource broker in selecting the best CE for job scheduling based on access costs of replicas
WP 2 & 7
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Replica Location Service
RLS: A Framework for Constructing Scalable Replica Location Services
Joint collaboration between WP2 and Globus
Independent local state maintained in Local Replica Catalogues : LRCsLRCs
Unreliable collective state maintained in Replica Location Indices : RLIsRLIs
Soft state maintenance of RLI state
relaxed consistency in the RLI, full state information in LRC
Membership and partitioning information maintenance
RLS components change over time : failure, new components added
Service discovery and system policies
LRC LRC LRC
RLI RLIRLI
LRCLRC LRC LRC
RLI RLIRLI
LRC
WP 2
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
The StorageElement
TB2.0 will see the first production release of the SE control system
For this first release the three interfaces to the outside world are:
Data, gridftp will be used to transfer files over the WAN and the files will optionally be available to local nodes by NFS.
Information, Existing MDS information providers will be extended to provide the extra information in the GLUE storage schema.
Control, functions such as reservation, pinning, deletion, and transfer time estimation.
The SE control interface to a generic MSS has already been tailored for CERN and RAL and tested there.
Work is under way with IN2P3, WP10, and WP9 to adapt it to their MSS.
WP 5
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
R-GMA: Information & Monitoring
A relational implementation of GMA
Relational model is better able to describe real world systems than the hierarchical
Applied to both information and monitoring
Creates impression that you have one RDBMS per VO
Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important.
Supports streaming
Producer
Consumer
Registry
Store location
Lookup
locatio
n
Execute or
stream
Producers
Announce: SQL "CREATE"
Producers
Publish: SQL "INSERT"
Consumers
Collect: SQL "SELECT"
WP 3
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Installation & Configuration
LCFGng tailored to EDG (“EDG LCFGng”):
Component guidelines for EDG middleware developers
Produced new system management and Grid middleware components
Ported new client configuration access library to LCFGng
Non-intrusive version of the package installation subsystem (updaterpms)
Central Config DB
XMLHLDL
Pancompiler
Client node
Client node
Client node
cacheAccess
API
admins
Applications
Configuration Architecture
WP 4
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
User VOMS
service
authz
map
pre-proc
authz
LCAS
LCMAPS
pre-proc
LCAS
Coarse-grainede.g. Spitfire
service
dn
dn + attrs
Fine-grainede.g. RepMeC
Coarse-grainede.g. Gatekeeper
Fine-grainede.g. SE
Java Cauthenticate
acl acl
Authorisation via VOMS SCG
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Release Plan on Application Testbed
Port to RH 7.3 & LCFGng February
Upgrade to Globus 2.2.3 end of February
R-GMA
RLS
Storage Element
NetworkCost Suite
ERM March & April
New Resource Broker + GLUE schema
GridFTP access to CASTOR
VOMS
Testbed 2.0: May 2003
GridPP 6th Collaboration Meeting – 30-31 Jan. 2003 – n° 1
Summary
Major new developments in all middleware areas
Addressing the key shortcomings identified:
WMS stability and scalability
Replica catalogue stability and scalability
Data management usability
Information system stability and scalability
Unified access to MSS
Providing new functionality
Upgrade underlying software