wp3 the status of the eu datagrid's r-gma system steve fisher / ral 24/4/2003

27
WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003 <s.m.fisher@ rl .ac. uk >

Upload: vernon-cross

Post on 06-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

WP3 Steve Fisher/RAL - 24/4/2003R-GMA3 GMA From GGF Very simple model Does not define: –Data model –How data are moved from Producer to Consumer –What registry looks like Producer Consumer Registry Store location Lookup location execute or stream

TRANSCRIPT

Page 2: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 2

WP3Who we are• Heriot-Watt, Edinburgh

– Andrew Cooke, Werner Nutt • IBM-UK

– James Magowan, (Manfred Oevers), Paul Taylor• INFN

– Roberto Barbera, Giuseppe Save, Gennaro Tortone• Queen Mary, University of London

– Roney Cordenonsi, (Ari Datta)• CCLRC/PPARC

– Rob Byrom, Laurence Field, Steve Hicks, Manish Soni, Antony Wilson, (Xiaomei Zhu), Jason Leake

– Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton• SZTAKI, Hungary

– Peter Kacsuk, Norbert Podhorszki• Trinity College Dublin

– Brian Coghlan, Stuart Kenny, David O’Callaghan, (John Ryan)

Page 3: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 3

WP3GMA

• From GGF• Very simple model• Does not define:

– Data model– How data are

moved from Producer to Consumer

– What registry looks like

Producer

Consumer

Registry

Store location

Lookup

locatio

n

execute or

stream

Page 4: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 4

WP3R-GMA

• Use the GMA from GGF

• A relational implementation– Powerful data model

and query language

• Applied to both information and monitoring

• Creates impression that you have one RDBMS per VO

Producer

Consumer

Registry

Store location

Lookup

locatio

n

execute or

stream

Page 5: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 5

WP3Relational Data Model

• Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important.

• Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT”

• Consumers collect: SQL “SELECT” • Some producers, the Registry and Schema make use

of RDBMS as appropriate – but what is central is the relational model.

Page 6: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 6

WP3Producer Consumer• Consumer can issue one-off queries

– Similar to normal database query• Consumer can also start a continuous query

– Requests all data published which matches the query

• Can be seen as an alert mechanism

Page 7: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 7

WP3Registry choices

• Decided early to keep them separate• In fact they have different requirements for

distribution/replication• Each implemented with one RDBMS per

instance

Registry (of Producers

and Consumers)

Schema (descriptions

of tables)

Page 8: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 8

WP3Virtual RDBMS• Creates impression that you have one

RDBMS per VO– This makes it very easy to use– 1 integrated system– 1 query language

• Users like it• But how will it fit in with GridServices?

Page 9: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 9

WP3Producers

• DataBaseProducer – Supports History Queries– Information not lost– Supports joins – Clean up strategy

• StreamProducer – Supports Continuous Queries– In memory data structure– Can define minimum retention period

• ResilientStreamProducer – Supports Continuous Queries– Like the StreamProducer but won’t lose data if system crashes– So slightly slower

• LatestProducer – Supports Latest Queries– Just holds the latest information for any “primaryish” key– Supports joins

• CanonicalProducer – Supports anything– Offers anything as relations

Page 10: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 10

WP3Archiver (Re-publisher)

• It is a combined Consumer-Producer • You just have to tell it what to collect and it

does so on your behalf• Re-publishes to any kind of “Insertable” (i.e.

not to the CanonicalProducer)

Page 11: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 11

WP3Canonical Producer• Allows user defined code to be invoked to respond to

SQL query• Developed in collaboration with CrossGrid

CPAPI

User Code

CanonicalProducerServlet

Files

CreateTable, Port, Protocol, Security, SQL Support, Multiple Query Support

Security

Insert

Query

Port

Register

Page 12: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 12

WP3Functionality - mediator• Queries posed against a virtual data base• The Mediator must:

– find the right Producers– combine information from them

• Hidden component – but vital to R-GMA• Can now merge information from several

producers • The final mediator will take “any” SQL

statement and do the right thing

See Werner Nutt’s talk

Page 13: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 13

WP3Topologies

• Normally publish via SP

• Archivers instantiated with a Producer and a Predicate

• Must avoid cycles in the graph

A SP

A SP

A LP

A HP

SP

SP

SP

SP

Page 14: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 14

WP3Schema & Contributions

CPULoad (Global Schema)Country Site Facility Load TimestampUK RAL CDF 0.3 19055711022002UK RAL ATLAS 1.6 19055611022002UK GLA CDF 0.4 19055811022002UK GLA ALICE 0.5 19055611022002CH CERN ALICE 0.9 19055611022002CH CERN CDF 0.6 19055511022002

CPULoad (Producer 3)

CH CERN ATLAS 1.6 19055611022002

CH CERN CDF 0.6 19055511022002

CPULoad (Producer 1)

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

CPULoad (Producer 2)

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

Page 15: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 15

WP3Contributions are Views

CPULoad (Producer 1)

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

CPULoad (Producer 2)

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

SELECT * FROM cpuLoad

WHERE country = ’UK’ AND site = ’RAL’

SELECT * FROM cpuLoad

WHERE country = ’UK’ AND site = ’GLA’

Page 16: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 16

WP3R-GMA Tools• R-GMA CLI

– Command Line Interface (similar to MySQL)– Supports single query and interactive modes– Can perform simple operations with Consumers,

Producers and Archivers

• R-GMA Browser– JSP application dynamically generating web pages– Supports pre-defined and user-defined queries

• Pulse– R-GMA Java client-based GUI– Supports streaming and simple graphical displays

Page 17: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 17

WP3GIN and GOUT (Gadget IN and Gadget OUT)

R-GMA Consumers

LDAPInfoProvider

GIN

LDAPServer

LDAPInfoProvider

CircularBuffer Producer

GIN

Consumer (CE)

Consumer (SE)

Consumer (SiteInfo) RDBMS

DataBase Producer

GOUT

ConsumerAPI

Archiver

CircularBuffer Producer

R-GMA

GLUESchema

Page 18: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 18

WP3R-GMA – How? • Currently based on servlet technology

– Behind every API there is a Servlet– Multiple hand crafted APIs

• Java, C++, C, Python and Perl

– Tomcat– Soft state registration– Uniform exception handling

• To ensure that useful messages and stack traces are preserved.

Page 19: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 19

WP3OGSIfication• Have recently started the migration to web

and grid services– Apache axis– WSDL generated APIs– Will provide a wrapper for backwards compatibility

Page 20: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 20

WP3

• All Grid Services• OGSA Factories, GSH, GSR• Registry includes HandleMapper• SQL as Service Data Element Query Language

ConsumerFactory

ProducerInstance

OGSIfied R-GMA

Sensor

ProducerAPI

Application

ConsumerAPI

Schema

RegistryConsumerInstance

ProducerFactory

Page 21: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 21

WP3OGSIfication issues• Consider XML as internal representation of

service data elements– Depends on other developments

• Consider XQuery as service data elements query language– Depends on how XQuery develops

• X-GMA ??– Will this be distinguishable from what is in GT3

Page 22: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 22

WP3Resilience - Registry

• Will have one logical registry and schema per VO

• Each logical registry will have multiple physical “copies”

• Each entry in registry has 3 possible states

• Transmit new records and deleted records and checksum after records deleted locally

• Self healing even supports new registry instances

• Consumer uses any instance• Fail over mechanism not yet

implemented• Schema more tricky

Producer1

Producer2

Registry2Info mastered by Registry2

Copy of info from Registry1

Copy of info from Registry3

Registry3Info mastered by Registry3

Copy of info from Registry1

Copy of info from Registry2

Registry1Info mastered by Registry1

Copy of info from Registry2

Copy of info from Registry3

See

poster

Page 23: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 23

WP3

Soft-state Registration and the Registry• Registry records existence of Producers and

Consumers• Registry holds last contact time and ‘expiry’

time• Producers and Consumers periodically

refresh their time stamps• Producer and Consumer servlets avoid

unnecessary traffic to Registry• Scheduled removal of entries that have

timed-out

Page 24: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 24

WP3Resilience Testing• Taking 7 components

– Schema– 2 registry instances– Producer API– Consumer API– Producer Servlet with other APIs– Consumer Servlet with other APIs

• Consider each component in turn– Break the network and bring it back– Close the component down and bring it back– Crash the component and bring it back

• Will also consider real life scenarios

Page 25: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 25

WP3Performance• By design:

– Very flexible - to avoid bottlenecks– Powerful queries allow a single query to be made

• Performance and Optimisation– Will use NetLogger and profiling tools to identify

possible bottlenecks• Internally not high speed because of XML etc

Page 26: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 26

WP3Summary• R-GMA is a combined Grid information and

monitoring system• Supports notion of Virtual Database• Recently deployed in the EDG development

testbed• Now focusing on reliability, stability and

performance

Thanks to the EU and our national funding agencies for their support of this work

http://hepunx.rl.ac.uk/edg/wp3/

Page 27: WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003

Steve Fisher/RAL - 24/4/2003R-GMA 27

WP3And finally GGF8…• RGIS-RG

– The two short sessions will be held:• Session 1: Database use cases and best practices in the grid

environment (outside the traditional data areas)» Using databases to store application metadata» Using databases to store monitoring information» Using databases as a grid registry» Creating grid registries for locating relational and XML

databases• Session 2: Data discovery in the grid environment

– We will also discuss our milestones and future directions. (e.g should we include XML as well as Relational models.)

– See http://hepunx.rl.ac.uk/ggf/rgis-rg

• A GMA BOF is planned for GGF8