the glite information system(s)

22
IST-2006- 026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America The gLite Information System(s) Domenico Vicinanza, CERN EELA Tutorial, Santiago, September 2006

Upload: martina-mclean

Post on 31-Dec-2015

32 views

Category:

Documents


0 download

DESCRIPTION

The gLite Information System(s). Domenico Vicinanza, CERN EELA Tutorial, Santiago, September 2006. Information System. What? System to collect information on the state of resources Why? To discover resources of the grid and their nature - PowerPoint PPT Presentation

TRANSCRIPT

IST-2006-026409 www.eu-eela.org

E-infrastructure shared between Europe and Latin America

The gLite Information System(s)Domenico Vicinanza, CERN

EELA Tutorial, Santiago, September 2006

Santiago, Chile, EELA Tutorial, 06-07.09.2006 2IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• What?– System to collect information on the state of resources

• Why?– To discover resources of the grid and their nature– To have useful data to know who is in charge of managing the

workload to do it more efficiently.– To check for health status of resources.

• How?– Monitoring state of resources locally and publishing fresh data on

the information system.– Adopting a data model that MUST be well known to all components

that want to access monitored information– Using different approaches that we are going to investigate in the

next slides

Information System

Santiago, Chile, EELA Tutorial, 06-07.09.2006 3IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

If you are a middleware developer

Workload Management System:Matching job requirements and Grid resources

Monitoring Services:Retrieving information of Grid Resources status and availability

If you are a user

Retrieve information of Grid resources and status

Get the information of your jobs status

If you are site manager or service

You “generate” the information for examplerelated to your site or to a given service

Uses of the IS in Grid

Santiago, Chile, EELA Tutorial, 06-07.09.2006 6IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• LCG adopts a combination of solutions– Globus MDS

At the lowest level of the information system To discover and monitor resources and publish information Grid Information Security (GSI) credentials Caching

– BDII At the highest level of the system Because MDS had some troubles in terms of scalability Used by the Resource Broker for the matchmaking process Can be configured by each VO Queries underlying systems periodically (2 minutes)

• Hierarchical system– Information is collected on the leaves of a hierarchical tree and travels

towards the root– Clients can query the hierarchical tree at every level– The higher the level against which queries are made, the older is the

obtained information

LCG Information System

Santiago, Chile, EELA Tutorial, 06-07.09.2006 7IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• The BDII (Berkeley DB Information Index)– has been adopted in LCG middleware as the Information System

provider. – It is an evolution of the Globus Meta Directory System (MDS)– LCG-2 actually adopts BDII as Information System.– It is based on Lightweight Directory Access Protocol (LDAP)

server

• The Relational Grid Monitoring Architecture (R-GMA)– Is an implementation of the Grid Monitoring Architecture (GMA)

standardized by the Global Grid Forum (GGF)– It is a relational implementation of the GMA– It is strongly Web Services Oriented– It uses standard SQL query syntax

Information Systems in gLite

Santiago, Chile, EELA Tutorial, 06-07.09.2006 8IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Gathering of information at different levels– Lower level: Grid Resource Information Server (GRIS) - MDS

Collects information on the state of a given resource One GRIS on top of each resource: CE, SE, RB, MyProxy A set of scripts and sensors that try to extract useful info on the

resource

– Medium level: Grid Index Information Server (GIIS) – Local BDII Collects information on resources of a given site One GIIS for each site

– Higher level: Top-level BDII Collects information on resources of a given VO One BDII for each VO (suggested solution)

• Way of collecting info– Pull model (higher level servers periodically query lower level

servers)– LDAP query model

Collecting Information

Santiago, Chile, EELA Tutorial, 06-07.09.2006 9IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Way of working– One GRIS for each resource– One GIIS for each site collecting info from below GRIS systems– One BDII for a given VO collecting information from below GIIS systems– Two LDAP servers, one for write access and one for read access– Every two minutes a cron-job runs a script and collects info from a list of

GIIS sites– The list of GIIS is placed in the configuration file of the BDII

GIISINFN sez. CT

GIISMerida (gilda)

GRISes GRISes

Other GIIS (gilda)

GRISes

BDII (gilda)

BDII

Globus MDS

The hierarchy

Santiago, Chile, EELA Tutorial, 06-07.09.2006 10

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

► LDAP structures data as a tree

► The values of each entry are

uniquely named

► Following a path from the node back to

the root of the DIT, a unique name

is built (the DN):“id=dv,ou=IT,or=CERN,st=Geneva, \

c=Switzerland,o=grid”

o = grid (root of the DIT)

c= US c=Switzerland c=Spain

st = Geneva

or = CERN

ou = IT ou = EP

id = dv id=gv id=fd

objectClass:personcn: Vicinanza D.phone: 5555666office: 28-r026

The LDAP Protocol

Santiago, Chile, EELA Tutorial, 06-07.09.2006 11

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• The Relational Grid Monitoring Architecture (R-GMA)– It is the relational implementation of

GMA defined by the GGF– Adopts a database model with tables

and relations between tables– Implements a virtual database– The user queries the R-GMA as

he/she was querying to a classical database (SQL string)

– Implements different type of queries

• The information– Produced and accessed locally to its

site– Always new– Can be collected by an entity

(secondary producer) to be accessed faster

R-GMA front end

R-GMA front end

R-GMA front end

Virtual Database

R-GMA client

R-GMA client

R-GMA client

R-GMA

Santiago, Chile, EELA Tutorial, 06-07.09.2006 12

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• The Producer stores its location (URL) in the Registry.

• The Consumer looks up producer URLs in the Registry.

• The Consumer contacts the Producer to get all the data.

• Or the Consumer can listen to the Producer for new data.

Registry

Producer Consumer

Store

Loc

atio

n

Look up Location

Execute or Stream data

name ID birth Group

SELECT * FROM people WHERE group=‘HR’

Tom 4 1977-08-20 HR

GMA Architecture and Relational Model

Santiago, Chile, EELA Tutorial, 06-07.09.2006 13

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Consumer

Producer 1

Registry

TableName

Value 1 Value2

Value 3 Value 4

TableName

Value 1 Value 2

TableName URL 1

TableName URL 2

• The Consumer will get all the URLs that could satisfy the query.

• The Consumer will connect to all the Producers.

• Producers that can satisfy the query will send the tuples to the Consumer.

• The Consumer will merge these tuples to form one result set.

Producer 2TableName

Value 3 Value 4

Multiple Producers

Santiago, Chile, EELA Tutorial, 06-07.09.2006 14

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

CPULoad (Producer 3)

CH CERN ATLAS 1.6 19055611022002

CH CERN CDF 0.6 19055511022002

CPULoad (Producer 1)

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

CPULoad (Producer 2)

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

CPULoad (Consumer)

Country Site Facility Load Timestamp

UK RAL CDF 0.3 19055711022002

UK RAL ATLAS 1.6 19055611022002

UK GLA CDF 0.4 19055811022002

UK GLA ALICE 0.5 19055611022002

CH CERN ALICE 0.9 19055611022002

CH CERN CDF 0.6 19055511022002

Select * from CPULoad

Santiago, Chile, EELA Tutorial, 06-07.09.2006 15

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

ServiceURI VO type emailContact sitegppse01 alice SE [email protected] RAL

gppse01 atlas SE [email protected] RAL

gppse02 cms SE [email protected] RAL

lxshare0404 alice SE [email protected] CERN

lxshare0404 atlas SE [email protected] CERN

ServiceStatus

URI VO type up status

gppse01 alice SE y SE is running

gppse01 atlas SE y SE is running

gppse02 cms SE n SE ERROR 101

lxshare0404 alice SE y SE is running

lxshare0404 atlas SE y SE is running

Result Set (Consumer)

URI emailContact

gppse02 [email protected]

SELECT Service.URI Service.emailContact FROM Service S, ServiceStatus SS WHERE (S.URI= SS.URI and SS.up=‘n’)

Joins

Santiago, Chile, EELA Tutorial, 06-07.09.2006 16

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

GLUE Schema

Santiago, Chile, EELA Tutorial, 06-07.09.2006 17

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Definition and main goals

• Schema: a description of objects and attributes needs to describe Grid resources, and the relationships between the objects.

Main goals:

• Define a minimum common schema requirement for interoperability

– Compute Elements, Network Elements, Storage Elements

• To address need to common schemas between projects– framework independent (LDAP, SQL, XML)

Santiago, Chile, EELA Tutorial, 06-07.09.2006 18

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• Grid Laboratory Uniform Environment (GLUE) Schema– It is a data model to describe in a meaningful way information on

grid resources (static and dynamic info)– As result of a collaboration between the EU-DataTAG and iVDGL

projects– EGEE, NorduGrid, LCG and Grid3/OSG contributed to the

definition of the schema

• XML Schema– Now, GLUE Schema is being mapped to an XML representation– http://infnforge.cnaf.infn.it/glueinfomodel/Spec/V12/R1

Glue Schema

Santiago, Chile, EELA Tutorial, 06-07.09.2006 19

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Example of attibutes

• Operating System– OSName– OSRelease– OSVersion

• QueueState– RunningJobs– TotalJobs– QueueStatus– WaitQueueLength– WorstResponseTime– EstimatedResponseTime

Santiago, Chile, EELA Tutorial, 06-07.09.2006 20

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Site

A collection of resources owned by the same organization and managed by the same administrator. Contains info on the location, the administrator, the web homepage and so on.

Service

The description of a deployed Web Service. Contains the URI endpoint of the WS, the WSDL document, the list owners and so on.

StorageElement Cluster

1 1 1

*

*

*

Site Element

Santiago, Chile, EELA Tutorial, 06-07.09.2006 21

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Cluster

A set of heterogeneous resources. Contains information on shared temporary directories.

SubCluster

A set of similar resources. Contains the number of Logical and Physical CPUs.

1

*

Host

Contains detailed static information of the type of hosts and related installed software. Data deal with the type of CPU architecture, memory sizes, the operating system installed as well as the type of network adapter. Furthermore it contains some information on performance mesures obtained by executing well known benchmark softwares.

Location

Information on installed softwares, their path and version

1

*

ComputingElement

1

*

Cluster Element

Santiago, Chile, EELA Tutorial, 06-07.09.2006 22

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

ComputingElement

Abstraction of a queue of jobs

Policy

Contains info on configuration policies. MaxWallClockTime, MaxRunningJobs, MaxCPUTime . . .

AccessControlPolicyBase

Set of rules defining access control policy rules

Info

Static information on the resource that deal with the type of Loca scheduler adopted, the default Storage Element and so on.

VOview

View for a given Virtual Organization. Contains authorization details for VO members and the amount of available resources.

State

Dynamic information on the status of this queue such as the number of free CPUs and the Estimated Traversal Time (ETT)

Job

Information on jobs in this queue, its owner, its local and global ID and its status

*

*

Computing Element

Santiago, Chile, EELA Tutorial, 06-07.09.2006 23

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

• gLite 3.0 User Guide– https://edms.cern.ch/file/722398/1.1/gLite-3-UserGuide.pdf

• R-GMA home page– http://www.r-gma.org/

• GLUE Schema– http://infnforge.cnaf.infn.it/glueinfomodel/

References

Santiago, Chile, EELA Tutorial, 06-07.09.2006 24

IST-2006-026409

E-infrastructure shared between Europe and Latin America

www.eu-eela.org

Questions…

Thanks to Roberto Barbera

who firstly developed these slides