network in egee building end-to-end network services for the grid

24
EGEE-II INFSO-RI- 031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Network in EGEE Building end-to-end network services for the Grid Mathieu Goutelle – CNRS UREC, France EGEE-II SA2 “Networking support” [email protected]

Upload: sharla

Post on 17-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Network in EGEE Building end-to-end network services for the Grid. Mathieu Goutelle – CNRS UREC, France EGEE-II SA2 “Networking support” [email protected]. Outline. Short presentation of EGEE, The network in EGEE: Network services? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Network in EGEE Building end-to-end network services for the Grid

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Network in EGEE

Building end-to-end network servicesfor the Grid

Mathieu Goutelle – CNRS UREC, FranceEGEE-II SA2 “Networking support”[email protected]

Page 2: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Outline

• Short presentation of EGEE,• The network in EGEE:

– Network services?– EGEE focus on end-to-end services in a multi-domain context.

• Network services:– Resource reservation,– Service Level Agreement.

• Operational services:– Monitoring,– EGEE Network Operational Centre.

• Summary & conclusion

Page 3: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE in a nutshell…• EGEE:

– 1 April 2004 – 31 March 2006– 71 partners in 27 countries, federated in regional Grids

• EGEE-II:– 1 April 2006 – 31 March 2008– 91 partners in 32 countries – 13 Federations

• Objectives:– Large-scale, production-quality

infrastructure for e-Science– Attracting new resources and

users from industry as well asscience

– Improving and maintaining “gLite” Grid middleware

Page 4: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE in a nutshell…

• More than 20 applications from 7 domains:– Astrophysics:

MAGIC, Planck– Computational Chemistry– Earth Sciences:

Earth Observation, Solid Earth Physics, Hydrology, Climate – Financial Simulation:

E-GRID– Fusion– Geophysics:

EGEODE– High Energy Physics:

4 LHC experiments (ALICE, ATLAS, CMS, LHCb) BaBar, CDF, DØ, ZEUS

– Life Sciences: Bioinformatics (Drug Discovery, GPS@, Xmipp_MLrefine, etc.) Medical imaging (GATE, CDSS, gPTM3D, SiMRI 3D, etc.)

– Multimedia– Material Sciences – …

Page 5: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE Infrastructure

Country participating

in EGEE

Scale (June 2006):~ 200 sites in 40 countries

~ 25 000 CPUs

> 10 PB storage

> 35 000 jobs per day

> 100 Virtual Organizations

Page 6: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Network infrastructure

Connects 32 NRENsOver 3M users

Page 7: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Network infrastructure (cont.)

Page 8: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

End-to-end network services?

• What type of services?– Network services are available to the EGEE sites:

Premium IP and similar (QBSS e.g.), “lightpath” or network resource reservation, IPv6, multicast…

– Operational services are available to the EGEE sites: Monitoring of the network (local & backbone), Operational data (incident, maintenance).

• How to ensure the service continuity along the path?– In the last mile?– In a multi-domain context?

• What about service availability, interface standardization, inter-domain agreements, etc.

Page 9: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 9

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE focus

• Network services:– Network resource reservation:

Bandwidth Allocation and Reservation (BAR), Dedicated talk on that subject (see session 1, “End to End

Bandwidth Allocation and Reservation for Grid applications”).

– Service Level Agreement (SLAs): End-to-end SLAs?

• Operational services:– Monitoring:

Network Performance Monitoring (NPM), Dedicated talk on that subject (see session 2, “Federated Network

Performance Monitoring for the Grid”).

– Coordination of operational actions: Concept of the EGEE Network Operational Centre (ENOC).

Page 10: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 10

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Network resource reservation

• Based on the framework currently being built by the GÉANT2 project:– Hides the multi-domain, multiple technologies issues;– Provides at the Grid level:

A seamless interface for service requests at the “customer” layer; High-level view of the network, with request of characteristics and

not of a particular service; Reduced configuration lead-time; A description of the service level.

• Issues remain:– A component (BAR, see dedicated talk) gives access to these

interfaces at the middleware layer, but the application layer is not yet ready;

– Need of sub-management of the macroscopic reserved resource at the Grid level;

– What about domains outside the GÉANT2 cloud?

Page 11: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 11

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Quick look at the BAR architecture

• Clear demarcation between the Grid and the network:– The network is hidden from the Grid (technology, multi-domain

issues…);– The Grid is hidden to the network (only knows one “EGEE” user);– Allows a two-stage process (reservation & activation) suitable in a Grid

context;

Extended QoS Network

HLM

BARBAR

Site 1 Site 2

NSAPL-NSAP L-NSAP

Network

EGEE

Network 1 Network 2 Network 3

L-Network L-Network

NSAPNSAP

Page 12: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 12

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SLAs

• “SLAs”?– Description of the characteristics of the service provided (e.g.

after a successful resource reservation request);– Provided by each domain crossed by the data path;– Either manually filled in by a human or automatically if the

request is all handled by software.– Definition of templates in cooperation with GÉANT2:

Based on previous work inside EGEE and answers from GÉANT2 to some open issues (procedures, demarcation point…)

• SLA template:– Administrative part (contact, duration, troubleshooting

procedures);– SLS (Service Level Specification) part.

• The SLA is formed using the individual SLAs provided by all domains along the end-to-end path.

Page 13: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 13

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SLAs (cont.)

• EGEE end-to-end SLA template:– Concatenation of the individual SLAs in each participating domains;– SLA between the border of the NRENs cloud (border-to-border SLA);

• Difficulty to accommodate and take into account the “last mile”:– If the “last-mile” network is not participating (no resource reservation

system, no SLA, etc.);– Try to address this with static information on these networks to provide

service characteristics to the user/application.

NREN 2

GEANT

NREN 1

EGEE RC A

Campus/MAN EGEE RC B

Campus/MAN

SLA 1 SLA 2

SLA 3

border-to-border connectivity

end-to-end connectivity

Page 14: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 14

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SLA institution

• All domains involved in network services provisioning to EGEE as part of the existing network infrastructure hierarchy have to be categorized as one of:– Compliant with the Premium IP service,– Supportive of the Premium IP service,– Indifferent to the Premium IP service.

Page 15: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 15

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE focus

• Network services:– Network resource reservation:

Bandwidth Allocation and Reservation (BAR), Dedicated talk on that subject (see session 1, “End to End

Bandwidth Allocation and Reservation for Grid applications”).

– Service Level Agreement (SLAs): End-to-end SLAs?

• Operational services:– Monitoring:

Network Performance Monitoring (NPM), Dedicated talk on that subject (see session 2, “Federated Network

Performance Monitoring for the Grid”).

– Operational Interface with the network: Concept of the EGEE Network Operational Centre (ENOC).

Page 16: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 16

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Monitoring

• Not Yet Another Monitoring Framework!– Role of a Mediator between the various monitoring frameworks and the

various clients (diagnostic tools, middleware, etc.);– Network Performance Monitoring (NPM) gives access to data collected

at existing monitoring frameworks (site, backbone);– Use of the NMWG interface to access those frameworks and republish

data;– Special requirements for some middleware

components for faster access to data.

Page 17: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 17

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Operational Interface

• The network infrastructure of EGEE is mainly served by a set of NRENs via GÉANT2;

• Need of an entity coordinating all the NOCs involved and the Grid Operations:– Concept of an end-to-end Coordination Unit (GÉANT2);– Providing an end-to-end operational support.

• A single point of contact as an operational interface between EGEE and GÉANT2/NRENs dealing with:– Network problems troubleshooting,– Interactions with network providers and Grid sites,– Notifications from NRENs,– Network SLA installation and monitoring.

• Two Functional Entities inside EGEE:– EGEE Network Operational Centre (ENOC);– A Network Trouble Ticket Manager – GGUS.

Page 18: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 18

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

ENOC

• From the EGEE point of view:– GGUS acts as the first line support (interacts with the user);– Support units are the second level support;

• From the NRENs’ point of view:– EGEE (via the ENOC) is a single entity;– The ENOC is the only point of contact for the NRENs (submitter of the

problem).

GGUS

Users

SupportUnits

ENOC

NRENs

GÉANT2

EGEE Network

Page 19: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 19

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

ENOC (cont.)

• Main challenges: – To create a network support structure inside EGEE;– To define the associated network operational procedures.

• The ENOC is the user support for network failures:– End-to-End network problems troubleshooting;– Coordination unit of the actions of all the entities involved in a

network incident;– Try to have an overall view of the end-to-end service, gathering

information from all the involved domains;– SLA Management: installation and monitoring.

• ENOC Operational Procedures have been defined and validated during the first phase of EGEE;

• EGEE-II will fully implement ENOC.

Page 20: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 20

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

ENOC (cont.)

• ENOC Service:– Collect tickets from NRENs which agree to provide them to the

ENOC;– Forward to GGUS the ones that seem relevant (possible impact

on the Grid infrastructure);– Receive tickets assigned to ENOC by the GGUS 1st level

support;– Troubleshoot them with the help of monitoring tools;– Contact identified faulty domains or reassign ticket to the

associated site if there is no evidence of a backbone problem (e.g. LAN issue).

• Main Issues:– Load on the ENOC team (amount of info, etc.);– Heterogeneity of systems the ENOC has to deal with

(languages, trouble ticket format, monitoring, etc.).

Page 21: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 21

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

ENOC status

• ENOC team is ready! 5 people (2 FTE) including one dedicated to it.

• ENOC receives operational information from GÉANT2 and 10 NRENs (more to come):

About 80% of all the EGEE sites covered; An average of 5 tickets handled per day; 8 different languages.

• Building tools to follow up or enhance the network support:

Network Operational Database (interconnection of administrative domains between the EGEE resource centres);

TT parsing and filtering tool; Dashboard to present overall status

of the “EGEE network”.

Page 22: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 22

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

EGEE expectations

• Towards a better solution against our “multi-domain” and “end-to-end” issues

• Seamless access to network monitoring data: GÉANT2 will provide such access (PerfSonar), from multiple

domains, aggregating data from multiple frameworks;

• Network resource reservation: Requests expressed not in terms of service but of characteristics; The choice of the underlying technology to fulfil them is up to the

network; Answer to a request = SLA (depending of the current network status

& load); What about the last mile? The non-NRENs domains?

• Standardization of the operational interface: Trouble Ticket format (data schema and exchange format); Access method.

Page 23: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 23

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Summary & conclusion

• Focus on providing end-to-end services in a multi-domain context:– Hiding the network complexity from the Grid (users, middleware,

Grid support);– Hiding the Grid complexity from the network (single point of

contact, operational interface);

• Many building blocks depend on the providers:– Resource reservation frameworks, SLA installation, backbone

monitoring;– Fortunately, EGEE and GÉANT2 built up a strong collaboration!

• Many things remains pending:– Mainly on the operational side (homogenization of the network

interface);– How to cope with domains outside the GÉANT2 cloud?

• The two infrastructures need to collaborate on these aspects.

Page 24: Network in EGEE Building end-to-end network services for the Grid

GridNets 2006 – 2006-10-01 – San Jose, CA, USA 24

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Thank you for your attention!