grid resource bazaar: efficient sla management

6
Grid Resource Bazaar: Efficient SLA Management Tomasz Szepieniec, Ma lgorzata Tomanek, and Tomasz Twar´ og Academic Computer Center CYFRONET AGH, ul. Nawojki 11, 30-950 Krak´ ow, Poland email: [t.szepieniec, m.tomanek, t.twarog]@cyfronet.pl Abstract In large grid environment like EGEE/EGI and PL-Grid there are hun- dreds of actors providing and using computational resources who need some principles in order to satisfy requirements and expectations of both sides. For that reason it seems indispensable to define quality of services, usually called SLAs, between a VO and a resource provider. Due to a large number of interactions as well as many details that need to be negotiated for each such contract a formalized, traceable process of resource allocation is crucial for successful operation. Efficiency of the process should be supported by a collaboration platform assisting resource management by both users and providers. Grid Resource Bazaar is a web portal aiming at facilitating SLA man- agement related to resource allocation in grids. The functionality allows to define and broadcast calls for resources, negotiate and manage exist- ing SLAs and communication with partners. The tool is maintained by PL-Grid and deployed as part of Polish NGI operational architecture. 1 Introduction EGEE and EGI [3] initiatives gathers a lot of institutions and associations that use or provide computational resources. A formalization of resource providing or consumption is important to make activities of these institutions efficient and adjusted for reasonable scheduling. A fundamental issue is an existence of an agreement between the resource provider and the user. The agreement should contain essential and measurable metrics. Parties should have a possibility of agreement negotiation before signing it and later renegotiation of the agreement. In some cases parameters of SLAs can be automatically determined but often a human interaction during negotiations is required. There is a need to track how resource provider fulfills agreement and what is the impact of signed agreements on the whole needed or accessible resources. PL-Grid implement grid operation according to a novel architecture [6], which can be called resource-allocation-centric. Resource allocation is done ac- cording the an SLA signed between users and resource providers. The paper is organized in the following way: the first section explains how resource allocation in EGEE and PL-Grid proceeds. Next section focuses on SLA properties and its life cycle. The last section introduces to a tool, called Grid Resource Bazaar, that has already implemented basic assumptions of the

Upload: doanthien

Post on 23-Jan-2017

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Resource Bazaar: Efficient SLA Management

Grid Resource Bazaar: Efficient SLAManagement

Tomasz Szepieniec, Ma lgorzata Tomanek, and Tomasz Twarog

Academic Computer Center CYFRONET AGH,ul. Nawojki 11, 30-950 Krakow, Poland

email: [t.szepieniec, m.tomanek, t.twarog]@cyfronet.pl

Abstract

In large grid environment like EGEE/EGI and PL-Grid there are hun-dreds of actors providing and using computational resources who needsome principles in order to satisfy requirements and expectations ofboth sides. For that reason it seems indispensable to define quality ofservices, usually called SLAs, between a VO and a resource provider.Due to a large number of interactions as well as many details that needto be negotiated for each such contract a formalized, traceable processof resource allocation is crucial for successful operation. Efficiency ofthe process should be supported by a collaboration platform assistingresource management by both users and providers.Grid Resource Bazaar is a web portal aiming at facilitating SLA man-agement related to resource allocation in grids. The functionality allowsto define and broadcast calls for resources, negotiate and manage exist-ing SLAs and communication with partners. The tool is maintained byPL-Grid and deployed as part of Polish NGI operational architecture.

1 Introduction

EGEE and EGI [3] initiatives gathers a lot of institutions and associations thatuse or provide computational resources. A formalization of resource providingor consumption is important to make activities of these institutions efficient andadjusted for reasonable scheduling. A fundamental issue is an existence of anagreement between the resource provider and the user. The agreement shouldcontain essential and measurable metrics. Parties should have a possibility ofagreement negotiation before signing it and later renegotiation of the agreement.In some cases parameters of SLAs can be automatically determined but often ahuman interaction during negotiations is required. There is a need to track howresource provider fulfills agreement and what is the impact of signed agreementson the whole needed or accessible resources.

PL-Grid implement grid operation according to a novel architecture [6],which can be called resource-allocation-centric. Resource allocation is done ac-cording the an SLA signed between users and resource providers.

The paper is organized in the following way: the first section explains howresource allocation in EGEE and PL-Grid proceeds. Next section focuses onSLA properties and its life cycle. The last section introduces to a tool, calledGrid Resource Bazaar, that has already implemented basic assumptions of the

Page 2: Grid Resource Bazaar: Efficient SLA Management

Fig. 1: Process of resource allocation for VO by RP.

model for EGEE project and describes also the next version of Grid ResourceBazaar that is planned to fulfill specific PL-Grid requirements.

2 Resource Allocation Process

Actors involved in the system are the following:• VOs and users groups, later called Resource User Groups (RUG). They

use assigned resources in computational experiments.• Resource Providers (RP). Thy need to plan efficient usage of their resources

meeting requirements from many RUGs,• National Grid Initiative (NGI). NGI brokers resources provided by RPs.

NGI can act as proxy provider offering resources to the users, and as proxyuser for some RPs; a similar role can be applied to others NGIs in EuropeanGrid Initiative (EGI).

Fig. 1 illustrates the simple process of resource allocation between users andproviders. The initial action in the process is that a RUG is announcing aresources demand in the form of a call. Next, a RP can answer for this callwith an SLA proposal to fulfill the total or part of needed resources. In thenegotiations between the involved partner in the next stage, all the details ofSLA metric are set-up. In execution stage the agreement is implemented in theconfiguration of resources and resources are used according to this agreement.Finally a reward for delivering services is provided. Usually, such processes areproceeded concurrently with many partners and decisions are taken accordingto RUG or RP policies.

Allowing an NGI to operate in the resource allocation process introducessome new requirement. NGI Operator can negotiate SLA as RUG or RP agent

Page 3: Grid Resource Bazaar: Efficient SLA Management

but with some restrictions. Thus, two new SLA types are required:• SLA between NGI and RUG, called Grant SLA (NGI guarantees grant

resources),• SLA between NGI and RP, called NGI Infrastructure SLA (RP guarantees

support for PL-Grid project),The third kind of SLA is the elementary one – an SLA between RP and RU

(the SLA takes effect as real resources in concrete locations) with possibility totreat NGI as representative of RUG or RP. That SLA is called Basic SLA.

Different NGIs can interact with each other and RP from one NGI can sup-port RUG from another NGI.

A basic allocation scenario is when RUG wants to perform computations us-ing PL-Grid infrastructure and does not have chosen RPs. RUG presents GrantSLA to NGI, which means that RUG applies for resources in NGI infrastructure.NGI Operator and RUG negotiate and both agree to Grant SLA conditions. NGIOperator can attach to that Grant SLA subSLAs in which RPs declared RUGsupport. These subSLAs can be created before or after Grant SLA signing. Incase when all of RPs declined to support a given RUG, NGI Operator can signsubSLA on behalf of a chosen RP.

3 Resource-related SLA Specification

In SLA parties declare some conditions upon which they will use or providecomputational resources. Setting the SLA should be done basing on the set ofclearly defined, measurable metrics, which describe quality of specific services.

As was mentioned in the previous section, there are three kind of SLA inEGI: Grant SLA, NGI Infrastructure SLA and Basic SLA. The Basic SLA isusually a subSLA of Grant SLA and NGI Infrastructure SLA. This SLA canbe created by RUG, RP or by NGI Operator played role of RP/RUG agent.It is important that sum of guaranteed resources in subSLAs in a given periodcannot be greater than value of related resource in a superior SLA. When agiven SLA has subSLAs, it also has information about amount of resources thatis contracted in SLA but is not guaranteed in any subSLA.

SLA has several configuration states: NOT CONFIGURED, PREPARED, VERIFIEDand MISCONFIGURED. These states depend on configuration status of contractedresources (Fig. 2). Change of configuration state is connected with ticketsubmission to request tracker system. When user is modifying SLA state toMISCONFIGURED, s/he inserts data required a ticket creation and the ticket isgenerated. The configuration state applies to any kind of SLA: Grant SLA, NGIInfrastructure SLA and Basic SLA. Change of configuration state of subSLAsdoes not affect superior SLAs’ configuration states and vice versa.

An SLA in EGI, can have many computational periods. Each period hasdeclared set of metrics values. Actually chosen set of metrics includes suchitems like total wall time to be used in specified time period monthly, max.wall-clock time of a single job, max. job parallelism, min. asymptotically jobparallelism, storage quota on disks, storage quota on shared file system, storage

Page 4: Grid Resource Bazaar: Efficient SLA Management

Fig. 2: States transitions of SLA: main states (upper-left), configuration states(upper-right), and activity states (below).

quota on tapes, backup feature and services support (Top Level BDII, WMS,LFC, VOMS).

General QoS is assigned to metrics - this is Minimum Availability [in %] andMinimal Reliability [in %] and both are optional parameters. Each QoS belongto specific group metrics.

An SLA can be in one of three main states during its life cycle: PROPOSAL,AGREED or CANCELLED (Fig. 2). SLA in AGREED main state cannot be changedin a simply way. Because signed SLA is an agreement, users cannot modifyconditions that were in force in the past. Only future SLA metrics values can bechanged. When both sides agreed for new conditions (maybe also new end dateof computations), the old computation periods time frames are shortened andfrom current date (or declared date) will oblige a new computation period withthe new conditions. When SLA with attached subSLAs is renegotiated and thevalues of proposal disturb relation between subSLAs and SLA, RUG/RP mustfirst renegotiate suitable subSLAs and then modify the superior SLA.

Besides the main state, the SLA has also activity state which depends oncurrent date and computation periods defined in the SLA (Fig. 2).

4 The Grid Resource Bazaar

The Grid Resource Bazaar for PL-Grid is based on the Bazaar which had beendeveloped for the EGEE project. The Bazaar for EGEE has less complex func-tionality and model of work but provides calls and SLAs management, possibilityof site configuration assessment and notification management (about new call

Page 5: Grid Resource Bazaar: Efficient SLA Management

Fig. 3: Bazaar modules dependencies.

and SLA proposals). Bazaar for EGEE is integrated with resources allocationprocess in EGEE CE ROC and has special features for EGEE seed resourcesoperations. The tool uses information about sites and users from GOCDB andabout VOs from CIC DB. It is ready to use at CIC Operational Portal [2].

Bazaar for PL-Grid (Fig. 4) introduces needed modifications: in processmodel, functionality, GUI and collaboration with external tools (Fig. 3). Bazaarcan be integrated with other operational and user tools. Resources structure andcontacts information are fetched from external database (GOCDB). To achieveflexibility and to be adjusted also to another NGIs users, Bazaar retrieves in-formation about Virtual Organizations from VO configuration database (CICdatabase).

Fig. 4: New graphical interface of Grid Resource Bazaar.

Page 6: Grid Resource Bazaar: Efficient SLA Management

Bazaar GUI was design to support easy resource management. As we can seeon Fig. 4, portal is organized as a dashboard with views of resource allocationfor a given RUG. RP has analogous dashboard. The resource are visualized ona graph and SLAs are listed below. The user can play with SLA proposal, tocheck for it influence the overall view of resources. Detailed view on SLAs andcalls can be opened in self-managed area on right part of the window.

Bazaar makes accessible SLA data from its database using Web Serviceswhich will be designed and provided according to new requirements. First releaseof The Bazaar for PL-Grid should be available on June, 2010. More informationare available on project web page [1].

5 Summary

An efficient collaboration between grid resource users and providers is very im-portant in EGEE and EGI projects. To make real that, a formalization ofcollaboration process and tool which enables convenient interaction between ac-tors is needed. How crucial is definition of clear SLA was described in [5] andthere was proposed the first version of SLA metrics. Because time had verifiedwhich metrics are really reasonable and measurable, in this paper we presentedmodified set of SLA metrics. Specific requirements of EGI project and pol-ish NGI, PL-Grid, force the changes in the model of process and instrumentsused in it. This paper introduces also to the new version of tool, called GridResource Bazaar, that will be used in PL-Grid and will implement mentionedbefore resource allocation model assumptions.

Acknowledgements. This work is partially funded by Polish NGI, PL-GridProject.

References

1. Bazaar Project Web Page, http://grid.cyfronet.pl/bazaar2. EGEE CIC Operations Portal, http://cic.in2p3.fr3. Foster I., Kesselman C. and Tuecke S. (2001). The Anatomy of the Grid – Enabling

Scalable Virtual Organizations. International Journal of Supercomputer Applica-tions, 15(3), pp. 200-222.

4. Polish National Grid Initiative – PL-Grid, http://www.plgrid.pl/en5. Szepieniec T., Pagacz A.: SLA negotiation and resource allocation in Grids. Cra-

cow’08 Grid Workshop : October 1315, 2008 Cracow, Poland: proceedings, pp. 221-228, Krakow 2009.

6. Szepieniec T., Radecki M., Flis L., Krakowian M., Tomanek M., and Ziajka W.(2009). Operational Architecture of PL-Grid Project. Cracow’09 Grid Workshop:proceedings