dais grid1 database access and integration services on the grid * * authors: n. paton, m. atkinson,...

30
DAIS Grid 1 Database Access and Integration Services on the Grid * * http://www.cs.man.ac.uk/grid-db/papers/dbtf.pdf Authors: N. Paton, M. Atkinson, V. Dialiani, D. Pearson, T. Storey, P. Watson Florida International University School of Computing and Information Sciences Summer 2006 Presented by: Ariel Cary

Upload: cecil-norman

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 1

Database Access and Integration Services on the Grid*

* http://www.cs.man.ac.uk/grid-db/papers/dbtf.pdf

Authors: N. Paton, M. Atkinson, V. Dialiani, D. Pearson, T. Storey, P. Watson

Florida International University

School of Computing and Information Sciences

Summer 2006

Presented by: Ariel Cary

Page 2: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 2

Agenda Introduction

Scope and Context of Proposal

Proposed Database Services

DS in OGSA

Current DAIS Standards and Systems

Conclusion

Page 3: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 3

Introduction

• Grid research generally focus on applications where data is stored in files

• DBMS systems have a central role in data organization for numerous applications, e-Science: particle physics (LHC@CERN), earth sciences, bio-informatics

• There is a need to interconnect pre-existing and independently operated databases

Page 4: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 4

Introduction (cont)

This work seeks to encourage the development of standards that can meet those needs.

A (preliminary) proposal is made for the staged development of a collection of Grid Database Services that allow access to existing, autonomous databases within Grid

Follows a service-based approach within OGSA framework for DBMS integration

Page 5: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 5

Introduction (cont)

How functionalities are supported may come to be implemented in different ways (performance characteristics, etc.)

Services definitions essentially state what functionality is to be supported

Page 6: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 6

Scope and Context of Proposal

Page 7: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 7

Scope

The proposal has several characteristics

– Independent of any specific Grid toolkit (could skew and restrict it)

– It does not propose the development of a new DBMS for the Grid, but wrapping existing systems to a consistent interface and developing distributed managers

– Independent of any specific data model or access language

Page 8: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 8

Context

Relevant terms related to Databases

– Database Service is any service that supports a database interface (WSDL)

– Service interfaces are abstract and not prescriptive on how they are supported, or the data model that underpins a DBMS

– Specific DBMS services could provide access to relational or object DBMS, XML repositories, specialist storage systems …

Page 9: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 9

Context– Grid Database Service (GDS) provides

capabilities for querying, updating and evolving a database

– The interface also describes:Data delivery: transmitting structured dataTransactions: coordinating collections of

operationsDatabase Metadata: accessing information

about the data a DB service provides

Page 10: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 10

Proposed Database Services

Page 11: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 11

Database Discovery

It is assumed that a registry lookup returns a Grid Service Handle (GSH), globally unique name for a service instance

A service provider publishes description (WSDL) of a service to a service registry

Later consulted by a requestor, and binding created that allow calls to the service

Page 12: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 12

Database Statements

Thus, it is a point of tension with the proposal being independent of the data model

Statements allow queries or change operations to be sent to a DBMS

This implies that the underlying DBMS supports a query or command language, different on every database model

Page 13: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 13

Database Statements (cont.)

The pairs (queryNotation, query), … are introduced to allow flexibility (like MIME types for e-mail attachments)

For example:– queryNotation=“SQL’92”– query=“Select * from EMP Where Salary>1000”

Page 14: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 14

Database Statements (cont.)

The optional txHandle indicates if the operation is part of a transaction, provided the DBMS supports transactions

The final results of an operation are managed via:

– resultHandle: generated dynamically– expires: an expiry time up for the result to be

claimed

Page 15: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 15

Database Statements (cont.) The operations on a GDS will be atomic:

– Preparation and Validation: consistency check– Application: operation is performed– Result Delivery: results available to the caller

Usually involve transfer of large amounts of data which may take long time to execute (prone to interruptions!)

The implementation of the DBMS service should handle such failures to achieve atomicity

Page 16: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 16

Delivery System

Means by which (potentially large amounts of) structured data is moved from one locations to one or more others

Should be considered complementary to protocols such as GridFTP, which could be used as a delivery mechanism

Page 17: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 17

Delivery System (cont.)

Single data source to be delivered, represented as a URI

Several destinations represented by URI with delivery mechanisms associated

The deliver operation initiates delivery of the data from the single source to multiple destinations

A more elaborated delivery system would include encryption, progress monitoring, etc.

Page 18: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 18

Distributed Transactions

A minimal transaction interface: performs the role of conferring a guaranteed unique identity on the transaction

Given a transaction handle, other operations over a database service can be put explicitly within the context of a transaction, using the txHandle parameter

Page 19: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 19

Distributed Transactions (cont.)

For a transaction to span multiple DBMS services, they must provide operations for use by the transaction manager that is overseeing the distributed transaction

startTransaction includes an expires param. to limit the consumption of resources

prepareCommit operation can be used by a two-phase commit protocol to ensure that all participating database services commit

Page 20: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 20

Database Metadata

Metadata that could be useful to have access to includes:

– Content description: DB schema – data model, logical & physical structures, stats (could be obtained from the data dictionary)

– Capability description: language (query /update operations supported), transactional capabilities, protocols supported

The metadata should be described in a standard representation, e.g. XML document given by the data service provider

Page 21: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 21

Distributed Query Service Query DS1

(DQS) Parsed &

optimized Sub-queries to

relevant DB’s Results

collected & joined by DQS

Page 22: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 22

Database Services in OGSA

Page 23: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 23

DS in OGSA

The Open Grid Services Architecture (OGSA) represents an evolution towards a Grid system architecture based on Web services concepts and technologies*

* http://www.globus.org/ogsa

The described interfaces can be used as the basis of database services through participation in the OGSA

Thus many features of this architectural framework can be obtained for service creation, authorization, notification, etc.

Page 24: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 24

Requirements from OGSA

The secure connection and authentication mechanism underpins all GDS security and authentication

The lifetime management model carries over unchanged as the lifetime management model for GDS

The notification mechanism specified in OGSA appears to satisfy the GDS needs

Page 25: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 25

Requirements from OGSA (cont.) It is required information about the user

authorization (potentially through many intermediate grid services)

– User identification services, referenced from a certificate

Certification of the services themselves may be necessary. A discovery service could be tricked to mimic the intended GDS and get the data sent

Some databases charge for their use. It is necessary to support a digital payment process

Page 26: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 26

Current DAIS Standards

and Systems

Page 27: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 27

DAIS Standards

Global Grid Forum – “The Global Grid Forum (GGF) is the community

of users, developers, and vendors leading the global standardization effort for grid computing.” http://www.ggf.org/

Part of the GGF: DAIS-WG– “The group seeks to promote standards for the

development of grid database services, focusing principally on providing consistent access to existing, autonomously managed databases.” https://forge.gridforum.org/projects/dais-wg

Page 28: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 28

OGSA-DAI System

OGSA-DAI Overview http://www.ggf.org/GGF17/materials/303/Overview.ppt

Architecture + Extensibility http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt

Supported Data Resources http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt

“The aim of the OGSA-DAI project is to develop middleware to assist with access and integration of data from separate sources via the grid…and is working closely with the Global Grid Forum DAIS-

WG...” http://www.ogsadai.org/

Page 29: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 29

Conclusion

Page 30: DAIS Grid1 Database Access and Integration Services on the Grid * *  Authors: N. Paton, M. Atkinson, V

DAIS Grid 30

Conclusion

This document has made a preliminary, service-oriented proposal for integrating database functionality into a Grid setting

It is hoped that the document will provoke discussion on how best databases can be integrated with Grid middleware

There is an establish community dedicated to defining DBMS service standards, and emerging system are adopting them