mcat: a metadata catalog san diego supercomputing center part of the storage resource broker (srb)

16
MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Upload: george-lucas

Post on 28-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT: A Metadata Catalog

San Diego Supercomputing Center

Part of the Storage Resource Broker (SRB)

Page 2: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Overview

What is metadata MCAT architecture List of (many!) MCAT attributes MAPS

Page 3: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Elements of Data Intensive computing environments

Resources– Hardware: computing platforms, networks, storage

systems– Software: DBs, file systems, operating systems,

schedulers, applications

Methods– Access methods, APIs, data access and conversion

Data objects– Data sets and collections of data sets

Users and groups– Who is allowed to create/update/access resources,

methods and data sets

Page 4: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Elements of MCAT

MAPS initialization (Metadata Attribute Presentation Structure)

Schema initialization MAPS to schema converter to dynamic

query generator DB2 and Oracle Query systems Answer extractor Convert back to MAPS format

Page 5: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Metadata Stored in MCAT

Metadata = information about data objects– Describes properties and attributes of

objects

Examples1. Identifier (internal, not seen by user)2. Name3. Types and formats4. Size

Page 6: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT attributes (cont.)

5. Comments6. Liveness: (I.e., current state) deleted

or exists or locked or under construction

7. Replica-number SRB supports cloning of data An object may have many clones SRB controls replica selection

8. Creation time-stamp9. Creation-owner

Page 7: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT attributes (cont.)

10. Collection name Every data object must be associated

with a collection A collection contains data objects and

other collections (I.e., sub-collections) Objects may only belong to one collection

11. Physical resource where object is located

12. Location inside the resource (e.g., a directory on a file system

Page 8: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT Attributes (cont)

13. Access control list (ACL) Entry is: <dataObjectId, userID,

PermissionID> Each user is given one permission per

data object Each permissionID has an associated list

of actions that are permitted Read Write Control grantTicket

Page 9: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT attributes (cont.)

14. Audit record <objectID, userID, actionID, timeStamp,

Comments> Each action on a data object can be

audited Action success or failure noted in audit

trail

Page 10: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MCAT attributes (cont.)

15. Ticket Provides holder with an action permit on

the data object Currently only read Ticket-giver can impose restrictions: who

can use it, when, how many times it can be used

<ticketValue, objectId, userId, actionId, beginTime, endTime, accessCount, ticketGiver>

Page 11: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Attributes not yet supported

Partitioning of data objects Versioning Lineage (of data objects and methods) Derivatives Locking Public and private keys on data objects

or collections Summaries or aggregations Measurements

Page 12: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Resource-related metadata

1. Name2. Type3. Access address4. Default location template (URL??)5. Replica-numberA: copies of the same

resource, any of copies are equivalent6. Comments

Page 13: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Replicated resource concept

Logical resource Formed as set of physical, possibly

heterogeneous resources Create a data object on a replicated resource:

– object automatically replicated in each one of the

component resources

Provides fault tolerance

Other logical resources: striped resources (round-robin), write-once resources, read-only resources

Page 14: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

User-related attributes

1. Name2. Type (privileged, normal, projects)3. Address4. Email5. Phone6. Pass phrase7. Domain: e.g., ucsd, sdsc, caltech8. User-groups: provides group ID and

access control

Page 15: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

Data Models and Data Exchange

Data models: standards for structuring information (e.g., Dublin core)

Data exchange formats: standard means to communicate metadata (e.g., XML)

MCAT uses its own data model and exchange format: MAPS– MAPS = metadata attribute presentation

structure Working on mappings to other formats

Page 16: MCAT: A Metadata Catalog San Diego Supercomputing Center Part of the Storage Resource Broker (SRB)

MAPS

MAPS query format derived from SQL– Large metadata catalogs require database

systems– Metadata are normally given as attribute-

value pairs whose search can easily be translated into SQL-like queries