gbrds workshop sept09 metadata identifiers

26
GLOBAL GLOBAL BIODIVERSITY BIODIVERSITY INFORMATION INFORMATION FACILITY FACILITY Metadata Metadata in context of in context of GBRDS GBRDS Éamonn Ó Éamonn Ó Tuama Tuama GBRDS Workshop, GBRDS Workshop, Copenhagen, 17-18 Sept 2009 Copenhagen, 17-18 Sept 2009

Upload: vishwas-chavan

Post on 18-Jan-2015

419 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Gbrds Workshop Sept09 Metadata Identifiers

GLOBALGLOBALBIODIVERSITYBIODIVERSITY

INFORMATIONINFORMATIONFACILITYFACILITY

Metadata Metadata in context of in context of

GBRDSGBRDSÉamonn Ó Éamonn Ó TuamaTuama

GBRDS Workshop, GBRDS Workshop, Copenhagen, 17-18 Sept 2009Copenhagen, 17-18 Sept 2009

Page 2: Gbrds Workshop Sept09 Metadata Identifiers

- Task groups

Outline

- Metadata task group recommendations

- LSID-GUID task group recommendations

- Overview: where metadata fits in

- Role of metadata

Page 3: Gbrds Workshop Sept09 Metadata Identifiers

Where metadata fits in ...

Page 4: Gbrds Workshop Sept09 Metadata Identifiers

Where metadata fits in ...

Page 5: Gbrds Workshop Sept09 Metadata Identifiers

Why metadata?

William K. Michener, Meta-information concepts for ecological data management, Ecological Informatics, Volume 1, Issue 1, January 2006, Pages 3-7, ISSN 1574-9541, DOI: 10.1016/j.ecoinf.2005.08.004.(http://www.sciencedirect.com/science/article/B7W63-4HJRS57-3/2/ea2e08412c6776456f540e66983546c0)

Information about datasets deteriorates over time!

Page 6: Gbrds Workshop Sept09 Metadata Identifiers

GBIF Nodes Survey 2009

Page 7: Gbrds Workshop Sept09 Metadata Identifiers

GBIF Nodes Survey 2009

Page 8: Gbrds Workshop Sept09 Metadata Identifiers

Data providers register their data and services in the GBIF UDDI Registry

UDDI lists “business” information and binding template, i.e, the URL by which the provider installation can be accessed

All further metadata are derived via DiGIR /TAPIR requests

No separate metadata catalogue with dedicated client for searching or browsing

Current GBIF metadata handling

Page 9: Gbrds Workshop Sept09 Metadata Identifiers

Data provider details (Name; Website; GBIF participant; Description; Country; Added to portal; Information updated)

Provider (DiGIR, BioCASe, TAPIR) bindingNameWebsiteDescriptionCitationHow to cite this datasetBasis of recordAccess point URLAdded to portalInformation updatedContacts (Name, Role, Address, Email, Telephone)Data networks Number of occurrence records indexed

Number of records shared by providerNumber of occurrences with coordinatesNumber of occurrences with no geospatial

issuesNumber of speciesNumber of taxa

and via the indexing process -

Current GBIF dataset metadata

via the UDDI registry, DiGIR, TAPIR

Page 10: Gbrds Workshop Sept09 Metadata Identifiers

Metadata Implementation Framework Task Group (MIFTG)

To provide recommendations and guidelines on implementation of a metadata framework for the GBIF network.

http://www.gbif.org/communications/news-and-events/showsingle/article/gbif-convenes-task-group-on-metadata/

- Define a metadata model

- Define optimal network design

- Advise on use of controlled terminology

- Focus on implementation issues

- Review current GBIF metadata system

Page 11: Gbrds Workshop Sept09 Metadata Identifiers

MIFTG: summary

The Global Biodiversity Information Facility (GBIF) aspires to … become a major provider of discovery and access services for a wide variety of biodiversity data types. A distributed metadata catalog system that describes and makes accessible general information on datasets of primary biodiversity data is recognised as an essential component of GBIF to achieve this objective.

Page 12: Gbrds Workshop Sept09 Metadata Identifiers

MIFTG: recommendations

Controlled vocabularies

R53. Providers should use controlled vocabularies in any metadata field for which an appropriate vocabulary exists, and should use a multi-lingual thesaurus when appropriate

R54. The GBIF vocabularies registry is a valuable service, but should be extended to include a canonical identifier for each vocabulary, and should work to be consistent with other vocabulary registries (e.g., oasis, info, srw)

Page 13: Gbrds Workshop Sept09 Metadata Identifiers

MIFTG: recommendations

Metadata specificationsR6. Metadata should be able to describe multiple types of primary biodiversity data.

R7. Metadata should support data discovery, interpretation, and analytical reuse

R8. Metadata should support search/browse by space, time, taxa, and theme

R9. Metadata should support search/browse by name of provider/name of organization

R10. Metadata should support search by related publications

Page 14: Gbrds Workshop Sept09 Metadata Identifiers

MIFTG: recommendations

Metadata catalog system recommendationsR27. The metadata catalog system must support multiple metadata models natively.

R28. The metadata catalog system must be able to return the original contributed metadata object.

R29. The metadata catalog system must support unique versioning of metadata and data objects using globally unique identifiers to differentiate revisions.

R42. The metadata catalog system should register with one or more node registries to advertise services available..R44. The metadata catalog system should provide attribution and branding for original metadata providers..

Page 15: Gbrds Workshop Sept09 Metadata Identifiers

MIFTG: recommendations

Network Architecture Recommendations

R20. GBIF should build a distributed system of regional nodes, each containing a replica of all metadata.

R21. Each regional node must replicate metadata to other regional nodes when record changes occur using a GBIF-prescribed replication protocol.

R22. Each regional node should also provide a harvesting interface that exposes metadata via their unique identifiers.R25. GBIF needs a registry to maintain list of regional nodes and their relevant service endpoints.

Page 16: Gbrds Workshop Sept09 Metadata Identifiers

LSID-GUID Task Group (LGTG)

To provide recommendations and guidelines on deployment of LSIDs and other GUIDs on the GBIF network with particular reference to the potential role of GBIF as a stable, long term provider of GUID resolution services.

http://www.gbif.org/communications/news-and-events/showsingle/article/gbif-convenes-task-group-on-lsids-1/

- Review the plans for a decentralised GBIF informatics architecture to ascertain requirements for GUID technologies - Evaluate a role for GBIF in provision of LSID hosted services

- Identify the data models/vocabularies for use in metadata returned on GUID resolution - Propose a business model for adopting LSIDs

- Review the main GUID technologies

- Identify solutions for integrating GUIDs (e.g., LSIDs, Handles, DOIs) with the Semantic Web and the Linked Data model

Page 17: Gbrds Workshop Sept09 Metadata Identifiers

LSID-GUID Task Group: summary

Effective identification of data objects is essential for linking the world’s biodiversity data. If GBIF is to enable the exchange of biodiversity data it must promote identifier adoption through:

- education, training, outreach

- leadership

- practical services

Page 18: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 4

GBIF should encourage, support and advise on the use of appropriate identifier technologies, in particular LSIDs and HTTP URIs, but not impose a requirement for one at the expense of the other. GBIF should provide specific advice for the issuing and use of LSIDs and for HTTP URIs.

Page 19: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 5

GBIF should support a promotional programme, including:- workshops for data providers on awareness of identifiers and choosing and implementing persistent identifiers;- technical and deployment training programmes;- maintaining a system of “quality marks” for compliant collaborators (data providers, aggregators, etc.).

Page 20: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 6

The GBIF data portal should demonstrate good practice by:- maintaining fields for identifiers including those from data providers,- assigning GBIF identifiers to cached objects,- property values in GBIF records should be persistent resolvable identifiers if possible.

Page 21: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 8

GBIF should make data more inter-connected by:- adopting current best practice for interconnected data (Linked Data principles);- outputing RDF graphs;- using existing vocabularies and GUIDs wherever possible.

Page 22: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 8

GBIF should make data more inter-connected by:- adopting current best practice for interconnected data (Linked Data principles);- outputing RDF graphs;- using existing vocabularies and GUIDs wherever possible.

Page 23: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 10

GBIF should provide services to support identifier resolution, redirection, metadata hosting, and caching.

Page 24: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 11

GBIF should provide additional services, including persistent identifier monitoring services.

Page 25: Gbrds Workshop Sept09 Metadata Identifiers

LGTG: recommendations

Recommendation 12

GBIF should extend the role of its data portal by hosting resources related to the use of identifiers, such as the TDWG vocabularies.

Page 26: Gbrds Workshop Sept09 Metadata Identifiers

How to contact GBIF:Web site: www.gbif.org Data portal: www.gbif.net

GBIF SecretariatUniversitetsparken 15DK-2100 Copenhagen ØDenmark

E-mail: [email protected]: +45 3532 1470Fax: +45 3532 1480

GBIF Secretariat building, supported by a grant from the Aage V. Jensens Fonde