using resource description framework (rdf) to carry metadata for datasets

37
M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society OpenDAP 2007 http:// iridl.ldeo.columbia.edu/ontologie s / Using Resource Description Framework (RDF) to carry metadata for datasets

Upload: aliza

Post on 12-Jan-2016

62 views

Category:

Documents


0 download

DESCRIPTION

Using Resource Description Framework (RDF) to carry metadata for datasets. M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society OpenDAP 2007 http://iridl.ldeo.columbia.edu/ontologies/. RDF is important for OpenDAP because. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Resource Description Framework (RDF) to carry metadata for datasets

M.Benno Blumenthal and John del Corral

International Research Institute for Climate and Society

OpenDAP 2007

http://iridl.ldeo.columbia.edu/ontologies/

Using Resource Description Framework (RDF) to carry

metadata for datasets

Page 2: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF is important for OpenDAP because

• By embedding OpenDAP in an RDF document, metadata (a.k.a. attributes) not understood by OpenDAP code are easily carried in a semantically-valid way

• Explicit relationships between OpenDAP variables can cleanly solve netcdf common name vs OpenDAP GRID/MAP structures, while avoiding retransmission of common independent variables

• Explicit mapping between the different data models of the different OpenDAP APIs

Page 3: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF is important for OpenDAP because

• Support for different languages can be built on top of RDF object support, e.g. Ruby ActiveRDF

Page 4: Using Resource Description Framework (RDF) to carry metadata for datasets

Why RDF?

Web-based system for interoperating semantics

A key part of the Semantic Web

RDF/OWL is an interesting technology, but it is even more interesting when it is clear that it can help solve our problems

Page 5: Using Resource Description Framework (RDF) to carry metadata for datasets

Standard Metadata

Users

Datasets

Tools

Standard Metadata Schema/Data Services

Page 6: Using Resource Description Framework (RDF) to carry metadata for datasets

Many Data Communities

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Page 7: Using Resource Description Framework (RDF) to carry metadata for datasets

Super Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Standard metadata schema

Page 8: Using Resource Description Framework (RDF) to carry metadata for datasets

Super Schema: direct

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Standard metadata schema/data service

Page 9: Using Resource Description Framework (RDF) to carry metadata for datasets

Flaws

• A lot of work

• Super Schema/Service is the Lowest-Common-Denominator

• Science keeps evolving, so that standards either fall behind or constantly change

Page 10: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF Standard Data Model Exchange

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Tools

Users

Datasets

Standard Metadata Schema

Standard metadata schema

RDF

RDF

RDF

RDF

RDF

RDF

Page 11: Using Resource Description Framework (RDF) to carry metadata for datasets

Standard metadata schema

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schem

RDF

RDFRDF

RDF Data Model Exchange

RDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Tools

Users

Datasets

Standard Metadata Schema

RDF

RDFRDF

Page 12: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF Architecture

RDF

RDF RDF

RDF

RDF RDF

RDF

RDF RDF

RDF

RDF

RDF RDF

RDF

RDF RDF

Virtual (derived) RDF

queries queries queries

Page 13: Using Resource Description Framework (RDF) to carry metadata for datasets

Why is this better?

• Maps the original dataset metadata into a standard format that can be transported and manipulated

• Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but

• When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping

• Can uses enhanced mappings between models that have common concepts beyond the least-common-denominator

• EASIER – tools to enhance the mapping process, mappings build on other mappings

Page 14: Using Resource Description Framework (RDF) to carry metadata for datasets

CF attributes

SWEET Ontologies

Search Terms

CF Standard Names

IRIDL Terms

NC basic attributes

IRIDL attributes

SWEET as Terms

CF Standard NamesAs Terms

Gazetteer Terms

Page 15: Using Resource Description Framework (RDF) to carry metadata for datasets

Sample Tool: Faceted Searchhttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...

Page 16: Using Resource Description Framework (RDF) to carry metadata for datasets

Distinctive Features of the search

• Search terms are interrelated

• terms that describe the set of returns are displayed (spanning and not)

• Returned items also have structure (sub-items and superseded items are not shown)

Page 17: Using Resource Description Framework (RDF) to carry metadata for datasets

Architectural Features of the search

• Multiple search structures possible

• Multiple languages possible

• Search structure is kept in the database, not in the code

http://iridl.ldeo.columbia.edu/ontologies/query2.pl

Page 18: Using Resource Description Framework (RDF) to carry metadata for datasets

Triplets of • Subject• Property (or Predicate)• Object

URI’s identify things, i.e. most of the aboveNamespaces are used as a convenient

shorthand for the URI’s

RDF: framework for writing connections

Page 19: Using Resource Description Framework (RDF) to carry metadata for datasets

Datatype Properties

{WOA} dc:title “NOAA NODC WOA01”

{WOA} dc:description “NOAA NODC WOA01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1x1; Longitude: global; Latitude: global; Depth: [0 m,5500 m]; Time: [Jan,Dec]; monthly”

Page 20: Using Resource Description Framework (RDF) to carry metadata for datasets

Object Properties

{WOA} iridl:isContainerOf {Grid-1x1},

{Grid-1x1} iridl:isContainerOf {Monthly}

Page 21: Using Resource Description Framework (RDF) to carry metadata for datasets

WOA01 diagram

Page 22: Using Resource Description Framework (RDF) to carry metadata for datasets

Standard Properties

{WOA} dcterm:hasPart {Grid-1x1},{Grid-1x1} dcterm:hasPart {MONTHLY}

Alternatively

{WOA} iridl:isContainerOf {Grid-1x1},{iridl:isContainerOf} rdfs:subPropertyOf

{dcterm:hasPart}

Page 23: Using Resource Description Framework (RDF) to carry metadata for datasets

{SST} rdf:type {cfatt:non_coordinate_variable}, {SST} cfatt:standard_name {cf:sea_surface_temperature}, {SST} netcdf:hasDimension {longitude}

netcdf/CF in RDF

Object properties provide a framework for explicitly writing down relationships between data objects/components, e.g. vague meaning of nesting is made explicit

Properties also can be related, since they are objects too

Page 24: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF Tools

• Transport/Exchange (RDF/XML)

• Storage

• RDF APIs (Redland,Jena,Sesame)

• Query (SPARQL,SeRQL, …)

• Basic Semantics

Page 25: Using Resource Description Framework (RDF) to carry metadata for datasets

Search Interface Term

• http://iri.columbia.edu/~benno/sampleterm.pdf

Page 26: Using Resource Description Framework (RDF) to carry metadata for datasets

Ontologies

Use Conventions to connect concepts to established sets of concepts

Generate additional “virtual” triples from the original set and semantics

RDFS – some property/class semantics

OWL – additional property/class semantics: more sophisticated (ontological) relationships

Page 27: Using Resource Description Framework (RDF) to carry metadata for datasets

OWL

Language for expressing ontologies, i.e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema

However, there is a serious split in world view from what we have been talking about: concepts as classes vs concepts as individuals

Page 28: Using Resource Description Framework (RDF) to carry metadata for datasets

Faceted Search Explicated

Page 29: Using Resource Description Framework (RDF) to carry metadata for datasets

Search Interface

• Items (datasets/maps)

• Terms

• Facets

• Taxa

Page 30: Using Resource Description Framework (RDF) to carry metadata for datasets

Search Interface Semantic API

{item} dc:title dc:description rss:link iridl:icon dcterm:isPartOf {item2} dcterm:isReplacedBy {item2}

{item} trm:isDescribedBy {term}

{term} a {facet} of {taxa} of {trm:Term},{facet} a {trm:Facet}, {taxa} a {trm:Taxa},{term} trm:directlyImplies {term2}

Page 31: Using Resource Description Framework (RDF) to carry metadata for datasets

Faceted Search w/Querieshttp://iridl.ldeo.columbia.edu/ontologies/query2.pl?...

Page 32: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF Architecture

RDF

RDF RDF

RDF

RDF RDF

RDF

RDF RDF

RDF

RDF

RDF RDF

RDF

RDF RDF

Virtual (derived) RDF

queries queries queries

Page 33: Using Resource Description Framework (RDF) to carry metadata for datasets

Data ServersOntologies

MMI

JPL

StandardsOrganizations

Start Point

RDF Crawler

RDFS SemanticsOwl SemanticsSWRL Rules

SeRQL CONSTRUCT

Search Queries

LocationCanonicalizer

TimeCanonicalizer

Sesame

Search Interface

bibliography

IRI RDF Architecture

Page 34: Using Resource Description Framework (RDF) to carry metadata for datasets

CF attributes

SWEET Ontologies

Search Terms

CF Standard Names

IRIDL Terms

NC basic attributes

IRIDL attributes

SWEET as Terms

CF Standard NamesAs Terms

Gazetteer Terms

Page 35: Using Resource Description Framework (RDF) to carry metadata for datasets

RDF is important for OpenDAP because

• By embedding OpenDAP in an RDF document, metadata (a.k.a. attributes) not understood by OpenDAP code are easily carried in a semantically-valid way

• Explicit relationships between OpenDAP variables can cleanly solve netcdf common name vs OpenDAP GRID/MAP structures, while avoiding retransmission of common independent variables

• Explicit mapping between the different data models of the different OpenDAP APIs

• Build on language support of RDF objects

Page 36: Using Resource Description Framework (RDF) to carry metadata for datasets

Embedded OpenDAP Ontology

Page 37: Using Resource Description Framework (RDF) to carry metadata for datasets

Topics/Issues

• OpenDAP and RDF: can we transport data semantics without fixing the entire schema?

• netcdf/HDF and RDF: do we need non-contextual modeling in our metadata transport/storage?

• Concepts as classes vs concepts as individuals

• Sub-classes vs sub-categories