a water information r & d alliance between the bureau of meteorology and csiro’s water for a...

48
A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and REST Peter Fitch

Upload: barrie-frank-owen

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship

Vocabulary Services, RDF,SKOS and REST

Peter Fitch

Page 2: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Outline

• Outline the problem• Background on

• Linked data• RDF• SKOS• REST – Linked Data API

• Vocabulary Service • What is it• How to develop a vocabulary service

• Test case with USGS code list• Demo

Page 3: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Warning

Frequent use of XML

Page 4: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Motivations

Xlink is all well and good, but the real

problem is what is at the end of the link and how to use it.

Agreed, and I wish I knew more about

the semantic technologies.

Page 5: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

The need for semantic context

From Lemon OSDM Linked Data workshop 2010

• Semantic Context• Black and White • Bessie• Good Milker

Page 6: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Machines need it to.

From Lemon OSDM Linked Data workshop 2010

Page 7: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Information Needed

• Internal Structure-the information model• Supported functions – the operations• Semantics

• What are the concepts• What are the vocabularies• How are they related• Where are they defined

• Where did it come from?• How was it created?

Current Metadata

Adapted from Lemon OSDM Linked Data workshop 2010

Semantic Context

Page 8: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

The need for semantic information in Hydro-Domain data exchange

Page 9: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Don’t Information Models solve the problem?

Page 10: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Take a closer look

http://www.bom.gov.au/std/water/xml/wio0.2/property//bom/WaterCourseLevel_m

Page 11: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

The O word

You need an ontology!

O What? I know one O word and its not that. I better find

out more.

Page 12: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

4 Rules of Linked Data TBL – key take home!

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

Comment:

So by following rule 3, we might be able to get some useful information, we still need semantic context though.

Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html

Page 13: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Linked data quality schemeSir Tim BL

Rating Description

★ Available on the web (whatever format), but with an open license

★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)

★★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)All the above plus, Use open standards from W3C

(RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★ All the above, plus: Link your data to other people’s data to provide context

Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html

Page 14: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Intro to RDF

• RDF is a data model for describing resources• Resource Description Framework

Subject Object

Predicate

Things Have properties Property Value

• The object of one statement can become the subject in another.• The set of linked statements, forms a directed graph• Subject, Object and Predicate are all Resources*

A set of Subject, Predicate, Object entities is called a Triple

Page 15: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

RDF Example

Remember - Resources are URI’s

Peter – http://www.csiro.au/people/PeterFitch.html

hasColleague – http:/mydefinitions/defintions#hasColleague

Nate - http://cida.usgs.gov/professional-pages/booth.html

Peter Nate

hasColleague

Subject: Peter Predicate: hasColleague Object: Nate

Page 16: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

RDF Landscape

Basic resource descriptionsRDF

Express resources as classes, with properties and class relationships

RDFS

OWLWeb Ontology LanguageExact description and relationships

SKOSSimple KOSSimple description and relationships

Expressivity

Basic Building Blocks

SPARQLRDF

Query

Page 17: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Intro to SKOS

• SKOS : Simple Knowledge Organizational System. KOS- provides semantic context.

• Built on RDF and RDFS• Designed to bridge current chaotic, poorly described web, and full

sematic web – OWL.• See SKOS primer at http://www.w3.org/TR/skos-primer• Limited vocabulary eg:

• skos:ConceptScheme• skos:Concept• skos:prefLabel• skos:scopeNote

• And some limited standard relationships• skos:exactMatch• skos:narrower, skos:broader

• Allows for limited inference• Because of its limited vocabulary, really useful for Thesauri,

classification lists, taxonomies etc.

Page 18: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

SPARQL Queries

Purpose: query a RDF triple store, works by matching triples to patterns.

example:

select ?concept

where { ?concept rdf:type skos:Concept}

Return me all concepts which are of rdf:type skos:Concept

Other Queries

CONSTRUCT – returns a rdf graph

ASK – returns bool if triple is matched

DESCRIBE – returns a graph describing a resource.

Page 19: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Intro to Linked Data API

• Familiar with RESTful services right??• LD API designed as a bridge between the complexity of

SPARQL endpoints, and a standard REST API• Provides standard URI matching patterns and additional

specification for behaviors• eg: /doc/school/12345 should respond with a document that

includes information about /id/school/12345’• /doc/school should respond with a document of schools• /doc/school/12345 .JSON should return with a JSON

document.

Page 20: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Intro to linked data API

URI pattern

SPARQL for Result Set

SPARQL for view on Result Set

Response in RDF, Turtle, etc.

Page 21: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Vocabulary Service

• In semantic web, vocabulary is defined as a set of URI’s

• Functionally we want:• Ability to look up definitions of terms and or code lists –

skos:Concept, skos:definition, skos:prefLabel• Ability to resolve synonyms skos:exactMatch, skos:broader,

skos:narrower• Ability to deal with different langauages skos:preflabel lang=en• Standard API – Linked data API and REST• Standard Information/Data Model - SKOS

Page 22: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Simon Cox vocab proposal

Proposal by S. Cox https://www.seegrid.csiro.au/wiki/Siss/VocabularyService3

Page 23: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Vocab development process

1. Select code list or vocab for service

2. Map code list to skos

3. Check code list for web compatibility and harmonise with other code lists or vocabs

1. eg: use a standards units vocab

2. remove any non conforming content.

4. Convert code list to SKOS RDF

5. Validate RDF using W3C RDF validator

6. Import to Triple Store

7. Publish Service

8. Use: Link to in documents!

Page 24: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Case study USGS Parameter Code ListProof of Concept

• Code List is a CSV table of parameter codes.

Page 25: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Mapping code list to SKOS

Parameter Code SKOS Mapping

Parameter Code List skos:ConceptScheme

Parameter Code skos:Concept

Group Name skos:broader

Parameter Name skos:prefLabel

cas Name skos:exactMatch

srs Name skos:broadMatch

Units need additional relationshipusgs:hasUnits

Page 26: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Content conformance-harmonization

• Issues• need skos:Definition – Parameter Name?• Invalid characters for web in Parameter Name eg &, <, > - &amp;

&lt; &gt;

• Units – non standard representation, eg Mi2 (square Miles),mgd (Million G per day),%, nu (number of bad characters TX by DCP)

• Fix up• Leave as a literal?

Page 27: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Comments on code list

• Conflation of information• Chromium(VI), water, unfiltered, recoverable, micrograms per liter

• Observable phenomena – Chromium(VI)• Procedure – unfiltered/recoverable• media – water• Units – ug/L

• Phosphorus, suspended sediment, total digestion, dry weight, percent

• Observable phenomena - Phosphorus• procedure total digestion (but not linked to standard method)• media – suspended sediment• units dry weight percent.

• Some meaningless codes - Precipitation, cumulative at given time, location 6, inches

Page 28: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Duplication

• Turbidity, water, unfiltered, broad band light source (400-680 nm), detection angle 90 +/- 30 degrees to incident light, nephelometric turbidity units (NTU)

• Turbidity, water, unfiltered, laboratory, Hach 2100AN, nephelometric turbidity units

• USEPA method 180.1???• Are they the same?

• Why not link to standard methods?• Needs work by domain experts to resolve.

Page 29: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Conversion to SKOS-EXcel2SKOS

• .Net utility to convert Table into skos using Nvelocity

Spreadsheet

Template

Office Interop

Formatter

Excel2Skos

Page 30: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Nvelocity Template-Mapping to SKOS

#foreach($row in $excelsheet)

<skos:Concept rdf:about="$Globals.get_item("URI-Base")/$row.get($code)">

<skos:inScheme rdf:resource="$Globals.get_item("URI-Base")"/>

<skos:definition>Definition of parameter code $row.get($code)</skos:definition>

<skos:prefLabel>$row.get($name)</skos:prefLabel>

#if($row.get($casrn)!= "")

<skos:exactMatch rdf:resource="http://casrn.namespace/$row.get($casrn)"/>

#end

#if($row.get($srsname) != "")

<skos:broadMatch rdf:resource="http://srs.namespace/$row.get($srsname)"/>

#end

<skos:broader rdfs:literal="$row.get($groupname)"/>

    <usgs:preferredUnit rdf:resource="http://usgs.gov/vocabularies/units#$row.get($units)"/>

</skos:Concept>

#end

• Classes passed in• Globals – ConceptScheme definitions• excelsheet – 2D table of values.

Page 31: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Conversion to skos

Page 32: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Converting list to RDF

Page 33: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

RDF Validationhttp://www.w3.org/RDF/Validator/

Page 34: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

RDF Validationhttp://www.w3.org/RDF/Validator/

Page 35: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Import to Triple store

Page 36: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Test Services

• Developed REST services using Microsoft WCF dotnetRDF and NVelocity libraries

• Test API/Vocab/ParameterCodeList – respond with a document of skos:ConceptScheme

/Vocab/ParameterCodeList/ParameterCode – respond with the first page of parameter codes – my implementation returns all!

/Vocab/ParameterCodeList/ParameterCode/{ID}

Page 37: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Process reminder

Sesame RDF Triple store

SPARQL API

USGSCode List

Harmonise and Map

Excel2RDF

RDF Validator

ValidateRDFLoad

dotnetRDF

WCF REST TestServices

http://localhost:8080/VocabService/ParameterCodes/

Page 38: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Demo

Page 39: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Next Steps

• Try the Auscope tooling• the process is the same, uses sesame rdf store• has diferent tooling for Excel to RDF• Different service interface, LD not quite ready.

• If have time, we should set up a test service before I leave.• Below is example of what LD vocabs in WaterML2.0 might look

like.

Page 40: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Conclusions

• The need for sematic context to assist with data integration is pressing.

• Vocabularies are foundation services and need to be put in place for data mediation.

• Technologies and approaches are now mature enough to use• RDF, SKOS, SPARQL, LD API

• There is tooling available through AUSCOPE, but it needs assessment.

• USGS & CIDA has the opportunity to make a range of standard vocabularies available for the hydro community.

Page 41: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Pillars and foundations of Interoperability

System of Systems Interoperablity

Identity and Registration

Service S

tandards

Application S

chema

Netw

ork Standards

Com

munity P

rofiles

Feature C

atalogs

Agreed V

ocabularies and O

ntologies

Sem

antic Brokering

Page 42: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Final word

I don’t look it,but I’m so

happy, I know what is at the

end of the xlink!

Page 43: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Thank you

Business Unit NamePeter FitchProgram Leader Environmental Information Systems

Phone: +61 2 6246 5763Email: [email protected]: www.csiro.au/clw/eis

Contact UsPhone: 1300 363 400 or +61 3 9545 2176Email: [email protected] Web: www.csiro.au

Page 44: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Lessons of Climate Gate

• Theft of e-mails from UEA Nov 2009• E-mails indicated manipulation of data, and suppression of raw data

• Investigations found• methods dis-organised• bunker mentality• lack of transparency

• Researchers promised to• improve scientific data management• open access to data• Improve transparency

climatic research unit, University of East Anglia

Page 45: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

From RDF Primer W3C

<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar"> <ex:editor> <rdf:Description> <ex:homePage> <rdf:Description rdf:about="http://purl.org/net/dajobe/"> </rdf:Description> </ex:homePage> </rdf:Description> </ex:editor></rdf:Description>

Page 46: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

SWT in other domains

Eco-InformaticsBio-Informatics

Bio-Informatics

HWB – WIRADA Symposium August 2011

Page 47: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

If

Insert presentation title

Page 48: A water information R & D alliance between the Bureau of Meteorology and CSIRO’s Water for a Healthy Country Flagship Vocabulary Services, RDF, SKOS and

Terminology

• Activity – single process block which can perform a useful task and which can be linked to another process block

• Workflow – a linked set of process blocks

HWB – WIRADA Symposium August 2011