christophe gueret: publish web data - an interactive session
TRANSCRIPT
Data Archiving and Networked Services
Publishing data on the Web
Christophe Guéret (@cgueret)
Evolution and variation of classification systems March 4-5, 2015 Amsterdam
Publishing data on the Web
● Its' easy! Everybody does it! – … in very different ways :-/
● Several, even not so big, issues : – Several competing standards and formats
– Data hard to compare across sources
– Lack of documentation
– Limited capacities to assess trust
– Missing dialog publisher ↔ consumer
– etc
htp://www.w3.org/2013/dwbp/
● W3C working group “Data on the Web best practices”
● Part of the Data Activity– Also in this activity : Working group for CSV on the Web
● Charted until July 2016, running since January 2014
● Focus on defining best practices for publishing and using open data via the Web– Agnostic to technologies
– Scope: government data, research data, cultural heritage data
Goals
● Pub lish a se t o f best p ract ices fo r pub lish ing and consum ing open data– and suppo rt ing list o f use-cases and requ irem en ts
● Publish a vocabu lary for quality and granularity description
● Publish a vocabu lary for data usage description
Plans for the remaining tme
● Go quickly through the best practices
● Split up in groups of 3 or 4 persons
● Each group review the BP and say what is missing, what should be deleted, what should be added, …
– Write everything on post-its!
● We collect and cluster the input. This will be reported back to the group on Friday
Grouped in topics (1/4)
● Metadata– What kind of metadata should be considered when describing data on the
Web?
– How can metadata be provided in a machine readable way?
● Data Identification– How can unique re-use be provided for data resources?
– How should URIs be designed and managed for persistence?
● Data Formats– What kind of data formats should be considered when publishing data on
the Web?
(List based on https://www.linkedin.com/pulse/open-data-standards-steven-adler )
Grouped in topics (2/4)
● Data Vocabularies– How can existing vocabularies be used to provide semantic interoperability?
– How can a new vocabulary be designed if needed?
● Data Licenses– How can data licenses be made machine readable?
– How can license information about data published on the Web be provided/gathered?
● Data Provenance– How can data provenance information about data published on the Web be
provided/gathered?
(List based on https://www.linkedin.com/pulse/open-data-standards-steven-adler )
Grouped in topics (3/4)
● Data Quality– How can data quality information about data on the Web be
provided/gathered?
● Sensitive Data– How can data be published without infringing a person's right to privacy or
an organization's security?
● Data Access– What kind of data access should be considered when publishing data on the
Web?
– What requirements should be taken into account when deciding how to make data available on the Web?
(List based on https://www.linkedin.com/pulse/open-data-standards-steven-adler )
Grouped in topics (4/4)
● Data Versions– How can different versions of a dataset be tracked and managed?
● Data Preservation– How can publishers decide when and how data on the Web should
be archived?
● Feedback– How can user feedback about data consumed from the Web be
gathered?
(List based on https://www.linkedin.com/pulse/open-data-standards-steven-adler )