dissemination and visualisation of linked statistical data – a … · by research projects and...
TRANSCRIPT
Dissemination and visualisation of linked statistical data
– a practical approach –
Agenda Background Open data and linked data RDF vocabularies for statistical data Data visualization tool More resources and way forward Q&A
European Environment Agency transparency https://taskman.eionet.europa.eu/projects
open source software https://github.com/eea
open data http://www.eea.europa.eu/data-and-maps#tab-datasets exposed as RDF via Semantic Data Service:
http://semantic.eea.europa.eu/ 16.000 datasets SPARQL endpoint: http://semantic.eea.europa.eu/sparql
Global Open Data Index http://index.okfn.org/place
5 stars open data * on the web, open license ** structured *** non-proprietary open format **** uses URIs to denote things ***** provides context through links to other data http://5stardata.info/en/#costs-benefits
Linked data www.w3.org/TR/ld-glossary/#linked-data linkeddatafragments.org resolvable HTTP URIs
http://dbpedia.org/resource/Bucharest http://dbpedia.org/data/Bucharest.jsonld (application/ld+json) http://dbpedia.org/page/Bucharest (text/html)
Open Data Portals Linked data producers by research projects and web enthusiasts conversion of existing data to RDF increasingly adopted by the primary data producers
Open Data Portals most of them based on CKAN (by OKFN) open-data.europa.eu catalog.data.gov data.gov.ccTLD, e.g. data.gov.ro datahub.io
http://lod-cloud.net
Statistical data structurally different from other [linked] data typically distributed as datasets concerned with measures, indicators, series,
time periods, statistical/geographical regions multiple dimensions and measures generic - place, time or domain-specific
slices, agregations, totals, denominators various RDF vocabularies for modeling datasets
XML-based standards SDMX Statistical Data and Metadata eXchange probably the most widely used standard for
statistical data exchange adopted by major producers of statistical data (ECB,
Eurostat, IMF, OECD, UNSD, UNESCO, World Bank) http://ec.europa.eu/eurostat/web/sdmx-web-services
DSPL Dataset Publishing Language can be processed by Google Public Data Explorer
RDF vocabularies Dublin Core Terms DCAT http://www.w3.org/TR/vocab-dcat catalogues, dataset metadata, distribution
VoID http://www.w3.org/TR/void linked datasets
RDF Data Cube Vocabulary http://www.w3.org/TR/vocab-data-cube built on top of existing vocabularies focused on statistical data integrates dataset metadata, structure, codelists and
observations uses the core SDMX Information Model
Semantic interoperability SPARQL - native query language of RDF knowledge bases http://worldbank.270a.info/sparql select * where { ?s ?p <http://dbpedia.org/resource/Temperature> }
http://worldbank.270a.info/classification/variable/tas
http://worldbank.270a.info/classification/variable
world-bank-climates/month-average-historical.html
SPARQL PREFIX qb: <http://purl.org/linked-data/cube#> PREFIX sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX property: <http://worldbank.270a.info/property/> PREFIX d-climates: <http://worldbank.270a.info/dataset/world-bank-climates/>
select ?o ?value where { ?o a qb:Observation; qb:dataSet d-climates:month-average-historical; sdmx-dimension:refArea [owl:sameAs <http://dbpedia.org/resource/Romania>]; property:variable [skos:exactMatch <http://dbpedia.org/resource/Temperature>]; property:recurring-interval <http://reference.data.gov.uk/def/intervals/March>; property:month-average ?value } http://worldbank.270a.info/dataset/world-bank-climates/month-average-historical/1900-2009/03/RO/tas "3.781612" ^^<http://www.w3.org/2001/XMLSchema#decimal>
Observation page
Romania page
http://digital-agenda-data.eu European Commission, DG for Communications
Networks, Content & Technology Digital Agenda, Europe 2020 strategy
https://ec.europa.eu/digital-agenda/
The Digital Economy & Society Index (DESI)
User requirements target statistical data / Data Cube Vocabulary non-technical audience simple and intuitive navigation lots of explanatory notes, labels and metadata 100+ indicators inside a single large dataset
no SPARQL embedding, export, share, bookmark, etc. moderated comments good looking charts
Dataset metadata
Dataset metadata
Dataset contents
Sample column chart
Visualisations
Chart configurator
Chart configurator type of chart single/multiple selection for dimensions series layout of filters order and grouping in each filter sorting of values chart titles, tooltips, explanatory texts legend and metadata from code lists and more...
Chart configurator
Hierarchical code list
Sample column chart
Sample line charts
Additional explanations and user interaction
Navigation widget
Digital Economy and Society Index
http://digital-agenda-data.eu Digital Agenda Key Indicators 170 indicators, 700k observations
Digital Economy and Society Index 50 indicators, 8k observations
Lead Indicators for DG Connect policy priorities 31 indicators, 35k observations
Other resources Use Cases and Lessons for the Data Cube
Vocabulary http://www.w3.org/TR/vocab-data-cube-use-cases
Technical information http://digital-agenda-data.eu/documentation
Try it test.digital-agenda-data.eu github.com/digital-agenda-data Vagrant build
Linked statistical data – our experience good: URI’s – break data silos of statistical office cross-dataset queries metadata and methodology
not so good: missing from RDF data cube: aggregate values size of RDF serialized form performance
challenging: generating valid DSD data maintenance
Find out more joinup.ec.europa.eu interoperability solutions for public administrations share and reuse communities, guidelines, source code
SEMIC 2016 Semantic Interoperability Conference Rome, Italy – 12 May 2016
StatDCAT application profile for data portals joinup.ec.europa.eu/asset/stat_dcat_application_pro
file/home