ess workshop dissemination of official statistics as open data · ess workshop: dissemination of...

33
20-3-2017 ESS Workshop Dissemination of Official Statistics as open data

Upload: others

Post on 23-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

20-3-2017

ESS Workshop Dissemination of Official

Statistics as open data

Page 2: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

1

The information and views set out in the report are those of the author(s) and do not necessarily reflect the

official opinion of the European Union. Neither the European Union institutions and bodies nor any person

acting on their behalf may be held responsible for the use which may be made of the information contained

therein.

Authors:

Oscar CORCHO (Full Professor at the Department of Artificial Intelligence (Universidad Politécnica de Madrid))

Evangelos KALAMPOKIS (Research fellow with the Information Technologies Institute of the Centre for Research & Technology - Hellas (CERTH-ITI))

Eoin MACCUIRC (Webmaster at the Data dissemination Unit of CSO Ireland)

Joan Miquel PIQUÉ (External expert of DevStat)

Paola VOTTA (DevStat)

Page 3: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

2

Table of Contents

Introduction .................................................................................................................................. 3

1. Key conclusions and Recommendations from the Workshop .............................................. 5

1.1 Opportunities and Use Cases for the adoption of (Linked) Open Data in Official Statistics 5

1.2 The current landscape of challenges and tools ................................................................... 7

1.3 Strategy and Policy for an effective (Linked) Open Data adoption in official statistics ...... 7

Annexes ......................................................................................................................................... 9

Annex 1: Programme of the Workshop ........................................................................................ 9

Annex 2: Description of workshop sessions ................................................................................ 11

2.1 Opening Session ................................................................................................................ 11

2.1.1 Summary of presentations ......................................................................................... 11

2.2 Session 1: Opportunities and Use Cases ........................................................................... 12

2.2.1 Objective of the session ............................................................................................. 12

2.2.2 Summary of presentations ......................................................................................... 13

2.3 Session 2: Challenges and Tools ........................................................................................ 18

2.3.1 Objective of the session ............................................................................................. 18

2.3.2 Summary of presentations ......................................................................................... 19

2.4 Session 3: Strategy and Policy ........................................................................................... 21

2.4.1 Objective of the session ............................................................................................. 21

2.4.2 Summary of presentations ......................................................................................... 22

2.5 Closing Session .................................................................................................................. 24

Annex 3: Methodology of the Group Discussions ....................................................................... 26

3.1 Objective of the group discussions ................................................................................... 26

3.2 Methodology ..................................................................................................................... 27

3.2.1 Session 1 - Activity: Enriched Double SWOT analysis ............................................. 28

3.2.2 Session 2 - Activity: Lotus Flower ........................................................................... 30

3.2.3 Session 3 - Activity: From…to ................................................................................. 31

Page 4: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

3

Introduction

This document describes the findings and conclusions obtained from the ESS workshop

on dissemination of official statistics as open data, held in St. Julians (Malta) on 18th -

19th January 2017.

Open data is an emerging discipline with a huge potential for the (re-)dissemination of

official statistics. There are numerous questions to address for the statistical

community. The workshop aimed at bringing together various stakeholders to explore

and prepare the ESS future orientation in this area of work. 56 delegates from 27

countries, including representatives from private companies and academia, joined

together in order to learn about the ESS vision of disseminating official statistics as

(Linked) Open Data, and to discuss about the potential benefits of such an approach, the

limitations and challenges for a broader uptake, the key ingredients involved in the

different topics to be addressed, and the steps to be taken to achieve this vision.

The event was preceded by a preparatory meeting in Luxembourg, held on 28th

November 2016, where a group of 25 experts from Eurostat, the Office for Official

Publications of the European Communities, several national statistical Offices and

Devstat and PriceWaterHouseCoopers, as private companies, met together. During this

meeting, several discussions took place and agreements were reached, such as the

selection of three Proofs of Concept (PoC)1 to continue focusing on, from those

proposed in the initial study done by PwC. The first PoC selected was the one on using

LOD as a vehicle for harmonization, standardization and management of classifications

handled by national statistical offices, the second on the usage of Linked Data principles

for the publication of data and metadata from national statistical offices, and the last

one on using LOD technologies for integration of multiple data sets from multiple data

providers.

Besides that several needs were also identified, such as the need for clear guidelines for

the adoption of an LOD approach in official statistics, the need to study better the

benefits that LOD provides in this area, the need to characterise further the users of

such data, and the need to address the perceived complexity of linked data across data

providers and users, including the development of strategy roadmaps,

recommendations about the governance of the generated LOD and the generation of

technical guidelines.

The results from the preparatory meeting allowed configuring the ESS Workshop, which

was structured around three main sessions:

1 A proof of concept is considered to be an experiment or pilot project that demonstrates the feasibility of an approach

or concept. In this case, PoCs will demonstrate the feasibility of using Linked Open Data for the dissemination and exploitation of official statistics.

Page 5: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

4

• Session 1: Opportunities and Use cases

• Session 2: Challenges and tools

• Session 3: Strategy and policy

Each session combined expert lectures around the main topic, presentations of national

experiences and group discussions.

The experts lectures and presentations aimed at showing the key ingredients and

opportunities that open data, in general, and Linked Open Data, in particular, offer for

the dissemination of official statistics, as well as the organisational and technological

support and challenges associated to such adoption. They also illustrated examples of

the work already done by different Member States, who in some cases are already

deployed and available in their production systems.

These presentations also identified challenges and barriers for a wider and faster

adoption of these approaches in national statistical institutes (NSIs) from EU member

and associate states. These benefits, challenges and barriers were all reflected in the

discussions held in the group discussion activities.

The group discussions (organised as round tables) were carried out based on different

activities aimed at letting participants developing first their ideas individually and then

share them with the other participants in order to define a common. At the end of every

activity, a representative of every team presented to the whole audience a short

conclusion of the discussion, results, and common agreements reached.

In order to enhance its reading and use, this report first presents the key conclusions

from the Workshop in the following chapter. The Workshop sessions are then described

in Annex 2, summarising the presentations and explaining the methodology used for the

group activities under each session. Annex 1 include instead the programme of the

Workshop.

All presentations are available on the CROS portal2.

2 http://ec.europa.eu/eurostat/cros/content/presentations-11_en

Page 6: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

5

1. Key conclusions and Recommendations from

the Workshop

Main conclusions from the workshop are the following:

• There is a shared understanding of the benefits of LOD among NSIs who have experimented it. LOD supports more flexible means of data dissemination, enhanced data exploration between datasets and enables the linking with other sources (e.g. within a national statistical system) while keeping the information on data provenance. Indirect benefits are that LOD projects foster internal coherence of data and metadata, reinforce the role of NSI as standard setters and stimulate partnerships.

• LOD is an area in which NSIs are still largely experimenting. It is not yet perceived as mature for full production but there was broad agreement on the advantages of developing further steps in a coordinated way at ESS level. This can best be achieved through concrete results and pilots to further demonstrate the feasibility and benefits from LOD. The following priorities were identified:

o There is a need to build capacities at NSI and ESS level, through training, common pilot projects and collaboration across multidisciplinary teams (IT, dissemination, content and classifications).

o Governance is a key element in LOD. Common governance approaches and processes for LOD should be developed collaboratively and embedded in existing structures of the ESS, Eurostat and NSIs. The existing governance structures and communities will first be examined.

o The technology is available but the ESS would benefit from an evaluation and selection of standard tools and for guidelines concerning performance issues.

o The ESS should liaise more systematically with standards setters beyond the EU, and from experiments outside the EU (e.g. Australia, Japan).

In this section, a brief analysis of the discussions held during the working groups and the

results achieved by each activity is presented, which will help in setting up the

communication, training and work strategy to achieve the ESS vision 2020 for improving

dissemination of official statistics. We recall that the event was structured around three

sessions each of them characterised by a different activity.

1.1 Opportunities and Use Cases for the adoption of

(Linked) Open Data in Official Statistics

The conclusions presented in this section are based on the analysis of the results of the

group activity that was done during session 1. This activity intended to establish a

framework to visualise the key elements in order to start developing action, setting a

Page 7: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

6

starting point with regards to different perspectives of the situation, reaching a

consensus about causes, effects, and main challenges.

As a result of the group discussion on the enriched SWOT analysis, where participants

were organised in seven tables, 7 consolidated enriched SWOTs were generated. These

SWOTs were later analysed, after the meeting, and the following main conclusions can

be drafted (as a summary SWOT):

• Strengths are generally shared across all participants, who understand the main benefits of adopting a (Linked) Open Data policy: better interoperability, improving access to data and metadata, the possibility of handling multiple formats and standardisation.

• Weaknesses are also identified and generally shared across tables, including the lack of appropriate skills for data production and dissemination inside NSIs, and for data (re)use by users, the perceived immaturity of tooling despite existing success stories, the lack of clear guidelines and how-to, and lack of awareness and buy-in from top management in some NSIs.

• In terms of opportunities, such an approach is perceived as a good opportunity to provide more visibility and relevance to the work done by NSIs, and to provide solutions to some of the existing current limitations, such as providing more trust, allowing dealing with provenance, improving data literacy overall and opening up new opportunities for business and society.

• Threats identified were: whether this was just a fashionable approach that would live further in the future, or the possibility of having private companies releasing data if NSIs do not do it quickly, hence losing relevance in the data value chain landscape.

Many of the points raised in the SWOTs were accompanied by proposals for measures

and actions that would be required, according to participants, in order to boost and

enhance strengths and opportunities, as well as to mitigate the potential negative

effects of weaknesses and threats. Most of these proposals were oriented towards the

need and opportunity for good training, for the development of good simple use cases

that would act as a showcase and allow easier replication, for the establishment of good

technological, organisational and, in general, governance guidelines for all to follow, and

for the common development of a joint fabric of utilities and software to be reused and

adapted throughout the NSIs.

Page 8: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

7

1.2 The current landscape of challenges and tools

The second group activity focused on identifying the main ingredients needed to address

challenges identified in the previous session: (1) Interoperability to join data silos and

standardisation, (2) Codelists, (3) Technical complexity of Linked Data for Users, (4)

Technical complexity of Linked data for the Producers (skills gap), (5) Naming policies

and handling of temporal references and versions in identifiers and data, (6)

Reputational risk, data quality and confidentiality, (7) Risk of low performance/

Immature tooling.

This activity was performed using the Lotus Flower technique, with the main objective

of identifying the main places where existing technologies and approaches can help, and

those where there is a need to do more investment.

Given the fact that each table discussed a different topic, it is hard to provide a

consolidated view on all the ingredients that were addressed. However, they may be

grouped into the following three main areas:

• Governance/organisational: need for standardisation (common vocabularies, level of aggregation, metadata, common codelists, etc.), need to understand better users and their needs, need to start with low hanging fruit first, need to establish clear naming policies, need to address confidentiality, need to maintain visibility of NSIs’ work.

• Technological: use of cloud services, create catalogues of mature tools, ensure the sustainability of tools and create a common set of tools to be used by all NSIs.

• Training: need for documentation (developer-oriented, user-oriented, management-oriented), guidelines and cookbooks.

1.3 Strategy and Policy for an effective (Linked) Open Data

adoption in official statistics

The final activity was focused on making participants think and agree collectively about

the concrete steps to be taken in order to make the ESS vision 2020 a reality, based on

the previous identification of the elements to be tackled. This was performed using a

“From…to” activity, aimed to define the next steps in the short term, and guidelines and

priorities for an action plan. Key aspects and trends were integrated to draft future

scenarios. Each group was given a project to develop where a list of key aspects for the

development of the given project, including trends and possible external factors, was

prepared and future scenarios drafted.

Participants were divided into 8 groups, each of which discussed one of the following

five topics: (1) Strategy and Policy, (2) People and capabilities, (3) Data and Metadata,

Page 9: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

8

(4) Linked Data Governance, (5) Technology Infrastructure, as well as the following three

proposed “Proofs of Concept” (PoC): (6) Codelists, (7) Data publishing and provenance

and (8) Data integration.

The main discussions held under this activity can be summarised as follows:

• LOD is an area in which NSIs are still largely experimenting. There was broad agreement on the advantages of developing further steps in a coordinated way at ESS level. This can best be achieved through concrete pilots to further demonstrate the feasibility and benefits from LOD and a maturity model to assess advances across NSIs.

• The general lack of skills needs to be addressed through training.

• Governance is a key element in LOD. Common governance approaches and processes for LOD should be developed collaboratively and embedded in existing structures of the ESS, Eurostat and NSIs. The existing governance structures and communities will first be examined.

• LOD approach’s uptake will be possible if the landscape of formats is clearly understood, a limited number of sources clearly identified, with supporting scripts and ETLs to be shared, and agreements on naming.

• Codelists need to be standardised via international collaborations, and establishing a good governance for ensuring quality.

• For the PoC on codelists, the steps towards the development of a Linked Open Classification repository were discussed: XKOS profile, a central store for codelists, establish liaisons with different organisations.

• For the PoC on data publication, the need to select already harmonised datasets, export in different formats (including JSON-Stat) and the need for software were identified.

• For the PoC on data integration, similar aspects were also discussed.

Page 10: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

9

Annexes

Annex 1: Programme of the Workshop

Wednesday 18 January 2017

11:30 – 12:30 Registration, Networking

12.30 – 13:15 Buffet Lunch

13:15 – 14:00 Welcome address Mr Reuben Fenech (NSI Malta)

Opening address Ms Martina Hahn (Eurostat)

Presentation of the Workshop Objectives Mr Jose Enrique Vila (DevStat) & Ms Christine Kormann (Eurostat)

Session 1 Opportunities and Use Cases

The purpose of this session is to establish the added value of LOD for official statistics and its users, based on existing experience

14:00 – 14:45 Expert Lecture I: Linked Statistical Data 101 Dr Oscar Corcho, Full Professor at the Department of

Artificial Intelligence (Madrid)

14:45 – 15:15 Study on LOD requirement- main findings and proofs of concept

Dr Nikolaos Loutas, PWC Data and Analytics

15:15-15:45 Coffee Break

15:45 – 16:30 National Experiences Experience in Italy: Mr Giovanni Barbieri, ISTAT Experience in the UK: Mr Darren Barnes, ONS

Experience in Ireland: Mr Eoin MacCuirc, NSI Ireland

16:30 – 18:00 Group discussion (First Part)

20:00 Social Dinner

Page 11: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

10

Thursday 19 January 2017 - Morning Session 2

Challenges & Tools The purpose of this session is to examine the technical and organisational challenges

raised by LOD, available tools and areas for further development

9:30 – 10:15 Expert Lecture II: Linked Statistical Data: challenges and tools.

Dr Evangelos Kalampokis, research fellow with the Information Technologies Institute of the Centre for

Research & Technology - Hellas (CERTH-ITI)

10:15 – 10:45 Coffee Break

10:45 – 11:45 National Experiences Experience in France: Mr Franck Cotton, INSEE

LOD approaches in NSIs – Geographic Information perspective: Mr Hannes Reuter, Eurostat

11:45 – 13:00 Group discussion (Second Part)

13:00 – 14:00 Buffet Lunch

Thursday 19 January 2017 - Afternoon

Session 3 Strategy and Policy

The purpose of this session is to discuss the possible strategic orientations for the ESS in the field of LOD, level of ambition, common projects and governance issues

14:00 – 14:30 Expert Lecture III: Where are we going? – Delivering open data in Europe.

Mr Eoin MacCuirc, Webmaster at the Data dissemination Unit of CSO Ireland

14:30-14:45 Building blocks for an open data strategy of the European Statistical System

Dr Nikolaos Loutas, PWC Data and Analytics

14:45 – 15:45 Group discussion (Third Part)

15:45 – 16:00 Coffee Break

16:00 – 17:00 Presentation of Conclusions Closing address

Mr Emanuele Baldacci (Eurostat)

Page 12: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

11

Annex 2: Description of workshop sessions

2.1 Opening Session

Chair: Dr José Enrique Vila

2.1.1 Summary of presentations

The welcome address was provided by Mr Reuben Fenech, Director General at the

National Statistical Office of Malta). Ms. Martina Hahn (Head of Unit Methodology and

corporate Architecture in Eurostat) then made the opening presentation,

acknowledging the presence of representatives of several National Statistical Institutes

(NSIs). She presented the main objectives of the workshop, which were focused on

developing a shared understanding of the opportunities and challenges of open data

and Linked Open Data for official statistics, as well as contributing to the elaboration of

the ESS strategy in this field.

She moved then into describing the ESS vision 2020 (a common strategic response of

the ESS to the challenges that official statistics are facing, such as the data revolution,

the development of new metrics, or the price of statistics) and the DIGICOM project,

which provides the framework for the organisation this event. DIGICOM Work Package

3 (WP3) deals with open data dissemination, addressing all open data-related aspects

for the area of official statistics, facilitating and harmonising APIs to European data, and

improving re-dissemination.

Following this workshop, the contractor PriceWaterHouseCoopers will finalise its study

on LOD for official statistics (February 2017) and an initial draft of the ESS open data

strategy will be produced (also for February 2017). An ESSnet is then planned to be

launched in 2017.

A general presentation on open data was also provided by Ms. Christine Kormann

(DIGICOM project manager in Eurostat), from the definition of open data and the 5 stars

of open data, proposed by Tim Berners Lee, to the existence of open data barometers

that are helpful to understand the maturity of open data adoption in Europe, and the

availability of the pan-European open data portal. An interesting point was made about

the fact that open data is now a policy requirement that matches perfectly with NSI's

mission to disseminate official statistics, while LOD is a set of design principles that the

ESS may decide to use or apply. A final reflection was done on the high importance of

official statistics in this open data landscape, since statistics are considered as high value

datasets for open data portals, according to G8.

Page 13: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

12

Dr José Enrique Vila (Devstat) provided then a quick presentation of the workshop

logistics and agenda, paying special attention to the fact that the workshop would

contain three group activities facilitated by a mediator (Dr Oscar Corcho) and several

facilitators.

2.2 Session 1: Opportunities and Use Cases

Chair: Dr José Enrique Vila

2.2.1 Objective of the session

The objective of this session was to provide an initial tutorial on the main characteristics

to be considered when applying Linked Open Data principles for the dissemination of

official statistics, and then to reflect on the main opportunities, benefits and added value

that LOD can provide for official statistics and its users, based on the existing experience

from some early adopters.

Page 14: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

13

2.2.2 Summary of presentations

Dr Oscar Corcho (Universidad Politécnica de Madrid) started this session with an initial

tutorial-oriented presentation (an expert lecture) on how to apply Linked Open Data for

the dissemination of official statistics, which he entitled Linked Statistical Data 101. The

presentation was structured in four main blocks: an initial part describing the main

foundations for (Linked) Open Data, then a set of examples from the application of LOD

principles in a regional statistical office in Spain (Aragón), then the foundations of LOD

for official statistics (more specifically, the W3C RDF DataCube recommendation) and

finally an initial set of ideas to ignite the discussions during the group activity.

W3C Data Cube

51 51

Page 15: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

14

Some questions raised after this presentation were related to whether data reusers are

explicitly asking for the usage of RDF. Dr Oscar Corcho answered saying that normally

what they ask for is good/permanent identifiers to refer to the data, and that if RDF is

provided, they do not find it really problematic to deal with such data, since there are

development libraries in many programming languages.

After this initial presentation, Mr Daniel Brulé (PwC) discussed about the study that had

been carried out in the previous months by PwC on LOD requirements for official

statistics. The study objectives are the following:

• The identification and study of LOD initiatives in the ESS

• The identification and assessment of implementations, supported use cases and public learning resources for LOD in statistics

• The identification of current initiatives and projects on LOD in Statistics

• The proposition of proof of concepts that demonstrate the benefits of LOD for official statistics

• The identification of a high-level architecture for LOD in statistics

• The definition of a joint LOD strategy at ESS level

The approach taken for the study was also presented, based on desk research, onsite

visits and audioconferences with nine stakeholders coming from NSIs, standardisation

organisations like W3C, and academia.

Page 16: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

15

Some of the initial insights obtained were related to:

• Why NSIs use LOD. Mostly for interconnecting datasets within the NSI, with other NSIs and Eurostat, and to publish official statistics in machine-readable, linkable formats.

• Several use cases are available: metadata catalogues at the EU and national levels, integrated access to EU and BEA data, the Scottish Index of Multiple Deprivation, LOD for fact checking, selecting the best places to invest, or the Digital Agenda Scoreboard.

• Cost structure, for development, maintenance, promotion and licensing.

• Channels, like NSIs portals, endpoints and APIs, and mobile apps.

• Customer relationships, like contests (e.g. hackatons), and feedback.

After these two initial presentations, several presentations on national experiences

were made: Mr Giovanni Barbieri from ISTAT, Mr Darren Barnes from ONS United

Kingdom and Mr Eoin MacCuirc from CSO Ireland.

Mr Giovanni Barbieri described the principles and main design decisions behind the

datiopen.istat.it portal. Some of the benefits that were highlighted in his presentation

were the opportunity to reinforce trust, getting closer to users, making easier for users

to retrieve data and provide richer services to the users, reaching new users, giving

information back and improving metadata. Then a good presentation of use cases from

Italy was given, on spatial querying on mobile apps, federated queries on ISTAT and

ISFRA datasets, and connecting with social media. He concluded on the fact that recent

technologies advances in the open data community enable new advanced dissemination

channels for official statistics.

Page 17: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

16

Mr Darren Barnes described the current landscape of Linked Data from a UK perspective.

The number of Linked Data providers from the UK (e.g., Government, Geographical data

producers) is relatively small. The main challenges identified are not technical but rather

the lack of skills and trainings inside NSIs to deal with this type of approach. He also

raised the need for agreed standards and vocabularies to deploy an LOD approach

conveniently: identifiers, data models, vocabularies for dimensions, attributes and

measures, reference data, metadata and API methods. Next steps include proof of

concept demos.

Page 18: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

17

Finally, in the last presentation Mr Eoin MacCuirc discussed about the status of open

data and linked open data in Ireland, and not just on the National Statistical Office.

Statements about how to produce knowledge from data were discussed, referring to

the CSO statement of strategy 2016-2018. The presentation also pointed out the

importance of a good collaboration with the academic sector. In an analogy to how the

Web was created, the presentation also reflected about the fact that what is now seen

as difficult (publishing linked data and linking it to other datasets) is similar to what was

done at the beginning of the Web.

The next steps for CSO are related to data audits, assisting data publishers (in fact, one

of the last points covered in this presentation was the fact that now CSO is being

contacted by other organisations for help in the publication of their data), and publishing

high value datasets. The main benefits identified for the adoption of LOD were:

• More transparency and accountability of public bodies

• Better data discipline in public bodies, providing for greater efficiency and effectiveness of service delivery

• More citizen participation and inclusion

• Business innovation, business creation and business efficiency, leading to economic growth

• Opportunity for CSO to play a pivotal role

Page 19: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

18

During the question&answer session for this presentation, Mr MacCuirc was asked

about which were the high value datasets to be considered for publication first. The

answer was that those ones already proposed by the Open Knowledge Foundation

(OKFN) for the whole world, for instance. It is not completely clear yet which are the

users and how users are making use of the Linked Data. In any case, it is extremely

important to work on achieving a good data literacy among the population, and

especially starting with young people and kids.

2.3 Session 2: Challenges and Tools

Chair: Ms Martina Hahn

2.3.1 Objective of the session

The purpose of this session was to examine the technical and organisational challenges

raised by LOD, available tools and areas for further development.

The first presentation by Evangelos Kalampokis (University of Macedonia) described

existing tools that can be used for publishing, combining and exploiting Linked Statistical

Data as well as limitations of these tools that should be addressed in order to achieve

the vision of Linked Data Cube Analytics. The second presentation by Franck Cotton

(INSEE) described the experience of INSEE in France with regards to publishing Linked

Statistical Data. Finally, Hannes Reuter (Eurostat) presented the experiences of member

states in linking geospatial information with statistics.

Page 20: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

19

2.3.2 Summary of presentations

In the first presentation of the session Evangelos Kalampokis initially motivated the

value of Linked Data technologies in statistics by introducing the vision of Linked Data

Cube Analytics. This vision describes a paradigm where multi-dimensional statistical

datasets are connected through Web standards and users are able to perform

innovative data analytics scenarios on top of multiple datasets.

Towards this end, various software tools have been already developed. These tools

address requirements related to publishing (e.g. Grafter, TARQL from the OpenCube

toolkit, QBer, CSV2DataCube from the LOD2 Statistical Workbench) combining (e.g. the

OpenCube Compatibility Explorer and the StatSpace Explorer) and exploiting (e.g. the

OpenCube OLAP Browser, the CODE Linked Data Query Wizard) Linked Statistical Data.

Most of the tools cover exploiting related functionalities while only few tools aiming at

data integration.

The characteristics that differentiate the publishing tools include among others the user

interface (i.e. graphical user interface or command line interface), the technical format

of the raw datasets (e.g. CSV, JSON-stat, RDBMS etc.), and the structure of the produced

linked data cube. On the other hand, exploitation tools are characterized by the type of

analysis (e.g. OLAP operations and map or graphs visualisations), the domain of

application (e.g. tourism and health), and the way of provision (e.g. web application or

standalone tool). The most important category of tools that were presented was the

tools enabling integration and analysis of multiple datasets. The main characteristics

related to these tools are the way they identify the datasets to join as well as the

integrated data.

Although these tools are a major step towards achieving the vision of linked data cube

analytics, Evangelos Kalampokis also presented some challenges that need to be

addressed. Although statistical data are modelled using the RDF QB vocabulary, which

is a W3C standard, the degrees of freedom of the QB vocabulary allows data publishers

to follow different practices in the application of the vocabulary. For example, they

define the unit of a measured variable at different levels (that is, qb:DataSet,

qb:MeasureProperty, and qb:Observation) and using different type of properties (for

example, qb:AttributeProperty and qb:MeasureProperty). This results in Linked

Statistical Data silos and in software tools that cannot be reused across different

datasets. A set of guidelines for publishing linked statistical data, which are currently

being developed in the course of the H2020 OpenGovIntelligence project, have been

presented to address this challenges.

Moreover, the complexity of linked data technologies often hamper the wide

exploitation of these software tools and thus the wide adoption and exploitation of

Linked Statistical Data. An effort aiming at addressing this challenge is JSON-QB API

developed during the H2020 OpenGovIntelligence project. This API is designed to

Page 21: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

20

support developers to use Linked Statistical Data while assuming minimal knowledge of

linked data.

Finally, the performance of the tools has been presented as a challenge, especially in

web-based applications, in large datasets and in the execution of federated queries.

In the second presentation of the session Franck Cotton (INSEE) described the

experience of France in publishing official statistics using linked data technologies.

INSEE, which is the official statistics authority in France, have been experimenting with

RDF since 2006, starting with an official geographic code, a static website

(http://rdf.insee.fr) and an identification namespace (http://id.insee.fr). INSEE

introduced the XKOS vocabulary that extends SKOS and enables a richer description of

code lists and published statistical classifications linked to other datasets (e.g. DBpedia

and Geonames) through a SPARQL endpoint. In 2012 INSEE used RDF for their internal

metadata repository and published their first census dataset as linked data that is

updated every year. INSEE is also very active in the research community as they

participate in research projects (e.g. Datalift) and co-organise the SemStats workshop

which takes place every year with the International Semantic Web Conference since

2013. The Linked Statistical Data of INSEE are available through zip files, SPARQL

enspoints, and dereferencable URIs.

INSEE participated in the UNECE-HLG Implementing Modernstats Standard project

(2016) with two concrete projects related to linked open metadata. The first one was

about the creation of an RDF store with international and national classifications from

numerous organisations (e.g. UN and Eurostat) and countries (e.g. France, Slovenia, Italy

etc.). The second project refers to UNECE models and standards such as GSIM, CSPA etc.

Next steps in INSEE include publishing descriptive metadata as linked data considering

also quality as well as promoting existing datasets through the new INSEE’s

dissemination policy and new open data legislation in France.

After all these years of experimentation with linked data INSEE believes that the

technology is mature and that publishing not only data but also metadata with linked

data is very useful. Linked Statistical Data provide instant data services to users both

inside and outside INSEE, facilitate data integration, support collaboration with other

publishers, and enable distributed data storing and thus limit data replication. However,

they have identified several challenges such as using naming things with URIs (including

modeling and versioning of the data) as well as the need for more “profiled” standards.

In the final presentation of the session Hannes Reuter (Eurostat) presented the

experiences of individual member states in linking geospatial information. This type of

information is important because an action on “merging statistics and geospatial

information” is active since 2012.

Page 22: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

21

In particular, Hannes Reuter presented three cases from the UK, Poland, and Finland. In

the UK, data related to geographic codes, postcode lookup and geographic boundaries

was transformed to RDF and published through a portal. This portal apart from RDF data

provides also a user interface. In the polish case, statistical units for which data can be

published with harmonization of their geometries for respective years were identified

and published as RDF. In Finland, the INSPIRE datasets were published with URIs and a

technical infrastructure has been created for dereferencing the URIs.

Outside what NSIs have done, Hannes Reuter described EEA Semantic Data Platform

(http://semantic.eea.europa.eu).

Concluding, Hanner Reuter emphasized the need to bring together the data, metadata,

and geospatial worlds and mentioned that different approaches and methods can be

followed in the implementation of linked geospatial data.

2.4 Session 3: Strategy and Policy

Chair: Ms Martina Hahn

2.4.1 Objective of the session

The objective of this session was to discuss the possible strategic directions for the ESS

in LOD, the level of ambition, common projects and governance issues.

The session had three parts. Firstly, an expert lecture; Secondly, a presentation on a

study by PwC on developing an ESS LOD strategy, the building blocks necessary and

recommending a way forward. Thirdly, there was an opportunity for a group discussion.

While LOD can deliver benefits for the ESS, NSIs and data (re)users there are many

elements to consider in ensuring its ready implementation. NSIs and countries in the ESS

are at different starting points, having different data ecosystems and different levels of

engagement with LOD.

A maturity model approach to delivering LOD nationally and throughout the ESS could

prove optimal. Early adopters can pave the way for a successful implementation of LOD

in the ESS. Strategic test bed projects can deliver well defined outcomes. Lessons

learned, strategies, tools and resources can be shared. Gradually, in a structured and

systematic way the new LOD ecosystem would permeate the ESS and delivering its

benefits.

From the session, it was apparent that the LOD journey has already started in the ESS.

The question the session primarily addressed was how can the ESS best guide this

journey.

Page 23: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

22

2.4.2 Summary of presentations

The session started with an expert lecture from Mr Eoin MacCuirc (CSO, Ireland) entitled

“Where are we going? Delivering open data in Ireland and Europe”.

The lecture shared the journey of the CSO from early days in LOD, through publishing

Irish Census 2011 as LOD, to Irish involvement in the Open Cube project and current LOD

projects.

The lecture continued with an outline of how Ireland adopted the Open Government

Partnership and the story of Irelands open data portal data.gov.ie. Mr MacCuirc shared

the resources published to guide the strategic development of open data in Ireland.

Though published in 2014 and 2015 the Best Practice Handbook, the Open Data

Publication Handbook and the Technical Framework provide essential information for

publishing open data nationally and the Roadmap lays out initial steps on such a journey.

Mr Mac Cuirc finished his lecture with some lessons learned and the open data charter

principles.

The key points raised in the lecture were that publishing LOD was possible in a national

and NSI context. Resources and tools are available to guide producers in publishing LOD.

Collaboration is key as with any innovation there are new skillsets, tools, technical and

infrastructural challenges, data and metadata challenges and most NSO in the ESS do

not currently possess the people, capacity and capabilities required in the LOD sphere.

Mr MacCuirc encouraged those present to embrace an incremental and experimental

approach to publishing LOD and stated that the ESS could play a key role in coordinating

and liaising with international LOD developments.

Page 24: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

23

The expert lecture was followed by the presentation of the study conducted by PwC

“Towards a joint linked open data strategy for the European Statistical System” and

presented by Dr Nikolaus Loutas. The presentation outlined the essential building blocks

of a LOD strategy for NSIs and proposed LOD proofs of concept for collaborative

development in the ESS.

For each building block Dr Loutas began with recommendations and concluded with key

strategic questions.

The presentation outlined a practical approach to developing a joint LOD strategy in the

ESS. Looking at key players in the ESS LOD landscape and at a high-level architecture for

LOD for official statistics the essential building blocks of an ESS LOD strategy were

identified:

• Strategy and policy

• People and Capabilities

• Data and Metadata

• Linked Data Governance

• Technology and infrastructure

The list of recommendation and the key strategic questions for each building block were

proposed for further discussion by the group.

The presentation concluded with a set of proposed proof of concept for collaborative

development within the ESS publishing linked open (meta)data:

Page 25: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

24

1. Linking official statistics within an NSO to improve data dissemination

2. Publishing standardised nomenclatures as linked open metadata

3. Linking official statistics with other data (across ESS) to develop value added

services and apps

Each of these proof of concepts were viewed as quick projects involving Eurostat, NSIs

and other LOD actors, allowing some NSIs to dip their toes into LOD while

simultaneously building collaboration, capability and capacity with the ESS LOD

community.

Dr Loutas’ presentation was followed by a group discussion based on the activity

“From...to” selected for this session.

2.5 Closing Session

The closing address was done by Ms Martina Hahn (Eurostat), who underlined that the

objective of raising awareness on LOD potential and challenges for official statistics was

clearly met. She mentioned the fact that besides the 56 onsite participants from 27

countries, approximately 100 additional persons attended the meeting online, through

streaming, following the talks and seeing the results of the group discussions.

She thanked all the attendants for their active participation throughout the whole

workshop and for their ideas and suggestions, which undoubtedly contributed to the

success of the Workshop.

Ms Hahn reminded the schedule ahead for DIGICOM WP3 during 2017 and provided the

following general conclusions:

1) LOD is an area in which NSIs are still largely experimenting. It is not yet perceived as mature for full production but there was broad agreement on the advantages of developing further steps in a coordinated way at ESS level.

2) This can best be achieved through a “low-hanging fruit” approach and concrete pilots to further demonstrate the feasibility and benefits from LOD and collect more feedback from users.

3) Existing experience (as compiled in the inventory of practices from PWC) can be shared and reused. Learning from experience beyond the EU (e.g. Japan, Australia) will also be useful.

4) There is a shared understanding of the benefits of LOD among NSIs who have experimented it.

Page 26: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

25

5) It is time to engage other NSIs, redisseminators, researchers and other actors involved in the data value chain ecosystem. They need to be well informed about the benefits of this approach, and this requires good "marketing" and training material.

6) The combination of LOD and official statistics require a multidisciplinary

approach where different skills need to be taken into account and trained.

7) Further work is needed to develop good governance strategies, on e.g. standard models, vocabularies, codelists, URIs, building on existing models such as the SDMX community.

Page 27: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

26

Annex 3: Methodology of the Group Discussions

Group discussions have been considered to be an important part of the reflection to be

carried out during the conference. That has been the reason why a specific team has

been devoted to analyse the program and prepare specific methods and techniques to

stimulate debate and obtain the most inputs from the participants and the best results

in terms of effectiveness and productivity.

The preparation of the discussion groups was performed in parallel to the definition of

the different items and topics of the programme, and according to the profiles of

participants and the objectives of every part of the Workshop. Several meetings from

October 2016 to January 2017 were held to fine tune the objectives and expected

results, to adapt and adjust methodologies of every session to specific needs.

3.1 Objective of the group discussions

Formal content and presentations are usually just a part of the achievements wished

from this sort of events, when there may probably be found as much knowledge in the

audience than the one provided by the speakers. It is key to provide the environment

and the tools to create trust and a fruitful atmosphere for the participants to contribute

with ideas, feedback, and eventually be able to make one step further in the

identification and design of the solutions to needs and challenges posed by the different

topics addressed.

A specific document with detailed information about the objectives for every discussion

was delivered and used for the preparation of the discussions. It is relevant, though, to

cite here the following general aims:

• Set the starting point regarding different perspectives of the situation, reaching

a consensus with regard to causes, effects, and main challenges of the current

context.

• Establish a framework to visualize the key elements in order to start

developing action.

• Harmonize visions and expectations.

• Agree and set up challenges to be addressed.

• Work towards key aspects to face and overcome different challenges, to

bring them into reality.

• Define next steps in the short term, and guidelines and priorities for an

action plan.

Page 28: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

27

3.2 Methodology

Each group discussion was developed with a specific methodology according to the

objectives and tasks to be performed. The logics for the group discussions was prepared

through a “Diverge-Converge” framework, in order to get as many ideas as possible, and

the focus them into the most relevant or those gathering higher consensus.

From this, 4 methodologies were implemented:

• Enriched double SWOT (Session 1).

• Lotus Flower (Session 2)

• From-To (Session 3).

• Elevator pitch (to present the results of every team for each of the 3

activities, in 2 or 3 steps versions).

Besides, a specific document for the facilitators to follow and correctly apply every

technique was also provided, in order to harmonize the development of the sessions, to

perform based on the same rules and parameters, and for the results to be as

comparable as possible, taking into account the different issues discussed and the

particular profiles and ways of every discussion. Among the instructions previously

provided to the team of moderators and organisation, presented also during the briefing

of the conference, the following:

Page 29: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

28

1 Have a flipchart close to each table/group to work comfortably with all the members

participating at the same time.

2 Be extremely free to get completely focused on every demanding task (question-

mirror-focus-funnel-scanning-everyone’s happy tasks). Please no comments, no

last-minute issues – no “please inform the group that---“, no “tell them to pick the

menu for dinner…”, etc.

3 Be extremely sensitive to the group’s mood, and be entitled to act according to that,

being flexible with timings, results, etc. If a group feels better working a bit different,

let them get results.

4 Have coordination with the other facilitators. At the end, they are seeking a result

for the whole group working from smaller pieces. Facilitators are a team

themselves.

5 Collect and keep pictures of every wall, result, etc. When doing the activity, it seems

trivial, but the day after details get overlapped, mixed and confusing. Keep record

of everything.

6 Think about recording dialogue or parts of the discussion. When you hear a

conversation second time, you grab a lot of new details and nuances. Audio

recording can be easily done with a phone, and it is always a useful tool (always

asking for permission, and if possible).

7 Control time, is key. Not for counting every second, but to be able to do everything

that should be done. Pattern helps to advance, to cover, and not to forget.

Smartphones are also a decent tool to do that.

8 Have enough resources to be relaxed and ready to take the most of the group. When

the group activity is on, everyone is working for the facilitators, they must feel

backed and supported.

3.2.1 Session 1 - Activity: Enriched Double SWOT analysis

The activity Enriched double SWOT analysis consisted of the usual elements of a SWOT

grid, plus a proposal to enrich positive aspects, or keep the negative under control. It

focused on understanding the benefits that Linked Data may bring in and why it is good

to invest on it. Participants first developed their SWOT individually and then, as a group,

worked on a common vision.

Page 30: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

29

Participants had 15 minutes to develop their SWOT individually and then 2 minutes to

present to the group his/her vision, without debate and trying not to repeat issues and

concepts already appeared. At the same time, the facilitator was writing down the

different elements in the canvas available next to each table, trying to set up links with

the ideas of the other participants. A joint discussion of 15 minutes followed, where the

group tried to define a common scenario.

The final presentation of the conclusions by each group was made following the

methodology of the “Elevator pitch” (duration: 2 minutes). This methodology implied

that a representative of each group (mainly the facilitator) presented to the whole

audience a short conclusion of the discussion, results, and common agreements reached

by his/her group. The “Elevator pitch” was organised as a climb of “3 floors”:

• 1st Floor (duration: 30 seconds): Composition of the team, who they are, profiles and backgrounds.

• 2nd Floor (duration: 45 seconds): The issues discussed and reflection were presented, in bullets.

• 3rd Floor (duration: 45 seconds): conclusions, key issues and needs to approach.

Page 31: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

30

3.2.2 Session 2 - Activity: Lotus Flower

Stimulating participants to develop an open mind exercise to identify as much as

possible key factors to address challenges, facilitating to share and work together about

specific topics and detailed issues. Lotus Flower allows to represent and organize, easing

the discussion to reach consensus and calls for action.

Under this activity every participant elaborated the relevant aspects to aboard a specific

challenge, which were afterwards presented to the group and a final joint proposal was

prepared.

The participants had 15 minutes to develop their Lotus flower individually, taking into

account the challenge assigned, and trying to define the maximum number of elements

needed to face the challenge.

The following step consisted of dividing the group into two smaller groups where the

reduced members presented the aspects they would like to be highlighted, to start a

joint discussion regarding the key aspects to deal with the assigned challenge (duration:

25 minutes). Then, the two reduced groups got together again, and shared the key

aspects chosen to define a final joint proposal (duration: 15 minutes).

The final presentation of the conclusions by each group was made following the

methodology of the “Elevator pitch” (duration: 2 minutes per group), this time in a climb

of “2 floors”:

Brief description

of thechallenge

Needed elements

to addresschallenge

.

.

.

.

..

.

.

.

.

Page 32: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

31

3.2.3 Session 3 - Activity: From…to

The final activity was focused on making participants think and agree collectively about

the concrete steps to be taken in order to make the ESS vision 2020 a reality, based on

the previous identification of the elements to be tackled.

“From…To” technique aims to make concrete statements, linked to reality and with a

very oriented focus to future actions to achieve the goals, according to a classical past-

present-future time line logic for discussion and shaping strategy and priorities.

Each group received a project to work on and the following template:

From… Current situation To…

Working as a team, the group had to define a list of key aspects for the development of

the project (duration: 15 minutes), where trends and external factors were also

identified (duration: 10 minutes). Then, working in pairs, the key aspects and trends

were integrated to draft future scenarios, using the template received (duration: 15

minutes).

Then, the group joined again during 20 minutes to define a common scenario. The

presentation of the final conclusions to the other groups followed again the Elevator

pitch, organized in a climb of “3 floors”:

• 1st Floor (duration: 1 minute): From…TO: Where do we come from, where do we want to reach.

• 2nd Floor (duration: 1 minute): Key success factors for the project.

• 3rd Floor (duration: 1 minute): First steps to be taken in an Action Plan.

Page 33: ESS Workshop Dissemination of official statistics as open data · ESS Workshop: dissemination of official statistics as open data Malta 2017 3 Introduction This document describes

ESS Workshop: dissemination of official statistics as open data Malta 2017

32