a framework for user centred privacy and security in the...

107
CLARUS – H2020-ICT-2014 – G.A. 644024 © CLARUS Consortium 1 / 80 A framework for user centred privacy and security in the cloud Definition of Application Cases Type (distribution level) Public Contractual date of Delivery 30-04-2015 Actual date of delivery 12-06-2015 Deliverable number D2.1 Deliverable name Definition of Application Cases Version V1.1 Number of pages 80 WP/Task related to the deliverable Task 2.1 WP/Task responsible AKKA Author(s) AKKA and FCRB Teams Partner(s) Contributing All Document ID CLARUS-D2.1-DefinitionOfApplicationCases-v1.1

Upload: others

Post on 22-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024

© CLARUS Consortium 1 / 80

A framework for user

centred privacy and

security in the cloud

Definition of Application Cases

Type (distribution level) Public

Contractual date of Delivery 30-04-2015

Actual date of

delivery 12-06-2015

Deliverable number D2.1

Deliverable name Definition of Application Cases

Version V1.1

Number of pages 80

WP/Task related to the

deliverable Task 2.1

WP/Task

responsible AKKA

Author(s) AKKA and FCRB Teams

Partner(s) Contributing All

Document ID CLARUS-D2.1-DefinitionOfApplicationCases-v1.1

Page 2: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 2 / 80

Abstract This document analyses and specifies the application cases targeting

e-Health and publication of Geo-referenced data on the Internet. The

goal of this analysis is the identification of a number of

demonstration cases that are the main input for the refinement of

CLARUS requirements (WP2) and for the CLARUS implementation

(WP5). The demonstration cases cover all major aspects of the

CLARUS results. The demonstrations developed on the basis of this

specification will enable integration testing and support the final

evaluation of the project results to be carried out in WP6. In addition,

the application cases provide working examples that will support the

exploitation and dissemination activities (WP7).

Page 3: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 3 / 80

Disclaimer

CLARUS (G.A. 644024) is a Research and Innovation Actions project funded by the EU Framework Programme for Research and Innovation Horizon 2020. This document contains information on CLARUS core activities, findings and outcomes. Any reference to content in this document should clearly indicate the authors, source, organisation and publication date. The content of this publication is the sole responsibility of the CLARUS consortium and cannot be considered to reflect the views of the European Commission.

Page 4: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 4 / 80

Table of Contents

1 INTRODUCTION ............................................................................................................................................ 8

1.1 SCOPE OF THE DOCUMENT ..................................................................................................................................... 8

1.2 METHODOLOGY / DOCUMENT PLAN ....................................................................................................................... 8

1.3 DELIVERABLE OUTCOME AND FITTING IN THE PROJECT WORKFLOW ................................................................................ 8

1.4 APPLICABLE AND REFERENCE DOCUMENTS ................................................................................................................ 8

1.5 REVISION HISTORY ............................................................................................................................................. 10

1.6 NOTATIONS, ABBREVIATIONS AND ACRONYMS ........................................................................................................ 11

1.6.1 Acronyms............................................................................................................................................. 11

1.6.2 Definitions ........................................................................................................................................... 12

2 GEO PUBLICATION APPLICATION CASE ....................................................................................................... 14

2.1 OVERVIEW ....................................................................................................................................................... 14

2.2 ACTORS ........................................................................................................................................................... 14

2.2.1 Data Providers ..................................................................................................................................... 14

2.2.2 Data Consumers .................................................................................................................................. 15

2.2.3 Application Providers .......................................................................................................................... 15

2.2.4 IT Team................................................................................................................................................ 15

2.2.5 Security Manager ................................................................................................................................ 15

2.2.6 Cloud Service Provider ......................................................................................................................... 15

2.3 DATASETS ........................................................................................................................................................ 15

2.3.1 Geospatial data ................................................................................................................................... 15

2.3.2 Geospatial datasets for CLARUS.......................................................................................................... 17

2.4 SERVICES.......................................................................................................................................................... 20

2.4.1 Introduction ......................................................................................................................................... 20

2.4.2 Publication Services ............................................................................................................................. 22

2.4.3 Access Services .................................................................................................................................... 23

2.4.4 Computation Services .......................................................................................................................... 24

2.4.5 Exploitation & Operation Services ....................................................................................................... 26

2.5 SECURITY EXPECTATIONS ..................................................................................................................................... 28

Page 5: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 5 / 80

2.5.1 Expectations regarding geospatial data ............................................................................................. 29

2.5.2 Expectations regarding geospatial services ........................................................................................ 31

2.5.3 Expectations regarding personal data ................................................................................................ 33

2.5.4 Expectations regarding Cloud-based hosting ...................................................................................... 33

2.6 DEMONSTRATION CASES FOR CLARUS.................................................................................................................. 34

2.6.1 Geodata storage in the Cloud ............................................................................................................. 35

2.6.2 Geodata publication in the Cloud ........................................................................................................ 39

2.6.3 Collaboration on geodata in the Cloud ............................................................................................... 46

2.7 SECURING DEMONSTRATION CASES ....................................................................................................................... 52

2.7.1 Securing geodata storage in the Cloud ............................................................................................... 52

2.7.2 Securing geodata publication in the Cloud ......................................................................................... 53

2.7.3 Securing geodata collaboration in the Cloud ...................................................................................... 55

2.7.4 Summary ............................................................................................................................................. 56

3 E-HEALTH APPLICATION CASE ..................................................................................................................... 58

3.1 OVERVIEW ....................................................................................................................................................... 58

3.2 ACTORS ........................................................................................................................................................... 58

3.2.1 Data Providers ..................................................................................................................................... 58

3.2.2 Data Consumers .................................................................................................................................. 59

3.2.3 IT Team................................................................................................................................................ 59

3.2.4 Security Manager ................................................................................................................................ 59

3.2.5 Cloud Service Provider ......................................................................................................................... 59

3.3 DATASETS ........................................................................................................................................................ 59

3.3.1 Introduction ......................................................................................................................................... 59

3.3.2 Standards used in the e-Health use case ............................................................................................. 60

3.3.3 E-Health Dataset ................................................................................................................................. 63

3.4 SERVICES.......................................................................................................................................................... 64

3.4.1 Introduction ......................................................................................................................................... 64

3.4.2 Data Publication .................................................................................................................................. 69

3.4.3 Metadata Management...................................................................................................................... 69

3.4.4 Search .................................................................................................................................................. 69

Page 6: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 6 / 80

3.4.5 Advanced Queries ............................................................................................................................... 69

3.4.6 Statistics Computation ........................................................................................................................ 70

3.4.7 Transformation Services ...................................................................................................................... 70

3.4.8 Exploitation Services ........................................................................................................................... 70

3.5 SECURITY EXPECTATIONS ..................................................................................................................................... 70

3.5.1 Expectations in terms of data ............................................................................................................. 70

3.5.2 Expectations in terms of services ........................................................................................................ 71

3.6 DEMONSTRATION CASES FOR CLARUS.................................................................................................................. 71

3.6.1 Securing Passive Medical Health Records storage in the cloud .......................................................... 72

3.6.2 Securing Passive Medical Health Records access and retrieval from the cloud .................................. 73

3.6.3 Securing Passive Medical Health Record for the Advanced Query ...................................................... 74

3.6.4 Securing Passive Medical Health Record for Statistics Computation query ....................................... 75

3.7 SUMMARY........................................................................................................................................................ 76

4 CONCLUSION .............................................................................................................................................. 79

Page 7: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 7 / 80

Table of Figures

Figure 1 - INSPIRE network services ([R8]) .................................................................................................. 21

Figure 2 - Kriging interpolation formula ..................................................................................................... 25

Figure 3 - Kriging semi-variance formula .................................................................................................... 26

Figure 4 - Illustration of semivariogram components [R11] ....................................................................... 26

Figure 5 - Geodata Storage Activity Diagram .............................................................................................. 36

Figure 6 - Geodata Publication Diagram ..................................................................................................... 41

Figure 7 - Geo-processing Activity Diagram ................................................................................................ 45

Figure 8 - Server configuration of WPS Kriging implementation [R10] ...................................................... 46

Figure 9 - Geodata Collaboration (Consultation) Activity diagram ............................................................. 49

Figure 10 - Geodata Collaboration (Modification) Activity diagram .......................................................... 50

Figure 11 - Geo Data Demonstration Cases with regard to Security Expectations .................................... 57

Figure 12 - CLARUS Generic Scenarios with regard to Geo Data Demonstration Cases............................. 57

Figure 13 - Medical Health Records “Passivation” process ........................................................................ 65

Figure 14 - Medical Health Records access and data retrieval process ...................................................... 66

Figure 15 - Advanced Query to the CLARUS Cloud process ........................................................................ 67

Figure 16 - Medical Record Statistics Computation query ......................................................................... 68

Figure 17 - Medical Health Records storage in the cloud diagram ............................................................. 72

Figure 18 - Medical Health Records access and retrieval diagram ............................................................. 73

Figure 19 - Medical Health Record Advanced Query diagram .................................................................... 74

Figure 20 - Medical Health Record Statistics Computation query diagram ................................................ 75

Figure 21 - e-Health Demonstration Cases with regard to Security Expectations...................................... 78

Page 8: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 8 / 80

1 Introduction

1.1 Scope of the document

The requirements elicitation and specification process for CLARUS methodologies, technologies and

tools is supported through the analysis of two main application cases that will serve in demonstrating

the appropriateness an applicability of project results (see [A1]).

This document defines more in details the applications cases that will inform all further development in

the project from the two main domains:

Publication of geo-referenced data on the Internet

e-Health

This integrate an identification of main actors involved, the services they use, maintain and/or develop,

leaning on various types of datasets that we try and categorize according to trust and security

perspectives. This shall give a relevant overview of the domains at stake for initiating further work,

including specification of CLARUS requirements.

1.2 Methodology / Document Plan

For this document, we have reviewed several of the most prominent projects and initiatives in each

application domain. This allowed converging towards lists of Actors, Datasets, and Application Cases

that appear relevant for demonstrating appropriateness and applicability of CLARUS solutions.

Chapter 2 describes the domain of geospatial data publication on the internet, presenting actors,

datasets, services, security expectations, and detailing three cloud-based scenario where data

confidentiality is an issue. These scenario will serve as demonstration cases for CLARUS.

Chapter 3 describes the domain of e-Health, presenting actors, datasets, services, security expectations

and detailing two cloud-oriented scenario where data privacy is an issue. These scenario will serve as

demonstration cases for CLARUS.

Chapter 4 provides a conclusion to the present document, paving the way for the definition of CLARUS

requirements.

1.3 Deliverable outcome and fitting in the project workflow

The publication of this deliverable should help in the specification of CLARUS requirements and in

defining the evaluation, testing and validation undertaken in WP6.

1.4 Applicable and Reference documents

Page 9: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 9 / 80

App. /Ref. Title

[A1] CLARUS Grant Agreement 644024

[A2] CLARUS Consortium Agreement v3.0, dec.2014

[A3] Internet Security Glossary Glossary RFC4949.

[A4] CLARUS D2.1 Annex I - Review of EU geo-publication projects

[R1] INSPIRE Thematic Working Group Geology, Data Specification on Geology –

Technical Guidelines, dec. 2013

[R2] INSPIRE Thematic Working Group Mineral Resources, Data Specification on

Mineral Resources – Technical Guidelines, dec. 2013

[R3] INSPIRE Thematic Working Group EMF, Data specification on Environmental

Monitoring Facilities (EMF) – Technical Guidelines

[R4] INSPIRE Thematic Working Group Utility and governmental services, Data

Specification on Utility and governmental services - Technical Guidelines

[R5] Official Journal of the European Union, Directive 2007/2/EC of the European

Parliament and of the Council of 14 March 2007 establishing an Infrastructure

for Spatial Information in the European Community (INSPIRE)

[R6] Provisions contained in Article 4(2) Directive 2003/4/EC on public access to environmental information.

[R7] Provisions contained in article 8 (1) EU Data Protection Directive 95/46 on the protection of individuals with regard to the processing of personal data and on the free movement of such data

[R8] InGeoCloudS D2.1, Use Cases for InGeoCloudS Data and Services, Version 1.1 –

May2013

[R9] InGeoCloudS D2.2, Interface of web services and models of data, Version 1.0

[R10] Kriging implementation documentation, inGeoCloudS communication sheet,

E.Grinias, EKBAA

[R11] ArcGIS Help 10.1: How Kriging works. Available at

http://resources.arcgis.com/en/help/main/10.1/index.html#//00q90000001t00

0000

[R12] EGDI-Scope D5.1, Report on trust and authentication, Katleen Janssen, Jos

Dumortier (KU Leuven), August 2013

[R13] Design security and geo-rights management services in spatial data

infrastructure, T.Kubik et al.

[R14] Estimating Kriging-Based predictions with privacy, B.Tugrul, H.Polat, oct.2012

Page 10: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 10 / 80

[R15] OpenStreetMap website. Available at https://www.openstreetmap.org

[R16] Successful Response Starts with a Map, Improving Geospatial Support for

Disaster Management, The National Academy of Sciences, 2007

[R17] Editing map client for geodata updating in emergency situations, M. Konecny et

al., Masaryk University

[R18] Geographic Information Systems (GIS) for Disaster Management, Brian

Tomaszewsk, 2015

[R19] InGeoCloudS D3.1.3, Analysis and Monitoring of Clouds for Geo-Data Services,

Version 1.0

[R20] InGeoCloudS D3.3, Maintenance plan and Service Profiling, Version 1.0

[R21] InGeoCloudS D4.2, Fully Operational InGeoCloudS Pilot, Version 1.0

[R22] Available at : http://www.hl7.org

[R23] Available at : http://www.who.int/classifications/icd/en/

[R24] Available at : https://loinc.org/

[R25] Available at : http://www.ihtsdo.org/snomed-ct

[R26] Available at : http://dicom.nema.org/

1.5 Revision History

Version Date Author Description

0.1 20/02/2015 AKKA AKKA internal initial iteration

0.2 09/03/2015 AKKA

AKKA internal iteration: integration of

harmonized structure for Use Cases

description

0.3 13/03/2015 AKKA, FCRB

Incorporation of 1st FCRB inputs, re-

structuration of §2.5, share with

consortium

0.4 14/04/2015 AKKA

General comments on v0.3 from partners

(mails+WebMeeting of 19/03) lead to

more details about treated datasets to

activity diagrams for better describing

business processes and to sections

focused on demonstrators.

0.5 30/04/2015 AKKA, FCRB

Mapping security expectations with

demonstrators in the geo case summary

section, Incorporation of FCRB inputs,

Page 11: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 11 / 80

share with consortium for WebMeeting

30/04

0.6 13/05/2015 AKKA

Finalizing geo-publication part: added

Kriging description, explanation on

activity diagrams, securing demonstration

cases section, reference to annexI,

conclusion

0.7 29/05/2015 AKKA, FCRB

Finalizing e-Health part : added security

manager actor, securing for advanced

query and statistics computation queries

part, references, figure title.

Final version of deliverable D2.1.

0.8 01/06/2015 FCRB Added advanced query and statistics

computation explanations.

0.9 09/06/2015 AKKA

New release following reviews (MTI,

THALES) and KUL comments. Changes in

§1.6.2 definitions, §1.4 added R6, R7,R15

references , §Appendix, ToC, minor

format revisions.

1.0 12/06/2015 AKKA, FCRB New release following MTI review for e-

Health section. Final version of D2.1

1.1 12/06/2015 Jesús A. Manjón

(URV) Style modifications

1.6 Notations, Abbreviations and Acronyms

1.6.1 Acronyms

BRGM: ANSI: API: CDA:

Bureau de Recherche Géologique et Minière American National Standards Institute Application Programing Interface Clinical Document Architecture

CSW: CT:

Catalogue Service for the Web Computed Tomography

DoW: DICOM:

Description of Work Digital Imaging and Communications in Medicine

DP: Data Provider

EC: European Commission

ESRI: Environmental Systems Research Institute

FP7: Seventh Framework Programme for Research

FTPS: File Transfer Protocol Secure

GA: Grant Agreement

GEUS: Geological Survey of Denmark and Greenland

GIS: Geographical Information System

GML: HCCC: HIS:

Geography Markup Language Història Clínica Compartida de Catalunya Hospital Information System

Page 12: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 12 / 80

HL7 Health Level 7 International

IaaS: ICD: IGN: IHTSDO:

Infrastructure as a Service International Classification of Diseases Institut National de l'Information Géographique et Forestière International Health Terminology Standards Development Organisation

InGeoCloudS Inspired GEOdata CLOUD Services

INSPIRE: LIS: LOINC: LOPD MHR:

Infrastructure for Spatial Information in Europe Laboratory Information Systems Logical Observation Identifiers Names and Codes Personal Data Protection Law in Spain (Ley Orgánica de Protección de Datos) Medical Health Record

OGC: Open Geospatial Consortium

OSM: OpenStreetMap

OWL: Web Ontology Language

PaaS: PACS

Platform as a Service Picture Archiving and Communication System

PMB: PCIS: PDF:

Project Management Board Primary Care Information Systems Portable Document Format

RDBMS RDF:

Relational Database Management System Resource Description Framework

REST: Representational State Transfer

RIF: Rule Interchange Format

SaaS: Software as a Service

SAML: Security Assertion Markup Language

SCP: Secure Copy

SFTP: SNOMED CT:

Secure File Transfer Protocol Systematized Nomenclature of Medicine Clinical Terms

ToC: Table of Contents

WFS: WHO:

Web Feature Service World Health Organization

WMS: Web Map Service

WP: Work Package

WPS: Web Process Service

1.6.2 Definitions

Critical

A condition of a system resource such that denial of access to, or lack of availability of, that resource

would jeopardize a system user’s ability to perform a primary function or would result in other serious

consequences, such as human injury or loss of life [A3].

Confidential

Confidential information refers to information that is restricted from public dissemination for reasons

related inter alia to the restriction of access based on law for example national security, intellectual

property, trade secrets, international relations, public security or national defence [R5][R6].

Sensitive

Page 13: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 13 / 80

Sensitive information related to personal data is defined as “personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life” [R7].

Page 14: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 14 / 80

2 Geo Publication Application Case

2.1 Overview

Earth systems are coupled and tightly integrated; that is why discovering and sharing geo-referenced

data is critical for environmental professionals. However finding data and processing resources across

disciplines is often an issue for this community.

All around the world, initiatives aim to remove technical obstacles to institutional information sharing

and to facilitate the adoption of open, spatially enabled reference architectures in enterprise

environments. Most relevant frameworks for (geo-)data professionals are the OGC (Open Geospatial

Consortium) that acts worldwide and INSPIRE initially designed and developed by the European

Commission through its JRC (Joint Research Centre):

The INSPIRE (Infrastructure for Spatial Information in the EC) Directive establishes rules for

geographic and environmental data (geodata) supporting environmental policies or relating to

any activities which might have an impact on the European environment. This Directive aims at

ensuring that geodata are consistently available, interpretable and usable across European

regional and state boundaries. The consequence of the Directive is a requirement that geodata

definitions follow agreed and established norms, standards and that the data be readily

available online. Many of the standards promoted by INSPIRE currently come from the OGC.

The Open Geospatial Consortium (OGC) is an international industry consortium of 506

companies, government agencies and universities participating in a consensus process to

develop publicly available interface standards. OGC members work together in Standards

Working Groups and Domain Working Groups (such as Earth Systems Science, Hydrology,

Metadata, Meteorology & Oceanography, Sensor Web Enablement) to provide free and openly

available standards to the market. The OGC leads worldwide in the creation and establishment

of standards that enable global infrastructures for delivery and integration of geospatial content

and services into business and civic processes.

While these initiatives do not impose any solution in terms of Data Infrastructures, the cloud has

received during the recent years an ever-growing interest through its capabilities of addressing common

requirements such as huge data volumes, ubiquitous access and quality of services, computation power

and economical competitiveness. Thus, security issues are very relevant for the different actors in the

field, in particular when confidential/critical data are at stake.

2.2 Actors

2.2.1 Data Providers

This designates actors producing or collecting information into a system. It could be commercial or non-

profit organizations (e.g. meteorological agency, geological survey, transport organization, Google,

IGN...), academic institution, government agency, scientific laboratories, citizens (amateur scientists).

Data providers use the system in an authenticated manner.

Page 15: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 15 / 80

2.2.2 Data Consumers

These are actors consuming data from the system. This could be end-users like citizens, commercial or

non-profit organisations, academic institutions, government agencies, or scientific laboratories. It could

also be a tier client application or system like an orchestration framework, an INSPIRE service.

Some services might not require any authentication from the user (e.g. public data published along

INSPIRE recommendation). Some data consumers might be granted access to data that are not public.

2.2.3 Application Providers

In the frame of a platform as a service (PaaS), these actors have the capability of integrating new data

management applications and services in the system. It could be commercial or non-profit organizations

(e.g., meteorological agency, geological survey, transport organization, Google, IGN...), academic

institution, government agency, scientific laboratories. They use the system in an authenticated manner.

2.2.4 IT Team

This groups the technical team in charge of managing, monitoring and maintaining the (geo-publication)

system in operational conditions. They notably ensure that all technical components operate in nominal

conditions, that system issues are solved as quickly as possible.

2.2.5 Security Manager

This actor is in charge of defining, provisioning, maintaining security policies in an organisation for the

system into consideration. This notably can include account management, authorisations management…

2.2.6 Cloud Service Provider

This designates the institution providing an IaaS, PaaS and/or SaaS type of service that are used by Data

Providers for pushing and managing own data in the cloud.

As an example for IaaS services, we can cite Amazon Web Services (AWS), Microsoft Azure etc.

For PaaS, we can cite ESRI Managed Cloud Services, InGeoCloudS, that notably allows application

providers to extend geo-spatial applications already online with additional services and technical

capabilities.

SaaS examples in the field include web mapping services such as MangoMap, ArcGIS Online,

InGeoCloudS, GISCloud…

2.3 Datasets

2.3.1 Geospatial data

2.3.1.1 Data types

Geographical information is encoded following the standards defined by international consortiums such

as the the OpenGIS Consortium (OGC) or by GIS software editors.

Page 16: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 16 / 80

The GIS data types can be classified into two different categories: vector and raster.

Vector data are geographical features considered as geometrical shapes. Different geographical features

are expressed by different types of geometry:

Points are geographical features with zero dimensions and can best be expressed by a single

point reference. Points are used as simple location for wells, peaks, points of interest, etc.

Lines are geographical features with one dimension and are used for linear features such as

rivers, roads, railroads, topographic lines, etc.

Polygons are geographical features with two dimensions and are used to cover a particular area

of the earth's surface such as the boundary of a city (on a large scale map), lake, or forest.

Raster datasets represent geographic features by dividing the world into discrete square or rectangular

cells laid out in a grid. Each cell has a value that is used to represent some characteristic of that location.

Raster datasets are commonly used for representing and managing imagery, digital elevation models,

and numerous other phenomena.

2.3.1.2 Formats

Formats vary and are designed either for data exchange or for data rendering and serving. There are

also very significant differences between performances of various formats. Serving big coverage data

sets with good performance requires some knowledge and tuning.

A good data serving format is raster data type, which allows for multi-resolution extraction, and

provides support for quick subset extraction at native resolutions. Popular raster formats are GeoTIFF or

BLOB in RDBMS.

On the other hand, vector data type is a better data processing format, that allows for visually smooth

and easy implementation of overlay operations, displays data as vector graphics, and simplifies

combining vector layers from different sources. Moreover, vector data is simpler to update and

maintain, is more compatible with relational database environments and usually smaller than raster

data. Popular vector formats are shapefile, mapinfo and PostGIS.

2.3.1.3 Storage types

There are two common ways of storing geospatial data:

2.3.1.3.1 GIS files

Geographical information is encoded in a standardized manner into a file. There are many options for

GIS files. Among the popular ones, we can mention the followings:

Shapefiles are a very common format for storage of vector data. It is developed and regulated

by Esri as a (mostly) open specification for data interoperability. The shapefile format is a digital

vector storage format for storing geometric location and associated attribute information. It is

possible to read and write with a wide variety of software. The shapefile format consists of a

collection of files with a common filename prefix, stored in the same directory. Three

mandatory files contain binary data: the main file (.shp) that contains geometry data, the shape

index file (.shx) and the feature attribute data file (.dbf) stored in dbase format.

Page 17: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 17 / 80

Mapinfo TAB is another popular format for storage of vector data. It is developed and regulated

by MapInfo Corporation as a proprietary format. Minimum files required are the main file (.tab)

that holds in ASCII format the information about the type of data, the feature attribute data file

(.dat) stored in dbase format, the map file (.map) that stores the graphic and geographic

information needed to display each vector feature on a map and its associated index file (.id).

GeoTIFF is a public domain metadata standard which allows georeferencing information to be

embedded within a TIFF file. It results from an effort by over 160 different remote sensing, GIS,

cartographic, and surveying related companies and organizations to establish a TIFF based interchange

format for geo-referenced raster imagery. GeoTIFF has emerged as a standard image file format for

various GIS applications worldwide.

2.3.1.3.2 Spatial databases

Spatial databases are alternative data sources to GIS files and are essential if GIS applications (either

web applications or rich-client applications) do transactions.

Spatial databases are designed to store and query geospatial data, using spatial indexes to speed up

database operations. In addition to typical SELECT statements, spatial databases provides a wide variety

of spatial operations such as set operations (e.g. union, difference), predicates (e.g. overlapping

between two features), functions (e.g. area, distance, perimeter, location of the center of a geometrical

shapes).

Although there are many options for spatial databases, PostgreSQL/PostGIS is usually recommended.

Alternatives are RDBMS like H2GIS, Oracle spatial, DB2, SQL Server with spatial extensions, Spatialite or

ArcSDE, but also non-relational databases like MongoDB.

The GIS objects supported by PostGIS are all the vector types defined in the "Simple Features for SQL

1.2.1" standard defined by the OpenGIS Consortium (OGC), and the ISO "SQL/MM Part 3: Spatial"

document. In addition, PostGIS supports a raster type (no standards exist to follow), and a topology

model (following an early draft ISO standard for topology that has not been published as yet).

2.3.2 Geospatial datasets for CLARUS

2.3.2.1 Introduction

2.3.2.1.1 Critical/confidential datasets

In this section we introduce the environmental datasets that require a certain level of security / privacy.

Qualification of datasets regarding security and confidentiality is of high importance for the definition of

CLARUS application cases. However, this information is not directly available in the different projects we

studied (e.g. as metadata or in data documentation).

Qualification is deduced from interviews of significant stakeholders and analysis of key projects in the

domain of publication of geo-referenced data (e.g. EGDI-Scope, Minerals4EU, InGeoCloudS) [A4]. In

current document, we deliberately consider several types of datasets in order to address the security

issues from a relatively broad perspective, trying and taking into account some specificities of each

dataset type.

Page 18: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 18 / 80

These datasets are further described in the Demonstration Cases section (2.6), where we try and

establish a categorization of the data manipulated in terms of criticity /sensitivity (taking the viewpoint

of data owners / service providers) and of the corresponding data types.

2.3.2.1.2 INSPIRE classification

The INSPIRE (Infrastructure for Spatial Information in Europe) Directive mandates all European Union

Member States to provide environmentally related datasets so that they can be easily accessed by

other public organisations within their own country, in surrounding European countries and by the

European Commission for Europe-wide policy making.

INSPIRE Regulations apply to data:

With a geographic reference (i.e. “geodata”)

Relating to an area where the Member states have or exercise jurisdictional rights

Held by a public authority, third party or others working on behalf of either

In electronic format

Relating to one of the 34 topics classified in the 3 INSPIRE themes below:

1. Addresses, Geographical names, Administrative units, Hydrography, Cadastral parcels,

Protected sites, Coordinate reference systems, Transport networks, Geographical grid

systems.

2. Elevation, Geology, Land cover, Orthoimagery.

3. Agricultural and aquaculture facilities, Habitats and biotopes, Population distribution

and demography, Area management, Human health and safety, Production and

industrial facilities, Atmospheric conditions, Land use, Sea regions, Bio-geographical

regions, Meteorological geographical features, Soil, Buildings, Mineral Resources,

Species distribution, Energy Resources, Natural risk zones, Statistical units,

Environmental monitoring Facilities, Oceanographic geographical features, Utility and

governmental services.

2.3.2.2 Human Health and Safety

The INSPIRE Human Health and Safety (HH) theme describes the geographical distribution of dominance

of pathologies, the effect on health or well-being of humans linked to the quality of the environment.

It does not address personal data whereas the E-Health Application Case (section 3) specifically points

the issues related to the protection of personal health data stored on the cloud.

Therefore because the e-Health application case is more relevant for demonstration of the CLARUS

solutions, we will not consider the INSPIRE Human Health and Safety theme in the field of geo

publication.

2.3.2.3 Geology

The INSPIRE Geology (GE) theme is split into the following sub-themes: Geology, Hydrogeology and

Geophysics.

2.3.2.3.1 Geology and boreholes from the oil industry

The particular field of Geology provides basic knowledge about the physical properties and composition

of geologic materials, their structure and their age as depicted in geological maps, as well as landforms

Page 19: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 19 / 80

(geomorphological features). The model also covers boreholes - another important source of

information for interpreting the subsurface geology. [R1]

Boreholes falling under the Geology theme are categorized according to hydrocarbon production (i.e.

production of petroleum oil and/or gas). Borehole data from the oil industry has high commercial value

and security is a concern, as it emerged from our interview with the national geological survey from

Denmark and Greenland (GEUS).

2.3.2.3.2 Hydrogeology and groundwater boreholes

The particular field of Hydrogeology describes the flow, occurrence, and behaviour of water in the

subsurface environment. The two basic elements are the rock system (including aquifers) and the

groundwater system (including groundwater bodies). Man-made or natural hydrogeological

objects/features (such as groundwater wells and natural springs) are also included. [R1]

As it emerged from the EGDI-Scope stakeholders’ survey, security is critical for boreholes, wells and

groundwater data. (cf. Datasets section in the EGDI-Scope annex [A4] )

2.3.2.4 Environmental Monitoring Facilities

Location and operation of environmental monitoring facilities (EMF) includes observation and

measurement of emissions, of the state of environmental media and of other ecosystem parameters

(biodiversity, ecological conditions of vegetation, etc.) by or on behalf of public authorities. [R3]

2.3.2.4.1 EMF and groundwater boreholes

Groundwater boreholes can be modelled under either the Geology (GE) or the Environmental

Monitoring Facilities (EMF) INSPIRE themes. (see above for security aspects)

2.3.2.5 Mineral Resources

The Mineral resources data (MR) theme refers to the description of natural concentrations of very

diverse mineral resources of potential or proven economic interest. Mineral resources are used in

various domains [R2] :

Management of resources and exploitation activities: Providing information on inventoried

mineral resources.

Environmental impact assessments: mapping and measuring environmental geological

parameters at desk, in the field and in laboratory, for assessing geological material to be used

for construction and rehabilitation at the mine site.

Mineral exploration: the quantitative assessment of undiscovered mineral resources, the

modelling of mineral deposits, the mapping of lithological areas and units potentially hosting

mineral deposits, the use of by-products from natural stone quarrying as "secondary

aggregates" or as raw material for other industries.

Promotion of private sector investment: providing geodata and services for mining and

exploration companies.

2.3.2.5.1 Mineral resources and Rare Earth Elements

Rare Earth Elements (REE) are a group of critical raw materials.

Page 20: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 20 / 80

As it emerged from the EGDI-scope and Mineral4EU projects, the rare-earths market is critical to many

defense, energy, and other high-tech products. China today controls about 95 % of the world’s REE

production, making the sustainable supply of these elements for European industry highly vulnerable

and therefore a concern for national security of member states. [A4]

2.3.2.6 Utility and Governmental Services

The INSPIRE Utility and Governmental Services (US) theme includes both utility networks (such as

electricity, oil & gas, sewage, telecommunication and water networks) and administrative/social

governmental services (such as public administrations, civil protection sites, schools and hospitals).

2.3.2.6.1 Utility networks

The scope of this sub-theme covers 6 distinct categories of network:

Water Network,

Sewer Network,

Electricity Network,

Oil, Gas & Chemicals Network,

Thermal Network,

Telecommunications. Utility networks sub-theme overlaps other INSPIRE themes such as Hydrography, Production and

industrial facilities and energy resources. [R4]

The INSPIRE Directive states that “in every particular case, the public interest served by disclosure shall

be weighed against the interest served by limiting or condition the access” [R5]. In some instances,

spatial datasets covered by the INSPIRE directive may therefore not be made available to the public. For

example in the case these datasets are linked to public security. Or if they belong to a third-party that

does not give permission for re-use.

Utility networks are especially major concern to public security. On the sole gas distribution networks of

the Member states there are numerous damages every day, sometimes with very serious consequences

for the safety of both workers and residents and for the protection of the environment and the

economy. At the same time, utility networks datasets have an important business value. Data on

pipelines are the property of their operators, often private companies, and they cannot be used without

permission.

These datasets are therefore particularly interesting in the case of emergency and geo-hazard risk

management. Utility data must be quickly available for disaster-response personnel, but confidentiality

of these data should be guaranteed to the organisations to which they belong. Otherwise these

organisations might be reluctant to share them.

2.4 Services

2.4.1 Introduction

The INSPIRE directive also applies to spatial data services through which it is possible to access or use

the data described in the previous section.

Page 21: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 21 / 80

These services (called geodata services, or network services), are operations that can be executed on

the Web thanks to software applications managing geospatial data and/or geospatial metadata. These

services are:

Discovery,

Download,

View,

Transformation.

INSPIRE makes available a number of technical guidance documentation intended to assist government

and public bodies make their information available using standards such as OGC services. Among the

recommended standards, there is:

OGC Catalog Service for the Web (CSW) for discovery,

OGC Web Map Service (WMS) for viewing,

OGC Web Feature Service (WFS) for direct access download,

ATOM or WFS for pre-defined dataset download.

Figure 1 - INSPIRE network services ([R8])

These INSPIRE architecture services form the building blocks of some of the services offered by a

Geodata Cloud Service Provider. The combination of such services forms a SaaS application. Moreover a

Geodata Cloud Service Provider can provide the solution as PaaS, allowing consumers to deploy their

own application. An example is given by the FP7/CIP InGeoCloudS project that provides an open-source,

cloud-based platform (PaaS) with Geodata services (SaaS). [A4]

These Geodata cloud services can be divided into:

Publication services,

Page 22: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 22 / 80

Access services,

Computation services,

Recovery services.

We describe these services in general terms in the following section. Demonstrators description section

(Section 2.6) shows in more detail the data workflows that we will have to consider in CLARUS.

2.4.2 Publication Services

The data publication use case gathers services about the publication of geospatial datasets [R8]. These

services are more dedicated to:

Install custom applications in the cloud,

Upload datasets into the cloud,

Publish the datasets through OGC/INSPIRE services ,

Manage metadata

2.4.2.1 Custom Application

The application providers have the capability to install their custom application on the cloud.

The cloud provider delivers a computing platform (PaaS), including operating system, programming

language execution environment, database, and web server.

Two kinds of applications can be installed by application providers on the cloud:

A Web application dedicated to the domain of publication of geo-referenced data addressed by

the application provider.

An application utility dedicated to the synchronization of the dataset stored on the cloud with

the dataset stored on premises.

2.4.2.2 Data Import / Synchronization

The data providers have complete knowledge and skill of their datasets and elect the data to push into

the cloud. Each data provider has a dedicated and secured storage space on the file system and on the

database server. Only the owner of the data can access it.

The data providers also have complete skill on the procedures to manage their data on the cloud,

controlling how and when data are pushed, updated or deleted. The cloud solution provides basic

features (SaaS) to help data providers:

Services to manually manage or synchronize their datasets from own premises (FTP, FTPS, SCP,

SFTP),

Services to register synchronization tasks running on the cloud.

2.4.2.3 Service Provisioning

This use case defines how data providers publish the datasets through interoperable OGC/INSPIRE

compliant services (SaaS). They create or edit layers, configure Web Map Service (WMS) for raster data

and/or Web Feature Service (WFS) for vector data.

Page 23: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 23 / 80

The data providers define access rights to their published data for critical, confidential or marketable

data.

2.4.2.4 Metadata Management

Metadata allow data providers to describe their data, maps and services. Metadata are published and

used by data consumers who are looking for particular datasets and services.

These metadata are available and managed through a so-called Catalogue compliant with interoperable

OGC/INSPIRE CSW service.

Standard tools for managing and exposing discovery services (SaaS) include Geonetwork, Geosource,

Deegree, etc.

2.4.3 Access Services

2.4.3.1 Search

Discovery Services allow GIS softwares searching for spatial datasets and spatial data services on the

basis of the content of corresponding metadata, and displaying the metadata content.

These operations are performed through HTTP(S), following the CSW or OpenSearch standards.

2.4.3.2 View

GIS client applications invoke interoperable OGC/INSPIRE compliant services (SaaS) to display, navigate,

zoom in/out, pan, or overlay spatial datasets and display legend information and any relevant content of

metadata. They retrieve layers (raster data) and legend through Web Map Service (WMS, WMS-C,

WMTS) and/or features (vector data) through Web Feature Service (WFS).

2.4.3.3 Download

Download Services (SaaS) enable copies of complete spatial datasets, or of parts of such sets, to be

downloaded. Download Services could be of two types:

Pre-defined dataset download service(s): A pre-defined dataset download service provides for

the simple download of predefined datasets (or pre-defined parts of a dataset) with no ability to

query datasets or select user-defined subsets of datasets. A pre-defined dataset or a pre-

defined part of a dataset could be (for example) a file stored in a dataset repository, which can

be downloaded as a complete unity with no possibility to change content, whether encoding,

the CRS of the coordinates, etc. Pre-defined datasets are usually downloaded through INSPIRE

compliant services (SaaS) such as WFS or ATOM (optionally extended with GeoRSS or

OpenSearch).

Direct access download service(s): A direct access download service extends the functionality of

a pre-defined dataset download service to include the ability to query and download subsets of

datasets. The direct access download service allows more control over the download than the

simple download of a pre-defined dataset or pre-defined part of a dataset. In this case, the

spatial information is typically stored in a repository (e.g. a database) and only accessible

through a middleware data management system (although the precise implementation may

vary). The query can be based upon spatial or temporal criteria, or by specific properties of the

Page 24: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 24 / 80

instances of the spatial object types contained in the repository. Direct access download is

usually done through OGC/INSPIRE compliant WFS service (SaaS).

2.4.4 Computation Services

2.4.4.1 Transformation Services

Transformation services (SaaS) enable spatial datasets to be transformed with a view to achieving

interoperability. Examples of transformations are:

Data format transformation. It is used, for example, to transform dataset in a data provider’s

proprietary format to a dataset in a standard format such as GML.

Coordinates reference system transformation. Examples of coordinates systems frequently used

in geospatial dataset are the “Universal Transverse Mercator” coordinate system, the “British

national grid” reference system or the “United States National Grid”.

Data and/or Application providers provision, configure and publish transformation services in order to

allow data consumers to retrieve their geospatial datasets in a format that is compliant to a specific

standard or their own application needs.

Application providers:

o Deploy transformation services implementation to the cloud,

o Publish transformation services and related metadata.

Data providers:

o Manage transformation configurations (source and target data schema, mapping

definition) and related metadata

o Publish transformation configurations and related metadata

The implementation of transformation services rely on geospatial domain standards (RIF, GML) and/or

on other standards (XSD, XSLT, OWL (RDF), SQL). Commercial Solutions can also be used (e.g. Talend)

Data consumers invoke transformation service in order to retrieve datasets provided by data providers

in a format that conforms to their specific needs.

Data consumers:

Gather information about available transformation services

Invoke a dataset transformation service providing

o Source dataset and schema (by reference or by value)

o Target dataset destination and schema

Retrieve transformed dataset

Applicable standards: WSDL, WADL, WS-Addressing.

2.4.4.2 GeoProcessing / Invokation Services

Geospatial data often need to be processed before the information can be used effectively. The

geographic calculations run on the cloud and are provided as a service (SaaS). Web Processing Service

(WPS) is an OGC standard which provides rules for standardizing the implementation of geographic

calculations ("processes") as a Web Service.

Page 25: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 25 / 80

The WPS standard defines three types of operations:

GetCapabilities: describes the contents and properties of the service; lists the treatments; lists

the supported operations and methods,

DescribeProcess: provides the description of a process (description of inputs and outputs, i.e.

requests and responses),

Execute: allows to invoke the process/calculation and gather the corresponding result by

providing input parameters.

2.4.4.2.1 Kriging Example

Kriging (named after D.Krige) is an example of widespread geographic calculation that could be

implemented via WPS. More precisely Kriging is a spatial interpolation method, “which relies on the fact

that as distance between points increases, their similarity, defined by the covariance or correlation

between points, decreases.” [R10]

2.4.4.2.1.1 Principles

Let us consider a Regionalized Variable (Z), e.g. the concentration of some mineral in a geographical

zone, the temperature in a region, etc. Given the value of Z at a set of sample points (𝑆1 … 𝑆𝑛), a spatial

interpolation method aims at predicting its value at any other point (𝑆0) of the region. Broadly speaking,

the principle is to weight the sample points:

𝑍(𝑆0) = ∑ 𝑤𝑖 𝑍(𝑆𝑖)

𝑛

𝑖=1

Figure 2 - Kriging interpolation formula

When using interpolation in geography, we assume that locations that are close to each other tend to

be more similar than locations that are far apart. The weight (𝑤𝑖) given to values measured at distant

points will therefore be less than the weight given to values measured at points near the prediction

location.

For instance we can consider the weight to be directly a function of distance (or inverse distance)

between the prediction location and the sample points. This deterministic interpolation is called Inverse

Distance Weighting (IDW).

Unlike IDW, Kriging also takes into account the statistical relationships of the measured points between

themselves (i.e. autocorrelation) – using statistical methods. Kriging weights are based not only on the

distance between the measured points and the prediction location, but also on the overall spatial

arrangement of the measured points. This is particularly useful when measurements are spatially

correlated or when they show a directional bias (e.g. N-S vs. W-E), as it is often the case in geology and

soil science.

Kriging is a multi-step process:

First step (“Variography”) is to quantify and depict the spatial autocorrelation of the measured sample points - i.e. to express the degree of relationship between them. In this prospect, Kriging uses the semi-variance, which is simply half the variance of the differences between all possible points spaced a constant distance apart.

Page 26: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 26 / 80

γ(si,sj) = ½ var(Z(si) - Z(sj))

Figure 3 - Kriging semi-variance formula

The relationship between semi-variance and distance can be shown in a graph called a semi-variogram.

Once the squared difference between the values for all pairs of locations is plotted on the experimental variogram, the next step (“Spatial Autocorrelation Modeling”) is to fit a model through them. Indeed the experimental variogram, being irregular, cannot be used directly for calculating the Kriging weights. Instead a smooth mathematical function (model) must be used, e.g. Spherical, Exponential, Circular, Gaussian, Linear, etc. Even though they are different, these variogram models share certain characteristics (namely the range, the sill, and the nugget).

Figure 4 - Illustration of semivariogram components [R11]

Last step is to predict the unknown values (“Prediction”), either at a specific location (single point) either for a continuous surface (map grid).

In the case of a prediction for a specific location: the semi-variance values between the prediction point

(𝑆0) and the surrounding sample measurements (𝑆1 … 𝑆𝑘) (sample subset within a given search radius)

are computed using the variogram model; then the prediction value is calculated from a series of linear

equations. (cf. Kriging document [R10].

Actually Kriging uses the same data twice: the first time to estimate the spatial autocorrelation of the

dataset (variography + spatial modeling) and the second time to make the prediction.

2.4.4.2.1.2 Implementation

There are different technical solutions to implement Kriging solutions. An example of WPS Kriging

Execution operation will be described in order to demonstrate the process. Please refer to section

2.6.2.2 for more details.

2.4.5 Exploitation & Operation Services

The data providers have confidence in a geo publication solution running on the cloud if integrity of

their geospatial data and availability of the provided geo-services are preserved.

Page 27: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 27 / 80

Data recovery, monitoring and user management must be addressed in order to achieve an acceptable

level of quality.

2.4.5.1 Back-Up and Data Recovery

Minimizing data loss and improving performance are opposing goals. CSPs usually provide different

services (IaaS) to target those goals:

object storage services (e.g. S3 on Amazon, Swift on OpenStack) are designed to reduce the risk

of data loss,

disk storage services (e.g. Instance Store or EBS on Amazon, Ephemeral Disk or Cinder on

OpenStack) focus on performances (high I/O).

The geo publication solution must rely on the two kind of services to achieve an acceptable performance

level while avoiding data loss.

Disk storage services (IaaS) allow to improve performances so that client requests are served with a

good response time. However, even if CSPs offers solution to minimize lost of data stored on disk (e.g.

Amazon EBS volumes are tied to one data center and are automatically replicated within their data

center), they do not provide turnkey solutions to ensure data recovery (e.g. EBS volumes tied to a data

center are lost in case of failure the whole data center).

The geo publication solution must define a backup and data recovery procedures that rely on the object

storage service (IaaS) provided by the CSP (or on another solution independent of the CSP) [R19].

Object storage services (IaaS) are suitable to store backup of data because they replicate data to

minimize data loss. Some CSPs go further with replication on multiple data centers (e.g. Amazon S3

replicates data on multiple data centers in the same region). An alternative is to rely on object storage

services provided by other CSPs.

The geo publication solution must at least backup all data provider’s data and optionally provide a

service allowing data providers to backup/restore explicitely their datasets.

2.4.5.2 Monitoring

CSPs (IaaS) do not ensure high availability of the services provided by applications running on the cloud.

A service may fail for any reason: application bug (e.g. disk full), cyber attack, CSP hardware failure or

CSP maintenance task, etc.

CSPs usually provide SLA with redundant IaaS services running in multiple data centers to achieve high

availability. However customers do not automatically benefit of this strategy for their applications. The

architecture of the geo publication solution must be adapted, following the best practices guides

provided by the CSP.

If the geo publication solution does not achieve high availability, monitoring services should allow to

react as soon as possible to restart the faulty service. The IT team must rely on the monitoring and

support services provided by the CSP [R21]:

Health dashboard publishes information on availability of the CSP’s services (IaaS),

Monitoring service for cloud resources provisioned by the geo publication solution running on

the cloud (SaaS),

Page 28: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 28 / 80

Support service must notify the IT team (e.g. by mail) of failure events and scheduled

maintenance tasks that will imply stopping the service,

Support service must also help the IT team, according to the subscribed level of support.

In order to achieve monitoring of the services provided by the geo publication solution, the IT team

must use a dedicated application that relies on specific metrics [R20].

Moreover, in case of a geo publication solution that acts as a PaaS or SaaS platform - managing

geospatial data of multiple data providers, sharing the costs due to the cloud resources provisioned on

the underlying IaaS platform is difficult. Costs should be shared according to the usage by each data

provider of the geo publication solution: storage (e.g. volume of data), access services (e.g. number of

client requests), computation services (e.g. custom geo processing) [R19].

2.4.5.3 User Authentication and Authorization

Data providers expect the geo publication solution to authenticate users and authorize access to data

and services according to their needs in term of security. The geo publication solution must therefore

provide an authentication mechanism.

Although the geo publication solution usually integrates multiple software components, the end user

have to authenticate once and then shall be able to access to all the services and applications (according

to its rights). The geo publication solution must therefore provide an authorization mechanism that

implement single sign on (SSO). Application standards are OAuth and SAML.

Data providers may need to audit access to its data or may need to restrict access to its data to

authorized people. The geo publication solution must therefore controls and logs access to the services

that operates the data provider’s data.

Moreover a data provider that needs to audit access or needs to restrict access to some features of its

application has to integrate its application with the authentication mechanism (SSO) provided by the

geo pblication solution.

2.4.5.4 User Management

User authentication and authorization imply to provide a service to the IT team to manage users and

their rights[R21].

In some cases, the geo publication solution may also provide a restricted access to user managment to

allow data providers to manage users’ permissions on their data and on their application.

2.5 Security Expectations

When considering the use of a geospatial data infrastructure, it is essential that a certain level of

confidence is guaranteed to both data providers and users. This level of confidence relies mainly on

security measures.

As a prerequisite, we should therefore assess what are the security measures required. The aim of this

section is to sum up security requirements coming from various projects representative of the

“publication of geo-referenced data on the internet” domain.

Page 29: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 29 / 80

These requirements cover three main aspects:

The geospatial data,

The geospatial services,

The protection of personal data.

We should note that confidence also relies on transparency. Transparency means that the user finds it

easy to answer questions such as "- where are my data?"; "- what do the services do?"; "- how is the

privacy of my personal data guaranteed?"

2.5.1 Expectations regarding geospatial data

The users need to have confidence in the environmental datasets they can access. This confidence

derives on a great deal from guarantees given about the security of the data:

To the user – the data are not altered (data integrity, authenticity),

To the data provider – the data are protected (access control).

2.5.1.1 Preserving accessibility and protecting data against alteration

In the case of public data, the security requirements are limited because there is no need to restrict data

access for some people. We should nevertheless be careful about threats to data accessibility which can

seriously damage the business and affect the image of the organization and/or institution.

Protecting data accessibility implies being able to ensure the authenticity and integrity of the data (e.g.

give guarantees about data integrity using hash codes). Indeed the data that are not protected can

easily be altered in an undetectable way. This protection is applicable to all data, including public data.

Possible solution(s) are:

Implementing Data Origin Authentication,

Ensuring traceability : it is not enough to know that data has been changed, it is necessary to

know if it has been modified by an authorized person through authentication and authorization

mechanisms

2.5.1.2 Publish access limitations

In order to improve confidence with their data, it is recommended to define and publish information on

the legal aspects and on restrictions on the use of data.

2.5.1.2.1 Using metadata to define access limitations

Geospatial metadata allow the users to retrieve the datasets or services that best fit their needs. They

may contain legal information and information on the restrictions on the use of data.

According to the INSPIRE Metadata implementing rules*, ISO 19115 provides a general mechanism for

documenting different categories of constraints applicable to the resource or its metadata. There are

two major requirements expressed in the Directive in terms of documentation of the constraints as part

of the metadata:

Page 30: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 30 / 80

The limitations on public access: the Member States may limit public access to spatial datasets

and spatial data services in a set of cases defined in Article 13. These cases include public

security or national defense, i.e. more generally the existence of a security constraint.

The conditions applying to access and use of the resource, and where applicable, the

corresponding fees (Articles 5-2(b) and 11-2(f)).

2.5.1.2.1.1 Limitations on public access (AccessConstraint)

The contents of this property must be an XML fragment corresponding to the AccessConstraints element

as defined in the OGC WMS 1.1.1 DTD.

Metadata restriction code list (source: ISO19115)

MD_RestrictionCode Limitation(s) placed upon the access or use of the data

copyright exclusive right to the publication, production, or sale of the rights to a

literary, dramatic, musical, or artistic work, or to the use of a commercial

print or label, granted by law for a specified period of time to an author,

composer, artist, distributor.

Patent government has granted exclusive right to make, sell, use or license an

invention or discovery

patentPending produced or sold information awaiting a patent

Trademark a name, symbol, or other device identifying a product, officially

registered and legally restricted to the use of the owner or manufacturer

License formal permission to do something

intellectualPropertyRights rights to financial benefit from and control of distribution of non-tangible

property that is a result of creativity

Restricted withheld from general circulation or disclosure

otherRestrictions limitation not listed

2.5.1.2.1.2 Conditions applying to access and use (useLimitation)

This free text metadata is used to describe:

Terms and conditions, including where applicable, the corresponding fees that shall be provided

through this element,

A link (URL) where these terms and conditions are described.

To our knowledge, there are no wide-spead technical solution that enforces access limitations from

metadata declaration; ad hoc implementations are used.

Page 31: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 31 / 80

2.5.1.3 Protecting data confidentiality

This security measure aims at ensuring that the content of the information remains secret, except for

authorized people. In order to ensure the confidentiality of a dataset, we should define who has the

right to access the corresponding data, divide users into different groups or classes, and assign a role to

these groups.

A particular type of data whose confidentiality should always be protected is personal data. But as

geological data cannot be qualified as personal data, the question is : are there geological data for which

it is required to ensure confidentiality? At least the answer is yes for the personal data relating to

geological data, that is: - who has read or downloaded this particular data, who has used this particular

service.

There are several ways to ensure confidentiality among which :

Access control,

Cryptography.

2.5.1.4 Ensuring data quality assessing authenticity of sources

Trust in the geospatial data also depends on guarantees about the quality (i.e. how the user can ensure

that the data is reliable, good quality and corresponds to the expected use), which can be derived from

metadata. However it can be difficult for non-expert users to assess the quality of geospatial data

directly from metadata, as metadata contain little or no information on the expected use of data.

On the other hand, quality of geospatial data can be assessed through the use of authoritative data and

the ability to check the authenticity of sources. Quality is assumed when for instance the data come

from government agencies responsible for collecting data (e.g. national geological surveys). The concept

of authentic sources is related to one of the basic principles of INSPIRE: collect the data once at the most

suitable place, and re-use the data multiple times [R12].

2.5.2 Expectations regarding geospatial services

In order to set up a security policy for the publication of geo-referenced data, we must pay attention to

services – and more precisely to service continuity and to access management.

The security policy developed for the data needs to be extended to the services. It is important to

ensure continuity and to protect services against (distributed) Denial of Service attacks, power failures

and other external incidents.

2.5.2.1 Protecting services against unauthorized access

Data providers need to control the dissemination of their data in the geospatial value chain.

Access management is essential: it helps ensuring that only authorized individuals have access to the

service and can use the service for the purposes identified as part of their role.

Page 32: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 32 / 80

2.5.2.1.1 Implementing GeoRM services to manage rights

The INSPIRE directive specifies payment regulations for accessing spatial services provided by public

authorities (article 14) and mentions the use of services for electronic commerce, licensing, and other

mechanisms to ensure the preservation of rights and security of transactions.

Ideally, implementations of geospatial data infrastructure should follow open standards defined by the

OGC, ISO and INSPIRE. These standards include the OGC specification of GeoDRM architecture (digital

rights management), for geographic information. This reference model aims at simplifying the

management and protection of intellectual property on geospatial data.

Examples of features offered by GeoRM services are:

Authorization,

Authentication,

Pricing,

Billing,

Licensing access to the data (limited in time, spatial extent, specific position, particular user,

etc.).

The model allows to adapt licensing to different types of relationships between participants (direct

licensing, licensing indirect, B2B, B2C, licensing for WMS and WFS, with modules for RM, REL,

encryption, license verification, authentication, authorization, etc.).

Authentication standards and rights management are not in the CLARUS scope, however the GeoRM

specification is important to consider in parallel when securing geospatial services.

2.5.2.2 Publish access limitations

The access constraints for the data can be extended to the geospatial services, e.g. WMS, WFS or WPS

services.

2.5.2.2.1 Using service metadata to define access limitations

Geospatial metadata allow the users to retrieve the services that best fit their needs. They may contain

legal information and information on the restrictions on the use of a service. Even though authorization

mechanism based on such metadata are rarely implemented [R13].

For instance the wms-service-AccessConstraints element specifies any access constraints associated with

the WMS service. This information applies to the whole WMS instance, that is to say: it is not particular

to any layer or piece of data published by the WMS instance.

Metadata-based access limitations should be implemented by the application itself.

2.5.2.3 Additional considerations

Service Level Agreement: The users depend on certain services in order to retrieve their data

(cf. INSPIRE), they expect a certain level of service (SLA) from the Cloud Service Provider.

Ease of understanding: Trust in the geospatial services relies on ease of understanding. Ease of

understanding means that security should not complicate the user experience. Security and

Page 33: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 33 / 80

user-friendliness should be properly balanced, i.e. a security policy should not deter potential

users from using the service.

Using secure communication protocols: In this use case, Cloud Service Providers provide

secured communication protocols to manage metadata.

2.5.3 Expectations regarding personal data

Restricting access to a given set of data implies personal data collecting and processing. More precisely:

information on persons authorized to access this data or service, on the moment when they access this

data or service, on the volume they used, etc.

Data consumers: It is important for data consumers to know where the data that they want to

access come from, whether they can access it or not, and why. Furthermore, they want to

ensure that information about their identity and their use of the data will not be misused by the

data provider.

Data providers: It may be important for data providers to know who is using their data/services

and how they are used.

2.5.3.1 Ensuring identity data protection

Ensuring personal data protection may imply managing identity (through registration, identification,

authentication, and the administration of rights and privileges).

Managing identity could also lead to identity federation matters, i.e.:

Outsourcing identification and authentication processes to third parties,

Making life easier for users through a Single Sign-On for multiple services rather than using

many user-ID and passwords (cf. use case below)

In addition, any solution aiming at ensuring personal data protection should be harmonized with EU’s

Data Protection Directive. These directives may change over time, i.e. data which are considered public

may become private.

2.5.3.2 Ensuring access auditing

As stated before, knowing who is using their data/services and how they are used is an important

concern for data providers.

2.5.4 Expectations regarding Cloud-based hosting

The transition to the cloud leads to specific security risks that should be compared with risks linked to

information systems outsourcing. The CSPs traditionally implement their own governance rules with

limited visibility and choice for the customer/user. It is one of the main objectives of CLARUS to give

back control to the user.

2.5.4.1 Risks regarding data location

In Europe, the legal framework for the personal data protection is based on the principle that it should

be possible at any time to control the data location (territoriality principle).

Page 34: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 34 / 80

Yet in a public cloud service getting this information is not possible. This raises the question of the

jurisdiction of courts and applicable law. The fact that it is not possible to perform audits hinders the

control on security measures implementation.

Similarly, personal data transfer outside EU’s borders is regulated. Without a consistent level of data

protection and guarantees as to the security measures implemented, data confidentiality is uncertain.

2.5.4.2 Risks of information system loss of control

Governance: using the services of a cloud provider, the customer grants the provider full

control, including the management of security incidents.

Portability/Reversibility: cloud services do not always ensure reversibility of data, applications

or services. In these conditions it seems difficult to consider changing provider.

2.5.4.3 Risks related to multi-tenancy and resource sharing

Faulty isolation: resource separation mechanisms (storage, memory) may be faulty and integrity

and confidentiality of data compromised;

Incomplete or insecure deletion: there is no guarantee that the data is actually deleted or that

there are no other copies stored in the cloud.

2.6 Demonstration Cases for CLARUS

The aim of this section is to describe precisely those application cases that will be used as

demonstrators for CLARUS (especially in the frame of WP6 work), i.e. cases that will help in specifying

CLARUS, designing its architecture and validating it all along the project.

We have seen that the geopublication domain is wide and numerous scenarios of use of datasets and

services can be identified. We have listed a wide range of security expectations, even if some of them

cannot be fully answered by foreseen CLARUS. We nevertheless try and focus on those most common

scenarios where cloud technologies are used and where CLARUS could bring breakthrough solutions to

important security expectations,

Both IaaS and PaaS delivery models are covered as well as situations where distinct CSPs can be

involved.

In particular three main cases are detailed, that could be mapped to the different CLARUS typical

scenarios:

Storage of geospatial data,

Publication and processing of geospatial data,

Collaboration on geospatial data.

For each of these cases, a four-sided perspective has been adopted, answering key questions about:

Why the corresponding demonstration case is relevant for both CLARUS and the geospatial

domain,

Who is susceptible of using it in a real-life context,

What are the sample data selected, their type and their sensitivity/criticality,

How the application case is usually designed and implemented.

Page 35: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 35 / 80

2.6.1 Geodata storage in the Cloud

2.6.1.1 Purpose (Why?)

Exploiting the cloud’s storing facilities is crucial for the actors in the geospatial field, as the volume of

data they use (data with a spatial reference usually called geodata) becomes more and more important

over time. In addition to storage capabilities, the cloud provides the users with ubiquitous access to and

sustainability of their data.

However securing the storage of geodata is a challenge.

For instance, applying ordinary measures to secure data access like disabling external connections to the

database could be very detrimental for the geospatial data users. Indeed geodata viewing, editing, and

analysis sometimes require the use of dedicated rich-client applications (e.g. QGIS) running on premises.

2.6.1.2 Actors (Who?)

People who create, interpret, and use geodata: geomatics/Geo-IT professionals, geodata and content

providers (e.g. public authority, national survey, territorial authority, etc.).

2.6.1.3 Data (What?)

The data that need to be secured in the typical context of a geospatial infrastructure are:

Geographical coordinates, and/or,

Scientific data (e.g. measurement value, measurement type, etc.).

For this demonstration case, any geospatial dataset requiring a certain level of security / privacy may fit

(cf. §2.2).

Example: Rare Earth and Minerals Resources Data.

This dataset includes information about the nature, genesis, location, extent, mining and distribution of

mineral resources, presence of rare earth, etc.

2.6.1.4 Design (How?)

In this typical case the data is accessed through a rich-client application supporting spatial databases

and GIS file formats (e.g. QGIS). A typical use case of QGIS is to extract data from a spatial database (e.g.

PostGIS) and convert them into a shapefile in order to use them in another Geographical Information

System.

The rich-client application runs on premises whereas Geodata are stored on the cloud.

Page 36: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 36 / 80

Figure 5 - Geodata Storage Activity Diagram

Cloud storage is partitioned into multiple dataspaces in order to structure hierarchy of geo-datasets.

Page 37: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 37 / 80

A dataspace can be either stored on a file storage system (with one or more GIS files depending on the

GIS file format), or on a database server (as a spatial database).

GIS files: the rich-client application (e.g. QGIS) requires accessing to GIS files locally via the file

system, whereas file storage is on the cloud. Therefore a mount point (on Unix) or a virtual drive

(on Windows) must be created and mounted to the remote file storage server (AWS S3,

OpenStack Swift, Ceph or instances exposing a remote directory through NFS, GlusterFS or CIFS).

Spatial databases: the rich-client application accesses directly to remote spatial databases

running on the cloud. Therefore the remote database server must support spatial databases

(PostgreSQL with PostGIS extension)

Settings of a dataspace include configuration specific to the type of storage or specific to GIS format but

also include security settings:

GIS file(s): in order to make management efficient, a dataspace is a directory where GIS file(s) is

(are) stored. The administrator manages the security settings of the GIS directories as if they are

located on premises, through the file system.

Spatial database: a dataspace is necessarily a spatial database on the database server. The

administrator manages the security settings of the spatial databases directly on the database

server.

2.6.1.4.1 Dataspace management (use case)

The way the administrator creates, modifies or deletes a dataspace depends on the type of storage:

GIS file(s):

o Create dataspace: the administrator creates a dedicated directory (using any file tool),

then the administrator may configure specific settings if required by the GIS file

format,

finally the administrator sets the owner of the dedicated directory and sets

access rights for all users (POSIX permissions on Unix, ACLs on Windows). Access

rights can be read-only, write or none,

o Modify settings: the administrator can modify name, path and access rights of a GIS

directory,

o Delete dataspace: the administrator simply deletes the GIS directory dedicated to the

dataspace.

Spatial database:

o Create dataspace: the administrator creates a dedicated spatial database directly on the

database server (using any database administration tool),

configures the policy of client authentication for the new database,

finally the administrator sets the owner of the spatial database and grants

privileges to other users. Access rights can be read-only, write or none,

o Modify settings: the administrator can modify any setting of a spatial database,

including the policy of client authentication and granted privileges,

o Delete dataspace: the administrator deletes the spatial database dedicated to the

dataspace directly on the database server (using any database administration tool). The

administrator also deletes the associated policy of client authentication.

Page 38: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 38 / 80

2.6.1.4.2 Data management (use case)

2.6.1.4.2.1 Upload:

The way the provider uploads dataset depends on the type of storage:

GIS file(s): the data provider copies the GIS file(s) directly to the GIS directory (using any file

tools). The GIS file(s) are transparently transferred into the cloud storage.

Spatial database: through any database client tool, the data provider initializes the remote

spatial database with a database script that defines the schema of the database and/or contains

geodata.

2.6.1.4.2.2 Modify geodata:

The way the provider creates, modifies or deletes geodata of a dataspace depends on the type of

storage:

GIS file(s):

o Connection: through the rich-client application (e.g. QGIS), the data provider navigates

in the file system to the GIS directory and select the main GIS file to open,

the rich-client application open the GIS file(s),

o View geodata: the data provider can browse, search and view all geodata stored in the

GIS file(s) using the rich-client application,

o Modify geodata: the data provider can modify geodata stored in the GIS file(s) using the

rich-client application,

o Delete geodata: the data provider can delete geodata stored in the GIS file(s) using the

rich-client application,

o Create geodata: the data provider can create new geodata in the GIS file(s) using the

rich-client application,

o Disconnection: through the rich-client application (e.g. QGIS), the data provider closes

the GIS file(s).

Spatial database:

o Connection: through the rich-client application (e.g. QGIS), the data provider connects

directly to the remote spatial database using a user account,

the data provider must specify all connection information: host, port, database

name, user and password,

the rich-client application connects to the remote spatial database,

o View geodata: the data provider can browse, search and view all geodata stored in the

spatial database using the rich-client application,

o Modify geodata: the data provider can modify geodata stored in the spatial database

using the rich-client application,

o Delete geodata: the data provider can delete geodata stored in the spatial database

using the rich-client application,

o Create geodata: the data provider can create new geodata in the spatial database using

the rich-client application,

o Disconnection: through the rich-client application (e.g. QGIS), the data provider closes

the connection to the spatial database.

Page 39: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 39 / 80

2.6.2 Geodata publication in the Cloud

2.6.2.1 Web mapping (metadata and map creation)

2.6.2.1.1 Purpose (Why?)

Geospatial data providers like public authorities are required to conform with the INSPIRE European

directive, which states that the data they are responsible for must be:

accessible on the internet and reusable,

through research, consultation and downloading,

thanks to metadata and services

Securing the geo-publication is an important concern as data providers often want to limit the access to

some of their spatial datasets and data services.

Access to data and/or services could be restricted due to public security or national defense,

and more generally to the existence of a security constraint.

Data could also be withheld from general disclosure due to business matters (i.e. selling data). In

this case the data has a high commercial value, not only for the provider (who sells it) but even

more for the companies who have acquired the data and reported it to the provider.

Furthermore some data are confidential (information about which companies are buying which

data).

2.6.2.1.2 Actors (Who?)

Public authority acting as geo-referenced data providers like national survey, municipality, territorial authority, etc.

2.6.2.1.3 Data (What?)

Example: Boreholes, Wells and Groundwater Data

In the case of a restricted access due to national security concerns, the dataset used to

demonstrate the publication scenario could be the groundwater boreholes data in France (i.e.

groundwater bank of basement maintained by the BRGM, the French national geological

survey). This dataset is subject to restricted access due to national security matters (at least

from the 1:100,000 scale)

Data category Data name / data type

Critical Geom (spatial geometry)

Gisement (varchar)

Confidential

In the case of a restricted access due to business concerns, the dataset used to demonstrate

the publication scenario could be the borehole data from the oil industry in Denmark and

Greenland (maintained by GEUS, the Danish geological survey).

Page 40: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 40 / 80

Data category Data name / data type

Critical

Confidential the whole dataset

The publication of geodata should be done through the definition of metadata, helping the

users to find the data they are looking for. Metadata need not be secured themselves, however

they contain information about the category of constraints applicable to the data or the service

itself. For instance the AccessConstraints element specifies any access constraints associated

with the corresponding resource (see below).

Screenshot - Access constraints metadata on groundwater dataset

2.6.2.1.4 Design (How?)

Publication is done thanks to services used by geo web applications (javascript library e.g. OpenLayers

/Leaflet)

Geo catalogue services : CSW (provided by a cataloguing application, e.g. GeoNetwork),

Geo web services : WMS, WFS (provided by a geospatial server, e.g. MapServer/GeoServer) ,

Download services : ATOM or WFS (for simple download i.e. whole dataset), WFS (for direct

download).

Page 41: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 41 / 80

Figure 6 - Geodata Publication Diagram

Page 42: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 42 / 80

Different data are stored on the cloud:

Geo-spatial data storage is in charge of storing and serving geodata. Geodata are scientific data

associated to geographical features (points, lines, polygons, etc.).

Maps and layers storage is in charge of storing and serving information which is necessary to

build an image for a map: source of geodata (GIS files or spatial database), symbolic to represent

data, etc.

Metadata storage is in charge of storing and serving metadata on geodata and on maps and

layers.

Tile cache is in charge of storing and serving tiles built from an image of a map.

All data are exclusively accessed from the computation services.

Computation services and storage services are dissociated on the diagram, which means that

different CSPs could be used: one for running the services and one (or more) for data storage.

However, in order to have an efficient solution, we should consider using a single CSP for both

running the services and for data storage. The combination of computation services running on

the cloud and storage on the cloud forms a SaaS application provided by the CSP.

2.6.2.1.4.1 Publish data (use case)

On premises, the data providers prepare geodata and maps for the geopublication:

Import data: the data provider uses the transfer file service to upload the dataset files on the

cloud through any file transfer tool which supports SCP or SFTP protocols.

Publish data: on premises, the data provider publishes geodata:

o Creating a map with all its layers:

Defining the source of geodata and the symbolic,

Publishing layers of the map: background layer is published as WMS service and

other layers are published as WFS service.

All information (map, layers, symbolic, WMS service, WFS service, etc.) are

stored in a single .map file,

o Then using the transfer file service, the .map file is uploaded on the cloud through any

file transfer tool which supports SCP or SFTP protocols,

o Finally the data provider creates and publishes the metadata for the geodata and for the

map using the metadata catalog running on the cloud.

After that, geodata can be found using the dedicated Web application running on the cloud or using the

CSW service. Geodata, maps and tiles are accessible using the WFS, WMS and ATOM services.

2.6.2.1.4.2 Access data (use case)

On premises, the data consumer uses an application compatible with OGC services to search and view

geodata:

Search: through the GIS application, the data consumer invokes the CSW service running on the

cloud to search for geodata and discover how to access it (metadata gives endpoints for WFS,

WMS and ATOM services),

o Then the GIS application displays the list of available layers.

Page 43: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 43 / 80

Display map: then the GIS application calls the WMS service and displays the background image

of the map:

o On the cloud, the tile service builds the background image from the cached tiles,

o If a tile does not exist in cache, the tile service invokes the map service via WMS. The

returned image is split into multiple tiles and put in the tile cache.

Selection of layers: then the data consumer selects one or more layers,

o For each selected layer, the GIS application calls the WFS service and displays the

features (geodata) on the map.

Navigation: each time the data consumer zooms in/out or pans, the map must be refreshed.

Step is executed again.

Download: the data provider may download geodata:

o Only a subset through the WFS service,

o Or the entire dataset through the ATOM service or through the WFS service.

2.6.2.2 Geo-processing (Kriging computation)

The following demonstration case is about securing a geo-processing service for commercial purpose.

This scenario might not be found in a single organization but at least all the elements of the scenario are

known to occur in different organizations: for instance providers selling geospatial data on one hand,

and providers offering computational services on the other.

In this demonstration case, we have selected a dataset (borehole data) known to have a high

commercial value . And we have selected a computational service requiring important efforts and

investment, suggesting that the service provider may want to get some benefits back (at least to

compensate what has been spent): namely, a Kriging service.

Kriging is an interpolation method, which is widely used to estimate the value of an unmeasured

location from known measurements observed at nearby locations (cf.2.4.4.2) .

2.6.2.2.1 Purpose (Why?)

Why securing a commercial computational service?

In a transaction involving a Kriging interpolation, the service provider holds measurements together

with their related coordinates and wants to provide the computational services in return of some

benefits. On the other side of the transaction, the data consumer wants to obtain an estimated

prediction for a specific location without making measurements.

According to B. Tugrul and H. Polat, the problem is that although Kriging is increasingly becoming

popular and widely used for estimating predictions, it fails to protect confidentiality. Data providers

and data consumers could hesitate to participate in Kriging transactions. [R14]

2.6.2.2.2 Actors (Who?)

Data providers : public authority proposing commercial services, business data providers,

Data consumers : companies that are ready to pay for using the geoprocessing service.

Page 44: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 44 / 80

2.6.2.2.3 Data (What?)

From the provider perspective, measurements collected for Kriging interpolations and their related

coordinates are considered as confidential. They are its valuable assets. The provider “might lose

competitive edge over other rival companies” in case of data disclosure. [R14]

From the consumer perspective, the location where a prediction is requested and the estimated

prediction for that location are considered confidential. Based on the outcome of the Kriging

interpolation, data consumers plan investments and they do not want to reveal the location and the

related estimated prediction to the service provider.

To sum up, we could say that the confidential data that should be protected against involving parties

are:

Coordinate values of the sample locations (provider-side),

Observed measurements (provider-side),

Coordinate values of the research location (consumer-side),

Estimated prediction (consumer-side).

In addition privacy should also be preserved, i.e.:

Commercial transaction (which company buys/sells which data )

Kriging interpolation is used in various areas (mine reservoirs, petroleum industry, environmental

sciences, agriculture, etc.).

In this case, we plan to demonstrate a privacy-preserving Kriging interpolation on the groundwater

borehole data in Denmark and Greenland.

Data category Data name / data type

Critical

Confidential Geom (spatial geometry)

Owners and buyers of data

2.6.2.2.4 Design (How?)

Geo web services : WPS,

Used by geo web applications.

Page 45: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 45 / 80

Figure 7 - Geo-processing Activity Diagram

The Geo-processing activity diagram completes the Geodata publication diagram adding a Kriging

service running on the cloud.

2.6.2.2.4.1 Publish kriging (use case)

The data provider is in charge of publishing the Kriging application:

Install Kriging application: the data provider uses the transfer file service to upload the Kriging

application on the cloud through any file transfer tool which supports SCP or SFTP protocols.

Publish Kriging service: the data provider creates and publishes the metada for the Kriging

service using the metadata catalog running on the cloud.

After that, Kriging service can be found using the CSW service and is accessible using the WPS service.

2.6.2.2.4.2 Invoke a kriging service (use case)

On premises, the data consumer uses an application compatible with OGC services to search and invoke

a Kriging service:

Search: through the GIS application, the data consumer invokes the CSW service running on the

cloud to search for the kriging service and discover how to access it (metadata gives endpoints

for WPS services).

Selection of features: on the map, the data consumer selects the features to be used by the

Kriging service.

Page 46: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 46 / 80

Estimation: then the GIS application invokes the kriging service with the selected point through

the WPS service.

Display estimated points: the GIS application displays the estimated points on the map.

How the Kriging WPS is implemented ?

WPS Kriging process can be implemented in different ways: for instance within a GIS server like

GeoServer (which provides WPS support and Sextante integration) or within a java-based WPS container

like the 52°North Web Processing Service.

The following diagram shows an implementation of the Kriging process through a Java class in a Linux

machine, using Apache Tomcat Web Java Server and the 52 North WPS 3.1.1 implementation. Ordinary

Kriging is applied on the input data using the R Gstat package. The interconnection between the Java

module located at the WPS Container and R is handled by the TCP/IP server Rserve. [R10]

Figure 8 - Server configuration of WPS Kriging implementation [R10]

2.6.3 Collaboration on geodata in the Cloud

Aside from storage, publication and computation, the collaborative edition of geographic data is another

important feature of a Geographic Information System (at least in some cases). One of the most popular

initiatives to collaborate on geo-referenced data online is OpenStreetMap (OSM), an open-source

project which provides online JavaScript/Flash-based editors and a RESTful editing API for collaborating

on geodata [R15]. Thanks to OSM, any mapper (whether amateur or professional) could add or edit any

feature in any area – such as roads, but more generally any geographic object one may wish to

reference (phone boxes, bus stops, parks, public toilets, etc.)

OpenStreetMap mainly relies on volunteers to work on the task of collecting geodata and providing

maps, free to use, without restriction, i.e. it promotes a crowdsourcing approach to geography. The

benefit is that while in other systems, the update cycle of maps is often very slow, OSM is continually

updated.

Page 47: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 47 / 80

This project is very popular, but this popularity comes with drawbacks, from deliberate vandalism to

inaccuracies or mistakes. As OSM keeps a full editing history for every object, any mistakes or

deliberated vandalism can be rolled back, however these drawbacks may become critical in some

domains – for instance the domain of disaster-management. Indeed map projects sometimes take place

in an “extremely sensible environment like a war zone, where tolerance for inaccuracy is low, security

and privacy concerns for respondents are too high. In these cases, encouraging the crowd to submit

sensitive data might not be the best strategy.” [R18]

In the case we propose as a demonstration for geodata collaboration, we have decided to focus on

disaster-management but, despite the vital role it plays in this domain (as exemplified by the

Humanitarian OpenStreetMap Team), letting aside the OSM crowdsourcing approach and choosing a

standard-GIS solution using the WFS-T service.

2.6.3.1 Purpose (Why?)

The geospatial community can meet the challenge of disaster management, and providing support to

situations of emergency is one of the key roles of a GIS. This is especially the case for complex

emergency actions where geo-information plays an important role. It is crucial that information on the

spatial extent and consequences of disasters are made available within a short time to decision makers,

and shared between disaster-response personnel for the coordination of emergency efforts. Mobile

applications are obviously desirable in such a context as they could help in taking near real-time

decisions in the field. However these tasks require more than ‘classic’ GIS functionalities, as not only

visualization is needed, but also quick update of geo-information.

A common approach for supporting this geodata collaboration scenario is to provide a web-based

application with a user interface to select and update map features. This interface (a mapping client

application) allows to draw new features, move, modify or delete features in the map according to the

current situation. This editing client is primarily designed for an operational officer in the field, for

emergency tasks with a spatial reference such as marking evacuation routes, marking the place for

helicopter landing, etc. [R17]

However, security is an important concern. It is “one of the major reasons cited by organizations for

failing to share data in support of emergency response.” While there are “enormous amounts of data”

essential to emergency management, “many organizations are unwilling to share their data or will

provide it only under very restrictive agreements because of concerns about data security or liability”.

[R16]

It is therefore crucial to develop a set of security requirements for data to be shared in the event of a

local, regional, or national emergency. These guidelines should be implemented by the parties involved.

2.6.3.2 Actors (Who?)

Geohazard experts (in the field), Disaster-response personnel.

2.6.3.3 Data (What?)

Utility and governmental services susceptible of being damaged in the event of a disaster, and whose

security and confidentiality is critical for authorities and/or private companies: oil & gas pipeline, Water

supply system network, Electricity transmission lines, Transmission network for different kind of data/

signals.

Page 48: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 48 / 80

This broad dataset is part of the INSPIRE data specifications, whose typical use example is risk

management. In the context of the Seveso Directive, which is of major importance in regulating

management of risk, access to utility data is needed.

Data to be secured in this context:

Geospatial coordinates of critical points

2.6.3.4 Design (How?)

Through mobile applications (mobile-friendly javascript map library)

Using WFS-T* (for transactions)

With geospatial servers supporting WFS-T (e.g. GeoServer/Degree)

Publishing data from a PostgreSQL/postGIS database (layers) with OGC web services

The editing (mobile) application allows users to display WFS layers from a geospatial server (e.g.

Geoserver or Degree). The user can edit a selected WFS layer by clicking on a dedicated button ("Edit").

The Layer being added is highlighted in the layer list and simultaneously an editing toolbox is displayed.

In this toolbox, the user can choose either to move existing features from the edited layer, create new,

or delete existing ones. When the user edits features a WFS transaction is created (i.e. an XML file

specifying which features have been modified and how). When the user clicks the Save button in the

editing toolbar, the generated WFS transaction (WFS-T) is sent to the server and the data updated. The

modifications are then displayed to any other device connecting to the application

* WMS and WFS services are specified as read-only. WFS-T is a part of the WFS specification that allows

updates of the underlying data (rarely implemented due to inefficiency of XML and HTTP to upload large

GIS data)

Page 49: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 49 / 80

Figure 9 - Geodata Collaboration (Consultation) Activity diagram

Page 50: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 50 / 80

Figure 10 - Geodata Collaboration (Modification) Activity diagram

Different data are stored on the cloud, and all these data are exclusively accessed from the computation

services. (see previous case for details).

2.6.3.4.1 Publish data (use case)

The data providers prepare geodata for the geopublication on premises and create maps on the cloud:

Import data: the data provider uses the transfer file service to upload the dataset files on the

cloud through any file transfer tool which supports SCP or SFTP protocols.

Publish data: on the cloud, the data provider publish geodata:

o Creating a map with all its layers:

Defining the source of geodata and the symbolic,

Publishing layers of the map: background layer is published as WMS service and

other layers are published as WFS service.

o Then the data provider creates and publishes the metadata for the geodata and for the

map using the metadata catalog running on the cloud.

Page 51: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 51 / 80

After that, geodata can be found using the dedicated Web application running on the cloud or

using the CSW service. Geodata, maps and tiles are accessible using the WFS, WMS and ATOM

services.

2.6.3.4.2 Access data (use case)

On devices, the data consumer uses an application compatible with OGC services to search and view

geodata:

Search: through the GIS application, the data consumer invokes the CSW service running on the

cloud to search for geodata and discover how to access it (metadata gives endpoints for WFS,

WMS and ATOM services),

o Then the GIS application displays the list of available layers.

Display map: then the GIS application calls the WMS service and displays the background image

of the map:

o On the cloud, the tile service builds the background image from the cached tiles,

o If a tile does not exist in cache, the tile service invokes the map service via WMS.

Returned image is split into multiple tiles and put in the tile cache.

Selection of layers: then the data consumer selects one or more layers,

o For each selected layer, the GIS application calls the WFS service and displays the

features on the map.

Navigation: each time the data consumer zooms in/out or pans, the map must be refreshed.

Step is executed again.

2.6.3.4.3 Modify data (use case)

On devices, the data provider or mandated people use an application compatible with OGC services to

modify features:

Select the layer: through the GIS application, the data provider or mandated people selects a

layer to modify

o The GIS application displays the edit toolbox,

o The GIS application calls the WFS-T service to lock the features.

Modify the layer: then the data provider or mandated people modify the layer

o Either selecting the feature to modify:

Modifying the feature value,

Deleting the feature,

o Or creating a new feature (specifying the coordinates and the scientific data),

o Then the data provider or mandated people can continue to modify the layer.

Save modifications: then the data provider or mandated people saves the modified layer:

o The GIS application calls the WFS-T service with all modifications (new features, features

modified and features deleted):

The WFS-T service modifies all the features in one transaction,

The WFS-T service releases the lock on the data,

o The GIS application closes the edit toolbox,

Update metadata: then the data provider or mandated people update the metadata for the

map using the metadata catalog running on the cloud.

Page 52: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 52 / 80

2.7 Securing demonstration cases

The aim of this section is to give an overview on security expectations as they relate to the different

demonstration cases described in the previous part of the document. A particular attention is drawn

upon important matters regarding the CLARUS requirements, i.e.:

Protecting the data, knowing:

o Where the data is

o How it should be protected

o Who is accessing it (audit)

Protecting confidentiality vs :

o The CSP,

o Potential attackers.

2.7.1 Securing geodata storage in the Cloud

2.7.1.1 Protecting the data

Access to a dataspace must be protected, so that only users with granted privileged can access it. The

way a dataspace is protected depends on the type of storage:

GIS files: because users access GIS files locally via the file system, thanks to a mount point (on

Unix) or a virtual drive (on Windows), access control must rely on the system user account and

on the rights defined on the GIS directory. Only the administrator is allowed to manage the GIS

directories (create, modify, delete).

Spatial database: because users connect directly to the spatial database, access control must

rely on the database user, on the defined policy of client authentication, and on the privileges

given for the spatial database. Only the administrator is allowed to manage the spatial

databases, using a privilege account (create, modify, delete).

A user that can connect to a dataspace can access all data of the dataspace.

Data is accessed by:

Dataset administrators are in charge of managing datasets using suitable tools (e.g. file browser,

database admin tool) running on premises.

Data providers manage the content of dataspace on which they have write permissions using

suitable tools (e.g. QGIS) running on premises.

2.7.1.2 Protecting confidentiality

Because dataspaces are located on the cloud whereas rich-client application runs on premises, traffic

must be filtered and communications protected:

Storage service must be configured to allow access only from the premises.

Users must be authenticated and/or requests must be signed.

Pprotocol(s) used to access the remote file storage server (AWS S3, OpenStack Swift, Ceph or

instances exposing a remote directory through NFS, GlusterFS or CIFS) and protocol(s) used to

access the database server (PostgreSQL protocol) must be encrypted.

Page 53: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 53 / 80

In case of data leakage out of the cloud storage, data must be unusable. As mentioned previously, data

that need to be particularly secured are geographical features and/or scientific data (e.g. measurement

value, measurement type, etc.). The way those data are encoded depends on the type of storage and on

the GIS format. However the data types are generally the following ones:

Geographical features are encoded with a composite type which vary according to the GIS

format. A geographical feature is by nature difficult to interpret but is easy to identify.

Scientific data are generally encoded with a primitive type (boolean, number, string). A scientific

data may be difficult to identify because it is drowned among the other data, but is easy to

interpret.

2.7.2 Securing geodata publication in the Cloud

2.7.2.1 Protecting the data

2.7.2.1.1 Constraints and limitations

Geodata, maps and tiles are intended for publication. However data providers may want to restrict

access to authorized people only, or give public access to low-quality data only, degrading exactness and

accuracy of maps and data:

Restrict access to the entire dataset and maps: only authorized data consumers can access all

the geodata, maps and tiles.

Degrade exactness and accuracy of maps and data: any data consumer can access the geodata

maps and tiles, but data is made imprecise and inoperable at small scale levels.

Metadata do not need to be protected and can be used by anyone to have information on data

(including information on security).

Data integrity may be important for data consumers. Maps and tiles could be signed in order to

guarantee the data consumers of the origin and of the integrity of the returned images.

2.7.2.1.2 Roles and access

In the case of geo-publication, data is accessed by:

Data providers, in charge of

o Managing geodata and maps through the transfer services (SCP, FTP, etc.) (write),

o Managing metadata through the dedicated Web application running on the cloud or

through CSW (write).

Data consumers, likely to

o Request images of map or tiles through the WMS service and request features (geodata)

through the WFS service (read-only),

o Download the entire dataset through the ATOM service or through the WFS service

(read-only).

Services (on behalf of the data consumer)

o Get tile service, that creates and caches tiles when requested tiles do not exist (write),

o Get map service, that creates an image of the requested map (write).

Page 54: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 54 / 80

2.7.2.1.3 Geo-processing peculiarities (Kriging)

The Kriging data is accessed by:

Data providers, in charge of

o Installing and maintaining the application (write) that serves the Kriging service through

the WPS service,

o Managing metadata on WPS through the dedicated Web application running on the

cloud or through the CSW service (write).

Data consumers, that request Kriging processing running on the cloud through the WPS service

(read-only)

Kriging service (on behalf of the data consumer), that processes estimation of points from the

geodata samples (read-only).

The data consumer may request a private estimation, so the result should be considered as private data

that the data provider and CSP should not store and operate.

Samples of geodata must not be directly accessible on the cloud storage. All users must use

computation services to access data.

In the case a degradation of data is required for security reasons, imprecise data must not imply

incorrect estimation in the Kriging service.

2.7.2.2 Protecting confidentiality

Because geodata, maps and tiles are intended for publication, confidentiality is generally not an issue.

However, geodata, maps and tiles must not be directly accessible on the cloud storage. All users must

use computation services to access data.

Moreover, confidentiality must be protected when a restricted access to the entire dataset and maps is

required.

Traffic must be filtered and communications protected:

Compute service could be configured to allow access only from a white list of IP networks.

Users must be authenticated and requests could be signed.

Protocols used to access geodata, maps and tiles must encrypted and protected (OGC services

and ATOM service rely on HTTP/S).

In case of data leakage out of the cloud storage, data must be unusable. As mentioned previously, data

that need to be particularly secured are geographical features and/or scientific data (e.g. measurement

value, measurement type, etc.). The way those data are encoded depends on the type of storage and on

the GIS format. However the data types are generally the following ones:

Geographical features are encoded with a composite type which vary according to the GIS

format. A geographical feature is by nature difficult to interpret but is easy to identify.

Scientific data are generally encoded with a primitive type (boolean, number, string). A scientific

data may be difficult to identify because it is drowned among the other data, but is easy to

interpret.

Page 55: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 55 / 80

When degraded accuracy of maps and data is required, Geodata must be imprecise and maps

generated for small levels must be inaccurate in case of data leakage out of the Cloud storage.

2.7.3 Securing geodata collaboration in the Cloud

2.7.3.1 Protecting the data

Geodata, maps and tiles are intended for publication. They can be viewed or downloaded by any data

consumer but only data provider or mandated people can modify them.

Metadata do not need to be protected and can be used by anyone to have information on data

(including information on security). However, only data provider or authorised people can modify

metadata.

Data integrity is very important for people mandated for data modification. Geodata, maps and tiles

could be signed in order to guarantee the mandated people of the origin and of the integrity of the

returned data and images.

2.7.3.1.1 Roles and Access

In the case of geodata collaboration, data is accessed by:

Data providers, in charge of managing

o Geodata through the transfer services (SCP, FTP, etc.) (write),

o Maps and layers through the dedicated Web application running on the cloud (write),

o Metadata on geodata and on maps through the dedicated Web application running on

the cloud (write).

Data providers or mandated people, likely to:

o Modify features through the WFS-T service (write)

o Update metadata associated to modified features through the dedicated Web

application running on the cloud (write).

Data consumers, likely to:

o Request images of map or tiles through the WMS service and request features (geodata)

through the WFS service (read-only).

Services (on behalf of data providers)

o Get tile service, that creates and caches tiles when requested tiles do not exist (write).

o Get map service, that creates an image of the requested map (write).

2.7.3.2 Protecting confidentiality

Because geodata, maps and tiles are intended for publication, confidentiality is usually not an issue.

However, confidentiality is an issue while modifying geodata (see previous section for expectations

regarding the protection of confidentiality).

Page 56: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 56 / 80

2.7.4 Summary

As a brief summarization we have built the mapping of the various demonstration cases proposed in §

2.6, and the security expectations on which they depend.

Note these security expectations will not necessarily be all covered by the CLARUS solution. As well

CLARUS may also be constrained by others parameters (performance, availability...etc). These non-

functional constraints will be expressed in the deliverable D2.2 of the same Work Package WP2 related

to the requirements specification. The only expectations considered here, are the security expectations

thought as prerequisite of confidence (cf. §¡Error! No se encuentra el origen de la referencia.).

Security

Expectations

Geo Data

Demonstration Cases

Geospatial subjects

Refers to § Storage Publication Processing Collaboration

2.6.1 2.6.2.1 2.6.2.2 2.6.3

Data

accessibility and

alteration

protection

2.5.1.1

metadata-based

access limitations

2.5.1.2

confidentiality

2.5.1.3

authenticity of

sources (data

quality)

2.5.1.4 []

Services

access

authentication (e.g.

GeoRM)

2.5.2.1

[]

metadata-based

access limitations

2.5.2.2.1

secure

communication

protocols

2.5.2.3

Personal

Data

personal data

protection

2.5.3.1

access auditing

2.5.3.2

Cloud-

based

hosting

data location risks

2.5.4.1

information system

loss of control risks

2.5.4.2

multi-tenancy and

resource sharing

risks

2.5.4.3

Page 57: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 57 / 80

Relevance Levels : : relevant [] : potentially relevant

Figure 11 - Geo Data Demonstration Cases with regard to Security Expectations

For each of these demonstration cases, we have additionally tried to propose the following mapping for

the different CLARUS scenarios.

CLARUS Generic Scenarios

Refers to [A2] §2.1.2.1(Technical approach)

Geo Data

Demonstration Cases Refers

to §

Scenario A B C

Scenario configuration

Cloud customer(s) 1 More than one More than one

CSP

1 1

More than one

(>1 CSP or

1CSP and >1

user account)

Storage 2.6.1 A, C

Publication 2.6.2.1 B, C

Processing 2.6.2.2 A, B, C

Collaboration 2.6.3 B, C

Figure 12 - CLARUS Generic Scenarios with regard to Geo Data Demonstration Cases

Page 58: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 58 / 80

3 E-Health Application Case

3.1 Overview

Due to the high degree of digitalization of the Medical Health Record in Hospitals nowadays, and

because of many legal regulations related to long term data preservation, there is a need to transform

active Medical Records, which contain the medical information of active patients (the ones who keep

contact with the Hospital for Healthcare purposes), to passive ones, based on different criteria, i.e.

cease of activity (when for any reason the patient stops having contact with the Hospital) and/or death,

with a time constraint that varies amongst the different European countries due to their specific laws (5

years in Spain).

The Hospital Healthcare professionals need to search and retrieve passive Medical Health Records data

for different reasons, for instance and majorly for healthcare purposes, and also for legal requirements,

typically one Medical Health Record at a time.

Another need we should take into account, specifically in the research field, is the possibility to make

Advanced Queries with or without Statistics Computation, over the passive Medical Health Records

data, in order to retrieve datasets (with multiple Medical Health Records extraction) meeting certain

conditions specified by the Hospital Healthcare professionals and to apply different statistical

calculations when required by the physician and/or the researcher.

In big Hospitals this creates an increasing demand on storage capacity, which pushes the required

technology assets and related costs of ownership. Within this scenario it is highly recommended to

move towards cloud storage services in order to cover these growing needs, which at the same time

arises the security issues related to the fact that we are dealing with very sensitive personal health data

cloud services, as we know them currently, are not able to fulfil the Security needs of such sensitive

data, as they cannot assure the integrity and confidentiality of the datasets they store. This is the main

reason why CLARUS approach, understood in this case as a layer of solutions on top of a cloud storage

services, must be a suitableanswer to the issues previously mentioned.

3.2 Actors

3.2.1 Data Providers

Patients’ Medical Health Records stored in the Hospital Information System (HIS), as active Medical

Records, are the source of data to be transformed into passive Medical Records, and then to be stored

in the cloud services using the CLARUS methodologies, technologies and tools.

The Medical Records Department of the Hospital will be responsible for the task of transforming the

active Medical Records into passive ones, so they will periodically trigger the Medical Records

“Passivation” process, based on different time constraints criteria as mentioned in the overview.

Page 59: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 59 / 80

3.2.2 Data Consumers

Hospital Healthcare professionals (mostly Physicians and Nurses) are the main potential users of the

passive Medical Records that will be stored in the cloud services using the CLARUS solutions. They need

to access those records, for data search and retrieve, in order to give answer to Healthcare needs

(access to a patient’s or patient’s relatives historical data, for instance to check family background), legal

requirements (access to any patient record derived from a Court order), research studies (access to

historical clinical data of groups of patients with certain diseases for data analysis and statistical

calculations), among others. The access to this data would always be done through the current Hospital

Clinical Workstation interface, keeping the Hospital security policies related to Medical Records to grant

access to the Healthcare professionals.

3.2.3 IT Team

Application developers of the Hospital IT Team, who are in charge of the enhancement and deployment

of the HIS functionalities, will have to assure that the Healthcare professionals have the possibility to

access (search, retrieve, query and compute) the passive Medical Records of the Hospital patients’

through the Clinical Workstation interface, stored in the cloud services using the CLARUS functionalities

provided through its API. The Clinical Workstation interface actually is the HIS application interface, as

mentioned above. Nevertheless, for testing and simulation purposes, in the CLARUS project framework,

it will be a stand-alone web based interface module with only part of the functionalities of the HIS,

developed for that single purpose, which is a simpler and easier approach.

3.2.4 Security Manager

This actor is in charge of defining, provisioning, maintaining security policies in the Hospital for the HIS.

This usually includes accounts management, profiles management, and authorisations management.

3.2.5 Cloud Service Provider

The company or companies that will provide an IaaS and/or PaaS type of service which will be used by

the Data Providers and Data Consumers, when storing and accessing the passive Medical Records of the

Hospital patients’ through the Clinical Workstation interface, developed by the IT Team of the Hospital.

3.3 Datasets

3.3.1 Introduction

There are several different types of standards commonly used in the Healthcare “business” aimed at

allowing the semantic interoperability of the Information Systems present in the cosmos of any

community of Healthcare Providers (Hospital Information Systems – HIS, Primary Care Information

Systems – PCIS, Social Care Information Systems, Mental Health Information Systems, Laboratory

Information Systems – LIS, Departmental Information Systems, etc.). They are also aimed at the future

capability to access old Datasets being able to understand their contents (semantics).

All those standards allow us to represent and exchange the Healthcare data stored in the mentioned

Information Systems, in a commonly understood manner (semantics), which eases again the

information exchange amongst all of them, and with other communities.

Page 60: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 60 / 80

In the e-Health domain we find different types of standards, and the most important of them are:

Messaging standards: which define the structure and workflow of the messages required to

exchange Healthcare Information between Healthcare Information Systems

Clinical documentation standards: which define the structure of clinical documents to be

stored in and/or exchanged between Healthcare Information Systems

Terminology standards: which define the codification or classification of different clinical terms

(diseases, clinical observations/results, drugs, etc.) that will be embedded as specific segments

in the standardized messages or in the standardized clinical documents.

As mentioned in the overview, all data contained in the e-Health use case Dataset, the patients’ passive

Medical Records, is considered by the Personal Data Protection Law (LOPD in Spain, expecting new

legislation for all Europe in the near future) as personal data of high level of sensitivity, which means

that requires the highest possible level of security and privacy.

3.3.2 Standards used in the e-Health use case

In the following sections, we describe the different standards currently in use by the Healthcare

Information Systems that are in production in Hospital Clínic de Barcelona

3.3.2.1 HL7 : Health Level 7 International [R22]

HL7 is a not-for-profit ANSI (American National Standards Institute) accredited standards developing

organization dedicated to providing a comprehensive framework and related standards for the

exchange, integration, sharing, and retrieval of electronic health information that supports clinical

practice and the management, delivery and evaluation of health services.

HL7 Version 2.X messaging standard is the workhorse of electronic data exchange in the clinical domain

and arguably the most widely implemented standard for healthcare in the world. This messaging

standard allows the exchange of clinical data between systems. It is designed to support a central

patient care system as well as a more distributed environment where data resides in departmental

systems. Hospital Clínic de Barcelona is using this standard to connect to the internal departmental

systems and also for the exchange of clinical information with other Healthcare providers in the

community, Atenció Integral de Salut de Barcelona Esquerra (AISBE), using the standards based

interoperability platform provided by Departament de Salut - TicSalut (iSISS.Cat).

The HL7 Version 3 Clinical Document Architecture (CDA®) is a document markup standard that specifies

the structure and semantics of "clinical documents" for the purpose of exchange between healthcare

providers and patients. It defines a clinical document as having the following six characteristics: 1)

Persistence, 2) Stewardship, 3) Potential for authentication, 4) Context, 5) Wholeness and 6) Human

readability. Hospital Clínic de Barcelona is using this standard to publish structured clinical documents to

the Shared Medical Record of Catalonia (HCCC – Història Clínica Compartida de Catalunya), but we must

say that this standard is actually under discussion because of its complexity, which led us to use it only

when mandatory.

3.3.2.2 International Classification of Diseases (ICD)[R23]

The International Classification of Diseases (ICD) is the standard diagnostic tool for epidemiology, health

management and clinical purposes. This includes the analysis of the general health situation of

Page 61: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 61 / 80

population groups. It is used to monitor the incidence and prevalence of diseases and other health

problems, proving a picture of the general health situation of countries and populations.

ICD is used by physicians, nurses, other providers, researchers, health information managers and coders,

health information technology workers, policy-makers, insurers and patient organizations to classify

diseases and other health problems recorded on many types of health and vital records, including death

certificates and health records. In addition to enabling the storage and retrieval of diagnostic

information for clinical, epidemiological and quality purposes, these records also provide the basis for

the compilation of national mortality and morbidity statistics by World Health Organization (WHO)

Member States. Finally, ICD is used for reimbursement and resource allocation decision-making by

countries.

The ICD is developed and administered collaboratively between World Health Organization (WHO) and

international centers. All the countries implicated in CLARUS project are currently members of WHO.

The ICD is revised regularly to incorporate changes in the medical domain, in order to reflect advances in

medical knowledge. The latest revision of the ICD is ICD-10, published between 1992 and 1994, which

doubles the amount of diagnoses codes compared to ICD-9, providing more precise descriptions.

Hospital Clínic de Barcelona is using this standard, in its revision ICD-9, for the codification of diagnoses

all over the Hospital, as it is a mandatory requirement from the Departament de Salut – CatSalut

(catalan public insurer), for official reporting purposes. Revision ICD-10 is becoming compulsory in Spain,

and specifically in Catalonia, starting January 2016. Again the new standard increases complexity, and

the more complex it becomes the more resistance we find from our Physicians as it is a time demanding

activity. Then, codification is a process done by the Medical Records Department professionals (coders),

always after the clinical encounters have occurred, what means that a higher complexity will increase

time demand and costs of this Service.

3.3.2.3 Logical Observation Identifiers Names and Codes (LOINC) [R24]

LOINC terminology was initiated in 1994 by the Regenstrief Institute associated with the US Indiana

University, and it is used worldwide.

LOINC is a rich catalog (terminology) of measurements, including laboratory tests, clinical measures like

vital signs and anthropomorphic measures, standardized survey instruments, and more. LOINC enables

the exchange and aggregation of clinical results for care delivery, outcomes management, and research

by providing a set of universal codes and structured names to unambiguously identify things you can

measure or observe. The LOINC codes are universal identifiers for laboratory tests and other clinical

observations.

The main issue solved by LOINC is related to the interoperability of laboratory tests results, mainly from

the Laboratory Information Systems (LIS) to the Hospital Information Systems (HIS), allowing the use of

universal identifiers instead of their own internal code values, embedded in standard HL7 messages.

Hospital Clínic de Barcelona is using this standard to publish structured clinical documents (CDA)

containing the laboratory tests results to the Shared Medical Health Record of Catalonia (HCCC –

Història Clínica Compartida de Catalunya), as it is mandatory, and taking advantage of this requirement

we are also using this codification internally in the patients’ Medical Health Records.

Page 62: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 62 / 80

3.3.2.4 Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) [R25]

SNOMED CT is a comprehensive, multilingual clinical healthcare terminology, with scientifically validated

clinical content that enables consistent, processable representation of clinical information in electronic

health records. It allows meaning-based retrieval of the clinical information, which increases the

opportunities to support evidence based care, real time decision support, and more accurate

retrospective reporting for research and management.

It is supported and developed by the International Health Terminology Standards Development

Organisation (IHTSDO), in a collaborative manner to ensure that it meets the diverse needs and

expectations of the worldwide medical profession. Several countries of the EU are members of the

Organisation (CLARUS partners: Belgium, Spain and United Kingdom).

SNOMED CT contains concepts with unique meanings and formal logic based definitions and is

organized into hierarchies, and is represented using three types of components: concepts, descriptions

and relationships (among concepts). It is also mapped to other health-related classifications and

terminologies in use around the world: ICD-9, ICD-10, ICD-11 (foundation layer) and LOINC.

Hospital Clínic de Barcelona is using this standard to code the Pathology diagnoses that are published in

structured clinical documents (CDA) to the Shared Medical Health Record of Catalonia (HCCC – Història

Clínica Compartida de Catalunya), as it is mandatory, and as a means to have standard codification of

those diagnoses in the patients’ Medical Health Record. There are no plans for now to widen the use of

this standard, again because of its high degree of complexity, and the efforts will be put in the evolution

from ICD-9 to ICD-10 in the near future.

3.3.2.5 Digital Imaging and Communication in Medicine (DICOM) [R26]

DICOM is the international standard for medical images and related information (ISO 12052). It defines

the formats for medical images that can be exchanged with the data and quality necessary for clinical

use. DICOM is implemented in almost every radiology, cardiology imaging, and radiotherapy device (X-

ray, CT, MRI, ultrasound, etc.), and increasingly in devices in other medical domains such as

ophthalmology and dentistry. With tens of thousands of imaging devices in use, DICOM is one of the

most widely deployed healthcare messaging standards in the world. DICOM has revolutionized the

practice of radiology, allowing the replacement of X-ray film with a fully digital workflow.

Hospital Clínic de Barcelona is using this standard since 2002, for the storage in the PACS1 of any medical

image generated within the Hospital walls, or even coming from other Healhtcare providers when

necessary, always tied to the patients’ Medical Records in the Hospital Information System. Since then,

there is an on-going process of integrating any medical device that generates images to the PACS.

1 PACS stands for Picture Archiving and Communication System, it is in fact the medical imaging repository which

typically links the medical images of a patient to his/her Medical Records stored in the Healthcare Information

Systems, and the access to those images is granted from the interface of the latter

Page 63: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 63 / 80

3.3.2.6 OntoFarma

OntoFarma is an ontology of Pharmaceutical products, basically drugs or medicines, which has been

entirely developed by the Hospital Clínic de Barcelona Medical Informatics team from the Information

Systems Department, using Ontology Web Language (OWL), under a Project ordered by the Spanish

Ministerio de Sanidad (Health Ministery) by means of an official tender in 2014. This ontology allows us

to have a standardized codification of the medication prescribed to the patients by the Physicians in the

Hospital in-patient premises. It also allows “intelligent” querying to search for non explicit equivalent

drugs based on their action principles and to check drug interactions based on the implicit knowledge

that it contains.

3.3.2.7 OntoRegClin

OntoRegClin is an ontology of clinical observations, basically the ones that come from clinical monitors,

but not excluding the manually obtained ones, which is under development by the Hospital Clínic de

Barcelona Medical Informatics team from the Information Systems Department, using OWL, own funded

by Hospital Clínic de Barcelona on our own interest, as an evolution of the Hospital Information System

and the web based Clinical Workstation Interface, that we have in productive (HIS ,CWI). The natural

next step of this ontology, once finished, will be to map it to the LOINC standard. This ontology will allow

us to assure that wherever a clinical observation appears and is shown and/or stored throughout our

HIS and our CWI, it will be correctly classified for future understanding of the data, follow-up and

comparison purposes across time (historical view), and of course for interoperability requirements with

other Information Systems and with other Healthcare Providers or Healthcare Authorities.

3.3.3 E-Health Dataset

The Passive Medical Records Database stored in a cloud service using the CLARUS services and tools,

which we expect will enhance the security and privacy of the Dataset, will contain all the historical

clinical information of the patients of Hospital Clínic de Barcelona that have become “Passive”, due to

lack of encounters over an specific period of time (5 years according to Spanish specific legislation), due

to death or change of residence, etc. The patients’ historical clinical information is composed by a set of

different types of data, starting from PDF documents, unstructured data (free text), structured data

(based on standard terminologies or not), and finally medical images.

The goal is to keep all those types of data as much standardized as possible in the Passive Medical

Records Database, using the standards described in the previous sections, in order to guarantee the

future accessibility to them being able to fully clinically understand their contents (semantics), and

obviously to facilitate their meaningful use and exploitation.

We describe the different types of data we will have in the Passive Medical Records Database in the

following sections.

3.3.3.1 PDF documents

The PDF documents we can find in the Medical Health Records are Discharge and Results reports, which

are physically structured in a patient friendly layout to facilitate their reading, as they are usually given

to the patients, and must be stored in the Hospital Information System (HIS) as a legal requirement.

Page 64: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 64 / 80

They must be stored, “as is” in the Passive Medical Health Records Database, to keep them legally,

nevertheless their contents are mostly based on other types of data also present in the HIS that will be

covered in next sections. No standards would apply in this case.

3.3.3.2 Unstructured data (free text fields)

There are plenty of data fields in the HIS that contain unstructured data, as free text fields, which of

course means that we will not be capable to map their contents to any standard terminology.

Nevertheless we will have to store them in the Passive Medical Records Database, embedding them in

CDA standard documents (or any equivalent clinical document standard commonly used), using the

corresponding segment associated with the semantics of the specific field when mapping it into the

standard document.

The unstructured data fields in our HIS are among others: Clinical Notes, anamnesis, reason for

encounter, allergies, family background, personal background, physical exploration, etc.

3.3.3.3 Structured data

The structured data we can find in the Medical Health Records belongs to two different types, data

already mapped to a standard terminology or ontology, and data not mapped to any standard.

For the data already mapped to a terminology, we will have to store them in the Passive Medical

Records Database, embedding them in CDA standard documents (when applies) or directly as specific

database fields or sets of fields, using the corresponding segment to the terminology used when

mapping it into the standard document.

For the data not mapped to any standard, we will have to store them in the Passive Medical Records

Database, embedding them in CDA standard documents (when applies) or directly as specific database

fields or sets of fields, using the corresponding segment associated with the semantics of the specific

field when mapping it into the standard document.

The structured data fields in our HIS are: diagnoses (ICD-9), pathology findings (SNOMED-CT), laboratory

test results (LOINC), in-patient medication prescriptions (OntoFarma), vital signs and other clinical

registers (OntoClinReg).

3.3.3.4 DICOM medical images

The DICOM images, as they are already standardized in our Picture Archiving and Communication

System (PACS), will have to be stored in the Passive Medical Records Database “as is”, keeping the

accession number which links any image to the corresponding patient’s Medical Health Record in our

HIS.

3.4 Services

3.4.1 Introduction

The Clinical Workstation interface of the Hospital Information System (HIS) will be the means by which

the Medical Records Department will have the capability to “Passivate” active Medical Records for

storing, and the other Hospital Healthcare professionals will have access, for search, retrieval, query and

Page 65: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 65 / 80

computation, to the passive Medical Records. This means that the CLARUS Services will be always

invoked through that application.

We show this in the following figures where we include a diagram of each of the four main processes

involved in this use case.

The first diagram shows the process in which the Data Providers are involved, Medical Health Records

“Passivation”:

Figure 13 - Medical Health Records “Passivation” process

Active patient

Medical Health

Record

Patient MHR

Passivation

Process

Patient becomes passive: Healing,

change of residence, death, lack

of encounters (time lapse: 5 yrs.)

Data Provider

HIS

HIS

HIS PASSIVE

MHR DB

PDF documents

CDA documents:

- Unstructured data

- Structured data

DICOM images

Cloud +

CLARUS

Page 66: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 66 / 80

The second diagram shows the process in which the Data Consumers are involved, passive Medical

Health Records access and data retrieval:

Figure 14 - Medical Health Records access and data retrieval process

Passive patient

Medical Health

Record Query

Data Consumer

HIS

Passive patient

MHR data

retrieval

HIS

HIS PASSIVE

MHR DB

PDF documents

CDA documents:

- Unstructured data

- Structured data

DICOM images

Cloud +

CLARUS

Healthcare Professional requires to

Access a patient’s Passive Medical

Health Record for consultation

Page 67: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 67 / 80

The third diagram shows another procces in wich the Data Consumers are involved: Advanced Query to

the CLARUS Cloud

Figure 15 - Advanced Query to the CLARUS Cloud process

Active patient

Medical

Health Record

Advanced

Query

Data Consumer

HIS

PASSIVE

MHR DB

Cloud +

CLARUS

Page 68: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 68 / 80

The last diagram shows another process in wich Data Consumers are involved : Medical Record Statistics

Computation query.

Figure 16 - Medical Record Statistics Computation query

In the following sections we describe the services required to execute the different processes described

above.

Active patient

Medical Health

Record

Statistics

computation

Data Consumer

HIS

PASSIVE

MHR DB

Cloud +

CLARUS

Page 69: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 69 / 80

3.4.2 Data Publication

Data publication services will be used by the “Passivation” process present in the Clinical Workstation

interface, that will transform active Medical Records into passive ones, which means that those records

will be moved from the HIS application database (or a fake dataset for testing purposes) to the cloud

data storage space using the CLARUS project API.

3.4.3 Metadata Management

In the publication process, i.e. “Passivation”, the required Metadata will be added to the records moved

from the HIS application database (or a fake dataset for testing purposes) to the cloud data storage

space using the CLARUS tool, in order to keep the same structure of Clinical documentation

categorization present in the HIS application database.

3.4.4 Search

The Clinical Workstation interface will provide a search capability, which will allow the Healthcare

professionals to access the passive Medical Records of a certain patient by his/her Patient ID, previously

recorded in the cloud data storage space using the CLARUS project.

Once the search done, the Healthcare professionals will be able to view the passive Medical Records of

that patient, always keeping the same structure of Clinical documentation categorization present in the

HIS application database. When necessary, they will also be able to download any of the Medical

Records of that patient, for any purpose they may require.

3.4.5 Advanced Queries

The Clinical Workstation interface will also provide an advanced search capability, which will allow the

Healthcare professionals to make Advanced Queries on structured data contained in the passive Medical

Records by means of the use of multiple fields search criteria, i.e. by defining different conditions to be

met by multiple structured data fields, like specific values or ranges of values, presence or absence of

certain medication records, duration , etc.

Once the search done, the Healthcare professionals will be able to view the dataset resulting from the

Advanced Query, and also to download it, for any purpose they may require (population assessment,

scientific research, etc.).

A couple of examples of this type of Advanced Queries would be:

Example 1: List of patient IDs that meet the following conditions:

o Diagnose: HIV (ICD9 = 042, V08, 042+079.53, etc.)

o Way of infection: infected (field INF4B = 2)

o Prescribed drugs: 2 nucleotides and 1 IP (OntoFarma codes)

o Prescription duration: > 1 year

o Follow-up duration: > 1 year

o Result: Checking of the viral load evolution over time: CD4 (LOINC = 3325)

Example 2: List of patient IDs that meet the following conditions:

o Diagnose: HIV + HPC

o Follow-up duration: > 12 weeks

Page 70: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 70 / 80

o Combination of drugs prescribed (specific combination)

o Result: Checking of the RNA HPC, comparing level at the beginning to level at the end of

the treatment

3.4.6 Statistics Computation

The Clinical Workstation interface will also provide a last additional capability, which will allow the

Healthcare professionals to request Statistics Computations on structured data contained in the passive

Medical Records, which will be performed in the cloud, i.e. without requiring to download any data to

user Information System. Those Statistics Computations will typically consist in the calculation of means,

standard deviations, frequencies (%), etc., always performed on structured data fields.

Once the calculations done, the Healthcare professionals will be able to view the figures resulting from

the Statistics Computations requested, and also to download them, for any purpose they may require.

An example of this type of Statistics Computations would be:

For the patients identified in any of the previous examples of advanced queries:

o Age: mean and standard deviation of treated patients

o Sex: % of males and females

o Infection way: % of each predefined category of infection way

o Follow-up time: mean and standard deviation of treated patients

o Outpatient: mean and standard deviation of number of visits

o Inpatient: mean and standard deviation of number of stays in Hospital

3.4.7 Transformation Services

In order to preserve the integrity, but specifically the confidentiality of the passive Medical Records

stored in the cloud services using the CLARUS project, when data is pushed into the cloud by the Clinical

Workstation interface at “Passivation” time, this will have to use CLARUS project services for masking or

anonymizing the data to be stored.

At search and retrieve time, the Clinical Workstation interface will have to use CLARUS project services

for unmasking or de-anonymizing the retrieved data.

3.4.8 Exploitation Services

In order to preserve the integrity and availability of the passive Medical Records stored in the cloud

services using the CLARUS framework, we expect to be able to use Back-up, Data Recovery and

Monitoring services, which should be offered by the Cloud Service Provider.

3.5 Security Expectations

3.5.1 Expectations in terms of data

As mentioned in the overview, we expect the assurance of the preservation of the integrity,

confidentiality and accessibility of the passive Medical Records stored in the cloud data storage space

Page 71: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 71 / 80

using the CLARUS methodologies and tools, and of course to accomplish the legal regulations on

personal data protection (LOPD in Spain), that in our case are very sensitive (high level according to law).

It is important to mention that any user security check, due to the nature of how we have planned the

different users to execute their processes directly from the Clinical Workstation Interface of the Hospital

Information System, will be done by the Hospital Information System, that is where all security issues

related to Medical Health Record access, Active or Passive, rely on. In other words the HIS will be in

charge of the user access control to any data stored in the Passive Medical Health Records in the Cloud

Service Storage, previously to invoking any CLARUS service or tool.

The only main benefit we expect from CLARUS framework is to guarantee the security and privacy of all

the Passive Medical Health Records stored in the Cloud Service. This is done by masking and/or

anonymizing of the data sent to the cloud when “Passivating” it, and of course its unmasking and de-

anonymizing when accessed and retrieved.

3.5.2 Expectations in terms of services

Besides, we expect that the services will also assure confidentiality and availability of the Medical

Records stored in the cloud data storage space using the CLARUS framework.

3.6 Demonstration Cases for CLARUS

The aim of this section is to describe precisely those application cases that will be used as

demonstrators for CLARUS (especially in the frame of WP6 work), i.e. cases that will help in specifying

CLARUS, designing its architecture and validating it all along the project.

In particular four main cases are detailed, that could be mapped to the different CLARUS typical

scenarios:

- Storage of Passive Medical Health Records (Data providers)

- Access and retrieval of Passive Medical Health Records (Data consumers)

-Advanced Queries on structured data contained in Passive Medical Health Records (Data

consumers)

- Statistics Computation on structured data contained in Passive Medical Health Records (data

consumers)

For each of these cases, a four-sided perspective has been adopted, answering key questions about:

- Why the corresponding demonstration case is relevant for both CLARUS and the e-Health

domain

- Who is susceptible of using it in a real-life context

- What are the sample data selected, their type and their sensitivity/criticality

- How the application case is usually designed and implemented

Page 72: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 72 / 80

3.6.1 Securing Passive Medical Health Records storage in the cloud

Figure 17 - Medical Health Records storage in the cloud diagram

3.6.1.1 Purpose (Why?)

Demonstrate that sensitive information (Health Medical Records) could be stored in a secure cloud.

3.6.1.2 Actors (Who?)

Data Providers: experts in collecting or producing medical data into the Hospital Information System.

This experts are responsible of the Medical Record “Passivation” process and responsible also to store

this passive medical data in CLARUS cloud through the Clinical Workstation interface.

3.6.1.3 Data (What?)

Non-structured information: Discharge reports, Results Reports, Allergies, Clinical Notes. Sructured

information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient medication, etc.

3.6.1.4 Design (How?)

Fake medical records but with the same structure as the real ones will be stored in CLARUS cloud, by

means of the use of the Clinical Workstation interface.

Create Dataset

Search Data

Dat

a P

rovi

der

Cloud

Service

Healthcare

Data Storage

Browse Data

Upload Dataset

Found

View Data

Modify Data

Delete Data

Not Found

On Premises On Cloud

Create Dataset

Page 73: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 73 / 80

3.6.2 Securing Passive Medical Health Records access and retrieval from the cloud

Figure 18 - Medical Health Records access and retrieval diagram

3.6.2.1 Purpose (Why?)

In many cases, it is necessary to retrieve the passive medical health records that could be used for

medical practice, medical research, legal purposes, and others. Data will be searched, accessed and

retrieved only by Patient ID.

3.6.2.2 Actors (Who?)

Medical doctors who need the Passive Health Records for the purposes mentioned above.

3.6.2.3 Data (What?)

The data stored in CLARUS cloud: Non-structured information: Discharge reports, Results Reports,

Allergies, Clinical Notes. Sructured information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient

medication, etc.

3.6.2.4 Design (How?)

Medical doctors will acces this information through the Clinical Workstation interface, which will

manage the access to and retrieval of the data stored in CLARUS cloud.

Search Data

Dat

a C

on

sum

er

Cloud

Service

Healthcare

Data Storage

Browse Data

Found

View Data

Download

Data

Not Found

On Premises On Cloud

Page 74: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 74 / 80

3.6.3 Securing Passive Medical Health Record for the Advanced Query

Figure 19 - Medical Health Record Advanced Query diagram

3.6.3.1 Purpose (Why?)

In many cases, it is necessary to retrieve specifical data information for certain purposes as local

organization (distribution of it´s efforts), research and others. This type of search will involve the use of

multiple fields for the search criteria, i.e. by defining different conditions to be met by multiple

structured data fields, like specific values or ranges of values, presence or absence of certain medication

records, duration , etc.

3.6.3.2 Actors (Who?)

Medical doctors who need the Passive Health Records for the purposes mentioned above.

3.6.3.3 Data (What?)

The data stored in CLARUS cloud: Non-structured information: Discharge reports, Results Reports,

Allergies, Clinical Notes. Sructured information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient

medication, etc. The expected result is a dataset containing a single or multiple Passive Health Records

meeting the conditions stablished by the Healthcare professionals when making the Advanced Query, or

an empty dataset if the conditions are not met.

3.6.3.4 Design (How?)

Medical doctors will acces the Advanced queries results through the Clinical Workstation interface,

which will manage the access to and retrieval of the data stored in CLARUS cloud.

Page 75: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 75 / 80

3.6.4 Securing Passive Medical Health Record for Statistics Computation query

Figure 20 - Medical Health Record Statistics Computation query diagram

3.6.4.1 -Purpose (Why?)

In many cases, it is necessary to make certain Statistics computation on Medical Recordsfor research

purposes, quality analysis or cases where is needed a health improvement in certain health areas.

3.6.4.2 Actors (Who?)

Medical doctors or Mediacal Managers who need the Passive Health Records for the purposes

mentioned above.

3.6.4.3 Data (What?)

The calculations of the Statistics Computations will be performed over structured data stored in CLARUS

cloud: Diagnostics (ICD-9), Lab Results (LOINC), in-patient medication records, etc.

3.6.4.4 Design (How?)

Medical doctors will access the Statistics computation results through the Clinical Workstation interface,

which will request the execution of the calculation to the cloud and then will manage the access to and

retrieval of the resulting calculated figures by CLARUS cloud.

Page 76: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 76 / 80

3.7 Summary

For comparisons purposes, we have also built the mapping of the four demonstrations cases proposed

for the e-Health part, and the security expectations correlated as submitted in §2.7.4

Security

Expectations

E Health

Demonstration Cases for CLARUS

Security

items Refers to §

MHR

Medical Health

Records

Storage

MHR

Medical Health Records

Access and Retrieval

0 ¡Error! No se encuentra el origen de

la referencia.

Data

accessibility and

alteration protection

/data integrity

2.5.1.1

¡Error! No se

encuentra el

origen de la

referencia.

Even though MHR are restricted data

access

metadata-based

access limitations 2.5.1.2.1 NA NA

confidentiality 2.5.1.3

authenticity of sources

(data quality) 2.5.1.4 NA NA

Anonymisation,

masking and inverse

¡Error! No se

encuentra el

origen de la

referencia.

Services

access authentication

(e.g. GeoRM)

2.5.2.1 NA NA

metadata-based

access limitations

2.5.2.2.1 NA NA

secure communication

protocols

2.5.2.3 NA NA

Personal

Data

personal data

protection

2.5.3.1

¡Error! No se

encuentra el

origen de la

referencia.

* *

* It is assumed that all data contained in MHR is under personal data

protection

access auditing 2.5.3.2 NA NA

Page 77: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 77 / 80

Cloud-based

hosting

data location risks 2.5.4.1 NA NA

information system

loss of control risks

2.5.4.2 NA NA

multi-tenancy and

resource sharing risks

2.5.4.3 NA NA

Security

Expectations

E Health

Demonstration Cases for CLARUS

Security

items Refers to §

MHR

Passive Medical

Health Record for the

Advanced Query

MHR

Passive Medical Health

Record for Statistics

Computation Query

3.6.4 3.6.4

Data

accessibility and

alteration protection

/data integrity

2.5.1.1

¡Error! No se

encuentra el

origen de la

referencia.

Even though MHR are restricted data

access

metadata-based

access limitations 2.5.1.2.1 NA NA

confidentiality

2.5.1.3

authenticity of

sources (data quality) 2.5.1.4 NA NA

Anonymisation,

masking and inverse

¡Error! No se

encuentra el

origen de la

referencia.

Services

access authentication

(e.g. GeoRM)

2.5.2.1 NA NA

metadata-based

access limitations

2.5.2.2.1 NA NA

secure communication

protocols

2.5.2.3 NA NA

Personal

Data personal data

protection

2.5.3.1

¡Error! No se

encuentra el

origen de la

referencia.

* *

* It is assumed that all data contained in MHR is under personal data

protection

Page 78: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 78 / 80

access auditing 2.5.3.2 NA NA

Cloud-based

hosting

data location risks 2.5.4.1 NA NA

information system

loss of control risks

2.5.4.2 NA NA

multi-tenancy and

resource sharing risks

2.5.4.3 NA NA

Legend : : relevant ; [] : potentially relevant ; NA: Not applicable Security item : common Health/Geodata ; e-Health only ; Geodata only.

Figure 21 - e-Health Demonstration Cases with regard to Security Expectations

Page 79: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 79 / 80

4 Conclusion

In this document, two main application cases have been analysed in order to demonstrate the

appropriateness and applicability of the CLARUS methodologies, technologies and tools, namely, e-

Health and publication of geodata on the Internet. This analysis resulted in the definition of actors,

datasets and services in the corresponding domains, as well as a comprehensive list of security

expectations.

Preserving privacy in the cloud is a rather intuitive concern when thinking about the e-Health scenario.

Healthcare information systems are dealing with very sensitive personal health data, and at the same

time there is a growing demand for storage and computing capacity, which makes transition to the

cloud almost inevitable in the near future.

In comparison, geospatial data (owned by European institutions and environmental actors) can not be

considered as very sensitive personal data. However, geospatial data are sensitive for different reasons:

mission-critical data for public safety and national security, data having a strong business potential

(datasets and/or services are sold), etc.

As it is shown in this document, the needs for security and confidentiality when moving to the cloud are

somewhat different depending on whether one considers a Healthcare or a Geospatial information

system. Providing demonstration cases in each of these two domains therefore gives us the assurance to

cover a broad range of application scenarios for the CLARUS solution.

On the other hand, this detailed analysis highlights similarities and common security expectations.

Ultimately the demonstration cases proposed also provide a different perspective (through distinct

actors, datasets and services) on these common issues.

While some of the security expectations identified can be covered thanks to already existing solutions,

implemented either by the CSP either by the application itself, some others cannot be fulfilled today

without impairing typical cloud functionality or reducing the benefits associated with it. This fully

justifies the development of an innovative set of techniques improving trust in cloud environments.

The way to integrate such innovative set of security techniques in an application is considered

differently by the two main application cases. In the e-Health application case, CLARUS is “a layer of

solutions on top of a cloud storage service”. With this approach, CLARUS is a proxy running exclusively

on premises. It is a kind of cloud storage service with advanced security features, and could be

transparently integrated in any application (provided that application relies on transfer protocols

supported by CLARUS). On the other hand, in the Geo Publication application case, some services

operate geospatial data and are running on the cloud, which involves the use of CLARUS innovative

techniques directly on the cloud, either explicitly using some CLARUS libraries, or transparently using

CLARUS as a proxy.

During the further course of WP2, the applications cases described will serve the definition of

requirements for such a solution, namely the CLARUS framework, seeking to comply with legal

requirements and technical standards set up or approved by European authorities.

Page 80: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01

© CLARUS Consortium 80 / 80

Appendix A. Review of Related Project

A review of several projects in the field of Geodata publication is joined in the additional document

“CLARUS D2.1 Annex I - Review of EU geo-publication projects” [A4].

*** End of the Document ***

Page 81: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024

© CLARUS Consortium 1 / 27

A framework for user

centred privacy and

security in the cloud

Definition of Application Cases – Annex I:

Review of EU geo-publication projects

Type (distribution level) Public

Contractual date of Delivery 30-04-2015

Actual date of delivery 12-06-2015

Deliverable number D2.1-AnnexI

Deliverable name Definition of Application Cases – Annex I: Review of EU geo-

publication projects

Version V1.0

Number of pages 15

WP/Task related to the

deliverable Task 2.1

WP/Task responsible AKKA

Author(s) AKKA Team

Partner(s) Contributing

Document ID CLARUS-D2.1-DefinitionOfApplicationCases-AnnexI-v1.0

Page 82: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 2 / 27

Abstract This document analyses relevant projects representative of

publication of Geo-referenced data on the Internet.

Page 83: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 3 / 27

Disclaimer

CLARUS (G.A. 644024) is a Research and Innovation Actions project funded by the EU Framework Programme for Research and Innovation Horizon 2020. This document contains information on CLARUS core activities, findings and outcomes. Any reference to content in this document should clearly indicate the authors, source, organization and publication date. The content of this publication is the sole responsibility of the CLARUS consortium and cannot be considered to reflect the views of the European Commission.

Page 84: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 4 / 27

Table of Contents

1 INTRODUCTION ............................................................................................................................................ 6

1.1 SCOPE OF THE DOCUMENT ..................................................................................................................................... 6

1.2 APPLICABLE AND REFERENCE DOCUMENTS ................................................................................................................ 6

1.3 REVISION HISTORY ............................................................................................................................................... 6

1.4 NOTATIONS, ABBREVIATIONS AND ACRONYMS (OPTIONAL) ......................................................................................... 6

2 INGEOCLOUDS .............................................................................................................................................. 8

2.1 OVERVIEW ......................................................................................................................................................... 8

2.2 BUSINESS USE CASES OF THE INGEOCLOUDS PARTNERS ............................................................................................... 8

2.2.1 UC1: Publication of Geospatial Dataset in the Environmental Field ..................................................... 8

2.2.2 UC2: Susceptibility Map of Triggering Landslides Due to Rainfall Forecast .......................................... 9

2.2.3 UC3: Shakemaps ................................................................................................................................... 9

2.2.4 UC4: Pesticides in Groundwater ............................................................................................................ 9

2.2.5 UC5: Ground Water Resources Management in Granular Aquifers .................................................... 10

2.2.6 UC6: Active Landslide Inventory Mapping and Susceptibility Zoning .................................................. 10

2.3 ACTORS ........................................................................................................................................................... 10

2.4 SERVICES.......................................................................................................................................................... 11

2.4.1 Account management ......................................................................................................................... 11

2.4.2 Workspace .......................................................................................................................................... 12

2.4.3 Application installation and maintenance .......................................................................................... 12

2.4.4 Data import/Synchronization .............................................................................................................. 13

2.4.5 Data publication services .................................................................................................................... 14

2.4.6 Metadata management ...................................................................................................................... 14

2.4.7 Search service ...................................................................................................................................... 15

2.4.8 View service ......................................................................................................................................... 15

2.4.9 Download service ................................................................................................................................ 15

2.4.10 Processing service ........................................................................................................................... 16

2.4.11 Account billing ................................................................................................................................ 16

Page 85: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 5 / 27

2.4.12 Support, maintenance and monitoring service .............................................................................. 16

2.5 SECURITY IN INGEOCLOUDS ................................................................................................................................ 18

2.5.1 Protecting data against unauthorized access ..................................................................................... 18

2.5.2 Protecting services against unauthorized access ................................................................................ 18

2.5.3 Ensuring identity data protection ....................................................................................................... 18

3 EGDI-SCOPE ................................................................................................................................................ 19

3.1 OVERVIEW ....................................................................................................................................................... 19

3.2 ACTORS ........................................................................................................................................................... 19

3.2.1 EGDI stakeholder panel ....................................................................................................................... 19

3.2.2 EGDI user categories ........................................................................................................................... 19

3.3 DATASETS ........................................................................................................................................................ 20

3.3.1 EGDI methodology for prioritizing datasets ........................................................................................ 20

3.3.2 Relevant answers to EGDI stakeholder survey .................................................................................... 20

3.3.3 Compilation of the INSPIRE dataset inventory .................................................................................... 21

3.4 USE CASES ........................................................................................................................................................ 21

3.4.1 Geohazards: Ground Instability in densely populated areas ............................................................... 21

3.4.2 Raw Materials: Rare Earth Element potential within the European Union ......................................... 22

3.4.3 Renewable Energy - Planning for offshore Wind Farms ...................................................................... 22

3.4.4 Geology and Soils – Ecosystem Mapping ............................................................................................ 23

3.5 TRUST AND AUTHENTICATION ............................................................................................................................... 23

3.5.1 Elements of trust ................................................................................................................................. 23

3.5.2 Moving to the cloud ............................................................................................................................ 24

4 MINERALS4EU............................................................................................................................................. 26

4.1 THE RAW MATERIALS INITIATIVE .......................................................................................................................... 26

4.2 THE PROJECT .................................................................................................................................................... 26

4.3 ACTORS: DATA OWNERS AND CONSUMERS ............................................................................................................. 26

4.4 SERVICES.......................................................................................................................................................... 27

4.4.1 Mineral Statistics ................................................................................................................................. 27

Page 86: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 6 / 27

1 Introduction

1.1 Scope of the document

The aim of this document is to describe different projects representative of the “publication of geo-

referenced data on the internet” domain, providing further details on the Geo-Publication application

case in the CLARUS Deliverable D2.1 (Definition of Application cases).

1.2 Applicable and reference documents

This document refers to the following documents:

CLARUS Grant Agreement

CLARUS-D2.1-Definition of Application Cases-V1.0

INGC-D2.1-Use Cases for InGeoCloudS Data and Services-v1.1

INGC-D2.2-Interface of web services and models of data-v1.0

INGC-D2.3-InGeoCloudS Web Services Covering Use Cases-v1.1

INGC-D4.2-Fully Operational InGeoCloudS Pilot-v1.0

EGDI-scope User requirements and use cases

EGDI-scope D2.3 Functional User Requirements and Use Cases

EGDI-scope D5.2 Report on trust and authentication

1.3 Revision History

Version Date Author Description

0.1 15/01/2015 AKKA Initial version

1.0 12/05/2015 AKKA V1.0 Version

1.4 Notations, Abbreviations and Acronyms (optional)

API: Application Programming Interface

CRUD: Create/Read/Update/Delete

CSP: Cloud Service Provider

CSW: Catalog Service for the Web

DG ENTR: Directorate General for Enterprise and Industry

DG JRC: Directorate General Joint Research Centre

Page 87: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 7 / 27

DoW: Description of Work

EC: European Commission

EEA: European Environment Agency

EGDI: European Geological Data Infrastructure

ENISA: European Network and Information Security Agency

EPPO: Earthquake Planning and Protection Organisation

ETPSMR: European Technology Platform on Sustainable Mineral Resources

FTP: File Transfer Protocol

FTPS: File Transfer Protocol Secure

GA: Grant Agreement

GEMAS : Geochemical Mapping of Agricultural Soils of Europe

GeoZS: Geological Survey Of Slovenia

GIS: Geographical Information System

GML: Geography Markup Language

HTTP: Hypertext Transfer Protocol

IaaS: Infrastructure as a Service

InGeoCloudS: INspired GEOdata CLOUD Services

NGO: Non-governmental Organization

OGC: Open Geospatial Consortium

OWL: Web Ontology Language

PaaS: Platform as a Service

PMB: Project Management Board

PSI: Public Sector Information

RDF: Resource Description Framework

REE: Rare Earth Elements

REE: Rare Earth Elements

RIF: Rule Interchange Format

SAML: Security Assertion Markup Language

SCP: Secure Copy

SFTP: Secure File Transfer Protocol

SQL: Structured Query Language

ToC: Table of Contents

UC: Use Case

WADL: Web Application Description Language

WFS: Web Feature Service

WMS: Web Map Service

WMS-C: Web Mapping Service Caching

WMST: Web Map Service Time

WP: Work Package

WS-Addressing: Web Services Adressing

WSDL: Web Service Description Language

XSD: XML Schema Definition

XSLT: eXtensible Stylesheet Language Transformations

Page 88: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 8 / 27

2 InGeoCloudS

2.1 Overview

The INspired GEOdata CLOUD Services (InGeoCloudS) project aimed at demonstrating the feasibility of

employing a cloud-based infrastructure coupled with the necessary services to provide seamless access

to geospatial public sector information, especially targeting the geological, geophysical and other

geoscientific information.

Project partners' data and services, available under more traditional infrastructures are easily deployed

to the cloud. One of the project challenges has been the linking of the partners’ data among themselves

and with relevant external datasets.

On top of the cloud services implemented (mostly according to the European Directive INSPIRE), the

project demonstrates the ability to build more intelligent services by using Linked Open Data principles

and technologies to combine data seamlessly integrated through the cloud.

The project has been co-funded by the European Commission under: The Information and

Communication Technologies Policy Support Programme

2.2 Business use cases of the InGeoCloudS partners

As described in InGeoCloudS project documentation, the use cases implemented by InGeoCloudS have

been proposed by the data providers and have been elaborated through interactions between data

providers and interested stakeholders. The implemented use cases cover the areas of a) hydrogeology

and b) hazards due to earth phenomena.

The following gives an overview of each use case.

2.2.1 UC1: Publication of Geospatial Dataset in the Environmental Field

Public authorities producing or collecting geo-information in the environmental field, have to publish

and share data with public and other authorities: geological maps, piezometric maps, geo-hazards maps

and other geospatial datasets as defined in the annexes of the INSPIRE Directive.

The use case allows a provider to publish his dataset on the web, to create a specific map and to be

compliant with the INSPIRE technical requirements without the intervention of an IT team or INSPIRE

experts.

Page 89: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 9 / 27

A provider is any public authority (local, national, associations, etc.) which have to publish and share

geo-information in the environmental field. Public or other authorities can view and/or download the

shared data.

The use case is split in 2 business processes:

the process for the provider in charge of the creation of a map and dataset publication,

the process for the user who searches, visualizes datasets, re-uses data created by the system.

2.2.2 UC2: Susceptibility Map of Triggering Landslides Due to Rainfall Forecast

The system predicts (in a best possible way) the areas where the probability of triggering landslides is

increased due to higher precipitation levels. The endangered zones are predicted using the combination

of the landslide susceptibility map, the precipitation forecast and the landslide triggering threshold

values.

GeoZS (Geological Survey Of Slovenia) provides information about geo-hazards induced by mass

movement process. Intended users are Administration of the republic of Slovenia for civil protection and

disaster relief, Slovenian Environment Agency, Municipalities, Planners, Infrastructure owners and

operators, First responders, general public/citizens and Real Estate Agencies.

The use case is split in 2 business processes:

the process for the provider in charge of pushing raw data and periodic execution of the

calculation of landslide susceptibility map,

the process for the user who searches and re-uses data created by the system.

2.2.3 UC3: Shakemaps

The Shakemaps service provides shake-maps for major earthquakes in Greek region. Shake-maps are

maps showing ground movement and shaking intensity following major earthquakes. The shake-maps

are calculated using information about earthquakes that are extracted automatically in near-real time

(in a few minutes) from accelerogram records.

EPPO (Earthquake Planning and Protection Organisation) provides shake-maps for major earthquakes in

Greek region. Other data providers can also use Shakemaps service to calculate their own shake-maps.

The use case is split into 2 business processes:

the production process for the provider in charge of pushing initial data parameters, uploading

sensor data and executing calculation,

the process for the user who searches and re-uses data created by the system.

2.2.4 UC4: Pesticides in Groundwater

This use case makes it possible for users to find areas where there are high concentrations of pesticides

in the groundwater. It could be either pesticide in general or specific pesticides. It is also possible to

restrict the output to pesticides found at a certain depth interval and/or from certain geology (lithology

or lithostratigraphy).

Page 90: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 10 / 27

Intended users include NGOs, EEA, national environmental authorities, national or European

environmental portals and researchers.

The use case is split into 2 business processes:

the production process for the provider in charge of the configuration of the map and the

synchronization with local DB,

the process for the user who searches and re-uses data created by the system.

2.2.5 UC5: Ground Water Resources Management in Granular Aquifers

The use case provides data from field measurements, from chemical analyses in accredited laboratories

and from database of various geospatial data. Considering the above data and the fact that there is very

good and adequate scientific knowledge of geological, hydrological, hydro-geological and hydro-

chemical characteristics and properties of the study areas, important interdisciplinary and multi-layered

conclusions could be conducted such as a) Water balance estimation and assessment b) Piezometric

surface maps for dry and wet periods’ piezo maps c) Hydro chemical maps.

The use case is split into 2 business processes:

the production process to process groundwater statistical operations with geo-processing tools

(e.g. kriging) on the fly,

the process for the user to access data published.

2.2.6 UC6: Active Landslide Inventory Mapping and Susceptibility Zoning

The use case provides an active inventory map of the occurred landslides (e.g. location, classification,

volume, activity, date of occurrence of landslide, etc.) updated after every new event recorded in the

system. It is possible to retrieve data concerning the landslides’ characteristics (type of movement,

causes, season and year of occurrence,..), as well as any information available for the region of

occurrence (geology, precipitation, altitude, slope,…).

The calculation of the spatial probability produces a susceptibility zoning map available to the system.

The map is a result of the analysis between the spatial distribution of the landslides (landslides’ density)

and a group of generative causes (geological, topographical, hydrological etc… characteristics of the

area) based on the fact that landslides in the future will occur under the same circumstances that they

occurred in the past. The calculation of the density map is redone every time when new data for the

study area are provided to the system.

The use case is split into 2 business processes:

the production process to perform the calculation of the spatial probability to produce a

susceptibility zoning map,

the process for the user to access data published.

2.3 Actors

Data consumer: Actor consuming information provided by the InGeoCloudS platform. This

category of actors interacts with the system to find and view information, download data,

access to treatment or geoscience modeling. The user can also consume Web services or APIs. In

Page 91: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 11 / 27

most cases, this access is one without authentication but it is possible, for safety or traceability

reasons, to request authentication for certain functions.

Data provider: Actor producing or collecting information (or having a function of data

collection) in relation to the thematic geoscience and / or environmental project. This category

of actors interacts with the system to push information or launch the harvest of information

from their local system, configure treatments dedicated to these data, develop models, etc. The

producer interacts with the platform in an authenticated mode and secure access.

Application provider: Actor with the capability to integrate new applications and services in the

InGeoCloudS infrastructure. The application provider can use the API infrastructure and build

specific services; they can contemplate the use cases and their implementation as examples of

integration.

Registered user: Actor that needs to be authenticated on the system to access additional

functionalities in providers’ applications (e.g. register for notification). The application provider

is fully responsible of the access control to its application.

Public: Actor that is not authenticated. The InGeoCloudS platform sees public as a guest.

Stakeholder: Actor that does not interact directly with the system but has a direct or indirect

interest in the features offered by the system.

Administrator: Actor managing, monitoring and maintaining the InGeoCloudS platform. The

administrator is also in charge of the management of the accounts of type data provider and

application provider and also of the management and backup of workspaces.

Cloud provider: Actor which is responsible of management of the underlying cloud platform. A

cloud provider provides the necessary infrastructure (IaaS) to run the InGeoCloudS platform but

does not interact directly with (InGeoCloudS platform is seen as a black box).

2.4 Services

The InGeoCloudS project provides services in the cloud with a very scalable and efficient infrastructure

in the environmental field.

Within the frame of the public infrastructure, users consume the various services offered by the

platform, whether for the viewing, downloading or reuse through technical and interoperable services

or other Application Programming Interface (API).

2.4.1 Account management

Administrator creates/deletes/modifies accounts of type data provider or application provider. (All UCs)

Page 92: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 12 / 27

Application provider creates/deletes/registers/unregisters accounts of type registered user. (All UCs)

Administrator, data provider, application provider, registered user modify their profile. (All UCs)

2.4.2 Workspace

Data/Application providers have a dedicated and secured workspace on the InGeoCloudS platform to

store their datasets. Workspace is mainly a dedicated directory on the file system, but a dedicated

database can optionally be created. (All UCs)

Administrator automatically and transparently creates a dedicated directory when a new account of

type Data provider or Application provider is created. (All UCs)

Administrator creates a dedicated database when an Application provider request for it. (UC1, UC2, UC4)

2.4.3 Application installation and maintenance

The InGeoCloudS solution is a computing platform (PaaS) including operating system, programming

language execution environment, database, and web server.

The application providers have the capability to install their custom application on the cloud (web

applications dedicated to the publication of geo-referenced data or application utilities dedicated to the

synchronization of the dataset stored on the cloud with the dataset stored on premises).

- APPLICATION INSTALLATION

This use case defines how application providers manage the installation of application files stored in

their dedicated space on the cloud.

Application providers upload their application files using a file transfer utility (FTP, FTPS), a file transfer

API or a web application provided by InGeoCloudS. (All UCs)

Application providers connect to the InGeoCloudS platform using SSH to configure their application and

initialize its database. (UC1, UC2, UC4)

- APPLICATION MAINTENANCE

This use case defines how application providers maintain their application running in the cloud.

Application providers download (for analysis) the log files of their application using a file transfer utility,

a file transfer API or a web application provided by the InGeoCloudS platform.

Page 93: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 13 / 27

2.4.4 Data import/Synchronization

The data providers choose the data to be pushed into InGeoCloudS. Each data provider has a dedicated

and secured storage space on the file system and on the database server. Only the owner of the data

can access it.

Thanks to dedicated features, data providers manage their data inInGeoCloudS, controlling how and

when data are pushed, updated or deleted.

- MANUAL MANAGEMENT OF DATASETS FILES

This use case defines how data providers and/or application providers manage datasets files stored in

their dedicated spacein InGeoCloudS.

o Data providers manage their dataset files (i.e. upload, move, delete, etc.) using a file

transfer utility, a file transfer API (the InGeoCloudS API), or a web application provided

by the platform (the InGeoCloudS workspace explorer) (All UCs)

o Application providers manage their dataset files using their own application.

- MANUAL MANAGEMENT OF DATASETS STORED IN A DATABASE

In this use case, data providers manage the content of their dedicated database using a web application

provided by the InGeoCloudS platform.

On the other hand, application providers manage the content of their dedicated database through their

own application.

- SYNCHRONIZATION OF DATASETS FILES FROM DP’S OWN PREMISES

This use case allows data providers to update datasets files stored onInGeoCloudS, in their dedicated

space, using a synchronization process running on their own premises. Data providers typically configure

on their premises an automatic process that runs locally and updates the datasets files inInGeoCloudS,

through a standard transfer protocol (FTP, FTPS) or through a file transfer API provided byInGeoCloudS.

- SYNCHRONIZATION OF DATASETS IN INGEOCLOUDS

This use case allows application providers to register synchronization tasks that periodically update

datasets files stored in their dedicated space, and/or datasets stored in their dedicated database.

A synchronization task is an application installed by an application provider that is executed by

InGeoCloudS on behalf of the application provider:

o Application providers manage synchronization jobs (to list, register/unregister or modify

data) using a web application or a task management API provided byInGeoCloudS (UC2,

UC4)

Page 94: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 14 / 27

The InGeoCloudS platform executes the registered tasks at the scheduled time on behalf of the

application provider.

2.4.5 Data publication services

This use case defines how data providers publish the datasets through interoperable OGC/INSPIRE

compliant services. They create or edit layers, configure Web Map Service (WMS) for raster data and/or

Web Feature Service (WFS) for vector data.

Data providers:

o create/open a map in their environment and define characteristics of the map. (UC1,

UC2, UC5, UC6)

o create layers in a map with their files or other public WMS/WFS services and defines

characteristics of the layers. (UC1, UC2, UC5, UC6)

o publish each layer as WMS and/or WFS services. (UC1, UC2, UC5, UC6)

o publish the map on the Internet. (UC1, UC2, UC5, UC6)

In addition, data providers define access rights to their published data for critical, sensitive or

marketable data.

2.4.6 Metadata management

Metadata allow data providers to describe their data, maps and services. Metadata are published and

used by data consumers who are looking for particular datasets and services. These metadata are

available and managed through a so-called Catalogue.

In the InGeoCloudS solution, the standard tool used for managing and exposing discovery services is

Geonetwork.

- METADATA CREATION AND STORAGE

In the use cases (UC1, UC2, UC5, UC6), Data providers:

- perform CRUD operations on data metadata, service metadata (protocols, security constraints,

etc.) and/or map metadata (i.e. map layers and portrayal parameters).

- publish these metadata (optionally compliant with OGC/INSPIRE standards: CSW)

- define access rights to their published metadata.

Application providers may publish their data/service metadata using other web service mechanisms.

In addition the data providers may register their own discovery service to a global discovery service (e.g.

Member State discovery service). This might help in improving trust in services and data.

Page 95: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 15 / 27

2.4.7 Search service

Discovery Services search for spatial datasets and spatial data services through interoperable

OGC/INSPIRE CSW service and on the basis of the content of corresponding metadata, and display the

metadata content.

- SEARCH and CONSULTATION THROUGH METADATA

In these cases (UC1, UC2, UC5, UC6), Data consumers:

o invoke getCapabilities() facility in order to design and build queries on metadata

o browse a geo-portal by area of interest

o perform a keyword-based search on a geo metadata catalogue

These operations are performed through HTTP(S), following the CSW standard.

- SEARCH and CONSULTATION THROUGH OTHER MEANS

In this use case (UC1, UC2, UC5, UC6), data consumers use the search functionalities of the application

provided by the application provider (inGeoCloudS SmartQueries, Shake Maps application, full text

search).

2.4.8 View service

View Services as a minimum, display, navigate, zoom in/out, pan, or overlay spatial datasets and display

legend information and any relevant content of metadata

- VISUALIZATION THROUGH STANDARDIZED SERVICES

In these use cases (UC1, UC2, UC5, UC6), Data consumers:

o display spatial data and interacts with them (zoom in/out, navigate, pan, overlay)

through a web application provided by the cloud platform

o reuse the spatial data in their own applications or services

o consult legend information about the spatial data displayed.

These operations are performed through HTTP-S), following the WMS standard.

- VISUALIZATION THROUGH OTHER MEANS

In this use case, Data consumers display spatial data and interact with them through a web application

provided by the application provider (inGeoCloudS SmartQueries, Shake Maps application).

2.4.9 Download service

Download Services enable copies of complete spatial datasets, or of parts of such sets, to be

downloaded.

- DOWNLOAD THROUGH STANDARDIZED SERVICES

Page 96: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 16 / 27

In these use cases (UC1, UC2, UC5, UC6), Data consumers:

o download complete (or a part of) spatial datasets as static resources/URL (through

ATOM, WFS)

o queries subsets of spatial datasets.

- DOWNLOAD THROUGH OTHER MEANS

In this use case, Data consumers download complete (or a part of) spatial datasets using:

o a file transfer utility

o the file transfer API provided by the InGeoCloudS platform

o the web application provided by the InGeoCloudS platform

o the web application provided by the application provider (UC3, UC4).

2.4.10 Processing service

This use case defines how:

o Application providers invoke a custom geo-processing service through the InGeoCloudS

API (HTTPS) to calculate or process their data. (UC2, UC3)

o Cloud provider creates one temporary instance to run the custom geo-processing

service. Instance is terminated after geo-processing service is completed. (UC3)

o Application provider, data provider and data consumer invoke a shared geo-processing

service provided by the InGeoCloudS platform (e.g. kriging). (UC6)

2.4.11 Account billing

- Cloud provider provides periodically costs details for the usage of its services. (All UCs)

- Administrator splits costs between data providers and application providers. Split is based on

usage statistics by users. (All UCs)

- Application provider and data provider are in charge of costs according to the usage

statistics.(All UCs)

2.4.12 Support, maintenance and monitoring service

- Cloud provider:

o provides services to administrator for monitoring the cloud services. (All UCs)

- Administrator:

o monitors the usage using the monitoring service provided by the InGeoCloudS platform.

(All UCs)

o adjusts the computation and storage capacities (scalability) of the InGeoCloudS platform

according to the usage statistics. (All UCs)

Page 97: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 17 / 27

o performs automatically and transparently a periodic backup (every night) of the

workspaces (file system and database) of the data providers and application providers.

(All UCs)

- Application providers:

o perform explicitly backup of their database using the InGeoCloudS API (HTTPS). (UC1,

UC2, UC4)

o perform basic maintenance operations (rebuild indexes, vacuum) on their database

using the InGeoCloudS API (HTTPS). (UC1, UC2, UC4)

Page 98: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 18 / 27

2.5 Security in InGeoCloudS

2.5.1 Protecting data against unauthorized access

- AUTHENTICATION WITH FILE TRANSFER UTILITIES

In this use case, data providers and application providers need to authenticate to access their dedicated

space using a file transfer utility and a standard transfer protocol (FTPS, SCP, SFTP).

2.5.2 Protecting services against unauthorized access

- AUTHENTICATION FOR PUBLICATION

In this use case, data providers need to authenticate to publish their data..

2.5.3 Ensuring identity data protection

- SINGLE SIGN ON TO WEB APPLICATIONS AND APIs PROVIDED BY THE SYSTEM (USE CASE)

In this use case, users need to authenticate once in order to have access to the Web applications and

APIs provided by InGeoCloudS (according to their permissions).

- Users:

o log on once using a login page provided by theInGeoCloudS platform, or calling a session

management API provided by the InGeoCloudS platform (valid session + HTTPS)

o can access to web applications and/or call APIs provided by the InGeoCloudS platform

according to their permissions

o log out using a logout page provided by theInGeoCloudS platform, or calling a session

management API provided by the InGeoCloudS platform (valid session + HTTPS)

- Security Manager:

o invalidates a user session using a web application provided by theInGeoCloudS platform,

or calling a session management API provided by the InGeoCloudS platform

Page 99: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 19 / 27

3 EGDI-Scope

3.1 Overview

The EGDI-Scope project was a 2012-2014 feasibility study on the creation of a European Geological

Data Infrastructure (EGDI), co-funded by the European Commission under the 7th Framework

Programme for Research and Technological Development, and executed by a consortium of four

National Geological Surveys (NL, UK, FR, DK), the Catholic University of Leuven (BE) and EuroGeoSurveys

– with the direct contribution of the 32 member European Geological Surveys.

The EGDI-Scope study set the basis of a pan-European Geological Service for facilitating easy open

access to digital geological data at the European scale.

In the context of the CLARUS application case definition, the EGDI-Scope project has been identified as

representative of the “publication of geo-referenced data on the internet” domain, along with other EU

projects and directives (inGeoCloudS, Minerals4EU, INSPIRE, etc.).

3.2 Actors

3.2.1 EGDI stakeholder panel

A core group of pan-European institutions and projects have been assembled in a “stakeholder panel”,

regularly consulted by the consortium in order to ensure that the project was on the right track. Among

members of this panel were the European Environment Agency (EEA), the Directorate General for

Enterprise and Industry (DG ENTR), the Directorate General Joint Research Centre (DG JRC) and

Insurance Europe.

3.2.2 EGDI user categories

Four general types of EGDI users have been identified, each of them being potentially subdivided

according to the sector they represent:

High level end-users: Users such as policy makers that will not need direct access to the

geological data infrastructure, but who depend on the ability for experts to have access to up-

to-date, reliable data to respond quickly to requests for information.

Page 100: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 20 / 27

System end-users: Users that access the geological data infrastructure directly in order to find

data and information.

For example:

- end-users of inGeoCloudS, Minerals4EU, OneGeology-Europe, Promine, etc.

- some members of the stakeholder panel (i.e. the European environment agency and Insurance

Europe) as well as geological experts from different domains (EGS expert groups)

Data providers: Stakeholders that will feed data into the geological data infrastructure.

Representatives of all EuroGeoSurveys (the association of the European Geological Surveys)

members are involved.

Other stakeholders: Organizations that have an interest in EGDI‐Scope to ensure integration to

other projects and programs (on a political or technical level). This concern past and ongoing

European projects, for example Minerals4EU.

3.3 Datasets

3.3.1 EGDI methodology for prioritizing datasets

In order to prioritize datasets (and identify infrastructure needs) the EGDI consortium developed a

questionnaire and distributed it to all engaged members of the Stakeholder Panel. In addition, a greater

number of stakeholders has been identified and invited to fill in the questionnaire in order to get input

from as many different types of organizations as possible.

Furthermore, identification of geospatial datasets has been performed via the INSPIRE Monitoring and

Reporting web portal.

3.3.2 Relevant answers to EGDI stakeholder survey

The questionnaire was relatively simple, and aimed at asking people about their need for geological data

and services.

Among others, one question seems particularly relevant for CLARUS:

Do you have any current legal barriers relating to your use of geological data?

- “Occasionally, standard copyright policies might apply”

- “Data from wells”

- “For any private data, in particular borehole data”

- “Yes, reserves/resources data are confidential”

- “classified and confidential data”

- “Yes, issues of national security”

- “National legislation in other countries”

Page 101: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 21 / 27

- “Yes. Law restriction.”

- “All geological maps freely available. Generated geological information from site investigations

depends on confidentiality.”

- “Regarding geological reports and borehole data (rights to inspection, copy rights) - Mineral

royalty - Intellectual property rights (IPR)”

- “Some data are not public and is hard to have access to them, even if I am part of the Geological

Survey”

- “Yes – legal regulations are not clear”

3.3.3 Compilation of the INSPIRE dataset inventory

On another hand, key INSPIRE Indicators were selected to help create a synthesized overview of existing

spatial datasets (INSPIRE is a EU Directive aiming at standardizing infrastructures for spatial information

and data across Europe).

3.4 Use cases

Four use cases were selected by the EGDI‐Scope project consortium for the purpose of study in details

the future EGDI user needs. These use cases were chosen to represent different policy areas and

different stakeholder groups.

The aim of these use cases was to:

- Describe how geological data are used

- Assess the user requirements for data and functionality

- Assess the availability of geological datasets to fulfil the requirements

- Study the dependencies towards previous and ongoing projects

- Demonstrate interfaces to other e‐Infrastructures

- Assess legal, licensing and governance aspects

- Studying issues of relevance for the EGDI architecture

- Study issues of relevance for the implementation of EGDI and conversion of existing datasets.

3.4.1 Geohazards: Ground Instability in densely populated areas

A number of European projects have been dealing with ground stability assessments in densely

populated areas. The present use case focuses on the PanGeo project and data.

Datasets

The PanGeo project has developed a ground stability GIS layer and a geohazard description report for a

number of cities around Europe (Urban atlas).

Users

- End-users : European politicians, general public or private companies

- Professional users : geological experts from within the national geological surveys, mining or

finance companies, geological scientists or scientist from non-geological domains.

Page 102: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 22 / 27

Examples of required functionality

- Geohazard and demographic information for ground stability polygons (click‐info).

- Display of PSI data:

o Average annual velocities

o Cumulative displacements

- GIS Inquiry tools such as the visualization of ground motion time series for individual PSI points

(click‐info, click a point and visualize a graph of movement in time)

- Download of geohazard descriptions

- WMS/WFS

- Visualization in Google Earth

3.4.2 Raw Materials: Rare Earth Element potential within the European Union

This use case is closely connected to Minerals4EU.

Datasets

Rare Earth Elements (REE) are a group of critical raw materials. The sustainable supply of these elements

for European Industry is highly vulnerable.

Users

- End-users : European politicians, general public or private companies

- Professional users : geological experts from within the national geological surveys, mining or

finance companies, geological scientists or scientist from non-geological domains.

Examples of required functionality

- Interactive map functionality

- Download of printable maps

- WMS/WFS functionality

- Download in Excel format

- Download in GIS format

- Various search facilities (to be specified)

3.4.3 Renewable Energy - Planning for offshore Wind Farms

Users

- Private companies planning for the construction of wind farms

- Governmental agencies preparing calls for tender and evaluating applications

- Environmental agencies doing e.g. habitat mapping

Datasets

Page 103: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 23 / 27

A wind farm project is planned taking into account a number of geological and geophysical data: e.g. the nature of the sea bottom, the potential impacts on the environment and potential submerged archaeological sites. The following data have been collected:

- Ground investigation - Mapping of marine physical processes - Marine water and sediment quality - Marine and intertidal ecology - Impact on other marine users - Marine and coastal archaeology

Free and open geological and geophysical data are easily available and are of a high value when

preparing for wind farm applications. A number of European projects have for many years worked on

putting together harmonized geological and geophysical data and making these available via the web.

The most important of these projects to be considered by EGDI‐Scope is EMODnet‐geology and Geo‐

Seas.

3.4.4 Geology and Soils – Ecosystem Mapping

The ecosystem mapping and assessment described within the current use case relates to the EU Biodiversity strategy. The part presented here concentrates on geoscientific studies/data sets about soils.

Users

- The European Environment Agency (EEA)

Datasets

The EGS geochemical mapping project (GEMAS) has recently produced a high quality pan‐European

dataset of geochemistry from agricultural and grazing soils. The dataset can supply important

information for the ecosystem assessment performed by EEA.

3.5 Trust and authentication

The study drew a particular attention on legal and organizational aspects (particularly trust and

authentication matters).

3.5.1 Elements of trust

Trust in the EGDI may take different levels and different forms.

Trust in the data

Page 104: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 24 / 27

According to the report on trust and authentication delivered in the 5th EGDI-Scope work package, “the

user has to feel comfortable in using the geological data sets offered”. “He or she has to have enough

guarantees and safeguards that the data are reliable and of sufficient quality and proper format for the

objectives he wants to obtain”. This could be done through:

- Metadata - Transparent quality assessment procedures - Security measures for maintaining the authenticity and integrity of the data

Trust in the services

If a user has to rely on obtaining data via services such as the INSPIRE network services, he or she has to

be able to rely on the availability of these services whenever they are needed. This implies:

- A sufficient level of service has to be guaranteed by the service providers in the EGDI - The offered level of service has to be communicated clearly to the users of the services via

what is generally referred to as service level agreements or terms of service. - The required level of service is to a large extent determined by the INSPIRE implementing

rules relating to the network services.

Trust in the people

Trust in the people cover different aspects, like authentication and identity management, rights management and appropriate personal data protection. It may also imply an appropriate identity management system to be set up allowing for cross-border transactions, while not imposing too heavy burdens on the users of the system. Furthermore the EGDI consortium has differentiated:

- Trust from the perspective of data providers : it may be important, depending on the data and use conditions, to know who is

using their data and how they are using it.

- Trust from the perspective of users : it is important to know who the data is stemming from and that access and use

of the data is not unnecessarily restricted they need to be sure that the information on their identity and their use of the

data is not misused by the data provider

3.5.2 Moving to the cloud

As state in the EGDI-Scope report, moving to the cloud “may have considerable benefits relating to

scalability and efficiency”. “However, there are a number of risks and possible disadvantages that need

to be taken into account.”

These risks may even be intensified if the EGDI uses multiple cloud service providers.

In order to identify these risks one may refer to the ENISA extensive report “Cloud computing. Benefits,

risks and recommendations for information security.”

Page 105: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 25 / 27

Risks about security

The user will depend on the cloud service provider’s security measures, and in most cases he or she will

not be able to impose any security requirements.

Inadequate security may lead to “loss of data”, “corruption of data”, “problems in extracting the data

from the cloud service”, “unintended exposure of data”, or “continuity problems”.

Risks about personal data protection

A particular type of data requiring confidentiality is personal data. The data sets included in the EGDI

cannot be qualified as personal data, and the only personal data that will be processed in the EGDI will

most likely be the data regarding the persons accessing the data and services in case of restricted data.

The processing of such data has to comply with specific requirements under the Data Protection

Directive.

The study for an EGDI concluded that a key question in the context of cloud computing concerns “the

division of responsibilities and liability between the different actors in the cloud computing value chain,

and the determining of the processors and controllers of the data processing operations and their

obligations.” Putting personal data in the cloud implies a “transfer of personal data to third parties

under the Data Protection Directive, possibly to countries outside of the European Union”.

Risks about control and ownership

For example it should be made sure that the data is properly deleted after the end of the contract, after

an appropriate transition period.

Page 106: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 26 / 27

4 Minerals4EU

4.1 The Raw Materials Initiative

Raw materials are fundamental to Europe’s economy, and they are essential for maintaining and

improving our quality of life. Recent years have seen a rapid growth in the number of materials used

across products. Securing reliable and undistorted access of certain raw materials is of growing concern

within the EU and across the globe. As a consequence of these circumstances, the Raw Materials

Initiative was instigated to manage responses to raw materials issues at an EU level. Critical raw

materials have a high economic importance to the EU combined with a high risk associated with their

supply.

4.2 The Project

The Minerals4EU project is designed to meet the recommendations of the Raw Materials Initiative and

will develop an EU Mineral intelligence network structure delivering a web portal, a European Minerals

Yearbook and foresight studies.

The network will provide data, information and knowledge on mineral resources around Europe, based

on an accepted business model, making a fundamental contribution to the European Innovation

Partnership on Raw Materials (EIP RM), seen by the Competitiveness Council as key for the successful

implementation of the major EU2020 policies.

4.3 Actors: Data owners and consumers

Initially the network membership will be from the consortium, since the members are those supposed

to be the data owners at EU level, and which in fact comprises mainly the National Geological Surveys

and other key EU players, including all the main European raw materials industry associations

(EUROMINES, IMA, EUROMETAUX and UEPG). The network must also be extended to include academics,

and the main technology and related platforms. The involvement of the European Technology Platform

on Sustainable Mineral Resources (ETP SMR), which brings together all the European industrial players,

is guaranteed by EuroGeoSurveys1.

1 www.eurogeosurveys.org

Page 107: A framework for user centred privacy and security in the cloudclarussecure.eu/sites/default/files/CLARUS-D2.1-DefinitionOf... · CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition

CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0

© CLARUS Consortium 27 / 27

An Industrial Consultation Committee has been formed of main stakeholders and end user industries.

Academic institutes are not normally the owners of large, regional data sets, but may be able to

contribute some data. A major interest in the outcome of data processing is the European Commission

itself. The network should therefore strive to become the authoritative source on which the European

Commission can base its judgments and policy making.

Following types of stakeholders are identified for information about results of the projects since they

may become users of the data collected and referenced:

experts/professionals from the industry sector or their representative association, as well as the

research and innovation institutes, that are involved in a Sustainable Consumption and

Production approach following a circular economy from the natural resources, design,

manufacturing, distribution, use, collection and reuse, recycling and recovery.

environmental organizations/professionals regarding the existing initiatives on good practices

on raw materials sustainability and resource efficiency. Special attention will be given to

clarifying the production, transformation and recycling routes followed by ‘Raw Materials’

during their lifespan. The alternatives for specific raw materials in terms of natural occurrences,

needs, toxicity, hazard and risk will be present in parallel with the relevant regulations.

social and labor bodies will be informed on the existing work practices for the

production/transformation of raw materials by the different main producers.

4.4 Services

4.4.1 Mineral Statistics

A central part of any “Mineral Intelligence Network” is the ‘intelligence’, in other words the quality

controlled data and information derived from that data.

Existing datasets relating to mineral production and trade will be enhanced. New datasets for

exploration activity, resources and reserves and secondary raw materials will be developed. As data are

received, an initial assessment will be conducted, which will consider data quality, requirements for

harmonization, and identify key data and knowledge gaps.