openaire and eudat services and tools to support fair dmp implementation

63
Credits: OpenAIRE team Sarah Jones, Data Curation Centre, DCC (UK) Marjon Grootweld, Dans (NL) Natalia Manola, ATR (GR) RDA National Event in Italy, 14-15 November 2016 FAIR Data Management: best practices and open issues Paola Gargiulo, OpenAIRE NOAD/Cineca OpenAIRE and Eudat services and tools to support FAIR D MP implementation

Upload: research-data-alliance

Post on 16-Apr-2017

146 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Credits: OpenAIRE teamSarah Jones, Data Curation Centre, DCC (UK)

Marjon Grootweld, Dans (NL)Natalia Manola, ATR (GR)

RDA National Event in Italy, 14-15 November 2016FAIR Data Management: best practices and open issues

Paola Gargiulo, OpenAIRE NOAD/Cineca

OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Page 2: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Agenda• The Open Research Data Pilot• The data management plan• OPENAIRE tools and services for the Data Pilot• EUDAT data services

2

Page 3: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Open Research Data Pilot (2015-2016): aims

To make the research data generated by selected Horizon 2020 projects accessible with as few

restrictions as possible, while at the same time

protecting sensitive data from inappropriate access.

EC:information already paid for by the public should not be paid for again.Open data is data that is free to access and reuse

Page 4: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

To whom does the Data Pilot concern?Current situation 2015-2016:•Researchers funded by Horizon 2020 within 9 specified call areas.

•Opt out and opt in are possible and are being used •Call areas: https://www.openaire.eu/opendatapilot

As of 2017:•European Cloud Initiative to give Europe a global lead in the data-driven economy.

•Open data will become the default option. The pilot will be extended to cover all call areas. Opting out remains possible. 

•Press release: http://europa.eu/rapid/press-release_IP-16-1408_en.htm  Daniel Spichtinger (EC) at OpenCon 14-11-15: 3,699 Horizon 2020 signed grant agreements – 149/431 projects in core areas opted out -

409/3268 projects in other areas opted in 4

Page 5: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Which research has to partipate in the pilot? (2015- 2016) • Future and Emerging Technologies• Research infrastructures – (new: coverage of the whole area) • Leadership in enabling and industrial technologies – Information and

Communication Technologies• Nanotechnologies, Advanced Materials, Advanced Manufacturing and

Processing, and Biotechnology:  ‘nanosafety’ and ‘modelling’ topics (new)

• Societal Challenge: Food security, sustainable agriculture and forestry, marine and maritime and inland water research and the bioeconomy - selected topics as specified in the work programme (new)

• Societal Challenge: Climate Action, Environment, Resource Efficiency and Raw materials – except raw materials

• Societal Challenge: Europe in a changing world – inclusive, innovative and reflective Societies

• Science with and for Society• Cross-cutting activities - focus areas – part Smart and Sustainable

Cities (moved from Energy WP)

5

Page 6: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Horizon 2020 Open Data by Default from 2017

Just announced!

Page 7: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Two types of data:Data, including metadata, needed to validate the results in scientific publications

Other data, including metadata, as specified in the Data Management Plan, like raw data

Page 8: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

The following slides come from the EC’s open access team and provide an overview to the key points. Content from Jean-Francois Dechamp and colleagues.

Mail: [email protected] Web: http://ec.europa.eu/research/openscience/index.cfm Twitter: @OpenAccessEC

RDA National Event in Italy, 14-15 November 2016 8

Page 9: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

PublicationsOpenly accessible and minable. Eligible costs for APCs.

Research dataOpenly accessible research data can typically be accessed, mined, exploited, reproduced and disseminated free of charge for the user.

Page 10: OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Page 11: OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Page 12: OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Page 13: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Three top reasons to opt outWhether a (proposed) project

participates in the ORD or chooses to opt out does not affect the

evaluation of that project. Proposals will not be penalised for opting out

Page 14: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Reasons for opting out:

14

•participation is incompatible with the Horizon 2020 obligation to protect results that can reasonably be expected to be commercially or industrially exploited;

•participation is incompatible with the need for confidentiality in connection with security issues;

•participation is incompatible with rules on protecting personal data;

•the project will not generate / collect any research data; or

•there are other legitimate reasons not to take part in the Pilot.

•Note that partial opt out is possible – and preferable to full opt out!

Page 16: OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Page 17: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

FAIR data•Findable

– assign persistent IDs, provide rich metadata, register in a searchable resource...

•Accessible– Retrievable by their ID using a standard protocol, metadata remain accessible even

if data aren’t...

•Interoperable– Use formal, broadly applicable languages, use standard vocabularies, qualified

references...

•Reusable– Rich, accurate metadata, clear licences, provenance, use of community standards...

www.force11.org/group/fairgroup/fairprinciples

Page 18: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Findable• Use metadata and specify standards for metadata

creation (if any). If there are no standards in your discipline describe what type of metadata will be created and how

• Search keywords • Persistent and unique identifiers such as DOIs or other

handles • File and folder naming conventions• Versioning of the datasets and clear version numbers

18

Page 19: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Metadata and documentation• Metadata and documentation is needed to find

and understand research data• Think about what others would need in order

to find, evaluate, understand, and reuse your data

• Get others to check the metadata to improve quality

• Use standards to enable interoperability

19

Page 20: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Where to find metadata standardsMetadata Standards DirectoryBroad, disciplinary listing of standards and toolsMaintained by RDA grouphttp://rd-alliance.github.io/metadata-directory

BiosharingA portal of data standards, databases, and policies for life, environmental and biomedical scienceshttps://biosharing.org

20

Page 21: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Accessible• Explain which data can’t be shared openly, if any• Specify how access will be provided in case of

restrictions, e.g., through a data committee, a license, or arranged with the repository

• Will methods or software tools needed to access the data (if any) be included or documented?

• Deposit the data and associated metadata, documentation and code preferably in certified repositories which support Open Access

21

Page 22: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Where to find a repository?More information: https://www.openaire.eu/opendatapilot-repository

What to deposit?a. the data needed to validate the results

presented in scientific publications, including the metadata;

b. any other data, including the metadata, as specified in the DMP;

c. plus for a-b the documentation and the tools that are needed to validate the results, e.g. specialised software or software code, algorithms and analysis protocols (when possible, these instruments themselves).

22

Page 23: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Shop around

http://databib.org

http://service.re3data.org

Re3data is a registry of data repositories

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Searching with Re3data.orgwww.fosteropenscience.eu/content/re3data-demo

23

Page 24: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Interoperable• Interoperability on data and metadata, on data exchange

formats and protocols• Specify what data and metadata vocabularies, standards or

methodologies you will follow to facilitate interoperability • Standard vocabulary to allow inter-disciplinary

interoperability or a mapping from your vocabulary to more commonly used ontologies?

Aim for compliance to globally accepted practices

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 24

Page 25: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

• Clarify licences early on• License the data to permit the widest reuse

possible

• Specify a data embargo, if needed• If data re-use by third parties is restricted,

explain why • How long will the data remain reusable? • Describe data quality assurance processes

Reusable

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 25

Page 26: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

www.dcc.ac.uk/resources/how-guides/license-research-data

License research data openlyDCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence

CREATIVE COMMONS LIMITATIONSNC Non-CommercialWhat counts as

commercial?

ND No DerivativesSeverely restricts use

These clauses are not open licenses

Horizon 2020 Open Access guidelines point

to:

or

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 26

Page 27: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

PLANNING DATA MANAGEMENT

27

Page 28: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

What is a data management plan?A plan written at the start of a project to define:• how the data will be created?• how it will be documented?• who will access it?• where it will be stored?• who will back it up?• whether (and how) it will be shared & preserved?

DMPs are often submitted as part of grant applications, but are useful whenever researchers are creating data

The DMP is a living document. You are not required to provide detailed answers to all the questions

in the first version of the DMP (due M6)

28

Explain any selection criteria in the DMP

Page 29: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

When to submit the DMP•Note that the Commission does NOT require applicants to submit a DMP at the proposal stage.

•A DMP is therefore NOT part of the evaluation.

•DMPs are a deliverable

•Note that the Commission requires updates. A DMP is a living or “active” document.

Page 30: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

What aspects of RDM should be in a DMP?

What data will be created (format, types, volume...)

Standards and methodologies to be used (incl. metadata)

How ethics and Intellectual Property will be addressed

Plans for data sharing and access

Strategy for long-term preservation Create

Document

Use

Store

Share

Preserve

A DMP is a plan to share!

Page 31: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

What should be deposited?•The data needed to validate results in scientific publications

(minimally!).•The associated metadata: the dataset’s creator, title, year of

publication, repository, identifier etc.• Follow a metadata standard in your line of work, or a generic standard, e.g.

Dublin Core or DataCite., and be FAIR.• The repository will assign a persistent ID to the dataset: important for

discovering and citing the data. •Documentation like code books, lab journals, informed consent forms –

domain-dependent, and important for understanding the data and combining them with other data sources.

•Software, hardware, tools, syntax queries, machine configurations – domain-dependent, and important for using the data. (Alternative: information about the software etc.)

Basically, everything that is needed to replicate a study should be available for others. But more is welcome! More data, more information in the package… and described in the DMP.

Research Data Alliance (RDA) http://rd-alliance.github.io/metadata-directory/standards/FAIR Guiding Principles for scientific data management http://www.nature.com/articles/sdata201618

31

Page 32: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Archive the data openly, unless…

•Confidentiality and security issues can be good reasons not to publish or share – all – data. Note in the DMP the reasons for not giving access, and deposit that part of the data under a Restricted Access regime.

•When regenerating data would be cheaper than archiving, don’t archive. Spend time on selecting what data you’ll need and want to retain. Motivate your criteria in the DMP.

See http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf For selection criteria see https://www.openaire.eu/opendatapilot 32

Grant Agreement, Art. 29.3, Open Access to research data:

Page 33: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

A DMP is about ‘keeping’ data• Storing data < > archiving data• Archived data < > findable data• Findable < > accessible• Accessible < > understandable• Understandable < > usable

•a USB stick is not safe•Figshare is not a Trustworthy Digital Repository•a persistent identifier is essential but no guarantee for usability

•Data in a proprietary format are not sustainable

Page 34: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

How much does it cost? Who pays?• What are the costs for making data FAIR in your project? • Resources needed for long term preservation• Check the UK Data Service Costing model • The High Level Expert Group on the European Open

Science Cloud recommends that “well budgeted data stewardship plans should be made mandatory and we expect that on average about 5% of research expenditure should be spent on properly managing and stewarding data”

• Who pays? How?UKDS model http://www.data-archive.ac.uk/create-manage/planning-for-sharing/costing

HLEG report http://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf#view=fit&pagemode=none

34

Page 35: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

http://www.curationexchange.org/

Costs?

Page 36: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

OPENAIRE SUPPORT KIT FOR THE DATA PILOT

36

Page 37: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

OPEN ACCESS

OpenAIRE implements the

EC requirements& SUPPORTS THE OPEN DATA

PILOT

Page 38: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Human Network e-infrastructure NOADS: National Open Access Desks Monitor and foster the adoption of Open

Access policies at the local level Support researchers at the

implementation of the Open Data Pilot FP7 post grant APCs Pilot

e-infrastructure for monitoring impact of OA mandates and research projects

OpenAIRE guidelines for metadata exchange Zenodo Repository for the deposition of research

products

THE POINT OF REFERENCE FOR OPEN ACCESS IN EUROPE

50 Partners: EU countries, data centers, universities, libraries, repositories

Open Access infrastructurefor research in Europe

Page 39: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Integrated Scientific Information System

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

17.3 mi unique publications 760+ validated data providers370Κ publications linked to projects from 6 funders28 K datasets linked to publications3.5K links to software repositories33K organizations

OrganizationsProjects

AuthorsDatasets

Publications

Data Providers

Software Facilities MethodsResearch

Communities

OpenAIRE-ConnectFrom January 2017

39

Page 40: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

OpenAIRE support materialsBriefing papers, factsheets, webinars, workshops, FAQs

Information on• Open Research Data Pilot• Creating a data

management plan• Selecting a data repository• Personal data

Developing guidance to add to DMPonline

https://www.openaire.eu/opendatapilothttps://www.openaire.eu/support

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 40

Page 41: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Information at the OpenAIRE website•Open Research Data Pilothttps://www.openaire.eu/opendatapilot•What is the pilot? Which H2020 strands are required to participate? What practical

steps should the researcher take?•Create a Data Management Planhttps://www.openaire.eu/opendatapilot-dmp•Information about how to create a Data Management Plan. First steps; When to

write and revise your Data Management Plan•Select a Data Repositoryhttps://www.openaire.eu/opendatapilot-repository•Information about how to select a repository•Frequently Asked Questions about the Open Research Data Pilothttps://www.openaire.eu/support/faq

41

Page 42: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Webinars Webinar page https://www.openaire.eu/webinars/ • E.g. about Pilot, Zenodo and data

management, but also about OA publications and how to make your repository OpenAIRE compatible.

• Various presenters; slides included.42

Page 43: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

OpenAIRE factsheets

43

Page 44: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Briefing Paper RDMOpenAIRE Research Data Management Briefing Paper

•https://www.openaire.eu/briefpaper-rdm-infonoads

•This extensive briefing paper gives an overview of Research Data Management with practical sections about data management planning, and archiving the research data for reuse. 44

Page 45: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

OpenAIRE services•Researchers

•Zenodo for all types of publications, data and software•Claiming – linking research results•Amnesia, an anonymization tool for all

•Data providers – Interoperability Guidelines, validation,…•Project coordinators – reporting •Funders and institutions – monitoring•Research communities – gathering, monitoring all research

45

DASHBOARDS

Page 46: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

ZenodoMulti-disciplinary repository used for the long-tail of research data• An OpenAIRE-CERN joint effort• Multidisciplinary repository accepting

– Multiple data types– Publications– Software – link to Github

• Assigns a Digital Object Identifier (DOI), up t 50GB per dataset

• Links funding, publications, data & software

www.zenodo.org

46

Page 47: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

What is DMPonline?•A web-based tool to help researchers write Data Management and Sharing Plans

•Includes requirements and guidance from funders, universities and other groups

•Developed by the Digital Curation Centre

Page 48: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

How to write a DMP•Template available from https://dmponline.dcc.ac.uk/

•And from a few national DMPonline sites, e.g. in Spain and Belgium

See https://www.openaire.eu/opendatapilot-dmp - Spain: http://pgd.consorciomadrono.es/ - Belgium: pilot and therefore limited to authorised persons 48

1

Page 49: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

https://dmponline.dcc.ac.uk

DMPonline (free tool)

Page 50: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

DMPonlineA web-based tool to help researchers write DMPs

Includes a template for Horizon 2020

Guidance from EUDAT and OpenAIRE being added

https://dmponline.dcc.ac.uk

Page 51: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

New H2020 template

51

Page 52: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Deliver the DMPEC: “Since DMPs are expected to mature during the project, more developed versions of the plan can be included as additional deliverables at later stages. (…) New versions of the DMP should be created whenever important changes to the project occur due to inclusion of new data sets, changes in consortium policies or external factors.”

52

Page 53: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Where to find a repository?

More information: https://www.openaire.eu/opendatapilot-repositoryZenodo: http://www.zenodo.org/ Re3data.org: http://www.re3data.org/

54

Page 54: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

How to select a repository? 1/2Main criteria for choosing a data repository:

•Certification as a ‘Trustworthy Digital Repository’, with an explicit ambition to keep the data available in the long term.

• Network of trustworthy digital repositories for long-term preservation of the data after the research is finished.

• Three common certification standards for TDRs:

Data Seal of Approval: http://datasealofapproval.org/en/nestor seal for DIN 31644: http://www.langzeitarchivierung.de/Subsites/nestor/EN/nestor-Siegel/siegel_node.htmlISO 16363: http://www.iso16363.org/ ICSU-WDS: https://www.icsu-wds.org/

55

Page 55: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

Main criteria for choosing a data repository:

•Certification as a ‘Trustworthy Digital Repository’, with an explicit ambition to keep the data available in the long term.

•Matches your particular data needs and is FAIR compliant: e.g. certain file formats; mixture of Open and Restricted Access. So contact the repository of your choice when writing the first version of your DMP, or earlier.

•Provides guidance on metadata and on how to cite the data that has been deposited.

•Gives your submitted dataset a persistent and globally unique identifier: for sustainable citations – both for data and publications – and to link back to particular researchers and grants.

How to select a repository? 2/2

https://www.openaire.eu/opendatapilot-repository 56

Page 56: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

EUDAT DATA SERVICES

57

Page 57: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

EUDAT project

https://eudat.eu/ 58

EUDAT offers common data services to both research communities and individuals through a network of 35 European organisations.

Page 58: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

EUDAT offers data servicesEUDAT services are designed, built and implemented based on user community requirements.

59

PHYSICAL SCIENCES & ENGINEERING

SOCIAL

SCIENCES &

HUMANITIE

S

MATERIALS & ANALYTICAL FACILITIES

ENVIRONMENTAL SCIENCES

MAPPER

BIOMEDICAL & MEDICAL SCIENCES

Page 59: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

EUDAT services

60

B2 SERVICE SUITE

B2ACCESSB2HANDLE

https://eudat.eu/services

Page 60: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

•Store and exchange data with colleagues and team members, including research data not finalized for publishing

•share data with fine-grained access controls

•synchronize multiple versions of data across different devices

e.g. B2DROP – a solution for researchers and scientists to:

Features:20GB storage per userLiving objects, so no PIDsVersioning and offline useDesktop synchronisationB2DROP is hosted at the Jülich Supercomputing CentreDaily backups of all files in B2DROP are taken and kept on tape

b2drop.eudat.eu

Page 61: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

•move large amounts of data between data stores and high-performance compute resources

•re-ingest computational results back into EUDAT

•deposit large data sets into EUDAT resources for long-term preservation

Features:high-speed transferreliable and light-weightmanages permanent PIDs

62

e.g. B2STAGE - Facilitating communities to:eudat.eu/b2stage

Page 62: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

EUDAT training

63

https://eudat.eu/traininghttps://eudat.eu/events/webinars

Page 63: OpenAIRE and Eudat services and tools to support FAIR DMP implementation

www.openaire.eu@openaire_eufacebook.com/groups/openaire linkedin.com/groups/OpenAIRE-3893548

[email protected]

64

Questions?