![Page 1: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/1.jpg)
www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
EUDAT How manage Data into the
Collaborative Data Infrastructure: a general
overview of EUDAT services
Giovanni Morelli
![Page 2: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/2.jpg)
Outline
What kind of problems we want(try) to solve Different management system for different communities
Quality of data sets Class of users
What about our solutions (B2<services>) B2DROP, B2SHARE,B2SAFE,B2STAGE,B2HANDLE,B2ACCESS,…
B2<service> integration
Project and Service Enabling Community / EUDAT interaction
Practical use cases
![Page 3: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/3.jpg)
Where Does EUDAT Fit In?(in a Data quality view)
Community repositories
Institute repositories
Scientists personal data
Homeless scientists
Citizen scientists
![Page 4: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/4.jpg)
Where Does EUDAT Fit In?(in a multilayer view of Data Management)
Tru
st
Data
C
ura
tion
Common Data Services
Users
User functionalities, data
capture & transfer, virtual
research environments
Persistent storage,
identification, authenticity,
workflow execution, mining
Data
Generators
Community Support Services
Data discovery & navigation,
workflow generation,
annotation, interpretability
![Page 5: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/5.jpg)
Who can use EUDAT service
5
Upload and
download
Upload, add
metadata, share
Periodic transfers,
quality checks …
Single researcher Team Community
Different strategies for different usage scenarios
![Page 6: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/6.jpg)
Community-Driven Solutions
PHYSICAL SCIENCES & ENGINEERING
MATERIALS & ANALYTICAL FACILITIES
MAPPER
BIOMEDICAL & MEDICAL SCIENCES
EUDAT services are designed, built
and implemented based on user
community requirements.
![Page 7: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/7.jpg)
7
Community Repositories(thematic data centres)
EUDAT generic data service provider storage, workflows, processing, archive
EUDAT Collaborative Data Infrastructure(A general CDI architecture overview)
![Page 8: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/8.jpg)
8
EUDAT Collaborative Data Infrastructure(Using vs. joining)
Community “use” EUDAT
![Page 9: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/9.jpg)
9
EUDAT Collaborative Data Infrastructure(Using vs. joining)
Community “join” EUDAT
![Page 10: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/10.jpg)
If there are hundreds of Research
Infrastructures, how many different data
management systems can be sustained?
10www.eudat.eu
![Page 11: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/11.jpg)
B2 Service (modular) Suite
B2ACCESS
B2Handle
![Page 12: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/12.jpg)
EUDAT2020Further integration with EUDAT CDI (e.g. B2SHARE)
Integration with B2ACCESS to enable access by many different Identity Providers
Cloud Storage Federation, collaboration with GEANT in OpenCloudMesh
Assess B2DROP as workspacearea to computing facilities
Who
Citizens Scientists and small teams
What
Store and exchange data
Synchronize multiple versions
Ensure automatic desktop
synchronization
Why
Ease of Use
Trusted European Service
12
![Page 13: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/13.jpg)
EUDAT2020Further integration with EUDAT CDI (e.g.
B2DROP, B2SAFE)
Integration with B2ACCESS (incl eduGAIN),
focus on authorization
Embargo period
Editing of metadata
Data versioning and annotation
Extended HTTP Restful API interface
Easy installable software package
Who
Small to Medium Teams
What
Store data (incl. software) and add domain
meta data
Share registered research data worldwide
Preserve (small-scale) research data for long-
term
Why
Register Data for Publications
Make known to wider community
13
![Page 14: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/14.jpg)
14
Collection of official RDA documents
![Page 15: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/15.jpg)
Service Integration
Bidirectional Integration
![Page 16: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/16.jpg)
EUDAT2020Support iRODS v4
Support metadata
Optimize and extend policies to support
data curation and provenance
Further integration with B2ACCESS
Support authorization on basis of
community access rules
Assess B2SAFE as workspace area to
computing facilities
Who
Community Data Managers
‘Sophisticated’ Organisations
What
Provide an abstraction layer which virtualizes
large-scale data resources
Guard against data loss in long-term
archiving and preservation
Optimize access for users from different
regions
Bring data closer to powerful computers
Why
Performance
Replication between trusted sites
Data Preservation
16
![Page 17: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/17.jpg)
Data Policy ManagerData policies are centrally managed
Policy rules are implemented and enforced by
site-local rule engines
Policies describe in an abstract language
Community data managers must authenticate
to provide trust
Support policies for data replication and
integrity checking
Central logging for auditable data policies to
monitor execution
Active collaboration with the RDA Practical
Policy WG
EUDAT2020Handover to operations
Extend number of policies supported
Focus on data curation and
provenance policies
Integrate with B2ACCESS17
![Page 18: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/18.jpg)
Further develop HTTP to a mature
interface and extend functionality to
metadata
Native support PIDs within GridFTP
transfers
Extend EUDAT client API library to other
B2 services (e.g. B2SHARE, B2FIND,
PID)
Further integration with B2ACCESS
EUDAT2020
Who
Users and Communities with Significant
Computational Needs
What
Transfer large data collections from EUDAT
storages to external HPC facilities for
processing
Copy large data sets, ingesting them onto
EUDAT storage resources
Why
Integration/Collaboration with PRACE
Simplify Data Transfer
18
![Page 19: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/19.jpg)
Harvesting of metadata stored in
B2SAFE
Community customizations
Annotation of datasets
Further assess RDF and Linked Data
Further assess scalability and
performance
EUDAT2020
Who
Anyone
What
Find collections of scientific data quickly and
easily, irrespective of their origin, discipline or
community
Get quick overviews of available data
Browse through collections using standardized
facets
Why
Unique collection
Ease of Searching
19
![Page 20: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/20.jpg)
Develop the policies for the B2HANDLE
service (e.g. PID namespace mngmt)
Migrate service from Handle v7 to v8
Define PID Information Types for data,
metadata, collection records
Integrate with Data Type Registry service
Consolidate B2HANDLE API library with
EUDAT API library
EUDAT 6M EC Review, 28th October 2015, Brussels
Development plan
Who
Groups or Communities who want to make
their data citable
What
Follows policies to register data and make
it long term refer- and citable
Reliability through mutual PID mirroring
Provides abstraction layer between a
globally unique persistent identifier and
physical location of data objects
Machine readable via HTTP RESTful API
Why
Simple integration
Technology Agnostic
20EUDAT M6 Review - Services and Operations
![Page 21: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/21.jpg)
EUDAT2020Integration with operational and all B2 services
B2SHARE B2DROP B2STAGE
B2SAFE B2HANDLE, DPM, CREG , TTS,
Integration with community IdP domains and
portal environments
Enabling access via eduGAIN social IDs
enabling access via ORCID CLARIN IdPs
Focus on authorization
Collaborate on cross e-infrastructure access
(e.g. PRACE, EGI)
Extend European collaboration via AARC
(e.g. Geant, Terena)
Who
Anyone wanting to use the B2 Services
What
Complies with community ownerships and
access rights, basis of trust
Credential conversion approach (e.g.
SAML, OpenID, X.509, Username/password)
Identity provider for citizen scientists
Why
Use your own ID in federated environment
21
![Page 22: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/22.jpg)
![Page 23: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/23.jpg)
Operational tools & Central Services
creg.eudat.eu
CDI Config DBSites, Service Comp.
cmon.eudat.eu
Monitoring (cmon)to be replaced: A&R M.
rct.eudat.eu
RCT (Project Coord.)to be replaced by DPCP
http://eudat.eu/support-request
helpdesk.eudat.eu
HelpdeskTTS
EUDAT Wiki, JIRACROWD (AAI), SVN
Service Hosting
Framework 23
![Page 24: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/24.jpg)
Understanding the enabling processall the actors
Pre
sale
Dep
loy
Pro
du
ctio
n
Data pilot document(WP4)
Data Project Coordination Portal
Service Portfolio(WP2)
Small/LargeCustomization
(WP5)
Service & Resource
Provisioning(WP6 – T6.2)
Data Project Y Data Project ZData Project X
Service XEnabling Team
Service YEnabling Team
Service ZEnabling Team
WP
6 –
T6.3
TTSTTSTTS
Community
GOCDB
Interface
Production
UserSupport Monitoring
![Page 25: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/25.jpg)
Understanding the enablingDeploy actors
Dep
loy
Data Project X
Service XEnabling Team
WP
6 –
T6.3
ProjectEnabler
TTS
TTS
ServiceIntegrator
Service integrationinto community
![Page 26: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/26.jpg)
Understanding the enablingProject Lifecycle and relationship with
Project Enablers and Service Integrators
Planned
Enabling (repos)
Enabling
Pre-Production
Production
Serv
ice
Inte
grat
or(
s)
Pro
ject
En
able
r(s)
data project/service enabling still under discussion
service enabling at community side (repository) only, EUDAT provider selected, but storage service not yet provided
service enabling at community and EUDAT side
service is operational, but there are still someissues: e.g initial data transfer not complete,security or quality assessment pending,community or provider did not confirmedproduction readiness
service deployed and integrated across allparticipating project partners (communityrepository and EUDAT nodes, communityconfirmed production readiness
Documentation
User Documentation
![Page 27: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/27.jpg)
23 data pilots selected for enabling in EUDAT2020
Data pilots overview
Biomedical and lifesciences
Earth sciences, energyand environment
Physical Sciences andEngineering
Social Sciences andHumanities
Other
ResearchCommunity
ResearchInfrastructure
Applicant Community
Scientific domain
![Page 28: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/28.jpg)
Data pilots overview
0
1
2
3
4
5
Reference sites for storage
0 5 10 15 20
Data synchronication and exchange
Data repository and data sharing
Data replication and preservation
Data staging for analysis and processing
Data discovery and search
Data typing & visualization
New services or tools for Big Data
New services or tools for Semantic web
Total storage request 1220-4300 TB
Requested EUDAT services
![Page 29: EUDAT CDI Its Origins and Evolution · B2 services (e.g. B2SHARE, B2FIND, PID) Further integration with B2ACCESS EUDAT2020 Who Users and Communities with Significant Computational](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed73014c30795314c175c6d/html5/thumbnails/29.jpg)
Questions…