cases together for large scale use coupling hpc and data ... · use cases with user communities...
TRANSCRIPT
![Page 1: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/1.jpg)
www.eudat.euEUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Coupling HPC and Data services together for large scale use
casesDI4R Conference - Krakow
28th September 2016
Giuseppe [email protected]
![Page 2: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/2.jpg)
Background
• The amount of data produced in any particular discipline is starting to exceed the ability to manage them individually, leading to the development of a new analytic field named High Performance Data Analytics where data and computing services need to interact closely with each other.
• A large variety of services already exists but e-Infrastructures have evolved along different dimensions creating separate offerings for computing and data.
• Scientific communities do not access low level computing and data services directly, but rather work with portals and workflows to perform complex tasks.
2
![Page 3: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/3.jpg)
Need for interoperability
3
● Ensure the interoperability of EUDAT with other public and private e-Infrastructures, lowering technical and policy barriers by piloting concrete use cases with user communities
● Provide European researchers and industries with seamless access to data and computing resources for cross-utilization use cases
● Implement the Open Science vision where resources of any kind and size are accessible without any technical barrier.
![Page 4: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/4.jpg)
4
Joint Access to Data and HPC Services Commercial
stakeholders
Collaboration with Commercial Stakeholders
Joint Access to Data, HTC and Cloud
Computing Resources
Interoperability
![Page 5: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/5.jpg)
E-Infrastructure Collaboration
RDA
Open AIRE
PRACE
LERULIBER
EGI
Helix Nebula
Data Cloud
GEANT
Cross-infra services & opsCommon protocols, APIs
HPC/HTC/Clouds
Policy & networkingOutput adoption
Test bedsPolicy & guidelines
Data management plansService integration
Cloud Catalogue
DECI calls
DFT, data fabric, PID, metadata, practical
policy
Four interoperability pilots fostering the coupling of data and
cloud resources. Large communities involved, BBMRI,
ICOS, EISCAT-3D, ELIXIR. Contribution to regular PRACE calls by providing medium/long-term storage capacity and services. 5
pilots granted for DECI Call (13th)
![Page 6: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/6.jpg)
How?
• Joint Open Calls for proposals• EUDAT offering data services and resources
through regular PRACE calls• Review process is transparent to users
• Joint training activities• PRACE project investigators involved into EUDAT
Data Management webinars and courses• Continuous technical discussion and developments of
new components• Definition of the EUDAT Workspace area• Synchronization of authentication credentials for
single sign-on• EUDAT clients as part of the PRACE Common
Production Environment6
![Page 7: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/7.jpg)
• Technical• Cross-utilization use cases, e.g. data transfer, workflow
execution, data discoverability and provenance (PID), federated AAI, etc.
• Combination of respective services catalogue• Policy
• Harmonization of access policy fostering the uptake of services on the long-term
• Operational• Harmonization and cross-fertilization of operational tools,
technologies, practices and policies • Security for Collaboration among Infrastructures
(https://www.eugridpma.org/sci/) Security collaboration : WISE (https://wise-community.org/)
Areas of e-Infras harmonization
7
![Page 8: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/8.jpg)
PRACE Research Infrastructure
• PRACE – the Partnership for Advanced Computing in Europe – Research Infrastructure enables high impact European scientific discovery and engineering research and development across all disciplines to enhance European competitiveness for the benefit of society.
• PRACE seeks to realize this mission through world class computing and data management resources and services open to all European public research through a peer review process.
8
![Page 9: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/9.jpg)
B2 Service Suite
B2ACCESS
B2HANDLE
![Page 10: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/10.jpg)
CDI Data Domain
EUDAT Data Domain modeled on the ANDS1 Data Curation Continiuum
1. Australian National Data Service organization – www.ands.org.au
![Page 11: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/11.jpg)
Some Facts about the collaboration
11
• 5 pilots out of 10 granted resulting from the 13th PRACE DECI Call• 20% of all applicants requested to access EUDAT
data services• ~350TB of storage space
• Different scientific fields• Engineering, Material Science, Astrophysics, Earth
Science• Relevant requirements
• Temporary store of large collection -> Workspace area
• Sharing of intermediate results, simulations input -> B2DROP
• Deposit of relevant result for publication -> B2SHARE
![Page 12: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/12.jpg)
Some numbers
12
Code Project Name Field Country Data
requiremen
ts during
the PRACE
Data requirements in
TB after the PRACE
project
Duration EUDAT
Site
PRACE Site
HybTurb3DHybrid 3D simulations
of turbulence and
kinetic instabilities at
ion scales in the
expanding solar wind
Astro
Sciences
IT 140 TB 140 TB 24 m CINECA SurfSARA
MULTINANOMultiscale simulations
of nanoparticle
suspensions
Engineering IT 30 TB 30 TB 24 m CINECA MPDCF
CHARTEREDCharge transfer
dynamics by time
dependent density
functional
theory (CHARTERED)
Materials
Science
SE 30TB 20TB 24 m KTH/PDC IT4I
HiResClimateHigh Resolution
EC-Earth Simulations
Earth
Sciences
IE 150TB 150TB 12 m EPCC KTH
AFiDEffect of rotation and
surface roughness on
heat transport in
turbulent flow
Engineering NL 11TB 10TB for 24 months
1TB for long-term
storage
10TB for 24
m
1TB for
long-term
storage
SURFsara EPCC
![Page 13: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/13.jpg)
Requirements
• Communities and users (e.g. PRACE) want deposit area for digital entities from computing simulations
• Connect to existing, community specific access services
• Support multiple protocols: GridFTP, Webdav, POSIX (full, light, like)
• Integrated within the CDI domain and services
![Page 14: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/14.jpg)
14
WorkspaceRegistered Data
DomainUser space
Data
Data
MD
Digital Object
Module EUDAT
Post processing
Post processing
User scripts Workflows
![Page 15: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/15.jpg)
Challenges
• Synchronizing people, funds, and resources of projects which have their own implementation plan is challenging
• PRACE does not target RIs but rather individual researchers or research groups
• Transferring large amounts of data across internet is difficult• Large archives maintained close to
computational power, at least for HPC applications
15
![Page 16: cases together for large scale use Coupling HPC and Data ... · use cases with user communities Provide European researchers and industries with seamless access to data and computing](https://reader034.vdocuments.site/reader034/viewer/2022042310/5ed74064d37f9f58ca6a8d3c/html5/thumbnails/16.jpg)
For more info:
https://b2drop.eudat.euhttps://eudat.eu/services/userdoc/b2drop
https://b2share.eudat.euhttps://eudat.eu/services/userdoc/b2share
https://eudat.eu/services/userdoc/b2safe
https://eudat.eu/services/userdoc/b2stage
http://b2find.eudat.euhttps://eudat.eu/services/userdoc/b2find
http://b2access.eudat.eu