escience on distributed infrastructure in poland€¦ · 5/03/2015 · remote transparent access...
TRANSCRIPT
1
eScience on Distributed Infrastructure in Poland Marian BubakAGH University of Science and TechnologyACC CyfronetKrakow, Poland
dice.cyfronet.pl
PLAN-E, the Platform of National eScience/Data Research Centers in Europe, 29-30 September 2014, Amsterdam
2
ACC Cyfronet AGH
PL-Grid Consortium and Programme
Focus on users: training and support
Platforms and tools: towards PL-ecosystem
International cooperation, conferences
Summary
Outline
3
Credits
ACC Cyfronet AGH
Michał Turała
Krzysztof Zieliński
Karol Krawentek
Agnieszka Szymańska
Maciej Twardy
Angelika Zaleska-Walterbach
Andrzej Oziębło
Zofia Mosurska
Marcin Radecki
Renata Słota
Tomasz Gubała
Darin Nikolow
Aleksandra Pałuk
Patryk Lasoń
Marek Magryś
Łukasz Flis
ICM
Marek Niezgódka
Piotr Bała
Maciej Filocha
PCSS
Maciej Stroiński
Norbert Meyer
Krzysztof Kurowski
Bartek Palak
Tomasz Piontek
Dawid Szejnfeld
Paweł Wolniewicz
WCSS
Paweł Tykierko
Paweł Dziekoński
Bartłomiej Balcerek
TASK
Rafał Tylman
Mścislaw Nakonieczny
Jarosław Rybicki
… and many others
domain experts ….
4
ACC Cyfronet AGH
High Performance Computing
High Performance Networking
Centre of Competence
Participation and coordination of national and
international scientificprojects.
Computational power,storage and libraries
for scientific research.Coordinator of
PL-Grid InfrastructureDevelopment.
Main node of Cracow MAN.South Poland main node of
PIONIER network.Access to GEANT network.
40 years of expertise
Rank
TOP500Site System Cores
Rmax
Tflops
Rpeak
Tflops
176
VI.2014
Cyfronet
Poland
Cluster Platform
Infiniband
Hewlett-Packard25,468 266.9 373.9
5
Motivation and background
Experiments in silico:
advanced, distributed computing
big international collaboration
e-Science and e-Infrastructure interaction
World progress in Big Science:
Theory, Experiment, Simulation
1st:
Theory
2nd:
Experiment
4th Pradigm:
Data IntensiveScientificDiscovery
3rd:
Simulation
Data intensivecomputing
Numericallyintensive computing
Computational Science problems to be addressed:
algoritms, environments and deployment
4th paradigm, Big Data, Data Farming
Needs:
increase of resources
support for making science
6
PL-Grid Consortium
Consortium creation – January 2007
a response to requirements from Polish scientists
due to ongoing Grid activities in Europe (EGEE, EGI_DS)
Aim: significant extension of amount of computing resources provided to the scientific community (start of the PL-GridProgramme)
Development based on:
projects funded by the European RegionalDevelopment Fund as part of the InnovativeEconomy Program
close international collaboration (EGI, ….)
previous projects (5FP, 6FP, 7FP, EDA…)
National Network Infrastructure available:Pionier National Project
computing resources: Top500 list
Polish scientific communities: ~75% highly rated Polish publications in 5 Communities
PL-Grid Consortium members: 5 High Performance Computing Polish Centres, representing Communities, coordinated by ACC
Cyfronet AGH
7
PL-Grid and PLGrid Plus in short
PLGrid Plus Project (2011–2014)
Budget: total ca.18 M€, from EU: ca.15 M€
Expected outcome:
focus on users
specific computing environments
QoS by SLM
PL-Grid Project (2009–2012)
Budget: total 21 M€, from EU 17M€
Outcome: Common base infrastructure
National Grid Infrastructure (NGI_PL)
Resources: 230 Tflops, 3.6 PB
Extension of resources and services by:
500 Tflops, 4.4 PB
Keeping diversity for users
Clusters (thin and thick nodes, GPU)
SMP, vSMP, Clouds
8
PL-Grid project
PL-Grid aimed at significantly extending the amount ofcomputing resources provided to the Polish scientificcommunity (by approximately 215 TFlops of computingpower and 2500 TB of storage capacity) andconstructing a Grid system that would facilitate effectiveand innovative use of the available resources.
Polish Infrastructure for Supporting
Computational Science in the
European Research Space – PL-Grid
Budget: total 21 m€, from EC 17m€Duration: 1.1.2009 – 31.3.2012Managed by the PL-Grid Consortium made up of 5 Polish supercomputing and networking centresProject coordinator: Academic Computer Centre Cyfronet AGH, Krakow, PolandProject web site: projekt.plgrid.pl
Main Project Objectives:
Common (compatible) base infrastructureCapacity to construct specialized, domain Grid systems for specific applicationsEfficient use of available financial resourcesFocus on HPC and Scalability Computing for domain specific Grids
9
PL-Grid project – results
Publication of the book presenting the scientific
and technical achievements of the Polish NGI in
the Springer Publisher, in March 2012:
„Building a National Distributed e-
Infrastructure – PL-Grid”
In Lecture Notes in Computer Science, Vol. 7136,
subseries: Information Systems and Applications
Content: 26 articles
describing the experience and
the scientific results obtained
by the PL-Grid project
partners as well as the
outcome of research and
development activities carried
out within the Project.
First working NGI in Europe in the framework of
EGI.eu (since March 31, 2010)
Number of users (March 2012): 900+
Number of jobs per month: 750,000 - 1,500,000
Resources available:
Computing power: ca. 230 TFlops
Storage: ca. 3600 TBytes
High level of availiability and realibility of the
resources
Facilitating effective use of these resources by
providing:
innovative grid services and end-user tools like Efficient
Resource Allocation, Experimental Workbench and Grid
Middleware
Scientific Software Packages
User support: helpdesk system, broad training offer
Various, well-performed dissemination activities,
carried out at national and international levels, which
contributed to increasing of awareness and
knowledge about the Project and the grid technology
in Poland.
10
PLGrid Plus projectDomain-oriented services and resources of
Polish Infrastructure for Supporting
Computational Science in the European
Research Space – PLGrid Plus
Budget: total ca. 18 M€ including funding from the EC: ca.15 M€
Duration: 1.10.2011 – 31.12.2014
Five PL-Grid Consortium Partners
Project Coordinator: ACC CYFRONET AGH
The main aim of the PLGrid Plus project is to increase potential ofthe Polish Science by providing the necessary IT services forresearch teams in Poland, in line with European solutions.
Preparation of specific computing environments so called domain grids i.e. solutions, services and extended infrastructure (including software), tailored to the needs of different groups of scientists.
Domain-specific solutions created for 13 groups of users, representing the strategic areas and important topics for the Polish and international science:
11
PLGrid Plus project – activities
Integration ServicesNational and International levels
Dedicated Portals and Environments
Unification of distributed Databases
Virtual Laboratiories
Remote Visualization
Service value = utility + warranty
SLA management
Computing Intensive
SolutionsSpecific Computing Environments
Adoption of suitable algorithms and
solutions
Workflows
Cloud computing
Porting Scientific Packages
Data Intensive ComputingAccess to distributed Scientific Databases
Homogeneous access to distributed data
Data discovery, process, visualization,
validation….
4th Paradigm of scientific research
Instruments in GridRemote Transparent Access to instruments
Sensor networks
OrganizationalOrganizational backbone
Professional support for specific disciplines
and topics
12
New domain-specific services for 13 identified scientific
domains
Extension of the resources available in the PL-Grid
Infrastructure by ca. 500 TFlops of computing power and ca.
4.4 PBytes of storage capacity
Design and start-up of support for new domain grids
Deployment of Quality of Service system for users by
introducing SLA agreement
Deployment of new infrastructure services
Deployment of Cloud infrastructure for users
Broad consultancy, training and dissemination offer
Publication of the book presenting the scientific
and technical achievements of PLGrid Plus in the
Springer Publisher, in September 2014:
„eScience on Distributed Computing
Infrastructure”
In Lecture Notes in Computer Science, Vol. 8500,
subseries: Information Systems and Applications
PLGrid Plus project – results
Content: 36 articles
describing the experience
and the scientific results
obtained by the PLGrid
Plus project partners as
well as the outcome of
research and
development activities
carried out within the
Project.
Huge effort of 147
authors, 76 reviewers and
editors team in Cyfronet
13
PLGrid NG projectNew generation domain-specific
services in the PL-Grid infrastructure
for Polish Science
Budget: total ca. 14 889 773,23 PLN, including funding from the EC: 12 651 715,38 PLN
Duration: 01.01.2014 – 31.10.2015
Five PL-Grid Consortium Partners
Project Coordinator: ACC CYFRONET AGH
The aim of the PLGrid NG project is to provide a set of dedicated,
domain-specific computing services for 14 new groups of
researchers and implementation of these services in the PL-Grid
national computing infrastructure.
14
PLGrid NG project – activities
Tasks:
Additional groups of experts involved −
identified 14 communities/scientific topics
Development and maintenance of the IT
infrastructure
In line with the best IT Service Management (ITSM)
practices, such as ITIL or ISO-20000
Security on new applications, audits
In the development stage, before deployment and during
exploitation
Optimization of resource usage − IT experts
Operation Center
Optimization of application porting
User supportFirst-line support, Helpdesk, domain experts, training
Grid infrastructure (Grid services) PL-Grid
Ap
plic
atio
n
Ap
plic
atio
n
Ap
plic
atio
n
Ap
plic
atio
n
Clusters High Performance Computers Data repositories
National Computer Network PIONIER
DomainGrid
DomainGrid
DomainGrid
DomainGrid
New Advanced Service Platforms
15
PLGrid Core projectCompetence Centre in the Field of
Distributed Computing Grid
Infrastructures
Budget: total 104 949 901,16 PLN, including funding from the EC : 89 207 415,99 PLN
Duration: 01.01.2014 – 31.11.2015
Project Coordinator: Academic Computer Centre CYFRONET AGH
The main objective of the project is to support the development of
ACC Cyfronet AGH as a specialized competence centre in the field
of distributed computing infrastructures, with particular emphasis
on grid technologies, cloud computing and infrastructures
supporting computations on big data.
16
PLGrid Core project – services
Basic infrastructure services
Uniform access to distributed data
PaaS Cloud for scientists
Applications maintenance environment of MapReduce type
End-user services
Technologies and environments implementing the Open Science paradigm
Computing environment for interactive processing of scientific data
Platform for development and execution of large-scale applications organized in a workflow
Automatic selection of scientific literature
Environment supporting data farming mass computations
17
Focus on users
Computer centres
Hardware/Software
User friendly
Services
Domain
Experts
Real Users
Help Desk QoS/SLM
Grants
18
User support
Interdisciplinary team of IT experts with
extensive knowledge on
different programming methods used in research:
parallel, distributed and GPGPU cards programming
various scientific software
the specifics of work with HPC/Cloud systems
various aspects of work with large data sets
Support methods
PL-Grid Infrastructure user support systems
(Helpdesk, User’s Forum)
documentation services, PL-Grid User’s Manual
f2f meetings and consultations in ACC Cyfronet AGH
and users' home institutions
International cooperation
cooperation with various institutions and
initiatives dedicated to scientists’ training:
Software Sustainability Institute (UK),
Software Carpentry, Data Carpentry,
Mozilla Science Lab, ELIXIR UK
Cyfronet is making every effort to become
a Software Carpentry regional center in
Poland or Central Europe
Users of the Cyfronet computing
resources are provided with support
and professional help in solving any
problems related to access and
effective use of these resources.
19
Training
Training on basic and advanced services
traditional − in ACC Cyfronet AGH or in the interested users’ home scientific institutions
remote − using a teleconference platform (Adobe Connect) and e-learning platforms
(Blackboard Learn – currently; Moodle – planned)
Courses are prepared based on the experts' experience gained a.o. during
previous projects
A survey assessing the training is performed after each course
20
PL-Grid Infrastructure users
PL-Grid Users
All accounts
Employees
21
Grid users of global services
22
PL-Grid users of domain-specific services
23
GridSpace: a platform for e-Science applications
Experiment: an e-science application
composed of code fragments (snippets),
expressed in either general-purpose scripting
programming languages, domain-specific
languages or purpose-specific notations.
Each snippet is evaluated by a corresponding
interpreter.
GridSpace2 Experiment Workbench: a web
application - an entry point to GridSpace2. It
facilitates exploratory development, execution
and management of e-science experiments.
Embedded Experiment: a published
experiment embedded in a web site.
GridSpace2 Core: a Java library providing
an API for development, storage,
management and execution of experiments.
Records all available interpreters and their
installations on the underlying computational
resources.
Computational Resources: servers,
clusters, grids, clouds and e-infrastructures
where the experiments are computed.
24
Collage: executable e-Science publications
Goal:
Extending the traditionalscientific publishing model with computational accessand interactivity mechanisms; enabling readers (includingreviewers) to replicate and verify experimentation resultsand browse large-scale resultspaces.
Challenges:
Scientific: A common description schema for primary data (experimental data, algorithms, software, workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of experiments in e-Science.
Technological: An integrated architecture for storing, annotating, publishing, referencing and reusingprimary data sources.
Organizational: Provisioning of executable paper services to a large community of users representingvarious branches of computational science; fostering further uptake through involvement of major players in the field of scientific publishing.
25
DataNet: colaborative metadata management
Objectives
Provide means for ad-hoc metadata model
creation and deployment of corresponding
storage facilities
Create a research space for metadata
model exchange and discovery with
associated data repositories with access
restrictions in place
Support different types of storage sites and
data transfer protocols
Support the exploratory paradigm by making
the models evolve together with data
Architecture
Web Interface is used by users to create,
extend and discover metadata models
Model repositories are deployed in the PaaS
Cloud layer for scalable and reliable access
from computing nodes through REST
interfaces
Data items from Storage Sites are linked
from the model repositories
26
Cloud Platform: resource allocation management
VPH-Share Master Int.
AdminDeveloper Scientist
Development Mode
VPH-Share Core Services Host
OpenStack/Nova Computational Cloud Site
Worker Node
WorkerNode
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Worker Node
Head Node
Image store (Glance)
Cloud Facade(secure
RESTful API )
Other CS
Amazon EC2
AtmosphereManagement Service (AMS)
Cloud stackplugins (Fog)
AtmosphereInternal
Registry (AIR)
Cloud Manager
Generic Invoker
Workflow management
External application
Cloud Facade client
Customized applications may directly
interface Atmosphere via its RESTful
API called the Cloud Facade.
The Atmosphere Cloud Platform is a one-stop management
service for hybrid cloud resources, ensuring optimal deployment
of application services on the underlying hardware.
27
InSilicoLab science gateway framework
Goals
Complex computations done in non-complex
way
Separating users from the concept of jobs
and the infrastructure
Modelling the computation scenarios in an
intuitive way
Different granularity of the computations
Interactive nature of applications
Dependencies between applications
Summary
The framework proved to be an easy way to
integrate new domain-specific scenarios
Even if done by external teams
Natively supports multiple types of
computational resources
Including private resources – e.g. private clouds
Supports various types of computations
Architecture of the InSilicoLab framework: Domain Layer,
Mediation Layer with its Core Services, and Resource
Layer. In the Resource Layer, Workers (`W') of different
kinds (marked with different colors) are shown.
28
Scalarm
Self-scalable platform adapting to experiment size and simulation type
Exploratory approach for conducting experiments
Supporting online analysis of experiment partial results
Integrates with clusters, Grids, Clouds
Data farming experiments with an exploratory approach
Parameter space generation with support of design of experiment methods
Accessing heterogeneous computational infrastructure
Self-scalability of the management part
What problems are addressed with Scalarm ?Scalarm overview
29
Veilfs
Functionalities provided by VeilFS
A system operating in the user
space (i.e. FUSE), which virtualizes
organizationally distributed,
heterogeneous storage systems to
obtain uniform and efficient access
to data.
End users access the data stored
within VeilFS through one of the
provided user interfaces:
FUSE client, which implements a
file system in user space to cover
the data location and exposes a
standard POSIX file system
interface,
Web-based GUI, which allows data
management via any Internet
browser,
REST API.
30ChemistryInSilicoLab for chemistry
The service aims to support the launch of complex
computational quantum chemistry experiments in
the PL-Grid Infrastructure.
Experiments of this service facilitate planning
sequential computation schemes that require the
preparation of series of data files, based on a
common schema.
31MetallurgySimulations of extrusion process in 3D
Main Objective: Optimization of the metallurgicalprocess of profiles extrusion.
Optimization includes:
shape of foramera,
channel position on a die,
calibration stripes,
extrusion velocity, ingot temperatures, tools.
The proposed grid-based software simulatesextrusion of thin profiles and rods of special alloys of magnesium, containing calcium supplements.
These alloys are characterized by extremely low technological plasticity during metal forming. TheFEM mathematical model developed.
32Life Science Integromics – a system for researchers from biomedicine
and biotechnology
The system was developed to allow:
data collection from experiments, laboratory diagnostics, diagnostic imaging, instrumental
analysis and from medical interview,
integration, management, processing and analysis of the collected data using specialized
software and some of data mining techniques,
hypotheses generation,
data sharing and presentation of the results.
Example:
The diagram of an artificial neural network used
to classify patients based on the expression of
selected genes. The used method will allow to
raise new hypotheses about the influence of
individual genes on changes in the organisms.
33SynchroGridElegant − the service for those involved in the design and
operation of Synchrotron
The developed service consists in:
provision of the elegant (ELEctron Generation ANd Tracking) applicationin the parallel version on a cluster,
configuring the Matlab software to readoutput files produced by this application in a Self Describing Data Sets (SDDS) format and to generate the final results in the form of drawings.
Objectives:
Preparation of tools needed to Synchrotron deployment and running, aimed at operations and research of the beam line.
Addressing the estimated users’ needs in this scientific area focusing on data access and management – especially the metadata for the experimental data gathered during the beam time.
34
International cooperation – EU funded projects
ACC Cyfronet AGH is involved in numerous
projects co-financed by the EU funds and the
Polish government.
Research conducted in Cyfronet focus on:
grid and cloud environments,
programming paradigms,
research portals,
efficient use of computing and storage
resources,
reconfigurable FPGA and GPGPU
computing systems.
36
Organization of conferences
Cyfronet for many years has been organizing national and international conferences,
workshops and seminars, which bring together computer scientists and researchers
involved in the creation, development and application of information technologies, as
well as the users of these technologies.
The Centre has also initiated a series of conferences:
CGW Workshop, held yearly since 2001
ACC Cyfronet AGH Users' Conference, held yearly since 2008
as well as International Conference on Computational Science (ICCS),
organized twice: in 2004 and 2008
’01
http://www.cyfronet.krakow.pl/cgw14/
37
Organization of conferences
CGW Workshop
Proceedings
38
Summary: what we offer
We develope and deploy research e-infrastructure in three dimensions:
Network & Future Internet
HPC/GRID/CLOUDs
Data & Knowledge layer
Deployments have the national scope; however with close European links
Developments oriented on end-users & projects
Achieving synergy between research projects and e-infrastructures by close
cooperation and offering relevant services
Durability at least 5 years after finishing the projects - confirmed in contracts
Future plans: continuation of current policy with a support from EU
Structural Funds
Center of Excellence in Life Science
CGW as a place to exchange experience and for collaboration between
eScience centers in Europe
39
More information
• www.cyfronet.krakow.pl/en
• www.plgrid.pl/en
• www.cyfronet.krakow.pl/cgw14
• www.cyfronet.krakow.pl/kdm14
• dice.cyfronet.pl