Workflow
e-Science
Virtual Laboratory for e-Science (VL-e)
New e-Science plans (FES 2009)
Generic e-Science research in FES 2009
Introduction
● System level science● the integration of diverse sources of knowledge about the
constituent parts of a complex system with the goal of obtaining an understanding of the system's properties as a whole [Ian Foster]
● Multidisciplinary research● Each discipline can solve only part of a problem● Collaborations betweens distributed research
groups
● Research driven by (distributed) data● Data explosion, both volume and complexity
Examples
● Functioning of the cell for system biology● Cognition● Cancer research● Cohort studies in medicine (biobanking)● Discovery of biomarkers for drug design● Ecosystems/biodiversity● Studies of water/air pollution● Study black matter
e-Science
● Goal: allow scientists to collaborate in experiments and integration of research● Enable system level science
● Design methods to optimally exploit underlying infrastructure● Hardware (network, computing, datastorage)● Software (web, grid middleware)
e-Science in context
System levelexperiments
e-Science
Infrastructure
Web/grid software
Virtual Laboratory fore-Science (VL-e)
● Generic application support● Application cases are drivers for computer
science● Rationalization of experimental process● Midterm review:
● ``e-Science that works; gives Grid its correct role’’
Some VL-e results● Virtual Resource Browser:
● Integration platform for different applications
● Mirage & Virtual Lab for Medical Imaging:● Used autonomously at AMC for large-scale experiments
● EcoGRID:● Species observation records from many organizations
● Virtual Lab for Bird Migration Modeling:● Access military/meteorological radars & weather forecasts
● Ibis:● Programming/deployment of large-scale grid applications
Dynamicbird behaviour
MODELS
Bird distributionsEnsembles
Calibration andData assimilation
Predictions and on-linewarnings
RADAR
Bird behaviour in relation to weather and landscape
Example of generic approach
SCALE 2008 DACH 2008 - BS DACH 2008 - FT
AAAI-VC 2007 ISWC 2008
Multimedia
Computing
Astronomy
Semantic Web
● Runs simultaneously on clusters (DAS-3, Japan, Australia), Desktop Grid, Amazon EC2 Cloud
● Connectivity problems solved automatically by Ibis SmartSockets
Multimedia Content Analysis
Client
Broker
Servers IbisIbis
(Java)
eyeDentify● Object recognition on an Android
smartphone● Smartphone is a limited device:
● Can run only 64 x 48 pixels (memory bound)● 1024 x 768 pixels would take 5 minutes
● Distributed Ibis version:
+ =+1024 x 768 pixels
2.0 seconds
Outline
● e-Science● Virtual Laboratory for e-Science (VL-e)● New e-Science plans (FES 2009)
● Part of ICTregie FES proposal COMICT● Connecting, Mastering complexity, and Innovating by
Cooperation
● Generic e-Science research in FES 2009
Research questions
● How can we design, develop and build an adaptive e-Science environment that in a flexible way enables global collaboration in key areas of science?
● How can we establish an e-Science environment that is capable of handling the data explosion?
● How can we manage complexity via integration, at the application level and the generic e-Science level?
e-Science application projects
● e-Food & Flowers: WUR/VU (Top), TIFN, TIGG● e-BioScience & Life Sciences: UvA (Breit), RIVM● e-Biobanks: LUMC (Kok), AMOLF, AMC, Philips,
Schering-Plough● e-COAST & analytical science: TI-COAST (vd
Brink)● e-Ecology: UvA (Bouten), GAN● e-Data-intensive sciences: Nikhef (Templon),
RUG
20 40 60 80 100 120
20
40
60
80
100
120
0
0.5
1
1.5
2
2.5
3
3.5
4
Two “real-life” environments
● e-Food● With TI Food and Nutrition (TIFN), TI Green Genetics
(TIGG), NBIC● e-Biobanking
● With Pearl String Initiative (Parelsnoer), Biobanking and Bio-molecular Resources Research Infrastructure(BBMRI-NL) and NBIC
● Under umbrella Netherlands Genomics Initiative (NGI) ● FES2009 application Life Sciences & Health
Generic e-Science
Panel VL-e workshop (29 Oct 2008)
Generic e-Science projects
I. Scientific data management
II. Information- and knowledge-management
III. Visualization
IV. Computing and resourcemanagement
V. e-Science infrastructureengineering
VI. Workflow management &application integration
VII. Reliability and security
Scientific data management
CWI/UvA (Kersten), RUG (Valentijn)
● Support large array-data (using MonetDB)● Multi-scale query execution
● With increasing precision more and more data in the warehouse is used to answers queries
● Astrosensor warehouse● Data lineage (back tracing to origin of data)
Information- and knowledge-management
VU (van Harmelen, Schreiber),UvA (Adriaanse, Marshall)
● Robust & large scale techniques for accessing & reasoning over distributed data-sources
● Tools for integration of data with scientific publications; data provenance, lineage, trust
● Tools for data-sharing: entity naming, semantic enrichment, interlinking acrosssemantically heterogeneous vocabularies
Large Scale Data Visualization
CWI (van Liere), UvA (Belleman), SARA (Berg)
● Knowledge Assisted Feature Visualization● How to provide semantic meaningful interactive
visualizations for very large and complex data?
● User Driven Exploratory Visual Analysis● If automatic analysis fails
● Applications: mass spectrometry, CFD, HEP, CosmoGrid, Ijkdijk, …
Computing & resource management
VU (Bal, Seinstra), TU Delft (Epema)
● Map e-Science applications onto hybrid systems, optimize performance & energy● DAS-4: Multicores, GPUs, FPGAs,
MPSoCs (Cell/BE)
● Scheduling algorithms supporting co-allocation of compute-, data-, and network-sources
● Builds on Ibis & KOALA software● Many app’s (VUMC, AMOLF, MultimediaN,
astronomy)
e-Science infrastructure engineering
UvA (de Laat), VU (Bal), TUD (Langendoen), TNO (Meijer)
● Resource information system based on Semantic Web and RDF (Resource Description Framework)
● Highly mobile data sensors
● User Programmable Networks
“I want” approach
contentcontent
RDF/CGRDF/CG
RDF/ST
RDF/NDL
RDF/NDL
RDF/VIZ
RDF/CPU
Application: find video containing x,then trans-code to it view on Tiled Display
PG&CdLQuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
Workflow management &application integration
UvA (Bubak, Belloum), VU (Kielmann)
● Improve interoperability, sustainability and platform convergence in Scientific Workflow● Define a shared “standard”
for workflow metadata ● Workflow provenance models
● Make middleware-independent APIs for applications & programming environments● Cf. JavaGAT, SAGA, XtreemOS
Reliability and securityNIKHEF (Groep), UvA (van ‘t
Noordende),VU (Fokkink), TU Twente (van de Pol),
Logica (Mulder), SARA (Bouwhuis)● Improve availability, stability, and reliability
of the infrastructure● Monitoring, failure analysis● Self-healing
● Use formal verification techniques● Large-scale Model Checking becomes feasible on
grids (as shown on wide-area DAS-3)● Provide security
● Security policy enforcement, auditing
Summary
● Real-life application environments for e-Food and e-Biobanking
● New partners (TI-COAST, RUG, UT, ….) and new groups (Kersten, van Harmelen, Langendoen, Fokkink, vd Pol, ….)
● DAS-4● Many new research topics
Questions?