big data and hpc on-demand: large-scale genome analysis on ... · big data and hpc on-demand:...

51
Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula the Science Cloud Rupert Lueck Head of IT Services, EMBL ISC Cloud 12 Mannheim 24 September 2012

Upload: others

Post on 04-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Big data and HPC on-demand:

Large-scale genome analysis on

Helix Nebula – the Science Cloud

Rupert Lueck

Head of IT Services, EMBL

ISC Cloud ’12 Mannheim

24 September 2012

Page 2: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

EMBL: European Molecular Biology Laboratory

2

• Intergovernmental Research

Organization

• Supported by 20 Member States

(+1 associated: )

• One of the world‘s foremost life

science institutions

• EIROforum member

• 1500 staff

>70 nationalities

Page 3: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

3

Structural Biology

DESY

Hamburg

Structural Biology

ILL, ESRF, IBS, UVHCI

Grenoble

Mousebiology

CNR, EMMA

Monterotondo

European Bioinformatics

Institute (EBI)

Sanger Centre

Hinxton

Basic Molecular Biology

Research

Main Lab / Headquarters

Heidelberg

The Five Branches of EMBL

Page 4: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Basic

Research

Technology

Transfer

Advanced

Training

Instrument and Technology Development

EMBL’s Missions

Services

European

Integration

Page 5: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

5

Genome Embryo Cell

Development Organisms Complexity Aging Disease

Protein/DNA

Systems Biology: From Molecules to Organisms

Page 6: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

DNA and Life on Earth

6

The Sequence Holds the Code for the Organism

Page 7: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

NEXT GENERATION SEQUENCING (NGS) Exemplary Big Data Challenge

7

Page 8: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

8

Next Generation Sequencing (NGS) Revolution

Page 9: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

9

NGS Impact on Human Genome Sequencing

03/11/11 9

• Human genome project

• 10 years

• Large International Consortium

• Thousands of Sequencers

• $3,000,000,000

2000

2010

• Sequencing today

• < $10,000

• A few hours

• One machine

Page 10: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Cost of Sequencing Decreasing Rapidly

10

Page 11: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Genomic Sequencing is Now an Affordable

Solution

11

Agricultural

Research

Pharmaceutical

Companies

Medical

Research

Academic

Research

Groups

Page 12: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Genomic sequencing is

now an affordable solution

but ...

12

Page 13: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Read the Sequence to Study the Organism

13

Gene

here

Assemble Annotate

Sequence Extract DNA Prepare

Requires Computing Infrastructure & Expertise

Lab

In Silico

Page 14: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Problem – 1: Assembly

• NGS output:

millions of very short sequence reads

• Genomes contain long strings of bases

• The short reads have to be assembled into genomes

• Up to 1TB RAM and many weeks computation required to

solve puzzle

14

... GTATTCC 105 ATGCATT...

...TGCGGATC 200,000,000 ATGCATT...

Assembly

Page 15: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Problem – 2: Annotation

• Strings of assembled bases need to be

annotated

• 3 billion bases, ~25k genes

• Looking for genes and regulator elements

• Requires multiple pipelines and databases

15

Gene here

Annotate

Page 16: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Problem - Technology Explosion with NGS

16

0

5,000,000,000

10,000,000,000

15,000,000,000

20,000,000,000

25,000,000,000

30,000,000,000

35,000,000,000

Feb

08

Ma

y 0

8

Aug 0

8

No

v 0

8

Feb

09

Ma

y 0

9

Aug 0

9

No

v 0

9

Feb

10

Ma

y 1

0

Aug 1

0

No

v 1

0

Feb

11

Ma

y 1

1

Augu

st

11

Bases Sequenced / Sample / Run @ EMBL (Illumina)

Page 17: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Sequence Production & IT Infrastructure at EMBL

17

Compute Power:

2000+ CPU Cores, 6+ TB RAM

Storage:

1+ PB High Performance Disk

4 x Ilumina HiSeq2000

25 TB data

each week

2 x Ilumina GAIIx

Page 18: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

NGS - The Big Picture

• ~ 8.7 million species in the world (estimate)

• ~ 7 billion people

• Sequencers exist in both large centres & small research

groups

• 200+ Ilumina HiSeq sequencers in Europe alone

• capacity to sequence 1600 human genomes / month

• Largest centre: Beijing Genomics Institute (BGI)

• 167 sequencers, 130 HiSeq

• 2,000 human genomes / day

• 500-1000 Hiseq devices worldwide today

• 3-6 PB /day

• 1.1 – 2.2 ExaBbytes / year

18

Page 19: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Cloud Service

EMBL Flagship project: Whole-Genome Assembly

19

NGS Labs

Integration

with other

cloud services

/ Archiving

Cloud Storage

On-demand processing

Data acquisition

Access

Scientists

Page 20: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

PROOF OF CONCEPT

IMPLEMENTATIONS

EMBL Flagship Pilot Project

20

Page 21: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Proof of Concept Setup

• Multiple Cloud providers

• ATOS / Sixsq

• CloudSigma

• T-Systems

• Each tested 3 major steps with increasing complexity

• Major software components to test

• Assembly pipeline

• Annotation pipeline

• Shared File system

• StarCluster

21

Page 22: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

EMBL Dynamic Architecture

22

Storage: GlusterFS

Compute: SUN Gridengine HPC Cluster

Control

Customer data

x 100 GBs

x TBs shared across all nodes

7+ Gbit/s data throughput

StarCluster

master

Deploy

Page 23: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

StarCluster & Sun Grid Engine

Dynamic cluster provisioning

• StarCluster – Dealing with the Fluctuating Workload

• Manages provisioning of images and setting up of cluster

• Requires sets of EC2 APIs to work

• It monitors the number of jobs in the queue and launches more

instances

• Terminates them when no longer required

• Sun Grid Engine

• Single image running in two modes – master/worker

• Post-launch configuration

23

Page 24: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

PoC Results

Successful tests of all vendors deployed so far

• StarCluster API integration

• auto-provision 50-node cluster setups

• real world large genome sequencing data

• 100,000s of jobs

• mix of quick parallel jobs and long running serial jobs

• glusterFS stability under high I/O levels

• Initial hurdles (e.g. image deployment, StarCluster

integration, network setup) solved

24

Page 25: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

SGE cluster troughput

25

0

50

100

150

200

250

300

350

400

:08 :29 :32 :40 :46 :49 :52 :55 :58 :01 :04 :07 :10 :13 :16 :19 :22 :25 :28 :31 :34 :37 :40 :43 :46 :49 :52 :55 :58 :01 :04 :07 :10

22 23 00

5-Apr 6-Apr

Apr

2012

Jo

bs

20.000 annotation jobs / h on 50 nodes

Page 26: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

GlusterFS throughput

26

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

4,000,000

4,500,000

5,000,000

:08 :29 :32 :40 :46 :49 :52 :55 :58 :01 :04 :07 :10 :13 :16 :19 :22 :25 :28 :31 :34 :37 :40 :43 :46 :49 :52 :55 :58 :01 :04 :07 :10

22 23 00

5-Apr 6-Apr

Apr

2012

Inb

loc

k o

pera

tio

ns

60.000 inbound block I/Os / sec from annotation jobs on 50 nodes

Page 27: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Next steps

• Identify a suitable model for a future federated Helix

Nebula cloud

• Preparations for putting EMBL genome analysis pipeline

into production ongoing

• Attract other flagship from within and outside EMBL

• Through initial success with current genome analysis flagship

• After implementation of federated cloud model

27

Page 28: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Helix Nebula PoC Acknowledgements

EMBL

Michael Wahlers

Jonathon Blake Tobias Rausch Jürgen Zimmermann Vladimir Benes Christian Boulin Rupert Lueck

EMBL- EBI

Stephen Keenan

Paul Flicek

Page 29: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Rupert Lueck, EMBL 29

Initial flagships use cases

ATLAS High Energy

Physics Cloud Use

Genomic Assembly

in the Cloud

SuperSites Exploitation

Platform

To support the computing

capacity needs for the

ATLAS experiment

A new service to simplify

large scale genome

analysis; for a deeper

insight into evolution and

biodiversity

To create an Earth

Observation platform,

focusing on earthquake

and volcano research

Page 30: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 30

ESA’s experience with Helix Nebula and outlook

Wolfgang Lengert,

ERS and ADM-Aeolus Mission Manager

presented by Rupert Lueck (EMBL)

ESA UNCLASSIED - For Offical Use 05/07/2012

CNR

Page 31: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 31

Why the Cloud?

ESA UNCLASSIED - For Offical Use 09/05/2012

Data deluge Many users ESA UNCLASSIED - For Offical Use 05/07/2012

Page 32: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 32

PoC: Questions to answer

1. Can HN cloud computing serve ESA Earth

Observation (EO) processing ICT needs?

2. Can ESA deploy an end-to-end platform for Earth

Observation exploitation on Helix Nebula?

3. Can an ecosystem of value added service providers

develop around such platform?

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 33: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 33

Approach: SuperSites Exploitation Platform (SSEP)

1. SSEP is vested as an Helix Nebula “flagship”; along sites other

flagship at CERN and EMBL.

2. CNES, DLR and CNR agreed to participate to Helix Nebula.

The CNR/IREA (Italian Research Council) as a none-space

agency contributes with their Radar processor adapted for the

cloud.

3. Helix Nebula Prove of Concept participants:

1. ATOS

2. CloudSigma

3. Interoute

4. T-Systems ESA UNCLASSIED - For Offical Use 05/07/2012

Page 34: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 34

PoC: Questions to answer

1. Can HN cloud computing serve ESA EO processing

ICT needs?

2. Can ESA deploy an end-to-end platform for EO

exploitation on HN?

3. Can an ecosystem of value added service providers

develop around such platform?

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 35: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 35

Cloud Computing at ESA fingertips integrated in the Grid Processing environment

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 36: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 36

Phase 1: PoC Evaluation Approach

• Performance evaluation (via test scripts)

– Functions: Data dissemination (upload-cataloguing-

download), Data Processing (InSAR, SAR-IPF)

– Tests: Availability (24x7), Stress, Scalability

• Terms & Conditions evaluation (via questionnaires)

– Architecture

– Service Levels

– Security

tests have successfully been

concluded! ESA UNCLASSIED - For Offical Use 05/07/2012

Page 37: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 37

PoC: Questions to answer

1. Can HN cloud computing serve ESA EO processing

ICT needs?

2. Can ESA deploy an end-to-end platform for EO

exploitation on HN?

3. Can an ecosystem of value added service providers

develop around such platform?

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 38: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 38

Earth Observation Application Platform exploiting 20 years of satellite data

• EO Application Platform

– OpenNebula

– Data Catalogue and Access

– Map-Reduce computing model

– Software repository

– Utilities for sw development and testing

• Cloudification of application

– CNR / IREA (Italian Research Council in Napels) developed

an application (SBAS) measuring the vertical movement

of ground in sub cm from space.

– SBAS targets

• Time series over 20 years with ESA archive

• Points of Interest are at world scale

• TBytes of data to process

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 39: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 39

CNR SBAS Processing

~150 Satellite images: 1.5TB

Earthquakes

Volcanoes

Oil & Gas

Water Resources

ESA UNCLASSIED - For Offical Use 05/07/2012

Time Processing:

150h

CNR

Page 40: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 40

Mean Velocity (mm/a)

< -10 0 > 10

ERS-1/2 DATA (1995-2002)

Opportunity: Natural Resources

Water Resources

Agriculture

Sustainable and social development

Los Angeles Area

Page 41: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 41

Mean Velocity (mm/a)

< -10 0 > 10

ERS-1/2 DATA (1995-2002)

Seismic Activity

Civil Protection Risk Management

Insurances

Los Angeles Area

Opportunity: Natural Hazards

Page 42: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 42

Mean Velocity (mm/a)

< -10 0 > 10

ERS-1/2 DATA (1995-2002) Los Angeles Area

Opportunity: Energy Resources

Oil & Gas Field

Page 43: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 43

PoC: Questions to answer

1. Can HN cloud computing serve ESA EO processing

ICT needs?

2. Can ESA deploy an end-to-end platform for EO

exploitation on HN?

3. Can an ecosystem of value added service providers

develop around such platform?

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 44: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 44

Super Site Exploitation Platform (SSEP)

different actors, different environment helping to

understand the Geophysics of Earthquakes and

Volcanoes

The Geohazard Supersites partnership pool and coordinate the existing space-based and

ground-based observation resources of GEO members to mitigate and to improve the preparedness for geologic

disasters

Page 45: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 45

EO data provider benefits: • Enlarge EO data exploitation (space agencies) • Increase EO data sales (commercial distributors),

in particular EO data archives

IT companies (computational facilities) benefits: • New business • Access to a global user community • Contriibution to science

End-user benefits: • More data, either free or at low cost • Processing capabilities free or at low cost • Processing softwares free or at low cost • Forum for discussing/exchanging results More science

Processing software provider benefits: • Low investment • Increase sales • Increase software visibility

Processing softwares

Processing software providers

(EO-derived information)

Supersite Exploitation Platform: potential actors benefits

Page 46: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Example of R&D feedbacks between ESA and EO services industry

• Jun 2003 : Renewable Energy Industry

(33 companies).

• Oct 2007 : EO services Industry (100

companies)

• Sep 2009 : Insurance (15 companies).

• May 2008 + 2010 : World Bank Group.

• Oct 2009 : SwissRe (Flood Risk)

• Jul 2010 : 1st Global Business

Biodiversity symposium.

• Sep 2010 : Oil & Gas (104 participants).

Page 47: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

Page 47

Conclusions

• Successful PoC with IaaS Providers

– Able to perform tests in 3-4 providers

– Weaknesses could be addressed during following project

phases

• Federated HN vs single cloud providers

– Large differences among providers

– Multi-sourcing approach recommended for next phase

• Using cloud as a grid vs using native PaaS

– Evaluation run à la GRID with static provisioning

– but future use of cloudified applications and dynamic

provisioning

• Application cloudification challenges

– SBAS cloudification required significant effort and deep

application expertise

• A business model for ecosystem is still elaborated

ESA UNCLASSIED - For Offical Use 05/07/2012

Page 48: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

A European cloud computing partnership:

big science teams up with big business

Strategic Plan

Establish multi-tenant,

multi-provider cloud

infrastructure

Identify and adopt

policies for trust, security

and privacy

Create governance

structure

Define funding schemes

To support the

computing capacity

needs for the ATLAS

experiment

Setting up a new

service to simplify

analysis of large

genomes, for a

deeper insight into

evolution and

biodiversity

To create an Earth

Observation

platform, focusing on

earthquake and

volcano research

[email protected] @HelixNebulaSC HelixNebula.TheScienceCloud

Page 49: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

World Map of High-throughput Sequencers

49

Page 50: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

PoC Steps Step 0 – Infrastructure setup and code test

– Transfer of images and content

– Set up shared file system: GlusterFS with 4 nodes (1.2TB net)

– Assembly: SGA assembler tested using small data set

– Annotation: Manual small batch run against small data set

Step 2 – Big genome & elastic scalability

– StarCluster essential in this step

• automated provisioning of Sun Grid Engine cluster up to 50 nodes

– Assembly: Large genome sequencing data, mix of quick parallel jobs and long running serial jobs

– Annotation: Pipeline tested using big data set, 50k - 100k jobs run

– Validation against run on EMBL infrastructures

Step 3 (Optional) Large Genome on big box

– Process a large genome through velvet assembly software

– Using a high RAM 1TB server

50

Page 51: Big data and HPC on-demand: Large-scale genome analysis on ... · Big data and HPC on-demand: Large-scale genome analysis on Helix Nebula – the Science Cloud ... The Big Picture

51 Rupert Lueck, EMBL

Time

Processing:

150h

CNR