cloudmc: a cloud computing map-reduce implementation for radiotherapy. ruben jimenez & hector...

26
CloudMC: A cloud computing map-reduce implementation for radiotherapy Rubén Jiménez Marrufo Héctor Miras del Río Carlos Miras del Río Carles Gomà Estadella Big Data Spain http://www.bigdataspain.org Madrid, November 16th, 2012

Upload: big-data-spain

Post on 25-Dec-2014

595 views

Category:

Technology


1 download

DESCRIPTION

Session presented at Big Data Spain 2012 Conference 16th Nov 2012 ETSI Telecomunicacion UPM Madrid www.bigdataspain.org More info: http://www.bigdataspain.org/es-2012/conference/cloudMC-a-cloud-computing-map-reduce-implementation-for-radiotherapy/ruben-jimenez-and-hector-miras

TRANSCRIPT

Page 1: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: A cloud computing map-reduce implementation

for radiotherapy

Rubén Jiménez MarrufoHéctor Miras del RíoCarlos Miras del RíoCarles Gomà Estadella

Big Data Spainhttp://www.bigdataspain.org

Madrid, November 16th, 2012

Page 2: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Contents

IntroductionRadiotherapyMonte Carlo simulations for radiation transportMonte Carlo parallelizationClustering vs. Cloud ComputingCloud Computing for clinical radiation transportCloudMC

DEMO STARTArchitectureMap ReduceElasticityHow did Radarc help us?ResultsIs it reinventing the wheel?RoadmapDEMO RESULTS

Questions & Answers

Page 3: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Introduction

Héctor Miras del RíoDepartment of Medical Physics, Virgen Macarena Hospital, Seville, Spain Rubén Jiménez MarrufoR&D Division, Icinetic TIC S.L., Seville, Spain

Carlos Miras del RíoR&D Division, Wedoit Innovacion Tecnologica, Seville, SpainCarles GomàCentre for Proton Therapy, Paul Scherrer Institute, Villigen PSI, Switzerland

Page 4: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Introduction

Monte Carlo Simulations

Radiotherapy

Cloud Computing

Page 5: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Radiotherapy

Radiotherapy:  is the medical use of ionizing radiation, generally as part of cancer treatment to control or kill malignant cells.

Radiotherapy treatment planning:  is the process for calculating the radiation dose to be absorbed by an object to be irradiated, prior to radiotherapy.

Page 6: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Monte Carlo simulations for radiation transport

Page 7: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Monte Carlo simulations for radiation transport

Page 8: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

+👍 Gold standard algorithms for radiation calculations

- 👎 Extremely computationally intensive and very time-consuming.

Monte Carlo simulation for radiation transport

Monte Carlo Simulations:

Page 9: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Monte Carlo parallelization

Parallelization: Execute simultaneously one simulation in several nodes and merge the results.

Monte Carlo simulations are highly parallelizable since the primary events are independent.

Page 10: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Parallelization: Clustering vs. Cloud Computing

Page 11: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Cloud Computing for clinical radiation

calculations

100 cores cluster ≈ 20 000 €

Cost / plan

2 €

tCPU = 100 h

Number instanc

esn = 100

T(n) = 1.44 h

Extra-small

0.0142 € / h

1000 patients

/ year

Cost / year

2 000 €

160 years of computing time in an extra-small instance

Page 12: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC

CloudMC offers an implementation of map/reduce over Windows Azure cloud computing platform, for the parallelization of MC simulations of radiation therapy dose distribution.

Non-intrusive

Multi-application: Penelope Geant4 EGSnrc

Elasticity: Resources are not reserved 1 hour simulation costs 1 hour

Page 13: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: DEMO

Page 14: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC Architecture

Worker Roles

UI

Service Management

Simulation filesMessages Queues

Cloud Storage

Cloud Hosted Services

SQL Azure

Users & Simulation

Repositories

Provisioning

MapReduceFactory

Entities

Services

Page 15: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

1. New simulation

3. Parallel execution 4. Reduce 5. End of

simulation2. Map

5. End of Simulation

- Finished simulation metadata is saved on SQL Azure.

- Mail notices to the user of the end of the simulation to proceed to download the results.

2. Map

- Generation of n initial independent seeds.- Mapper: Modification of simulation config to divide histories by n. - Provisioning of the n worker roles.- Sending of n messages of “start”.

1. New simulation

- Simulation metadata is saved on SQL Azure.

- Simulation files are uploaded to the Azure Storage.

4. Reduce

- When the web role reads the n messages of end of simulation, Resolver merges the n results uploaded to the storage.

- n-1 worker roles are scaled down.

3. Parallel Execution

Every worker role:

1. Reads a message from the queue and downloads the simulation files.

2. Executes the “fragmented” simulation.

3. Sends the results to the storage.

4. Sends an “end of simulation” message.

CloudMC: MapReduce

Sequence of actions when carrying out a MC simulation on n instances:

Page 16: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: Map

Input A: Configuratio

nFiles

• Simulation parameters• Histories count• Geometry & materials

files• …• MapReduce

Parameters

ExecutableHistories: 1015

Input B

Histories: 215

ExecutableExecutableExecutableExecutableMapped Executable

Mapper: parametrized mapper to set histories number and seeds in the input files

Most of MC applications for radiation transport simulation read the configuration from textual files.

Page 17: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: Reduce

The result of MC applications for radiation transport simulation are dose, energy or any magnitude distribution files formatted in columns.

ExecutableExecutableExecutableExecutableMapped Executable

ExecutableExecutableExecutableExecutableDose distribution

files

Output

Reducer: parametrized reducer to combine columns depending on the column type:- Magnitude column- Uncertainty column

Page 18: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: MapReduce DSL

CloudMC uses a MapReduce DSL to read parameters to adapt Mapper and Reducer to specific MC applications.

Mapper parameters Reducer parameters

Page 19: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: Elasticity

Users choose the number of instances to use for each simulation.

CloudMC scales up worker role to run simulation and scales down when it finishes.

Windows Azure Service Management allows roles scaling:

👍 REST API 👍 Based on XML config files

👎 Minimum of 1 instance 👎 Impossible to scale down

specific instances (Multi-tenant)

Page 20: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Worker Roles

UI

Service Management

Simulation files

Messages Queues

User account

s

Cloud Storage

Cloud Hosted Services

SQL Azure

Users & Simulation

Repositories

Provisioning

MapReduce

FactoryEntities

ServicesFormula Azure

≃ 50% generated code:

• ASP.Net MVC 3 UI

• C# App Services

• C# POCO Entities

• EF CodeFirst• SQL Azure DB

Focus on domain core: map/reduce, provisioning, fault tolerance, etc.

CloudMC: How did Radarc help us?

Page 21: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: Results

Case Study:Simulation: 125I seed in ophtalmic applicator.Number of histories: 3·109

MC Code: PENELOPE, main program PenEasy.

Results:Worker instances size: extra-smallClock time in 1 instance: 30 hClock time in 64 instances: 48 min

(speed up = 37x)

Page 22: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

T(n): Clock time for 1 simulation in n instances.

tcpu: Overall time used only in the simulation of n histories.

Dt0: Non-parallelizable time for 1 instance.

a: Non-parallelizable part of time proportional to n.

CloudMC: Results

Time vs number of instances study

Page 23: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC: Is it reinventing the wheel?

http://stackoverflow.com/questions/1190520/is-it-possible-to-write-map-reduce-jobs-for-amazon-elastic-mapreduce-using-net

Why not using Amazon Elastic MapReduce? (http://aws.amazon.com/es/elasticmapreduce)

• Our mapper and reducer were written for .Net

Why not using Hadoop On Azure? (http://www.hadooponazure.com)

• First preview released on 2012.• The cluster size must be reserved.

Page 24: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Roadmap

Testing with more MC applications: Geant4, EGSnrc, etc.

Support packages with specific MapReduce implementations• Application to different domains• Use of MEF to provide Mappers and Reducers in

simulation packages

SDK to develop specific MapReduce implementation packages.• Visual Studio Templates could facilitate the

development of CloudMC packages

Enable multi-tenant environments• Concurrent simulations require scaling down of

specific instances that is not possible on Windows Azure.

Page 25: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

Questions

Page 26: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012

CloudMC soon available at:

https://cloudmontecarlo.cloudapp.net

Thank you for your attention …

[email protected] @hmiras

[email protected] @rjimenez