enabling(cloud(interoperability(( with(compss(...structural(analysis(for(civil engineering(...
TRANSCRIPT
Enabling Cloud interoperability with COMPSs
Daniele Lezzi, Francesc Lordan, Roger Rafanell, Rosa M. Badia Barcelona Supercompu/ng Center, Spain
Fabrizio Marozzo, Domenico Talia
University of Calabria, Italy
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
• The COMPSs framework • The VENUS-‐C plaPorm • COMPSs and Azure • ApplicaQon evaluaQon
Outline
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
q COMP Superscalar (COMPSs) is a programming framework that provides a programming model and a runQme that ease the development of applicaQons for distributed environments and their execuQon on a wide range of computaQonal infrastructures. • COMPSs has been recently extended in order to be interoperable with
several cloud technologies like Amazon, OpenNebula, EmoQve and other OCCI compliant offerings.
q Goal: This work presents the extensions of this interoperability layer to
support the execuQon of COMPSs applicaQons into the Windows Azure Pla9orm.
Goal of This Work
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Resource 1 ... for (i=0; i<N; i++){ T1 (data1, data2); T2 (data4, data5); T3 (data2, data5, data6); T4 (data7, data8); T5 (data6, data8, data9); } ...
SequenQal Code
T10 T20
T30 T40
T50 T11 T21
T31 T41
T51
T12
…
(c) Scheduling, data transfer, task execuQon
(d) Task compleQon, synchronizaQon
Parallel Resources (a) Task selecQon +
parameters direcQon
(input, output, inout)
Resource 2
Resource N
. . .
(b) Task graph creaQon based on data dependencies
COMPSs Programming model
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Custom Loader
Javassist
initialize(f1);
for (int i = 0; i < 2; i++) {
genRandom(f2);
add(f1, f2);
}
print(f2);
Annotated interface
User code
T1 T3
T2 T4
Grids Clusters Clouds
Files
COMPSs Programming model
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
COMPSs RunQme
Cloud
OCCI EC2
Azure
Job Storage AccounQng Adaptors
Usage Records
CDMI S3
GAT GRAM SSH
Azure gLite
FTP
gridFTP
OpenNebula
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Structural Analysis for
Civil Engineering
Building InformaQon Management
Data for Science -‐ AquaMaps
Civil ProtecQon
and Emergencies
BioinformaQcs
System Biology
Drug Discovery
VENUS-‐C: SupporQng the Long Tail of Science
COMPSs in VENUS-‐C allows the easy porQng of scienQfic applicaQons to Clouds • Minimum effort required to the user • No need of change for an already exisQng COMPSs applicaQon • Provides elasQcity features • ExecuQon results demonstrate scalability
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
VENUS-‐C Enactment Services
VENUS-‐C API
COMPSs Enactment Service
OCCI CDMI
The VENUS-‐C PlaPorm
CDMI
Generic Worker
Enactment Service
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
OpenNebula
OCCI
OCCI
FTP
VM1
VM2 VMn
VM1
VM2 VM2 VMn Output Storage
Output Output
Job Dispatcher J1 J2 … Jn … Jm
Thread pool
T1 T2 Tn
Cloud Provider Adapter (OCCI+OVF, Azure) SAGA + CDMI
1. VMs are created on demand through the actual cloud adaptor
2. The COMPSs application is remotely executed through SAGA and the data is staged in
3. The tasks are executed on the worker nodes
COMPSs Enactment Service
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Azure Java GAT Adaptor
Task Status Table
Task Queue
Windows Azure
Worker Role instances
Tables
Worker Worker Worker Worker Queues
Blobs Client User
libraries taskdata
COMPSs GAT
Azure GAT
Adaptor
SSH GAT Adaptor
IaaS Clouds
Amazon EC2
Eucalyptus
Emotive Cloud
OpenNebula
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Azure GAT Adaptor: Sub components
q The Azure JavaGAT Adaptor enriches COMPSs with data management and execuQon capabiliQes that make it interoperable with Azure and implemented using two subcomponents.
q Data management is supported by a subcomponent called Azure File Adaptor. Read and write data on the Azure Blob Storage (Blobs), to deploy the libraries needed to execute on Azure and to store the input and output data (taskdata) for the tasks.
q The Azure Resource Broker Adaptor, on the other side, is responsible for the task submission. Following the Azure Work Queue pacern, this subcomponent adds into a Task Queue the tasks that must be executed on an Azure resource by a Worker. In order to keep the runQme informed about each task execuQon, the status of the tasks is updated in a Task Status Table.
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Azure GAT Adaptor: Exec. mechanisms
Task Status Table
Task Queue
Windows Azure
Worker Role instances
Tables
Worker Worker Worker Worker Queues
Blobs Client User
libraries
worker.jar
taskdata
job1.out job1.err
output1
COMPSs GAT
Azure GAT
Adaptor
SSH GAT Adaptor
T T T
matmul.jar matmul.jar
input1 input1
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Azure GAT Adaptor: Architecture
Task Status Table
Task Queue
Windows Azure
Worker Role instances
Tables
Worker Worker Worker Worker Queues
Blobs Client User
1
2
3
libraries
kMeans.jar matmul.jar
blast.jar
worker.jar
sparseLu.jar
taskdata
job1.out job1.err
input1 output1
4.1
5
4.2
COMPSs GAT
Azure GAT
Adaptor 4.3
4.4
IaaS Clouds
SSH GAT Adaptor
Amazon EC2
Eucalyptus
Emotive Cloud
OpenNebula The task status is dynamically updated whenever the status of a task changes.
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
EvaluaQon: Goals
q EvaluaQng the performance of Azure GAT Adaptor through the execuQon of a classier-‐based workflow applicaQon on a pool of virtual servers hosted by Microsoe Cloud datacenters. • Scalability that can be achieved through the parallel execuQon on a
pool of virtual servers. q EvaluaQng the performance of the workflow applicaQon in a hybrid
scenario • A private cloud environment managed by EmoQve Cloud middleware • A public cloud testbed made of Azure instances • An hybrid configuraQon using both private and public clouds
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
EvaluaQon Workflow applicaQon
Dataset
Dataset Partitioner
TestSet
Classifier
Model Selector
Model
Model
Model Evaluator
Evaluator Evaluatedmodel
Evaluatedmodel
Evaluatedmodel
Best model
Evaluator Classifier
Classifier
TrainSet
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
EvaluaQon: Performance
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Conclusions
q COMPSs is plaPorm unaware programming model that simplifies the development of applicaQons in distributed environments • Transparent data managemet, task execuQon • ParallelizaQon at task level • Independent of plaPorm: clusters, grids, clouds
q The Azure adaptor extends COMPSs interoperability q The proposed approach has been validated through the execuQon of a
data mining workflow ported to COMPSs and executed on a hybrid testbed composed of a private cloud in VENUS-‐C
q Future work includes the support of the dynamic resource provisioning in Azure and enhancements to the Azure JavaGAT adaptor to opQmize data transfers among different clouds, and the possibility to specify input files already available on the Azure storage.
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
Contacts & acknowledgments
Daniele Lezzi Roger Rafanell Francesc Lordan Rosa M. Badia Fabrizio Marozzo Domenico Talia
ICAR-‐CNR, University of Calabria
@
@
Cloud Futures 2012, Berkeley, CA -‐ May 7-‐8, 2012
www.bsc.es/compss www.venus-‐c.eu