large-scale science through workflow management

22
Large-Scale Science Through Workflow Management Ewa Deelman Center for Grid Technologies USC Information Sciences Institute

Upload: myra

Post on 21-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Large-Scale Science Through Workflow Management. Ewa Deelman Center for Grid Technologies USC Information Sciences Institute. Acknowledgements. Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Large-Scale Science Through Workflow Management

Large-Scale Science Through Workflow

Management

Ewa Deelman

Center for Grid Technologies

USC Information Sciences Institute

Page 2: Large-Scale Science Through Workflow Management

Acknowledgements

Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Technologies, ISI)

James Blythe, Yolanda Gil (Intelligent Systems Division, ISI)

http://pegasus.isi.edu Research funded as part of the NSF

GriPhyN, NVO and SCEC projects and EU-funded GridLab

Page 3: Large-Scale Science Through Workflow Management

Today’s Scientific Applications

Increasing in the level of complexity Use of individual application components Reuse of individual intermediate data products (files) Description of Data Products using Metadata Attributes

Execution environment is complex and very dynamic Resources come and go Data is replicated Components can be found at various locations or staged in on demand

Separation between the application description the actual execution description

Page 4: Large-Scale Science Through Workflow Management

Workflow Definitions

Workflow template: shows the main steps in the scientific analysis and their dependencies without specifying particular data products

Abstract workflow: depicts the scientific analysis including the data used and generated, but does not include information about the resources needed for execution

Concrete workflow: an executable workflow that includes details of the execution environment

Page 5: Large-Scale Science Through Workflow Management

Scientific AnalysisW

orkf

low

Evo

lutio

n Select the Input Data

Map the Workflow onto Available Resources

Execute the Workflow

Construct the Analysis

Workflow Template

Abstract Worfklow

Concrete Workflow

Tasks to be executed

Grid Resources

Page 6: Large-Scale Science Through Workflow Management

Execution EnvironmentScientific AnalysisW

orkf

low

Evo

lutio

n

Grid Resources

Select the Input Data

Map the Workflow onto Available Resources

Execute the Workflow

Information Services

Library of Application

Components

Data Catalogs

Construct the Analysis

Resource availability and characteristics

Tasks to be executed

Data properties

Component characteristics

Workflow Template

Abstract Worfklow

Concrete Workflow

Aut

omat

edU

ser

guid

ed

Page 7: Large-Scale Science Through Workflow Management

Concrete Workflow Generation and Mapping

Input Data Selector

Compositional Analysis Tool

(CAT)

PegasusCondor

DAGManConcrete Workflow

Results

Workflow Template

Chimera

MontageAbstract Workflow Service

Abstract Workflow

Grid Resourcesjobs

Application-dependent

Application independent

Page 8: Large-Scale Science Through Workflow Management

Pegasus:Planning for Execution in Grids

Maps from abstract to concrete workflow Algorithmic and AI-based techniques

Automatically locates physical locations for both workflow components and data

Finds appropriate resources to execute Reuses existing data products where applicable Publishes newly derived data products

Provides provenance information

Page 9: Large-Scale Science Through Workflow Management

Generating a Concrete Workflow

Information location of files and component

Instances State of the Grid resources

Select specific Resources Files Add jobs required to form a concrete

workflow that can be executed in the Grid environment

Data movement Data registration Each component in the abstract

workflow is turned into an executable job

FFT filea

/usr/local/bin/fft /home/file1

Move filea from host1://home/filea

to host2://home/file1

Abstract Workflow

Concrete Workflow

DataTransfer

Data Registration

Page 10: Large-Scale Science Through Workflow Management

Information Components used by Pegasus

Globus Monitoring and Discovery Service (MDS) Locates available resources Finds resource properties

Dynamic: load, queue length Static: location of GridFTP server, RLS, etc

Globus Replica Location Service Locates data that may be replicated Registers new data products

Transformation Catalog Locates installed executables

Page 11: Large-Scale Science Through Workflow Management

Example Workflow Reduction

Original abstract workflow

If “b” already exists (as determined by query to the RLS), the workflow can be reduced

d1 d2ba c

d2

b c

Page 12: Large-Scale Science Through Workflow Management

Mapping from abstract to concrete

Query RLS, MDS, and TC, schedule computation and data movement

Execute d2 at B

Move b from A

to B

Move c from B

to U

Register c in the

RLS

d2

b c

Page 13: Large-Scale Science Through Workflow Management

Pegasus Research

resource discovery and assessment resource selection resource provisioning workflow restructuring

task merged together or reordered to improve overall performance

adaptive computing Workflow refinement adapts to changing

execution environment

Page 14: Large-Scale Science Through Workflow Management

Benefits of the workflow & Pegasus approach

The workflow exposes the structure of the application maximum parallelism of the application

Pegasus can take advantage of the structure to Set a planning horizon (how far into the workflow to plan) Cluster a set of workflow nodes to be executed as one (for

performance)

Pegasus shields from the Grid details

Page 15: Large-Scale Science Through Workflow Management

Benefits of the workflow & Pegasus approach

Pegasus can run the workflow on a variety of resources Pegasus can run a single workflow across multiple

resources Pegasus can opportunistically take advantage of

available resources (through dynamic workflow mapping) Pegasus can take advantage of pre-existing intermediate

data products Pegasus can improve the performance of the

application.

Page 16: Large-Scale Science Through Workflow Management

Mosaic of M42 created on the Teragrid resources using Pegasus

Pegasus improved the runtime of this application by 90% over the baseline case

Bruce Berriman, John Good (Caltech)

Joe Jacob, Dan Katz(JPL)

Page 17: Large-Scale Science Through Workflow Management

Future Directions

Support for workflows with real-time feedback to scientists. Providing intermediate analysis results so that the experimental setup can be adjusted while the short-lived samples or human subjects are available.

Page 18: Large-Scale Science Through Workflow Management

time

Levels ofabstraction

Application-level

knowledge

Logicaltasks

Tasksbound toresources

and sent forexecution

User’sRequest

Relevantcomponents

Fullabstractworkflow

Partialexecution

Not yetexecuted

executed

Workflow refinement

Onto-basedMatchmaker

Workflow repair

Policyreasoner

Cognitive Grids: Distributed Intelligent Reasoners that Incrementally Generate the Workflow

Page 19: Large-Scale Science Through Workflow Management

BLAST: set of sequence comparison algorithms that are used

to search sequence databases for optimal local alignments to a query

Lead by Veronika Nefedova (ANL) as part of the Paci Data Quest Expedition program

2 major runs were performed using Chimera and Pegasus:

1) 60 genomes (4,000 sequences each), In 24 hours processed Genomes selected

from DOE-sponsored sequencing projects67 CPU-days of processing time

delivered~ 10,000 Grid jobs>200,000 BLAST executions50 GB of data generated

2) 450 genomes processed

Speedup of 5-20 times were achieved because the compute nodes we used efficiently by keeping the submission of the jobs to the compute cluster constant.

Page 20: Large-Scale Science Through Workflow Management

Tomography (NIH-funded project) Derivation of 3D structure from a

series of 2D electron microscopic projection images,

Reconstruction and detailed structural analysis complex structures like synapses large structures like dendritic

spines. Acquisition and generation of huge

amounts of data Large amount of state-of-the-art

image processing required to segment structures from extraneous background.

Dendrite structure to be rendered byTomography

Work performed with Mark Ellisman, Steve Peltier, Abel Lin, Thomas Molina (SDSC)

Page 21: Large-Scale Science Through Workflow Management

LIGO’s pulsar search at SC 2002

The pulsar search conducted at SC 2002 Used LIGO’s data collected

during the first scientific run of the instrument

Targeted a set of 1000 locations of known pulsar as well as random locations in the sky

Results of the analysis were be published via LDAS (LIGO Data Analysis System) to the LIGO Scientific Collaboration

performed using LDAS and compute and storage resources at Caltech, University of Southern California, University of Wisconsin Milwaukee.

ISI people involved: Gaurang Mehta, Sonal Patil, Srividya Rao, Gurmeet Singh, Karan VahiVisualization by Marcus Thiebaux

Page 22: Large-Scale Science Through Workflow Management

Southern California Earthquake Center

• Southern California Earthquake Center (SCEC), in collaboration with the USC Information Sciences Institute, San Diego Supercomputer Center, the Incorporated Research Institutions for Seismology, and the U.S. Geological Survey, is developing the Southern California Earthquake Center Community Modeling Environment (SCEC/CME).

•Create fully three-dimensional (3D) simulations of fault-system dynamics.

•Physics-based simulations can potentially provide enormous practical benefits for assessing and mitigating earthquake risks through Seismic Hazard Analysis (SHA).

•The SCEC/CME system is an integrated geophysical simulation modeling framework that automates the process of selecting, configuring, and executing models of earthquake systems.

Figure 1: Fréchet sensitivity Kernel showing travel path between a Yorba Linda earthquake and the TriNet Station DLA.

Acknowledgments :

Philip Maechling and Vipin Gupta

University Of Southern California