a definition and analysis of the role of meta-workflows in workflow interoperability junaid arshad,...

20
A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center for Parallel Computing, University of Westminster, London, UK (J.Arshad, G.Z.Terstyanszky, T.Kiss, Weingan)@westminster.ac.uk

Upload: gabriel-flowers

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow

Interoperability

Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam WeingartenCenter for Parallel Computing, University of Westminster, London, UK

(J.Arshad, G.Z.Terstyanszky, T.Kiss, Weingan)@westminster.ac.uk

Page 2: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

2

Scientific Workflows

• Complex computational experiments involving

• Analysis of large volumes of data• Use of distributed/high performance

computing infrastructures (DCIs)

• Individual tasks performing control and data analytics functions

Collect Data

Transform Data

Analytics

Data Store

Flush Outputs

External Aggregation

Page 3: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

3

Meta-workflows

• Composed of one or more entities such as jobs and/or sub-workflows

• Re-use these existing entities as components

• Faster and more efficient development

Collect Data

Transform Data

Data Store

Flush Outputs

External Aggregation

Analytics

Join

Aggregate

results

Run algorith

m

Page 4: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

4

Types of Workflows (1)• Single Workflow

• all the nodes of the workflow are individual jobs that are executed and managed by a single workflow system such as P-GRADE [1], Galaxy [2] or Taverna [3]

N1

CN2

CN1

N2

N3

Embedded WF

ce1

e1

e3

e2

Host Workflow System (WS1)

N1

N2

N3

e1

e3

e2

Host Workflow System (WS1)

Page 5: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

5A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

N1

CN2

CN1

N2

N3

Embedded WF

ce1

e1

e3

e2

Host Workflow System (WS1)

N1

N2

N3

e1

e3

e2

Host Workflow System (WS1)

Single workflow Native meta-workflow

N1

CN2

CN1

N2N3

e1

e3

e2

Host Workflow System (WS1)

N4e4

CN3

CN2

CN1

WS3 WS2

ce1

ce1

ce2

Non-native meta-workflow

Types of Workflows (2)

• Native meta-workflow• the embedded sub-workflows are all from the host workflow system (WS1)

• Non-native meta-workflow• at least one of the embedded workflows is from external workflow systems

Page 6: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

6

Non-native meta-workflows

• Two approaches have been established to execute non-native meta-workflows

• Coarse Grained Interoperability (CGI)• Host workflow engine and workflows are considered native whereas all other workflows

are considered non-native• The non-native workflows and workflow engines are managed as legacy applications

• Fine Grained Interoperability (FGI)• A workflow is considered to have two distinguished parts i.e. abstract and concrete• Abstract includes workflow graph and abstract I/O whereas concrete part contains

implementation details• FGI approach creates a IWIR bundle containing IWIR representation of abstract part and

the concrete part.

Page 7: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

7

Meta-workflow Execution Approaches

Workflow Engine A

Workflow Repository

Submission Service

Workflow Engine B

DCIWF2WF1

WF3

Meta-workflowNative WF

Engine

Non-native workflow

Native WFNative WF

Non-native WF Engine

Coarse-grain Workflow Interoperability usage scenario Transformation of workflows for FGI approach

Page 8: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

8

Formalizing CGI based Approach

• Based on existing formalization presented in [4]• Objective to improve the usability of existing meta-workflows by

highlighting the requirements of CGI based workflow interoperability approach

• Formalization achieved using the Z-notations

Page 9: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

9

WFmeta =: {J1…Jk, WFnat1…WFnatl, WFnnt1,..WFnntm}, where Jnath - native job h = 1 . . . k

Wfnati - native workflow i = 1 . . . lWFnntj - non-native workflow j = 1 . . . m.

Existing formalization

Formal description of the CGI approach using Z

Page 10: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

10

Types of CGI-based meta-workflow

• Focused on CGI-based meta-workflows• Aimed to facilitate workflow developers to design new workflows by

enabling them to comprehend with attributes and semantics of each type of workflow

• Following types of meta-workflows consideredi. Single job meta-workflowii. Linear multi-job meta-workflowiii. Parallel multi-job meta-workflow iv. Parameter sweep meta-workflow

Page 11: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

11

Single job meta-workflow

• A job representing this workflow can be a simple native job, a native workflow or even a non-native workflow

Jn = {Jabs, Jcnf, Jcnr, Jennat}

where Jn : native job, Jabs : abstract description of the job, Jcnf : configuration of the job, Jcnr : implementation of the job Jennat : native workflow engine

WFn = {WFabs, WFcnr, WFcnf, WFenn}

where WFabs : abstract part of workflow WFcnr : concrete part of workflow WFcnf : workflow configuration WFenn : native workflow engine

WFnn = {TASKS, WFabs, WFcnr, WFcnf, WFennn}

where Jnn = {Jabs, Jcnf, Jcnr, Jennnat} Jnn ∈ TASKS WFnn : non-native workflow WFennn : non-native workflow engine

Single native job Single native workflow Single non-native workflow

1 2

Job0

Single Job Meta-Workflow

Page 12: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

12

Linear multi-job meta-workflow

• A pipeline of multiple jobs in the native workflow system where any (or even all) of these jobs can be native jobs, native or non-native workflows

WFlmj = {J1…, Jm}={Jn1…, Jnk, WFn1, …,WFny, WFnn1, …,WFnnz} where Jj job of the linear multi-job meta-workflow, j = 1 … m Jni native job J∈ j, i = 1 … k WFnp native workflow J∈ j, p = 1… y WFnnt non-native workflow, J∈ j, t = 1… z

If Jx and Jy are two consecutive jobs and Jx is not the last job of the workflow, there must be an edge ei = (Pxo, Pyi) where Pxo : output port of job Jx Pyi : input port of job Jy

Job0

1 2 12 1 2

Job1 Job2

Linear multi-job meta-workflow

Page 13: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

13

Parallel multi-job meta-workflow• A native workflow including parallel branches which can include either single jobs

or native and non-native workflows, i.e. composed of several single and/or linear multi-job meta-workflows

WFpmj = {J1…, Jm}={Jn1…, Jnk, WFn1, …,WFny, WFnn1, …,WFnnz} where Jj job of the parallel multi-job meta-workflow, j = 1 … m Jni native job J∈ j, i = 1 … k WFnp native workflow J∈ j, p = 1… y WFnnt non-native workflow, J∈ j, t = 1… z

A parallel multi-job meta-workflow must contain an edge that connects split and worker jobs es = (Pgo, Pwxi), where Pso : output port of the split job Jwx : worker job and x = 1 … k Pwxi : input port of the worker job Jwx

and, an edge em such that connects worker and merge jobs em = (Pwxo, Pmi), where Pwxo : output port of a worker job Jwx and x = 1 … k Pmi : input port of the merge job Jm and i = 1 … k

1

2

1

1 1

1

2

2

32

4

Job0

Job1

Job2 Job3

Job4

Parallel multi-job meta-workflow

Page 14: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

14

Parameter-sweep meta-workflow• A parameter sweep workflow in the native workflow system that

includes one or more non-native workflows

1

1

1

1

2 2

2

Generator

Job2Job1

Collector

WFps = = {J1…, Jm}={Jn1…, Jnk, WFn1, …,WFny, WFnn1, …,WFnnz} where Jj job of the parallel sweep meta-workflow, j = 1 … m Jni native job J∈ j, i = 1 … k WFnp native workflow J∈ j, p = 1… y WFnnt non-native workflow, J∈ j, t = 1… z

Connections between the generator, collector and the worker jobs of the parameter sweep meta-workflow are described by two specific types of edges eg = (Pgo, Pwxi) where

Pgo : output port of the generator job Jg

Pwxi : input port of the worker job Jwx and x = 1 … kand ec = (Pwxo, Pci) where

Pwxo : output of a worker job Jwx and x = 1 … kPci : input port of collector job Jc and i = 1 … k Parameter-sweep meta-workflow

Page 15: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

15

SHIWA Simulation Platform (1)• The simulation platform contains

• science gateway SHIWA Science Gateway• a submission service SHIWA Submission Service• a workflow repository SHIWA Repository• a data transfer service Data Avenue Service

• Enables data, infrastructure and workflow interoperability at both coarse- and fine-grained level

• Supports the whole workflow life-cycle addressing the challenges of• executing workflows of different workflow systems as non-native workflows• combining workflows of different workflow systems into meta-workflows• running these workflows on different DCIs

• The gateway is integrated with the WS-PGRADE Workflow System which is used as native workflow engine in the simulation platform

• Supports ASKALON, Galaxy, GWES, Kepler, MOTEUR, Pegasus, ProActive, Taverna, Triana

Page 16: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

16

SHIWA Simulation Platform (2)

Page 17: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

17

Meta-workflows in Heliophysics (1)• Simulating complex scenarios such as propagation of Coronal Mass

Ejection (CME)• Two types of meta-workflows used

• Linear multi-job meta-workflow• Investigation of the fastest CME in a given period of time using an embedded Taverna

workflow• Parameter sweep meta-workflow

• finds and propagates the fastest CME over a given time range• The WS-PGRADE parameter sweep extension of the Taverna meta-workflow performs the

same operation over many time ranges

• Experiments performed using meta-workflow support in SHIWA Simulation Platform

Page 18: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

18

Linear multi-job meta-workflow for CME propagation Parameter sweep meta-workflow in WS-PGRADE

Meta-workflows in Heliophysics (2)

Page 19: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

19

Conclusions and Further Ahead

• Meta-workflows represent an exciting but challenging prospect within the context of workflow execution and interoperability

• Formalization for workflow types and CGI based approach for workflow interoperability has been achieved

• Significant contribution to facilitate design and development of new and existing workflows and workflow engines

• An example use case from Heliophysics has been presented demonstrating the effectiveness of the efforts for broader scientific domains

• Envisage expanding the scope of formalization to fine grained workflow interoperability approach

Page 20: A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability

20

References

• [1] A. Kertész, G. Sipos and P. Kacsuk; Brokering multi-grid workflows in the P-GRADE portal, in: Euro-Par 2006: Parallel Processing, vol. 4375, Springer, Berlin, 138–149, 2007.

• [2] Giardine, B., Riemer et al.; Galaxy: A platform for interactive large-scale genome analysis. Genome Research, 15(10):1451-5, 2005.

• [3] Oinn, T., Addis et al.; Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 20(17):3045-54, 2004

• [4] Terstyanszky, G. et al; Enabling Scientific Workflow Sharing Through Coarse Grained Interoperability in the Future Generation Com[puter Systems Vol 37, 46-59, 2014.