comparison and versioning of scientific workfloweogasawara/wp-content/... · versioning of...

Post on 27-Jul-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Comparison and Versioning of Scientific

Workflow Eduardo Ogasawara1 Pablo Rangel1 Leonardo Murta2 Cláudia Werner1 Marta Mattoso1

1Federal University of Rio de Janeiro 2Fluminense Federal University

2

Summary

l Scientific Workflows l Versioning of Scientific Workflows l Diff/Merge of Scientific Workflows l Conclusion

3

Scientific Workflows and

Scientific Workflows Management Systems 1

4

1. In vitro experiment

2. Data analyzed by program X

3. Large Volume of Data Produced ...

4. ...which need to be processed by program Y in a cluster

5. Results are analyzed by program Z

Experiment Scenario Laboratory

In silico experiments assisted by scientific workflows

5

In Silico Experiment Process

6

Sharing and collaborating scientific workflows

7

Comparison with software development Programmer’s IDE E-Scientist’s IDE

Version control system with a repository that includes diff/merge facilities for collaborative software development

Absence of repository offering adequate version control and diff/

merge infra-structure

8

Goals l Define a version model for scientific

workflows l Define a diff/merge strategy for

scientific workflows

9

Versioning of Scientific Workflows 2

10

Versioning of scientific workflows

l  Software process can be compared to software (Fusaro et. al., 1998) → workflows can be compared to software

l  CM for workflows demands: l  Repository with access control to register

workflows and separate stable from under development versions

l  Mechanism to represent and store versions l  Presence of workspace concept to support

the modeling and maintenance of workflows

11

Version model Version space

Product space

Objects to be versioned and version identification (Conradi, 1998)

12

Product space

l  Coarse-grained units l  Structural information of the workflow. The

graph decomposition of the workflow l  Inherit “VersionedElement” l  Workflow, Activity, Relationship, Ports

l  Fine-grained units l  Internal information of each class

13

Version space

l  Each ConfigurationItem is composed by Version

l  Each VersionedElement has a version identifier

l  Version have next* and previous version l  A version may have branches and may be

merged

14

A

Workflow representation in version space

1

C A→C

1 1 1

OR node

AND node

OR edge

AND edge sample selections

A Version 1

C

Interplay between version and product space

15

A

Workflow representation in version space

1

B C A→C C→B

1 1 1

OR node

AND node

OR edge

AND edge sample selections

2 2 2 2

2

A Version 1

C

A

C B

Version 2 Workflow structure

Workflow evolution

16

Diff / Merge of Scientific Workflows 3

17

collaborative scenario

A

E

F

A

B E

J

D

H

I

B

C

E User 2 workspace

Baseline (v1)

User 1 workspace

A

X

G

K

D B

C

X

D

V

T

check-out check-out

18

User 1 makes check-in

User 1 tries to check-in first. No problem.

A

E

F

A

B E

J

D

H

I

B

C

E User 2 workspace

Baseline (v1)

Current version (v2) A

X

G K

D

B

C

X

D

V T

check-out

19

User 2 tries to check-in

A

E

F

A

B E

J

D

H

I

B

C

E User 2 workspace

Baseline (v1)

Current version (v2) A

X

G K

D

B

C

X

D

V T

check-out

When User 2 tries to check-in, a diff/merge needs to be executed.

20

diff / merge l  Configuration management tools usually

supports: l  2-way merge l  3-way merge

line  1  line  2  line  3  

line  1’  line  2  

<line  1>  ou  <line  1’>?  line  2  <line  3>  ou  nothing?  

line  1  line  2  line  3  

line  1’  line  2  

line  1’  line  2  line  3  

line  1  line  2  

21

2 way merge

A

B E

D

H

I

F

G

J

K

C

X

V

T

All conflicts need to be solved by the user.

A

E

F

A

B E

J

D

H

I

User 2 workspace Current version (v2)

G K

B

C

X

D

V T Conflicts to be solved during a check-in

22

3 way merge

A

B E

D

H

I

F

G

J

K

V

T

Conflicts to be solved during a check-in

A

E

F

A

B E

J

D

H

I

B

C

E User 2 workspace

Baseline (v1)

Current version (v2)

A

X

G K

D

B

C

X

D

V T

Lesser conflicts need to be solved by the user. The conflict involving activities

C and X is solved.

23

3-way sub graph diff/merge

l  Workflows have a dual behavior of being a model and executable code at the same time

l  Goal is to support a syntax merge, which means that a candidate conflict is not just a coarse grain unit, but a sub graph from the initial coarse grain conflict unit

24

3-way sub graph diff/merge

A

B E

D

H

I

F

G

J

K

V

T

step 3 A

B E

D

H

I

F

G

J

K

V

T

Exploring from E: No simplification

step 1

A

B E

D

H

I

F

G

J

K

V

T

step 2

Exploring from B: Simplification by

syntax merge Re-exploring from E:

decrease in conflict size

25

3-way sub graph diff/merge - two final conflicts

A

B E

D

H

I

F

G

J

K

V

T

Conflict one

Exploring from E In forward direction

A

B E

D

H

I

F

G

J

K

V

T

Conflict two

Exploring from E in backward direction

26

Conclusions 4

27

Conclusions

l  Contributions l  Version Model for scientific workflows l  Syntax diff/merge for scientific workflows

l  Prototype under development: l  Evaluate presented concepts l  Developed on top of Java Workflow Editor

(http://www.enhydra.org) using Postgresql DBMS

28

Eduardo Ogasawara ogasawara@cos.ufrj.br

Visit our Web site http://gexp.nacad.ufrj.br

Comparison and Versioning of Scientific Workflows

Thank you!

top related