scientific workflow management system based on ptolemy ii allows scientists to visually design and...

33

Upload: blaze-steven-sanders

Post on 25-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 2: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 3: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Scientific workflow management system based on Ptolemy II

Allows scientists to visually design and execute scientific workflows

Actor-oriented model with directors acting as the main workflow engine

Enables different models of computation

Page 4: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Modeling flow of data from one step to another in series of computations to achieve some scientific goal

Page 5: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Software system for modeling, simulation, and design of concurrent, real-time, embedded systems developed at UC Berkeley

Objective:“The focus is on assembly of concurrent components. The key underlying principle in the project is the use of well-defined models of computation that govern the interaction between components. A major problem area being addressed is the use of heterogeneous mixtures of models of computation.”

Page 6: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 7: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Directors Actors Ports Relations

PortPort

Actor Actor

LinkRelation

Actor

Port

connection

Link

Link

Attributes Attributes

Attributes

Page 8: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Directors control execution of workflow Actors are executable components of a

workflow (scheduling, dispatching threads, etc)

Directors govern execution of Actors

Page 9: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Actor-/Dataflow Orientation vsObject-/Control flow Orientation

Page 10: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Every Kepler workflow needs a director

Execute networks of components under multiple execution models› Synchronous vs. Parallel vs. Dataflow

vs. time-based vs. event-based vs. all combined

Computation model dictates semantics for component interaction

Page 11: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Make use of separation of concerns› e.g., component execution, workflow

execution and provenance tracking Managers acts like “common execution

environment” › governing different concerns related to

execution of network and services

Page 12: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

CT – continuous time modeling DE – discrete event systems FSM – finite state machines PN – process networks SDF – synchronous dataflow DDF – dynamic dataflow SR - synchronous/reactive systems

Page 13: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Reusable components that execute variety of functions

Communicate with other actors in workflow through ports

Composite actor – aggregation of actors

Composite actor may have a local director

Page 14: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Top level workflows can be conceptual representation of science process

Drilling down reveals increasing levels of detail

Composing models using hierarchy promotes development of re-usable components

Page 15: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Each actor implements several methods› initialize() – initializes state variables› prefire() – indicates if actor wants to fire› fire() – main point of execution

Read inputs, produce outputs, read parameter values

› postfire() – update persistent state, see if execution complete

› wrapup() Each director calls these methods

according to its model

Page 16: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Copy actor– copy files from one resource to another during execution› Stage actor – local to remote host› Fetch actor - remote to local host

Job execution actor – submit and run a remote job Monitoring actor – notify user of failures Service discovery actor – import web services from a

service repository or web site Rexpression actors MatlabExpression actors Web services actors – Given WSDL and name of an

operation of a web service, dynamically customizes itself to implement and execute that method

Database connection and query actors

Page 17: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Ports used to produce and consume data and communicate with other actors in workflow› Input port – data consumed by actor› Output port – data produced by actor› Input/output port – data both produced and

consumed

Page 18: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Direct same input or output to more than one port

Example: direct output to 1. display actor to show intermediate

results, and 2. operational actor for further processing

Page 19: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Execution Options: › inside GUI› at command-line› distributed computing

Page 20: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 21: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 22: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 23: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 24: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 25: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Kepler components can be shared by exporting workflow or component into a Kepler Archive (KAR) file (extension of JAR file format)

Component Repository is centralized system for sharing Kepler workflows

Users can search for components from repository from within Vergil

Page 26: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Kepler provides direct access to scientific data archived in many of commonly used data archives. › Ex. access to data stored in Knowledge

Network for Biocomplexity (KNB) Metacat server and described using Ecological Metadata Language.

Additional supported data sources › DiGIR protocol, OPeNDAP protocol, GridFTP,

JDBC, SRB, and others.

Page 27: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Kepler ships by default with:› Globus actors› GridFTP actors

No BES implementation*

Job submission to openPBS, G-lite Kepler actors capable of using Unicore by

Euforia (Poznań SC) TeraGrid gateways exists that use Kepler

Page 28: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 29: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 30: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Actor Data Polymorphism:› Add numbers (int, float, double, complex)› Add strings (concatenation)› Add complex types (arrays, records,

matrices)› Add user-defined types

Page 31: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented
Page 32: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Distributed execution of workflow parts (peer to peer) Efficient data transfer Provenance tracking of data and processes Tracking workflow evolution Streaming data analysis Easy-to-deploy batch interfaces Intuitive workflow design Customizable semantic typing Interoperability with other workflow and analytical

environments (at exec level)

Page 33: Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented

Ecology› SEEK: Ecological Niche Modeling and climate change› REAP: Modeling parasite invasions in grasslands using sensor networks› NEON: Ecological sensor networks; COMET: Environmental science

Geosciences› GEON: LiDAR data processing, Geological data integration› NEESit: Earthquake engineering

Molecular biology› SDM: Gene promoter identification and ScalaBLAST› ChIP-chip: Genome-scale research; CAMERA: Metagenomics

Oceanography› REAP: SST data processing; LOOKING/OOI CI: ocean observing CI› ROADNet: real-time data modeling and analysis› ATOL: Processing Phylodata ; CiPRES: Phylogentic tools

Chemistry› Resurgence: Computational chemistry; DART/ARCHER: X-Ray crystallography

Library science› DIGARCH: Digital preservation; UK Text Mining Center: Cheshire feature and

archival Conservation biology

› SanParks: Thresholds of Potential Concerns Physics

› SDM: astrophysics TSI-1 and TSI-2 ; CPES: Plasma fusion simulation; ITER-EU: ITM fusion workflows