univerza v ljubljani fakulteta za gradbeništvo in geodezijo open workflow infrastructure: a...

40
Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble, Ian Taylor WANDS workshop @SIGMOD 2010 Indianapolis, June 6, 2010

Upload: annabella-gilmore

Post on 13-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

Univerzav Ljubljani

Fakultetaza gradbeništvoin geodezijo

Open Workflow Infrastructure:

A Research Agenda

Vlado Stankovski, Paolo Missier, Carole Goble, Ian Taylor

WANDS workshop @SIGMOD 2010Indianapolis, June 6, 2010

Page 2: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

2

Introduction

University of Ljubljana, Faculty of Civil and Geodetic Engineering, Department of Construction InformaticsAssist. Prof. in Computer ScienceResearch interests• Semantic grid• Workflow technologies

Page 3: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

3

Complex engineering problems

Highly structured and/or massive data• product models, accellerograms, ..

Heterogeneous devices available on the Internet• processors, clusters, storage, sensors • IPv6

Computationally intensive algorithms• highly specialized, new developments

Finite Elements, simulations, data mining

Collaboration needs• knowledge exchange within the engineering community• dynamic integration of distributed resources

Page 4: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

4

Some end-user requirements

A system for distributed applications should• Support development by domain specialists, who

are not IT experts• Hide details of the underlying infrastructures from

the end-users • Open resource search and discovery is possible • Integrate users, services, resources, workflows

beyond a single organization boundary• Allow secure access to internal infrastructure• Support scenarios to offer “software, data,

workflows, .. as a service”• ID management, different user types is possible• Scalability, performance needed; data or process

intensive

Page 5: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

5

Workflows and SOA

Workflow Editor & Manager

Workflow Editing

Workflow Manager

V2

V3

VN

V42

Information Service

Ontology Service

V6

V1

V5

Page 6: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

6

Research topics

Independence from the underlying Distributed Computation Platforms (clusters, servers, grid/cloud infrastructures, peer-to-peer networks)CustomizationHarvesting and harnessing the intelligence of the communitiesSecurity support, ID management and provenance

Page 7: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

7

Workflows

Raise the level of abstraction when developing distributed applicationsAccessible to non-programmersMore effective means for sharing the knowledgeknowledge, processesprocesses, communicationcommunication, storagestorage and contentcontent, then individual services

Page 8: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

8

Workflow systemsA douzen workflow systems have been developed in the past years• Triana, Taverna, Askalon, Kepler, Windows WF

Foundation, Discovery Net etc.They address complex problems in a range of scientific and engineering fieldsHowever, a comprehensive methodology or tool that However, a comprehensive methodology or tool that facilitates open, generic and rapid development of facilitates open, generic and rapid development of distributed applications represented through the distributed applications represented through the workflow methodology, is still missingworkflow methodology, is still missing

Page 9: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

9

Taverna

Page 10: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

10

Askalon

Page 11: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

11

Kepler

Page 12: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

12

Windows WF Foundation

Page 13: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

13

Recent trends for open workflow development

Yahoo Pipes • http://pipes.yahoo.com

IBM QEDWiki • http://services.alphaworks.ibm.com/qedwiki

Google Mashup Editor• http://code.google.com/gme/

Microsoft Popfly• www.popfly.com

Page 14: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

14

Technology convergence

eScience workflows Business Process Modelling

Page 15: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

15

Research agenda

Towards a science gatewayUnified views on workflow system capabilitiesWeb-wide search and discovery of resources (workflows, data and services)Extending collaborative practices in science and industry

Page 16: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

16

Distributed

Audio Retrieval

Community

Medical &

bioinformatics

communities

AEC

Professionals

Community Infrastructures Enabler (for infrastructure

operators)

Application UsageKnowledge Generation

Social Networking Knowledge Generation

Taverna WF Enactor Triana WF Manager

DomainKnowledge

Infrastructures &

environment

Services Registry Application Usage,

ProvenanceData Registry

Workflow hosting and service interfaces

Interfaces to other WF systems

Collaboration Spaces (A Science Gateway)

Data services

Security, ID management

Infrastructures Services (security, logging, monitoring etc.)

Resources, middleware services, clusters, and other DCPs

Knowledge Layer

Metadata schemata

Software utilities

Workflow Composition Modules (for workflow

developers)

Data Explorer, Service Explorer, Workflow Explorer and so on.

(for end-users)

Operational

Research

Proofs of concept from projects like:• OntoGrid• InteliGrid• DataMiningGrid• K-WfGrid• Discovery Net• ..

Page 17: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

17

End-usersEngineers, scientists• Enhanced interfaces for resource search and

discovery

Workflow developers• Modules for workflow development, sharing and use• Modules for the management of community shared

services and infrastructures

System administrators• Distributed metadata and knowledge sharing

system• Data management services• Overlay security and identity management services• Provenance Services

Page 18: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

18

Workflow composition modules

Application Explorer (A)Data Manipulators (B)Parameter Control (C)

Execution and Monitoring Units (D)Provenance Unit (E)

E

Page 19: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

19

An Internet-wide Application and Service Explorer

Application available at University of

Ljubljana, Slovenia

Applications available at

Institute Josef Stefan,

Slovenia

Applications available at Fraunhofer

Institute, Germany

Page 20: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

20

Manipulating distributred data resources

files, databases, directories

Data available in Skopje,

Macedonia

Data available in Belfast, UK

Page 21: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

21

Parameter Control

Dynamically formed GUI• metadata describing

the end-user interface

Used to set up execution parameters

Page 22: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

22

Available computing clusters are found in Slovenia, Ireland and Germany

Resource brokering and execution monitoring

Page 23: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

23

Provenance

Collects data about workflow execution• where? when? which algorithm? what parameters?

what data? who? for which application domain?

Positive and negative examples

Page 24: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

24

Gruča 4 računalnikov

Condormaster

Gruča 20-40 računalnikov

Condor master

Condor master

Dve gruči, 80 računalnikov

GT4WS-

GRAM

Ontološkestoritve

Posrednikvirov

GT4

WS-GRAM

Strežnik grid1.fraunhofer.de

Strežnik matrix.ulster.ac.uk

Strežnik kanin.fgg.uni-lj.si

GT4

Strežnik grid2.fraunhofer.de

izvršljivprogram in knjižnice

podatki

Virtualna organizacija

Storitev za integracijo informacij

Matrix

Grid1Grid2

Kanin

GT4

Uporabnik mrežnega sistema

(Triana ali mrežni portal)

mrežna aplikacija(osnovni ali sestavljeni

delotok)

izvršljivprogram in knjižnice

multiposel

multiposel

WS-GRAM

LokacijaŠtevilo procesorjev

Univerza v Ulstru

50–70

Univerza v Ulstru

4–10

Nemčija, inštitut Fraunhofer

4

Univerza v Ljubljani

20–40

Page 25: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

25

Research agendaLarge scale integration of resources, services and infrastructures• Identify infrastructures and define models for their

integration, sharing and use• Define public interfaces for workflow enactment

Metadata schemata• Design and implement annotation pipeline for

harnessing collective (domain) intelligence• Describe applications and services in uniform way• Describe data• Provenance models for distributed applications

Page 26: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

26

Key question

How to make the knowledge about distributed applications explicit so that it can be shared within and among the communities?Workflow technologies have the potential to address many of the identified requirements

Page 27: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

27

Thank you!

Q&A

Page 28: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

28

Delotoki

Ian Taylor in sod., 2007• “Delotok je zaporedje povezanih storitev,

ki se izvaja v primerno orkestriranih korakih”

Povezujejo podatke, analitična orodja in storitve za simulacijo in vizualizacijo rezultatov/modelov

Page 29: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

29

Konvergenca tehnologij

.. z dveh področij:

Modeliranje poslovnih procesovZnanstveni in inženirski izračuni

Page 30: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

30

WS-BPEL 2.0

Bogat jezik za orkestracijo tako poslovnih kot tudi znanstvenih in inženirskih delotokovBPEL proces določa natančni vrstni red klicanja sodelujočih storitevOmogoča izdelavo zank, deklaracije spremenljivk, določanje vrednosti, obravnavanje izjem/napak, paralelno in zaporedno izvajanje itd. Ne omogoča pogojno izvajanje zank!Podpira razvoj ACID (Atomicity, Consistency, Isolation, Durability) transakcijVečina sistemov, ki podpira BPEL omogoča tudi persistentna stanja delotokov

Page 31: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

31

Project proposals

YouCreate: Create, Share and Enact Pervasive Workflow-Centric Applications with Colleagues and FriendsWF-Tube: A Workflow TubeWorkflowOmnibus, OurFactory... and so on.

Page 32: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

32

Odprta arhitektura

Storitev integracije informacij

Podatkovne storitve

Posrednik virov

Druge storitve

Varnost

Sistem za spremljanje in odkrivanje (MDS4)

Plast spletnih in mrežnih storitev

RFT in GridFTP

Upravljanje izvajanja (WS-GRAM)

Omrežja

Strežniki (Centralno-procesne

enote)

Plast virov: strojna in programska oprema

Datoteke in direktoriji

Računalniške gruče

Lokalni razvrščevalec poslov

shramba Podatkovne baze

Izvršljivi programi

Manipulator podatkov

Modul za spremljanje

Raziskovalec aplikacij

Modul za izvornost

Plast odjemalcev: urejevalnik in upravljalec delotokov in mrežni portali

Aplikacijske nastavitve

Upravljalec izvajanja

Generator poverilnic

Nalaganje virov

Raziskovalec podatkov

Druge enote

Mrežni portali

Page 33: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

33

Triana

Urejevalnik in upravljalec delotokov• grafično podprto • prožno• enostavno

sestavljanje in izvajanje delotokov

odprtokodna rešitev http://trianacode.org

Page 34: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

34

Triana podpira (povzetek)

Spletne (SOAP, WSDL) in mrežne (WSRF) storitveStoritve tipa enak z enakim (angl. Peer-to-Peer) (Jxta, P2PS)mrežne storitve preko sistema Grid Application Toolkit (npr. SAGA)storitve za odkrivanje znanja v podatkih (npr. DataMiningGrid)lokalna Java orodja in izvršljive programeprožen dostop do HTTP storitev

prepletene oz. hibridne (angl. mash-ups) storitve so hkrati enostavne in trivialne za implementacijo

Page 35: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

35

Upravljalec delotokov

Page 36: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

36

Light clients

Page 37: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

37

Zaključekkompleksne aplikacije

tehnologije delotokov omogočajo večjo kontrolotehnologije delotokov omogočajo večjo kontrolo

draga izdelava nam ni treba začeti iz ničnam ni treba začeti iz nič

dragoceno pridobljeno znanje, dragi viri znanje lahko delimoznanje lahko delimo boljši izkoristek virovboljši izkoristek virov

aplikacije so večinoma odvisne od posameznih sistemov prevzem delotoka ne glede na sistem na katerega se bo izvajalprevzem delotoka ne glede na sistem na katerega se bo izvajal

uporabniške zahteve po večji prožnosti, prilagodljivosti višja stopnja abstrakcijevišja stopnja abstrakcije hiter razvoj inovativnih in zmogljivih aplikacijhiter razvoj inovativnih in zmogljivih aplikacij

zahteve po varnosti in sledenja izvornosti delotoki omogočajo povsem odprti pristop k izvornosti (angl. delotoki omogočajo povsem odprti pristop k izvornosti (angl.

Open Provenance)Open Provenance)

Page 38: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

38

Q&As

Page 39: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

39

Distributed

Audio Retrieval

Medical &

bioinformatics

communities

Private

EDA Industrial

Applications

AEC

Professionals

DataServices

Community Infrastructures Enabler

(for system administrators)

Application UsageKnowledge Generation

Social Networking Knowledge Generation

Community

Taverna WF Enactor Triana WF Manager

DomainKnowledge

Infrastructures &

environment metadata

ServiceUtilities Registry

Application Usage, Provenance &

TrustData Registry

Metadata Bus

Application (Workflow) Hosting Layer

Other WF systems

Collaboration Spaces

Data and Software

Security, ID management

Infrastructures Services (security, logging, monitoring etc.)

(New) Distributed Applications

Resources, Middleware Services and Infrastructures Layer

User Interfaces / Domain oriented views on resources and applications

Application Composition Modules

(for application developers)

Knowledge Layer

Page 40: Univerza v Ljubljani Fakulteta za gradbeništvo in geodezijo Open Workflow Infrastructure: A Research Agenda Vlado Stankovski, Paolo Missier, Carole Goble,

40

Primer delotoka (glasba)