univerza v ljubljani fakulteta za gradbeništvo in geodezijo open workflow infrastructure: a...

Post on 13-Jan-2016

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Univerzav Ljubljani

Fakultetaza gradbeništvoin geodezijo

Open Workflow Infrastructure:

A Research Agenda

Vlado Stankovski, Paolo Missier, Carole Goble, Ian Taylor

WANDS workshop @SIGMOD 2010Indianapolis, June 6, 2010

2

Introduction

University of Ljubljana, Faculty of Civil and Geodetic Engineering, Department of Construction InformaticsAssist. Prof. in Computer ScienceResearch interests• Semantic grid• Workflow technologies

3

Complex engineering problems

Highly structured and/or massive data• product models, accellerograms, ..

Heterogeneous devices available on the Internet• processors, clusters, storage, sensors • IPv6

Computationally intensive algorithms• highly specialized, new developments

Finite Elements, simulations, data mining

Collaboration needs• knowledge exchange within the engineering community• dynamic integration of distributed resources

4

Some end-user requirements

A system for distributed applications should• Support development by domain specialists, who

are not IT experts• Hide details of the underlying infrastructures from

the end-users • Open resource search and discovery is possible • Integrate users, services, resources, workflows

beyond a single organization boundary• Allow secure access to internal infrastructure• Support scenarios to offer “software, data,

workflows, .. as a service”• ID management, different user types is possible• Scalability, performance needed; data or process

intensive

5

Workflows and SOA

Workflow Editor & Manager

Workflow Editing

Workflow Manager

V2

V3

VN

V42

Information Service

Ontology Service

V6

V1

V5

6

Research topics

Independence from the underlying Distributed Computation Platforms (clusters, servers, grid/cloud infrastructures, peer-to-peer networks)CustomizationHarvesting and harnessing the intelligence of the communitiesSecurity support, ID management and provenance

7

Workflows

Raise the level of abstraction when developing distributed applicationsAccessible to non-programmersMore effective means for sharing the knowledgeknowledge, processesprocesses, communicationcommunication, storagestorage and contentcontent, then individual services

8

Workflow systemsA douzen workflow systems have been developed in the past years• Triana, Taverna, Askalon, Kepler, Windows WF

Foundation, Discovery Net etc.They address complex problems in a range of scientific and engineering fieldsHowever, a comprehensive methodology or tool that However, a comprehensive methodology or tool that facilitates open, generic and rapid development of facilitates open, generic and rapid development of distributed applications represented through the distributed applications represented through the workflow methodology, is still missingworkflow methodology, is still missing

9

Taverna

10

Askalon

11

Kepler

12

Windows WF Foundation

13

Recent trends for open workflow development

Yahoo Pipes • http://pipes.yahoo.com

IBM QEDWiki • http://services.alphaworks.ibm.com/qedwiki

Google Mashup Editor• http://code.google.com/gme/

Microsoft Popfly• www.popfly.com

14

Technology convergence

eScience workflows Business Process Modelling

15

Research agenda

Towards a science gatewayUnified views on workflow system capabilitiesWeb-wide search and discovery of resources (workflows, data and services)Extending collaborative practices in science and industry

16

Distributed

Audio Retrieval

Community

Medical &

bioinformatics

communities

AEC

Professionals

Community Infrastructures Enabler (for infrastructure

operators)

Application UsageKnowledge Generation

Social Networking Knowledge Generation

Taverna WF Enactor Triana WF Manager

DomainKnowledge

Infrastructures &

environment

Services Registry Application Usage,

ProvenanceData Registry

Workflow hosting and service interfaces

Interfaces to other WF systems

Collaboration Spaces (A Science Gateway)

Data services

Security, ID management

Infrastructures Services (security, logging, monitoring etc.)

Resources, middleware services, clusters, and other DCPs

Knowledge Layer

Metadata schemata

Software utilities

Workflow Composition Modules (for workflow

developers)

Data Explorer, Service Explorer, Workflow Explorer and so on.

(for end-users)

Operational

Research

Proofs of concept from projects like:• OntoGrid• InteliGrid• DataMiningGrid• K-WfGrid• Discovery Net• ..

17

End-usersEngineers, scientists• Enhanced interfaces for resource search and

discovery

Workflow developers• Modules for workflow development, sharing and use• Modules for the management of community shared

services and infrastructures

System administrators• Distributed metadata and knowledge sharing

system• Data management services• Overlay security and identity management services• Provenance Services

18

Workflow composition modules

Application Explorer (A)Data Manipulators (B)Parameter Control (C)

Execution and Monitoring Units (D)Provenance Unit (E)

E

19

An Internet-wide Application and Service Explorer

Application available at University of

Ljubljana, Slovenia

Applications available at

Institute Josef Stefan,

Slovenia

Applications available at Fraunhofer

Institute, Germany

20

Manipulating distributred data resources

files, databases, directories

Data available in Skopje,

Macedonia

Data available in Belfast, UK

21

Parameter Control

Dynamically formed GUI• metadata describing

the end-user interface

Used to set up execution parameters

22

Available computing clusters are found in Slovenia, Ireland and Germany

Resource brokering and execution monitoring

23

Provenance

Collects data about workflow execution• where? when? which algorithm? what parameters?

what data? who? for which application domain?

Positive and negative examples

24

Gruča 4 računalnikov

Condormaster

Gruča 20-40 računalnikov

Condor master

Condor master

Dve gruči, 80 računalnikov

GT4WS-

GRAM

Ontološkestoritve

Posrednikvirov

GT4

WS-GRAM

Strežnik grid1.fraunhofer.de

Strežnik matrix.ulster.ac.uk

Strežnik kanin.fgg.uni-lj.si

GT4

Strežnik grid2.fraunhofer.de

izvršljivprogram in knjižnice

podatki

Virtualna organizacija

Storitev za integracijo informacij

Matrix

Grid1Grid2

Kanin

GT4

Uporabnik mrežnega sistema

(Triana ali mrežni portal)

mrežna aplikacija(osnovni ali sestavljeni

delotok)

izvršljivprogram in knjižnice

multiposel

multiposel

WS-GRAM

LokacijaŠtevilo procesorjev

Univerza v Ulstru

50–70

Univerza v Ulstru

4–10

Nemčija, inštitut Fraunhofer

4

Univerza v Ljubljani

20–40

25

Research agendaLarge scale integration of resources, services and infrastructures• Identify infrastructures and define models for their

integration, sharing and use• Define public interfaces for workflow enactment

Metadata schemata• Design and implement annotation pipeline for

harnessing collective (domain) intelligence• Describe applications and services in uniform way• Describe data• Provenance models for distributed applications

26

Key question

How to make the knowledge about distributed applications explicit so that it can be shared within and among the communities?Workflow technologies have the potential to address many of the identified requirements

27

Thank you!

Q&A

28

Delotoki

Ian Taylor in sod., 2007• “Delotok je zaporedje povezanih storitev,

ki se izvaja v primerno orkestriranih korakih”

Povezujejo podatke, analitična orodja in storitve za simulacijo in vizualizacijo rezultatov/modelov

29

Konvergenca tehnologij

.. z dveh področij:

Modeliranje poslovnih procesovZnanstveni in inženirski izračuni

30

WS-BPEL 2.0

Bogat jezik za orkestracijo tako poslovnih kot tudi znanstvenih in inženirskih delotokovBPEL proces določa natančni vrstni red klicanja sodelujočih storitevOmogoča izdelavo zank, deklaracije spremenljivk, določanje vrednosti, obravnavanje izjem/napak, paralelno in zaporedno izvajanje itd. Ne omogoča pogojno izvajanje zank!Podpira razvoj ACID (Atomicity, Consistency, Isolation, Durability) transakcijVečina sistemov, ki podpira BPEL omogoča tudi persistentna stanja delotokov

31

Project proposals

YouCreate: Create, Share and Enact Pervasive Workflow-Centric Applications with Colleagues and FriendsWF-Tube: A Workflow TubeWorkflowOmnibus, OurFactory... and so on.

32

Odprta arhitektura

Storitev integracije informacij

Podatkovne storitve

Posrednik virov

Druge storitve

Varnost

Sistem za spremljanje in odkrivanje (MDS4)

Plast spletnih in mrežnih storitev

RFT in GridFTP

Upravljanje izvajanja (WS-GRAM)

Omrežja

Strežniki (Centralno-procesne

enote)

Plast virov: strojna in programska oprema

Datoteke in direktoriji

Računalniške gruče

Lokalni razvrščevalec poslov

shramba Podatkovne baze

Izvršljivi programi

Manipulator podatkov

Modul za spremljanje

Raziskovalec aplikacij

Modul za izvornost

Plast odjemalcev: urejevalnik in upravljalec delotokov in mrežni portali

Aplikacijske nastavitve

Upravljalec izvajanja

Generator poverilnic

Nalaganje virov

Raziskovalec podatkov

Druge enote

Mrežni portali

33

Triana

Urejevalnik in upravljalec delotokov• grafično podprto • prožno• enostavno

sestavljanje in izvajanje delotokov

odprtokodna rešitev http://trianacode.org

34

Triana podpira (povzetek)

Spletne (SOAP, WSDL) in mrežne (WSRF) storitveStoritve tipa enak z enakim (angl. Peer-to-Peer) (Jxta, P2PS)mrežne storitve preko sistema Grid Application Toolkit (npr. SAGA)storitve za odkrivanje znanja v podatkih (npr. DataMiningGrid)lokalna Java orodja in izvršljive programeprožen dostop do HTTP storitev

prepletene oz. hibridne (angl. mash-ups) storitve so hkrati enostavne in trivialne za implementacijo

35

Upravljalec delotokov

36

Light clients

37

Zaključekkompleksne aplikacije

tehnologije delotokov omogočajo večjo kontrolotehnologije delotokov omogočajo večjo kontrolo

draga izdelava nam ni treba začeti iz ničnam ni treba začeti iz nič

dragoceno pridobljeno znanje, dragi viri znanje lahko delimoznanje lahko delimo boljši izkoristek virovboljši izkoristek virov

aplikacije so večinoma odvisne od posameznih sistemov prevzem delotoka ne glede na sistem na katerega se bo izvajalprevzem delotoka ne glede na sistem na katerega se bo izvajal

uporabniške zahteve po večji prožnosti, prilagodljivosti višja stopnja abstrakcijevišja stopnja abstrakcije hiter razvoj inovativnih in zmogljivih aplikacijhiter razvoj inovativnih in zmogljivih aplikacij

zahteve po varnosti in sledenja izvornosti delotoki omogočajo povsem odprti pristop k izvornosti (angl. delotoki omogočajo povsem odprti pristop k izvornosti (angl.

Open Provenance)Open Provenance)

38

Q&As

39

Distributed

Audio Retrieval

Medical &

bioinformatics

communities

Private

EDA Industrial

Applications

AEC

Professionals

DataServices

Community Infrastructures Enabler

(for system administrators)

Application UsageKnowledge Generation

Social Networking Knowledge Generation

Community

Taverna WF Enactor Triana WF Manager

DomainKnowledge

Infrastructures &

environment metadata

ServiceUtilities Registry

Application Usage, Provenance &

TrustData Registry

Metadata Bus

Application (Workflow) Hosting Layer

Other WF systems

Collaboration Spaces

Data and Software

Security, ID management

Infrastructures Services (security, logging, monitoring etc.)

(New) Distributed Applications

Resources, Middleware Services and Infrastructures Layer

User Interfaces / Domain oriented views on resources and applications

Application Composition Modules

(for application developers)

Knowledge Layer

40

Primer delotoka (glasba)

top related