univerza v ljubljani fakulteta za gradbeništvo in geodezijo open workflow infrastructure: a...
TRANSCRIPT
Univerzav Ljubljani
Fakultetaza gradbeništvoin geodezijo
Open Workflow Infrastructure:
A Research Agenda
Vlado Stankovski, Paolo Missier, Carole Goble, Ian Taylor
WANDS workshop @SIGMOD 2010Indianapolis, June 6, 2010
2
Introduction
University of Ljubljana, Faculty of Civil and Geodetic Engineering, Department of Construction InformaticsAssist. Prof. in Computer ScienceResearch interests• Semantic grid• Workflow technologies
3
Complex engineering problems
Highly structured and/or massive data• product models, accellerograms, ..
Heterogeneous devices available on the Internet• processors, clusters, storage, sensors • IPv6
Computationally intensive algorithms• highly specialized, new developments
Finite Elements, simulations, data mining
Collaboration needs• knowledge exchange within the engineering community• dynamic integration of distributed resources
4
Some end-user requirements
A system for distributed applications should• Support development by domain specialists, who
are not IT experts• Hide details of the underlying infrastructures from
the end-users • Open resource search and discovery is possible • Integrate users, services, resources, workflows
beyond a single organization boundary• Allow secure access to internal infrastructure• Support scenarios to offer “software, data,
workflows, .. as a service”• ID management, different user types is possible• Scalability, performance needed; data or process
intensive
5
Workflows and SOA
Workflow Editor & Manager
Workflow Editing
Workflow Manager
V2
V3
VN
V42
Information Service
Ontology Service
V6
V1
V5
6
Research topics
Independence from the underlying Distributed Computation Platforms (clusters, servers, grid/cloud infrastructures, peer-to-peer networks)CustomizationHarvesting and harnessing the intelligence of the communitiesSecurity support, ID management and provenance
7
Workflows
Raise the level of abstraction when developing distributed applicationsAccessible to non-programmersMore effective means for sharing the knowledgeknowledge, processesprocesses, communicationcommunication, storagestorage and contentcontent, then individual services
8
Workflow systemsA douzen workflow systems have been developed in the past years• Triana, Taverna, Askalon, Kepler, Windows WF
Foundation, Discovery Net etc.They address complex problems in a range of scientific and engineering fieldsHowever, a comprehensive methodology or tool that However, a comprehensive methodology or tool that facilitates open, generic and rapid development of facilitates open, generic and rapid development of distributed applications represented through the distributed applications represented through the workflow methodology, is still missingworkflow methodology, is still missing
9
Taverna
10
Askalon
11
Kepler
12
Windows WF Foundation
13
Recent trends for open workflow development
Yahoo Pipes • http://pipes.yahoo.com
IBM QEDWiki • http://services.alphaworks.ibm.com/qedwiki
Google Mashup Editor• http://code.google.com/gme/
Microsoft Popfly• www.popfly.com
14
Technology convergence
eScience workflows Business Process Modelling
15
Research agenda
Towards a science gatewayUnified views on workflow system capabilitiesWeb-wide search and discovery of resources (workflows, data and services)Extending collaborative practices in science and industry
16
Distributed
Audio Retrieval
Community
Medical &
bioinformatics
communities
AEC
Professionals
Community Infrastructures Enabler (for infrastructure
operators)
Application UsageKnowledge Generation
Social Networking Knowledge Generation
Taverna WF Enactor Triana WF Manager
DomainKnowledge
Infrastructures &
environment
Services Registry Application Usage,
ProvenanceData Registry
Workflow hosting and service interfaces
Interfaces to other WF systems
Collaboration Spaces (A Science Gateway)
Data services
Security, ID management
Infrastructures Services (security, logging, monitoring etc.)
Resources, middleware services, clusters, and other DCPs
Knowledge Layer
Metadata schemata
Software utilities
Workflow Composition Modules (for workflow
developers)
Data Explorer, Service Explorer, Workflow Explorer and so on.
(for end-users)
Operational
Research
Proofs of concept from projects like:• OntoGrid• InteliGrid• DataMiningGrid• K-WfGrid• Discovery Net• ..
17
End-usersEngineers, scientists• Enhanced interfaces for resource search and
discovery
Workflow developers• Modules for workflow development, sharing and use• Modules for the management of community shared
services and infrastructures
System administrators• Distributed metadata and knowledge sharing
system• Data management services• Overlay security and identity management services• Provenance Services
18
Workflow composition modules
Application Explorer (A)Data Manipulators (B)Parameter Control (C)
Execution and Monitoring Units (D)Provenance Unit (E)
E
19
An Internet-wide Application and Service Explorer
Application available at University of
Ljubljana, Slovenia
Applications available at
Institute Josef Stefan,
Slovenia
Applications available at Fraunhofer
Institute, Germany
20
Manipulating distributred data resources
files, databases, directories
Data available in Skopje,
Macedonia
Data available in Belfast, UK
21
Parameter Control
Dynamically formed GUI• metadata describing
the end-user interface
Used to set up execution parameters
22
Available computing clusters are found in Slovenia, Ireland and Germany
Resource brokering and execution monitoring
23
Provenance
Collects data about workflow execution• where? when? which algorithm? what parameters?
what data? who? for which application domain?
Positive and negative examples
24
Gruča 4 računalnikov
Condormaster
Gruča 20-40 računalnikov
Condor master
Condor master
Dve gruči, 80 računalnikov
GT4WS-
GRAM
Ontološkestoritve
Posrednikvirov
GT4
WS-GRAM
Strežnik grid1.fraunhofer.de
Strežnik matrix.ulster.ac.uk
Strežnik kanin.fgg.uni-lj.si
GT4
Strežnik grid2.fraunhofer.de
izvršljivprogram in knjižnice
podatki
Virtualna organizacija
Storitev za integracijo informacij
Matrix
Grid1Grid2
Kanin
GT4
Uporabnik mrežnega sistema
(Triana ali mrežni portal)
mrežna aplikacija(osnovni ali sestavljeni
delotok)
izvršljivprogram in knjižnice
multiposel
multiposel
WS-GRAM
LokacijaŠtevilo procesorjev
Univerza v Ulstru
50–70
Univerza v Ulstru
4–10
Nemčija, inštitut Fraunhofer
4
Univerza v Ljubljani
20–40
25
Research agendaLarge scale integration of resources, services and infrastructures• Identify infrastructures and define models for their
integration, sharing and use• Define public interfaces for workflow enactment
Metadata schemata• Design and implement annotation pipeline for
harnessing collective (domain) intelligence• Describe applications and services in uniform way• Describe data• Provenance models for distributed applications
26
Key question
How to make the knowledge about distributed applications explicit so that it can be shared within and among the communities?Workflow technologies have the potential to address many of the identified requirements
27
Thank you!
Q&A
28
Delotoki
Ian Taylor in sod., 2007• “Delotok je zaporedje povezanih storitev,
ki se izvaja v primerno orkestriranih korakih”
Povezujejo podatke, analitična orodja in storitve za simulacijo in vizualizacijo rezultatov/modelov
29
Konvergenca tehnologij
.. z dveh področij:
Modeliranje poslovnih procesovZnanstveni in inženirski izračuni
30
WS-BPEL 2.0
Bogat jezik za orkestracijo tako poslovnih kot tudi znanstvenih in inženirskih delotokovBPEL proces določa natančni vrstni red klicanja sodelujočih storitevOmogoča izdelavo zank, deklaracije spremenljivk, določanje vrednosti, obravnavanje izjem/napak, paralelno in zaporedno izvajanje itd. Ne omogoča pogojno izvajanje zank!Podpira razvoj ACID (Atomicity, Consistency, Isolation, Durability) transakcijVečina sistemov, ki podpira BPEL omogoča tudi persistentna stanja delotokov
31
Project proposals
YouCreate: Create, Share and Enact Pervasive Workflow-Centric Applications with Colleagues and FriendsWF-Tube: A Workflow TubeWorkflowOmnibus, OurFactory... and so on.
32
Odprta arhitektura
Storitev integracije informacij
Podatkovne storitve
Posrednik virov
Druge storitve
Varnost
Sistem za spremljanje in odkrivanje (MDS4)
Plast spletnih in mrežnih storitev
RFT in GridFTP
Upravljanje izvajanja (WS-GRAM)
Omrežja
Strežniki (Centralno-procesne
enote)
Plast virov: strojna in programska oprema
Datoteke in direktoriji
Računalniške gruče
Lokalni razvrščevalec poslov
shramba Podatkovne baze
Izvršljivi programi
Manipulator podatkov
Modul za spremljanje
Raziskovalec aplikacij
Modul za izvornost
Plast odjemalcev: urejevalnik in upravljalec delotokov in mrežni portali
Aplikacijske nastavitve
Upravljalec izvajanja
Generator poverilnic
Nalaganje virov
Raziskovalec podatkov
Druge enote
Mrežni portali
33
Triana
Urejevalnik in upravljalec delotokov• grafično podprto • prožno• enostavno
sestavljanje in izvajanje delotokov
odprtokodna rešitev http://trianacode.org
34
Triana podpira (povzetek)
Spletne (SOAP, WSDL) in mrežne (WSRF) storitveStoritve tipa enak z enakim (angl. Peer-to-Peer) (Jxta, P2PS)mrežne storitve preko sistema Grid Application Toolkit (npr. SAGA)storitve za odkrivanje znanja v podatkih (npr. DataMiningGrid)lokalna Java orodja in izvršljive programeprožen dostop do HTTP storitev
prepletene oz. hibridne (angl. mash-ups) storitve so hkrati enostavne in trivialne za implementacijo
35
Upravljalec delotokov
36
Light clients
37
Zaključekkompleksne aplikacije
tehnologije delotokov omogočajo večjo kontrolotehnologije delotokov omogočajo večjo kontrolo
draga izdelava nam ni treba začeti iz ničnam ni treba začeti iz nič
dragoceno pridobljeno znanje, dragi viri znanje lahko delimoznanje lahko delimo boljši izkoristek virovboljši izkoristek virov
aplikacije so večinoma odvisne od posameznih sistemov prevzem delotoka ne glede na sistem na katerega se bo izvajalprevzem delotoka ne glede na sistem na katerega se bo izvajal
uporabniške zahteve po večji prožnosti, prilagodljivosti višja stopnja abstrakcijevišja stopnja abstrakcije hiter razvoj inovativnih in zmogljivih aplikacijhiter razvoj inovativnih in zmogljivih aplikacij
zahteve po varnosti in sledenja izvornosti delotoki omogočajo povsem odprti pristop k izvornosti (angl. delotoki omogočajo povsem odprti pristop k izvornosti (angl.
Open Provenance)Open Provenance)
38
Q&As
39
Distributed
Audio Retrieval
Medical &
bioinformatics
communities
Private
EDA Industrial
Applications
AEC
Professionals
DataServices
Community Infrastructures Enabler
(for system administrators)
Application UsageKnowledge Generation
Social Networking Knowledge Generation
Community
Taverna WF Enactor Triana WF Manager
DomainKnowledge
Infrastructures &
environment metadata
ServiceUtilities Registry
Application Usage, Provenance &
TrustData Registry
Metadata Bus
Application (Workflow) Hosting Layer
Other WF systems
Collaboration Spaces
Data and Software
Security, ID management
Infrastructures Services (security, logging, monitoring etc.)
(New) Distributed Applications
Resources, Middleware Services and Infrastructures Layer
User Interfaces / Domain oriented views on resources and applications
Application Composition Modules
(for application developers)
Knowledge Layer
40
Primer delotoka (glasba)