UK e-Science All Hands Meeting 2005
Paul Groth, Simon Miles, Luc Moreau
UK e-Science All Hands Meeting 2005
Outline
Process Documentation for Provenance
Power of the P-Structure P-assertion Recording Protocol PReServ’s Functionality Performance Pitch
UK e-Science All Hands Meeting 2005
Provenance
The Provenance Question– Lots of definitions…– Boil it down to a question.– What is the process that led to a
particular result? How do we answer this question?
– Search through documentation.
UK e-Science All Hands Meeting 2005
Documentation
Process Documentation– encompasses all other
documentation SOA based model of process Actors communicate via message
passing Actors make ASSERTIONS to
document process. Termed p-assertions.
How to organise these p-assertions
UK e-Science All Hands Meeting 2005
Benefits
Domain independent queries That are provenance specific P-structure is a shared logical
organisation of p-assertions Does not prescribe how p-
assertions are exactly stored in an implementation.
UK e-Science All Hands Meeting 2005
PReP Introduces the Provenance Store
– A Separate entity for maintaining process documentation
PReP specifies how an actor can communicate with the Provenance Store.
PReP has a number of nice properties. – Statelessness– Idempotence– Terminiation
UK e-Science All Hands Meeting 2005
An Implementation
What is PReServ?– A Web Services implementation
of a Provenance Store– Implements
• PReP for recording• XQuery for querying
– Provides libraries and wrappers for making applications provenance aware.
UK e-Science All Hands Meeting 2005
AxisHandler
AxisHandler
Provenance Store
Backend Store Interface
DatabaseStore
In-MemoryStore
…Backend Stores
PS Client Side
Library
PS Client Side
Library
Web Service WS Client
Query Actor WS
PS Client Side
Library
WS Calls
Java Calls
PReServ Implementation Diagram
UK e-Science All Hands Meeting 2005
Implementation cont.
Backend Store Interface
Java Object Database Memory …
Store Plug In Query Plug In …
Dispatcher
SOAP Msg SOAP Msg
Caching mechanism to improve performance
Berkeley Java Database 2.0• No setup required• Completely Transactional
UK e-Science All Hands Meeting 2005
Requirements
Apache Tomcat 5.0 Apache Ant 1.6.2 Java 1.5 (1.4 supported with some
help) Pure Java, tested on
– Windows– Mac OS X– Debian Linux
UK e-Science All Hands Meeting 2005
Evaluation Deployment
Protein Compressibility Experiment– HPDC’05
Workflow runs under VMWare – deployment consistency– ease of development
Workflow is executed on one machine PReServ runs on another machine
– Version 0.1.5 of PReServ
UK e-Science All Hands Meeting 2005
Conclusion
The p-structure allows for domain independent, provenance specific queries using XQuery.
Both recording and query times are linear
PReServ has a extensible architecture allowing for further functionality to be easily added.
UK e-Science All Hands Meeting 2005
Download! Try it out! Download PReServ 0.2:
– The AHM release – Released under Open Source MIT
License
www.pasoa.org– Click software
Contact us, we will try to help you make your application provenance-aware.