uk e-science all hands meeting 2005 paul groth, simon miles, luc moreau

19
UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau

Upload: wesley-martin

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

UK e-Science All Hands Meeting 2005

Paul Groth, Simon Miles, Luc Moreau

UK e-Science All Hands Meeting 2005

Outline

Process Documentation for Provenance

Power of the P-Structure P-assertion Recording Protocol PReServ’s Functionality Performance Pitch

UK e-Science All Hands Meeting 2005

Provenance

The Provenance Question– Lots of definitions…– Boil it down to a question.– What is the process that led to a

particular result? How do we answer this question?

– Search through documentation.

UK e-Science All Hands Meeting 2005

Documentation

Process Documentation– encompasses all other

documentation SOA based model of process Actors communicate via message

passing Actors make ASSERTIONS to

document process. Termed p-assertions.

How to organise these p-assertions

UK e-Science All Hands Meeting 2005

P-Structure

UK e-Science All Hands Meeting 2005

P-Structure View

UK e-Science All Hands Meeting 2005

Benefits

Domain independent queries That are provenance specific P-structure is a shared logical

organisation of p-assertions Does not prescribe how p-

assertions are exactly stored in an implementation.

UK e-Science All Hands Meeting 2005

PReP Introduces the Provenance Store

– A Separate entity for maintaining process documentation

PReP specifies how an actor can communicate with the Provenance Store.

PReP has a number of nice properties. – Statelessness– Idempotence– Terminiation

UK e-Science All Hands Meeting 2005

An Implementation

What is PReServ?– A Web Services implementation

of a Provenance Store– Implements

• PReP for recording• XQuery for querying

– Provides libraries and wrappers for making applications provenance aware.

UK e-Science All Hands Meeting 2005

AxisHandler

AxisHandler

Provenance Store

Backend Store Interface

DatabaseStore

In-MemoryStore

…Backend Stores

PS Client Side

Library

PS Client Side

Library

Web Service WS Client

Query Actor WS

PS Client Side

Library

WS Calls

Java Calls

PReServ Implementation Diagram

UK e-Science All Hands Meeting 2005

Implementation cont.

Backend Store Interface

Java Object Database Memory …

Store Plug In Query Plug In …

Dispatcher

SOAP Msg SOAP Msg

Caching mechanism to improve performance

Berkeley Java Database 2.0• No setup required• Completely Transactional

UK e-Science All Hands Meeting 2005

Requirements

Apache Tomcat 5.0 Apache Ant 1.6.2 Java 1.5 (1.4 supported with some

help) Pure Java, tested on

– Windows– Mac OS X– Debian Linux

UK e-Science All Hands Meeting 2005

Evaluation Deployment

Protein Compressibility Experiment– HPDC’05

Workflow runs under VMWare – deployment consistency– ease of development

Workflow is executed on one machine PReServ runs on another machine

– Version 0.1.5 of PReServ

UK e-Science All Hands Meeting 2005

Record Performance

UK e-Science All Hands Meeting 2005

Query Performance

UK e-Science All Hands Meeting 2005

Applications

UK e-Science All Hands Meeting 2005

Conclusion

The p-structure allows for domain independent, provenance specific queries using XQuery.

Both recording and query times are linear

PReServ has a extensible architecture allowing for further functionality to be easily added.

UK e-Science All Hands Meeting 2005

Download! Try it out! Download PReServ 0.2:

– The AHM release – Released under Open Source MIT

License

www.pasoa.org– Click software

Contact us, we will try to help you make your application provenance-aware.

UK e-Science All Hands Meeting 2005

Configuration Redhat Linux 9.1 on VMWare on

Windows XP Pentium P4 2.8 GHZ 1.5 GB RAM PReServ on another machine

– Database backend Berkley JDB 100 Mb local ethernet