my experiment – a web 2.0 virtual research environment david de roure carole goble

43
my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble

Upload: kimo

Post on 25-Jan-2016

35 views

Category:

Documents


2 download

DESCRIPTION

my Experiment – A Web 2.0 Virtual Research Environment David De Roure Carole Goble. Overview. e-Science is about scientists doing science A Tale of Two Projects my Experiment Design Patterns for a VRE. Comb e Chem pilot project. Video. Simulation. Properties. Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

myExperiment – A Web 2.0 Virtual

Research Environment

David De Roure

Carole Goble

Page 2: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 2

Overview

e-Science is about scientists doing science

– A Tale of Two Projects

myExperiment

Design Patterns for a VRE

Page 3: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 3

X-Raye-Lab

Analysis

Properties

Propertiese-Lab

SimulationVideo

Diff

ract

omet

er

Grid Middleware

StructuresDatabase

CombeChem pilot project

www.combechem.org

Page 4: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

E-Scientists

Entire e-Science CycleEncompassing experimentation, analysis, publication, research, learning

Institutional Archive

LocalWebPublisher

Holdings

Digital Library

E-Scientists Graduate Students

Undergraduate Students

Virtual Learning Environment

E-Experimentation

E-Scientists

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints & Metadata

Certified Experimental

Results & Analyses

Data, Metadata & Ontologies

http://www.ukoln.ac.uk/projects/ebank-uk/

Reducing time-to-experiment

Page 5: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 5

The key observation!

“Publication at Source” describes the need to capture data and its context from the outset and maintain a complete end-to-end connection between the laboratory bench and the intellectual chemical knowledge that is published as a result of the investigation

Provenance

The details of the origins of data are just as important to understanding as their actual values

The details of the origins of data are just as important to understanding as their actual values

Page 6: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 6

My Chemistry Experiment

Box of Chemists

Page 7: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 7

Page 8: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 9

Page 9: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

e-Research workflows

Aggregator services

Institutional data repositories

Data curation & preservation: databases & databanks

Validation

Harvest

Data creation & capture in “Smart lab”

Deposit

Publishers: peer-review journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Search, harvest

Presentation services: portals

Data discovery, linking, citation

Linking, citation

Laboratory repository

Deposit

(Chemistry Central)

e-Crystals Federation model

This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

Page 10: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 13

Bioinformatics is not Chemistry

There are many pieces, from many boxes, but no box, and no lid with a complete picture of what the puzzle is supposed to be.

Planning? No. Metadata an afterthought

Page 11: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 14

myGrid

Open Source middleware for Life Scientists that enables them to undertake in silico experiments and share those experiments and their results.

Machinery for linking together datasets and tools

Individual scientists, in under-resourced labs, who use other people’s datasets and applications.

Ad hoc & exploratory workflows (data flows)

To support sharing and collaboration between scientists to disseminate best practice and improve the quality of science

33,000 downloads; 200+ user sites; 400+ workflows;

3500 third party external services accessible.

Moved from prototype to production quality.

Open Middleware Infrastructure Institute UK

http://www.mygrid.org.uk

Page 12: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 15Taverna Workflow Workbench

Page 13: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 16

Users in US, Asia, UK, Europe, Australia

Systems biology Proteomics Gene/protein annotation Microarray data analysis Medical image analysis Heart simulation orchestration High throughput screening of

chemical compounds Phenotypical studies Public Health studies Clinical trial analysis Plants, Mouse, Human Astronomy Cultural Heritage

Widespread Adoption

Page 14: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 17

Identified a pathway for which its correlating gene (Daxx) is believed to play a role in trypanosomiasis resistance.

Manual analysis on the microarray and QTL data failed to identify this gene as a candidate.

Repetitive, unbiased analysis.

Paul Fisher et al A Systematic Strategy for Large-Scale Unbiased Analysis of Genotype-Phenotype Correlations Bioinformatics in review

Trypanosomiasis cattle workflow reused without change to identify the biological pathways involved in sex dependence in the mouse model, previously believed to be involved in the ability of mice to expel the parasite.

Previously a manual two year study of candidate genes had failed to do this.

Recycling, Reuse, Repurposing

Page 15: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 18

Service and workflow annotation

Ontology 710 classes

Full time curator

Tagging by the masses

3500 service. 350 curated

Provenance

Ontology 35 classes

Enriched with domain ontologies and service ontologies. Possibly.

Export with data. Desirably.

Page 16: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 19

New Scientific Digital Artefacts

Design

Workflow design history

Experiment purpose

Scientist

LogBook

Workflow run log

Data lineage

Results interpretation log

Page 17: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 20

Kepler

Triana

New digital artefacts

Page 18: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 21

myExperiment.org Portal Party

28th & 29th Sept 2006

Hand picked Taverna users + Taverna development team

Facilitated by NCeSS.

AJAX based development

CombeChem xfer

1. A social networking environment for sharing any workflow

2. A Taverna workflow run environment

3. A multi-workflow launch environment

Page 19: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Page 20: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 24

openwetware.org

Page 21: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 25

Page 22: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 26

What are we trying to do?

Enabling scientists to be (more) creative.

Enabling scientists to be scientists. And not programmers.

Enabling mediocre scientists to become better and thus have better science.

Enabling smart scientists to be smarter and propagate their smartness.

Accelerate dissemination, pooling, insight.

Encouraging sanctioned plagiarism.

Page 23: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 27

Principles

Focus on making it easy to publish information

– Discovering and sharing experimental artefacts

– Publishing results to standard community repositories

– Publishing scholarly output

Familiar social networking / web paradigms

– Keeping it free and fluid and creative. Me-Science.

Crossing system boundaries

– Trans-workflow

Crossing discipline boundaries

– Multi-disciplinary, Inter-disciplinary, Trans-disciplinary

– Clustering expertise

– Intellectual fusion outside discipline. We-Science.

– Life Science, Social Science, Astronomy, Chemistry

Page 24: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 28

Scoping exercise

Workflow warehouse / federation of repositories Open Archives Initiative. Federated myExperiments. Sharepoint.

Social space + organised rich site Social discourse + organised service / workflow space using curated semantics.

Granularity and identifiers Rolling-up provenance. Id resolution

Open vs protected content Quality, Reliability, Validation, Safety, Intellectual Property, Ownership, Secrecy, A duty of guardianship. Curation? Policing? Local data mixed with shared resources

Desktop integration Google gadgets for workflows. Interacting with workflows through Office products.

Workflow execution (WHIP) Workflows Hosted in Portals project

Evolving the myExperiment software Community development

Enabling Scientists added value through applications and collaborative tagging

Page 25: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Hack Fest

26/2/2007 | myExperiment | Slide 29

Page 26: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble
Page 27: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble
Page 28: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Q1. Workflow Warehouse orFederation of Repositories?

Everything on the myExperiment.org web site

vs

Distributed stores

Multiple myExperiments

26/2/2007 | myExperiment | Slide 32

Page 29: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Q2. Social Space or Shoe Shop?

26/2/2007 | myExperiment | Slide 3326/2/2007 | myExperiment | Slide

33

Shopping for Workflows and Services and Data should be as easy as shopping for shoes.

Organic growth is good and bad.

Social tagging might help discover workflows but we need good metadata for automated use.

Page 30: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Q3. How open is the content?

OpenWetware is open

Our users don’t want this

Provenance helps

26/2/2007 | myExperiment | Slide 34

Page 31: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Q4. Integration

Bring user to Web Site

vs

Bringing myExperimentness to existing interfaces

26/2/2007 | myExperiment | Slide 35

Page 32: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Web 2.0 Design Patterns

http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

26/2/2007 | myExperiment | Slide 36

1. The Long Tail

2. Data is the Next Intel Inside

3. Users Add Value

4. Network Effects by Default

5. Some Rights Reserved

6. The Perpetual Beta

7. Cooperate, Don't Control

8. Software Above the Level of a Single Device

Page 33: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

1. The Long Tail

Our target users are not just the specialist e-Scientists using computing resources to tackle major scientific breakthroughs, but also the large number of scientists conducting the routine processes of science on a daily basis.

Through sharing we have the potential to enable smart scientists to be smarter and propagate their smartness, in turn enabling other scientists to become better and conduct better science.

26/2/2007 | myExperiment | Slide 37

Page 34: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

2. Data is the Next “Intel Inside”

myExperiment understands that scientists are focused on data, not software or one particular workflow engine.

Workflows are components of customised applications, many of which are data-oriented rather than process-oriented.

Users manipulate, through their own applications, the product (data, model) yielded by the workflow.

Furthermore, workflows themselves are the data of myExperiment and provide its unique value.

26/2/2007 | myExperiment | Slide 38

Page 35: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

3. Users Add Value

myExperiment makes it easy to find workflows and is designed to make it useful and straightforward to share workflows and add workflows to the pool.

To succeed we draw on the insights into the incentive models of scientists gained through experience with Taverna.

26/2/2007 | myExperiment | Slide 39

Page 36: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

4. Network Effects by Default

myExperiment aggregates user data as a side-effect of using the VRE.

The ability to execute workflows from myExperiment, and the integration of tools such as Taverna with myExperiment, further enable us to achieve increased value through usage.

26/2/2007 | myExperiment | Slide 40

Page 37: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

5. Some Rights Reserved

myExperiment users require protection as well as sharing, but the environment is designed for maximum ease of sharing to achieve collective benefits – workflows are "hackable" and "remixable".

Initiatives such as Science Commons provide a useful context for this.

26/2/2007 | myExperiment | Slide 41

Page 38: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

6. The Perpetual Beta

myExperiment is an online service (a collection of online services) and is continually evolving in response to its users.

To support this, the project commenced with developers being embedded in the user community.

Through day-to-day contact between designers and researchers, design is both inspired and validated.

26/2/2007 | myExperiment | Slide 42

Page 39: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

7. Cooperate, Don't Control

myExperiment is a network of cooperating data services with simple interfaces which make it easy to work with content.

It both provides services and reuses the service of others.

It aims to support lightweight programming models so that it can easily be part of loosely coupled systems.

26/2/2007 | myExperiment | Slide 43

Page 40: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

8. Software Above the Level of a Single Device

The current model of Taverna running on the scientist’s desktop PC or laptop is evolving into myExperiment being available through a variety of interfaces and supporting workflow execution.

26/2/2007 | myExperiment | Slide 44

Page 41: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Closing

e-Science is difficult – workflows and Web 2.0 make it easier.

Our design workshops and the review against Web 2.0 design patterns have revealed the relationship between myExperiment and Web 2.0.

The collective benefits of participation arise not only from the users but also from the developers – ease of use and ease of development.

It might be useful to review other VREs against the design patterns.

26/2/2007 | myExperiment | Slide 45

Page 42: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop 26/2/2007 | myExperiment | Slide 46

Take homes

myExperiment is a Web 2.0 Environment for Scientists to share experiments

Join us!

David De Roure – [email protected]

Carole Goble – [email protected]

Page 43: my Experiment – A Web 2.0 Virtual    Research Environment David De Roure Carole Goble

NeSC VRE Workshop

Credits

myGrid and CombeChem

Matt Lee

David Withers

Don Cruickshank

Rob Procter

Alex Voss

June Finch

Ed Zaluska

All the users inc. embedders26/2/2007 | myExperiment | Slide 47