feasting onbrainswithworkflows

30
25/10/22 BioAID 1 Feasting on Brains with Taverna and myExperiment Marco Roos, Katy Wolstencroft acknowledging Carole Goble, Dave de Roure, Alan Williams, Jiten Bhagat, Martijn Schuemie, Edgar Meij, Sophia Katrenko, Willem van Hage, M. Scott Marshall, Pieter Adriaans, NBIC, OMII-UK, the myGrid team

Upload: leiden-university-medical-center

Post on 11-May-2015

319 views

Category:

Education


1 download

DESCRIPTION

About how the workflow paradigm helps collaborative bioinformatics.

TRANSCRIPT

Page 1: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 1

Feasting on Brainswith Taverna and myExperiment

Marco Roos, Katy Wolstencroft

acknowledgingCarole Goble, Dave de Roure, Alan Williams, Jiten Bhagat, Martijn

Schuemie, Edgar Meij, Sophia Katrenko, Willem van Hage, M. Scott Marshall, Pieter Adriaans, NBIC, OMII-UK, the myGrid team

Page 2: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 2

Why should a biologist be interested in workflows?

Leiden Students Say…• …

Page 3: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 3

Hold those thoughts…

Page 4: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 4

A biologist

Page 5: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 5

Mouse fibroblast (skin) cells

My prime interestStructure and function of DNA in the nucleus

Esc

heric

hia

coli

Page 6: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 6

Connecting the dots(example: protein interaction network in yeast)

Page 7: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 7

Thousands of DatabasesLow & High Throughput

Proteomics, Genomics, Transcriptomics, Protein sequence prediction, Phenotypic studies, Phylogeny, Sequence analysis, Protein Structure prediction, Protein-protein interaction, Metabolomics, Model organism collections, Systems Biology, Epidemiology, etcetera …

All with a splendid interface… all different, of course

Page 8: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 8

Biomedical knowledge repository

PubMed statisticshttp://www.ncbi.nlm.nih.gov/entrez

>20 million citations>400,000 added/year~70,000 searches/month

Does not fit

Does not compute

Page 9: Feasting onbrainswithworkflows

04/12/2023 BioAID 9

My brain is too small !

Page 10: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 10

A typical biologist…

A needy biologist

Tiny brain

Lots of data to deal with

Lots of methodsand algorithms to try and combine

No computationalsuperpowers

Lots of knowledge to deal with

Page 11: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 11

Start at the beginning

I have a computational question…

Page 12: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 12

‘Old school’ Bioinformatics

A typical bioinformatician

Page 13: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 13

‘Old school’ Bioinformatics

A biologist behind a computerwho (just) learned perl

Page 14: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 14

/* * determines ridges in htm expression table*/

#include "ridge.h"

int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable){

char querystring[256];

sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname);htmtable = PQexec(conn, querystring);

return(validquery(htmtable, querystring));}

int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount)/* determines if mincount genes in a row are (part of) a ridge *//* pre: htmtable is valid and sorted on genStart (ascending)/* post: {

if (mincount<=0) return TRUE;

if (row>=PQntuples(htmtable)) return FALSE;

if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold){ return FALSE;}return(is_ridge(htmtable, ++row, exprthreshold, --mincount));

}

int main(){

PGconn *conn; /* holds database connection */char querystring[256]; /* query string */PGresult *result;int i;

conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim");

if (PQstatus(conn)==CONNECTION_BAD){

fprintf(stderr, "connection to database failed.\n");fprintf(stderr, "%s", PQerrorMessage(conn));exit(1);

}else printf("Connection ok\n");

sprintf(querystring, "SELECT * FROM chromosomes");printf("%s\n", querystring);

result = PQexec(conn, querystring);

if (validquery(result, querystring)){

printresults(result);}else{

PQclear(result);PQfinish(conn);return FALSE;

}

PQclear(result);PQfinish(conn);return TRUE;

}

int printresults(PGresult *tuples){

int i;

for (i=0; i< PQntuples(tuples) && i < 10; i++){

printf("%d, ", i);printf("%s\n", PQgetvalue(tuples,i,0));

}return TRUE;

}

int validquery(PGresult *result, char *querystring){

printf(" in validquery\n");if (PQresultStatus(result) != PGRES_TUPLES_OK) {

printf("Query %s failed.\n", querystring);fprintf(stderr, "Query %s failed.\n", querystring);return FALSE;

}return TRUE;

}

Page 15: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 15

The ‘spaghetti’ approach

Page 16: Feasting onbrainswithworkflows

04/12/2023 BioAID 16

Computational tools graveyard rephrasing David Shotton, University of Oxford

Page 17: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 17

Which diseases are associated with my protein of interest ‘EZH2’

Page 18: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 18

Combining expertise

Edgar Meij

Information retrieval expert

Page 19: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 19

Combining expertise

Sophia Katrenko

Machine learning expert

Page 20: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 20

Combining expertise

Willem van Hage

Semantic web expert(and bass guitar player)

Page 21: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 21

“Collaboration through Web Services”

Bio-text mining expertBioSemantics group,Erasmus University Rotterdam

Martijn Schuemie

Page 22: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 22

“Collaboration through Web Services”

Biological Database expert

Hideaki Sugawara

Page 23: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 23

AIDA toolbox

e-Science collaboration

Page 24: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 24

“Collaboration through Web Services”

e-bioscientist

Page 25: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 25

A nice experiment design

Page 26: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 26

A not so nice experiment design

Page 27: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 27

A workflowProtocol for a computational experiment

Page 28: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 28

Feasting on brilliant brains with Taverna!

Want this…

Page 29: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 29

The enhanced Biologist

Page 30: Feasting onbrainswithworkflows

Wednesday 12 April 2023 BioAID 30

What are the benefits of the Workflow approach?

Leiden Students Think…• …