feasting onbrainswithworkflows

Post on 11-May-2015

319 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

About how the workflow paradigm helps collaborative bioinformatics.

TRANSCRIPT

Wednesday 12 April 2023 BioAID 1

Feasting on Brainswith Taverna and myExperiment

Marco Roos, Katy Wolstencroft

acknowledgingCarole Goble, Dave de Roure, Alan Williams, Jiten Bhagat, Martijn

Schuemie, Edgar Meij, Sophia Katrenko, Willem van Hage, M. Scott Marshall, Pieter Adriaans, NBIC, OMII-UK, the myGrid team

Wednesday 12 April 2023 BioAID 2

Why should a biologist be interested in workflows?

Leiden Students Say…• …

Wednesday 12 April 2023 BioAID 3

Hold those thoughts…

Wednesday 12 April 2023 BioAID 4

A biologist

Wednesday 12 April 2023 BioAID 5

Mouse fibroblast (skin) cells

My prime interestStructure and function of DNA in the nucleus

Esc

heric

hia

coli

Wednesday 12 April 2023 BioAID 6

Connecting the dots(example: protein interaction network in yeast)

Wednesday 12 April 2023 BioAID 7

Thousands of DatabasesLow & High Throughput

Proteomics, Genomics, Transcriptomics, Protein sequence prediction, Phenotypic studies, Phylogeny, Sequence analysis, Protein Structure prediction, Protein-protein interaction, Metabolomics, Model organism collections, Systems Biology, Epidemiology, etcetera …

All with a splendid interface… all different, of course

Wednesday 12 April 2023 BioAID 8

Biomedical knowledge repository

PubMed statisticshttp://www.ncbi.nlm.nih.gov/entrez

>20 million citations>400,000 added/year~70,000 searches/month

Does not fit

Does not compute

04/12/2023 BioAID 9

My brain is too small !

Wednesday 12 April 2023 BioAID 10

A typical biologist…

A needy biologist

Tiny brain

Lots of data to deal with

Lots of methodsand algorithms to try and combine

No computationalsuperpowers

Lots of knowledge to deal with

Wednesday 12 April 2023 BioAID 11

Start at the beginning

I have a computational question…

Wednesday 12 April 2023 BioAID 12

‘Old school’ Bioinformatics

A typical bioinformatician

Wednesday 12 April 2023 BioAID 13

‘Old school’ Bioinformatics

A biologist behind a computerwho (just) learned perl

Wednesday 12 April 2023 BioAID 14

/* * determines ridges in htm expression table*/

#include "ridge.h"

int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable){

char querystring[256];

sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname);htmtable = PQexec(conn, querystring);

return(validquery(htmtable, querystring));}

int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount)/* determines if mincount genes in a row are (part of) a ridge *//* pre: htmtable is valid and sorted on genStart (ascending)/* post: {

if (mincount<=0) return TRUE;

if (row>=PQntuples(htmtable)) return FALSE;

if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold){ return FALSE;}return(is_ridge(htmtable, ++row, exprthreshold, --mincount));

}

int main(){

PGconn *conn; /* holds database connection */char querystring[256]; /* query string */PGresult *result;int i;

conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim");

if (PQstatus(conn)==CONNECTION_BAD){

fprintf(stderr, "connection to database failed.\n");fprintf(stderr, "%s", PQerrorMessage(conn));exit(1);

}else printf("Connection ok\n");

sprintf(querystring, "SELECT * FROM chromosomes");printf("%s\n", querystring);

result = PQexec(conn, querystring);

if (validquery(result, querystring)){

printresults(result);}else{

PQclear(result);PQfinish(conn);return FALSE;

}

PQclear(result);PQfinish(conn);return TRUE;

}

int printresults(PGresult *tuples){

int i;

for (i=0; i< PQntuples(tuples) && i < 10; i++){

printf("%d, ", i);printf("%s\n", PQgetvalue(tuples,i,0));

}return TRUE;

}

int validquery(PGresult *result, char *querystring){

printf(" in validquery\n");if (PQresultStatus(result) != PGRES_TUPLES_OK) {

printf("Query %s failed.\n", querystring);fprintf(stderr, "Query %s failed.\n", querystring);return FALSE;

}return TRUE;

}

Wednesday 12 April 2023 BioAID 15

The ‘spaghetti’ approach

04/12/2023 BioAID 16

Computational tools graveyard rephrasing David Shotton, University of Oxford

Wednesday 12 April 2023 BioAID 17

Which diseases are associated with my protein of interest ‘EZH2’

Wednesday 12 April 2023 BioAID 18

Combining expertise

Edgar Meij

Information retrieval expert

Wednesday 12 April 2023 BioAID 19

Combining expertise

Sophia Katrenko

Machine learning expert

Wednesday 12 April 2023 BioAID 20

Combining expertise

Willem van Hage

Semantic web expert(and bass guitar player)

Wednesday 12 April 2023 BioAID 21

“Collaboration through Web Services”

Bio-text mining expertBioSemantics group,Erasmus University Rotterdam

Martijn Schuemie

Wednesday 12 April 2023 BioAID 22

“Collaboration through Web Services”

Biological Database expert

Hideaki Sugawara

Wednesday 12 April 2023 BioAID 23

AIDA toolbox

e-Science collaboration

Wednesday 12 April 2023 BioAID 24

“Collaboration through Web Services”

e-bioscientist

Wednesday 12 April 2023 BioAID 25

A nice experiment design

Wednesday 12 April 2023 BioAID 26

A not so nice experiment design

Wednesday 12 April 2023 BioAID 27

A workflowProtocol for a computational experiment

Wednesday 12 April 2023 BioAID 28

Feasting on brilliant brains with Taverna!

Want this…

Wednesday 12 April 2023 BioAID 29

The enhanced Biologist

Wednesday 12 April 2023 BioAID 30

What are the benefits of the Workflow approach?

Leiden Students Think…• …

top related