carole goble, university of manchester, uk
DESCRIPTION
SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europe http://www.sysmo-db.org. Carole Goble, University of Manchester, UK. Pan European collaboration. Systems Biology of Microorganisms. - PowerPoint PPT PresentationTRANSCRIPT
SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europehttp://www.sysmo-db.org
Carole Goble, University of Manchester, UK
Pan European collaboration. Systems Biology of
Microorganisms.
The transition from growing to non-growing Bacillus subtilis cells
Energy and Saccharomyces cerevisiae
Biology of Clostridium acetobutylicum
Gene interaction networks and models of cation homeostasis in Saccharomyces cerevisiae
http://www.sysmo.net
Eleven individual projects, 91 institutes Different research outcomes A cross-section of microorganisms,
incl. bacteria, archaea and yeast.
Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way
Present these processes in the form of computerized mathematical models.
Pool research capacities and know-how.
Running since April 2007. Two phases – more later!
http://www.sysmo.net
BaCell-SysMO COSMIC
SUMO KOSMOBAC SysMO-LAB
PSYSMO Valla
MOSES TRANSLUCENT
STREAM SulfoSYS
Types of stuff
Multiple ‘omics genomics, transcriptomics proteomics, metabolomics
Images Reaction Kinetics Models Relationships between data sets/experiments
Procedures, experiments, data, results and models Analysis of dataThe same across many Systems Biology projects
The Problem (1)
No one concept of experimentation or modelling
No planned, shared infrastructure for pooling
Started July 2008, 3 years + 3 years 4 people, 3 teams over 3 sitesSensitively retrofit a data access, model
handling and data integration platform.Support and manage the diversity of data,
models and competencies.Web-based solution:
exchange of data, models and processes.search for across the initiative‘s assets.dissemination of results.
DB SysMO-DB
Own solutions
Suspicion
Data issues
Resource Issues
Own data solutions and collaboration environments. wikis, e-Groupware, PHProjekt, BaseCamp, PLONE, Alfresco, bespoke commercial … files and spreadsheets.
Suspicion and caution over sharing.Interesting interplay between modellers, experimentalists and bioinformaticians.
Many do not have data, or follow the standards that exist or know who is doing what. Much of the data cannot be compared
Different organisms, different strains.
No extra resources for the consortiums91 institutes, 11 consortiums, some overlapping
The Problem (2)
Principles…
A series of small victories Realistic Don‘t reinvent Sustainable and extensible Migrate to community standards
Provide instant gratification Address doubt and anxiety Keep barriers low.
Social Approach PALS - Power Contributors!
18 Postdocs and PhD students All three kinds of people Design and technical
collaboration team Very intense collaboration UK and Continental PALS
Chapters
Audits and Sharing Methods, data, models,
standards, software, schemas, spreadsheets, SOPs…..
20 questions want answered
Summer Schools
Picking Pain Points. Keeping it Real. Project Directors
Data remains with us. We control who sees
what. Just enough exchange. Responsibility
PALs Spreadsheets. Yellow Pages. Standard Operating
Procedures.
SysMO SEEK Assets Catalogue. Archive. Social Network. Sharing Space. Gateway.
Yellow Pages People. Expertise. Projects. Institutions. Facilities. Studies.
Data Experimental data sets and analysed results. Gateway to data stores – SABIO-RK, ‘omics
Models Store. Stimulate. Publish. Curate. Gateway to COPASI, JWS Online, BioModels
Processes Laboratory protocols – Standard Operating Procedures Bioinformatics analyses – computational workflows - Taverna Model population and validation – workflows – Taverna Gateway to myExperiment, MolMeth, OpenWetWare….In
terli
nkin
g A
SSET
S C
ATA
LOG
UE
SysMO SEEKIs there any group generating kinetic data?
Is this data available?
Who is working with which organism?
What methods are been used to determine enzyme activity?
Under which experimental conditions are my partners working on for the measurement of glucose concentration?
???
?
Social Networks
Access Permissions
Protect: Just Enough Sharing
Reusing myExperiment
Attribution
Credit
Reward and Provenance
Reusing myExperiment
Just Enough Results Model Harvest
standards e.g. MIAME (MIBBI.org)
consortium schemas and spreadsheets
JERMs for each data type – microarray, metabolomics, proteomics
Map to projects Distribute as spreadsheet
templates
“I only want to collect and share just enough results”
COSMIC and BaCell ( Alfresco, document management system)
Keeping data safe at home
Content Management System
harvest Harvester
Extractor Register
AssetsCatalogue
SearchFetch
Project X
Quality of Data – Reliable InterpretationPublication standards by stealth
Controlled vocabulary plug inBioPortal
Observations - PALs
Dissemination of standards Debunking myths
Tools exchange Modeller – Experimentalist
Trust Like, talking together
Transcended the projects Project power politics
PALs did their jobs….
Observations - Sharing Methods sharing. Protective of models.
in progress vs published models. Access and Version management. Curator-Rival conflict
Reluctant to share data. Even within their own projects. Legacy spreadsheets dominate. Curation practices vary. Centralised archive take-up. Point to Point Exchange.
Nature 461, 145 (10 Sept09)
SysMO2 Musical Chairs Incentive Model for Sharing
Future Funding
Phase 2 - SysMO2 Projects dropped and added People dropped and added Institutions dropped and added Others reconstituted and added
Incentive Model for Sharing? Convenience, Added Value? Personal benefit? Consortium Policies?
A Platform for Systems Biology Exchange
Preservation and archiving. Widen Participation of mothership
Community Exchange Bazaar Widen adoption of platform and
enable exchange. Accelerant to standards Adoption of JERM. Curation tools CMS + JERM bundling
Widen access to External Resources, incl. publication Added value and convenience Preparation for publishing.
EMBL- EBI‘omics datasets
Public Model repositories
isatab sbml
Research Objects and e-Laboratories
Packaged Assets Workflows linked to models
linked to data linked to SOPs Community standards
Mixed resources External and central Trust
Spreadsheets Integration via RDF linked data.
myExperiment, MethodBox, NEMA, BioCatalogue
Summaryhttp://www.sysmo-db.org Reality is messy.
Extreme Technology Determinism vs Voluntarist Sociocultural shaping
Extreme and continuous partnership with users. Act Local Think Global
Agile development environment facilitated stream of features to tackle pain points. Leverage other e-Laboratories, Maintaining scientists’ buy-in.
Socio-Political Axis dominates the Technical Axis. Collaboration evolutions. Confidence in exchange Consortium Policies.
SysMO-DB Team
University of Stellenbosch, South AfricaUniversity of Manchester, UK
Jacky Snoep
EML Research gGmbH, Germany
Isabel Rojas
University of Manchester, UK
Olga Krebs
Wolfgang Müller
Sergejs Aleksejevs
Carole Goble
Stuart Owen
Katy Wolstencroft
Finn Bacall
Acknowledgements myExperiment:
http://www.myexperiment.org Taverna: http://www.mygrid.org.uk JWS Online:
http://jjj.biochem.sun.ac.za/ SABIO-RK: http://sabio.villa-bosch.de/