using pathway information to understand omics data
DESCRIPTION
Using pathway information to understand omics data. Chris Evelo NuGO WP7 BiGCaT Bioinformatics Maastricht. Nu. G O. Un Oslo. Rowett. Un. Ulster. Un Newcastle. Un Lund. Trinity. DiFE. IFR. Un Cork. EBI. Rivm. Rikilt. TNO. Un Reading. Un Wageningen. Un Maastricht. Un Krakow. - PowerPoint PPT PresentationTRANSCRIPT
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
Using pathway information to understand omics data
Chris EveloNuGO WP7
BiGCaT Bioinformatics Maastricht
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOUn Oslo
Un Munich
Un Florence
Un Balearic Illes
Un Cork
Trinity
Un. Ulster
Rowett
Un Newcastle
Un Reading
IFR DiFE
Un Krakow
Inserm Marseille
TNO
Un Wageningen
Un Maastricht
EBI
NuGO
Un Lund
RikiltRivm
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Understanding Array data
Typical procedure1. Annotate the reporters
with something useful (UniProt!)
2. Sort based on fold change
3. Search for your favorite genes/proteins
4. Throw away 95% of the array
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Understanding Array data
Typical procedure1. Annotate the reporters
with something useful (UniProt!)
2. Sort based on fold change
3. Search for your favorite genes/proteins
4. Throw away 95% of the array
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Understanding Array data
“Advanced” procedureso Gene clustering or
principal component analysis
o Get groups of genes with parallel expression patterns
o Useful for diagnosiso Not adding much to
understanding (unless combined)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Mapping
Annotation/coupling
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOBest known: GenMAPP
Free, academic initiative with editable mapps,collaborates with NuGO
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Best known: GenMAPP
• Full content of GO database• Textbook like local mapps• Geneboxes with active backpages, coupled
to online databases• Visualize anything numerical
(fold changes on arrays, p-values, present calls, proteomics results)
• Update mapps yourself
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO GenMAPP: Full GO content
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOGenMAPP:
Textbook like maps
Extensive backpages
present with links to online
databases
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO 2D gels of 3T3-L1 (pre)-adipocytes
Enlarged sectionsgels derived from: A: 3T3-L1 pre-
adipocytes,B: 3T3-L1 adipocytes, C: 3T3-L1 adipocytes
with caloric restriction
D: 3T3-L1 adipocytes with caloric restriction and
TNF-a.
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOGenMAPP: visualize anything numerical
Example
Proteomics results (2D gels with GC-MS identification).Fasting/feeding study shows regulation of glycolysis (data from Johan Renes, UM).
Other useful things:
- p-values, present calls- presence in clusters- presence in QTLs
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Update mapps yourself
You can do anything.E.g. add genes, annotation, backpage information, graphics
Next page shows a combination of metabolic mapps.
“The Nutrigenomics Masterpiece”created by Milka Sokolović (AMC Amsterdam)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO MAPPfinder
• Ranks mapps where relatively many changes occur
• Useful to find unexpected pathways• Statistics hardly developed
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO MAPPfinder z-scoreNumber of
genes/proteins changed on this
mapp
Expected number of changes
Standard deviation of
observed number
many dependencies to overcome
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO MAPPfinder
• Next example from heart failure study(Schroen et al. Circ Res; 2004 95: 506-514)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO GenMAPP: Full GO content
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Scientist know GenMapp
Advantages: • Free,• Runs on (high end) MS Windows, • Relatively easy to use, • Reasonable visualization,• Some pathway statistics,• Interesting content (Including GO, KEGG),• Content editable,• Adopting standards (e.g. BioPax),• Soon to become open source.
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Scientist know GenMapp
Disadvantages: • Small academic initiative, uncertain lifespan• No info on reactions, metabolites, location• No change (e.g. time course) visualization• Hard to cope with ambiguous reporters
(we are working on that) • Content could be better!
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOMetacore example
GeneGo, IncGeneGo, Inc
• Systems ReconstructionTM Technology
www.genego.com
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
Agilent Affymetrix Proteomic SAGE
Concurrent visualization of different data types
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOGeneGo: primitive view of multiple conditions
Can you really see what happens?
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOBuild new networkusing MetacoreTM from GeneGO
• Around p53 protein• Making us of biological DB• Filtered to reduce complexity:
– for ‘rat ortholog’– for ‘transcriptional regulation’– for ‘liver’
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
Filtering needed to reduce complexity
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 1
GenMAPP local MAPPs:
Largely created by a single postdoc (Dr.Kam Dahlquist).
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 2
KEGG:
Older pathway database (Kyoto Japan), on enzyme code (EC) level.
Example… The Homo Sapiens Urea cycle Mapp
A converted KEGG Mapp
Note that not all EC’s were converted and that they don’t have backpages.
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 2
KEGG Conversion:
= How would you convert EC codes to Swissprot codes? 1) Go to Swissprot, look for EC code2) Add all proteins with that EC code to GenMapp backpage
Example: Superoxide dismutase function reaction would have:Cu/Zn-SOD, Mn-SOD and Ex-SOD in backpage… (and that is not what we usually want.
Note that many other tools use KEGG converted pathways (e.g. Spotfire Decissionsite, GeneGo, Ingenuity)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 2
KEGG:
Another example: Apoptosis KEGG Mapp
A contributed Mapp
Somebody manually converted this Mapp!
Great work… But, there are only four of these
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 3
Gene Ontology Database:
Simple tree structure database with a of lot biological content (biologist know and like it).
Automatic annotation possible even for EST’s
See structure in MappFinder (1) (or use Go browser)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Datasources 4
Alternative programs like GeneGo:
Based on expert knowledge (20 Russian biochemists).
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGONuGO data pathway data
collection workflowCombine and forward
existing mapsto limited group of experts
Text miningfrom key genes/metabolites
Forward improved mapsto limited group of experts
Collect back page info
Forward new draft to alarger group of experts
within NuGO
Develop storage format plus tools
Think of best way to storepathway information
Develop/adapt entry toolsplus converters
Test resulting maps
Make maps available
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO
BioPAX Plus/GMML 2
Working with Reactome, GenMapp and BioPax
Expert data
Reactome
BioPAX
GMML
Current GenMapp
GenMapp 2
NUGO/EBI
EBI MDP4/GenMapp
With Philippe Rocca and Imre
Vastrik (EBI/Reactome) we will define a way to
get Reactome views and export
them to GenMapp2
BiGCaT students created GenMapp 2 – GMML converters
with help from Lynn Ferrante (GenMapp.org)
Rachel van Haaften (BiGCaT/NuGO) and
Marjan van Erk (TNO/NuGO) visited EBI early 2005 to learn doing this
This step has not been taken care off as of yet…
Rachel van Haaften (BiGCaT/NuGO) and
Marjan van Erk (TNO/NuGO) will
test this and give user feedback
GMML (GenMapp Markup Language) is a superset of BioPAX
1. BioPAX could contain graphical views. (GMML 2 =
BioPAX2).
But, how doe we make that happen?
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO Can it help you?
Seeing the errors and getting useful information
A NuGO example Red Wine Polyphenols (Dr Cristina Luceri)
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGOClusters in control grouprepresenting pathways
Hierarchical Clustering
7.12E3 0
1 1706 2 2
rat 5 rat 12 rat 14 rat 13 rat 4 rat 3 rat 2 rat 1 rat 15 rat 11
Scatter Plot
column for spotfire0 10 20 30 40 50 60 70 80
-80
-70
-60
-50
-40
-30
-20
-10
0
Caused by bad
technology and bad design
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO After adapted normalization:
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO The bioinformatics
BiGCaT Bioinformatics• Chris Evelo• Rachel van Haaften• Arie van Erk• Stan Gaj• Magali Jaillard• Kitty ter Stege• Thomasz Kelder• Gijs Huisman
TNO Zeist• Rob Stierum• Marjan van Erk
EBI Hinxton• Susanna Sansone• Philippe Rocca• Imre Vastrik
University Firenze• Duccio Cavalieri
GenMAPP.org• Bruce Conklin• Lynn Ferrante
the European Nutrigenomics Organisation
NuNuGOGONuNuGOGO The Biology
Proteomics• Johan Renes (UM)• Chris Evelo (BiGCaT)
The masterpiece• Milka Sokolović (AMC)• Wout Lamers (AMC)• Magali Jaillard (BiGCaT)
Heart Failure• Blanche Schroen (UM)• Yigal Pinto (UM)• Arie van Erk (BiGCaT)
Red Wine Polyphenols• Cristina Luceri (Firenze)• At BiGCaT!
RhoA Stolen from• Rob Stierum (TNO)
Financial contributions: UM, TUe, Senter IOP, WCFS/ICN, Dutch Heart Foundation, NuGO