cross-species data integration

144
Cross-species data integration Lars Juhl Jensen EMBL Heidelberg

Upload: lars-juhl-jensen

Post on 10-May-2015

322 views

Category:

Technology


1 download

DESCRIPTION

Centre for Molecular Biology and Neuroscience, Rikshospitalet-Radiumhospitalet, Oslo, Norway, October 25, 2006

TRANSCRIPT

Page 1: Cross-species data integration

Cross-species data integration

Lars Juhl Jensen

EMBL Heidelberg

Page 2: Cross-species data integration

Lars Juhl Jensen

Page 3: Cross-species data integration
Page 4: Cross-species data integration
Page 5: Cross-species data integration

promoter analysis

Page 6: Cross-species data integration

Jensen et al., Bioinformatics, 2000

Page 7: Cross-species data integration

genome visualization

Page 8: Cross-species data integration

Pedersen et al., Journal of Molecular Biology, 2000

Page 9: Cross-species data integration

protein function prediction

Page 10: Cross-species data integration
Page 11: Cross-species data integration
Page 12: Cross-species data integration
Page 13: Cross-species data integration

data integration

Page 14: Cross-species data integration

Jensen et al., Drug Discovery Today: Targets, 2004

Page 15: Cross-species data integration

cell cycle

Page 16: Cross-species data integration

temporal interaction network

Page 17: Cross-species data integration

de Lichtenberg et al., Science, 2005

Page 18: Cross-species data integration

cross-species comparison

Page 19: Cross-species data integration

Jensen et al., Nature, 2006

Page 20: Cross-species data integration

STRING

Page 21: Cross-species data integration
Page 22: Cross-species data integration

373 proteomes

Page 23: Cross-species data integration

Genome Reviews

Page 24: Cross-species data integration

RefSeq

Page 25: Cross-species data integration

Ensembl

Page 26: Cross-species data integration

model organism databases

Page 27: Cross-species data integration

functional interactions

Page 28: Cross-species data integration

genomic context methods

Page 29: Cross-species data integration

gene neighborhood

Page 30: Cross-species data integration
Page 31: Cross-species data integration

gene fusion

Page 32: Cross-species data integration
Page 33: Cross-species data integration

phylogenetic profiles

Page 34: Cross-species data integration
Page 35: Cross-species data integration
Page 36: Cross-species data integration
Page 37: Cross-species data integration
Page 38: Cross-species data integration

Cell

Cellulosomes

Cellulose

Page 39: Cross-species data integration

correct interactions

Page 40: Cross-species data integration

wrong associations

Page 41: Cross-species data integration

gene neighborhood

Page 42: Cross-species data integration

sum of intergenic distances

Page 43: Cross-species data integration
Page 44: Cross-species data integration

gene fusion

Page 45: Cross-species data integration

sequence similarity

Page 46: Cross-species data integration
Page 47: Cross-species data integration

phylogenetic profiles

Page 48: Cross-species data integration

SVDSingular Value Decomposition

Page 49: Cross-species data integration

Euclidian distance

Page 50: Cross-species data integration
Page 51: Cross-species data integration

raw quality scores

Page 52: Cross-species data integration

not comparable

Page 53: Cross-species data integration

sum of intergenic distances

Page 54: Cross-species data integration

sequence similarity

Page 55: Cross-species data integration

Euclidian distance

Page 56: Cross-species data integration

benchmarking

Page 57: Cross-species data integration

calibrate vs. gold standard

Page 58: Cross-species data integration
Page 59: Cross-species data integration

raw quality scores

Page 60: Cross-species data integration

probabilistic scores

Page 61: Cross-species data integration

curated knowledge

Page 62: Cross-species data integration

KEGGKyoto Encyclopedia of Genes and Genomes

Page 63: Cross-species data integration

Reactome

Page 64: Cross-species data integration

MIPSMunich Information center

for Protein Sequences

Page 65: Cross-species data integration

STKESignal Transduction Knowledge Environment

Page 66: Cross-species data integration

primary experimental data

Page 67: Cross-species data integration

many sources

Page 68: Cross-species data integration

many parsers

Page 69: Cross-species data integration

physical protein interactions

Page 70: Cross-species data integration

BINDBiomolecular Interaction Network Database

Page 71: Cross-species data integration

GRIDGeneral Repository for Interaction Datasets

Page 72: Cross-species data integration

MINTMolecular Interactions Database

Page 73: Cross-species data integration

DIPDatabase of Interacting Proteins

Page 74: Cross-species data integration

HPRDHuman Protein Reference Database

Page 75: Cross-species data integration

merge data by publication

Page 76: Cross-species data integration

topology-based scores

Page 77: Cross-species data integration

von Mering et al., Nucleic Acids Research, 2005

Page 78: Cross-species data integration

co-expression

Page 79: Cross-species data integration

GEOGene Expression Omnibus

Page 80: Cross-species data integration

correlation coefficient

Page 81: Cross-species data integration

literature mining

Page 82: Cross-species data integration

different gene identifiers

Page 83: Cross-species data integration

synonyms lists

Page 84: Cross-species data integration

MEDLINE

Page 85: Cross-species data integration

SGDSaccharomyces Genome Database

Page 86: Cross-species data integration

The Interactive Fly

Page 87: Cross-species data integration

OMIMOnline Mendelian Inheritance in Man

Page 88: Cross-species data integration

co-mentioning

Page 89: Cross-species data integration

NLPNatural Language Processing

Page 90: Cross-species data integration

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 91: Cross-species data integration

calibrate vs. gold standard

Page 92: Cross-species data integration
Page 93: Cross-species data integration

combine all evidence

Page 94: Cross-species data integration

spread over many species

Page 95: Cross-species data integration

transfer by orthology

Page 96: Cross-species data integration

von Mering et al., Nucleic Acids Research, 2005

Page 97: Cross-species data integration

two modes

Page 98: Cross-species data integration
Page 99: Cross-species data integration

orthologous groups

Page 100: Cross-species data integration

von Mering et al., Nucleic Acids Research, 2005

Page 101: Cross-species data integration
Page 102: Cross-species data integration

fuzzy orthology

Page 103: Cross-species data integration

von Mering et al., Nucleic Acids Research, 2005

Page 104: Cross-species data integration

Bayesian scoring scheme

Page 105: Cross-species data integration

Bork et al., Current Opinion in Structural Biology, 2005

Page 106: Cross-species data integration

predicting “mode of action”

Page 107: Cross-species data integration

Jensen et al., Drug Discovery Today: Targets, 2004

Page 108: Cross-species data integration

Jensen et al., Drug Discovery Today: Targets, 2004

Page 109: Cross-species data integration

NetworKIN

Page 110: Cross-species data integration

the idea

Page 111: Cross-species data integration

mass spectrometry

Page 112: Cross-species data integration
Page 113: Cross-species data integration

phosphorylation sites

Page 114: Cross-species data integration

in vivo

Page 115: Cross-species data integration

kinases are unknown

Page 116: Cross-species data integration

sequence motifs

Page 117: Cross-species data integration
Page 118: Cross-species data integration

kinase families

Page 119: Cross-species data integration

overprediction

Page 120: Cross-species data integration

in vitro

Page 121: Cross-species data integration

protein networks

Page 122: Cross-species data integration

STRING

Page 123: Cross-species data integration
Page 124: Cross-species data integration

context

Page 125: Cross-species data integration

in vivo

Page 126: Cross-species data integration

the algorithm

Page 127: Cross-species data integration
Page 128: Cross-species data integration

benchmarking

Page 129: Cross-species data integration

Phospho.ELM

Page 130: Cross-species data integration
Page 131: Cross-species data integration
Page 132: Cross-species data integration

ATM signaling

Page 133: Cross-species data integration
Page 134: Cross-species data integration

ATM phosphorylates Rad50

Page 135: Cross-species data integration
Page 136: Cross-species data integration

summary

Page 137: Cross-species data integration

integration

Page 138: Cross-species data integration

high-throughput data

Page 139: Cross-species data integration

computational methods

Page 140: Cross-species data integration

biological discoveries

Page 141: Cross-species data integration

hypotheses

Page 142: Cross-species data integration

highly specific

Page 143: Cross-species data integration

testable

Page 144: Cross-species data integration

Acknowledgments

The STRING team (EMBL)– Christian von Mering

– Berend Snel

– Martijn Huynen

– Sean Hooper

– Samuel Chaffron

– Julien Lagarde

– Mathilde Foglierini

– Peer Bork

Literature mining project(EML Research)– Jasmin Saric

– Rossitza Ouzounova

– Isabel Rojas

Cell cycle project (CBS)– Ulrik de Lichtenberg

– Thomas Skøt Jensen

– Søren Brunak

• The NetworKIN project– Rune Linding

– Gerard Ostheimer

– Francesca Diella

– Karen Colwill

– Jing Jin

– Rob Russell

– Michael Yaffe

– Tony Pawson