developing a protein-interactions ontology esther ratsch european media laboratory

21
Developing a protein- interactions ontology Esther Ratsch European Media Laboratory

Post on 18-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Developing a protein-interactions ontology

Esther Ratsch

European Media Laboratory

PIOG

• Protein Interactions Ontology Group• Computer scientists:

– Philipp Cimiano Lavin (IMS Stuttgart, EML Heidelberg)

– Isabel Rojas (EML Heidelberg)

• Computational linguists:– Uwe Reyle (IMS Stuttgart)– Jasmin Saric (EML Heidelberg)

• Biologists:– Esther Ratsch (EML Heidelberg)– Jörg Schultz (MPI for Molecular Genetics Berlin)– Ulrike Wittig (EML Heidelberg)

Motivation

• Why protein interactions?– protein function analysis– larger datasets

• Why an ontology?– clear domain model– storage and understanding of data– information retrieval from text– retrieve hidden information, inferencing

What is a signal transduction pathway?

• signal from outside is transduced to the nucleus

• often phosphorylation cascade

signal

change

transcription

Why are they important?

• control of cellular processes

• communication between cells

• response to environmental changes

• regulatory network

• stable system, single mutations may be overriden by other pathway

• complex network enables complex behaviour

Jak-Stat pathway

ligand

cytokinereceptors

JAKsP

P P

STAT monomers

P P

P PPP

nucleus

target genes

P

PP

tyrosine residues

PP

General approach

• Identify scope of the ontology

• Identify concepts involved and their properties

• How to represent them?

• Define rules and constraints

• Formalisation

Scope Concepts Representation Rules/Constraints Formalisation

The scope

• Ontology that represents interactions between proteins and other cellular compounds

• Restriction on molecular detail: amino acids

• Concentration on signal transduction pathways in initial phase

• no quantitative properties are modeled

Scope Concepts Representation Rules/Constraints Formalisation

Identify concepts: Interacting compounds

• Different kinds of compounds: proteins, genes/DNA, ions, ...

• Composition of compounds, e.g. amino acids, domains

DNA regionJak Stat

TADLZ DBD SH3 SH2

YDomain organisationof Stat proteins

Scope Concepts Representation Rules/Constraints Formalisation

Properties of compounds

• Characteristics: molecular weight, sequence, isoelectric point...

• Interaction potential: modifications, location, binding partners

Scope Concepts Representation Rules/Constraints Formalisation

nucleus

PP

X

Identify concepts: Interactions I

– Control/Regulation

– Biochemical Interactions

– Logical Interactions

– Bind/Dissociate

– Formation

– Integrity

– Availability

– Change of Location

– Modification of Structure

– Special Processes/ Reactions

– Order

• Different types of interactions: phosphorylation, binding, translocation ...

• Other classification: grouping of > 100 verbs (Swissprot) 11 not disjoint classes

Scope Concepts Representation Rules/Constraints Formalisation

Representation of proteins

• General characteristics: sequence, molecular weight, ...

• Protein state:– location– list of modifications– list of binding partners

StatState3(cytoplasm, phosphorylatedAtResidue701, Stat)PP

JakState1(cytoplasm, none, cytokine-receptor)

Scope Concepts Representation Rules/Constraints Formalisation

Representation of interactions

• Event with pre- and postconditions

not p event e p

t),'(Res:':',, espsespssse

phosphorylation

P

not phosphorylated phosphorylated

Scope Concepts Representation Rules/Constraints Formalisation

Rules and constraints

• Simple hierarchies: nucleolus inside nucleus, Stat1 is a Stat is a protein

• Rules for the definition of interactions

• Consistency checking

• Knowledge retrieval

Scope Concepts Representation Rules/Constraints Formalisation

Rules and constraints: example

• „Protein A is phosphorylated by B at position X.“ A and B are located in the same compartment A was not modified at X before A is phosphorylated at X afterwards B is a protein kinase, which is a protein dependent on X, B is either a S/T-kinase or a

Y-kinase

Scope Concepts Representation Rules/Constraints Formalisation

• Phosphorylation of a protein by a kinase at a distinct residue

• S/T-kinase phosphorylation

Formalisation: phosphorylation

Scope Concepts Representation Rules/Constraints Formalisation

 

)) ')( )(:' )(:

),( )( )(

),,(:( ,,,,',

esseResultRatedphosphorylsRmodifieds

PRfisResidueOQaseproteinKinPprotein

RQPationphosphoryleRQPess

)),(: ),,(:( ,,, QPninteractioeRQPationphosphoryleRQPe

)))()(( :( ,, QcationcompoundLoPcationcompoundLoninteractioeQPe

))( )(( PproteinPaseproteinKinP

...)) )( )(( )(:( , RedglycosylatRatedphosphorylRmodifiedssR

)))( )(( ),,(

),,(/:( ,,,

RthreonineRserineRQPationphosphoryl

RQPationphosphorylTSeeRQP

Challenges met

• Multidisciplinarity of the group– Different vocabularies clear expression,

fewer ambiguities– Different goals, different needs not restricted

to one goal– Different experiences mutual benefit

• Domain

Complexity of the domain

• Granularity of information– detail of compound part

• protein: Stat

• domain: SH2-domain

• amino acid: tyrosine701

– detail of protein identity• protein family: Jak, Stat

• protein type: Jak2, Stat5

• organism specific protein: Jak2_human, Jak2_rat

Complexity of the domain II

• description detail:– not known: no data available– doesn‘t have: no binding partners– don‘t care: not important for a certain

interaction

What comes next?

• Go on with development of ontology

• Projects using the ontology:– integration in larger ontology on metabolic

pathways– application to TIGERSearch (see poster)

Acknowledgements

• Protein Interactions Ontology Group• Computer scientists:

– Philipp Cimiano Lavin (IMS Stuttgart, EML Heidelberg)

– Isabel Rojas (EML Heidelberg)

• Computational linguists:– Uwe Reyle (IMS Stuttgart)– Jasmin Saric (EML Heidelberg)

• Biologists:– Esther Ratsch (EML Heidelberg)– Jörg Schultz (MPI for Molecular Genetics Berlin)– Ulrike Wittig (EML Heidelberg)