crop plant ontologies & protein ontology (pro)

16
cROP Plant Ontologies & Protein Ontology (PRO) Amherst, NY May 16, 2013 Cathy H. Wu, Ph.D. PRO-PO- GO Meeting

Upload: kevork

Post on 24-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

cROP Plant Ontologies & Protein Ontology (PRO). PRO-PO-GO Meeting. Amherst, NY May 16, 2013 Cathy H. Wu, Ph.D. PRO Communities. Ontology Developers GO ontology: Interfaces of GO/PRO complexes; GO definition (e.g., GO:0005109) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: cROP  Plant  Ontologies  & Protein Ontology (PRO)

cROP Plant Ontologies & Protein Ontology (PRO)

Amherst, NYMay 16, 2013

Cathy H. Wu, Ph.D.

PRO-PO-GO Meeting

Page 2: cROP  Plant  Ontologies  & Protein Ontology (PRO)

2

Page 3: cROP  Plant  Ontologies  & Protein Ontology (PRO)

3

PRO Communities• Ontology Developers

• GO ontology: Interfaces of GO/PRO complexes; GO definition (e.g., GO:0005109)• GO annotation: precise annotation of protein forms in PomBase • Cell Ontology: Define cell types based on protein types • Annotation Ontology for annotating scientific documents on the web • Brucellosis Ontology (IDOBRU), extension of the Infectious Disease Ontology (IDO)

• Semantic Resources• Semantic Web Applications in Neuromedicine (SWAN); Neuroscience Information

Framework (NIF)• Pathway/Process-Modeling Resources

• Reactome, MouseCyc, EcoCyc, MaizeCyc• Chemical/Proteomic Resources: PubChem, IUPhar, P3DB, Top-Down Proteomics, PDB• Pharma/Clinical Communities: Drug Discovery & Disease Biomarker

• Alzforum• Salivaomics KB/SALO (Saliva Ontology): Saliva Biomarkers• Clinical flow cytometry, immunology (ImmuPort) community

Page 4: cROP  Plant  Ontologies  & Protein Ontology (PRO)

Biological Questions• List all the genes expressed differentially in the leaves of Rice plant

varieties IRBB5 and IR24 at the 5-leaf visible growth stage, when the plants were infected with Xanthomonas oryzae pv. oryzae were grown in a growth camber. IRBB5 is resistant and IR24 is susceptible to rice bacterial blight disease.

• Filter the differentially expressed gene set for those with– LRR-domains– Transmembrane domains (e.g. in excess of 1)– Receptor like kinase function– Plasmamembrane cellular location– OR those having Tryptophan decarboxylase function– Tryptophan metabolism– Have known alleles and homologs with disease resistance

phenotype

4

Page 5: cROP  Plant  Ontologies  & Protein Ontology (PRO)

Annotation: Ontology RequirementsObject Object

typeFeature or ontology Feature type Attribute

and scorecontext

XX

Plant/gene/gene-

product/

allele/population

Molecular Function GO

PATO

Any of the ontologies including the environment ontology for adding context to the annotation.

E.g. PEP carboxylase activity (GO-MF) in maize is required for C4 carbon assimilation (GO-PB). The process occurs in the plastid (GO-CC) of the leaf mesophyll cell (PO).

Biological Process GO

Cellular component GO

Plant structure PO

Plant growth stage PO

(bio)chemical ChEBI

Disease DO

Protein Domains PROInterPro

Pathway Pathway??

Trait TO

5

Page 6: cROP  Plant  Ontologies  & Protein Ontology (PRO)

GO: Receptor like Kinase Gene:XA21

Allele-B

Allele-A

has_

func

tion

map

s_to

Oryza genotype

inhe

res_

in

GO: response to pathogen

belongs_to

Disease Ontology Example Building genotype-phenotype associations

6

Page 7: cROP  Plant  Ontologies  & Protein Ontology (PRO)

7

PRO Workflow Data Sources

• Manual annotation (curator, collaborator, user): sourceforge tracker; RACE-PRO• Semi-automated processing of external databases (e.g., UniProtKB, Reactome,

MouseCyc, EcoCyc); coverage of 12 reference genomes in progress Integration with text mining: RLIMS-P/eFIP (Phosphorylation and Functional Impact)

RACE-PRO Annotation Interface: Capture knowledge of protein forms/ complexes of interest to support integrated analysis

Page 8: cROP  Plant  Ontologies  & Protein Ontology (PRO)

8Use of the protein ontology for multi-faceted analysis of biological processes: a case

study of the spindle checkpoint. Ross et al. (2013) Front Genet. 4:62. [PMID: 23637705]

PRO representation of the spindle checkpoint

PRO search query to retrieve PRO terms that contain the phrases “spindle checkpoint” or “spindle assembly checkpoint” or “mitotic checkpoint” and combined Cytoscape web view of the search results

nodes retrieved by the search are blue; related nodes (parents and children) are gray

Page 9: cROP  Plant  Ontologies  & Protein Ontology (PRO)

9[PMID: 23637705]

Phosphorylated forms of BUB1B in PRO

Four species-independent BUB1B phosphorylated forms (blue nodes).

Display options set to show parents and all children, including organism level terms.

Sequence alignment of human, frog, and mouse BUB1B highlighted to indicate experimentally determined phosphorylation sites (blue) and predicted phosphorylation sites (red).

Page 10: cROP  Plant  Ontologies  & Protein Ontology (PRO)

10

PRO in iPTMnet Framework

Data Mining: iProClass database for molecular and omics data integration Text Mining: RLIMS-P/eFIP system for knowledge extraction from literature Ontology: PRO for knowledge representation of PTM forms Web portal linking data and analysis/visualization tools for scientific queries

(http://proteininformationresource.org/iPTMnet)

PTM network of enzyme-substrate relationships and protein-protein interactions => iPTMnet with rich relations

Page 11: cROP  Plant  Ontologies  & Protein Ontology (PRO)

• Literature-curated kinase-substrate data PhosphoSitePlus, Phospho.ELM, HPRD PhosphoGRID P3DB, PhosPhAt UniProtKB, PRO

•Database content Substrates: 28,000; P-Sites: 126,000; Kinases: 700 Substrate/site-kinase pairs: 13,000 Covering: human, mouse, rat, other vertebrates, Drosophila, C.

elegans, yeast and plants Curated phosphorylation papers: 10,000

•Full-scale processing of PubMed abstracts: 22 million Phosphorylation papers identified by RLIMS-P: 143,000 Phosphorylation-PPI related papers identified by eFIP:

10,000

PTM Enzyme-Substrate Database

11

Page 12: cROP  Plant  Ontologies  & Protein Ontology (PRO)

iPTM Network

Coupled with functional annotation and biological context (homology, disease, tissue/cell..)=> Hypothesis generation and discovery

12

Exploring Relations• Substrate-centric:

What PTM forms of a protein and their modifying enzymes are known?

• Enzyme-centric: What substrates are known for a given PTM enzyme?

• Interaction: What interacting partners are known for each PTM form of a given protein?

• Pathway: What modifications and enzymes are known in a given signaling pathway?

Page 13: cROP  Plant  Ontologies  & Protein Ontology (PRO)

Human BUB1B Phosphorylation Network

• 73 nodes• 24 phosphorylated forms• 9 protein kinases• 10 phospho-specific PPIs

Construction of protein phosphorylation networks by data mining, text mining, and ontology integration: analysis of the spindle checkpoint. Ross et al. (2013) Database (Oxford) (in press).

• BUB1B/Phos:2 interacts specifically with PPP2R5A

• UB1B/Phos:2 phosphorylated by two important mitotic kinases: CDK1 and PLK1

• BUB1B interacts with both phosphorylated and unphosphorylated CDC27

• Phosphorylation on CDC27/Phos:1 sites does not regulate CDC27 interaction with BUB1B

Page 14: cROP  Plant  Ontologies  & Protein Ontology (PRO)

BR Signaling• Brassinosteroids (BRs): a class of growth-promoting

hormones, which plays role in plant growth and development.

• BR signaling is highly integrated with the light, gibberellin, and auxin pathways, and crosstalks with other receptor kinase pathways to modulate stomata development and innate immunity.

BR signaling curation Step 1: Search RLIMS-P with core genes (bri1, bak1, bin2, bsu1, bzr1, bes1) and “brassinosteroid mediated signaling pathway” to identify phosphorylation papers with phosphorylation information (kinase, substrate, site)Step 2: Use RACE-PRO to curate phosphorylated protein forms, their kinases, PPIs, and associated GO functions, process, subcellular component

14

Page 15: cROP  Plant  Ontologies  & Protein Ontology (PRO)

Core proteins and other associated proteins annotated with GO related to BR signaling pathway (blue)

BR Signaling Pathway

15

Page 16: cROP  Plant  Ontologies  & Protein Ontology (PRO)

SCF complexes & Auxin/Jasmonate Signaling

• SCF Complexes formed in response to auxin and jasmonate signaling

• Link to ChEBI for small molecule-containing complexes

Cullin-1 Rubylated

16