phenotype and trait ontology (pato) and plant phenotypes

Post on 24-Feb-2016

57 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Phenotype And Trait Ontology (PATO) and plant phenotypes. George Gkoutos. Phenotype And Trait Ontology (PATO). The meaningful cross species and across domain translation of phenotype is essential  phenotype-driven gene function discovery and comparative pathobiology - PowerPoint PPT Presentation

TRANSCRIPT

Phenotype And Trait Ontology (PATO) and plant phenotypes

George Gkoutos

Phenotype And Trait Ontology (PATO)

The meaningful cross species and across domain translation of phenotype is essential phenotype-driven gene function discovery and comparative pathobiology

Goal - “A platform for facilitating mutual understanding and interoperability of phenotype information across

• species, • domains of knowledge,

and amongst people and machines” …..

PATO today

PATO is now being used as a community standard for phenotype description

– many consortia (e.g. Phenoscape, The Virtual Human Physiology project (VPH), IMPC, BIRN, NIF)

– most of the major model organism databases, (e.g. example Flybase, Dictybase, Wormbase, Zfin, Mouse genome database (MGD))

– international projects

PATO’s Semantic Framework

• Conceptual Layer

• Semantic Components Layer

• Unification Layer

• Integration Layer

PATO Conceptual Layer

PATO

Process

Object

EQ

EQ Modellink Entities (E) from GO, CheBI, FMA etc. to Qualities (Q) from PATO

EQ statements

PATO

NBO

MPATH

GO

CHEBIUBERON

CL

UO

Semantic Components Layer

EQ

• Behavior• Physiology• Cell Phenotype• Pathology• UBERON• Measurements(Units Ontology)

Unification Layer

PATO

EQ

CPO

YPO

TO

HPO

FBCV

MP

WBPhenotype

Provision of PATO based equivalence definitions

PATO

EQ

CPO

YPO

TO

HPO

FBCV

MP

WBPhenotype

Integration Layer

Cross Species Data Integration

Cross species integration framework

• A PATO-based cross species phenotype network based on experimental data from 5 model organisms yeast, fly, worm, fish and mouse and human disease phenotypes (OMIM, OrphaNet)

• integration of anatomy and phenotype ontologies– more than 1,000,000 classes and 2,500,000 axioms

• PhenomeNET forms a network with more than 300.000 complex phenotype nodes representing complex phenotypes, diseases, drug indications and adverse reactions

• Semantic similarity measures pairwise comparison of disease and animal phenotypes

• quantitative evaluation based on predicting orthology, pathway, disease• enhance the network e.g.

– semantics e.g Behavior and pathology related phenotypes etc.– methods e.g. text mining, machine learning etc.– resources

Area Under Curve (AUC) = 0.9

Candidate disease gene prioritization

• Predict all known human and mouse disease genes• Adam19 and Fgf15 mouse genes • using zebrafish phenotypes - mammalian homologues of

Cx36.7 and Nkx2.5 are involved in TOF

E: Pulmonary valve (MA)Q: constricted (PATO)

E: heart ventricle wall(MA)Q: hypertrophic (PATO)

E1: Aorta(MA)Q: overlap with (PATO)E2: Membranous interventricular septum (MA)

E1: ventricular septum (MA)Q: closure incomplete (PATO)

Mouse (MP)E1: Aorta(FMA)Q: overlap with (PATO)E2: Membranous part of the interventricular septum (FMA)

E: Interventricular septum (FMA)Q: closure incomplete (PATO)

E: Pulmonary valve (FMA)Q: constricted (PATO)

E: Wall of right ventricle (FMA)Q: hypertrophic (PATO)

Human (HPO)

Data Integration Across Domains of Knowledge

Gene function determination

• bridge the gap between the availability of phenotype infer the functions (impaired given a phenotype observation)

• example: delayed cartilage development

EQ = GO:0051216 ! cartilage development PATO:0000502! delayed

• phenotype data from 5 different species more than 40000 novel gene functions

• evaluation:• manually for biological correctness • predict genetic interactions based on gene functions (semantic

similarity between genes)

• improvement for every species, and significantly improvement for fruitfly, worm and mouse

• GO-based gene function annotations analysis gene expression data• large volume of data from high-throughput mutagenesis screens

Can we do the same for plants ?

PPPP pilot project

• Dataset– manual EQ annotations for various species

• Framework for plant phenotype analyses– build a Plant PhenomeNet

• Analysis– analyze the dataset and quantify the extent of phenotype

similarity among certain groups of alleles/genes– prediction of gene identity for QTL or genetically defined locus– etc

Plant Phenotyping Centers

• Participating centres– National Plant Phenomics Centre– European Plant Phenomics Centre– Julich Plant Phenotyping Centre– Rothamsted Research– INRA

• Goal: PATO-based plant phenotyping analysis framework– Link to PPPP and NSF proposal

Solanaceae Phenotypic Atlas (SPA)

PATO-based methodology to systematically study associations between genotype, phenotype and environment in Solanum

–Nightshade Phenotype Ontology (NPO)– Compare phenotypic similarity across flowering

plants–Generate a Solanum geospatial map– Link environmental features to the Solanum

geospatial map

Questions?

top related