Ontologies for representing, integrating and analyzingphenotypes
Robert Hoehndorf
Department of GeneticsUniversity of Cambridge
21 June 2011
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 1 / 40
Introduction Motivation
Motivation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 2 / 40
Introduction Motivation
Motivation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 3 / 40
Introduction Ontology
Open Biomedical Ontologies (OBO)
Body
Organ
Cell
Molecule
Tissue
Population
Gene
Transcript
Organelle
Individual
Physical object Quality Function Process
Gene OntologyCelltype
Sequence Ontology
GO-CC
ChEBI Ontology
AnatomyOntology
PhenotypeOntology
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 4 / 40
Introduction Ontology
OntologyPhenotype and anatomy ontologies
anatomy ontologies: > 100,000 classes
FMA, MA, WA, ZFA, FA, GO-CC, ...
phenotype ontologies: > 20,000 classes
HPO, MP, WBPhenotype, FBcv, APO, ...
quality ontology: > 2,000 classes
PATO
process and function ontologies: > 25,000 classes
Gene Ontology, ...
alignments between anatomy ontologies
UBERON, various mappings
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 5 / 40
Introduction Ontology
OntologyChallenges for interoperability
“merely using ontologies [...] does not reduce heterogeneity: itjust raises heterogeneity problems to a higher level” [Euzenat,2007]
implicit knowledge
implicit semantics
weakly formalized
very large
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 6 / 40
Introduction Ontology
OntologyExample query
Find all regions in the human and mouse genome sequences that areassociated with Tetralogy of Fallot.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 7 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 8 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotHuman phenotypes
Overriding aorta (HP:0002623)
Ventricular septal defect (HP:0001629)
Pulmonic stenosis (HP:0001642)
Right ventricular hypertrophy (HP:0001667)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 9 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotPhenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 10 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotPhenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
HP:0002623 EquivalentTo:
phene-of some (has-part some (FMA:3734 and
has-quality some (PATO:0001590 and towards some
FMA:7135)))
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 11 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotPhenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
HP:0002623 EquivalentTo:
phene-of some (has-part some (FMA:3734 and
overlaps-with some FMA:7135))
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 12 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotUBERON human-mouse anatomy equivalences
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
FMA:3734 EquivalentTo: MA:0000062
E2: Membranous part of interventricular septum (FMA:7135)
FMA:7135 EquivalentTo: MA:0002939
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 13 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotPhenotype equivalence
Overriding aorta (MP:0000273):
Q: overlap with (PATO:0001590)
E1: Aorta (MA:0000062)
E2: Membranous interventricular septum (MA:0002939)
MP:0000273 EquivalentTo:
phene-of some (has-part some (MA:0000062 and
has-quality some (PATO:0001590 and towards some
MA:0002939)))
Consequence: MP:00000273 EquivalentTo: HP:0002623
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 14 / 40
Phenotype ontology Tetralogy of Fallot
Tetralogy of FallotPhenotype equivalence
Overriding aorta (MP:0000273):
Q: overlap with (PATO:0001590)
E1: Aorta (MA:0000062)
E2: Membranous interventricular septum (MA:0002939)
MP:0000273 EquivalentTo:
phene-of some (has-part some (MA:0000062 and
has-quality some (PATO:0001590 and towards some
MA:0002939)))
Consequence: MP:00000273 EquivalentTo: HP:0002623
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 14 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡LacksParts u ∃towards.Appendix u ∃inheresIn.HumanBody (Horrocks,2007)
AbsentAppendix ≡LacksParts u ∃towards.{Appendix} u ∃inheresIn.HumanBody(Mungall, 2007)
AbsentAppendix v ∃pheneOf .(HumanBody u ¬∃hasPart.Appendix)(H et al., 2007, 2011)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡LacksParts u ∃towards.Appendix u ∃inheresIn.HumanBody (Horrocks,2007)
AbsentAppendix ≡LacksParts u ∃towards.{Appendix} u ∃inheresIn.HumanBody(Mungall, 2007)
AbsentAppendix v ∃pheneOf .(HumanBody u ¬∃hasPart.Appendix)(H et al., 2007, 2011)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡LacksParts u ∃towards.Appendix u ∃inheresIn.HumanBody (Horrocks,2007)
AbsentAppendix ≡LacksParts u ∃towards.{Appendix} u ∃inheresIn.HumanBody(Mungall, 2007)
AbsentAppendix v ∃pheneOf .(HumanBody u ¬∃hasPart.Appendix)(H et al., 2007, 2011)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡LacksParts u ∃towards.Appendix u ∃inheresIn.HumanBody (Horrocks,2007)
AbsentAppendix ≡LacksParts u ∃towards.{Appendix} u ∃inheresIn.HumanBody(Mungall, 2007)
AbsentAppendix v ∃pheneOf .(HumanBody u ¬∃hasPart.Appendix)(H et al., 2007, 2011)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
AbsentAppendix v ∃pheneOf .(HumanBody u ¬∃hasPart.Appendix)
FMA: HumanBody v ∃hasPart.Appendix
HumanBody(John),AbsentAppendix(x), hasPhene(John, x)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 16 / 40
Phenotype ontology Absence
AbsenceAbsent appendix
Removal of conflicting axioms (has-part/part-of in anatomy)
Contextualize anatomy:
Normal u HumanBody v ∃hasPart.(Normal u Appendix)
Use of non-monotonic reasoning:
Normally: HumanBody v ∃hasPart.AppendixCircumscription of ¬NormalImplementation in dlvhexIC-has-part(X,Y) :- ind(X),class(Y),inst(X,Z),
CC-normally-has-part(Z,Y), not IC-lacks-has-part(X,Y),
class(Z).
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 17 / 40
Phenotype ontology Absence
Ontology of phenotypes
Different formal expressions for phenotypes based on
qualities,
anatomical parts,
functions,
processes
enable cross-species integration of phenotypes.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 18 / 40
Phenotype ontology Discovering mouse models
Tetralogy of Fallot
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 19 / 40
Phenotype ontology Discovering mouse models
Phenotype alignmentsMouse model: Phc1
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 20 / 40
Phenotype ontology Discovering mouse models
Phenotype alignmentsTetralogy of Fallot: Phc1
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 21 / 40
Knowledge representation Modularization
Complexity of automated reasoning
ontologies based on OWL
OWL 2 is based on description logic (SROIQ)
satisfiability in SROIQ is 2NEXPTIME-complete
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 22 / 40
Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
AbnormalityOfAppendix ≡∃pheneOf .(¬∃hasPart.(Normal u Appendix)) (��ZZEL)
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix) (��ZZEL)
Inference: AbsentAppendix v AbnormalityOfAppendix (EL)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
AbnormalityOfAppendix ≡∃pheneOf .(¬∃hasPart.(Normal u Appendix)) (��ZZEL)
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix) (��ZZEL)
Inference: AbsentAppendix v AbnormalityOfAppendix (EL)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
AbnormalityOfAppendix ≡∃pheneOf .(¬∃hasPart.(Normal u Appendix)) (��ZZEL)
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix) (��ZZEL)
Inference: AbsentAppendix v AbnormalityOfAppendix (EL)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
Knowledge representation Modularization
ModularizationEL Vira
http://el-vira.googlecode.com
ontology modularization
retain signature of ontology
identify EL, QL, RL axioms in deductive closure
completeness is open problem
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 24 / 40
Knowledge representation Modularization
ModularizationEL Module
AbnormalityOfAppendix ≡∃pheneOf .(¬∃hasPart.(Normal u Appendix))
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix)
AbsentAppendix v AbnormalityOfAppendix
H et al., 2011. A common layer of interoperability for biomedical ontologies based on OWL EL. Bioinformatics, 27(7), 1001–1008.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 25 / 40
Knowledge representation Applications and evaluation
Phenotype alignmentsPhenomeBLAST
apply to yeast, fly, worm, fish, mouse and human phenotypes
phenotype alignment through OWL reasoning
more than 300,000 classes and 1,000,000 axioms
combination of HermiT (for modularization), CB and CEL reasoner
classification time: 7 minutes
http://phenomeblast.googlecode.org
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 26 / 40
Knowledge representation Applications and evaluation
Phenotype alignmentsPhenomeBLAST
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 27 / 40
Knowledge representation Applications and evaluation
Phenotype alignmentsPhenomeBLAST
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 28 / 40
Knowledge representation Applications and evaluation
ApplicationComparison of phenotypes
direct comparison of phenotypes:
disease phenotypes, e.g., tetralogy of Fallotphenotypes associated with genetic mutations (genotypes in mouse,fish, etc.)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 29 / 40
Knowledge representation Applications and evaluation
ApplicationComparison of phenotypes
phenotype of mutations subclass of disease phenotype allows inference ofgene-disease association if
disease phenotypes sufficient for having the disease
mutation phenotypes necessary for having a specific genotype
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 30 / 40
Knowledge representation Applications and evaluation
ApplicationSimilarity-based comparison
pairwise comparison of phenotypes
semantic similarity: weighted Jaccard index
result: similarity matrix between phenotypes
(quantitative) evaluation based on predicting orthology, pathway,disease
identify novel gene-disease associations
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 31 / 40
Knowledge representation Applications and evaluation
ApplicationSimilarity-based comparison: ROC
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Tru
e po
sitiv
e ra
te
False positive rate
DiseaseOrthologyPathway
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 32 / 40
Knowledge representation Applications and evaluation
ApplicationSimilarity-based comparison: gene-disease associations
Adam19 and Fgf15 genes in mice may be involved in Tetralogy ofFallot
Aberrant pathways
Cytokine-cytokine receptor interaction pathway (ko04060) issignificantly correlated with Tetralogy of Fallot (p = 5 · 10−7, Wilcoxonsigned-rank test)
Gene disease associations for orphan diseases
Slc34a1 (MGI:1345284) and Fanconi renotubular syndrome 1(OMIM:134600)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 33 / 40
Knowledge representation Applications and evaluation
ApplicationPhenomeBrowser
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 34 / 40
Conclusions
SummaryAspects of ontology-based information systems in biology
knowledge representation language
expressivenessnon-monotonicitycomplexity of inferences
ontological decisions
anatomy (parthood, connectedness)physiology (function)pathology, disease (normality, abnormality)
statistical/similarity-based framework
semantic similarityaccount for incomplete informationaccount for noisy data
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 35 / 40
Conclusions
Challenges and future researchKnowledge representation
establish reasoning infrastructure (OWLlink, ...)
improve reasoning performance (OWL profiles, modularity,approximate reasoning)
OWL reasoning with prototypes, non-monotonic reasoning, abduction
explore alternatives to OWL
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 36 / 40
Conclusions
Challenges and future researchOntology
Body
Organ
Cell
Molecule
Tissue
Population
Gene
Transcript
Organelle
Individual
Physical object Quality Function Process
Gene OntologyCelltype
Sequence Ontology
GO-CC
ChEBI Ontology
AnatomyOntology
PhenotypeOntology
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 37 / 40
Conclusions
Challenges and future researchBiology
add phenotype information
20,000 knockout micedog, rat, slime mold, ...
define disease phenotypes
extension to other domains
functional genomicspharmacology, drug discoverysystems biologyclinical research, decision support
quantifiable evaluation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 38 / 40
Conclusions
Acknowledgements
George Gkoutos
Heinrich Herre
Janet Kelso
Michel Dumontier
DietrichRebholz-Schuhmann
Nico Adams
Dan Cook
Bernard de Bono
John Gennari
Pierre Grenon
Pascal Hitzler
Frank Loebe
Anika Oellrich
Kay Pruefer
Paul Schofield
Stefan Schulz
Robert Stevens
Sarala Wimalaratne
...
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 39 / 40
Conclusions
Thank you!
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 40 / 40