1 introduction to ontology barry smith

86
1 Introduction to Ontology Barry Smith http://ontology.buffalo.edu/smith

Post on 21-Dec-2015

238 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 1 Introduction to Ontology Barry Smith

1

Introduction to Ontology

Barry Smith

http://ontology.buffalo.edu/smith

Page 2: 1 Introduction to Ontology Barry Smith

Who am I?

NCBO: National Center for Biomedical Ontology (NIH Roadmap Center)

2

• Stanford Medical Informatics• University of San Francisco Medical Center• Berkeley Drosophila Genome Project• Cambridge University Department of

Genetics• The Mayo Clinic• University at Buffalo Department of

Philosophy

Page 3: 1 Introduction to Ontology Barry Smith

Who am I?

NYS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group

Buffalo Clinical and Translational Science Institute (CTSI)

3

Page 4: 1 Introduction to Ontology Barry Smith

Who am I?

Cleveland Clinic Semantic Database

Gene Ontology

Ontology for Biomedical Investigations

Open Biomedical Ontologies Consortium

Institute for Formal Ontology and Medical Information Science

BIRN Ontology Task Force

...4

Page 5: 1 Introduction to Ontology Barry Smith

5

Page 6: 1 Introduction to Ontology Barry Smith

6

natural language labels

to make the data cognitively accessible to human beings

and algorithmically tractable to computers

Page 7: 1 Introduction to Ontology Barry Smith

7

compare: legends for mapscompare: legends for maps

Page 8: 1 Introduction to Ontology Barry Smith

8

compare: legends for mapscommon legends allow (cross-border) integration

Page 9: 1 Introduction to Ontology Barry Smith

9

ontologies are legends for data

Page 10: 1 Introduction to Ontology Barry Smith

10

legends

help human beings use and understand complex representations of reality

help human beings create useful complex representations of reality

help computers process complex representations of reality

help glue data together

Page 11: 1 Introduction to Ontology Barry Smith

11

annotations using common ontologies can yield integration of image data

Page 12: 1 Introduction to Ontology Barry Smith

12

computationally tractable legends

help human beings find things in very large complex representations of reality

Page 13: 1 Introduction to Ontology Barry Smith

13

where in the body ? where in the cell ?

what kind of organism ?

what kind of disease process ?

Page 14: 1 Introduction to Ontology Barry Smith

14

to yield: distributed accessibility of the data to humansreasoning with the datacumulation for purposes of researchincrementality and evolvabilityintegration with clinical data

Creating broad-coverage semantic annotation systems for biomedicine

Page 15: 1 Introduction to Ontology Barry Smith

15

Page 16: 1 Introduction to Ontology Barry Smith

16

Page 17: 1 Introduction to Ontology Barry Smith

The Gene Ontology

Page 18: 1 Introduction to Ontology Barry Smith

18

The Gene Ontology

Page 19: 1 Introduction to Ontology Barry Smith

19

Page 20: 1 Introduction to Ontology Barry Smith

20

Page 21: 1 Introduction to Ontology Barry Smith

21

The Idea of Common Controlled Vocabularies

MouseEcotope GlyProt

DiabetInGene

GluChem

sphingolipid transporter

activity

Page 22: 1 Introduction to Ontology Barry Smith

22

The Idea of Common Controlled Vocabularies

MouseEcotope GlyProt

DiabetInGene

GluChem

Holliday junction helicase complex

Page 23: 1 Introduction to Ontology Barry Smith

Multiple kinds of data in multiple kinds of silos

Lab / pathology data

Electronic Health Record data

Clinical trial data

Patient histories

Medical imaging

Microarray data

Protein chip data

Flow cytometry

Mass spec

Genotype / SNP data

23

Page 24: 1 Introduction to Ontology Barry Smith

How to find your data?

How to find other people’s data?

How to reason with data when you find it?

How to work out what data does not yet exist?

24

Page 25: 1 Introduction to Ontology Barry Smith

Multiple kinds of standardization for data

• Terminologies (SNOMED, UMLS)

• CDEs (Clinical research)

• Information Exchange Standards (HL7 RIM)

• LIMS (LOINC)

• MGED standards for microarray data, etc.

25

Page 26: 1 Introduction to Ontology Barry Smith

26

how solve the problem of making such data queryable and re-usable by others

to address NIH mandates?

part of the solution must involve: standardized terminologies and coding schemes

Page 27: 1 Introduction to Ontology Barry Smith

27

most successful, thus far: UMLScollection of separate terminologies built by trained experts

massively useful for information retrieval and information integration

UMLS Metathesaurus a system of post hoc mappings between overlapping source vocabularies

Page 28: 1 Introduction to Ontology Barry Smith

28

for UMLSlocal usage respected

regimentation frowned upon

cross-framework consistency not important

no concern to establish consistency with basic science

different grades of formal rigor, different degrees of completeness, different update policies

Page 29: 1 Introduction to Ontology Barry Smith

caBIG approach: BRIDG (top-down imposition)

29

class Comprehensiv e Class-only diagram

Material MaterialRole{abstract}

FundingSponsorHealthCareSite

Inv estigator

Organization OrganizationRole{abstract}

Participant

Person

PersonRole{abstract}

TherapeuticAgent

PerformedActiv ity

PerformedStudy

PlannedActiv ity

PlannedStudy

StudyAgent

StudyInv estigator

StudySite

SubjectAssignment

StudyDocument

Dev iceDrug

Study{abstract}

StudyAuthor

Activ ity

ClinicalResult

Objectiv eResult

Observ ationRelationship

Procedure

SpecimenCollection

PerformedSubjectActiv ityObserv ation

LabTestSpecimen

LabResult

Quantitativ eMeasurement

InterpretedResult

Interpretation

Assessment{abstract}

AssessmentRelationship

Role{abstract}

RoleRoleRelationship Participation{abstract}

ConceptRelationship

PlannedObserv ation

Document{abstract}

Questionnaire

Examination VitalSign

Laboratory

CentralLaboratory

PerformingLaboratory

StudySubject

Interv ention

SubstanceAdministration

MicrobiologyTest

Organism

StudyDataProv ider

RandomizationScheme

EligibilityCriterion

StudyObjectiv e

SubGroup

SubjectVisit

Incident

ProtocolDev iation

Disposition

ClinicalIncident

MedicalHistory

StudyConductActiv ity

Adv erseEv ent

PlannedEv ent

PlannedVisitTempName:NonDrugSubstance

TestTestRelationship

TestInterpretation

InterpretedLaboratoryTest

BRIDGDosingDescriptionDataType

BRIDGCodedConcept

BRIDGContactAddress

BRIDGDescription

BRIDGID BRIDGInterv al

BRIDGStatus BRIDGTelecomAddress

PlannedStudyDrug

StudySubstanceAdministration

ConcomitantSubstanceAdministration

Image

MicrobiologyResult

Entity

Role

Participation

Activity/Act

Complex Datatype

Registry

RegistrySteward

Ev ent

Ev entEv entRelationship

CTOMActivityActivityRelationship

PerformedActiv ityRelationship

PlannedCalendar

PlannedEv entActiv ityRelationship

ScheduledEv ent

ScheduledEv entActiv ityRelationship

PlannedArm

ScheduledArm

ScheduledActiv ityScheduledCalendar

Inv estigativ eResult

StudyDesignEpoch

0..*

1

+is collecte at0..*

+collects 0..1

+source activity*

+are written by

+write

10..*

+is described by1

+describes1

1

1

1..*0..*

*

+target activity1..*

has a

1

1

has a1..*

0..*

1

1 0..*

1..1

1..*

+interprets 1

+is interpreted by 1..*

0..*1

1

0..1

0..1

1

0..* 1

0..11

1

0..*

+areattributed to

+have

+areattributed to

+have

+is fulfi ledby the role

1

+participate as

0..*

+areattributed to

+have

+is a subject-specific description of 1

+is used to create a

0..*

+are performed at

1..* +participate in 1

0..*

1

+are performed by

+participate in

0..*

1

+assign

0..*

+are assigned by1

+is described by

1..1

+is operationalized by

0..1

+is operationalized by

0..1

+is described by

1..1

1

1

+has test performed 0..*

+Is performed on

1

+receives Interventions 1

+are performed on 0..*

+is assigned to

0..*

+is responsible for 1

+is assigned to

0..*

+is the location for participation for

1

+is screened for or enrolled in

1..* +screens or enrolls 1

+must have

1

+is experienced by

1..*+may have a

1

+belongs to

0..*

+uses 0..*

+is used in 1

+target 1..*

+source

1

+are used in

+have

+is a subject-specific description of

1

+is used to create a 0..*

0..*1

0..*1

+may experience

1..

+may happen to

0..*

+uses a

+is used by

+is collected at

0..*

+collects

0..1

+has a study population defined by

+defines a patient population

0..1

0..1

0..*

1

+collects 0..1

+are collected at 0..*

0..1

1

+target

0..*

+source 1

1..*

1

+source 1

+target 0..*

+is a subject-specific description of 1

+is used to create a 0..*

+target0..*

+source

1

+compose

+are composed of

+are contained within

+contain+is used to create a 1

+is a subject-specific description of 0..*

Page 30: 1 Introduction to Ontology Barry Smith

class Comprehensiv e Class-only diagram

Material MaterialRole{abstract}

FundingSponsorHealthCareSite

Inv estigator

Organization OrganizationRole{abstract}

Participant

Person

PersonRole{abstract}

TherapeuticAgent

PerformedActiv ity

PerformedStudy

PlannedActiv ity

PlannedStudy

StudyAgent

StudyInv estigator

StudySite

SubjectAssignment

StudyDocument

Dev iceDrug

Study{abstract}

StudyAuthor

Activ ity

ClinicalResult

Objectiv eResult

Observ ationRelationship

Procedure

SpecimenCollection

PerformedSubjectActiv ityObserv ation

LabTestSpecimen

LabResult

Quantitativ eMeasurement

InterpretedResult

Interpretation

Assessment{abstract}

AssessmentRelationship

Role{abstract}

RoleRoleRelationship Participation{abstract}

ConceptRelationship

PlannedObserv ation

Document{abstract}

Questionnaire

Examination VitalSign

Laboratory

CentralLaboratory

PerformingLaboratory

StudySubject

Interv ention

SubstanceAdministration

MicrobiologyTest

Organism

StudyDataProv ider

RandomizationScheme

EligibilityCriterion

StudyObjectiv e

SubGroup

SubjectVisit

Incident

ProtocolDev iation

Disposition

ClinicalIncident

MedicalHistory

StudyConductActiv ity

Adv erseEv ent

PlannedEv ent

PlannedVisitTempName:NonDrugSubstance

TestTestRelationship

TestInterpretation

InterpretedLaboratoryTest

BRIDGDosingDescriptionDataType

BRIDGCodedConcept

BRIDGContactAddress

BRIDGDescription

BRIDGID BRIDGInterv al

BRIDGStatus BRIDGTelecomAddress

PlannedStudyDrug

StudySubstanceAdministration

ConcomitantSubstanceAdministration

Image

MicrobiologyResult

Entity

Role

Participation

Activity/Act

Complex Datatype

Registry

RegistrySteward

Ev ent

Ev entEv entRelationship

CTOMActivityActivityRelationship

PerformedActiv ityRelationship

PlannedCalendar

PlannedEv entActiv ityRelationship

ScheduledEv ent

ScheduledEv entActiv ityRelationship

PlannedArm

ScheduledArm

ScheduledActiv ityScheduledCalendar

Inv estigativ eResult

StudyDesignEpoch

0..*

1

+is collecte at0..*

+collects 0..1

+source activity*

+are written by

+write

10..*

+is described by1

+describes1

1

1

1..*0..*

*

+target activity1..*

has a

1

1

has a1..*

0..*

1

1 0..*

1..1

1..*

+interprets 1

+is interpreted by 1..*

0..*1

1

0..1

0..1

1

0..* 1

0..11

1

0..*

+areattributed to

+have

+areattributed to

+have

+is fulfi ledby the role

1

+participate as

0..*

+areattributed to

+have

+is a subject-specific description of 1

+is used to create a

0..*

+are performed at

1..* +participate in 1

0..*

1

+are performed by

+participate in

0..*

1

+assign

0..*

+are assigned by1

+is described by

1..1

+is operationalized by

0..1

+is operationalized by

0..1

+is described by

1..1

1

1

+has test performed 0..*

+Is performed on

1

+receives Interventions 1

+are performed on 0..*

+is assigned to

0..*

+is responsible for 1

+is assigned to

0..*

+is the location for participation for

1

+is screened for or enrolled in

1..* +screens or enrolls 1

+must have

1

+is experienced by

1..*+may have a

1

+belongs to

0..*

+uses 0..*

+is used in 1

+target 1..*

+source

1

+are used in

+have

+is a subject-specific description of

1

+is used to create a 0..*

0..*1

0..*1

+may experience

1..

+may happen to

0..*

+uses a

+is used by

+is collected at

0..*

+collects

0..1

+has a study population defined by

+defines a patient population

0..1

0..1

0..*

1

+collects 0..1

+are collected at 0..*

0..1

1

+target

0..*

+source 1

1..*

1

+source 1

+target 0..*

+is a subject-specific description of 1

+is used to create a 0..*

+target0..*

+source

1

+compose

+are composed of

+are contained within

+contain+is used to create a 1

+is a subject-specific description of 0..*

Page 31: 1 Introduction to Ontology Barry Smith

31

where do you find scientifically validated information linking gene products and other entities represented in biochemical databases to semantically meaningful terms pertaining to disease, anatomy, development in different model organisms?

A new approach

for science

Page 32: 1 Introduction to Ontology Barry Smith

caBIG

BRIDG

32

Top-down (master-model-based)

Bottom-up (evidence-based)

prospective standardization

caBIGSNOMEDHL7

OBO Foundry

retrospective mapping

UMLS (multiple authorities)

NLP / data + text-mining

Page 33: 1 Introduction to Ontology Barry Smith

SNOMED

Ultimately as data become attached to the samples (e.g., pathology data, genotypes) these will be linked to the patient records.

33

Page 34: 1 Introduction to Ontology Barry Smith

34

where in the body ? where in the cell ?

what kind of organism ?

what kind of disease process ?

Page 35: 1 Introduction to Ontology Barry Smith

35

ontologies = high quality controlled structured vocabularies for the annotation (description) of data

Page 36: 1 Introduction to Ontology Barry Smith

36

compare: legends for diagrams

Page 37: 1 Introduction to Ontology Barry Smith

or chemistry diagrams

Prasanna, et al. Chemical Compound Navigator: A Web-Based Chem-BLAST, Chemical Taxonomy-Based Search Engine for Browsing Compounds

PROTEINS: Structure, Function, and Bioinformatics 63:907–917 (2006)

legends for chemistry diagrams

Page 38: 1 Introduction to Ontology Barry Smith

Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological OntologySyst. Biol. 56(2):283–294, 2007

Page 39: 1 Introduction to Ontology Barry Smith

39

The Network Effects of Synchronization

MouseEcotope GlyProt

DiabetInGene

GluChem

Holliday junction helicase complex

Page 40: 1 Introduction to Ontology Barry Smith

40

Five bangs for your GO buck1. based in biological science

2. incremental approach (evidence-based evolutionary pathway)

3. cross-species data comparability (human, mouse, yeast, fly ...)

4. cross-granularity data integration (molecule, cell, organ, organism)

5. cumulation of scientific knowledge in algorithmically tractable form, links people to software

Page 41: 1 Introduction to Ontology Barry Smith

41

Model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with entries in gene product and other molecular biology databases

($4 mill. p.a. NIH funding)

The methodology of annotations

Page 42: 1 Introduction to Ontology Barry Smith

42

How to extend the GO methodology to other domains of clinical and

translational medicine?

Page 43: 1 Introduction to Ontology Barry Smith

43

the problemexisting clinical vocabularies are of variable quality and low mutual consistency

current proliferation of tiny ontologies by different groups with urgent annotation needs

Page 44: 1 Introduction to Ontology Barry Smith

44

the solution

establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence-based pathway to incremental improvement

Page 45: 1 Introduction to Ontology Barry Smith

45

How to build an ontology

work with scientists to create an initial top-level classification

find ~50 most commonly used terms corresponding to types in reality

arrange these terms into an informal is_a hierarchy according to the universality principle

A is_a B every instance of A is an instance of B

fill in missing terms to give a complete hierarchy

(leave it to domain scientists to populate the lower levels of the hierarchy)

Page 46: 1 Introduction to Ontology Barry Smith

46

a shared portal for (so far) 58 ontologies (low regimentation)

http://obo.sourceforge.net NCBO BioPortal

First step (2003)

Page 47: 1 Introduction to Ontology Barry Smith

47

Page 48: 1 Introduction to Ontology Barry Smith

48

OBO now the principal entry point for creation of web-accessible biomedical data

OBO and OBOEdit low-tech to encourage users

Simple (web-service-based) tools created to support the work of biologists in creating annotations (data entry)

OBO OWL DL converters make OBO Foundry annotated data immediately accessible to Semantic Web data integration projects

Page 49: 1 Introduction to Ontology Barry Smith

49

Second step (2004):reform efforts initiated, e.g. linking GO formally to

other ontologies and data sources

id: CL:0000062name: osteoblastdef: "A bone-forming cell which secretes an extracellular matrix. Hydroxyapatite crystals are then deposited into the matrix to form bone." is_a: CL:0000055relationship: develops_from CL:0000008relationship: develops_from CL:0000375

GO

Cell type

New Definition

+

=Osteoblast differentiation: Processes whereby an osteoprogenitor cell or a cranial neural crest cell acquires the specialized features of an osteoblast, a bone-forming cell which secretes extracellular matrix.

Page 50: 1 Introduction to Ontology Barry Smith

50

The OBO FoundryThe OBO Foundryhttp://obofoundry.org/http://obofoundry.org/

Third step (2006)Third step (2006)

Page 51: 1 Introduction to Ontology Barry Smith

51

Ontology Scope URL Custodians

Cell Ontology (CL)

cell types from prokaryotes to mammals

obo.sourceforge.net/cgi-

bin/detail.cgi?cell

Jonathan Bard, Michael Ashburner, Oliver Hofman

Chemical Entities of Bio-

logical Interest (ChEBI)

molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara

Common Anatomy Refer-

ence Ontology (CARO)

anatomical structures in human and model

organisms(under development)

Melissa Haendel, Terry Hayamizu, Cornelius

Rosse, David Sutherland,

Foundational Model of Anatomy (FMA)

structure of the human body

fma.biostr.washington.

edu

JLV Mejino Jr.,Cornelius Rosse

Functional Genomics Investigation

Ontology (FuGO)

design, protocol, data instrumentation, and

analysisfugo.sf.net FuGO Working Group

Gene Ontology (GO)

cellular components, molecular functions, biological processes

www.geneontology.org

Gene Ontology Consortium

Phenotypic Quality Ontology

(PaTO)

qualities of anatomical structures

obo.sourceforge.net/cgi

-bin/ detail.cgi?attribute_and_value

Michael Ashburner, Suzanna

Lewis, Georgios Gkoutos

Protein Ontology (PrO)

protein types and modifications

(under development)Protein Ontology

Consortium

Relation Ontology (RO)

relationsobo.sf.net/

relationshipBarry Smith, Chris

Mungall

RNA Ontology(RnaO)

three-dimensional RNA structures

(under development) RNA Ontology Consortium

Sequence Ontology(SO)

properties and features of nucleic sequences

song.sf.net Karen Eilbeck

Page 52: 1 Introduction to Ontology Barry Smith

52

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Building out from the original GO

Page 53: 1 Introduction to Ontology Barry Smith

53

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Cellular Process

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

initial OBO Foundry coverage

GRANULARITY

RELATION TO TIME

Page 54: 1 Introduction to Ontology Barry Smith

54

Continuants (aka endurants)have continuous existence in timepreserve their identity through changeexist in toto whenever they exist at all

Occurrents (aka processes)have temporal partsunfold themselves in successive phasesexist only in their phases

Page 55: 1 Introduction to Ontology Barry Smith

55

You are a continuant

Your life is an occurrent

You are 3-dimensional

Your life is 4-dimensional

Page 56: 1 Introduction to Ontology Barry Smith

56

Dependent entities

require independent continuants as their bearers

There is no run without a runner

There is no grin without a cat

Page 57: 1 Introduction to Ontology Barry Smith

57

Dependent vs. independent continuants

Independent continuants (organisms, buildings, environments)

Dependent continuants (quality, shape, role, propensity, function, status, power, right)

Page 58: 1 Introduction to Ontology Barry Smith

58

All occurrents are dependent entities

They are dependent on those independent continuants which are their participants (agents, patients, media ...)

Page 59: 1 Introduction to Ontology Barry Smith

59

BFO Top-Level Ontology

ContinuantOccurrent

(always dependent on one or more

independent continuants)

IndependentContinuant

DependentContinuant

Page 60: 1 Introduction to Ontology Barry Smith

60

= A representation of top-level types

Continuant Occurrent

IndependentContinuant

DependentContinuant

cell component

biological process

molecular function

Page 61: 1 Introduction to Ontology Barry Smith

61

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Functioning

Side-Effect, Stochastic Process, ...

Function

Page 62: 1 Introduction to Ontology Barry Smith

62

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Functioning Side-Effect, Stochastic Process, ...

Function

Page 63: 1 Introduction to Ontology Barry Smith

63

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Quality Function Spatial Region

Functioning Side-Effect, Stochastic Process, ...

instances (in space and time)

Page 64: 1 Introduction to Ontology Barry Smith

64

CRITERIA

The ontology is open and available to be used by all.

The ontology is in, or can be instantiated in, a common formal language.

The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

CRITERIA

Page 65: 1 Introduction to Ontology Barry Smith

65

CRITERIA UPDATE: The developers of each ontology

commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.

ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.

Page 66: 1 Introduction to Ontology Barry Smith

66

communities must work together to ensure consistency orthogonality modular development plus additivity of annotations:

if we annotate a database or body of literature with one OBO Foundry ontology, we should be able to add annotations from a second such ontology without conflicts

ontologies do not need to create tiny theories of anatomy or chemistry within themselves

ORTHOGONALITY

Page 67: 1 Introduction to Ontology Barry Smith

67

CRITERIA IDENTIFIERS: The ontology possesses a unique

identifier space within OBO.

VERSIONING: The ontology provider has procedures for identifying distinct successive versions.

The ontology includes textual definitions for all terms.

CRITERIA

Page 68: 1 Introduction to Ontology Barry Smith

68

CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.

DOCUMENTATION: The ontology is well-documented.

USERS: The ontology has a plurality of independent users.

CRITERIA

Page 69: 1 Introduction to Ontology Barry Smith

69

COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology

CRITERIA

Page 70: 1 Introduction to Ontology Barry Smith

70

OBO Foundry is serving as a benchmark for improvements in discipline-focused terminology resources

yielding callibration of existing terminologies and data resources and alignment of different views

Consequences

Page 71: 1 Introduction to Ontology Barry Smith

71

Foundry ontologies all work in the same way

all are built to represent the types existing in a pre-existing domain and the relations between these types in a way which can support reasoning

– we have data– we need to make this data available for semantic

search and algorithmic processing– we create a consensus-based ontology for annotating

the data– and ensure that it can interoperate with Foundry

ontologies for neighboring domains

Page 72: 1 Introduction to Ontology Barry Smith

72

Mature OBO Foundry ontologies (now undergoing reform)

Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)

Page 73: 1 Introduction to Ontology Barry Smith

73

Ontologies being built to satisfy Foundry principles ab initio

Ontology for Clinical Investigations (OCI)Common Anatomy Reference Ontology (CARO)Ontology for Biomedical Investigations (OBI)Protein Ontology (PRO)RNA Ontology (RnaO)Subcellular Anatomy Ontology (SAO)

Page 74: 1 Introduction to Ontology Barry Smith

74

Ontologies in planning phaseBiobank/Biorepository Ontology (BrO, part of OBI)Environment Ontology (EnvO) Immunology Ontology (ImmunO)Infectious Disease Ontology (IDO)Mouse Adult Neurogenesis Ontology (MANGO)

Page 75: 1 Introduction to Ontology Barry Smith

OBO Foundry provides a method for handling legacy databases

75

Page 76: 1 Introduction to Ontology Barry Smith

Senselab/NeuronDB*NeuronDB comprehends three types of neuronal properties:

voltage gated conductances

neurotransmitter receptors

neurotransmitter substances

Many questions immediately arise: what are receptors? Proteins? Protein complexes? The Foundry framework provides an opportunity to evaluate such choices.

76

* http://senselab.med.yale.edu/

Page 77: 1 Introduction to Ontology Barry Smith

Senselab/NeuronDB

The GO Molecular Function (MF) ontology already has classes such as receptor activity (GO_0004872) plus subclasses describing receptor activities already referred to in NeuronDB.

This provides a roadmap for further development. Review the 130 receptor classes to see if they exist in MF, where not, create subclasses and submit to GO for future inclusion. We can then e.g. take advantage of GO Annotations to find the proteins that correspond to these receptor classes in different species.

77

Page 78: 1 Introduction to Ontology Barry Smith

OBO Foundry Success Story

Model organism research seeks results valuable for the understanding of human disease.

This requires the ability to make reliable cross-species comparisons, and for this anatomy is crucial.

But different MOD communities have developed their anatomy ontologies in uncoordinated fashion.

78

Page 79: 1 Introduction to Ontology Barry Smith

Multiple axes of classification

Functional: cardiovascular system, nervous systemSpatial: head, trunk, limbDevelopmental: endoderm, germ ring, lens placodeStructural: tissue, organ, cell Stage: developmental staging series

Page 80: 1 Introduction to Ontology Barry Smith

80

Developmental terms are often lumped together for lack of a way to categorize them

Stages are represented in a variety of ways. Terms can be children of superstages, stages can be integrated into each term, or stages can be assigned to terms from a separate ontology

Page 81: 1 Introduction to Ontology Barry Smith

Ontologies facilitate grouping of annotations

brain 20 hindbrain 15 rhombomere 10

Query brain without ontology 20Query brain with ontology 45

81

Page 82: 1 Introduction to Ontology Barry Smith

CARO – Common Anatomy Reference Ontology

for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations

for the first time provides guidelines for those new to ontology work

See Haendel et al., “CARO: The Common Anatomy Reference Ontology”, in: Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press.

82

Page 83: 1 Introduction to Ontology Barry Smith

83

CARO-conformant ontologies already in development:

Fish Multi-Species Anatomy Ontology (NSF funding received)Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy OntologyXenopus Anatomy Ontology (XAO)

undergoing reform: Drosophila and Zebrafish Anatomy Ontologies

Page 84: 1 Introduction to Ontology Barry Smith

OBI / OCI

Ontology for Biomedical Investigations

overarching terminology resource for MIBBI Foundry

Ontology for Clinical Investigations

collaboration with EPOCH ontology for clinical trial management

and with CDISC (FDA mandated vocabulary for clinical trial reports)

84

Page 85: 1 Introduction to Ontology Barry Smith

85

INDEPENDENT

CONTINUANTS

organism

system

organ

organ part

tissue

cell

acellular anatomical structure

biological molecule

genome

DEPENDENT CONTINUANTS

physiology

(functions)

pathologyacute stage

progressive stage

resolution stage

next step: repertoire of disease ontologiesbuilt out of OBO Foundry elements

Page 86: 1 Introduction to Ontology Barry Smith

86

Scope of Draft Ontology for Multiple Sclerosis