what developers need to know about ontologies? barry smith 1

121
What developers need to know about ontologies? Barry Smith http:// ontology.buffalo.edu/smith 1

Upload: magdalen-copeland

Post on 19-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What developers need to know about ontologies? Barry Smith  1

What developers need to know about

ontologies?

Barry Smithhttp://ontology.buffalo.edu/smith

1

Page 2: What developers need to know about ontologies? Barry Smith  1

HL7 Watch (blog)

Microsoft Healthvault:

Allergic Episode is_a Health Record Item,

Health Record Item =def. A single piece of data in a health record that is accessible through the HealthVault service

2

Page 3: What developers need to know about ontologies? Barry Smith  1

3

Problem of ensuring sensible cooperation in a massively interdisciplinary community

concepttypeinstancemodelrepresentationdata

Page 4: What developers need to know about ontologies? Barry Smith  1

4

What do these mean?

‘conceptual data model’‘semantic knowledge model’‘reference information model’

Page 5: What developers need to know about ontologies? Barry Smith  1

You’re interested in which genes control heart muscle development

17,536 results

5

Page 6: What developers need to know about ontologies? Barry Smith  1

Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...

Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)

attacked

time

control

Puparial adhesionMolting cyclehemocyanin

Defense responseImmune responseResponse to stimulusToll regulated genesJAK-STAT regulated genes

Immune responseToll regulated genes

Amino acid catabolismLipid metobolism

Peptidase activityProtein catabloismImmune response

Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...

Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)

Microarray datashows changed expression ofthousands of genes.

How will you spot the patterns?

6

Page 7: What developers need to know about ontologies? Barry Smith  1

Lab / pathology dataEHR dataClinical trial dataFamily history data Medical image dataMicroarray dataModel organism dataFlow cytometryMass specGenotype / SNP data

How will you find the data you need?

7

Page 8: What developers need to know about ontologies? Barry Smith  1

− Human − Mouse− Rat − Fish− Yeast− E. coli

How will you find the compare the data? How will you integrate the data

8

Page 9: What developers need to know about ontologies? Barry Smith  1

:.

The GO Idea

MouseEcotope GlyProt

DiabetInGene

GluChem

sphingolipid transporter

activity

Page 10: What developers need to know about ontologies? Barry Smith  1

:.

annotation using common ontologies yields integration of databases

MouseEcotope GlyProt

DiabetInGene

GluChem

Holliday junction helicase complex

Page 11: What developers need to know about ontologies? Barry Smith  1

• For this to work, ontologies cannot be allowed to proliferate uncontrollably

• Rather, we need as far as possible non-overlapping ontology modules (OBO Foundry)

• How should we build these modules in such a way as to ensure glue-ability of annotations?

Page 12: What developers need to know about ontologies? Barry Smith  1

12

Glue-ability / integration• rests on the existence of a common

benchmark called ‘reality’

• the ontologies we want to glue together are representations of what exists in the world

• not of what exists in the heads of different groups of people

Page 13: What developers need to know about ontologies? Barry Smith  1

13

two kinds of annotations

Page 14: What developers need to know about ontologies? Barry Smith  1

14

names of types

Page 15: What developers need to know about ontologies? Barry Smith  1

15

names of instances

Page 16: What developers need to know about ontologies? Barry Smith  1

16

First basic distinction

type vs. instance

(science text vs. diary)

(human being vs. Tom Cruise)

Page 17: What developers need to know about ontologies? Barry Smith  1

17

For ontologies

it is generalizations that are important = ontologies are

about types, kinds, universals

Page 18: What developers need to know about ontologies? Barry Smith  1

18

Ontology types Instances

Page 19: What developers need to know about ontologies? Barry Smith  1

19

Ontology = A Representation of types

Page 20: What developers need to know about ontologies? Barry Smith  1

20

An ontology is a representation of types

We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories

experiments relate to what is particular science describes what is general

Page 21: What developers need to know about ontologies? Barry Smith  1

21

Inventory vs. CatalogTwo kinds of representational

artifact

Very roughly:

Databases represent instances

Ontologies represent types

Page 22: What developers need to know about ontologies? Barry Smith  1

22

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

Catalog vs. inventory

Page 23: What developers need to know about ontologies? Barry Smith  1

23

Catalog vs. inventory

Page 24: What developers need to know about ontologies? Barry Smith  1

24

Catalog of types/Types

Page 25: What developers need to know about ontologies? Barry Smith  1

25

siamese

mammal

cat

organism

objecttypes

animal

frog

instances

Page 26: What developers need to know about ontologies? Barry Smith  1

26

Ontologies are here

Page 27: What developers need to know about ontologies? Barry Smith  1

27

or here

Page 28: What developers need to know about ontologies? Barry Smith  1

28

ontologies represent general structures in reality (leg)

Page 29: What developers need to know about ontologies? Barry Smith  1

29

Ontologies do not represent concepts in people’s heads

Page 30: What developers need to know about ontologies? Barry Smith  1

30

They represent types in reality

Page 31: What developers need to know about ontologies? Barry Smith  1

31

which provide the benchmark for integration

Page 32: What developers need to know about ontologies? Barry Smith  1

32

Entity =def

anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software (Levels 1, 2 and 3)

Page 33: What developers need to know about ontologies? Barry Smith  1

33

what are the kinds of entity?

Page 34: What developers need to know about ontologies? Barry Smith  1

34

First basic distinction

type vs. instance

(science text vs. diary)

(human being vs. Tom Cruise)

Page 35: What developers need to know about ontologies? Barry Smith  1

35

Ontology Types Instances

Page 36: What developers need to know about ontologies? Barry Smith  1

36

Ontology = A Representation of types

Page 37: What developers need to know about ontologies? Barry Smith  1

37

Domain =def

a portion of reality that forms the subject-matter of a single science or technology or mode of study or administrative practice ...;

proteomics

HIV

epidemiology

Page 38: What developers need to know about ontologies? Barry Smith  1

38

Representation =def

an image, idea, map, picture, name or description ... of some entity or entities.

Page 39: What developers need to know about ontologies? Barry Smith  1

39

Ontologies are representational artifacts

comparable to science textsand subject to the same sorts of constraints (including need

for update)

Page 40: What developers need to know about ontologies? Barry Smith  1

40

Representational units =def

terms, icons, alphanumeric identifiers ... which refer, or are intended to refer, to entities

and which are minimal (atoms)

Page 41: What developers need to know about ontologies? Barry Smith  1

41

Composite representation =defrepresentation

(1) built out of representational units

which

(2) form a structure that mirrors, or is intended to mirror, the entities in some domain

Page 42: What developers need to know about ontologies? Barry Smith  1

42

Analogue representations

no representational units, no ‘atoms’

Page 43: What developers need to know about ontologies? Barry Smith  1

43

Periodic Table

The Periodic Table

Page 44: What developers need to know about ontologies? Barry Smith  1

44

Class =defa maximal collection of particulars determined by a general term (‘cell’. ‘electron’ but also: ‘ ‘restaurant in Palo Alto’, ‘Italian’)

the class A = the collection of all particulars x for which ‘x is A’ is true

Page 45: What developers need to know about ontologies? Barry Smith  1

45

types vs. their extensions

types

{a,b,c,...} collections of particulars

Page 46: What developers need to know about ontologies? Barry Smith  1

46

Extension =def

The extension of a type A is the class: instance of the type A

(it is the class of A’s instances)

(the class of all entities to which the term ‘A’ applies)

Page 47: What developers need to know about ontologies? Barry Smith  1

47

Problem

The same general term can be used to refer both to types and to collections of particulars. Consider:

HIV is an infectious retrovirus

HIV is spreading very rapidly through Asia

Page 48: What developers need to know about ontologies? Barry Smith  1

48

types vs. classes

types

{c,d,e,...} classes

Page 49: What developers need to know about ontologies? Barry Smith  1

49

types vs. classes

types

~ defined classes

Page 50: What developers need to know about ontologies? Barry Smith  1

50

types vs. classes

types

e.g. populations, ...

Page 51: What developers need to know about ontologies? Barry Smith  1

51

Defined class =def

a class defined by a general term which does not designate a type

the class of all diabetic patients in Leipzig on 4 June 1952

Page 52: What developers need to know about ontologies? Barry Smith  1

52

OWL is a good representation of defined classes

• sibling of Finnish spy

• member of Abba aged > 50 years

• pizza with > 4 different toppings

Page 53: What developers need to know about ontologies? Barry Smith  1

53

Terminology =def.

a representational artifact whose representational units are natural language terms (with IDs, synonyms, comments, etc.) which are intended to designate types together with defined classes, with no particular attention to composite representations

Page 54: What developers need to know about ontologies? Barry Smith  1

54

types, classes, concepts

types

defined classes

‘concepts’ ?

Page 55: What developers need to know about ontologies? Barry Smith  1

55

types < defined classes < ‘concepts’

‘concepts’ which do not correspond to defined classes:

‘Surgical or other procedure not carried out because of patient's decision’

‘Congenital absent nipple’

because they do not correspond to anything

Page 56: What developers need to know about ontologies? Barry Smith  1

Gene Ontology: The Very Top

cellular component

molecular function

biological process

56

Page 57: What developers need to know about ontologies? Barry Smith  1

Gene Ontology: The Very Top

continuant

cellular component

molecular function

occurrent

biological process

57

Page 58: What developers need to know about ontologies? Barry Smith  1

BFO: The Very Top

continuant occurrent

biological processes

independentcontinuant

cellular component

dependentcontinuant

molecular function

58

Page 59: What developers need to know about ontologies? Barry Smith  1

Basic Formal Ontology

continuant occurrent

independentcontinuant

dependentcontinuant

organism

59

Page 60: What developers need to know about ontologies? Barry Smith  1

Basic Formal Ontology

continuant occurrent

independentcontinuant

dependentcontinuant

anatomical structure

60

Page 61: What developers need to know about ontologies? Barry Smith  1

Continuants

• continue to exist through time, preserving their identity while undergoing different sorts of changes

• independent continuants – objects, things, ...

• dependent continuants – qualities, attributes, shapes, potentialities ...

61

Page 62: What developers need to know about ontologies? Barry Smith  1

Qualitiestemperatureblood pressuremass...

are continuantsthey exist through time while undergoing changes

62

Page 63: What developers need to know about ontologies? Barry Smith  1

Qualitiestemperature / blood pressure /

mass ...are dimensions of variation within the structure of the entity; a quality is something which can change while its bearer remains one and the same

63

Page 64: What developers need to know about ontologies? Barry Smith  1

A Chart representing how John’s temperature

changes

65

Page 65: What developers need to know about ontologies? Barry Smith  1

John’s temperaturethe temperature he has throughout his entire life, cycles through different determinate temperatures from one time to the next

John’s temperature is a physiology variable which, in thus changing, exerts an influence on other physiology variables through time

66

Page 66: What developers need to know about ontologies? Barry Smith  1

BFO: The Very Top

continuant

independentcontinuant

dependentcontinuant

quality

occurrent

temperature 67

Page 67: What developers need to know about ontologies? Barry Smith  1

Blinding Flash of the Obvious

independentcontinuant

dependentcontinuant

quality

temperature types

instances

organism

John John’s

temperature 68

Page 68: What developers need to know about ontologies? Barry Smith  1

Blinding Flash of the Obvious

independentcontinuant

dependentcontinuant

quality

temperature types

instances

organism

John John’s

temperature 69

Page 69: What developers need to know about ontologies? Barry Smith  1

Blinding Flash of the Obvious

temperature types

instances

organism

John John’s

temperature

70

inheres_in

Page 70: What developers need to know about ontologies? Barry Smith  1

temperature types

instances

John’s temperature

71

37ºC37.1º

C37.5º

C37.2º

C37.3º

C37.4º

C

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

Page 71: What developers need to know about ontologies? Barry Smith  1

human types

instances

John

72

embryo

fetus adultneonat

einfant child

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

Page 72: What developers need to know about ontologies? Barry Smith  1

• lower lever of types does not ‘carry identity’ in OntoClean terms

• are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering)

73

Page 73: What developers need to know about ontologies? Barry Smith  1

independentcontinuant

dependentcontinuant

quality

temperature types

instances

organism

John John’s

temperature

74

Page 74: What developers need to know about ontologies? Barry Smith  1

independentcontinuant

dependentcontinuant

quality

temperature

organism

John John’s

temperature

occurrent

process

course of temperature

changes

John’s temperature history

75

Page 75: What developers need to know about ontologies? Barry Smith  1

independentcontinuant

dependentcontinuant

quality

temperature

organism

John John’s

temperature

occurrent

process

life of an organism

John’s life

76

Page 76: What developers need to know about ontologies? Barry Smith  1

BFO/GO: The Very Top

continuant occurrent

biological processes

independentcontinuant

cellular component

dependentcontinuant

molecular function

77

Page 77: What developers need to know about ontologies? Barry Smith  1

BFO: The Very Top

continuant occurrent

independentcontinuant

dependentcontinuant

quality functionrole

disposition

78

Page 78: What developers need to know about ontologies? Barry Smith  1

:.

Function - of liver: to store glycogen- of birth canal: to enable transport- of eye: to see- of mitochondrion: to produce ATP- of liver: to store glycogen

not optional; reflection of physical makeup of bearer; can malfunction

79

Page 79: What developers need to know about ontologies? Barry Smith  1

:.

Role optional:exists because the bearer is in some special natural, social, or institutional set of circumstances in which the bearer does not have to be

80

Page 80: What developers need to know about ontologies? Barry Smith  1

:.

Role - bearers can have more than one role

person as student / as staff member- roles often form systems of mutual dependence

husband / wife first in queue / last in queuedoctor / patient

host / pathogen 81

Page 81: What developers need to know about ontologies? Barry Smith  1

:.

Role of some chemical compound: to serve as analyte in an experiment

of a dose of penicillin in this human child: to treat a disease

of this bacteria in a primary host: to cause infection

82

Page 82: What developers need to know about ontologies? Barry Smith  1

:.

Qualities are categorical features of reality – you just have them

Functions, roles and dispositions are potential featires of reality: they are realizable dependent continuants, realized in certain associated processes

83

Page 83: What developers need to know about ontologies? Barry Smith  1

independentcontinuant

dependentcontinuant

role

drug role

portion of chemical compound

this portion of aspirin

role of this portion of aspirin

occurrent

process

process of drug

adminstration

John’s taking this portion of aspirin

84

Page 84: What developers need to know about ontologies? Barry Smith  1

independentcontinuant

dependentcontinuant

role

drug role

portion of chemical compound

this portion of aspirin

role of this portion of aspirin

occurrent

process

process of drug

adminstration

John’s taking this portion of aspirin

85

inheres_in

realized_in

Page 85: What developers need to know about ontologies? Barry Smith  1

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

The Open Biomedical Ontologies (OBO) Foundry86

Page 86: What developers need to know about ontologies? Barry Smith  1

• The Road to Convergence

All ontologies for each given domain (anatomy, chemistry…) should be part of a single suite of interoperable ontologies

should use a common top-level corefor subdomains with many variants, should

follow the strategy of canonical ontologies with extensions

should require acceptance of common, tested guidelines on all subscribing ontology developers

87

Page 87: What developers need to know about ontologies? Barry Smith  1

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Cellular Process

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

initial OBO Foundry coverage, ontologies automatically semantically coupled

GRANULARITY

RELATION TO TIME

88

Page 88: What developers need to know about ontologies? Barry Smith  1

Disposition (Internally-Grounded Realizable

Entity)disposition =def.

a realizable entity which if it ceases to exist, then its bearer is physically changed, and whose realization occurs when this bearer is in some special physical circumstances, in virtue of the bearer’s physical make-up

89

Page 89: What developers need to know about ontologies? Barry Smith  1

Function

• A Disposition (Internally-Grounded Realizable Entity) that is designed or selected for

90

Page 90: What developers need to know about ontologies? Barry Smith  1

OGMS• Ontology for General Medical

Science

http://code.google.com/p/ogms

91

Page 91: What developers need to know about ontologies? Barry Smith  1

:.

Physical Disorder

– independent continuantfiat object part

92

Page 92: What developers need to know about ontologies? Barry Smith  1

Big Picture

93

Page 93: What developers need to know about ontologies? Barry Smith  1

A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes.

etiological process

produces

disorder

bears

disposition

realized_in

pathological process

produces

abnormal bodily features

recognized_as

signs & symptomsinterpretive process

produces

diagnosis

used_in94

Page 94: What developers need to know about ontologies? Barry Smith  1

Elucidation of Primitive Terms

• ‘bodily feature’ - an abbreviation for a physical component, a bodily quality, or a bodily process.

• disposition - an attribute describing the propensity to initiate certain specific sorts of processes when certain conditions are satisfied.

• clinically abnormal - some bodily feature that – (1) is not part of the life plan for an organism of the relevant

type (unlike aging or pregnancy), – (2) is causally linked to an elevated risk either of pain or other

feelings of illness, or of death or dysfunction, and – (3) is such that the elevated risk exceeds a certain threshold

level.*

*Compare: baldness95

Page 95: What developers need to know about ontologies? Barry Smith  1

Definitions - Foundational Terms

• Disorder =def. – A causally linked combination of physical components that is clinically abnormal.

• Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal.

• Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.

96

Page 96: What developers need to know about ontologies? Barry Smith  1

Dispositions and Predispositions

• All diseases are dispositions; not all dispositions are diseases.• A predisposition is a disposition.• Predisposition to Disease of Type X =def. – A disposition in an

organism that constitutes an increased risk of the organism’s subsequently developing the disease X.

• HNPCC is caused by a – disorder (mutation) in a DNA mismatch repair gene that – disposes to the acquisition of additional mutations from

defective DNA repair processes, and thus is a– predisposition to the development of colon cancer.

97

Page 97: What developers need to know about ontologies? Barry Smith  1

Cirrhosis - environmental exposure

• Etiological process - phenobarbitol-induced hepatic cell death

– produces

• Disorder - necrotic liver

– bears

• Disposition (disease) - cirrhosis

– realized_in

• Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death

– produces

• Abnormal bodily features

– recognized_as

• Symptoms - fatigue, anorexia

• Signs - jaundice, splenomegaly

Symptoms & Signs used_in

Interpretive process produces

Hypothesis - rule out cirrhosis suggests

Laboratory tests produces

Test results - elevated liver enzymes in serum used_in

Interpretive process produces

Result - diagnosis that patient X has a disorder that bears the disease cirrhosis

98

Page 98: What developers need to know about ontologies? Barry Smith  1

Influenza - infectious

• Etiological process - infection of airway epithelial cells with influenza virus

– produces

• Disorder - viable cells with influenza virus

– bears

• Disposition (disease) - flu

– realized_in

• Pathological process - acute inflammation

– produces

• Abnormal bodily features

– recognized_as

• Symptoms - weakness, dizziness

• Signs - fever

Symptoms & Signs used_in

Interpretive process produces

Hypothesis - rule out influenza suggests

Laboratory tests produces

Test results - elevated serum antibody titers used_in

Interpretive process produces

Result - diagnosis that patient X has a disorder that bears the disease flu

But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course).

99

Page 99: What developers need to know about ontologies? Barry Smith  1

Huntington’s Disease - genetic

• Etiological process - inheritance of >39 CAG repeats in the HTT gene– produces

• Disorder - chromosome 4 with abnormal mHTT– bears

• Disposition (disease) - Huntington’s disease– realized_in

• Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum– produces

• Abnormal bodily features– recognized_as

• Symptoms - anxiety, depression• Signs - difficulties in speaking and

swallowing

Symptoms & Signs used_in

Interpretive process produces

Hypothesis - rule out Huntington’s suggests

Laboratory tests produces

Test results - molecular detection of the HTT gene with >39CAG repeats used_in

Interpretive process produces

Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease

100

Page 100: What developers need to know about ontologies? Barry Smith  1

HNPCC - genetic pre-disposition

• Etiological process - inheritance of a mutant mismatch repair gene– produces

• Disorder - chromosome 3 with abnormal hMLH1– bears

• Disposition (disease) - Lynch syndrome– realized_in

• Pathological process - abnormal repair of DNA mismatches– produces

• Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2)– bears

• Disposition (disease) - non-polyposis colon cancer– realized in

• Symptoms (including pain)

101

Page 101: What developers need to know about ontologies? Barry Smith  1

The OBO Foundry Initiative

102

Page 102: What developers need to know about ontologies? Barry Smith  1

A good solution to the data integration problem must be:

• modular• incremental• bottom-up• evidence-based • revisable• incorporate a strategy for motivating

potential developers and users

103

Page 103: What developers need to know about ontologies? Barry Smith  1

GO is amazingly successful – but covers only three sorts of biological entities:–cellular components–molecular functions–biological processes

and does not provide representations of disease-related phenomena

104

Page 104: What developers need to know about ontologies? Barry Smith  1

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

The Open Biomedical Ontologies (OBO) Foundry105

Page 105: What developers need to know about ontologies? Barry Smith  1

OBO Foundry provides

• tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort

• an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology

• automatic web-based linkage between medical terminologies and biological knowledge resources

• traffic laws and traffic police

106

Page 106: What developers need to know about ontologies? Barry Smith  1

the strategy

establish common rules governing best practices for creating ontologies in coordinated fashion, with an evidence-based pathway to incremental improvement

107

Page 107: What developers need to know about ontologies? Barry Smith  1

The methodology of cross-products

compound terms in ontologies to be defined as cross-products of simpler terms:E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose.

= factoring out of ontologies into discipline-specific modules (orthogonality)

108

Page 108: What developers need to know about ontologies? Barry Smith  1

The methodology of cross-products

enforcing use of common relations in linking terms drawn from Foundry ontologies serves

• to ensure that the ontologies are maintained and revised in tandem

• logically defined relations serve to bind terms in different ontologies together to create a network

109

Page 109: What developers need to know about ontologies? Barry Smith  1

CRITERIA

opennness

common formal language.

collaborative development

evidence-based maintenance

identifiers

versioning

textual and formal definitions

CRITERIA

110

Page 110: What developers need to know about ontologies? Barry Smith  1

Orthogonality = modularity

• one ontology for each domain• no need for mappings (which are in

any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change)

• everyone knows where to look to find out how to annotate each kind of data

111

Page 111: What developers need to know about ontologies? Barry Smith  1

Ontologies and research groups

using BFO and RO

– OBO Foundry (60 biomedical ontologies, including

GO, OBI, Protein Ontology, Cell Ontology, IDO …– National Cancer Institute (BiomedGT)– NIF (NIH Neuroscience Information Framework)– Cleveland Clinic Semantic Database– Siemens– AstraZeneca– EU (ACGT Cancer Ontology, RAPS, …)

112

Page 112: What developers need to know about ontologies? Barry Smith  1

Because the ontologies in the Foundry

are built as orthogonal modules which form an incrementally evolving network

• scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network

• users are motivated by the assurance that the ontologies they turn to are maintained by experts

113

Page 113: What developers need to know about ontologies? Barry Smith  1

More benefits of orthogonality

• helps those new to ontology to find what they need

• to find models of good practice• ensures mutual consistency of ontologies

(trivially)• and thereby ensures additivity of annotations

114

Page 114: What developers need to know about ontologies? Barry Smith  1

More benefits of orthogonality

• it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes

• thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness

115

Page 115: What developers need to know about ontologies? Barry Smith  1

More criteria of a successful standard

1. intelligibility to users, consistent use of terms like ‘term’, ‘class’, ‘entity’, ‘object’ …)

2. track record of lessons learned (GO has 10 years of hard user testing)

3. lots of existing users (ontologies are like telephone networks)

116

Page 116: What developers need to know about ontologies? Barry Smith  1

The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) including the Relation Ontology (RO)

http://ifomis.org/bfo

http://www.obofoundry.org/ro/

COMMON ARCHITECTURE

117

Page 117: What developers need to know about ontologies? Barry Smith  1

Anatomy Ontology(FMA*, CARO)

Environment

Ontology(EnvO)

Infectious Disease

Ontology(IDO*)

Biological Process

Ontology (GO*)

Cell Ontology

(CL)

CellularComponentOntology

(FMA*, GO*) Phenotypic Quality

Ontology(PaTO)

Subcellular Anatomy Ontology (SAO)Sequence Ontology

(SO*) Molecular Function

(GO*)Protein Ontology(PRO*) OBO Foundry Modular Organization

top level

mid-level

domain level

Information Artifact Ontology

(IAO)

Ontology for Biomedical

Investigations(OBI)

Spatial Ontology(BSPO)

Basic Formal Ontology (BFO)

118

Page 118: What developers need to know about ontologies? Barry Smith  1

BFO:continuant

Page 119: What developers need to know about ontologies? Barry Smith  1

BFO:occurrent

Page 120: What developers need to know about ontologies? Barry Smith  1

Example: The Cell Ontology

Page 121: What developers need to know about ontologies? Barry Smith  1

Anatomy Ontology(FMA*, CARO)

Environment

Ontology(EnvO)

Infectious Disease

Ontology(IDO*)

Biological Process

Ontology (GO*)

Cell Ontology

(CL)

CellularComponentOntology

(FMA*, GO*) Phenotypic Quality

Ontology(PaTO)

Subcellular Anatomy Ontology (SAO)Sequence Ontology

(SO*) Molecular Function

(GO*)Protein Ontology(PRO*) OBO Foundry Modular Organization

top level

mid-level

domain level

Information Artifact Ontology

(IAO)

Ontology for Biomedical

Investigations(OBI)

Spatial Ontology(BSPO)

Basic Formal Ontology (BFO)

122