building the ontology landscape for cancer big data research barry smith may 12, 2015
TRANSCRIPT
![Page 1: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/1.jpg)
Building the Ontology Landscape for Cancer Big Data Research
Barry SmithMay 12, 2015
![Page 2: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/2.jpg)
Addressing cancer big data challenges
Session 1: through imaging ontologies (BS)
Session 2: by capturing metadata for data integration and analysis (Chris Stoeckert)
Session 3: through the Ontology of Disease (Lynn Schriml and Lindsay Cowell)
Public Session: Cancer Big Data to Knowledge (BS)
2
![Page 3: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/3.jpg)
National Center for Biomedical Ontology (NCBO)
NIH Roadmap Center 2005-2015
Gene OntologySemantic Web
3
NCBO
![Page 4: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/4.jpg)
Old biology data
4
![Page 5: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/5.jpg)
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV
New biology data
5
![Page 6: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/6.jpg)
How to do biology across the genome?
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVMKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVMKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVMKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV
6
![Page 7: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/7.jpg)
how to link the kinds of phenomena represented here
7
![Page 8: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/8.jpg)
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGELIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGELIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE
8
to data like this?
![Page 9: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/9.jpg)
Answer
Tag the data with meaningful labels which together form an ontology
~ Semantic enhancement
An ontology is a controlled structured vocabulary to support annotation of data
9
![Page 10: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/10.jpg)
QuestionsHow to build an ontology?
How to bring it about that all scientists in each domain use the same ontology to annotate their data?
How to bring it about that scientists in neighboring domains use ontologies that are interoperable?
10
![Page 11: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/11.jpg)
By far the most successful: GO (Gene Ontology)
11
![Page 12: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/12.jpg)
GO provides a controlled vocabulary of terms for use in annotating (describing, tagging) data
• multi-species, multi-disciplinary, open source
• built by biologists, maintained and improved by biologists
• contributes to the cumulativity of scientific results obtained by distinct research communities
12
![Page 13: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/13.jpg)
International System of Units (SI)
13
![Page 14: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/14.jpg)
Gene products involved in cardiac muscle development in humans
14
![Page 15: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/15.jpg)
Prerequisites for ontology success
• Aggressive use in tagging data across multiple communities
• Feedback cycle between ontology editors and ontology users to ensure continuous update
• Logically and biologically coherent definitions – logical = to allow computational reasoning and
quality assurance– biological = to ensure consistency between
ontologies
15
![Page 16: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/16.jpg)
GO is amazingly successful
but it covers only generic biological entities of three sorts:
– cellular components– molecular functions– biological processes
and it does not provide representations of diseases, symptoms, anatomy, pathways, experiments …
16
![Page 17: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/17.jpg)
Ontology success stories, and some reasons for failure
So people started building the needed extra ontologies more or less at random
17
![Page 18: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/18.jpg)
18
![Page 19: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/19.jpg)
19
![Page 20: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/20.jpg)
20
![Page 21: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/21.jpg)
21
![Page 22: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/22.jpg)
22
![Page 23: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/23.jpg)
23
![Page 24: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/24.jpg)
24
![Page 25: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/25.jpg)
25
![Page 26: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/26.jpg)
26
![Page 27: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/27.jpg)
27
Definition: Reaching a decision through the application of an algorithm designed to weigh the different factors involved.
![Page 28: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/28.jpg)
28
Definition: Reaching a decision through the application of an algorithm designed to weigh the different factors involved.
Confuses an algorithm with an act of reaching a decision
Defines ‘algorithm’ as a special kind of application of an algorithm. (This is worse than circular.)
![Page 29: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/29.jpg)
John Fox (Director, OpenClinical)
As a user and teacher of ontological methods in medicine and engineering I have for years warned my students that the design of domain ontologies is a black art with no theoretical foundations and few practical principles.
29
![Page 30: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/30.jpg)
Ontology success stories, and some reasons for failure
Linked Open Data, from Musicbrainz to Mouse Genome Informatics
30
![Page 31: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/31.jpg)
What are the criteria of success for ontologies in supporting reasoning
over Big Data?1. logically and biologically correct
subsumption hierarchies– correct: Beta cell is_a cell– incorrect: allergy is_a allergy
record in Microsoft Healthvault
31
![Page 32: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/32.jpg)
John Fox, againAs a user and teacher of ontological methods in medicine and engineering I have for years warned my students that the design of domain ontologies is a black art with no theoretical foundations and few practical principles. … I now have a much more positive story for my students. … In the journey from black art to a truly scientific theory for ontology design this book is an important milestone.
32
![Page 33: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/33.jpg)
33
![Page 34: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/34.jpg)
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)Original OBO Foundry ontologies
(Gene Ontology in yellow) 34
![Page 35: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/35.jpg)
– CHEBI: Chemical Entities of Biological Interest– CL: Cell Ontology– GO: Gene Ontology– OBI: Ontology for Biomedical Investigations– PATO: Phenotypic Quality Ontology– PO: Plant Ontology– PATO: Phenotypic Quality Ontology– PRO: Protein Ontology– XAO: Xenopus Anatomy Ontology– ZFA: Zebrafish Anatomy Ontology
http://obofoundry.org
35
![Page 36: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/36.jpg)
Anatomy Ontology(FMA*, CARO) Disease Ontology
(OGMS, IDO, HDO, HPO)
Biological Process Ontology (GO)
Cell Ontology(CL)
Subcellular Anatomy Ontology (SAO)
Phenotypic QualityOntology(PATO)
Sequence Ontology (SO) Molecular Function
Ontology(GO)Protein Ontology
(PRO)
Extension Strategy + Modular Organization
top level
mid-level
domain level
INDEPENDENT CONTINUANT
(~THING))
DEPENDENT CONTINUANT(~ATTRIBUTE)
OCCURRENT(~PROCESS)
Basic Formal Ontology (BFO)
36
![Page 37: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/37.jpg)
Example: The Cell Ontology
![Page 38: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/38.jpg)
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
Cellular Process
(GO)
MOLECULEMolecule
(ChEBI, SO,RNAO, PRO)
Molecular Function(GO)
Molecular Process
(GO)
rationale of OBO Foundry coverage
GRANULARITY
RELATION TO TIME
38
![Page 39: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/39.jpg)
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Component(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)Environment Ontology (EnvO)
En
viro
nm
ents
39
![Page 40: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/40.jpg)
examples of OBO Foundry approach extended into other domains
42
NIF Standard Neuroscience Information Framework
IDO Consortium Infectious Disease Ontology Suite
cROP Common Reference Ontologies for Plants
UNEP Ontology Framework
United Nations Environment Program Ontologies
![Page 41: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/41.jpg)
Common Reference Ontologies for Plants (cROP)
![Page 42: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/42.jpg)
The second important criterion of ontology success in supporting
reasoning over Big Data is:keeping track of provenance
= recording how data was generated and processed in a way external users can understand, to enhance
• combinability
• reproducibility44
![Page 43: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/43.jpg)
RELATION TO TIME
CONTINUANT
OCCURRENTGRANULARITY
INDEPENDENTCONTINUANT
DEPENDENT CONTINUANT
ORGAN ANDORGANISM
Organism
NCBITaxonom
y
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO)
Biological Process
(GO)Ontology for Biomedical Investigatio
ns(OBI)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function
(GO)
Molecular Process
(GO)
Env
iron
men
t Ont
olog
y (E
NV
O)
45
Phe
noty
pic
Qua
lity
(PA
TO
)
Recognizing a new family of protocol-driven processes (investigation, assay, …)
![Page 44: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/44.jpg)
Anatomy Ontology(FMA*, CARO) Disease Ontology
(OGMS, IDO, HDO, HPO)
Bio-logical Process
Protocol-driven
process(OBI)
Cell Ontology(CL)
Subcellular Anatomy Ontology
(SAO)
Phenotypic QualityOntology(PATO)
Sequence Ontology
(SO)
Molecular Function Ontology
(GO)Protein Ontology(PRO)
Extension Strategy + Modular Organization
INDEPENDENT
CONTINUANT(~THING))
DEPENDENT CONTINUANT(~ATTRIBUTE)
OCCURRENT(~PROCESS)
Basic Formal Ontology (BFO)
46
![Page 45: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/45.jpg)
Structure of a typical investigation as viewed by OBI (from http://obi-ontology.org/page/Investigation)
The Ontology for Biomedical Investigations
![Page 46: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/46.jpg)
RELATION TO TIME
CONTINUANTOCCURRENT
GRANULARITY
INDEPENDENTCONTINUANT
DEPENDENT CONTINUANT
INFORMATION ARTIFACT
ORGAN ANDORGANISM
Organism
NCBITaxonom
y
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO)
IAOSoftware,
Algorithms,…
Sequence Data,
EHR Data …
Biological
Process(GO)
OBICELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function
(GO)
Images,Image Data,
Flow Cytometry
Data, …
Molecular Process
(GO)OBI:
Imaging
Env
iron
men
t Ont
olog
y (E
NV
O)
48
Phe
noty
pic
Qua
lity
(PA
TO
)
Recognizing a new family of information entities: data, publications, images, algorithms …
![Page 47: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/47.jpg)
Anatomy Ontology(FMA*, CARO) Disease Ontology
(OGMS, IDO, HDO, HPO)
Data Biological Process Assays
Cell Ontology(CL)
Subcellular Anatomy Ontology
(SAO)
Phenotypic QualityOntology(PATO)
Sequence Ontology
(SO)Molecular
Function Ontology(GO)Protein Ontology
(PRO)
Extension Strategy + Modular Organization
INDEPENDENT
CONTINUANT(~THING))
DEPENDENT CONTINUANT(~ATTRIBUTE)
INFORMATION
ARTIFACT (~DATA)
OCCURRENT(~PROCESS)
Basic Formal Ontology (BFO)
49
![Page 48: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/48.jpg)
50
Even here, things are not as bad as they seem
![Page 49: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/49.jpg)
51
![Page 50: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/50.jpg)
52
![Page 51: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/51.jpg)
53
![Page 52: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/52.jpg)
54
http://purl.obolibrary.org/obo/IAO_0000064: algorithm
![Page 53: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/53.jpg)
IAO = Information Artifact Ontology:
https://code.google.com/p/information-artifact-ontology/
55
![Page 55: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/55.jpg)
A list of ontologies using IAOAdverse Event Reporting Ontology (AERO)Bioinformatics Web Service OntologyBiological Collections Ontology (BCO)Chemical Methods Ontology (CHMO)Cognitive Paradigm Ontology (COGPO)Comparative Data Analysis Ontology Computational Neuroscience OntologyCore Clinical Protocol Ontology (C2PO)Document Act OntologyEagle-I Research Resource Ontology (ERO)The Email OntologyEmotion Ontology (MFOEM)Experimental Factor Ontology (EFO)Exposé OntologyIAO-IntelInfectious Disease Ontology (IDO)Influenza Research Database (IRD)Information Entity OntologyMental Functioning Ontology (MF)
Ontology for Biomedical InvestigationsOntology for Drug Discovery Investigations Ontology for General Medical Science (OGMS)Ontology for Newborn Screening Follow-up and Translational Research (ONSTR)Ontology of Clinical Research (OCRE)Ontology of Data Mining (OntoDM) Ontology of Medically Related Social Entities (OMRSE)Ontology of Vaccine Adverse Events Oral Health and Disease Ontology (OHDO)Population and Community OntologyProper Name OntologySemanticscience Integrated OntologySoftware Ontology (SWO)Translational Medicine Ontology (TMO)Twitter OntologyVaccine Ontology (VO)
![Page 56: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/56.jpg)
Patient Demogra
phics Phenotype
(Disease, …)
Disease process
esData about all of
these things including
image data …algorithms, software,
protocols, …
Instruments, Biomaterials,
FunctionsParameters, Assay types,
Statistics…
Anatomy
Histology
Genotype (GO)
Biological
processes (GO)
Chemistry
INDEPENDENT
CONTINUANT
(~THING))
DEPENDENT
CONTINUANT
(~ATTRIBUTE)
OCCURRENT
(~PROCESS)
IAO OBI
Basic Formal Ontology (BFO)
58 aboutness
![Page 57: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/57.jpg)
Patient Demogra
phics Phenotype
(Disease, …)
Disease process
esData about all of
these things including
image data …algorithms, software,
protocols, …
Instruments, Biomaterials,
FunctionsParameters, Assay types,
Statistics
Anatomy
Histology
Genotype (GO)
Biological
processes (GO)
Chemistry
INDEPENDENT
CONTINUANT
(~THING))
DEPENDENT
CONTINUANT
(~ATTRIBUTE)
OCCURRENT
(~PROCESS)
IAO OBI
Basic Formal Ontology (BFO)
59 biomedical imaging ontology
![Page 58: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/58.jpg)
The third important criterion of ontology success in supporting
reasoning over Big Data is:use the framework of modular,
general-purpose reference ontologies as starting points for
creating families of purpose-specific application ontologies in ever widening circles (scalability)
60
![Page 59: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/59.jpg)
BFO
61
Ontology for General Medical Science (OGMS) Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease Ontology
Infectious Disease Ontology IDO Staph Aureus IDO MRSA IDO Australian MRSA IDO Australian Hospital MRSA …
![Page 60: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/60.jpg)
![Page 61: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/61.jpg)
![Page 62: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/62.jpg)
![Page 63: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/63.jpg)
Problems with:
Denys-Drash syndrome is_a rare non-neoplastic disorder
1. Denys-Drash syndrome involves nephroblastoma and is therefore neoplastic
2. X is_a rare Y does not track biology
![Page 64: Building the Ontology Landscape for Cancer Big Data Research Barry Smith May 12, 2015](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649dcf5503460f94ac4813/html5/thumbnails/64.jpg)
What are the criteria of success for ontologies in supporting reasoning
over Big Data?
correct: Beta cell is_a cellincorrect: rare disease is_a disease
If the ontology hierarchy is to support biologically useful reasoning it must track biology
66