hiding ontologies under the carpet - ontosib 2013
DESCRIPTION
Presentation for the OntoSIB 2013 meeting in GenevaTRANSCRIPT
Hiding ontologies under the carpet
Frederic Bastian – Bgee teamOntoSIB - August 2013
© 2013 SIB
Big players of Internet never exhibit ontologies
Amazon:
© 2013 SIB
Big players of Internet never exhibit ontologies
Google Knowledge Graph:
© 2013 SIB
Big players of Internet never exhibit ontologies
Google Knowledge Graph:
© 2013 SIB
Big players of Internet never exhibit ontologies
Facebook Open Graph:
© 2013 SIB
Big players of Internet never exhibit ontologies
Facebook Open Graph:
© 2013 SIB
Meanwhile… Bgee is ontology-centric
Expression of mouse Hoxa5
© 2013 SIB
Meanwhile… Bgee is ontology-centric
Expression of mouse Hoxa5
© 2013 SIB
But lists are not usable…
Expression of mouse Hoxa5
© 2013 SIB
Neither are tables
Expression of mouse Hoxa5
© 2013 SIB
Solutions are…
On the analytical side: - identify the most pertinent biological signal
On the visualization side: - display the most pertinent information (e.g., remove
redundancy)- guide users to the information they look for
© 2013 SIB
Solutions are…
On the analytical side: - identify the most pertinent biological signal
On the visualization side: - display the most pertinent information (e.g., remove
redundancy)- guide users to the information they look for
But an ontology-centric display is terrible at guidance and pertinence.
© 2013 SIB
The solution adopted by neXtProt
Revamp the ontologies to make the organization clearer, and the hierarchy simpler.
© 2013 SIB
Bgee needs a different approach
- Bgee includes several species; revamping ontologies would be too time-consuming.
- Bgee includes in situ hybridizations with great granularity; we don’t want to loose it.
- Things are getting even worse with the use of the Uberon ontology.
We are now trying the approach of completely hiding the ontologies to the users!
© 2013 SIB
Solution 1
Summarize information on the fly:
Find by walking the ontology from the root the x most general terms
© 2013 SIB
Solution 1
1. Start walk from the root
© 2013 SIB
Solution 1
1. Start walk from the root
2. Walk 1st level
2 terms identified
© 2013 SIB
Solution 1
1. Start walk from the root
2. Walk 1st level
3a. Walk 2nd level 1st term
2 terms identified
© 2013 SIB
Solution 1
1. Start walk from the root
2. Walk 1st level
3a. Walk 2nd level 1st term
3b. Walk 2nd level 2nd term
2 terms identified
© 2013 SIB
Solution 1
1. Start walk from the root
2. Walk 1st level
3a. Walk 2nd level 1st term
3b. Walk 2nd level 2nd term
4a. Walk 3rd level 1st term
4 terms identified
© 2013 SIB
Solution 1
1. Start walk from the root
2. Walk 1st level
3a. Walk 2nd level 1st term
3b. Walk 2nd level 2nd term
4a. Walk 3rd level 1st term
4b. Walk 3rd level 2nd term
11 terms identified
© 2013 SIB
Solution 1
EMAPA:16060 cavities and their linings
EMAPA:16072 primitive streak
EMAPA:16097 mesenchyme
EMAPA:16103 organ system
EMAPA:16405 limb
EMAPA:16748 tail
EMAPA:17213 skeleton
EMAPA:17743 vertebral axis muscle system
MA:0000003 organ system
MA:0002405 postnatal mouse
MA:0002433 anatomic region
11 terms identified
© 2013 SIB
Solution 1
EMAPA:16060 cavities and their linings
EMAPA:16072 primitive streak
EMAPA:16097 mesenchyme
EMAPA:16103 organ system
EMAPA:16405 limb
EMAPA:16748 tail
EMAPA:17213 skeleton
EMAPA:17743 vertebral axis muscle system
MA:0000003 organ system
MA:0002405 postnatal mouse
MA:0002433 anatomic region
Uberon includes “subsets” that would allow to filter meaningless terms
© 2013 SIB
Solution 2
Find most precise and independent terms:
Walk the ontology from the leaves to the root.
Data found in a term will prevent its ancestors from being display (redundancy), but not its siblings (independency).
© 2013 SIB
Solution 2
1. Start the walk from the leaves
© 2013 SIB
Solution 2
1. Start the walk from the leaves
2. Remove ancestors
© 2013 SIB
Solution 2
Limb example:
EMAPA:17459 footplate
EMAPA:17428 handplate
EMAPA:17713 humerus cartilage condensation
EMAPA:16779 hindlimb bud
EMAPA:16406 forelimb bud
- Visualization tool should allow to easily get information about ancestors of selected terms.
- This solution could still lead to an unorganized list of many terms
© 2013 SIB
Solution 3
Display most precise and independent terms, organized by most general terms
© 2013 SIB
Solution 3
Example: general terms
EMAPA:16405 limb
EMAPA:16748 tail
EMAPA:17213 skeleton
© 2013 SIB
Solution 3
Example: general terms + precise and independent terms
EMAPA:16405 limbEMAPA:17459 footplate
EMAPA:17428 handplate
EMAPA:17713 humerus cartilage condensation
EMAPA:16779 hindlimb bud
EMAPA:16406 forelimb bud
EMAPA:16748 tailEMAPA:16752 unsegmented mesenchyme
EMAPA:17213 skeleton EMAPA:18344 sternum
EMAPA:19387 S1
EMAPA:19388 S2
EMAPA:19364 S3
EMAPA:19365 S4
EMAPA:18010 rib
© 2013 SIB
OWLGraphManipulator allows to perform:
• enhanced relation reductions• class removal and relation propagation• relation mapping to parent• relation filtering or removal• subgraph filtering or removal• relation removal to subset if non orphan• combination of these methods for generating basic
ontologies
Used to simplify the Uberon ontology.
Owltools enhancement – OWLGraphManipulator
© 2013 SIB
Use of OWLGraphManipulator on Uberon:
• From the global version, keep only relevant species. • Clear relations to upper_level (obscure) terms• Remove subgraphs of obscure terms, keeping shared
classes• keep only is_a, part_of, develops_from, and sub-relations• Simplify graph structure over is_a/part_of
Uberon tweaking
© 2013 SIB
Development of an ontology to capture confidence in an annotation, following the Biocuration 2012 meeting.
http://wiki.isb-sib.ch/biocuration/Confidence_information_draft
Quality codes
© 2013 SIB
We annotate homology using Uberon, by providing for each annotation:
• Uberon ID UBERON:0003126 trachea
• NCBI taxon ID 32523 Tetrapoda
• HOM ID HOM:0000007 historical homology
• Evidence Code ID ECO:0000060 positional similarity evidence
• Confidence Code ID CONF:0000003 High confidence
• References ISBN:978-0030223693 "Liem KF, Bemis WE, Walker WF,
Grande L, Functional Anatomy of the Vertebrates: An Evolutionary Perspective (2001) p.591-592 »
Homology annotation
© 2013 SIB
Conclusion
Some big players on internet make an intensive use of ontologies (Amazon, Google, Facebook, …)
They invest a lot in usability and user-friendliness. What we can learn from them is:
They never, ever, display ontologies as such
© 2013 SIB
Conclusion
In bioinformatics, we have much more information to capture in ontologies.
But the lack of usability prevent biologists to access this knowledge.
It is now time to invest more in usability.
We hope that our approach will make easier the use of ontologies to analyse gene expression data.
Thank You
Marta RosikiewiczSébastien Moretti
Anne Niknejad
Mathieu Seppey
Marc Robinson-Rechavi