scientific names and descriptions for organisms on the semantic web nathan wilson 1, han wang 2, and...

19
Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1 , Han Wang 2 , and Deborah McGuinness 3 1 Marine Biological Laborary, 7 MBL St., Woods Hole, MA 02556, USA nwilson @ mbl.edu 2 Tetherless World Constellation, Rensselaer Polytechnic Institute, 110 8th Street Troy, NY 12180, USA [email protected] 3 Tetherless World Constellation, Rensselaer Polytechnic Institute, 110 8th Street Troy, NY 12180, USA [email protected]

Upload: rudolf-whitehead

Post on 24-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

Scientific Names and Descriptions for Organisms on

the Semantic Web

Nathan Wilson1, Han Wang2, and Deborah McGuinness3

1 Marine Biological Laborary, 7 MBL St., Woods Hole, MA 02556, USA

[email protected] Tetherless World Constellation, Rensselaer Polytechnic Institute, 110 8th Street Troy, NY 12180, USA

[email protected] Tetherless World Constellation, Rensselaer Polytechnic Institute, 110 8th Street Troy, NY 12180, USA

[email protected]

Page 2: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

2

For Example…

• Chroogomphus vinicolor• Chroogomphus rutilus• Chroogomphus ochraceus• Chroogomphus• Pine Spike

11/12/2012

Candidate Names:

Chroogomphus vinicolor California, USA

© 2007 Darvin DeShazer CC-BY-NC

Pine Spike California, USA

© 1993 Nathan Wilson CC-BY

Page 3: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

3

• Take advantage of the crowd to understand the world’s biodiversity.

• Clarify connection between observations and scientific literature.

• Provide accurate and machine-interpretable definitions for groups of organisms.

• Build a central repository for semantic descriptions of groups of organisms.

Motivation

11/12/2012

Page 4: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

4

What is a Species?

• Concept: “A group of organisms capable of interbreeding and producing fertile offspring”

• Definition: Type specimen, a name, and a circumscription that is believed to describe the species that specimen belongs to

11/12/2012

Page 5: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

5

Definition of a Scientific Name

11/12/2012

Latin NameReference

Type

Circumscription

Problem: Circumscriptions change frequently

Cap: overlapping, wavy; multicolored concentric

Page 6: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

6

Definition of a Scientific Name

11/12/2012

Latin NameReference

Type

Cap: 2-10cm wide; usually overlapping or in a row or a rosette; kidney shaped, also described as fan shaped, sometimes fused laterally; can be flat to wavy; multicolored concentric

Multiple Circumscriptions

Cap: 2-10cm wide; overlapping or in a row or a rosette; kidney shaped, also described as fan shaped; can be flat to wavy; multicolored concentric

Cap: 2-10cm wide; overlapping, also rosette; can be flat to wavy; multicolored concentric; zones alternate

Cap: overlapping, wavy; multicolored concentric

Page 7: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

7

Scientific Names & Observations

11/12/2012

Scientific NamesObservations

Type

Circumscription(s)

People

Genetic Barcode

Page 8: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

8

Semantic Vernacular Description (SVD)

11/12/2012

SVD

Identifiere.g. SV1234

Namee.g. PineSpike

Descriptione.g.EquivalentTo: Fungus and (hasOverallShape some StipitateAgaric) and (hasHymenophoreShape some Gilled) and ((hasPileusDiscColor some Brown)...

Scientific Name(s)e.g. Chroogomphus rutilusChroogomphus vinicolorChroogomphus ochraceus

Page 9: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

9

Definition of a Scientific Name

11/12/2012

Latin NameReference

Specimen

SV5678SV4567SV3456SV2345

Multiple Circumscriptions

Page 10: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

10

SVDs

11/12/2012

SVDs Scientific NamesObservations

Type

Circumscription(s)

People

Genetic Barcode

Page 11: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

1111/12/2012

• Provides a controlled vocabulary for describing an observation.

• Associates an observation to one or more scientific names.

• Starts with macroscopic features.• Moving into microscopic, chemical, and

molecular features.• Generated by an open collaborative

process.

Fungal Ontology

Page 12: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

12

Observational Features

ObjectProperty: 'has surface color’ Annotations: label "has surface color"^^Literal SubPropertyOf: 'has color’ Domain: Fungus and (('has overall shape' some earthstar) or ('has overall shape' some gasteroid)) Range: 'Color Value Partition'

11/12/2012

Page 13: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

13

Descriptions

Class: SV1112 Annotations: hasID "1112"^^positiveInteger EquivalentTo: Fungus and (('has surface color' some white) or ('has surface color' some gray) or ('has surface color' some off-white)) and ('has hymenophore shape' some 'spore mass') and ('has overall shape' some gasteroid) and ('has substrate attachment' some pileate-sessile)

SubClassOf: 'proposed at' value "2012-07-03T12:00:00-

05:00"^^dateTime, 'Vernacular Feature Description', 'proposed by' value SV1090

11/12/2012

Page 14: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

14

SVDs

Class: SV1012 Annotations: hasID "1012"^^positiveInteger SubClassOf: 'has SVD name' value WhitePuffball, 'Semantic Vernacular Description', 'has definition' some SV1112, 'has associated scientific name' some 'Bovista pila', 'has associated scientific name' some 'Lycoperdon

perlatum'

11/12/2012

Page 15: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

15

Fungal Ontology

11/12/2012

Page 16: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

16

Highlights

• Explicit, fixed collection of observational features

• ‘Duck’ Typing: If it ‘looks’ like a PineSpike, it is a PineSpike

• Amenable to peer review/codification• Inherently unique• Stable for observers

11/12/2012

Page 17: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

17

Peer-review Process

• Every SVD needs review before use• Alternative names and definitions can

be proposed• Discussion/voting happens where there

are alternatives• Votes are weighted according to users’

past contributions to the process

11/12/2012

Page 18: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

18

Implementation Prototype

11/12/2012

• A Ruby on Rails application

• Triple store powered by Jena TDB

• RESTful web service ready for use

http://mushroomobserver.org/semantic_vernacular

Page 19: Scientific Names and Descriptions for Organisms on the Semantic Web Nathan Wilson 1, Han Wang 2, and Deborah McGuinness 3 1 Marine Biological Laborary,

19

Questions? Comments?

11/12/2012

1. Artportalen, http://artportalen.se

2. Biodiversity Heritage Library, http://biodiversitylibrary.org

3. Encyclopedia of Life, http://eol.org

4. International Code of Nomenclature of Bacteria: Bacteriological Code, 1990 Revision. ASM Press, Washington, DC, USA (1992)

5. International Code of Zoological Nomenclature. The International Trust for Zoological Nomenclature, London, UK, 4th edn. (2000)

6. Burdsall, H.H., Bank, M.T.: The Genus Laetiporus in North America. Harvard Papers in Botany 6(1),43{55 (2001)

7. Dahdul, W.M., Lundberg, J.G., Midford, P.E., Balho, J.P., Lapp, H., Vision, T.J., Haendel, M.A., Westereld, M., Mabee, P.M.: The Teleost Anatomy Ontology: Anatomical Representation for the Genomics Age. Systematic Biology 59, 369{383 (2010), doi:10.1093/sysbio/syq013

8. Knapp, S., McNeill, J., Turland, N.J.: Changes to Publication Requirements Made at the XVIII Inter-national Botanical Congress in Melbourne - What Does e-Publication Mean for You? PhytoKeys 6(0),5{11 (2011), doi:10.3897/phytokeys.6.1960

9. Knowlton, N.: Sibling Species in the Sea. Annual Review of Ecology and Systematics 24, 189{216 (1993), doi:10.1146/annurev.es.24.110193.001201

10. Mayr, E.: The Bearing of the New Systematics on Genetical Problems. The Nature of Species. Advances in Genetics 2, 205{237 (1948)

11. McGuinness, D.L., van Harmelen, F.: OWL Web Ontology Language Overview. World Wide Web Consortium (W3C) Recommendation. (2004), http://www.w3.org/TR/owl-features/

12. Miko, I., Deans, A.R.: Masner, a New Genus of Ceraphronidae (Hymenoptera: Ceraphronoidea) Described Using Controlled Vocabularies. ZooKeys 20, 127{153 (2009), doi:10.3897/zookeys.20.119

13. Patterson, D.J., Cooper, J., Kirk, P.M., Pyle, R.L., Remsen, D.P.: Names are Key to the Big New Biology. Trends in Ecology & Evolution 25(12), 686{691 (2010), doi:10.1016/j.tree.2010.09.004

14. Sato, H., Yumoto, T., Murakami, N.: Cryptic Species and Host Specicity in the Ectomycorrhizal Genus Strobilomyces (Strobilomycetaceae). American Journal of Botany 94(10), 1630{1641 (2007)

15. Sullivan, B.L., Wood, C.L., Ili, M.J., Bonney, R.E., Fink, D., Kelling, S.: eBird: a Citizen-based Bird Observation Network in the Biological Sciences. Biological Conservation 142, 2282{2292 (2009),doi:10.1016/j.biocon.2009.05.006

16. Ueda, K., Loarie, S.: iNaturalist, http://inaturalist.org

17. Wilson, E.O.: The Future of Life. Random House Digital, Inc. (2002)

18. Wilson, N., Dunn, K., Wang, H., McGuinness, D.L.: Application of Semantic Technology to Define Names for Fungi. Tech. rep., Tetherless World Constellation at Rensselaer Polytechnic Institute (2012), http://tw.rpi.edu/web/doc/ApplicationofSemanticTechnologytoDefineNamesforFungi

19. Wilson, N., Hollinger, J.: Mushroom Observer, http://mushroomobserver.org

20. Yoder, M.J., Miko, I., Seltmann, K.C., Bertone, M.A., Deans, A.R.: A Gross Anatomy Ontology for Hymenoptera. PLoS ONE 5(12), e15991 (2010), doi:10.1371/journal.pone.0015991

AcknowledgementsKatie Dunn

Jason HollingerEncyclopedia of Life

Tetherless World ConstellationMarine Biological Laboratory

Rensselaer Polytechnic InstituteMushroom Observer Community