Transcript
Page 1: Complex Data Modeling for Simpler Data Access

Complex Data Modeling for Simpler Data Access

TDWG 2014, Jönköping, Sweden

Ramona Walls

Robert Guralnick

Page 2: Complex Data Modeling for Simpler Data Access

A Canonical Example of “Opportunistic Collecting” typical in biocollections

Page 3: Complex Data Modeling for Simpler Data Access

plot

sub-plot

transect

(within plot)

individual

(within plot)individual

(within sub-plot)

Page 4: Complex Data Modeling for Simpler Data Access
Page 5: Complex Data Modeling for Simpler Data Access

transect

depth

* *

*

*

*

*sample collection point

water sample at

depth X

aliquot

*

metagenome

Page 6: Complex Data Modeling for Simpler Data Access

DwC

Bag ofterms

Page 7: Complex Data Modeling for Simpler Data Access

http://vegbank.org/vegbank/general/faq.html#datamodel

Page 8: Complex Data Modeling for Simpler Data Access

http://vegbank.org/vegbank/general/faq.html#datamodel

?

Page 9: Complex Data Modeling for Simpler Data Access

Madin et al. 2007 Ecol. Informatics doi: 10.1016/j.ecoinf.2007.05.004

OBO-E:

O&M:

Page 10: Complex Data Modeling for Simpler Data Access

Most biology requires work at the intersection of disciplines

MUSEUM COLLECTIONS

ECOLOGYGENOMICS

Page 11: Complex Data Modeling for Simpler Data Access

Material entities, information entities, and processes in the Basic Formal Ontology

Page 12: Complex Data Modeling for Simpler Data Access

observations versus specimens

Page 13: Complex Data Modeling for Simpler Data Access

Specimen data from a Darwin Core Archive: VertNet

Page 14: Complex Data Modeling for Simpler Data Access

specimencollection

process

sampling process

material sampling process

sampling process logical definition:assay and (achieves_planned_objective some ‘biological feature identification objective’)

has_specified_input some‘sampling feature’has_specified_output some‘sample data item’

specimen collection process logical definition:'planned process' and(achieves_planned_objectivesome 'specimen collection objective')

has_specified_input some‘material entity’has_specified_output some‘specimen’

material sampling process logical definition:'planned process' and(achieves_planned_objectivesome ’material sampling objective')

has_specified_input some‘material sampling feature’has_specified_output some‘material sample’

Page 15: Complex Data Modeling for Simpler Data Access

ROB

BCO Taxonomic Inventory Process Class and Sub-classes of different kinds of processes

Page 16: Complex Data Modeling for Simpler Data Access
Page 17: Complex Data Modeling for Simpler Data Access
Page 18: Complex Data Modeling for Simpler Data Access
Page 19: Complex Data Modeling for Simpler Data Access
Page 20: Complex Data Modeling for Simpler Data Access

Conclusions

• BCO splits the middle ground between the high level OBO-E world view and the flat way of representing a process that has a single output to allow us to represent all kinds of different content.

• BCO can serve as a sandbox to test out new models and terms for describing sampling processes and data, to inform standards like DwC.

Page 21: Complex Data Modeling for Simpler Data Access

Acknowledgments

• Dozens of participants at BCO workshops and hackathons over the past two years

• NSF-EAGER: An Interoperable Information Infrastructure for Biodiversity Research (I3BR)

• NSF: Research Coordination Network for GSC (RCN4GSC)

• VertNet and University of Kansas Biodiversity Institute


Top Related