the challenge of biodiversity: plot, organism and taxonomic databases robert k. peet university of...
TRANSCRIPT
The challenge of biodiversity:Plot, organism and taxonomic databases
Robert K. PeetUniversity of North Carolina
The National Plots Database Committee
John HarrisNCEAS
A case study: VegBank - The ESA Vegetation Plot Archive
Project supported by:National Center for Ecological Analysis & SynthesisU.S. National Science FoundationUSGS-BRD Gap Analysis ProgramABI / The Nature Conservancy
Project organized and directed by:Robert K. Peet, University of North CarolinaMarilyn Walker, USDA Forest Service & U. AlaskaDennis Grossman, The Nature Conservancy / ABIMichael Jennings, USGS-BRD & UCSB
Observation/CollectionEvent
Object or specimen
Taxon
Locality
Biodiversity data structure
Taxonomic databases
Plot/Inventory databases
Specimen databases
Web-interface
Veg Classification Database
VegBank
Proposal
Raw Plot Data
Vegetation/Biodiversity
Information flow in the US National Vegetation Classification
TaxonomicDatabase
Proposal
Taxonomic database challengeThe problem:
Integration of data potentially representing different times, places, investigators and taxonomic standards
The traditional solution: A standard list of kinds of organisms.
There exist numerous compilations of organism names.
For example:
• Species 2000http://www.sp2000.org/default.html(Composed of 18 participant databases)
• All Species http://www.all-species.org
• ITIS http://www.itis.usda.gov/(The US government standard list, plus Canada & Mexico)
• Index to organism nameshttp://www.biosis.org.uk/triton/indexfm.htm
Taxon-specific standard lists are available.
Representative examples for higher plants include:North America / US
USDA Plants http://plants.usda.gov/ITIS http://www.itis.usda.gov/ NatureServe http://www.natureserve.org
WorldIPNI International Plant Names Checklist
http://www.ipni.org/IOPI Global Plant Checklist
http://www.bgbm.fu-berlin.de/IOPI/GPC/
Most standardized plant lists fail to allow effective integration of datasets.
The reasons include:
• The user cannot reconstruct the database as viewed at an arbitrary time in the past,
• Taxonomic concepts are not defined (just lists),
• Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.
Current standards• Biological organisms are named following international rules of nomenclature.
• Database standards are being developed by TDWG, GBIF, IOPI, etc.
• Metadata standards have been developed. For example, the Darwin Core is a profile describing the minimum set of standards for search and retrieval of natural history collections and observation databases. (http://tsadev.speciesanalyst.net/DarwinCore/)
Carya ovata(Miller)K. Koch
Carya carolinae-sept.(Ashe) Engler & Graebner
Carya ovata(Miller)K. Koch
sec. Gleason 1952 sec. Radford et al. 1968
Three concepts of shagbark hickory
Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies.
R. plumosa
R. plumosa
R plumosav. intermedia
R. plumosav. plumosa
R. intermedia
R. plumosav. interrupta
R. pineticola
R. plumosa R. sp. 1
R. plumosav. plumosa
R. plumosav. pineticola
Multiple concepts of Rhynchospora plumosa s.l.
Elliot 1816
Gray 1834
Kral 1998
Peet 2002?
1
2
3
Chapman1860
Name ReferenceAssertion
An assertion represents a unique combination of a name and a reference
“Assertion” is equivalent to “Potential taxon” & “taxonomic concept”
NamesCarya ovata Carya carolinae-septentrionalisCarya ovata v. australis
Assertions(One shagbark)C. ovata sec Gleason ’52C. ovata (sl) sec FNA ‘97
(Southern shagbark)C. carolinae-s. sec Radford ‘68C. ovata v. australis sec FNA ‘97
(Northern shagbark)C. ovata sec Radford ‘68C. ovata (v. ovata) sec FNA ‘97
ReferencesGleason 1952 Britton & BrownRadford et al. 1968 Flora CarolinasStone 1997 Flora North America
Six shagbark hickory assertions
Possible taxonomic synonyms are listed together
Name AssertionUsage
A usage represents a unique combination of an assertion and a name.
Usages can be used to track nomenclatural synonyms
1. Carya ovata2. C. carolinae3. C. ovata var. australis
A. ovata sec. GleasonB. ovata sl sec. FNAC. carolinae sec. RadfordD. ovata australis sec. FNAE. ovata sec. RadfordF. ovata ovata sec. FNA
1-F OK2-D OK3-D Syn
Names AssertionsITIS Usage
ITIS views the linkage of the assertion “Carya ovata var. australis sec. FNA 1997” with the name “Carya ovata var. australis” as a nomenclatural synonym.
Name AssertionUsage
A usage (name assignment) and assertion (taxon concept) can be
combined in a single model
Reference
Party Perspective
The Party Perspective on an Assertion includes:
•Status – Standard, Nonstandard, Undetermined
• Correlation with other assertions – Equal, Greater, Lesser, Overlap,
Undetermined.
•Lineage – Predecessor and Successor assertions.
•Start & Stop dates.
ITISFNA CommitteeABI
Carya ovata sec Gleason 1952Carya ovata (sl) sec FNA 1997 Carya ovata sec Radford 1968Carya carolinae sec Radford 1968Carya ovata (ovata) sec FNA 1997Carya ovata australis sec FNA 1997
Party Assertion
Party Assertion Status Start Name
ITIS ovata – G52 NS 1996ITIS ovata – R68 St 1996 ovataITIS carolinae – R68 St 1996 carolinaeITIS carolinae – R68 NS 2000ITIS ovata aust – FNA St 2000 carolinaeITIS ovata – R68 NS 2000ITIS ovata ovata – FNA St 2000 ovata
Status
VegBank taxonomic data model
Concept-based taxonomy is coming! • All organisms/specimens in databases should be identified by linkage to an assertion = name and reference!
• Various standards are being developed by FGDC, TDWG, IOPI, GBIF, etc.
• Most major databases are working toward inclusion of assertions (e.g. ITIS, IOPI, HDMS).
• Until standard assertion lists are available, databases that track organisms should include couplets containing both a scientific name and a reference.
(Inter)National Taxonomic Database?
• Concept-based• Party-neutral• Synonymy and lineage tracking• Perfectly archived
An upgrade for ITIS & Species 2000?
Specimen/object databases
Information on specimens/objects should be tracked by reference to
• Place (place or collection)
• Unique identifier (accession number)
• Time
A museum is a place
Annotation should be by assertion (concept)!
Database systems for tracking specimens
The following are a few of the many available
• BioLink http://www.ento.csiro.au/biolink/index.html
• Specify http://usobi.org/specify/default.htm
• Biota http://viceroy.eeb.uconn.edu/Biota
• Taxis http://taxis.virtualave.net/
TDWG maintains links to multiple software systems
http://www.bgbm.fu-berlin.de/TDWG/acc/Software.htm
Plots Database SystemsSeveral plot database systems are available. Among the best know and widely used are:
TurboVeg http://www.alterra.nl/onderzoek/producten/websites/turboveg/Over 1,000,000 plots stored using TurboVeg
Plots (ABI NPS Mapping Project)
A vegetation plot archive?
There is currently no standard repository for plot data.
A repository is needed for:
• Plot storage
• Plot access and identification
• Plot documentation in literature/databases
This would be equivalent to GenBank for vegetation science
Project
PlotPlot
Observation
Taxon Observation
Taxon Interpretation
PlotInterpretation
Core elements of the VegBank
Support multiple interpretations of which concept applies to an organism or community.
Various observers will associate different taxonomic concepts with records in a database
Provision must be made for inclusion of these taxonomic interpretations.
Minimal attributes include
• Concept applied
• Date applied
• Who made the interpretation
• Links to supporting information
Interface tools•Desktop client for data preparation and local use.
•Loaders for legacy data.
•Flexible data inport.
•Tools for linking to taxonomic and community concepts.
•Standard query, flexible query, SQL query.
•Flexible data export.
•Local data refresh
•Easy web access with consistent interface
Conclusions for database designers1. Records of organisms should always contain
(or point to) couplets consisting of a scientific name and a reference where the name was used.
2. Design for future annotation of organism concepts.
3. Track specimens/objects by location, unique identifier & time.
4. Design for reobservation. Separate permanent from transient attributes.
5. Archival databases should provide multiple or continuous time-specific views.