plant systematics databases: users perspectives robert k. peet, university of north carolina in...

31
Plant Systematics databases: Users perspectives Robert K. Peet, University of North Carolina In collaboration with The National Center for Ecological Analysis & Synthesis VegBank Development Team Ecological Society of America Vegetation Panel Science Environment for Ecological Knowledge

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Plant Systematics databases:

Users perspectives

Robert K. Peet, University of North Carolina

In collaboration with The National Center for Ecological Analysis &

SynthesisVegBank Development Team

Ecological Society of America Vegetation PanelScience Environment for Ecological KnowledgeInternational Association for Vegetation Science

Biodiversity data structure

Taxonomic database

Plot/Inventory database

Occurrence database

Plot Observation/Collection Event

Specimen or Object

Bio-Taxon

Locality

Community Type

Community type database

New data sources & EcoInformatics

• Site data: climate, soils, topography, etc.

• Taxon attribute data: identification, phylogeny, distribution, life-history, functional attributes, etc.

• Occurrence data: attributes of individuals (e.g., size, age, growth rate) and taxa (e.g., cover, biomass) that occur or co-occur at a site.

How do we get there?Standards, tools & access

• Standard protocols.• Standard data structures &

exchange formats. • Public data archives and

databases• Tools for data discovery &

semantic mediation.

VegBank

• The ESA Vegetation Panel is developingVegBank (www.vegbank.org) as a public vegetation plot archive

• VegBank is expected to function for vegetation data in a manner analogous to GenBank.

• Data deposited for storage & preservation, references & documentation, access & identification, novel synthesis & reanalysis.

• Millions of co-occurrence records

SEEK-Taxon

• Tools for data discovery and semantic mediation

• Tools and standards for data markup

• Methods and tools for describing more precisely the meaning of concepts associated with organism names

• Demonstration databases that maintain mappings of taxonomic concepts

Types of systematics databases

• Comprehensive lists (compilers: e.g. IPNI, Zoo Record)

• Authoritative checklists (aggregators: e.g. ITIS/USDA, Species2000)

• Concepts and perspectives (e.g. EuroMed, VegBank)

• Taxon attributes (e.g. USDA, BioFlor, LEDA, IRIS, TreeBase)

• Specimens (distributed, various standards and protocols)

Compilations (e.g. IPNI)

• Semi-comprehensive – no registration requirement

• Duplications of names – no rectification?

• Inconsistencies between names in the list and in references (names or protonyms?)

• Web services needed for validating names

• No standard for exchange or unique identification of names or references

Standard checklists for taxa

Representative examples for North American higher plants

USDA Plants http://plants.usda.gov ITIS http://www.itis.usda.gov NatureServe Biotics http://www.natureserve.org BONAP http://www.bonap.org/ Flora North America http://hua.huh.harvard.edu/FNA/

These are intended to be checklists wherein the taxa recognized perfectly partition all plants. Most of the lists are dynamic.

Taxonomic database challenge:

Standardizing organism names

The problem: Integration of data potentially

representing different times, places, investigators and taxonomic standards

The traditional solution: A standard checklists of organisms

Most taxon checklists fail to allow effective dataset integration

The reasons include:

• The user cannot reconstruct the database as viewed at an arbitrary time in the past,

• Taxonomic concepts are not defined (just lists),

• Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.

Name ReferenceConcept

Taxonomic theory

A taxon concept represents a unique combination of a name and a reference

“Taxon concept” roughly equivalent to “Potential taxon” & “assertion”

Carya ovata(Miller) K. Koch

Carya carolinae-septentrionalis(Ashe) Engler & Graebner

Carya ovata(Miller) K. Koch

sec. FNA 1997 sec. Kartesz 1999

Three concepts of shagbark hickory

Splitting one species into two illustrates the ambiguity often associated with scientific names.

Name ConceptUsage

A usage represents an association of a concept with

a name.

• Usage does not appear in the IOPI model, but instead is a special case of concept

• Desirable for stability in recognized concepts when strictly nomenclatural synonyms are created.

• Usage can be used to apply multiple name systems to a concept

Names Carya ovata Carya carolinae-septentrionalis Carya ovata v. ovata Carya ovata v. australis

Concept groups(One shagbark) C. ovata sec Gleason ’52 C. ovata sec FNA ‘97

(Southern shagbark) C. carolinae-s. sec Kartesz ‘99 C. ovata v. australis sec FNA ‘97

(Northern shagbark) C. ovata sec Kartesz ‘99 C. ovata (v. ovata) sec FNA ‘97

References Gleason 1952. Britton & Brown Kartesz 1999. Synthesis Stone 1997. Flora North America 3

Six shagbark hickory concepts

Possible synonyms are listed together

Name ConceptUsageStart, StopNameStatusName system

Reference

Data relationshipsVegBank taxonomic data model

Single party, dynamic perspective

StatusStart, StopConceptStatusLevel, Parent

Party Perspective

The Party Perspective on a concept includes:

• Status – Standard, Nonstandard, Undetermined

• Correlation with other concepts – e.g.Equal, Greater, Lesser, Overlap,

Undetermined

• Start & Stop dates for tracking changes

Name Concept

Party

UsageStart, StopNameStatusName system

StatusStart, StopConceptStatusLevel, Parent

Correlation

Reference

Data relationshipsVegBank taxonomic data model

With party correlations and lineages

PlotObservation

Taxon Observation

Taxon Interpretation

Some core elements

of VegBank

Taxon Assignment

Taxon Concept

Plant systematics databases: What do we need?

• What has been done?

• What is going on?

• What additional work is needed?

General data model and data exchange standard

• Numerous data models incorporate concepts. The IOPI, VegBank, and Taxonomer models are optimized for different uses.

• Jessie Kennedy, representing SEEK, GBIF, and TDWG, is seeking a consensus model to be presented in May 2004 and revised for TDWG

• A unique opportunity to build on other efforts. Kennedy’s results will need to be reviewed prior to TDWG in October.

True concept-based checklists

• Equivalent of ITIS but with concept documentation and including how other concepts map onto the concepts accepted by the party.

• Fully archived so that can be viewed as existed at any given time.

• Several are operative or in development including EuroMed, IOPI-GPC, Biotics, VegBank. Planned for IT IS/USDA.

Population of concept-based checklists

• For concept-based taxonomy to be widely adopted an initial set of accepted concepts must be identified.

• VegBank and NatureServe are collaborating to develop concepts for the 2004 revision of the Kartesz list. The concepts will be used to populate VegBank, Biotics, ITIS and USDA PLANTS.

• The IOPI Global Plant Checklist is gradually incorporating concepts.

Registration system and standard identifiers for names, references, and

concepts• Essential for data exchange

• SEEK is in the early design stages for a identifier system and central database.

• IPNI and GBIF would be ideal parties to host a names registry.

Tools to develop and map concepts

• Taxonomists need mapping and visualization tools for relating concepts of various authors. SEEK will build prototypes for review and possible adoption.

• Aggregators need tools for mapping relationships among concepts.

• Users need tools for entering legacy concepts. Several are in development

Publishers, curators and data managers need to tag taxon

interpretations with concepts

• Precedence exists with tagging literature citations and GenBank accessions

• Allen Press is linking scientific names in many ejournals to ITIS (e.g. Evolution, Ecology)

• Much work to be done here. SEEK is developing recommendations

Standard protocols for recording plant traits and

exchanging plant trait data.

• TDWG standards.

• European ecological initiativesBioFlor – www.ufz.ed/bioflor/index.jspLEDA - www.leda-traitbase.orgIRIS - www.synbiosys.alterra.nl/IRIS/

Where are we?

• Standards, tools and databases are essential for advancement of our fields

• Much is going on

• Much needs to be done

• Resources are scarce

• Collaboration is essential

Primary differences between the VegBank model and the IOPI(Berendsohn) models

The VB model is optimized for• stability in accepted concepts,• support of multiple dynamic party perspectives,• support of multiple name systems.

The IOPI model is optimized for • Describing taxonomic decisions represented in literature.

Name InterpretationAssertion

Rank

Source

Correlation

Author

Assertion Status

Reference

Core elements of theIOPI (Berendsohn)

model