biological nomenclature in the postgenomic era: biological and computational issues. george garrity...
Post on 08-Jan-2018
223 Views
Preview:
DESCRIPTION
TRANSCRIPT
Biological nomenclature in the postgenomic era:
Biological and computational issues.George Garrity and Catherine Lyons
Bergey’s Manual Trust and Explicatrix, LLC
Imagine..
• A clinical microbiologist’s predicament• The microbial ecologist’s dilemma• The case of Francisella novicida• The history of the Altermonadaceae
– Genus described in 1972• 15 emendations, 20 species
– 19 moved to four genera– 5 synonyms, two subspecies– 64 names, five genera, three families, two classes
• The common thread in all these stories…
Stan Falkow’s Underwear
“Given a choice, most taxonomists would rather wear each other’s
underwear than use each other’s names”
Why is this so?
My objective• Share some insights on problems in three areas
– Nomenclature and taxonomy– Publishing taxonomic information– A generalized taxonomic model
• Finite state machine• Simple grammar
– Global issues• Data equivalence• Data provenance• Data curation
Problems in nomenclature• Systematic biologists
– Marking territory– Personal achievement
• Other biologists– End-users
• Unfamiliar with literature– Unique aspects
• Unaware of Codes of Nomenclature– Legalistic framework
» Formation and assignment of names» Circumscription and emendation of taxa» Priority and citation» Synonymy and homonymy» Correction of orthographic errors» Adjudication of nomenclatural disputes
– But» Do not govern classification or identification
– Biological names• Primary entry point into STM literature• Prominent role in laws/regulations
– Commerce, public safety, public health• Primary entry point into scientific databases• Poor identifiers
– Fixed in time and scope– May not be revised– Synonymies generally not address– Persist, but
» obsolesce in relation to taxon» An archival record of a taxonomic
definition for a single point in time
Problems in nomenclature (cont.)
The name/taxon disjunction• Impact
– Accumulation of dubious names in literature/databases
– Effects assertions of:• Identity, commonality of pathways, common
ancestry, homology, parology, xenology• Legal consequences
Problems in print publishing• Key requirement
– Proposals and emendations must appear in print• Code specific
– Prokaryotic Code» Effective, legitimate, and valid» Registration
• Taxonomies are retrospective– Can only cite earlier publications– Cannot cite future emendations– Increasingly based on molecular sequence data
• Deposit of sequence data in public databases– Not conveniently referenced in print
Problems with electronic publishing• No formal publishing mechanisms
– Does not fulfill fundamental requirement of the Code(s)
– Lack bibliographic information• Not citable• Not persistent
– Subject to uncontrolled change– May disappear
• Link rot– 404 Link not found
A brief glimpse at where we’re headed
• The Bergamot/N4L model– Separates names from taxa
• Taxa nameless– Uniquely, persistently identified
– Supports multiple, overlapping taxonomies• Accumulation of new data vs. new methodologies• Rank agnostic
– Unique from all other approaches• An identifier resolution service, not an information space in
which to practice taxonomy.– Names provide an entry point into the literature
• Reliably• Persistently
• A lightweight information layer
A simple grammarspecies -> current.name.pointer, exemplar.deposit.pointer+,
sequence.deposit.pointer+taxon -> current.name.pointer, nomos.defined.data, (taxon+|
species+)nomos.defined.data -> (sequence|phenotypic.feature|text)+name -> (citation, bibliographic.record, name.status)exemplar -> exemplar.id, sourcesequence -> gene, sequence.depositsource -> exemplar|exemplar.deposit|textexemplar.deposit -> brc.id.pointer, deposit.id.pointer, sourcesequence.deposit -> brc.id.pointer, deposit.id.pointer, sourcephenotypic.feature -> feature.name, feature.value, deposit.id.pointer
Exemplar+ Sequence+
Name+
Taxo
n
Species+
Exemplar+ Sequence+
Name+
Taxo
n
Literature Governing bodies
GenBankDDBJEMBLothers
CollectionsBRC
Species+
Taxo
n
Exemplar+ Sequence+
Name+
Species+
Literature Governing bodies
GenBankDDBJEMBLothers
CollectionsBRC
Practitioner + Practitioner+
Practitioner+
genotypic“omics”
ProposalSTMLegal
Databases
PriorityValiditySynonymyExemplar req.
phenotypic
directindirect
BRC
Public Private
General
Exemplar+ Sequence+
Name+
Species+
A properly formed species
Sequence+
Name+
Species+
Candidatus or exemplarlost
Sequence+
Environmental sequence
Exemplar+
Name+
Species+
Old type strain, not yet sequenced
Name+
Species+
Old type, exemplar based ondrawing or description
Sequence+
“Name”+
Misidentifed taxon
Exemplar*
Exemplar+ Sequence+
Name+
Taxo
n
N4L/Bergamot
Literature Governing bodies
GenBankDDBJEMBLothers
CollectionsBRC
Species+
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
– Expertise in content packaging/delivery• SGML/XML publishing
– The Systematics» XML compliant SGML instance
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
– Expertise in content packaging/delivery• SGML/XML publishing
– The Systematics» XML compliant SGML instance
– The Outline» An experiment in SGML/XML publishing
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
– Expertise in content packaging/delivery• SGML/XML publishing
– The Systematics» XML compliant SGML instance
– The Outline» An experiment in SGML/XML publishing
A bit of background information• Bergey’s Manual Trust
– Principal information source• Bergey’s Manual of Determinative Bacteriology• Bergey’s Manual of Systematic Bacteriology• Taxonomic Outline of the Procaryotes
– Expertise in content packaging/delivery• SGML/XML publishing
– The Systematics» XML compliant SGML instance
– The Outline» An experiment in SGML/XML publishing
– Derivative projects» Bergamot/N4L» The Determinative
top related