ubio presentation to umls group of nlm / nih

25
Universal Biological Indexer and Organizer Research Funded by the Andrew W. Mellon Foundation MBL / WHOI LIBRARY

Upload: david-remsen

Post on 13-Apr-2017

163 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Page 2: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Newt: as concept

• Triturus viridescens Rafinesque 1820• String• a single specimen

• Nomenclatural conceptQuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

viridis - to become green

Page 3: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Concepts:Nomenclatural

• Triturus viridescens Rafinesque 1820• Notopthalmus viridescens Baird 1850• Notophthalmus viridescens Gray 1850• Notophthalma viridescens Gray 1858 msp.• Diemyctylus viridescens Hallowell 1856• Triton viridescens Strauch, 1870• Molge viridescens Boulanger, 1872• Diemyctylus minatus viridescens Yarrow•…

Common origin in a single real specimen (homotypic)

Creation of the new nomen concept is subjectiveRelationship among them is not

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 4: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Concepts: Nomenclatural

• Triturus viridescens dorsalis - Bishop, 1943 • Diemyctylus viridescens dorsalis Schmidt 1953• Notophthalmus viridescens dorsalis - Smith, 1953

• Triturus viridescens louisianae - Strecker 1928• Triturus viridescens louisianensis - Bishop, 1943• Diemyctylus viridescens louisianensis Schmidt 1953• Notophthalmus viridescens louisianensis - Smith, 1953

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 5: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Concepts:Taxonomic

• Notopthalmus viridescens Valid name•Triturus viridescens• Notopthalmus viridescens• Notophthalmus viridescens• Notophthalma viridescens• Diemyctylus viridescens• Triton viridescens• Molge viridescens• Diemyctylus minatus viridescens• Triturus viridescens dorsalis• Diemyctylus viridescens dorsalis• Notophthalmus viridescens dorsalis•… 24 others

Frost 2005 AMNH

• Notopthalmus viridescens viridescens•Triturus viridescens• Notopthalmus viridescens• Notophthalmus viridescens• Notophthalma viridescens• Diemyctylus viridescens• Triton viridescens• Molge viridescens

• Notophthalmus viridescens dorsalis• Triturus viridescens dorsalis• Diemyctylus viridescens dorsalis

• Notophthalmus viridescens louisianensis

Dolbe 2004

Expert interpretation of the original specimens

Page 6: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Concepts:Taxonomic

• Amphibia• Urodela• Salamandridae• Notophthalmus• Notopthalmus viridescens

Frost 2005 AMNH

• Amphibia • Batrachia• Caudata • Salamandroidea• Salamandridae• Notophthalmus• Notopthalmus viridescens

NCBI 2005

Page 7: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Concepts:Summary

• Factual• Inter-relationships are objective• No new science required

• (except to make new ones)• Stable• Expert scrutiny useful, not required• Compilation potentially FAST

• uBio 1 million/year• share (no opinion attached)

Nomenclatural Concepts

• Opinion• Interelationships are subjective• Derived from nomenclatural concepts• Expert scrutiny is required• Unstable• Compilation slow

• CoL 50K / year• Diptera 200K/15 years

• sharing concerns - opinions attached

Taxonomic Concepts

Page 8: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Why this is a problem

Page 9: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Don’t forget common names

Page 10: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

And additionally…

5-10% scientific names become invalid per decade

Scientific names aren’t unique

Acalyptus

Page 11: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Names Challenges within PubMed

476 unique

Name (Nomenclatural Synonyms) PMID Date Unique

Notophthalmus viridescens 350 1965 349

Diemictylus viridescens 36 1959 36

Triturus viridescens 87 1949 86

Page 12: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Names Challenges within PubMed

4208 unique PMID

Name (Taxonomic Synonyms) Total Unique

Brucella melitensis 1078 840 78.1%

Brucella abortus (Bacterium abortus) 3109 2852 91.7%

Brucella canis 178 146 82.0%

Brucella neotomae 12 4 33.3%

Brucella ovis 233 168 84.9%

Brucella suis 286 198 69.2%

Brucella melitensis DSMZ 2005

Page 13: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

How big is the problem?

• Not sure• No comprehensive listing of names

• 1.75M valid species names• 2-?M+ invalid names• 2-?M+ vernacular names• + Misspellings, lexical forms• 14,000 avian genera

Page 14: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

uBio

• Library service• “System” must account for all names• Any classifications• Biological Name Server

• 2 million nomenclatural concepts• 1.7 taxon concepts • (60 classifications)

• SOAP/WSDL web services

Page 15: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Page 16: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Page 17: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

uBio

Page 18: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Major Impediment to progress

• Different taxon concepts/needs• Same nomenclatural concepts• No obj/subj distinction

• Duplication• Interconnectivity issues

Page 19: uBio presentation to UMLS group of NLM / NIH

NameBank• Repository for all nomen concepts

•Insulates taxonomic systems from “bad” nomen concepts

•Consensus data only

• Common to any taxon concept

• Shareable

• Distributable

Page 20: uBio presentation to UMLS group of NLM / NIH

NameBank• NameBank is not a nomenclator nor are nomenclators NameBank• NameBank is an index of factually-derived name concepts that include a much more broad names definition• It overlaps, and is supported by nomenclators and should, I think, provide a service on top of NameBank.• NameBank provides an underlying unified index to systems like IF that contain authoritative nomenclatural metadata.• NameBank accomodates strings outside the scope of nomenclators

Page 21: uBio presentation to UMLS group of NLM / NIH

NameBank: UMLS Metathesaurus style

Page 22: uBio presentation to UMLS group of NLM / NIH

NameBank: Current

Page 23: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

NameBank• Repository for all nomen concepts• Insulates taxonomic systems from “bad” nomen concepts• Consensus data only• Layered• Shareable• Distributable• Independent compilation

Page 24: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Share

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

• NameBank is a big job • Catalog all names• Map all factually derived relationships• Share them for increased data access

•Proactive• NCBI

• CBOL• new submissions

Page 25: uBio presentation to UMLS group of NLM / NIH

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Federate• Layered architecture• Common Foundation• Diverse expression• Enhanced Interchange• Cooperation• Efficient

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.