ubio presentation to jim edwards 2006

Post on 13-Apr-2017

42 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

universal Biological indexer and organizer 1

New Dimensions in Managing Biological Information @ the MBLWHOI LIBRARY

David Remsen

June 27, 2006

universal Biological indexer and organizer 2

All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge.

- Grimaldi & Engel, 2005, Evolution of the Insects

universal Biological indexer and organizer 3

a name that serves as a link between what has been learned in the past

From T.E. Glover, The Fishes of Southwestern Japan, c.1870

universal Biological indexer and organizer 4

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

…and what we today add to the body of knowledge.

universal Biological indexer and organizer 5

universal Biological indexer and organizer 6

The challenge of names as keywordsFinding this…

Type keyword…

With this…

universal Biological indexer and organizer 7

Names – the only universal metadata for Biology

Names offer a logical way to search for and index content

• Names annotate data objects• All names annotate all data objects• A compilation of all names ever used is the foundation of a universal index for biology• or for a semantic web for biology

universal Biological indexer and organizer 8

• Many names refer to one concept• Vernacular concept• Lexical or Nominal synonym• Nomenclatural synonym• Taxonomic Synonym

• Single name refers to many concepts• Homonyms• Taxonomic concepts• Vernacular concepts• Taxonomic Groups/Classifications

The Taxonomic Names Problem in Biology

universal Biological indexer and organizer 9

Many to One: Vernacular Concepts

• Equivalence implicit through co-occurrence

universal Biological indexer and organizer 10

Many to One: Lexical Synonyms

Many to One: Nomenclatural Synonyms

universal Biological indexer and organizer 11

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Retention of lexical & nomenclatural variation

Loligo pealeiiLoligo pealiiLoligo pealei

Doryteuthis pealei

universal Biological indexer and organizer 12

Peranema – the fern

One to Many: Homonyms

Peranema – the euglenid

universal Biological indexer and organizer 13

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Taxonomic Concept

universal Biological indexer and organizer 14

LibrariesPublishers

MuseumsFederal Agencies

Name IR impediments in current systems: NLM, JSTOR

universal Biological indexer and organizer 15

Name IR impediments in current systems: OBIS

One organism

4 scientific names

4 maps

We want one map

universal Biological indexer and organizer 16

• Basis for Relationships: Facts• Vernacular concept• Lexical or Nominal synonym• Nomenclatural synonym• Homonyms

• Basis for Relationship: Opinion• Taxonomic Synonym• Vernacular concepts• Taxonomic Groups/Classifications

Division of Concepts

universal Biological indexer and organizer 17

Lexical SynonymsNomenclatural SynoymsVernacular Names

Taxonomic HierarchiesTaxonomic Synonyms

Primary Components of uBio

Indexes to content

Indexes to taxonomic views

universal Biological indexer and organizer 18

NameBank: An index of names and sources

universal Biological indexer and organizer 19

ClassificationBank

An index of taxon concepts

universal Biological indexer and organizer 20

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Fitting In

universal Biological indexer and organizer 21

Fitting In: A datacentric perspective

universal Biological indexer and organizer 22

Network Service :Attribution

• Every datum sent out via service is logged– nameBankID– datestamp– Client IP– Calling method– requestorIP

• <client optional>

universal Biological indexer and organizer 23

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Tools and Applications: FindIT

• Is trainable

• Locates names & authorities

• Finds names it doesn’t know

• Finds names mangled by OCR

universal Biological indexer and organizer 24

Universal Biological Indexer and OrganizerResearch Funded by the Andrew W. Mellon Foundation

MBL / WHOI LIBRARY

Tools and Applications: LinkIT

universal Biological indexer and organizer 25

Applications

universal Biological indexer and organizer 27

Taxonomic intelligence applied to search

Synonymies expand the scope of queries

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

universal Biological indexer and organizer 28

uBioRSS: Embedding taxonomies into literature retrieval

universal Biological indexer and organizer 29

Embedding uBio into remote services: uBioRSS

universal Biological indexer and organizer 30

Taxonomic hierarchies enhance data browsing

• Birds of the Belgian Congo

• 4500 pages• One page has a

species of dipteran• How would someone

interested find it?• 50,000+ Diptera

species to choose from

Both enhancements apply to all name-annotated content

universal Biological indexer and organizer 31

uBio Portal: Building communities, enabling connections

universal Biological indexer and organizer 32

Elements of the PortalIndexing power from NameBank

universal Biological indexer and organizer 33

Alternative names

Vernacular names

Expert view

More or less specific

Suggestions & corrections

Indexing power from NameBank

universal Biological indexer and organizer 34

Results from an array of resources

universal Biological indexer and organizer 35

Additional information from specific projects

universal Biological indexer and organizer 36

content certified linkouts to authoritative resources

XML source

Additional information from specific projects

universal Biological indexer and organizer 37

Text source

Additional information from specific projects

universal Biological indexer and organizer 38

• data from various sources may be merged

• red dots on the maplink back to the website thatprovided the geographical co-ordinates

Specimen distribution data from remote sources

top related