lsids in a nutshell jun zhao university of manchester 1 st december, 2005
TRANSCRIPT
LSIDs in a Nutshell
Jun Zhao
University of Manchester
1st December, 2005
Any idea?
• 30350027• 30350027
• gi:30350027 THE LSID
Outline
• What is an LSID• Why do we need LSIDs• How does it work• What are available from your LSID comrades• How is it working in myGrid• Questions
LSID: Life Science Identifier
Clark T., Martin S., Liefeld T. Globally Distributed Object Identification for Biological Knowledgebases Briefings in Bioinformatics 5.1:59-70, March 1, 2004. http://lsid.sourceforge.net/
• A URN (Uniform Resource Name)
• A standard from the OMG LSR group
• A detailed specification:http://www.omg.org/cgi-bin/doc?lifesci/2003-12-02
URN
• URI– Uniform Resource Identifiers – Can be further classified as URL & URN
• URL: – Uniform Resource Locators – identifying a place where a resource may reside – a representation of a primary access mechanism
• URN– required to remain globally unique and persistent
even when the resource ceases to exist or becomes unavailable.
Tim Berners-Lee. Uniform Resource Identifiers (URI): Generic Syntax http://www.ietf.org/rfc/rfc2396.txt
Five part schema
• A five-part format: urn:lsid:Authority:Namespace:Object_ID[:Revision-ID]
For example:• urn:lsid:ncbi.nlm.nih.gov:pubmed:12571434
refers to a PubMed article
• urn:lsid:ncbi.nlm.nig.gov:genbank:T48601:2
refers to the second version of an entry in GenBank
Motivation
• Making your local publications globally available• Persistent• Open source
– Anyone can become an LSID registration agency– No central third-party registration agency is required,
and there are no fees to pay
• Linking with other database sources: NCBI protein/nucleotide DBs, PubMed,
UniProt/SwissProt, GO terms ……
How does it work
urn:lsid:www.mygrid.org.uk
http://hostname:80/authority
WSDL script
http://hostname:80/authority
Operation calls
http, ftp and soapReturned results
ClientClient
MetadataStore
MetadataStore
DataStoreDataStore
LSIDAuthority
LSIDAuthority
LSID resources
• http://lsid.sourceforge.net/• http://lsid.biopathways.org/• http://cvs.sourceforge.net/viewcvs.py/lsid/• Who are using them
– BioMOBY(www.biomoby.org)– Aventis– BioImage(www.bioimage.org)– Haystack, the first Semantic Web browser, based on
Eclipse (haystack.lcs.mit.edu)
myGrid
• An e-Science project for bioinformaticians and biologists http://www.mygrid.org
• A set of middleware services• Based on 3 molecular scenarios• A successful workflow workbench Taverna
http://taverna.sourceforge.net• Hosting 1,800 bio-services• We finished but we will continue
http://www.mygrid.org.uk/ontology
#contains_similar_sequence_to
LSIDs in myGrid
• Motivation– Uniquely and persistently
identifying myGrid internal resources
– Separating data and metadata
– Applying a compatible standard
– Integrating with resources in the open world
• LSIDs and RDF (Resource Description Framework)
urn:lsid:taverna.sf.net:datathing:45fg6
urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:genbank_gi:5851672
reportsequence
http://www.mygrid.org.uk/ontology#DNA_sequence
LSIDs in action
FreefluoEnactor
Services
LSID Assigning Service
Store plug-in
Metadata plug-in MetadataStore
mIR
Workflow design User
context
LSID Metadata Resolver
LSID Data Resolver
LSIDAuthority
Client application
1. Data sent/ received from services
2. New LSIDs assigned to data
3. Data / Metadata stored
4. Data and metadata retrieved
TavernaWorkbench
View LSIDs
LSID ≠ URL
• An LSID is a URN– Identifying a resource by its name, instead of
its location– Persistency (theoretically??)– Legacy support
• Multiple protocols: http, ftp, file systems, soap…
Your responsibility
• Unique authority id
• Unique object and revision ids within your namespace
• Never reassign an LSID
• Persistently identifying your data
What is not working
• Security
• Access control
• LSID synonyms
Questions?
Thank you!