the protein identifier cross-reference (picr) service
TRANSCRIPT
The Protein Identifier Cross-Reference (PICR)
service
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
Overview
• The problem…
• What is PICR?
• Access via web and web services
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
The problem…
-No direct comparison of the results can be done.
-Both groups used different Protein DB to report their results.
Protein list A
(DB Search vs. IPI)
IPI000001
IPI000002
IPI000003
…
Protein list B
(DB Search vs. UniProt)
P00001
P00002
P00003
…
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
The problem…
Protein list A
(DB Search vs. IPI)
IPI000001
IPI000002
IPI000003
…
Protein list B
(DB Search vs. UniProt)
P00001
P00002
P00003
…
-No direct comparison of the results can be done. -Both groups used different Protein DB to report their results.
We would need to use the PICR tool to be able to make a direct comparison
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
Why do you need ID mapping• Merging datasets to a common identifier space • Finding all aliases/synonyms for an identifier
• (data integration – submissions!)
• Mapping from secondary IDs to more recent primary IDs• (data “freshness”)
• Preparing data sets for specific tools• Querying in various primary databases
• (data format requirements)
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
Protein identifier mapping is hard• The basic problem: the same protein sequence is referred to by
multiple accession numbers assigned by multiple databases.
• No universal identifier scheme
• Redundant databases – multiple identifiers for the same sequence in the same database
• Unstable identifiers (ex: gi numbers)
• Obsolete and deleted identifiers (hypothetical proteins)
• Different production cycles for major databases
• Tools exist, but are limited in important their database and species coverage and in their usability and availability.
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
PICR Home Page
Submit accessions OR sequences
(FASTA) with 500 entry interactive limit
(no batch limit)
Select output format
Select one or many databases to map to in one
request
Limit search by taxonomy
(pessimistic)
Choose to return all mappings or only active ones
Run search
BLAST functionality for
protein fragments
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
PICR Result Page – simple view
Logical xref(hyperlinked)
Inactive xref
SecondaryIdentifier
Active xref(hyperlinked)
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
PICR Result Page – detailed view
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
PICR services
• PICR offers both SOAP and REST web service interfaces.
• Documentation is available online:
SOAP: http://www.ebi.ac.uk/Tools/picr/WSDLDocumentation.do
REST: http://www.ebi.ac.uk/Tools/picr/RESTDocumentation.do • Sample client code and URL examples are provided from the PICR
website.
EBI RoadshowRotterdam, 12 June 2012
Juan A. Vizcaí[email protected]
Do you want to know more?
Wein et al., NAR, 2012