the protein identifier cross-reference (picr) service

15
The Protein Identifier Cross-Reference (PICR) service

Upload: evan-leo-garrison

Post on 19-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Protein Identifier Cross-Reference (PICR) service

The Protein Identifier Cross-Reference (PICR)

service

Page 2: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Overview

• The problem…

• What is PICR?

• Access via web and web services

Page 3: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

The problem…

-No direct comparison of the results can be done.

-Both groups used different Protein DB to report their results.

Protein list A

(DB Search vs. IPI)

IPI000001

IPI000002

IPI000003

Protein list B

(DB Search vs. UniProt)

P00001

P00002

P00003

Page 4: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

The problem…

Protein list A

(DB Search vs. IPI)

IPI000001

IPI000002

IPI000003

Protein list B

(DB Search vs. UniProt)

P00001

P00002

P00003

-No direct comparison of the results can be done. -Both groups used different Protein DB to report their results.

We would need to use the PICR tool to be able to make a direct comparison

Page 5: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Databases are all different

Page 6: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Databases evolve

Page 7: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Why do you need ID mapping• Merging datasets to a common identifier space • Finding all aliases/synonyms for an identifier

• (data integration – submissions!)

• Mapping from secondary IDs to more recent primary IDs• (data “freshness”)

• Preparing data sets for specific tools• Querying in various primary databases

• (data format requirements)

Page 8: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Protein identifier mapping is hard• The basic problem: the same protein sequence is referred to by

multiple accession numbers assigned by multiple databases.

• No universal identifier scheme

• Redundant databases – multiple identifiers for the same sequence in the same database

• Unstable identifiers (ex: gi numbers)

• Obsolete and deleted identifiers (hypothetical proteins)

• Different production cycles for major databases

• Tools exist, but are limited in important their database and species coverage and in their usability and availability.

Page 9: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PICR Home Page

Submit accessions OR sequences

(FASTA) with 500 entry interactive limit

(no batch limit)

Select output format

Select one or many databases to map to in one

request

Limit search by taxonomy

(pessimistic)

Choose to return all mappings or only active ones

Run search

BLAST functionality for

protein fragments

Page 10: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PICR Result Page – simple view

Logical xref(hyperlinked)

Inactive xref

SecondaryIdentifier

Active xref(hyperlinked)

Page 11: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PICR Result Page – detailed view

Page 12: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PICR Result Page – XLS view

Page 13: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

PICR services

• PICR offers both SOAP and REST web service interfaces.

• Documentation is available online:

SOAP: http://www.ebi.ac.uk/Tools/picr/WSDLDocumentation.do

REST: http://www.ebi.ac.uk/Tools/picr/RESTDocumentation.do • Sample client code and URL examples are provided from the PICR

website.

Page 14: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Do you want to know more?

Wein et al., NAR, 2012

Page 15: The Protein Identifier Cross-Reference (PICR) service

EBI RoadshowRotterdam, 12 June 2012

Juan A. Vizcaí[email protected]

Questions?