pride-proteomexchange
TRANSCRIPT
PRIDE resources and ProteomeXchange
Dr. Juan Antonio Vizcaíno
PRIDE Group CoordinatorProteomics Services TeamEMBL-EBIHinxton, Cambridge, UK
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Data resources at EMBL-EBIGenes, genomes & variation
RNA CentralArrayExpress
Expression AtlasMetabolights
PRIDE
InterPro Pfam UniProt
ChEMBL ChEBI
Molecular structuresProtein Data Bank in EuropeElectron Microscopy Data Bank
European Nucleotide ArchiveEuropean Variation ArchiveEuropean Genome-phenome Archive
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biologyReactions, interactions & pathways
IntActReactome
MetaboLights
SystemsBioModels Enzyme Portal BioSamples
Ensembl Ensembl Genomes
GWAS CatalogMetagenomics portal
Europe PubMed CentralGene OntologyExperimental Factor Ontology
Literature & ontologies
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeXchange Consortium•Goal: Development of a framework to allow
standard data submission and dissemination pipelines between the main existing proteomics repositories.
•Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and (very recently) MassIVE (UCSD, San Diego).
•Common identifier space (PXD identifiers)
•Two supported data workflows: MS/MS and SRM.
•Main objective: Make life easier for researchers
http://www.proteomexchange.org
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE stores mass spectrometry (MS)-based proteomics data:
• Peptide and protein expression data (identification and quantification)
• Post-translational modifications• Mass spectra (raw data and peak
lists)• Technical and biological metadata• Any other related information
• Full support for tandem MS approaches
PRIDE (PRoteomics IDEntifications) database
http://www.ebi.ac.uk/pride/archive Martens et al., Proteomics, 2005Vizcaíno et al., NAR, 2013
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Mission
• To archive all types of proteomics mass spectrometry data for the purpose of supporting reproducible research, allowing the application of quality control metrics and enabling the reuse of these data by other researchers.
• To integrate MS-based data in a protein-centric manner to provide information on protein variants, modifications, and expression.
• To provide mass spectrometry based expression data to the Expression Atlas.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Mission
• To archive all types of proteomics mass spectrometry data for the purpose of supporting reproducible research, allowing the application of quality control metrics and enabling the reuse of these data by other researchers.
• To integrate MS-based data in a protein-centric manner to provide information on protein variants, modifications, and expression.
• To provide mass spectrometry based expression data to the Expression Atlas.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Data content in PRIDE Archive• Submission driven resource
• PRIDE is split in datasets (group of assays)
• An assay represents one MS run (in most cases).
• No data reprocessing at present. PRIDE aims to represent the author’s view on the data
• Supported formats: PRIDE XML and mzIdentML.
• Raw data is also now stored
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
What is a proteomics publication in 2015?• Proteomics studies generate potentially large amounts of
data and results.
• Ideally, a proteomics publication needs to:• Summarize the results of the study• Provide supporting information for reliability of any
results reported
• Information in a publication:• Manuscript• Supplementary material• Associated data submitted to a public repository
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Journal Submission Recommendations• Journal guidelines recommend submission to proteomics repositories:
Proteomics Nature Biotechnology Nature Methods Molecular and Cellular Proteomics
• Funding agencies are enforcing public deposition of data to maximize the value of the funds provided.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE: Source of MS proteomics data
• PRIDE Archive already provides or will soon provide MS proteomics data to other EMBL-EBI resources such as UniProt, Ensembl and the Expression Atlas.
http://www.ebi.ac.uk/pride
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeXchange Consortium•Goal: Development of a framework to allow
standard data submission and dissemination pipelines between the main existing proteomics repositories.
•Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and (very recently) MassIVE (UCSD, San Diego).
•Common identifier space (PXD identifiers)
•Two supported data workflows: MS/MS and SRM.
•Main objective: Make life easier for researchers
http://www.proteomexchange.org
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeCentral
Metadata / Manuscript
Raw Data*
Results
Journals
UniProt/neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL (SRM data)
PRIDE (MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE (MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• A sneak peak to other PRIDE resources
Overview
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeCentral
Metadata / Manuscript
Raw Data*
Results
Journals
UniProt/neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL (SRM data)
PRIDE (MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE (MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Complete
Partial
Complete vs Partial submissions: processed resultsFor complete submissions, it is possible to connect the spectra with the identification
processed results and they can be visualized.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Complete vs Partial submissions: experimental metadata
Complete Partial
General experimental metadata about the projects is similar. However, at the assay level information in partial submissions is not so detailed
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
How to perform a complete PX submission to PRIDE
• Decide between a complete/partial submission.
• File conversion/export to PRIDE XML or mzIdentML
• File check before submission (PRIDE Inspector)
• Experimental annotation and actual file submission (PX submission tool)
• Post-submission steps
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PX Data workflow for MS/MS data1. Mass spectrometer output files: raw data (binary files) or
peak list spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by PRIDE, search engine output files will be stored and provided in their original form.
3. Metadata: Sufficiently detailed description of sample origin, workflow, instrumentation, submitter.
4. Other files: Optional files:a. QUANT: Quantification related results e. FASTAb. PEAK: Peak list files f. SP_LIBRARYc. GEL: Gel imagesd. OTHER: Any other file type
Published
RawFiles
Other files
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PX Data workflow for MS/MS data1. Mass spectrometer output files: raw data (binary files) or
peak list spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by PRIDE, search engine output files will be stored and provided in their original form.
3. Metadata: Sufficiently detailed description of sample origin, workflow, instrumentation, submitter.
4. Other files: Optional files (the list can be extended):a. QUANT: Quantification related results e. FASTAb. PEAK: Peak list files f. SP_LIBRARYc. GEL: Gel imagesd. OTHER: Any other file type
Published
RawFiles
Other files
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML1
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Search output files
Spectra files
Original data files ‘RESULT’ file generation Final ‘RESULT’ file
PRIDE XML
‘RESULT’
Before: only file conversion to PRIDE XML
File conversion
PRIDE Converter
Other tools, e.g. hEIDI
Barsnes et al., Nat Biotechnol, 2009Cote et al., MCP, 2012
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Tools ‘RESULT’ file generation Final ‘RESULT’ file
mzIdentML ‘RESULT’
Now: native file export to mzIdentML
Spectra files
(mzML, mzXML, mzData,
mgf, pkl,
ms2, dta, apl)
Mascot
ProteinPilot
Scaffold
PEAKS
MSGF+
Others
Native File export
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Complete submissionsSearch Engine
Results + MS files
Search engines
mzIdentML
- Mascot- MSGF+- MyriMatch and related tools from D. Tabb’s
lab- OpenMS- PEAKS- PeptideShaker- ProCon (ProteomeDiscoverer, Sequest)- Scaffold- TPP via the idConvert tool (ProteoWizard)- ProteinPilot (from version 5.0)- X!Tandem native conversion (Beta,
PILEDRIVER)- Others: library for X!Tandem conversion, lab
internal pipelines, …- Crux
An increasing number of tools support export to mzIdentML 1.1
- Referenced spectral files need to be submitted as well (all open formats are supported).
Updated list: http://www.psidev.info/tools-implementing-mzIdentML#.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
2
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Inspector Toolsuite
Wang et al., Nat. Biotechnology, 2012Perez-Riverol et al., MCP, 2016, in press
PRIDE Inspector
PRIDE Inspector 2 supports:
- PRIDE XML- mzIdentML + all types of spectra files- mzML- mzTab identification and Quantification
https://github.com/PRIDE-Toolsuite/
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Inspector 2PRIDE Inspector 2
https://github.com/PRIDE-Toolsuite/
New visualisation functionality for Protein Groups
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
3
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
•Capture the mappings between the different types of files.
•Make the file upload process straightforward to the submitter (It transfers all the files using Aspera or FTP).
PX submission tool
Published
Raw
Other files
http://www.proteomexchange.org/submission
PXsubmission
tool
•Command line alternative: Using the Aspera file transfer protocol.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PX submission tool: screenshots
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Fast file transfer with Aspera
- Aspera is the default file transfer protocol to PRIDE:- PX Submission tool- Command line
- Up to 50X faster than FTP File transfer speed should not be a problem!!
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Manuscript published detailing the process
Ternent et al., Proteomics, 2014http://www.proteomexchange.org/submission
Example dataset:PXD000764
- Title: “Discovery of new CSF biomarkers for meningitis in children”- 12 runs: 4 controls and 8 infected samples- Identification and quantification data
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Archive: Number of submitted datasets in 2015
2012
-01
2012
-03
2012
-05
2012
-07
2012
-09
2012
-11
2013
-01
2013
-03
2013
-05
2013
-07
2013
-09
2013
-11
2014
-01
2014
-03
2014
-05
2014
-07
2014
-09
2014
-11
2015
-01
2015
-03
2015
-05
2015
-07
2015
-090
20
40
60
80
100
120
140
160
180
200
Number of submitted datasets to PRIDE Archive per month (November 1st 2015)
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeXchange: 2,774 datasets up until 1st September, 2015
Type: 1681 PRIDE partial 813 PRIDE complete 173 MassIVE 84 PeptideAtlas/PASSEL complete 23 Reprocessed
Publicly Accessible: 1372 datasets, 49% of all 90% PRIDE 6% PASSEL 4% MassIVE
Data volume:Total: ~150 TB Number of all files: ~400,000PXD000320-324: ~ 4 TBPXD002319-26 ~2.4 TBPXD001471 ~1.6 TB
Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1182
Top Species studied by at least 20 datasets:1080 Homo sapiens 335 Mus musculus 110 Saccharomyces cerevisiae 98 Arabidopsis thaliana 75 Rattus norvegicus 58 Escherichia coli 29 Bos taurus 23 Glycine max 20 Caenorhabditis elegans 20 Oryza sativa
~ 500 species in total
Origin: 714 USA313 Germany252 United Kingdom163 China146 France121 Netherlands108 Switzerland 103 Canada 81 Denmark 73 Spain 68 Japan 67 Australia 63 Sweden 57 Belgium 43 Austria 39 India 34 Taiwan 33 Norway 26 Italy 24 Ireland 24 Finland 21 Republic of Korea 20 Brazil 20 Russia 18 Israel 18 Singapore …
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Public data release: when does it happen?• When the author tells us to do it (the authors can do it
by themselves)
• When we find out that a dataset has been published
• We look for PXD identifiers in PubMed abstracts.
• If your PXD identifier is not in the abstract, a paper may have been published and the data is still private. Let us know!
• New web form in the PRIDE web to facilitate the process
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Partial submissions can be used to store other data types
• Everything can be stored, not only MS/MS data: very flexible mechanism to be able to capture all types of datasets
• PRIDE does not store SRM data (it goes to PASSEL)
• Top down proteomics datasets.
• Mass Spectrometry Imaging datasets.
• Data independent acquisition techniques: e.g. SWATH-MS datasets.
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
C
D
From original publication [13] Reconstructed ProteomeXchange data
1. Thermo RAW data / UDP2. Mirion Software (JLU)
1. Thermo RAW data / UDP2. Convert to imzML3. Upload to PRIDE
(EBI, Cambridge, UK)
4. Download from PRIDE5. Display in MSiReader
- Vendor-independent data format- Freely available software (open source)- ‘open data‘ – free to reuse- Anybody can do this!
A public repository for mass spectrometry imaging dataRömpp et al., 2015
PRIDE databaseEuropean
Bioinformatics Institute,
Cambridge, UK
3. Upload
4. Download
No file size limit!
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Data access to PRIDE Archive• Look for particular datasets of interest:
• For data reuse: which particular proteins and peptides (including PTMs) have been detected.
• Data reinterpretation or re-analysis.
• Validation of the experimental results reported.
• Specific use cases for proteomics: spectral libraries, fragmentation models, SRM transitions,…
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeCentral
Metadata / Manuscript
Raw Data*
Results
Journals
UniProt/neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL (SRM data)
PRIDE (MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE (MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
RSS feed for public datasets
http://groups.google.com/group/proteomexchange/feed/rss_v2_0_msgs.xml
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Ways to access data in PRIDE Archive
• PRIDE web interface
• File repository
• REST web service
• PRIDE Inspector tool
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Archive web interface
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Archive web interface (2)
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L: Proteomics 2011;11(5):996-9.
https://github.com/compomics/searchgui https://github.com/compomics/peptide-shaker
Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L, Barsnes H:Nature Biotechnology 2015; 33(1):22-24.
CompOmics Open Source Analysis Pipeline
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Find the desired PRIDE project …
… and start re-analyzing the data!
… inspect the project details ….
Reshake PRIDE data!
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE resources
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Archive
Aggregation
PRIDECluster
Basic QC checks for
PSMs
Reprocesseddatasets
Original Submissions
Link to the original evidence
For original results
PRIDE Proteomes
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Sneak peak• Provide an aggregated and QC filtered peptide-
centric and protein centric view on PRIDE Archive data. http://www.ebi.ac.uk/pride/cluster/http://wwwdev.ebi.ac.uk/pride/proteomes/
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Cluster - Concept• Use spectral clustering to reliably group spectra
coming from the same peptide• Infer reliable identifications by comparing
submitted identifications of spectra within a cluster
• Increases quality through data increase (taking advantage of the wealth of data in PRIDE).
• Inherently adapts to new (labelling) techniques
Griss et al., Nat Methods, 2013
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Cluster - Concept
Griss et al., Nat Methods, 2013
NMMAACDPR
NMMAACDPR
PPECPDFDPPR
NMMAACDPR
Consensus spectrum
PPECPDFDPPR
NMMAACDPR
NMMAACDPR
Threshold: At least 10 spectra in a cluster and ratio >70%.
Originally submitted identified spectra
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Cluster Home page
http://www.ebi.ac.uk/pride/cluster/#/
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Cluster: result of searches
http://www.ebi.ac.uk/pride/cluster/#/
A couple of examples …
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: one perfect cluster
- 880 PSMs give the same peptide ID- 4 species- 28 datasets- Same instruments
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: one perfect cluster (2)
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: one perfect cluster (3)What does that peptide sequence correspond to?
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: very good cluster
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: very good cluster (2)
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: one perfect cluster (3)What does that peptide sequence correspond to?
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Cluster – Spectral libraries
http://www.ebi.ac.uk/pride/cluster/#/libraries
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE Proteomes: reusing PRIDE Cluster data
• Condensed and cross-dataset view of PRIDE Archive for identification data:• Data filtering of PSMs is performed at the level of the
submitted data.• PSMs are grouped as peptide sequences.• The peptide sequences are remapped to a recent
version of UniProtKB (at present UniProtKB “complete proteome”).
• Linked to the original supporting evidence.• “PRIDE Cluster” used as an extra evidence for the PSMs.
http://wwwdev.ebi.ac.uk/pride/proteomes/
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
PRIDE: Using it for giving reliability to IDs
Link to PRIDE Cluster web
http://wwwdev.ebi.ac.uk/pride/proteomes/
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Examples: one perfect cluster
- 880 PSMs give the same peptide ID- 4 species- 28 datasets- Same instruments
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
• Main characteristics of PRIDE Archive and ProteomeXchange
• PX/PRIDE submission workflow for MS/MS data• PRIDE Inspector• PX submission tool
• PRIDE/ProteomeXchange has become the de facto standard for data submission and data availability in proteomics
• PRIDE Proteomes and PRIDE Cluster: new resources
Conclusions
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Do you want to know a bit more…?
http://www.slideshare.net/JuanAntonioVizcaino
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Aknowledgements: PeopleAttila CsordasTobias TernentNoemi del Toro
Johannes GrissYasset Perez-Riverol
Henning Hermjakob
All past team members, especially Rui Wang, Florian Reisinger and Jose A. Dianes
All ProteomeXchange partners, especially Eric Deutsch and Nuno Bandeira
Acknowledgements: The PRIDE Team and collaborators
Juan A. Vizcaí[email protected]
WT Proteomics Bioinformatics Course 2015Hinxton, 10 December 2015
Questions?