ruedi aebersold, ph.d. institute for systems biology seattle, washington
DESCRIPTION
Data Collection and Analysis for High Throughput Quantitative Proteomics: Current Status and Challenges. Ruedi Aebersold, Ph.D. Institute for Systems Biology Seattle, Washington email: [email protected]. Enumerate all the components of a proteome. Detect dynamic - PowerPoint PPT PresentationTRANSCRIPT
Data Collection and Analysis for High Throughput Quantitative Proteomics:
Current Status and Challenges
Ruedi Aebersold, Ph.D.Institute for Systems BiologySeattle, Washingtonemail: [email protected]
Proteomics:
The systematic (quantitative) analysis The systematic (quantitative) analysis of the proteins expressed in a cell at a timeof the proteins expressed in a cell at a time
Enumerate all the Enumerate all the components components of a proteomeof a proteome
Proteome as Proteome as database:database:
Proteome analyzedonce
Detect dynamic Detect dynamic changes in proteomechanges in proteome following external orfollowing external or
internal perturbationsinternal perturbations
Proteomics as Proteomics as Biol. or clin. assayBiol. or clin. assay::
Proteome analyzedmultiple (infinite) times
Q2Q2Collision CellCollision Cell
Q3Q3
II
IIII
IIIIIICorrelative Correlative
sequence database sequence database searchingsearching
TheoreticalTheoretical AcquiredAcquiredProtein identificationProtein identification
PeptidesPeptides
1D, 2D, 3D peptide separation1D, 2D, 3D peptide separation
200 400 600 80010001200m/zm/z
200 400 600 80010001200m/zm/z
200 400 600 80010001200m/zm/z
12 14 16Time (min)
Tandem mass spectrumTandem mass spectrum
Protein Identification StrategyProtein Identification Strategy
Q1Q1
*
*
Protein Protein mixturemixture
Accurate Quantitation Using Isotope Dilution
• h/l analytes are chemically identical h/l analytes are chemically identical identical specific signal in MS identical specific signal in MS
Sample 1Sample 1 Sample 2Sample 2(Reference)(Reference)
IncorporateIncorporateStable LightStable Light
IsotopeIsotope
IncorporateIncorporateStable HeavyStable Heavy
IsotopeIsotope
Analyze by Mass SpectrometerAnalyze by Mass Spectrometer
Combine Samples
• Ratio of h/l signals indicates ratio of analytesRatio of h/l signals indicates ratio of analytes
Isotope Coded Affinity Tags (ICAT)Isotope Coded Affinity Tags (ICAT)Isotope Coded Affinity Tags (ICAT)Isotope Coded Affinity Tags (ICAT)
Heavy reagent: d8-ICAT (Heavy reagent: d8-ICAT (XX=deuterium)=deuterium)Light reagent: d0-ICAT (Light reagent: d0-ICAT (XX=hydrogen)=hydrogen)Heavy reagent: d8-ICAT (Heavy reagent: d8-ICAT (XX=deuterium)=deuterium)Light reagent: d0-ICAT (Light reagent: d0-ICAT (XX=hydrogen)=hydrogen)
Biotin tagBiotin tagBiotin tagBiotin tag Linker (heavy or light)Linker (heavy or light)Linker (heavy or light)Linker (heavy or light)
SSSS
NNNN NNNN
OOOO
NNNN OOOOOOOO
OOOO NNNNIIII
OOOO OOOOXXXX
XXXX
XXXX
XXXX
XXXX
XXXX
XXXX
XXXX
ThiolThiolreactivereactive
ThiolThiolreactivereactive
Detection of Cys containing peptides andDetection of Cys containing peptides andaccurate quantification using stable isotope dilutionaccurate quantification using stable isotope dilution
Quantitative proteomics by isotope labeling-LC-MS/MSQuantitative proteomics by isotope labeling-LC-MS/MSQuantitative proteomics by isotope labeling-LC-MS/MSQuantitative proteomics by isotope labeling-LC-MS/MS
550550 560560 570570 580580m/zm/z
00
100100
LightLight HeavyHeavy
Mixture 1Mixture 1
Quantitation and protein Quantitation and protein identificationidentification
200200 400400 600600 800800m/zm/z
00
100100NHNH22-EACDPLR--EACDPLR-COOHCOOH
Combine and Combine and proteolyzeproteolyze
Avidin affinity Avidin affinity enrichmentenrichment
isotope-isotope-labellabel
isotope-isotope-labellabel
Mixture 2Mixture 2
Optional fractionationOptional fractionation
Compatible with any separation/fractionationmethod at protein/peptide level.
Metabolic stableisotope labeling
Isotope taggingby chemical reaction
Digest
Label
Stable isotope incorporationvia enzyme reaction
PR
OT
EIN
LA
BE
LIN
GD
AT
A C
OL
LE
CT
ION
DA
TA
AN
AL
YS
IS
Mass spectrometryMass spectrometry
Inte
nsity
Inte
nsity
Inte
nsity
m/z m/z m/z
Digest Digest
Stable Isotope Labeling Strategies
Quantitative Proteomics Technology
Protein identification: Automated peptide tandem mass spectrometry of complex peptide mixtures
Protein quantification: Isotope dilution
Selective chemical reactions: reduction of sample complexity; selective analyte isolation
ResultsResults
Identification of proteins in sample and quantitative profilesIdentification of proteins in sample and quantitative profiles
Quantitative Proteomics Technology
Protein identification: Automated peptide tandem mass spectrometry of complex peptide mixtures
Protein quantification: Isotope dilution
Selective chemical reactions: reduction of sample complexity; selective analyte isolation
ResultsResults
Identification of proteins in sample and quantitative profilesIdentification of proteins in sample and quantitative profiles
Current capacity: ~1000 proteins per day/instrumentTotal yeast lysate: ~ 2000 proteins identified and quantified
Quantitative Proteomics Technology
Protein identification: Automated peptide tandem mass spectrometry of complex peptide mixtures
Protein quantification: Isotope dilution
Selective chemical reactions: reduction of sample complexity; selective analyte isolation
ResultsResults
Identification of proteins in sample and quantitative profilesIdentification of proteins in sample and quantitative profiles
Current capacity: ~1000 proteins per day/instrumentTotal yeast lysate: ~ 2000 proteins identified and quantified
In 1991, all the world’s labs combined had identified just about 2000 genes
Current Limitations (and Potential Solutions)
• The efficiency problem
• The validation problem
• The biological inference problem
Standard Method for Complex Peptide Mixture
Analysis
Cation Exchange RP-HPLC ESI-MS/MS
Proteome Analysis: The Analytical Challenges
Yeast Proteome• Expected number of ORFs: 6118
• Expected number of tryptic peptides: ~350,000
Synchronous Timepoint SamplesCompared to Reference Sample
Timepoint Samples from Yeast Cells SynchronouslyTransiting the Cell Cycle
AsynchronousReference
Sample
T0 T30 T60 T90 T120T0 678T30 320 998T60 342 555 1006T90 340 604 571 1243T120 319 626 587 684 1047
Data Summary
16481523
14481713
1229
1095 1184 1112 8921055 1140 921
1051 871960
• 2735/6562 proteins quantified across all timepoints (42%) • 696 proteins quantified in every experiment• 1513 proteins quantified in at least one timepoint• 34,400 peptides quantified on average per timepoint• >1 million mass spectra collected
Features: 2720
Pep3D: Xiao-jun Li et al. submitted
CIDs: 1633
Features: 2720
Features: 2720
CIDs: 1633
IDs: 363
ID/CID: 22%
ID/feature: 13%
Possible Solutions
• Better separation technology
• Selective peptide isolation
• Smart precursor ion selection
Number of peptides identified in each FFE fraction(average overlap: 29%)
0
100
200
300
400
1 3 5 7 9 11 13 15 17 19 21 23 25 27
Number of fraction
Number of peptides
Number of peptides overlaped with previous one fraction Unique peptide in the fraction
Number of peptides identified in each SCX fraction(average overlap: 52%)
0
100
200
300
400
500
600
700
1 3 5 7 9 11 13 15 17 19 21 23 25 27
Number of fraction
Number of peptides
Number of peptides overlaped with previous one fraction Unique peptide in the fraction
•Tryptic yeast digest separated by FFE-IEX or SAX•30 fractions collected and analyzed by capLC-MS/MS•Overlap: same peptide identified in adjacent fractions
92%
Peptide overlap in SCX
0
200
400
600
800
1000
1200
1400
1 6 11 16 21 26
Number of fractions one peptide distribute to
Number of peptides
Peptide overlap in FFE
0
400
800
1200
1600
2000
2400
2800
1 6 11 16 21 26
Number of fractions one peptide distribute to
Number of peptides
68%
Possible Solutions
• Better separation technology• Selective peptide isolation
– Zhang H, et al. Curr. Op. Chem . Biol. (2004) 8: 66-75
– Aebersold R Nature (2003) 422(6928):115-6. • Smart precursor ion selection
– Griffin T et al. Anal Chem.( 2003) 75:867-74. – Griffin et al. J Am Soc Mass Spectrom. (2001) 12:1238-46.
• Only a (small) subset of peptides present is identified
• Current separation strategies do not have sufficient resolving power
• MS/MS of every peptide in every experiment is a bottleneck of current MS based proteomics
• LC-ESI MS/MS wastes a high fraction of MS/MS cycles sequencing precursor ions that do not lead to a positive identification
• Most positive identifications are not informative in profiling experiments
• Smart precursor ion selection is required
Summary: Efficiency Problem
Current Limitations (and Potential Solutions)
• The efficiency problem
• The validation problem
• The biological inference problem
MS/MS spectra
A B C D A B C
Protein Identification by MS/MS
protein sample
MS/MS spectra
peptide mixture
peptide identifications
protein identifications
MS/MS spectra
A B C D A B C
Protein Identification by MS/MS
protein sample
MS/MS spectra
peptide mixture
peptide identifications
protein identifications
Protein level
Peptide level
MS/MS spectrumlevel
Database searchTools:-Sequest-Mascott-SpectrumMill-Etc.
OUTPUT FROM SEARCH ALGORITHM
sort
by
sear
ch s
core
sort
by
sear
ch s
core
threshold
incorrect
“correct”
SEQUEST:Xcorr > 2.0Cn > 0.1
MASCOT:Score > 47
Threshold Model
Difficulty Interpreting Protein Identifications based on MS/MS
• Different search score thresholds used to filter data
• Unknown and variable false positive error rates
• No reliable measures of confidence
Spectrum Peptide Score
Spectrum 1 LGEYGH 4.5Spectrum 2 FQSEEQ 3.4Spectrum 3 FLYQE 1.3 … … …Spectrum N EIQKKF 2.2
Statistical Model
entire dataset:
MS/MSspectrum
best match
database search score
Spectrum Peptide Score
Spectrum 1 LGEYGH 4.5 1.0Spectrum 2 FQSEEQ 3.4 0.97Spectrum 3 FLYQE 1.3 0.01 … … …Spectrum N EIQKKF 2.2 0.3
Statistical Model
entire dataset:
EM mixture model algorithm learns the most likely distributions among correct and incorrect peptide
assignments given the observed data
incorrect
correct
incorrect ---
correct ---
p=0.5
probability
unsupervised learning
Threshold Model: Bad Discrimination and Inconsistency
Sensitivity: fraction of all correct resultspassing filter
Error Rate: fraction of allresults passingfilter that areincorrect
Ideal Spot
SEQUEST thresholds(from literature)
test data: A. Keller et al. OMICS 6(2), 207 (2002)
Discriminating Power of Peptide Prophet
Sensitivity: fraction of all correct resultspassing filter
Error Rate: fraction of allresults passingfilter that areincorrect
Ideal Spot
SEQUEST thresholds(from literature)
probab
ility
model
Improved discrimination: more identifications (for the same error rate) Keller at al. Anal. Chem. 2003
>sp|P02754|LACB_BOVIN BETA-LACTOGLOBULIN PRECURSOR (BETA-LG) (ALLERGEN BOS D 5) - Bos taurus (Bovine).
MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI
TPEVDDEALEK : p = 0.96
TPEVDDEALEKFDK : p = 0.96
KPTPEGDLEILLQK : p = 0.83
LSFNPTQLEEQCHI : p = 0.65LSFNPTQLEEQCHI : p = 0.76
sp|P02754|LACB_BOVINProbability = ???
ProteinProphetTM software combines probabilities of peptides assigned to MS/MS spectra to compute accurate probabilities that corresponding proteins are present
Protein Identification
Nesvizhskii et al Anal Chem. (2003)75:4646-58.
Issues for Protein Identification
• Many peptides are present in more than a single database protein entry
ProteinProphet apportions such peptides among all corresponding proteins to derive simplest list of proteins
that explain observed peptides
• Peptides corresponding to ‘single-hit’ proteins are less likely to be correct than those corresponding to ‘multi-hit’ proteins
ProteinProphet learns by how much peptide probabilities should be adjusted to reflect this protein grouping
information
Prot APeptide 1
Peptide 2
Prot BPeptide 3
Peptide 4
Peptide 5
Prot
Prot
Prot
Prot
in the sample(enriched for ‘multi-hit’ proteins)
not in the sample(enriched for ‘single hits’)Prot
Peptide 6
Peptide 7
Peptide 8
Peptide 9
Peptide10
+
++
+
+
5correct (+)
Amplification of False Positive Error Rate from Peptide to Protein Level
Peptide Level: 50% False Positives
Protein Level: 71% False Positives
Data Filter # ids # non-single hits # single-hits
Publ. Threshold model#1 2257 359 1898
Publ. Threshold model #2 2742 441 2301
ProteinProphet, p 0.5 713 511 202 (predicted error rate: 7%)
Serum Protein Identifications from Large-scale (~375 run)
Experiment
Reference: H. Zhang et al., in prep
Consistency of Manual Validation of SEQUEST Search Results
Correct Validation Incorrect Validation Validation Withheld
Manual Authenticators
Search Results
mzXMLmzXML
Data Analysis Pipeline
Suitableinput
Peptideassignment Validation
Proteinassignment
Quantitation
Interpretation
Tasks for a proteomic analysis pipeline
COMETProbID
Peptide Prophet
Protein Prophet
ASAPRatio
SBEAMSCytoscape
• Processing of data collected from different platforms, samples, experiments, operators requires transparent methods to score data
• Publication and relational database analysis require consistently scored data
• Tools assigning probability based scores are essential
• Openly accessible, transparent (OS) tools bring in new talent and lead to community improved tools
Data Analysis Summary:
Nesvizhskii and Aebersold (2004) Drug Discov Today. 9:173-81http://www.proteomecenter.org/software.php
Current Limitations (and Potential Solutions)
• The efficiency problem
• The validation problem
• The biological inference problem
HPLC-MS/MS
IFN-treatedIFN-treated Mock-treatedMock-treated
ICAT labelICAT labelC12C12C12C12
C12C12/C13C13
C13C13C13C13
Wei Yan et al
0.6632.8440.98ubiquitination and protein degradationGSA7: ubiquitin activating enzyme E1 -like protein
0.1382.8750.57signaling pathway; transcription; apoptosisNMI: N -myc (and STAT) interactor
0.4842.8900.45electron transfer ETFA: electron -transfer -flavoprotein , alpha polypeptide ( glutaric aciduria II)
0.3902.9230.98transcriptionM96: likely ortholog of mouse metal response element binding transcription factor 2
0.2312.9751cytoskeletion and intracellular transport?LOC151636: rhysin 2
0.3353.1401cytoskeletion and intracellular transportKNS2: kinesin 2 60/70kDa
0.2593.3511lipid and fatty acit metabolismACACA: acetyl -Coenzyme A carboxylase alpha
1.6593.5330.99chaperone and protein foldingCABC1: chaperone, ABC1 activity of bc1 complex like (S. pombe )
0.2243.6840.98nucleobase , nucleoside, nucleotide and nucleic acid metabolismNUDT2: nudix (nucleoside diphosphate linked moiety X) -type motif 2
0.0583.8150.62unknown and ESIsKIAA1276: KIAA1276 protein
0.6593.9631translation and ribosomal protein; anti - viral responsePRKR protein kinase , interferon -inducible double stranded RNA dependent
1.284(0.195)(0.39
4)
4.164(2.741)(2.
2)
1(1)(1
)translation and ribosomal protein; GTP bindingEEF1A protein [Fragment]
2.2044.5230.57immune responseCD7: CD7 antigen (p41)
0.1164.5690.69signaling pathway; lipid metabolismPLCD1: phospholipase C, delta 1
0.7514.7480.97lipid and fatty acid metabolismMTP: microsomal triglyceride transfer protein (large polypeptide, 88kDa)
0.6614.8580.98cytoskeletion and intracellular transportG1P2: interferon, alpha -inducible protein (clone IFI -15K)
0.1335.8451G-protein coupled receptor and G - protein signalingGNB1: guanine nucleotide binding protein (G protein), beta polyp eptide 1
4.8836.0540.73unknownFLJ32915: hypothetical protein FLJ32915
0.7946.2301unknownIFIT4: interferon - induced protein with tetratricopeptide repeats 4
0.9366.2790.93one -carbon compound metabolismAHCY: S - adenosylhomocysteine hydrolase
1.0708.1041chaperone and protein foldingCSA_PPIasePEPTIDYL PROLYL CIS TRANS ISOMERASE
1.0439.5080.79adhesion molecule and extracellular matrix proteinADRM1: adhesion regulating molecule 1
1.02412.0060.49signaling pathwayPASK PAS domain containing serine/ threonine kinase
4.74121.2700.63G-protein coupled receptor and G - protein signalingGPR111: G protein -coupled receptor 111
-199990.48unknown and ESIsIFIT1: interferon - induced protein with tetratricopeptide repeats 1
-199990.57ubiquitination and protein degradationUBE2L6: ubiquitin -conjugating enzyme E2L 6
-199990.94moto protein complexDNAH11: dynein , axonemal , heavy polypeptide 11
0.6632.8440.98ubiquitination and protein degradationGSA7: ubiquitin activating enzyme E1 -like protein
0.1382.8750.57signaling pathway; transcription; apoptosisNMI: N -myc (and STAT) interactor
0.4842.8900.45electron transfer ETFA: electron -transfer -flavoprotein , alpha polypeptide ( glutaric aciduria II)
0.3902.9230.98transcriptionM96: likely ortholog of mouse metal response element binding transcription factor 2
0.2312.9751cytoskeletion and intracellular transport?LOC151636: rhysin 2
0.3353.1401cytoskeletion and intracellular transportKNS2: kinesin 2 60/70kDa
0.2593.3511lipid and fatty acit metabolismACACA: acetyl -Coenzyme A carboxylase alpha
1.6593.5330.99chaperone and protein foldingCABC1: chaperone, ABC1 activity of bc1 complex like (S. pombe )
0.2243.6840.98nucleobase , nucleoside, nucleotide and nucleic acid metabolismNUDT2: nudix (nucleoside diphosphate linked moiety X) -type motif 2
0.0583.8150.62unknown and ESIsKIAA1276: KIAA1276 protein
0.6593.9631translation and ribosomal protein; anti - viral responsePRKR protein kinase , interferon -inducible double stranded RNA dependent
1.284(0.195)(0.39
4)
4.164(2.741)(2.
2)
1(1)(1
)translation and ribosomal protein; GTP bindingEEF1A protein [Fragment]
2.2044.5230.57immune responseCD7: CD7 antigen (p41)
0.1164.5690.69signaling pathway; lipid metabolismPLCD1: phospholipase C, delta 1
0.7514.7480.97lipid and fatty acid metabolismMTP: microsomal triglyceride transfer protein (large polypeptide, 88kDa)
0.6614.8580.98cytoskeletion and intracellular transportG1P2: interferon, alpha -inducible protein (clone IFI -15K)
0.1335.8451G-protein coupled receptor and G - protein signalingGNB1: guanine nucleotide binding protein (G protein), beta polyp eptide 1
4.8836.0540.73unknownFLJ32915: hypothetical protein FLJ32915
0.7946.2301unknownIFIT4: interferon - induced protein with tetratricopeptide repeats 4
0.9366.2790.93one -carbon compound metabolismAHCY: S - adenosylhomocysteine hydrolase
1.0708.1041chaperone and protein foldingCSA_PPIasePEPTIDYL PROLYL CIS TRANS ISOMERASE
1.0439.5080.79adhesion molecule and extracellular matrix proteinADRM1: adhesion regulating molecule 1
1.02412.0060.49signaling pathwayPASK PAS domain containing serine/ threonine kinase
4.74121.2700.63G-protein coupled receptor and G - protein signalingGPR111: G protein -coupled receptor 111
-199990.48unknown and ESIsIFIT1: interferon - induced protein with tetratricopeptide repeats 1
-199990.57ubiquitination and protein degradationUBE2L6: ubiquitin -conjugating enzyme E2L 6
-199990.94moto protein complexDNAH11: dynein , axonemal , heavy polypeptide 11
0.6632.8440.98ubiquitination and protein degradationGSA7: ubiquitin activating enzyme E1 -like protein
0.1382.8750.57signaling pathway; transcription; apoptosisNMI: N -myc (and STAT) interactor
0.4842.8900.45electron transfer ETFA: electron -transfer -flavoprotein , alpha polypeptide ( glutaric aciduria II)
0.3902.9230.98transcriptionM96: likely ortholog of mouse metal response element binding transcription factor 2
0.2312.9751cytoskeletion and intracellular transport?LOC151636: rhysin 2
0.3353.1401cytoskeletion and intracellular transportKNS2: kinesin 2 60/70kDa
0.2593.3511lipid and fatty acit metabolismACACA: acetyl -Coenzyme A carboxylase alpha
1.6593.5330.99chaperone and protein foldingCABC1: chaperone, ABC1 activity of bc1 complex like (S. pombe )
0.2243.6840.98nucleobase , nucleoside, nucleotide and nucleic acid metabolismNUDT2: nudix (nucleoside diphosphate linked moiety X) -type motif 2
0.0583.8150.62unknown and ESIsKIAA1276: KIAA1276 protein
0.6593.9631translation and ribosomal protein; anti - viral responsePRKR protein kinase , interferon -inducible double stranded RNA dependent
1.284(0.195)(0.39
4)
4.164(2.741)(2.
2)
1(1)(1
)translation and ribosomal protein; GTP bindingEEF1A protein [Fragment]
2.2044.5230.57immune responseCD7: CD7 antigen (p41)
0.1164.5690.69signaling pathway; lipid metabolismPLCD1: phospholipase C, delta 1
0.7514.7480.97lipid and fatty acid metabolismMTP: microsomal triglyceride transfer protein (large polypeptide, 88kDa)
0.6614.8580.98cytoskeletion and intracellular transportG1P2: interferon, alpha -inducible protein (clone IFI -15K)
0.1335.8451G-protein coupled receptor and G - protein signalingGNB1: guanine nucleotide binding protein (G protein), beta polyp eptide 1
4.8836.0540.73unknownFLJ32915: hypothetical protein FLJ32915
0.7946.2301unknownIFIT4: interferon - induced protein with tetratricopeptide repeats 4
0.9366.2790.93one -carbon compound metabolismAHCY: S - adenosylhomocysteine hydrolase
1.0708.1041chaperone and protein foldingCSA_PPIasePEPTIDYL PROLYL CIS TRANS ISOMERASE
1.0439.5080.79adhesion molecule and extracellular matrix proteinADRM1: adhesion regulating molecule 1
1.02412.0060.49signaling pathwayPASK PAS domain containing serine/ threonine kinase
4.74121.2700.63G-protein coupled receptor and G - protein signalingGPR111: G protein -coupled receptor 111
-199990.48unknown and ESIsIFIT1: interferon - induced protein with tetratricopeptide repeats 1
-199990.57ubiquitination and protein degradationUBE2L6: ubiquitin -conjugating enzyme E2L 6
-199990.94moto protein complexDNAH11: dynein , axonemal , heavy polypeptide 11
0.6632.8440.98ubiquitination and protein degradationGSA7: ubiquitin activating enzyme E1 -like protein
0.1382.8750.57signaling pathway; transcription; apoptosisNMI: N -myc (and STAT) interactor
0.4842.8900.45electron transfer ETFA: electron -transfer -flavoprotein , alpha polypeptide ( glutaric aciduria II)
0.3902.9230.98transcriptionM96: likely ortholog of mouse metal response element binding transcription factor 2
0.2312.9751cytoskeletion and intracellular transport?LOC151636: rhysin 2
0.3353.1401cytoskeletion and intracellular transportKNS2: kinesin 2 60/70kDa
0.2593.3511lipid and fatty acit metabolismACACA: acetyl -Coenzyme A carboxylase alpha
1.6593.5330.99chaperone and protein foldingCABC1: chaperone, ABC1 activity of bc1 complex like (S. pombe )
0.2243.6840.98nucleobase , nucleoside, nucleotide and nucleic acid metabolismNUDT2: nudix (nucleoside diphosphate linked moiety X) -type motif 2
0.0583.8150.62unknown and ESIsKIAA1276: KIAA1276 protein
0.6593.9631translation and ribosomal protein; anti - viral responsePRKR protein kinase , interferon -inducible double stranded RNA dependent
1.284(0.195)(0.39
4)
4.164(2.741)(2.
2)
1(1)(1
)translation and ribosomal protein; GTP bindingEEF1A protein [Fragment]
2.2044.5230.57immune responseCD7: CD7 antigen (p41)
0.1164.5690.69signaling pathway; lipid metabolismPLCD1: phospholipase C, delta 1
0.7514.7480.97lipid and fatty acid metabolismMTP: microsomal triglyceride transfer protein (large polypeptide, 88kDa)
0.6614.8580.98cytoskeletion and intracellular transportG1P2: interferon, alpha -inducible protein (clone IFI -15K)
0.1335.8451G-protein coupled receptor and G - protein signalingGNB1: guanine nucleotide binding protein (G protein), beta polyp eptide 1
4.8836.0540.73unknownFLJ32915: hypothetical protein FLJ32915
0.7946.2301unknownIFIT4: interferon - induced protein with tetratricopeptide repeats 4
0.9366.2790.93one -carbon compound metabolismAHCY: S - adenosylhomocysteine hydrolase
1.0708.1041chaperone and protein foldingCSA_PPIasePEPTIDYL PROLYL CIS TRANS ISOMERASE
1.0439.5080.79adhesion molecule and extracellular matrix proteinADRM1: adhesion regulating molecule 1
1.02412.0060.49signaling pathwayPASK PAS domain containing serine/ threonine kinase
4.74121.2700.63G-protein coupled receptor and G - protein signalingGPR111: G protein -coupled receptor 111
-199990.48unknown and ESIsIFIT1: interferon - induced protein with tetratricopeptide repeats 1
-199990.57ubiquitination and protein degradationUBE2L6: ubiquitin -conjugating enzyme E2L 6
-199990.94moto protein complexDNAH11: dynein , axonemal , heavy polypeptide 11
-1.2500.0000.94transcription suppressionHIC1: hypermethylated in cancer 1
-1.2500.0000.99signaling pathwayMIG -6: Gene 33/Mig -6
0.0080.0031cytoskeletion and intracellular transportKRT6: keratin 6
0.0640.0430.71unknownHypothetical protein FLJ21140
0.0330.1981carbohydrate metabolismG2AN: alpha glucosidase II alpha subunit
0.1100.2650.98immune responseIg lambda chain C regions
0.1000.3040.99lipid and fatty acid metabolismFASN: fatty acid synthase
0.0910.3160.99signaling pathwayWNT9A: wingless -type MMTV integration site family, member 9A
0.0840.3231chaperone and protein folding; protein protectionPPGB: protective protein for beta -galactosidase (galactosialidosis)
0.0180.3310.98amino acid and peptide metabolismOAT: ornithine aminotransferase (gyrate atrophy)
0.0160.3390.98cytoskeletion and intracellular transportMAPRE1: microtubule -associated protein, RP/EB family, member 1
0.1540.3780.96RNA splicing and processingCPSF4: cleavage and polyadenylation specific factor 4, 30kDa
0.1650.3831cytoskeletion and intracellular transport; GTP bindingTUBA6: tubulin alpha 6
0.0740.3910.95RNA binding and ribosomal associationFXR2: fragile X mental retardation, autosomal homolog 2
0.0810.4160.98lipid and fatty acid metabolismFACL4: fatty -acid -Coenzyme A ligase, long - chain 4
0.0140.4261unknownKIAA0007: KIAA0007 protein
0.2240.4340.82RNA splicing and processingSRRM2: serine/arginine repetitive matrix 2
0.1380.4550.98G-protein coupled receptor and G -protein signalingGPS1: G protein pathway suppressor 1
0.1650.4521chaperone and protein foldingTRA1: tumor rejection antigen (gp96) 1
0.0080.4840.98electron transferSARDH: sarcosine dehydrogenase
0.0550.4950.97cytoskeletion and intracellular transportKRT10: keratin 10 (epidermolytic hyperkeratosis; keratosis palma ris et plantaris)
0.0290.4961unknownSPK: symplekin
0.0710.4990.61translation and ribosomal proteinMGC3207: hypothetical protein MGC3207
-1.2500.0000.94transcription suppressionHIC1: hypermethylated in cancer 1
-1.2500.0000.99signaling pathwayMIG -6: Gene 33/Mig -6
0.0080.0031cytoskeletion and intracellular transportKRT6: keratin 6
0.0640.0430.71unknownHypothetical protein FLJ21140
0.0330.1981carbohydrate metabolismG2AN: alpha glucosidase II alpha subunit
0.1100.2650.98immune responseIg lambda chain C regions
0.1000.3040.99lipid and fatty acid metabolismFASN: fatty acid synthase
0.0910.3160.99signaling pathwayWNT9A: wingless -type MMTV integration site family, member 9A
0.0840.3231chaperone and protein folding; protein protectionPPGB: protective protein for beta -galactosidase (galactosialidosis)
0.0180.3310.98amino acid and peptide metabolismOAT: ornithine aminotransferase (gyrate atrophy)
0.0160.3390.98cytoskeletion and intracellular transportMAPRE1: microtubule -associated protein, RP/EB family, member 1
0.1540.3780.96RNA splicing and processingCPSF4: cleavage and polyadenylation specific factor 4, 30kDa
0.1650.3831cytoskeletion and intracellular transport; GTP bindingTUBA6: tubulin alpha 6
0.0740.3910.95RNA binding and ribosomal associationFXR2: fragile X mental retardation, autosomal homolog 2
0.0810.4160.98lipid and fatty acid metabolismFACL4: fatty -acid -Coenzyme A ligase, long - chain 4
0.0140.4261unknownKIAA0007: KIAA0007 protein
0.2240.4340.82RNA splicing and processingSRRM2: serine/arginine repetitive matrix 2
0.1380.4550.98G-protein coupled receptor and G -protein signalingGPS1: G protein pathway suppressor 1
0.1650.4521chaperone and protein foldingTRA1: tumor rejection antigen (gp96) 1
0.0080.4840.98electron transferSARDH: sarcosine dehydrogenase
0.0550.4950.97cytoskeletion and intracellular transportKRT10: keratin 10 (epidermolytic hyperkeratosis; keratosis palma ris et plantaris)
0.0290.4961unknownSPK: symplekin
0.0710.4990.61translation and ribosomal proteinMGC3207: hypothetical protein MGC3207
Name Cellular pathway Probability ASAPRatio Mean
ASAPRatio Std.
671
748
270
330
523
590
P
0.9 P
0.4
1464
1668
P3S100 P100 Unique IDSum
1113
1272
54 IFN-induced proteins (2-fold)
15 previously reported
39 novel
23 IFN-repressed proteins (0.5-fold)
Lots of data -what does it mean?
Katze et al (2002) 2: 675
Interferon (IFN) Pathway
2.215 ± 0.079
PKR
2’,5’-OAS
Mx
ADAR
IRFs
MHC
3.963 ± 0.659
IFN / Mock
2.460 ± 0.076
2.359 ± 0.149
1.398 ± 0.118
Not identified
2.768 ± 0.583-2-microglobulin (MHC I)
IFI-30 (MHC II) 2.219 ± 0.183
GO level
3
4
5
6
7
8
9
10
11
12
Physiological process
Response to stress
Response to external stimulus
PathogenesisMetabolismDeathCell growth and/or maintenance
Cellular defense responseMetabolismCell growth and/or maintenance
Defense response
Immune response
Cell deathTransportCell organization
Cell growth
Cytoplasm organization
Nuclear organization
Fatty acid metabolism
Amino acid metabolism
Nitrogen metabolism
DNA metabolism
Catabolism
GO Analysis of Interferon regulated proteins
Islands of intense knowledge in ocean of unknown
HormoneHormoneresponseresponse
ss
CellCellmotilitmotilit
yyEnergyEnergy
metabolismetabolismm
TranscriptionTranscription
Charting the path between landmarks
HormoneHormoneresponseresponse
ss
CellCellmobilitymobilityEnergyEnergy
metabolismetabolismm
TranscriptionTranscriptionUnassigned observations
Walking down the interaction map
A
D
B
C
F
E
I
H
G
First round of TAP-tagging:Identification of IGBP1 and TIP41 interactors
IGBP1
TIP41
PPP6R2A*
PPP6R1*
PPP4R2*
PPP2CB
PPP4C
PPP6C
PPP2CA
CCT4
CCT5 CCT6A
CCT8CCT7
CCT3
TCP1 CCT2
CCT complex
Catalytic subunitsPP2A-type
phosphatases
Uncharacterizedproteins
Anne-Claude Gingras
Human phosphatase-interaction network:Segregation into functional modules
Centrosome; Meiosis Exit from mitosis; Actin cytoskeleton
PP2B
PP2C
PP2A a
PP4C
PP6C
G1 S transition
AcknowledgementsSeparation strategies
Hookeun LeeEugene YiMingliang Yi
Abundance dependent MS/MSTim GriffinChris Lock (Sciex)
Software development and statistical modelsEric Deutsch Xiao-Jun LiJimmy Eng Alex NesvizhskiiAndy Keller Benno Schwikowski Patrick Pedrioli Ning Zhang
Inference of biological functionWei YanAnne-Claude GingrasCytoscape project (www.cytoscape.org)
Funding:NIH (NCI, NCRR, NIDA, NHBLI), Merck, ABI