proteomics informatics workshop part ii: protein characterization david fenyö february 18, 2011...

66
Proteomics Informatics Worksho Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications Protein complexes Cross-linking The Global Proteome Machine Database

Upload: dwain-hines

Post on 23-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Proteomics Informatics WorkshopPart II: Protein Characterization

David Fenyö

February 18, 2011

• Top-down/bottom-up proteomics• Post-translational modifications• Protein complexes• Cross-linking• The Global Proteome Machine Database

Page 2: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

MSMS/MS

Biological System

Samples

Information about each sample

Information about the biological system

Measurements

What does the sample contain?

How much?

Proteomics Informatics

ExperimentalDesign

Data Analysis

InformationIntegration

SamplePreparation

What does the sample contain?

How much?

Page 3: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Biological System

Information about each sample

Information about the biological system

What does the sample contain?

How much?

Sample Preparation

ExperimentalDesign

Data Analysis

InformationIntegration

MSMS/MS

Samples

Measurements

SamplePreparation

What does the sample contain?

How much?

EnrichmentSeparation etc

DigestionTopdown Bottom

upPeptidesProteins

Fragmentation

Fragments

Page 4: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Top down / bottom up

Top down

Bottom up

mass/charge

inte

nsi

ty

Page 5: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Top down Bottom up

Charge distribution

mass/chargein

ten

sity

mass/charge

inte

nsi

ty

1+

2+

3+

4+

27+

31+

Page 6: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Top down Bottom up

m = 1035 Da m = 1878 Da m = 2234 Da

Isotope distribution

mass/chargein

ten

sity

mass/charge

inte

nsi

ty

Page 7: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Fragmentation

Top down Bottom up

Fragmentation

Page 8: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Correlations between modifications

Top down

Bottom up

Page 9: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Alternative Splicing

Top down

Bottom up

Exon 1 2 3

Page 10: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Top down

Kellie et al., Molecular BioSystems 2010

Proteinmass

spectraFragment

mass spectra

Page 11: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Non-Covalent Protein Complexes

Schreiber et al., Nature 2011

Page 12: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Dynamic Range in Proteomics

Large discrepancy between the experimental dynamic range and the range of amounts of different proteins in a proteome

ExperimentalDynamic Range

Distribution of Protein Amounts

Log (Protein Amount)

Nu

mb

er o

f P

rote

ins

The goal is to identify and characterize all components of a proteome

Desired Dynamic Range

Page 13: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Experimental Designs

SimulatedProtein Separation

PeptideSeparation

"Retention time" (bin)

y

1 k

y

1 k

# o

f p

ep

tid

es

p

er

bin

Mass SpectrometryMS

dynamicrange

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

Protein AbundanceProtein Abundance

Digestion

Sample

Page 14: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Parameters in Simulation

● Distribution of protein amounts in sample

● Loss of peptides before binding to the column

● Loss of peptides after elution off the column

● Distribution of mass spectrometric response for different peptides present at the same amount

● Total amount of peptides that are loaded on column (limited by column loading capacity)

● # of peptide fractions

● # of Proteins in each fraction

● Total amount of peptides that are loaded on column (limited by column loading capacity)

● # of peptide fractions

● Dynamic range of mass spectrometer

● Detection limit of mass spectrometer

Protein Separation

PeptideSeparation

"Retention time" (bin)

y

1 k

y

1 k

# o

f p

ep

tid

es

p

er

bin

Mass SpectrometryMS

dynamicrange

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

Protein AbundanceProtein Abundance

Digestion

Sample

Page 15: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Simulation Results for 1D-LC-MS

Complex Mixtures of Proteins

RPC

Digestion

MS Analysis

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

0 2 4 6 8 10log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

No ProteinSeparation

Protein Separation:10 fractions

Protein Separation:10 fractions

No ProteinSeparation

Tissue

Tissue

Body Fluid

Body Fluid

Page 16: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Success Rate of a Proteomics Experiment

DEFINITION: The success rate of a proteomics experiment is defined as the number of proteins detected divided by the total number of proteins in the proteome.

Log (Protein Amount)

Nu

mb

er o

f P

rote

ins

ProteinsDetected

Distribution of Protein Amounts

Page 17: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Relative Dynamic Range of a Proteomics Experiment

DEFINITION: RELATIVE DYNAMIC RANGE, RDRx,

where x is e.g. 10%, 50%, or 90%

Log (Protein Amount)

RDR90

RDR50

RDR10Fra

cti

on

of

Pro

tein

s D

etec

ted

Nu

mb

er o

f P

rote

ins

ProteinsDetected

Distribution of Protein Amounts

Page 18: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Repeat Analysis

1 Analysis2 Analyses3 Analyses4 Analyses5 Analyses6 Analyses7 Analyses8 Analyses

Page 19: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Repeat Analysis: Comparison of Simulations and Experiments

0

0.1

0.2

0.3

0 2 4 6 8 10

Number of Repeats

Su

ce

ss

Ra

te

Experiment

Simulation

0

0.1

0.2

0.3

0.4

0.5

0 2 4 6 8 10

Number of Repeats

RD

R1

0

Experiment

Simulation

Page 20: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

0 2 4 6 8 10log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Su

cc

es

s R

ate

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Re

lati

ve

Dy

na

mic

Ra

ng

e (

RD

R5

0)

Number of Proteins in Mixture

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

Tissue

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Nu

mb

er o

f P

rote

ins

Body Fluid Body Fluid1 1 2

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Su

cc

es

s R

ate

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Re

lati

ve

Dy

na

mic

Ra

ng

e (

RD

R5

0)

RDR50 Success Rate

TissueBody Fluid

1

1

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

Tissue 2

2

2

Page 21: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Amount loaded and peptide separation

1. Protein separation2. Amount loaded 3. Peptide separation

Order:

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

11

11

Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

11

11

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

22

Tissue

11

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

11

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

11

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

33

Amountloaded

33

Tissue1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ati

ve D

yna

mic

Ran

ge

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

11

11

Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

44

Peptideseparation

44

33

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

33

Amountloaded

1. Protein separation2. Peptide separation3. Amount loaded

11

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

22

1111

Tissue1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Re

lati

ve D

yna

mic

Ran

ge Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

1111

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

33

Peptideseparation

33

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ati

ve D

yna

mic

Ran

ge

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ati

ve D

yna

mic

Ran

ge Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

1111

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

22

Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er o

f P

rote

ins

44

Amountloaded44

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Nu

mb

er

of

Pro

tein

s

33

Peptideseparation

33

Protein separationAmount loadedPeptide separation

Ranges:Protein separation: 30000 – 3000 proteins in each fractionAmount loaded: 0.1 ug – 10 ugPeptide separation: 100 – 1000 fractions

Page 22: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Number of fragment ions

Pro

bab

ilit

y o

f L

oca

liza

tio

n

Phosphopeptide identification

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

Localization of modifications

Page 23: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Pro

bab

ilit

y o

f Lo

cali

zati

on

Number of fragment ions

ID

3

Localization (dmin=3)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin>=3 for 47% of human tryptic peptides

Localization of modifications

Page 24: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Pro

bab

ilit

y o

f Lo

cali

zati

on

Number of fragment ions

ID32

Localization (dmin=2)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin=2 for 33% of human tryptic peptides

Localization of modifications

Page 25: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Pro

bab

ilit

y o

f Lo

cali

zati

on

Number of fragment ions

ID321

Localization (dmin=1)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin=1 for 20% of human tryptic peptides

Localization of modifications

Page 26: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Pro

bab

ilit

y o

f Lo

cali

zati

on

Number of fragment ions

ID3211*

Localization(d=1*)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

Localization of modifications

Page 27: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Peptide with two possible modification sites

Localization of modifications

Page 28: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsi

ty

Localization of modifications

Page 29: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsi

ty

Matching

Localization of modifications

Page 30: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsi

ty

Matching

Which assignment doesthe data support?

1, 1 or 2, or 1 and 2?

Localization of modifications

Page 31: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

AAYYQK

Visualization of evidence for localization

AAYYQK

Page 32: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Visualization of evidence for localization

AAYYQK

AAYYQK

Page 33: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Visualization of evidence for localization

3

2

1

3

2

1

Page 34: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Estimation of global false localization rate using decoy sites

By counting how many times the phosphorylation is localized to amino acids that can not be phosphorylated we can estimate the false localization rate as a function of amino acid frequency.

0

0.005

0.01

0.015

0.02

0 0.05 0.1 0.15

0

0.005

0.01

0.015

0.02

0 0.05 0.1 0.15

Amino acid frequency

Fal

se l

oca

liza

tio

n f

req

uen

cy

Y

Page 35: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

S21

Sm1

How much can we trust a single localization assignment?

If we can generate the distribution of scores for assignment 1 when 2 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

SSmm21

0

2

1

2

1

2

0

2

1

2

1

2

2

1

1

dSSF

dSSFp

S m

)(

)(

1.

2.

Page 36: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Is it a mixture or not?

If we can generate the distribution of scores for assignment 2 when 1 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

S12

Sm2

SSmm21

0

12

12

1

0

12

12

1

1

2)(

)(2

dSSF

dSSFp

Sm

1.

2.

Page 37: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

ppppthth

and1

2

2

11 and 2

ppppthth

and1

2

2

11

ppppthth

and1

2

2

1

ppppthth

and1

2

2

11 or 2

Ø )( ppSS mm 1

2

2

121

Peptide with two possible modification sites

MS/MS spectrum

m/zIn

ten

sity

Matching

Which assignment doesthe data support?

1, 1 or 2, or 1 and 2?

Localization of modifications

Page 38: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Protein Complexes

AB

A

CD

Digestion

Mass spectrometry

Page 39: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Tackett et al. JPR 2005

Protein Complexes – specific/non-specific binding

Page 40: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Sowa et al., Cell 2009

Protein Complexes – specific/non-specific binding

Page 41: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Protein Complexes – specific/non-specific binding

Choi et al., Nature Methods 2010

Page 42: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Analysis of Non-Covalent Protein Complexes

Taverner et al., Acc Chem Res 2008

Page 43: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Determining the architectures ofmacromolecular assemblies

Alber et al., Nature 2007

Page 44: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

M/Z

PeptidesFragments

Fragmentation

ProteolyticPeptides

Enzymatic Digestion

ProteinComplex

Chemical Cross-Linking

MS

MS/MS

Isolation

Cross-LinkedProtein Complex

Interaction Partners by Chemical Cross-Linking

Page 45: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

M/Z

PeptidesFragments

Fragmentation

ProteolyticPeptides

Enzymatic Digestion

ProteinComplex

Chemical Cross-Linking

MS

MS/MS

Isolation

Cross-LinkedProtein Complex

Interaction Sites by Chemical Cross-Linking

Page 46: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Cross-linking

protein

n peptides with reactive groups

(n-1)n/2 potential ways to cross-link peptides pairwise

+ many additional uninformative formsProtein A + IgG heavy chain 990 possible peptide pairs

Yeast NPC ˜106 possible peptide pairs

Page 47: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Cross-linking

Mass spectrometers have a limited dynamic range and it therefore important to limit the number of possible reactions not to dilute the cross-linked peptides.

For identification of a cross-linked peptide pair, both peptides have to be sufficiently long and required to give informative fragmentation.

High mass accuracy MS/MS is recommended because the spectrum will be a mixture of fragment ions from two peptides.

Because the cross-linked peptides are often large, CAD is not ideal, but instead ETD is recommended.

Page 48: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Search Results

Page 49: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Search Results

Page 50: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Search Results

Page 51: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

GPMDB

Page 52: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

2005 2006 2007 2008 2009 2010 20110

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

Year (as of Jan 1st)

Ass

ign

ed s

pect

raSequence-spectrum assignments in

GPMDB

Page 53: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

0 20 40 60 80 100

chromatin

cytoskeleton

E.R.

Golgi

lysosome

mitochondrion

nuclear membrane

plasma membrane

ribosome

% genes

Human Genes Observed in GPMDB

Page 54: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

-40

-30

-20

-10

0

10

20

30

40

N G P D E A V I S T L Y M F H Q K C R Wc

om

po

sit

ion

dif

fere

nc

e (

pe

rce

nt) b

Proteotypic peptide relative composition

Page 55: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Comparison with GPMDB

Most proteins show very reproducible peptide patterns

Page 56: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Comparison with GPMDB

Page 57: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Global frequency of observing a peptide

Peptide Sequence ObservationsFSTVAGESGSADTVR 2633FNTANDDNVTQVR 2432AFYVNVLNEEQR 1722LVNANGEAVYCK 1701

GPLLVQDVVFTDEMAHFDR 1637LSQEDPDYGIR 1560

LFAYPDTHR 1499NLSVEDAAR 1400

FYTEDGNWDLVGNNTPIFFIR 1386

ADVLTTGAGNPVGDK 1338

Page 58: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

If the number of times a peptide sequence (i) has been observed is ni, then for a particular protein:

i

itotal nN

Global frequency of observing a peptide

Page 59: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Define a normalized global frequency of observation for a particular peptide sequence from a particular protein as:

total

ii N

n

Global frequency of observing a peptide (ω)

Page 60: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Peptide Sequence ωFSTVAGESGSADTVR 0.08FNTANDDNVTQVR 0.07AFYVNVLNEEQR 0.05LVNANGEAVYCK 0.05

GPLLVQDVVFTDEMAHFDR 0.05

LSQEDPDYGIR 0.04LFAYPDTHR 0.04NLSVEDAAR 0.04

FYTEDGNWDLVGNNTPIFFIR 0.04

ADVLTTGAGNPVGDK 0.04

Global frequency of observation (ω), catalase

Page 61: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

1 2 3 4 5 6 7 8 9 10111213141516171819200.00

0.02

0.04

0.06

0.08

ω

Peptide sequences

Global frequency of observation (ω), catalase

Page 62: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

For any set peptides observed in an experiment assigned to a particular protein (1 to j ):

j

jprotein )(

1)( protein

Omega (Ω) value for a protein identification

Page 63: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Protein ID Ω (z=2) Ω (z=3)SERPINB1 0.88 0.82SNRPD1 0.88 0.59

CFL1 0.81 0.87SNRPE 0.8 0.81

PPIA 0.79 0.64CSTA 0.79 0.36PFN1 0.76 0.61CAT 0.71 0.78

GLRX 0.66 0.8CALM1 0.62 0.76FABP5 0.57 0.17

Protein Ω’s for a set of identifications

Page 64: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Part of Best Practices Integrative Informatics Consultation Service (BPIC) at the NYU Center for Health Informatics and

Bioinformatics (CHIBI)

[email protected]

[email protected]

Walk-in Clinic:Wednesday, February 23, 3-5 pm

227 E 30th Street, 7th Floor, Room #739

Proteomics Consultation

Page 65: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Proteomics Informatics WorkshopPart III: Protein Quantitation

February 25, 2011

• Metabolic labeling – SILAC• Chemical labeling• Label-free quantitation• Spectrum counting• Stoichiometry• Protein processing and degradation• Biomarker discovery and verification

Page 66: Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications

Proteomics Informatics Workshop

Part I: Protein Identification, February 4, 2011

Part II: Protein Characterization, February 18, 2011

Part III: Protein Quantitation, February 25, 2011