welcome to: case studies in quantitative proteomics · chromatin proteomics (jake jaffe)...

Welcome to:

Case Studies in Quantitative Proteomics

ASMS 2018

San Diego

1

Schedule

2

Time Day 1 Day 2

9:00-9:30Intro to Protein Quantification (Mike

MacCoss)

Case Study 4: Is my mass spectrometer

working? System Suitability. How we

assess this? (Pino)9:30-10:15

10:15-10:45 Coffee Break Coffee Break

10:45-11:15Introduction to Skyline (Brendan

MacLean)Intro to DIA (MacCoss)

11:15-12:00

Case Study 1: Label free targeted assay

development in a rodent model of heart

failure (Brendan MacLean)

Case Study 5: When PRM is not

enough -- hybrid assays (Jake Jaffe)

12:00-12:30Lunch Lunch

12:30-13:00

13:00-13:30

What happens when an experiment is

designed poorly: Case studies from failed

studies (Olga Vitek)

Case Study 6: When do you need a

PRM assay -- a case study in

Chromatin proteomics (Jake Jaffe)13:30-14:00

14:15-14:45 Coffee Break Coffee Break

14:45-15:15

Case Study 2: Group Comparsions in

Skyline (Brendan MacLean)

I have peak areas, what do I do now?

Intro to MSStats (Olga)14:15-15:45

15:45-16:15

Case Study 3: ABRF iPRG MS1

Quantification in Skyline (Brendan

MacLean)

Hands on Analysis of Proteomics

Datasets (Meena)16:15-16:45

What you should expect

• You should expect to be challenged!

– There should be material that you know but also there should

be material that we present that you don’t know.

• You should expect to learn the fundamentals of targeted

proteomics.

• You should learn that quantitative proteomics is not possible

without extensive quality control.

– We will teach you ways that have worked for us.

• You should learn how we have handled real-life problems.

• You should have a good introduction to the use of Skyline for

targeted proteomics.

• You may hear contradictory things from different instructors.

• The methods presented are not meant to be comprehensive. We

will present strategies we used to solve problems in our labs.

• You should expect to have fun!!

3

What we expect.

• We expect you to challenge us!

– You won’t hurt our feelings – we have thick skin. We

want feedback and also to hear that you want more.

• We expect different course participants to come

from different backgrounds.

– We expect to learn from you!

• We expect you to participate. The more you do,

the more we will all get out of the course.

– Don’t hesitate to ask a question. Feel free to interrupt

us.

4

Introduction to Quantitative Analysis of

Proteins

5

Selective Pressure is Provided on the Protein and not the Transcript

Schrimpf et al. PLOS Bio. 2009

Problems with clinical immunoassays

Single-plex

Poor standardization

Hook effect

Anti-reagent antibodies

Autoantibodies

Why Use Mass Spectrometry?

Hoofnagle and Wener, J Immunol Methods (2009)

Can We Multiplex Immunoassays?

Bead ID

Protein quantification

Luminex beads


vs.

Not easily.


Traditional Automated Immunoassays

Capture antibody selects sandwiches

detected by enzyme-conjugated secondary

Generally one at a time

Poor Standardization

Immunoassays start with epitope selection

Whole antigen, domain, or peptide

Different assays use different epitopes

Tumor antigens vary in post-translational modifications

Result: Variable results between assay platforms

Even with an international standard (e.g. BCR-457)

Hook Effect

Too much antigen can lead to negative result

Schiettecatte, Johan; Anckaert, Ellen; Smitz, Johan (2012-03-23). Interferences in Immunoassays. InTech.

https://en.wikipedia.org/wiki/Hook_effect

Autoantibody Interference

Autoantibodies can sequester antigen away from

detectable complexes

Falsely negative results with sandwich assays

Direct detection of analyte by mass spectrometry

Interlaboratory calibration

Improved specificity

No sandwiches formed

No hook effect

Saturation of signal – plateau of calibration curve

Digestion of all proteins

Destroy autoantibodies

Destroy anti-reagent antibodies

Mass SpectrometryA Potential Solution

Quantitative Analysis 101

Introduction Quantitative Analysis - Michael J. MacCoss 16

Amount (moles)

Measure

d I

nte

nsity (

Are

a)

LOD

LLOQ

ULOQ

Linear

Dynamic

Range

LOD: Limit of Detection

LLOQ: Lower Limit of Quantitation

ULOQ: Upper Limit of Quantitation

Quantitative Analysis 101: Using an Internal Std


Amount (moles)

Inte

nsity R

atio (

Are

a R

atio)

LOD

LLOQ

ULOQ

Linear

Dynamic

Range




Use of calibration curve to convert signal to quantity

18

Analyte (m/z = M)

Where: ki= is the slope of the standard curve

Ab 0 Area from a blank

−=

k

AAn b

Inte

nsity M

Inte

nsity

timem/z

bAnkA +=

A

Known AmountM

easure

d S

ignal (A

rea)

𝐴 ∝ 𝑚𝑜𝑙𝑒𝑠

Use of calibration curve to convert signal to quantity

19

Sample (m/z = M) Spiked with

Internal Standard (m/z = M+i)

Where: k0/i= is the slope of the standard curve

Ri 0 Area ratio from a blank … only internal std

−=

i

i

k

RRn

/0

00

Inte

nsity

M

M+ i

Inte

nsity

timem/z

R0 =A0

Ai

ii RnkR += 0/00

n0

ni

Known AmountMeasure

d S

ignal (A

rea R

atio)

What does it mean if these are the two points?

20

Amount (moles)

Measure

d I

nte

nsity (

Are

a)

LOD

LLOQ

ULOQ

Linear

Dynamic

Range




Questions:

• Design an approach to confirm linearity and

LLOQ for peptides in a complex matrix?

– Assume you have < 50 peptides in your assay.

– Assume you have >1000 peptides in your assay?

• Design an approach to “calibrate” your method

– The purpose of the calibration is:

• To correct for differences between instruments and platforms

• Minimize between day batch effects

• Enable another lab to get similar results to your lab

21

Method to measure both LOQ and Linearity

• Possible Samples to Use as a Diluent Matrix

– Stable isotope labeled version of the matrix.

• 15N or SILAC labeled cells

– A diverged species

• For human plasma we use chicken plasma.

22

Diluent

Sample

Matrix

Pooled

Reference

Sample

Matrix

Reference Yeast BY4742 Diluted in 15N Yeast (S288c)

23

Fractional Dilution

0 0.2 0.4 0.6 0.8 1

Pe

ak a

rea

(1e

9)

0

0.5

1

1.5

2Protein SSA1 (2.69E+05 copies/cell)

Fractional Dilution

0 0.2 0.4 0.6 0.8P

eak

are

a (1

e7

)

0

1.0

1.2

0.6

0.8

Protein LSM2 (2.21E+03 copies/cell)

0.2

0.4

Mixing two samples of extreme conditions

• Assume that you have a healthy and a disease

cohort.

• You can take a pool of the healthy and a pool of

the disease to demonstrate that the

measurement is linear between the two

conditions

• Example

– 100% condition A

– 80% condition A, 20% condition B



– 100% condition B

24

Sample

Sample + Spike

Known SpikePre-Existing

Femtomoles Peptide

-6 -4 -2 0 2 4 6 8 10 12 14 16

0

20000

40000

60000

80000

100000

120000

AU

C (

Counts

)

Method of Standard Addition

0

2500

5000

7500

10000

12500

15000

Counts

Time

Sample

Sample + Spike

Can’t assess Limit of Quantification. Can only

assess whether the sample us above the LOQ

OTHER THINGS TO CONSIDER

26

We are not doing protein quantitation we are

doing peptide quantitation!

27

P

PP

P

Level 1

Level 2

Level 3

Level 4

Level 5

Peptide Intensity Varies Over

Course of Digestion

Agger, ClinChem, 2010

Do you know the enrichment of the standard?

29

Peptide: DIDIEYHQNK N = 15

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

100 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

98 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

95 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

90 ape 15N

1:1

Do you know the enrichment of the standard?


Peptide: DIDIEYHQNK N = 15

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

100 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

98 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

95 ape 15N

1:1

1270 1275 1280 1285 1290 1295

Rela

tive A

bundance

m/z

90 ape 15N

1:1

Effect of 13C Labeling on Quantitative Accuracy


628 629 630 631 632 633 634 635 6360

20

40

60

80

100

Rela

tive A

bun

dance

m/z

YAGILDCICATFK

9 x 13C at >99.9 APE

Effect of 13C Labeling on Quantitative Accuracy


YAGILDCICATFK

9 x 13C at >99.9 APE

628 629 630 631 632 633 634 635 6360

20

40

60

80

100

Rela

tive A

bun

dance

m/z

Effect of 98 ape Labeling on 2H8-ICAT Accuracy


814 815 816 817 818 819 820 821 8220

20

40

60

80

100

Re

lative

Ab

un

da

nce

m/z

FGTWQCICATLMR

0.984

1885 1886 1887 1888 1889 1890 1891 1892 1893 18940

20

40

60

80

100

Rela

tive A

bundance

m/z

SYIEGTAVSQADFGLGS

GHCICATQALDPWVTVFK

0.968

1:1 Mole Ratio 1:1 Mole Ratio

34

Challenges with Discovery Proteomics

and Possible Alternatives

µLC

Ab

un

da

nce

Time

Shotgun Proteomics

Protein

Mixture

proteolysis

Peptide

Mixture

Ab

un

da

nc

e

m/z

MS

MS/MS1

Ab

un

da

nce

m/z

1

2

Ab

un

da

nce

m/z

2

3

Ab

un

da

nc

e

m/z

3

These Mixtures are Complicated!

Time (Min)

m/z

MS ScanMS/MS MS/MS MS/MS

These Mixtures are Complicated!

Time (Min)

m/z

Acquisition methods in shotgun LC-MS/MS

Data Dependent Acquisition (DDA)

Liquid chromatography

MS/MS analysisIsolation &

fragmentationMS analysis

Sampling Every Precursor Using DDA

• To get every precursor using DDA, we would have to collect 121 MS/MS spectra on average per chromatographic peak

200

150

50

100

0Scan index

300

250

Number of precursors from m/z 400 to 1,000

Nu

mb

er

of

pre

curs

ors

Average: 120.58

Data Dependent Acquisition will Always

Undersample

Data Dependent Acquisition will Always

Under Sample

Red: Detected Persistent Peptide Isotope Distributions

Blue: MS/MS Spectra

Green: Peptide Identifications

DDA always misses and is least

reproducible on the low abundant signals

Imagine Visiting Kyoto and Visiting the Tallest Buildings First

43

Analogy Nate Yates

You would miss …

44

If you wouldn’t learn much about Kyoto by going to the tallest

structures why would we use this approach to understand protein

mixtures???

So what do we do?

45

Build on Other People’s Knowledge

So what do we do?

46

So what do we do?

47

So what do we do?

48

C. elegans as an In Vivo Model System for Proteomics

The only multicellular

organism with a genome

sequenced to completeness

Complete Lineage for all 959 Somatic Cells

Insulin

ReceptorDAF-2

IRS1-4 IST-1

pi 3-kinase AGE-1

PDK1 PDK-1

AKT/PKB AKT-1/AKT-2

FKHRL1 DAF-16FKHR AFX

Humans C. elegans

Soluble Proteins Insoluble Proteins

Density Centrifugation

10 Fractions

Reversed

Phase

10 Fractions

(NH4)2SO4

Cuts

8 Fractions

10 MudPITs 8 MudPITs 10 MudPITs

>6,000,000 MS/MS Spectra, 67,047 Unique Peptide

Identifications, and 6,779 Unique Protein Identifications

0 Proteins Identified that are

Known to be Involved in the Insulin

/ IGF-1 Signaling Pathway

Fractionation to Improve the Coverage of Proteins Involved in Insulin / IGF-1 Signaling

200 400 600 800 10000

20

40

60

80

100

Re

lative

Ab

un

da

nce

m/z

521.0 521.5 522.0 522.5 523.0 523.5 524.00

20

40

60

80

100

Re

lative

Ab

un

da

nce

m/z

Ion

Source

CID

521.7 757.6

Q1 Q3

32 33 34 35 36 37 38 39 40

0

2x104

4x104

6x104

8x104

1x105

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

6x103

9x103

1x104

2x104

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

6x103

9x103

1x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

4x103

8x103

1x104

2x104

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

5x103

8x103

1x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x104

6x104

9x104

1x105

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

2x103

4x103

6x103

8x103

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

1x104

2x104

3x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

1x104

2x104

3x104

Counts

Time (min)

Precursor Ion > Product Ion

TGASEAVPSEGK > GK

TGSAEAVPSEGK > EGK

TGSAEAVPSEGK > SEGK

TGSAEAVPSEGK > PSEGK

TGSAEAVPSEGK > VPSEGK

TGSAEAVPSEGK > AVPSEGK

TGSAEAVPSEGK > EAVPSEGK

TGSAEAVPSEGK > AEAVPSEGK

TGSAEAVPSEGK > SAEAVPSEGK

566.78 > 204.13

566.78 > 333.18

566.78 > 420.21

566.78 > 517.26

566.78 > 616.33

566.78 > 687.37

566.78 > 816.41

566.78 > 903.44

566.78 > 1031.5

y2

y3

y4

y5

y6

y7

y8

y9

y10

SRM Chromatograms

Acquisition methods in shotgun LC-MS/MS

Data Dependent Acquisition (DDA)

Selected/Parallel Reaction Monitoring (S/PRM)

MS/MS analysisIsolation &

fragmentationMS analysis

Liquid chromatography

20 21 22 23 24 250

20

40

60

80

100

Re

lative

Ab

un

da

nce

Time (min)

Synth

etic P

eptide S

tandard

s

20 21 22 23 24 250

20

40

60

80

100

Re

lative

Ab

un

da

nce

Time (min)

Worm

Hom

ogenate

25 µ

g

20 21 22 23 24 250

20

40

60

80

100

Re

lative

Ab

un

da

nce

Time (min)

20 21 22 23 24 250

20

40

60

80

100

Re

lative

Ab

un

da

nce

Time (min)

75 attomoleIDATTHIGGVQIK

7.5 femtomoleQEDVVIEGWLHK

IDATTHIGGVQIK

DAF-16QEDVVIEGWLHK

AKT-1

0 20 40 60 80 100

0.0

5.0x106

1.0x107

1.5x107

2.0x107

2.5x107

3.0x107

3.5x107

4.0x107

Peak A

rea

Moles Injected (fmol)

Dilution of the Yeast Protein PGI1 in Undepleted Human Plasma

Peptide: ILFDYSK

R = 0.99992


Peptide: ILFDYSK

R = 0.99992

0 5 10 15

0

1x106

2x106

3x106

4x106

5x106

Peak A

rea



Peptide: ILFDYSK

R = 0.99992

0.0 0.2 0.4 0.6 0.8 1.0

0

1x105

2x105

3x105

4x105P

eak A

rea


12 Attomoles of ILFDYSK on Column in 10 µg of Unfractionated Human Plasma

27 28 29 30 31 32 330.0

2.0x103

4.0x103

6.0x103

8.0x103

1.0x104

1.2x104

1.4x104

Inte

nsity

Retention Time (min)

*

SRM of Ceruplasmin – ETFTYEWTVPK

30 31 32 33 34 35 36 37 38 39 40

0.0

5.0x105

1.0x106

1.5x106

2.0x106

2.5x106

3.0x106

3.5x106

Counts

Time (min)

SRM Transitions

700.84 > 244.17

700.84 > 759.4

700.84 > 922.47

SRM Triggered

Product Ion Spectrum

200 400 600 800 1000 12000

20

40

60

80

100

Rela

tive A

bu

nd

an

ce

Time (min)

Ceruplasmin – ETFTYEWTVPK

1170.9

1023.7

922.5759.31630.3443.1

119.8

243.9

461.1343.0 y9y8

y7y6y5y4

y3

y2

32 33 34 35 36 37 38 39 40

0

2x105

4x105

6x105

8x105

1x106

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

2x104

4x104

6x104

8x104

1x105

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

6x103

9x103

1x104

2x104

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

6x103

9x103

1x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

4x103

8x103

1x104

2x104

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x103

5x103

8x103

1x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

3x104

6x104

9x104

1x105

Co

un

ts

Time (min)

32 33 34 35 36 37 38 39 40

0

2x103

4x103

6x103

8x103

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

1x104

2x104

3x104

Counts

Time (min)

32 33 34 35 36 37 38 39 40

0

1x104

2x104

3x104

Counts

Time (min)

Chromogranin A

TGASEAVPSEGK

566.78 > 147.11

566.78 > 204.13

566.78 > 333.18

566.78 > 420.21

566.78 > 517.26

566.78 > 616.33

566.78 > 687.37

566.78 > 816.41

566.78 > 903.44

566.78 > 1031.5

32 33 34 35 36 37 38 39 40

0.0

2.0x104

4.0x104

6.0x104

8.0x104

1.0x105

1.2x105

Counts

Time (min)

y1

y2

y3

y4

y5

y6

y7

y8

y9

y10

SRM Triggered

Product Ion Spectrum

Chromogranin A is ~10 ng/mL in plasma

Anderson et al. J. Physiology 2005

200 400 600 800 10000

20

40

60

80

100

Re

lative

Ab

un

da

nce

Time (min)

Chromogranin A

TGASEAVPSEGK

566.78 > 147.11

566.78 > 204.13

566.78 > 333.18

566.78 > 420.21

566.78 > 517.26

566.78 > 616.33

566.78 > 687.37

566.78 > 816.41

566.78 > 903.44

566.78 > 1031.5

y1

y2

y3

y4

y5

y6

y7

y8

y9

y10

85.9

146.9

218.0

401.8

494.9

574.9639.4

741.2854.3

967.42

696.1

331.1

y1

y2 y3 y4 y5 y6

y7 y8 y9

y10

Should we be putting in all this effort toward measuring reliable peptide peak areas?

Spike-in Validation Experiment

Human4µg 0.1µg

E. coli

Human4µg

0.8µgE. coli

4 replicates

con

tro

ls

8x change

#Proteins

Detected

#E. coli

Proteins

#Human

Proteins

#E. coli

peptides

#Human

peptides

Spectra /

PSMs at q-

value < 0.01

2073 507 1566 1633 2346 172241; 29163 (16.9%)

Human E. coli

Log2 Protein Fold Change

Protein ID Summary

Relative Quantitation

4 replicates

MS1 Filtering compared with Spectral Counting

4 orders of magnitudeof dynamic range

2.5 orders of magnitudeof dynamic range

MS1 Filtering and Spectral Counting Fold-Change Distributions

Crawdad – AUC quantitation

Raw Ratio: n1 / n2

NSAF Ratio: ratio of SPC normalized by protein length

f = 0.5

𝑅𝑆𝐶 = log2𝑛2 + 𝑓

𝑛1 + 𝑓+ log2

𝑡1 − 𝑛1 + 𝑓

𝑡2 − 𝑛2 + 𝑓

E. Coli Proteins(8-fold spike-in

difference)

Limited to proteins where SPC for both1x, 8x sets are > 0

Measured Protein Fold-Change Ratios: MS1 Filtering vs. Spectral Counting

E. coli protein

Human protein

E. coli protein

Human protein

8x change

1x change

8x change

1x change

Measured Protein Fold-Change Ratios: MS1 Filtering vs. Spectral Counting

E. coli protein

Human protein

E. coli protein

Human protein

8x change

1x change

8x change

1x change

Limited to Proteins with >= 10 MS/MS Spectra

Summary

• Immunoassays are an imperfect solution to the protein quantitation

problem.

• Add the internal standard as early as possible to account for sample

preparation errors

• Systematic errors do occur and need to be corrected.

• High sequence coverage improves the comprehensiveness of shotgun

proteomics data

• Know the isotope enrichment of the internal standard.

– Can be measured by a number of different techniques

• 13C labeled internal standards aren’t perfect and require special

considerations to get accurate isotopomer ratios

• Think about how you are going to calculate the significance before you

do the study. Get help early on if you are not sure.

• Is your sample size and experimental design going to answer your

questions?

• Don’t tolerate experiments with an N of 1!

70

Schedule

71

Time Day 1 Day 29:00-9:30

Intro to Protein Quantification (Mike

MacCoss)

Intro to Data Independent

Acquisition (Mike MacCoss)9:30-10:00

10:00-10:30 Principles of Study Design:

Possibly examples of failed

studies? (Olga Vitek)10:30-11:00

Introduction to Skyline (Brendan

MacLean)

11:00-11:30

Case Study 1a: Targeted Method

Refinement (Brendan MacLean)

Case Study 3b: Using Sentinel

Profiles for Drug Characterization:

Clustering and connectivity (Jake

Jaffe)11:30-12:00

12:00-12:30Lunch Lunch

12:30-13:00

13:00-13:30 Case Study 1b: Group Comparsions

in Skyline (Brendan MacLean)Intro to MSstats (Olga Vitek)

13:30-14:00

14:00-14:30 Case Study 2: MS1 Quantification in

Skyline (Brendan MacLean)

Statistical Analysis of Proteomics

Datasets (Olga Vitek)14:30-15:00

15:00-15:30Case Study 3a: Intro to Sentinel

Assay Concept, Development of

P100 Discovery Data Set (Jake Jaffe)

System Suitability and Quality

Control. How we assess this,

AutoQC/Panorama/MSStatsQC

(Lindsay/Eralp)15:30-16:00

welcome to: case studies in quantitative proteomics · chromatin proteomics (jake jaffe)...

Documents