proteomics informatics – protein identification iii: de novo sequencing (week 6)

76
Proteomics Informatics – Protein identification III: de novo sequencing (Week 6)

Upload: emi-washington

Post on 04-Jan-2016

23 views

Category:

Documents


2 download

DESCRIPTION

Proteomics Informatics – Protein identification III: de novo sequencing (Week 6). De Novo Sequencing of MS Spectra. Only a manually confirmed spectrum is a correct spectrum Beatrix Ueberheide March 12 th 2013. Biological Mass Spectrometry. Proteolytic digestion. Peptides. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Proteomics Informatics – Protein identification III:

de novo sequencing  (Week 6)

Page 2: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

De Novo Sequencing of MS Spectra

Only a manually confirmed spectrum is a correct spectrum

Beatrix UeberheideMarch 12th 2013

Page 3: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Biological Mass Spectrometry

Time (min)500 1000 1500m/z

Protein(s)

Proteolytic digestion

Peptides

200 600 1000m/z

MS/MSMass Spectrometer

Database Search

Manual Interpretation

MSBase Peak Chromatogram

Page 4: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

KLEDEELFGS

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

Page 5: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

K1166

L1020

E907

D778

E663

E534

L405

F292

G145

S88 b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

Page 6: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147KL

260E

389D

504E

633E

762L

875F

1022G

1080S

1166 y ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

Page 7: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

Page 8: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

113

113

Page 9: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

129

129

Page 10: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

Page 11: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

Page 12: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Peptide Sequencing using Mass Spectrometry

147K

1166L

260

1020E

389

907D

504

778E

633

663E

762

534L

875

405F

1022

292G

1080

145S

1166

88

y ions

b ions

m/z

% R

elat

ive

Abu

ndan

ce

100

0250 500 750 1000

[M+2H]2+

762

260 389 504

633

875

292405 534

907 1020663 778 1080

1022

Page 13: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

How to Sequence: CAD

Residue Mass (RM)

b ion

b1 = RM + 1

The very first N- and C-terminal fragment ions are not just their corresponding residue masses. The peptides N or C-terminus has to be taken into account.

y ion

y1 = RM + 19

Page 14: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Example of how to calculate theoretical fragment ions

S A M P L E R88 159 290 387 500 629 803

175304417514645716803

Residue MassThe first y ionThe first b ion

Page 15: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

How to calculate theoretical fragment ions

S A M P L E R88 159 290 387 500 629 803

175304417514645716803

Residue MassThe first b ion

RM+1 +RM+18

The first y ion

RM+19+ RM + RM + RM + RM + RM+ RM

+ RM + RM + RM+ RM + RM

Page 16: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Finding ‘pairs’ and ‘biggest’ ions

If trypsin was used for digestion, one can assume that the peptide terminates in K or R. Therefore the biggest observable b ion should be:Mass of peptide [M+H] +1 -128 (K) -18Mass of peptide [M+H] +1 -156 (K) -18

y ions are truncated peptides. Therefore subtract a residue mass from the parent ion [M+H] +1 . The highest possible ion could be at [M+H] +1 -57 (G) and the lowest possible ion at [M+H] +1 -186 (W)

b and y ion pairs:Complementary b and y ions should add up and result in the mass of the intact peptide, except since both b and y ion carry 1H+ the peptide mass will be by 1H+ too high therefore: b (m/z) + y (m/z)-1 = [M+H] +1

Check the SAMPLER example

Page 17: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

How to start sequencing• Know the charge of the peptide• Know the sample treatment (i.e. alkylation, other

derivatizations that could change the mass of amino acids)

• Know what enzyme was used for digestion• Calculate the [M+1H]+1 charge state of the peptide• Find and exclude non sequence type ions (i.e.

unreacted precursor, neutral loss from the parent ion, neutral loss from fragment ions

• Try to see if you can find the biggest y or b ion in the spectrum. Note, if you used trypsin your C-terminal ion should end in lysine or arginine

• Try to find sequence ions by finding b/y pairs• You usually can conclude you found the correct

sequence if you can explain the major ions in a spectrum

Page 18: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Common observed neutral losses and mass additions:

• Ammonia -17

• Water -18

• Carbon Monoxide from b ions -28

• Phosphoric acid from phosphorylated serine and threonine -98

• Carbamidomethyl modification on cysteines upon alkylation with iodoacetamide +57

• Oxidation of methionine +18Calculate with nominal mass during sequencing, but use the

monoisotopic masses to check if the parent mass fits. For high res. MS/MS check that the residue mass difference is correct.

Page 19: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)
Page 20: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)
Page 21: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Mixed Phospho spectra

unmodified

1 Phospho site

1 Phospho site

Page 22: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

First ‘on your own example’

Remember what you need to know first!

Page 23: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

What is the charge state?

• Neutral loss of water?• Any ions about (z * parent

mass)?• Confirm with b/y pairs!

Neutral loss of water

Water = 18; 18/z; 9

Page 24: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Search for ‘biggest ion’

1433-18-RM1433-18- a residue after which an enzyme cleaves

1433-18-156 = 12491433-18-128 = 1297

Page 25: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

1433K

147

1297

Page 26: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

1433K

147

1297

87

1210

S

2341433

Page 27: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

1433K

147

1297

87

1210

S

2341433

Find the biggest y ion!

Peptide Mass – RMLowest possible ion = GlycineHighest possible ion = Tryptophan

Glycine = 1443-57 = 1386Tryptophan = 1443-186 = 1257

1443-163 = 1280

Y

1280

164

Page 28: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

And the sequence is……..

Page 29: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)
Page 30: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

What is the difference ?

Less b ions

A bit of precursor is leftAccurate mass

Page 31: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

What if we do not get good fragmentation?

Page 32: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Try a different mode of dissociation

Page 33: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

ETD

Page 34: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Electron Transfer Dissociation

CH NHOH

O

O

NH2

NH3+

NH3+

NHR

O

H

+

z' cc''Fluoranthene

+ peptideCH NHOH

O

O

NH2

NH3+

NH3+

NHR

O

H

+

z' cc''Fluoranthene

+ peptideCH NHOH

O

O

NH2

NH3+

NH3+

NHR

O

H

+

z' cc''Fluoranthene

+ peptideCH NHOH

O

O

NH2

NH3+

NH3+

NHR

O

H

+

z' cc''Fluoranthene

+ peptide+

Page 35: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Tandem MS - Dissociation Techniques

CAD: Collision Activated Dissociation (b, y ions)

ETD: Electron Transfer Dissociation (c, z ions)

increase of internal energy through collisions

bombardment of peptides with electrons (radical driven fragmentation)

Page 36: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

HPLC LTQ front

Modified rear / CI source

The Prototype Instrument

Page 37: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

+

IonDetector

1 0f 2

Three Section RF Linear Quadrupole Trap

Cations

FromESI

Source

3 mTorr HeNICI

Source

Filament

Methane

- Anions

e-

Modifications For Ion/Ion Experiments

Anion Precursor(Fluoranthene)

Secondary RF Supply0-150 Vpeak @ 600 kHz

~700 mTorr

Page 38: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Injection of Positive Ions (ESI)

FrontSection

CenterSection

BackSection

BackLens

FrontLens

++

Ions accumulate

0 V

-10 V

PeptideCations

+++++

++

Page 39: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Precursor Storage in Front Section

+

FrontSection

CenterSection

BackSection

BackLens

FrontLens

0 V

-10 V

Precursor ions moved to front section

Page 40: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Injection of Negative Ions (CI)

Precursor ions held in front section

Negative reagent ionsaccumulate in the center section

+

FrontSection

CenterSection

BackSection

BackLens

FrontLens

0 V

+5 V

---- ----

Page 41: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Charge-Sign Independent Trapping

Pseudo-potentialcreated by+150 Vp 600 kHzappliedto lenses

+0 V

Positive and negative ions react whiletrapped in axial pseudo-potential

-

Page 42: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Charge sign independent radial confinement

+

0 V

-

Axial Confinement With DC PotentialsTrapping is Charge Sign Dependent

Page 43: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Charge sign independent axial confinement with combined RF Quadrupole and

end lens RF pseudo-potentials

+0 V

-

Page 44: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

0 V

Reagent AnionsRemoved Axially

Product Cations Trapped in Center Section For Scan Out

+-12 V- -

End ion/ion reactions prepare for product ion analysis

+ + + +

Page 45: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Electron Transfer - Proton Transfer

200 400 600 800 1000 1200 1400

m/z

0

50

100 +3Precursor

Fragmentation(ETD)

Page 46: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

200 400 600 800 1000 1200 1400

m/z

0

50

100 +3Precursor

+1+1

+2+2

+3Precursor

0

50

100

200 400 600 800 1000 1200 1400m/z

O -

O

Fragmentation(ETD)

Charge Reduction (PTR)

Electron Transfer - Proton Transfer

Page 47: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

The two types of ion reactions

Page 48: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

[M + 3H]2+•

[M + 3H]2+•

Intact Charge-Reduced

Products

Mass ?m/z ?

Charge (z)Sequence ?

Temperature ?Anion ?

He Pressure? FragmentationProducts

c, z, etc.

ET or ETD

Page 49: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

[M + 3H]2+•

[M + 3H]2+•

Intact Charge-Reduced

Products

Mass ?m/z ?

Charge (z)Sequence ?

Temperature ?Anion ?

He Pressure? FragmentationProducts

c, z, etc.

ET or ETD

CAD

Page 50: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Charge dependence in fragmentation

Page 51: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Gentle off resonance activation

Page 52: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

How to Sequence ETD

Residue Mass (RM)

c1 = RM + 18 z1 = RM + 3

z1 ionc1 ion

+ NH

Page 53: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Example of how to calculate theoretical fragment ions

S A M P L E R105 176 307 404 517 646 803

159288401498629700803

Residue MassThe first z ion

The first c ion

Page 54: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Largest c and z ions

Page 55: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Background necessary to know

Show a hand annotated spectra from one of my toxin talks

Show how to calculate a charge state, also in a spectrum, give a spectrum

where they should find out the charge state for CAD and then for ETD and

how they confirm that by finding pairs (must have mentioned that in the

previous slide for the example)Ask them if they know how to calculate

a charge state etc.

Page 56: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Characteristics of CAD and ETD spectra

Show the precursor issue, show the

neutral loss from the parent, show the neutral loss from

peptides

Page 57: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Sample: Antigens from MHC molecules:

Proline in the 2nd position!

Page 58: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

1. What charge state ?[M + H]+1 = m/z (m = m + H)Number of Hs = z

Remember to calculate with nominal masses

Page 59: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

(389.5 · 3) – 2 = 1166

(m · z) – (z-1) = [M + H]+1

Page 60: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

[M + 2H]+2 = (1166 + 1) /2 = 584

[M + H]+1 + 1H /2

(389.5 · 3) – 2 = 1166

(m · z) – (z-1) = [M + H]+1

check

Page 61: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

[M + 2H]+2

[M + 3H]+3·

+3 +2

+1

Page 62: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

3. biggest c ion ?

4. biggest z ion ?

2. Eliminate non-sequencing relevant ions!

Page 63: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

cbiggest

P I/L1166

1166

1052

130

No suitable zbiggest ion! Remember 2nd position

is a proline?

Page 64: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

P I/L1166

1166

1052

130

Now find the first c ion (c1), look for c and z pairs, and sequence ladders

Let’s find c and z pairs!z + c = [M + H]+1 +2

[M + H]+1 – c + 2 = z

Page 65: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

P I/L1166

1166

965

116

174 271 358 445 573 701 802

366467595723810897994R S S Q/K SQ/K T I/L

1052

203

Page 66: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

2nd Example

Page 67: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Important to know:This sample was converted to Methylesters

(+14 on C-term and D/E side chains) and analyzed after IMAC!

Sample: Antigens from MHC molecules: 9mer

with Proline in the 2nd position!

Page 68: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

1. What charge state ?

(372.5 · 3) – 2 = 1115

[M + 2H]+2 = (1115 + 1) /2

[M + H]+1 + 1H /2

= 558

= [M + 2H]+2

check

Page 69: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

[M + 2H]+2

[M + 3H]+3·

+3

+2

+1

Page 70: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

3. biggest c ion ?

4. biggest z ion ?

2. Eliminate non-sequencing relevant ions!

For c-ions remember to also subtract 14!

Page 71: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

c8

No suitable z8 ion! Remember 2nd position

is a proline?

Page 72: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

c8

Let’s find c and z pairs!z + c = [M + H]+1 +2

[M + H]+1 – c + 2 = z

Page 73: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

c8

Let’s find c and z pairs!z + c = [M + H]+1 +2

[M + H]+1 – c + 2 = z

130201

+2 +2

Page 74: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

P I/L1115

1115

987

130201

916A

760

357R

243146Q/K

971 874

399R

718

Page 75: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

P I/L1115

1115

987

130201

916A

760

357R

243146Q/K

971 874

399R

718pS P P

551 454

566 663

Page 76: Proteomics Informatics –  Protein  identification III:  de novo  sequencing (Week 6)

Proteomics Informatics – Protein identification III:

de novo sequencing  (Week 6)