Transcript
Page 1: Interpreting  MS\\MS Results

Interpreting MS/MS Proteomics Results

Brian C. SearleProteome Software Inc. Portland, Oregon USA

[email protected]

NPC Progress Meeting(February 2nd, 2006)

The first thing I should say is that none of the material presented is

original research done at Proteome Software

but we do strive to make the tools presented here available in our software product Scaffold. With that

caveat aside…

Illustrated by Toni Boudreault

Page 2: Interpreting  MS\\MS Results

OrganizationThis is foremost an

introduction so we’re first going to talk

about

Then we’re going to talk about the motivations behind the development of

the first really useful bioinformatics technique in our field, SEQUEST.

This technique has been extended by two other tools

called X! Tandem and Mascot.

We’re also going to talk about how these programs differ

and how we can use that to our advantage by considering them simultaneously using probabilities.

Identify SEQUEST

X! Tandem/Mascot

Differ

Combine

how you go about identifying proteins with tandem mass spectrometry in the

first place

Page 3: Interpreting  MS\\MS Results

So, this is proteomics, so we’re going to use tandem mass spectrometry to identify proteins-- hopefully many of them, and hopefully very quickly.

A

A

I

K

G

K

I

D

VC

I

V

L

L

Q H KA

E PT

I

R

NT

DG

R

TA

Start with a protein

Page 4: Interpreting  MS\\MS Results

And to use this technique you

generally have to lyse the protein

into peptides about 8 to 20 amino acids in length and…

A

A

I

K

G

K

I

D

VC

I

V

L

L

Q H KA

E PT

I

R

NT

DG

R

TA

Cut with an enzyme

Page 5: Interpreting  MS\\MS Results

A

A

I

K

G

K

I

D

VC

I

V

L

L

Q H KA

E PT

I

R

NT

DG

R

TA

Select a peptide

Look at each peptide individually.

We select the peptide by mass using the first half of the tandem mass spectrometer

Page 6: Interpreting  MS\\MS Results

A E P T I R H2O

Impart energy in collision cell

The mass spectrometer imparts energy into the peptide causing it to fragment at the peptide bonds between amino acids.

Page 7: Interpreting  MS\\MS Results

M/z

Inte

nsity

A E P

A

A E

A E P T

72.0201.1

298.1399.2

Measure mass of daughter ionsThe masses of these fragment ions is recorded using the second mass spectrometer.

Page 8: Interpreting  MS\\MS Results

M/z

Inte

nsity

A E P T I R

B-type Ions

H2O

72.0 129.0 97.0 101.0 113.1 174.1

These ions are commonly called B ions, based on nomenclature you don’t really want to

know about…

But the mass difference between the peaks corresponds directly to the amino acid sequence.

Page 9: Interpreting  MS\\MS Results

M/z

Inte

nsity

A E P T I R

B-type Ions

H2O

72.0 129.0 97.0 101.0 113.1 174.1

A-0 AE-A AEP-AE

AEPT-AEP

AEPTI-AEPT

AEPTIR-AEPTI

For example, the A-E peak minus

the A peak should produce the mass

of E.

You can build these mass differences up and derive a sequence for the original peptide

This is pretty neat and it makes tandem mass spectrometry one of the best tools out there for sequencing novel peptides.

Page 10: Interpreting  MS\\MS Results

So, it seems pretty easy, doesn’t it?

But there are a couple confounding factors.

For example…

Page 11: Interpreting  MS\\MS Results

M/z

Inte

nsity

A E P T I R

B-type Ions

H2O

CO CO CO CO CO CO

B ions have a tendency to degrade and lose carbon monoxide producing…

Page 12: Interpreting  MS\\MS Results

M/z

A E P T I R

A-type Ions

H2O

CO CO CO CO CO CO

A ions.

Furthermore…

Page 13: Interpreting  MS\\MS Results

M/z

Inte

nsity

R I T P E A

Y-type Ions

H2O

… The second half are represented as Y ions that

sequence backwards.

And, unfortunately, this is the real world, so…

Page 14: Interpreting  MS\\MS Results

M/z

Inte

nsity

R I T P E A

Y-type Ions

H2O

… All the peaks have different measured heights and many peaks can often be missing.

Page 15: Interpreting  MS\\MS Results

M/z

Inte

nsity

R I T P E AH2O

B-type, A-type, Y-type Ions

All these peaks are seen together simultaneously

and we don’t

even know…

Page 16: Interpreting  MS\\MS Results

M/z

Inte

nsity

What type of ion they are, making the mass differences approach even more difficult.

Finally, as with all analytical techniques,

Page 17: Interpreting  MS\\MS Results

M/z

Inte

nsity

There’s noise,producing a final spectrum that looks like…

Page 18: Interpreting  MS\\MS Results

M/z

Inte

nsity

….This, on a good day. And so it’s actually fairly difficult to…

Page 19: Interpreting  MS\\MS Results

M/z

Inte

nsity

72.0 129.0 97.0 101.0 113.1 174.1

A E P T I R H2O

… compute the mass differences to sequence the peptide, certainly in a computer automated way.

Page 20: Interpreting  MS\\MS Results

So the community needed a new technique.

Now, it wasn’t all without hope…

Page 21: Interpreting  MS\\MS Results

Known Ion Types

B-type ions

A-type ions

Y-type ions

We knew a couple of things about peptide fragmentation.

Not only do we know to expect B, A, and Y ions,

but…

Page 22: Interpreting  MS\\MS Results

Known Ion Types

B-type ions

A-type ions

Y-type ions

B- or Y-type +2H ions

B- or Y-type -NH3 ions

B- or Y-type -H2O ions

… We also know a couple

of other variations on

those ions that come up.

We even know something

about the…

Page 23: Interpreting  MS\\MS Results

Known Ion Types

B-type ions

A-type ions

Y-type ions

B- or Y-type +2H ions

B- or Y-type -NH3 ions

B- or Y-type -H2O ions

• 100%• 20%• 100%

• 50%• 20%• 20%

… likelihood of seeing each type of ion,

where generally B and Y ions are most prominent.

Page 24: Interpreting  MS\\MS Results

If we know the amino acid

sequence of a peptide,

we can guess

what the spectra should look like!

So it’s actually pretty easy to guess what a spectrum

should look like

if we know what the peptide sequence is.

Page 25: Interpreting  MS\\MS Results

ELVISLIVESK

Model Spectrum

*Courtesy of Dr. Richard Johnsonhttp://www.hairyfatguy.com/

So as an example, consider the peptide

ELVIS LIVES K

that was synthesized by Rich Johnson in

Seattle

Page 26: Interpreting  MS\\MS Results

Model Spectrum

We can create a hypothetical spectrum based on our rules

Page 27: Interpreting  MS\\MS Results

B/Y type ions (100%)

A type ionsB/Y -NH3/-H2O

(20%)

B/Y +2H type ions(50%)

Where B and Y ions are estimated at 100%,

plus 2 ions are estimated at

50%, and other stragglers are at 20%.

Page 28: Interpreting  MS\\MS Results

Model Spectrum

So if we consider the spectrum that was derived from the ELVIS LIVES K peptide…

Page 29: Interpreting  MS\\MS Results

Model Spectrum

We can find where the overlap is between the hypothetical and the actual spectra…

Page 30: Interpreting  MS\\MS Results

Model Spectrum

And say conclusively based on the evidence that the spectrum does belong to the ELVIS LIVES K peptide.

Page 31: Interpreting  MS\\MS Results

But who cares?

The more important question is

“what about situations where we don’t know the sequence?”

Page 32: Interpreting  MS\\MS Results

We guess!

Page 33: Interpreting  MS\\MS Results

PepSeqAAAAAAAAAA

AAAAAAAAAC

AAAAAAAACC

AAAAAAACCC

ELVISLIVESK

WYYYYYYYYY

YYYYYYYYYY

……

J. Rozenski et al., Org. Mass Spectrom.,

29 (1994) 654-658.

build a hypothetical spectrum,

And so this was an approach followed by a program called PepSeq

which would guess every combination of amino acids possible

and find the best matching hypothetical.

Page 34: Interpreting  MS\\MS Results

PepSeq

• Impossibly hard after 7 or 8 amino acids!

• High false positive rate because you consider so many options

but it’s clearly impossibly hard with larger peptides

and there’s a lot of room to overfit the data.

This was a start,

Page 35: Interpreting  MS\\MS Results

PepSeq

• Impossibly hard after 7 or 8 amino acids!

• High false positive rate because you consider so many options

Another strategy is needed!

So obviously this isn’t going to work in the long run.

Page 36: Interpreting  MS\\MS Results

Sequencing Explosion

• 1977 Shotgun sequencing invented, bacteriophage fX174 sequenced.

• 1989 Yeast Genome project announced• 1990 Human Genome project announced• 1992 First chromosome (Yeast) sequenced• 1995 H. influenza sequenced • 1996 Yeast Genome sequenced • 2000 Human Genome draft

et cetra, et cetra

In 89 and 90 the Yeast and Human Genome projects were announced

We needed a new invention to come around

followed by the first chromosome

in 92

and that was shotgun Sanger-sequencing

Page 37: Interpreting  MS\\MS Results

• 1977 Shotgun sequencing invented, bacteriophage fX174 sequenced.

• 1989 Yeast Genome project announced• 1990 Human Genome project announced• 1992 First chromosome (Yeast) sequenced• 1995 H. influenza sequenced • 1996 Yeast Genome sequenced• 2000 Human Genome draft

Sequencing Explosion

Eng, J. K.; McCormack, A. L.; Yates, J. R. III J. Am. Soc. Mass Spectrom. 1994, 5, 976-989.

In 1994 Jimmy Eng and John Yates published a technique to

exploit genome sequencing

And the idea was …

for use in tandem mass

spectrometry.

Page 38: Interpreting  MS\\MS Results

SEQUEST.…instead of searching all possible peptide sequences,

search only those in genome databases.

Now, in the post- genomic world this seems like a pretty

trivial idea,

but back then there was a lot of assumption placed on

the idea

that we’d actually have a complete Human genome in

a reasonable amount of time.

Page 39: Interpreting  MS\\MS Results

SEQUEST2*1014 -- All possible 11mers

(ELVISLIVESK)

2*1010 -- All possible peptides in NR

1*108 -- All tryptic peptides in NR

4*106 -- All Human tryptic peptides in NRSo, In terms of 11amino

acid peptides

we’re talking about a 10 thousand fold difference between searching every

possible 11mer those in the current non-redundant protein

database from the NCBI

And a 100 million fold difference for searching human trypic peptides

So that was huge,

it made hypothetical spectrum matching feasible.

Page 40: Interpreting  MS\\MS Results

SEQUEST Model Spectrum

Instead of trying to make a better model,

Jimmy and John noted that there was a

discontinuity between the intensities of the

hypothetical spectrum and the actual spectrum.

SEQUEST made a couple of other interesting

improvements as well

they decided just to make the actual spectrum look

like the model with normalization…

Page 41: Interpreting  MS\\MS Results

SEQUEST Model Spectrum

For a scoring function they decided to use Cross-Correlation,

Like so. which basically sums the peaks that

overlap between hypothetical and the actual spectra

Page 42: Interpreting  MS\\MS Results

SEQUEST Model Spectrum

And then they shifted the spectra back and ….

Page 43: Interpreting  MS\\MS Results

SEQUEST Model Spectrum

They used this number, also called the Auto-Correlation, as their background.

… Forth so that the peaks shouldn’t align.

Page 44: Interpreting  MS\\MS Results

SEQUEST XCorr

Gentzel M. et al Proteomics 3 (2003) 1597-1610

Offset (AMU)

Cor

rela

tion

Sco

re

Cross Correlation(direct comparison)

Auto Correlation(background)

This is another representation of the Cross Correlation and the Auto Correlation.

Page 45: Interpreting  MS\\MS Results

SEQUEST XCorrCross Correlation

(direct comparison)

Auto Correlation(background)

CrossCorr

avg AutoCorr offset=-75 to 75 XCorr =Gentzel M. et al

Proteomics 3 (2003) 1597-1610

Offset (AMU)

Cor

rela

tion

Sco

re

The XCorr score is the Cross Correlation divided

by the average of the auto correlation over a

150 AMU range.

The XCorr is high if the direct comparison is significantly

greater than the background,

which is obviously good for peptide identification.

Page 46: Interpreting  MS\\MS Results

SEQUEST DeltaCn

XCorr1 XCorr 2

XCorr1and so far, there really

haven’t been any significant

improvements on it.The DeltaCn is another

score that scientists often use.

It measures how good the XCorr is relative to the

next best match.

And this XCorr is actually a pretty robust method for estimating how accurate

the match is,

As you can see, this is actually a pretty crude calculation.

Page 47: Interpreting  MS\\MS Results

Accuracy Score Relative Score

Strong(XCorr)

Weak(DeltaCn)

SE

QU

ES

T

Here’s another representation of that sentiment.

The XCorr is a strong measure of accuracy,

whereas the DeltaCn is a weak measure of relative goodness.

.

Page 48: Interpreting  MS\\MS Results

Accuracy Score Relative Score

Alte

rnat

eM

etho

dStrong(XCorr)

Weak

Weak(DeltaCn)

Strong

SE

QU

ES

T

Obviously, there could be an alternative method that focuses more on the success of the relative score.

Mascot and X! Tandem fit that bill.

Page 49: Interpreting  MS\\MS Results

by-Score= Sum of intensities of peaks matchingB-type or Y-type ions

HyperScore=

X! Tandem Scoring

by-Score Ny! Nb!

Fenyo, D.; Beavis, R. C. Anal. Chem., 75 (2003) 768-774

Now the X! Tandem accuracy score is

rather crude. It only considers B and Y ions and

and attaches these factorial terms with an admittedly hand waving argument.

Page 50: Interpreting  MS\\MS Results

Distribution of “Incorrect” Hits

Hyper Score

# of

Mat

ches

Best HitSecond

Best

But instead of just considering the best match to the second best, it looks at the

distribution of lower scoring hits, assuming that they are all wrong.

This is somewhat based on ideas pioneered with the BLAST algorithm.

Here, every bar represents the number of matches at a given score.

The X! Tandem creators found that the distribution decays (or slopes down)

exponentially…

Page 51: Interpreting  MS\\MS Results

Estimate Likelihood (E-Value)

Best Hit

Hyper Score

Lo

g(#

of M

atch

es)

…and the log of the distribution is relatively linear because of the exponential decay.

Page 52: Interpreting  MS\\MS Results

Estimate Likelihood (E-Value)Hyper Score

Lo

g(#

of M

atch

es)

Expected NumberOf Random Matches

Best Hit

If the distribution represents the number of random

matches at any given score,

the linear fit should correspond to the expected number of random matches.

Page 53: Interpreting  MS\\MS Results

Estimate Likelihood (E-Value)L

og

(# o

f Mat

ches

)

Score of 60 has1/10 chanceof occurring

at random

Best Hit

This is called an E-Value, or Expected-Value.

And from this, you can calculate the likelihood that the best match is random.

In this case, a score of 60 corresponds with a log number of

matches being -1 which means the estimated number of random matches

for that score is 0.1

Page 54: Interpreting  MS\\MS Results

X! Tandem and Mascot

E-Value=Likelihood that match is incorrect relative to N guesses

Empirical(X! Tandem)

P-Value=Likelihood that match is incorrect (E~P·N)

Theoretical(Mascot)

Another search engine, Mascot, tries to get at the same kind of number using

theoretical calculations,

Now, X! Tandem calculates this E-Value empirically.

most likely based on the number of identified peaks and the likelihood of finding certain amino acids in the

genome database.

They’ve never explicitly published their algorithm, so we’ll never really know,

I just want to bring up a point that we’ll touch on a little

later…

but I suspect it’s something smart.

Page 55: Interpreting  MS\\MS Results

X! Tandem and Mascot

E-Value=Likelihood that match is incorrect relative to N guesses

Empirical(X! Tandem)

P-Value=Likelihood that match is incorrect (E~P·N)

Theoretical(Mascot)

Probability=Likelihood that match is correct

Note (Probability≠1-P)!

This is realistically not nearly as useful as

knowing

the probability that a peptide identification is right, which is NOT 1 minus

the P-Value.

…the E-Value that X! Tandem calculates

and the P-Value that Mascot calculates are

probabilistically based,but they can only estimate the

likelihood that the match is wrong.

Page 56: Interpreting  MS\\MS Results

Accuracy Score Relative Score

X! T

ande

m

S

EQ

UE

ST

XCorr

HyperScore

DeltaCn

E-Value

Now, let’s go back and fill in the X! Tandem part of our accuracy/relativity scoring grid.

Page 57: Interpreting  MS\\MS Results

Accuracy Score Relative Score

X! T

ande

m

S

EQ

UE

ST

XCorr

HyperScore

DeltaCn

E-Value

To reiterate, the XCorr is an excellent measure of accuracy…

Page 58: Interpreting  MS\\MS Results

Accuracy Score Relative Score

X! T

ande

m

S

EQ

UE

ST

XCorr

HyperScore

DeltaCn

E-Value

If we assume that accuracy and relativity scores are independent measures of

goodness,could we use both the SEQUEST’s XCorr and

X! Tandem’s E-Value together?

…whereas the E-Value is an excellent measure of how good the best score is relative to the rest.

Page 59: Interpreting  MS\\MS Results

SEQUEST: Discriminant Score

X!

Tan

de

m: -

log

(E-V

alu

e)

10 Protein Control SampleAnd the answer is a resounding

yes.Each point on this

graph is a spectrum, where correct

identifications are marked in red, while

incorrect identifications are marked in blue.

Although in general the spectra SEQUEST scores well are spectra X!Tandem also scores well,

there is considerable scatter between the search engines.

We know what’s correct and incorrect

because this is a control sample.

Page 60: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

10 Protein Control Sample

X!

Tan

de

m: -

log

(E-V

alu

e)

One might wonder if X! Tandem and Mascot use similar scoring

approaches,

would they benefit as much,

Now, why are the scores so different?

but the answer is

surprisingly still yes!

Page 61: Interpreting  MS\\MS Results

Why So Different?• Sequest

– Considers relative intensities

• X! Tandem– Considers

semi-tryptic peptides

– Considers only B/Y-type Ions

• Mascot– Considers

theoretical

P-Value relative to search space

Well, here are a couple of possible reasons.

SEQUEST is the only method to consider relative intensities.

Page 62: Interpreting  MS\\MS Results

Why So Different?• Sequest

– Considers relative intensities

• X! Tandem– Considers

semi-tryptic peptides

– Considers only B/Y-type Ions

• Mascot– Considers

theoretical

P-Value relative to search space

X! Tandem is the only method to consider peptides outside the standard search space by default,

such as semi-tryptic peptides.

However, it’s the only score that considers only B and Y ions,

as opposed to a complete model.

Page 63: Interpreting  MS\\MS Results

Why So Different?• Sequest

– Considers relative intensities

• X! Tandem– Considers

semi-tryptic peptides

– Considers only B/Y-type Ions

• Mascot– Considers

theoretical

P-Value relative to search space

And Mascot is the only search engine to compute a completely theoretical P-Value

Page 64: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

Consider Multiple Algorithms?

X!

Tan

de

m: -

log

(E-V

alu

e)

So we clearly want to consider multiple search engines

simultaneously,

but how?

Page 65: Interpreting  MS\\MS Results

How To Compare Search Engines?– SEQUEST: XCorr>2.5, DeltaCn>0.1– Mascot: Ion Score-Identity Score>0– X! Tandem:E-Value<0.01

You can’t use a thresholding system

because it’s impossible to find corresponding

thresholds.

For example, a SEQUEST match with an XCorr of 2.5

doesn’t mean the same thing

as an X! Tandem match with an E-Value of 0.01.

Page 66: Interpreting  MS\\MS Results

How To Compare Search Engines?

Need to convert scores to probabilities!

– SEQUEST: XCorr>2.5, DeltaCn>0.1– Mascot: Ion Score-Identity Score>0– X! Tandem:E-Value<0.01

The simplest way would be to convert the scores into probabilities and compare

those.

We advocate for Andrew Keller and Alexy Nesviskii’s Peptide Prophet approach

because it actually calculates a true probability, not just a p-value.

Page 67: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)X! Tandem approach

Other IncorrectIDs for Spectrum

PossiblyCorrect?

Mascot: Ion-Identity Score

# of

Mat

ches

So if you remember,

X! Tandem considers the best peptide

match for a spectrum against a

distribution of incorrect

matches

Page 68: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)Peptide Prophet approach

ALL Other“Best” Matches

PossiblyCorrect?

Mascot: Ion-Identity Score

# of

Mat

ches

Keller, A. et al Anal. Chem. 74, 5383-5392

Well, Peptide Prophet looks across the entire sample,

and not at just one spectrum at a time.

It compares the best match against all of

the other best matches in the

sample, which is clearly bimodal.

Page 69: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)Peptide Prophet approach

ALL Other“Best” Matches

PossiblyCorrect?

Mascot: Ion-Identity Score

# of

Mat

ches

Keller, A. et al Anal. Chem. 74, 5383-5392

The low mode represents matches that are most likely wrong while the high mode represents matches that are probably right.

Page 70: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)Peptide Prophet approach

PossiblyCorrect?

“Correct”

“Incorrect”

Mascot: Ion-Identity Score

# of

Mat

ches

Peptide Prophet curve fits two distributions to

the modes,

following the assumption that the low scoring

distribution is “Incorrect”

and that the higher scoring distribution is “correct”.

Page 71: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)

“Incorrect” p( | D)

p(D | ) p()

p(D | ) p() p(D | ) p( )

Mascot: Ion-Identity Score

# of

Mat

ches

PossiblyCorrect?

“Correct”

These two distributions can be analyzed using Bayesian statistics with

this formula.

Now that formula looks pretty complex,

but…

Page 72: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)

p( | D)

p(D | ) p()

p(D | ) p() p(D | ) p( )“Incorrect”

Mascot: Ion-Identity Score

# of

Mat

ches

“Correct”

It just calculates the height of the correct distribution at a particular score, divided by the height of both distributions.

Page 73: Interpreting  MS\\MS Results

10 Protein Control Sample (Q-ToF)

p( | D)

p(D | ) p()

p(D | ) p() p(D | ) p( )

prob of having scoreand being correct

prob of having score

“Correct”

“Incorrect”

Mascot: Ion-Identity Score

This is essentially the probability of having that score and being correct

divided by the probability of just having that score

Page 74: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

PossiblyCorrect?

“Correct”

“Incorrect”

# of

Mat

ches

This is a neat method because it actually considers the likelihood of being correct,

rather than X! Tandem and Mascot, which only calculate the probability of being incorrect.

It’s because of this that Peptide Prophet can get

produce a true probability,

which is important when the sample characteristics change.

Page 75: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

PossiblyCorrect?

“Correct”

“Incorrect”

# of

Mat

ches Q-ToF:

For example, the control sample we’ve been looking at was derived from Q-

ToF data

which produces pretty high quality results

Page 76: Interpreting  MS\\MS Results

PossiblyCorrect?

“Correct”

“Incorrect”

# of

Mat

ches

Mascot: Ion-Identity Score

PossiblyCorrect?

“Correct”

“Incorrect”

# of

Mat

ches Q-ToF:

Ion Trap:

If you compare that to the same sample on run on an Ion Trap,

the probability of being correct is greatly

diminished.

If you’ll note, the Incorrect distribution doesn’t change very much between the two

analyses, however, the likelihood that the

identification is right changes dramatically!

Page 77: Interpreting  MS\\MS Results

PossiblyCorrect?

“Correct”

“Incorrect”

# of

Mat

ches

Mascot: Ion-Identity Score

Ion Trap:

As Peptide Prophet considers the correct distribution, it is immune to fluctuations between samples.

P-Values and E-Values don’t consider this information, so they can’t be compared across multiple samples, or different examinations of the same sample

hence the reason why we need to use Peptide

Prophet for comparing two different search engines

Page 78: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

Consider Multiple Algorithms?

X!

Tan

de

m: -

log

(E-V

alu

e)

So going back to the scatter plot between X! Tandem and Mascot,

we can use Peptide Prophet to compute the score

threshold that represents a 95% cut-off…

Page 79: Interpreting  MS\\MS Results

Mascot: Ion-Identity Score

Consider Multiple Algorithms?

X! Tandem: 2.6=95%

Mascot: -2.5=95%

X!

Tan

de

m: -

log

(E-V

alu

e)Like so.

This allows you to fairly consider the answers from both search engines simultaneously.

The important thing to note, is that if you looked at a different sample, these thresholds should change depending on the height of the correct distributions

Page 80: Interpreting  MS\\MS Results

Conclusion• All search engines

use different criteria, producing different scores

• Using multiple search engines simultaneously yields better results

• Peptide Prophet can normalize search engine results

So in conclusion,

all of the search engines look at different criteria

Page 81: Interpreting  MS\\MS Results

Conclusion• All search engines

use different criteria, producing different scores

• Using multiple search engines simultaneously yields better results

• Peptide Prophet can normalize search engine results

And we can leverage this to identify more peptides

Page 82: Interpreting  MS\\MS Results

Conclusion• All search engines

use different criteria, producing different scores

• Using multiple search engines simultaneously yields better results

• Peptide Prophet can normalize search engine results

And that Peptide Prophet is a great

mechanism for doing that

because it calculates true probabilities,

instead of p-values

Page 83: Interpreting  MS\\MS Results

The End


Top Related