mass spectrometry in a drug discovery setting

14
Mass Spectrometry in a drug discovery setting Claus Andersen Senior Scientist Sienabiotech Spa

Upload: mahon

Post on 12-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Mass Spectrometry in a drug discovery setting . Claus Andersen Senior Scientist Sienabiotech Spa. Overview. From genes to phenotype Proteins an introduction Mass Spec for protein Mass Spec data Mass Spec data analysis Mass Spec database searching Recent advances. identification - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mass Spectrometry in a drug discovery setting

Mass Spectrometry in a drug discovery setting

Claus AndersenSenior Scientist

Sienabiotech Spa

Page 2: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Overview• From genes to phenotype• Proteins an introduction

• Mass Spec for protein• Mass Spec data• Mass Spec data analysis• Mass Spec database searching• Recent advances

identificationquantificationcharacterization

Page 3: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

From genes to phenotypegenes

proteins

functions

pathways

metabolites

phenotypes

mRNA expression

Regulation D

egradation

Activation/inactivation

InteractionsKinematics

Protein abundance

Metabolite levels

ADME/Tox

Structure

Pharmacophore

Genome comparison

mRNA expression

Activation/inactivation

Protein abundance

Page 4: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Proteins as functional unitsGlucose

ATP

ATP

D.S. Goodsell pdb.orgVale and Milligan Science 2000

Myosin

Page 5: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

What affects the proteome

Cellular proteome

Interactions

Temperature

Stress

Environment

Physiological role

Pharmaceuticalsubstances

Proteasomeprotein degradation

mRNA

Ribosomeprotein production

Genome

Page 6: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Protein extraction and

digestion

Mass Spec on proteins

Treated/Sick

Control/Healthy Mass Spectrometer

Protein peptides

identificationMS spectra

quantification

characterization

KKYAAELHLV

P

O

Phosphorylation

KAVQQPDGLA

Oxidation… post translational modifications (PTM)

QFHFHWGSLDQPDGLA

Peptides

and MS/MS spectra

HPLC

Page 7: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Mass Spec data5 g

3000 MS spectra 500 MB

Total 700 MB

Gygi et al. Mol. Cell Bio. (1999)400 MS/MS spectra 200 MB

Page 8: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Mass Spec data analysis• Fourier transformation (noise filtering)• Gaussian peak fitting (peak detection)• Generation of theoretical spectra (sequencespectra)• Large scale spectral comparison(DB searching)• Spectral deconvolution (de-novo sequencing)• Large scale sequence searching (DB searching)• Data fitting (quantitation)• Statistics and probability theory (reliability estimation)• Linear discriminant analysis (quality assessment)

• …. and lots more

Large scale spectral comparison (DB searching)

Page 9: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Large scale spectral comparisonMass spec data

MS spectrumFLIDSSRFSYPERPIIFLSMCYNIYSIAYIVRLTVGRERISCDFEEAAEPVLIQEGLKNT

Protein sequence DB ~2 mil

Protein peptides ~60 mil

Peptide fragments ~2000 mil

ERPIIFLSMCYNIYSIAYIV

etc. etc…

ERPIIFLSMCYNIYSIAYIVERPIIFLSMCYNIYSIAYIERPIIFLSMCYNIYSIAYERPIIFLSMCYNIYSIAERPIIFLSMCYNIYSIERPIIFLSMCYNIYSERPIIFLSMCYNIYERPIIFLSMCYNIERPIIFLSMCYNERPIIFLSMCYERPIIFLSMCERPIIFLSM…

In-silico data

MS/MS Spectrum

(Mpeptide+H)+ ±Δ

i

Ni

Ki{VIVYIVAYIVIAYIVSIAYIVYSIAYIVIYSIAYIVNIYSIAYIV…

Page 10: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Large scale spectral comparisonPEP_PROBE by Sadygov and Yates Anal. Chem. 75 2003

NN

KNKN

KK

NKPi

iii

iiNK ),(,

Hypergeometric probability model

)!(!!

knkn

kn

where

is the binomial coefficient

i

iKK i

iNN

Page 11: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

where is the cumulative distribution function given by the hypergeometric model, is the number of all peptides in the database matching the (M+H)+ mass value.

Sadygov and Yates Anal. Chem. 2003

0

)(

0)(

)(

PPHM

HM

PN

PPNE

Expectation value (E-value)

Large scale spectral comparison

The E-value tells you how many peptides from the database are expected to have the same or better matches to the experimental spectrum by chance alone.

)( )( HM

N

Page 12: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Large scale spectral comparison

Sadygov and Yates Anal. Chem. 2003

An example from yeast (Saccharomyces cerevisiae)

MS/MS spectrum(M+H)+ = 2076.010 ± 0.002 AMU

Yeast proteins 6 200Yeast peptides ~200 000Peptide fragments ~5 mil

N=569 160K= 84 150

ATHILDFGPGGASGLGVLTHR

Top candidate peptidesK1N1

LTPPQLPPQLENVILNKY

40 34

E-value

34 15

10-26.62

10-5.25

FAS1

SIP2

PeptideProtein name

Page 13: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

Large scale spectral comparisonThe protein FAS1 is part of the fatty acid biosynthesis of yeast.Its enzyme classification number is (EC 2.3.1.86)

FAS1

Protein identificationIn general several peptides are found for each protein (3-10)

www.kegg.org

Page 14: Mass Spectrometry in a drug discovery setting

Bioinformatics and statistics in a drug discovery companyClaus Andersen

•Inverted sequence DB used for background distribution estimation (PRISM) Emili’s group Mol. Cell Proteomics, 2(2), p96-106, 2003

•Number of Sibling peptides (ProteinProphet)Aebersold’s group Anal. Chem. 74, p5383-5392, 2004

•Suffix tree searching: Lu and Chen Bioinformatics 19(2), pii113-ii121, 2003

•Bayesian approach:Chen Biosilico in press 2004

Most recent advances

Large scale spectral comparison•An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Yates’ group J.Am.Soc.Mass Spec. 5(11) 1994•ProbID: a probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data. Aebersold’s group Proteomics 2(10) 2002

Other approaches