a neural network-based method for predicting protein stability changes upon single point mutations

30
A neural network-based method for predicting protein stability changes upon single point mutations Emidio Capriotti, Piero Fariselli and Rita Casadio Biocomputing Unit Department of Biology, University of Bologna, Italy www.biocomp.unibo.it

Upload: novia

Post on 21-Mar-2016

43 views

Category:

Documents


4 download

DESCRIPTION

Biocomputing Unit Department of Biology, University of Bologna, Italy www.biocomp.unibo.it. A neural network-based method for predicting protein stability changes upon single point mutations. Emidio Capriotti, Piero Fariselli and Rita Casadio. Problem Definition - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A neural network-based method for predicting protein stability changes upon single point mutations

A neural network-based method for predicting

protein stability changes upon single point

mutationsEmidio Capriotti, Piero Fariselli and Rita Casadio

Biocomputing Unit

Department of Biology,

University of Bologna,

Italy

www.biocomp.unibo.it

Page 2: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 3: A neural network-based method for predicting protein stability changes upon single point mutations

Problem Definition (I)

Native

A

Mutant

L

If we change Alanine 35 with a Leucine,is the protein stability increased ? Decreased? A35L

Page 4: A neural network-based method for predicting protein stability changes upon single point mutations

Problem Definition (I)

If we change Alanine 35 with a Leucine,is the protein stability increased ? Decreased?

Gf=Gu-Gf

Free Energy

Gf mut

U

F

Mutant

Gf nat

U

F

NativeGf = Gf mut - Gf nat

Page 5: A neural network-based method for predicting protein stability changes upon single point mutations

Problem Definition (II)The sign of Gfu

identifies the direction of the stability change

The sign is more informative than the |G|

Gf < 0 => the mutation increases the protein stability

Gf > 0 => the mutation decreases the protein stability

Our Neural Networks are trained to predict the sign of the stability change

Page 6: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 7: A neural network-based method for predicting protein stability changes upon single point mutations

Energy-based predictive methods

3) empirical energiesG = Wvdw Gvdw + Wsolv Gsolv + Wsc TSsc +...

2) statistical potentialsE(i,j) = - KT log ( f(i,j) )

1) physical effective energy potentials (classical MM force fields)

E= ½ks,ij(rij -ro)2 + ½kb,ij(ij –o)

2 +...

The State of the Art

Page 8: A neural network-based method for predicting protein stability changes upon single point mutations

+

+

-

-

OK

OK

Over/Under-predictions

Over/Under-predictions

Page 9: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 10: A neural network-based method for predicting protein stability changes upon single point mutations

The Data Base

http://www.rtc.riken.go.jp/jouhou/Protherm/

ProTherm is a collection of numerical data of thermodynamic parameters including Gibbs free energy change, enthalpy change, heat capacity change, transition temperature etc. for wild type and mutant proteins

Total number of entries 15379 Number of unique proteins 471 Total number of all proteins 668 Number of Proteins with mutants 195 Number of Single Mutations 7586 Number of Double Mutations 1192 Number of Multiple Mutations 563 Number of Wild Type 6038

Gromiha et al. (2000). Nucleic Acids Res. 28, 283-285

Page 11: A neural network-based method for predicting protein stability changes upon single point mutations

Training/testing Data set (I)

The data set of proteins was extracted from ProTherm, with the following constraints:

i) the G value was experimentally detected and reported in the data base;

ii) the protein structure is known with atomic resolution (and deposited in the PDB (Berman et al., 2000));

iii) the data are relative to single mutations (no multiple mutations have been taken into account).

Page 12: A neural network-based method for predicting protein stability changes upon single point mutations

After this filtering procedure, we ended up with 2 data sets

S1615 : 1615 different single mutations

S388 : 388 mutations from containing only experiments performed at physiological conditions (T 20-40 °C, pH 6-8)

Training/testing Data set (II)

S388S1615

Page 13: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 14: A neural network-based method for predicting protein stability changes upon single point mutations

Neural Network Predictor (I)

• N1: A 20 element vector that describes the aminoacid mutation, pH and T

• N2: adds to the N1 input one more neuron for the relative accessibility surface of the mutated residue

• N3: adds to N2 20 more input neurons (43 in total) encoding the three-dimensional residue environment

Page 15: A neural network-based method for predicting protein stability changes upon single point mutations

A C D E F G H I K L M N P Q R S T V W Y1-1

Mutation E->A

pHT

Network N1

A

Relative Solvent Accessibility

N2

Neural Network Predictor (II)

E->A

A

A

L

G

G

LE

I

L

E

A C D E F G H I K L M N P Q R S T V W Y22 1 32

Environment

N3Radius

Page 16: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 17: A neural network-based method for predicting protein stability changes upon single point mutations

Cross-validation performance of the different

neural networks on S1615

+ and – : the index is evaluated for positive and negative signs of protein energy stability change, respectively.

Method Q2 P(+) Q(+) P(-) Q(-) C

N1 0.74 0.59 0.23 0.76 0.94 0.24N2 0.75 0.57 0.45 0.80 0.87 0.34N3 0.81 0.71 0.52 0.83 0.91 0.49

Page 18: A neural network-based method for predicting protein stability changes upon single point mutations

Cross-validation performance of N3 as a function of different protein environments (different radius) centred on the mutated residue

Method Radius Q2 P(+) Q(+) P(-) Q(-) CN3-4.5 4.5 0.79 0.63 0.55 0.83 0.88 0.45N3-6.0 6.0 0.79 0.63 0.57 0.84 0.87 0.46N3-9.0 9.0 0.81 0.71 0.52 0.83 0.91 0.49N3-12.0 12.0 0.79 0.63 0.59 0.84 0.87 0.47

Page 19: A neural network-based method for predicting protein stability changes upon single point mutations

Q2 accuracy of neural network (N3-9.0) as a function of the reliability index (Rel)

Page 20: A neural network-based method for predicting protein stability changes upon single point mutations

0

0.2

0.4

0.6

0.8

1

<0.5 1 2 >2

| Stability Change |

Q2DB(%)

Q2 accuracy of neural network (N3-9.0) as a function of the absolute value of protein stability changes upon mutation (|Stability Change|)

Kcal/mol

Page 21: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 22: A neural network-based method for predicting protein stability changes upon single point mutations

Comparison of neural network with other methods on S388

Method Q2 P(+) Q(+) P(-) Q(-) C

FOLDX(1) 0.75 0.26 0.56 0.93 0.78 0.25DFIRE(2) 0.68 0.18 0.44 0.90 0.71 0.11PoPMuSiC(3) 0.85 0.33 0.25 0.90 0.93 0.20N3-9.0 0.87 0.44 0.21 0.90 0.96 0.25

(1) http://fold-x.embl-heidelberg.de. (2) http://phyyz4.med.buffalo.edu/hzhou/dmutation.html(3) http://babylone.ulb.ac.be/popmusic/

Page 23: A neural network-based method for predicting protein stability changes upon single point mutations

Accuracy of joint-methods on subsets of S388

Method Agreement Q2 P(+) Q(+) P(-) Q(-) CN3-9.0 72% 0.93 0.88 0.28 0.93 0.99 0.47+ FOLDX(1) N3-9.0 69% 0.90 0.36 0.16 0.92 0.97 0.19+ DFIRE(2)

N3-9.0 86% 0.91 0.67 0.07 0.92 0.99 0.19+PoPMuSiC(3)

Page 24: A neural network-based method for predicting protein stability changes upon single point mutations

• Problem Definition• The State of the Art• Data Base• Neural Network Predictor• Results• Comparison with other Methods• I-Mutant

Page 25: A neural network-based method for predicting protein stability changes upon single point mutations

I-Mutant

Page 26: A neural network-based method for predicting protein stability changes upon single point mutations

I-Mutant Web Server

http://gpcr.biocomp.unibo.it/cgi/predictors/I-Mutant/I-Mutant.cgi/

Page 27: A neural network-based method for predicting protein stability changes upon single point mutations

thank you for your attention that’s all !

Biocomputing Unit

Department of Biology,

University of Bologna,

I taly

www.biocomp.unibo.it

Emidio Capriotti, Stability test

Piero Fariselli

Rita Casadio

Page 28: A neural network-based method for predicting protein stability changes upon single point mutations

Measures of Accuracy

1/2iiiiiiii

iiii

)]o (n )u (n )o (p )u [(p)ou - n(p

C

ii

ii up

pQ

ii

ii op

pP

N

i

i

NpQ

1

2Overall Accuracy

The efficiency of the predictor is scored using the statistical indexes defined following.

Correlation coefficient

Probability correct Prediction

Coverage

Where N is the total number of prediction, p the correct number of predictions, u and o are the numbers of under and over predictions.

Page 29: A neural network-based method for predicting protein stability changes upon single point mutations

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50 60 70 >70

Relative Solvent Accessibility (RSA)

Q2DB(%)

Q2 accuracy of the neural network (N3-9.0) as a function of the relative accessibility value of the mutated residue

Page 30: A neural network-based method for predicting protein stability changes upon single point mutations

Q2 accuracy as a function of the residue mutation type

native \ new Charged Polar Apolar

Charged

Polar

Apolar

0.62 (4%) 0.77 (8%) 0.72 (9%)

0.69 (6%) 0.82 (10%) 0.77 (17%)

0.75 (3%) 0.92 (12%) 0.87 (31%)