bioexcel webinar series #7: prodigy, a web server to predict binding affinities in protein-protein...

48
bioexcel.eu Partners Funding PRODIGY, a web server to predict binding affinities in protein-protein complexes Presenters: Anna Vangone Host: Adam Carter BioExcel Educational Webinar Series #7 12 October, 2016

Upload: bioexcel

Post on 22-Jan-2018

113 views

Category:

Science


3 download

TRANSCRIPT

Page 1: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

Partners Funding

PRODIGY, a web server to predict binding affinities

in protein-protein complexes

Presenters: Anna Vangone

Host: Adam Carter

BioExcel Educational Webinar Series #7

12 October, 2016

Page 2: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

This webinar is being recorded

Page 3: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

BioExcel Overview

• Excellence in Biomolecular Software

- Improve the performance, efficiency and scalability of key codes

• Excellence in Usability

- Devise efficient workflow environments

with associated data integration

• Excellence in Consultancy and Training

- Promote best practices and train end users

DMI Monitor

DMI Enactor

DMI Executor

DMI Enactor

Data Delivery Point

Data Source

Monitoring flow

Data flow

Service Invocation

DMI Optimiser

DMI Planner

DMIValidator

DMI Gateway

DMI Gateway

DMI Gateway

DMI Enactor

Portal / Workbench

DMI Request

DADC Engineer

DMI Expert

Repository

Registry

DMI Expert

Domain Expert

Page 4: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

Interest Groups

• Integrative Modeling IG

• Free Energy Calculations IG

• Best practices for performance tuning IG

• Hybrid methods for biomolecular systems IG

• Biomolecular simulations entry level users IG

• Practical applications for industry IG

• Training

• Workflows

Support platforms

http://bioexcel.eu/contact

Forums Code Repositories Chat channel Video Channel

Page 5: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

Today’s Presenter

Anna Vangone studied Chemistry at University of Salerno

(Italy) and received her MSc cum laude in 2009. She then

joined the doctoral program in bioinformatics at University of

Salerno and worked as visiting PhD student at University of

Oxford (UK), obtaining her PhD in 2013.

After a few visits at King Abdullah University of Science and

Technology (Saudi Arabia) as an invited scientist, she joined

the computational structural biology group at Utrecht

University as a post-doc researcher. In 2015 she obtained a

prestigious Marie Curie individual fellowship that has

supported her research at Utrecht University since October

2015.

Her research interests are focused on the study of protein-

protein interactions, characterization and analysis of biological

complexes by computational modeling and docking in

particular. Her work has resulted in 15 peer-reviewed

publications and several international conference invitations.

5

Page 6: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Structural Biology Groupwww.bonvinlab.org

[email protected]

Anna Vangone

Computational Structural Biology group

Utrecht University

PRODIGY

A web-server to predict binding affinity

in protein-protein complexes

Page 7: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Binding affinity

Computational prediction methods

Page 8: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Page 9: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. Introduction - 2. Method - 3. Results - 4. Conclusion

DNA replication

Immune response

Signaling cascade… many more

Protein-protein interactions

Page 10: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Protein-protein interactions: binding energy

ΔG

En

ergy

Conformational space

+

ΔGbind = Gbound – Gfree = RT ln(K)

Page 11: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Protein-protein interactions: binding energy

ΔG

En

ergy

Conformational space

+

ΔGbind = Gbound – Gfree = RT ln(K)

EXACT METHODS

EMPIRICAL SCORING

FUNCTIONS

STRUCTURAL PROPERTIES

MD direct counting

Umbrella Sampling

Combination of Energetics terms (En)

• Electrostatic

• Van der Waals

• Exclusion solvent

• ……

Inaccurate/fast

Page 12: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

NIS: Kastritis et al. J Mol Biol (2014)

BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)

ΔG= f(BSA)

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Structural properties

Page 13: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Non Interacting

Surface (NIS)

NIS: Kastritis et al. J Mol Biol (2014)

BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)

ΔG= f(BSA)

ΔG = f(BSA, NIS)

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Structural properties

Page 14: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Non Interacting

Surface (NIS)

NIS: Kastritis et al. J Mol Biol (2014)

BSA: Chothia & Janin. Nature(1975), Horton & Lewis (1992)

ΔG= f(BSA)

ΔG = f(BSA, NIS)

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Structural properties

Page 15: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Page 16: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Interfacial contacts (ICs):

number of pair-residues within a distance cut-off

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Contacts at the interface

5.5 Å

Vangone and Bonvin, eLife (2015)

Page 17: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Interfacial contacts (ICs):

number of pair-residues within a distance cut-off

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Contacts at the interface

5.5 Å

Vangone and Bonvin, eLife (2015)

ICstotal

ICsProperty P1

ICs Property P2

r

6 2 4 …

Performance: reported as Pearson’s Correlation

Coefficient

Classification of residues based on their

physico-chemical properties

P1 is #ICs between charged-polar residues

P2 is #ICs between polar-apolar residues

……

Page 18: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Interfacial contacts (ICs):

number of pair-residues within a distance cut-off

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Contacts at the interface

5.5 Å

Vangone and Bonvin, eLife (2015)

ICstotal

ICsProperty P1

ICs Property P2

r

6 2 4 …

Performance: reported as Pearson’s Correlation

Coefficient

Classification of residues based on their

physico-chemical properties

P1 is #ICs between charged-polar residues

P2 is #ICs between polar-apolar residues

……

Page 19: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL

P1Charged/Charged

P2Charged/Polar

P3Charged/Apolar

P4Polar/Polar

P5Polar/Apola

r

P6Apolar/Apolar

r

w1 w2 w3 w4 w5 w6 N

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Training the predictor

Page 20: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

P1Charged/Charged

P2Charged/Polar

P3Charged/Apolar

P4Polar/Polar

P5Polar/Apola

r

P6Apolar/Apolar

r

w1 w2 w3 w4 w5 w6 M

ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL

P1Charged/Charged

P2Charged/Polar

P3Charged/Apolar

P4Polar/Polar

P5Polar/Apola

r

P6Apolar/Apolar

r

w1 w2 w3 w4 w5 w6 N

FEATURE SELECTION (AIC)(Akaike Information Criterion)

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Training the predictor

Page 21: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

P1Charged/Charged

P2Charged/Polar

P3Charged/Apolar

P4Polar/Polar

P5Polar/Apola

r

P6Apolar/Apolar

r

w1 w2 w3 w4 w5 w6 M

ΔGpredicted = w1P1 + w2P2 + …LINEAR REGRESSION MODEL

25% prediction

P1Charged/Charged

P2Charged/Polar

P3Charged/Apolar

P4Polar/Polar

P5Polar/Apola

r

P6Apolar/Apolar

r

w1 w2 w3 w4 w5 w6 N

FEATURE SELECTION (AIC)(Akaike Information Criterion)

CROSS-VALIDATION: 4-fold cross-validation

75% trainingDataset:

Fold_1

Fold_2

Fold_3

Fold_4

Fold_1

Fold_2

Fold_3

Fold_4

Fold_1

Fold_2

Fold_3

Fold_4

Fold_1

Fold_2

Fold_3

Fold_4

X 10

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Training the predictor

Page 22: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Binding affinity

Stronger Weaker

• Functional classes (antibody 12%, enzymes 41%, other 47%)

• ΔG (-4.3 / -18.6) kcal mol-1

• BSA (808 – 3370) Å2

• Methods (Kd) (SPR, florescence, ITC…)

• Conformational changes (0.17-4.90) Å

Benchmark in: Kastritis at al., Protein Sci 2011

122 complexes with complete

crystallographic structure

-18.6 kcal mol-1 -4.3 kcal mol-1

1. Introduction - 2. Method - 3. Results - 4. Conclusion

The dataset: protein-protein binding affinity benchmark

Page 23: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

i_rmds: 3.28 Åi_rmds: 0.42 Å

Rigid complex Flexible complex

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Conformational changes upon binding

i_rmsd > 1.0 Å

Page 24: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

i_rmds: 3.28 Åi_rmds: 0.42 Å

Rigid complex Flexible complexi_rmsd > 1.0 Å

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Conformational changes upon binding

Page 25: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

i_rmds: 3.28 Åi_rmds: 0.42 Å

Rigid complex Flexible complexi_rmsd > 1.0 Å

1. Introduction - 2. Method - 3. Results - 4. Conclusion

Conformational changes upon binding

Page 26: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Page 27: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Experimental ΔGs (kcal mol-1)

R=-0.50p-value<0.0001

ICs

Technique r_ICs r_BSA #cases

All -0.50 -0.32 122

Stopped-flow -0.70 -0.55 8

Spectroscopy -0.65 -0.27 14

ITC -0.55 -0.64 20

SPR -0.53 -0.44 39

Inhibition Assay 0.05 -0.08 17

Fluorescence 0.04 0.34 19

10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 28: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Experimental ΔGs (kcal mol-1)

R=-0.50p-value<0.0001

ICs

Technique r_ICs r_BSA #cases

All -0.50 -0.32 122

Stopped-flow -0.70 -0.55 8

Spectroscopy -0.65 -0.27 14

ITC -0.55 -0.64 20

SPR -0.53 -0.44 39

Inhibition Assay 0.05 -0.08 17

Fluorescence 0.04 0.34 19

10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 29: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs

R=0.05p-value<0.4

Experimental ΔGs (kcal mol-1) Experimental ΔGs (kcal mol-1)

R=-0.50p-value<0.0001

ICs

Technique r_ICs r_BSA #cases

All -0.50 -0.32 122

Stopped-flow -0.70 -0.55 8

Spectroscopy -0.65 -0.27 14

ITC -0.55 -0.64 20

SPR -0.53 -0.44 39

Inhibition Assay 0.05 -0.08 17

Fluorescence 0.04 0.34 19

Inhibition Assay + Fluorescence10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 30: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs

R=0.05p-value<0.4

Experimental ΔGs (kcal mol-1) Experimental ΔGs (kcal mol-1)

Experimental ΔGs (kcal mol-1)

R=-0.59p-value<0.0001

R=-0.50p-value<0.0001

ICs

ICs

Technique r_ICs r_BSA #cases

All -0.50 -0.32 122

Stopped-flow -0.70 -0.55 8

Spectroscopy -0.65 -0.27 14

ITC -0.55 -0.64 20

SPR -0.53 -0.44 39

Inhibition Assay 0.05 -0.08 17

Fluorescence 0.04 0.34 19

Inhibition Assay + Fluorescence10

20

30

40

50

60

70

80

-20 -18 -16 -14 -12 -10 -8 -6 -4

ICs and impact of experimental techniques1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 31: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ICs

total

ICs Property-

basedNIS r

✓ 0.59

Vangone and Bonvin, eLife (2015)

ΔGpred=w1P1+w2P2+….

The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 32: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ICs

total

ICs Property-

basedNIS r

✓ 0.59

✓ 0.67

Vangone and Bonvin, eLife (2015)

ΔGpred=w1P1+w2P2+….

The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 33: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ICs

total

ICs Property-

basedNIS r

✓ 0.59

✓ 0.67

✓ ✓ 0.73

Vangone and Bonvin, eLife (2015)

ΔGpred=w1P1+w2P2+….

The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 34: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ICs

total

ICs Property-

basedNIS r

✓ 0.59

✓ 0.67

✓ ✓ 0.73

ΔGpredicted= - 0.09459 ICscharged/charged

- 0.10007 ICscharged/apolar

+ 0.19577 ICspolar/polar

- 0.22671 ICspolar/apolar

+ 0.18681 %NISapolar

+ 0.3810 %NIScharged

- 15.9433

Vangone and Bonvin, eLife (2015)

Pre

dic

ted

ΔG

s (k

cal

mo

l-1)

-20

-18

-16

-14

-12

-10

-8

-6

-4

-20 -18 -16 -14 -12 -10 -8 -6 -4

Experimental ΔGs (kcal mol-1)

r = 0.73

RMSE= 1.89 kcal mol-1

ΔGpred=w1P1+w2P2+….

The predictor1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 35: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Comparison with other methods1. Introduction - 2. Method - 3. Results - 4. Conclusion

Vangone and Bonvin, eLife (2015)1CCharPPI web-server: Moal et al., Bioinformatics 2015

Performance compared with 105 functions reported in CCharPPI1, calculated on the same set of

structures (“composite scoring functions” reported in the plot)

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

Pears

on

's C

orr

ela

tio

n

Page 36: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Comparison with other methods1. Introduction - 2. Method - 3. Results - 4. Conclusion

Vangone and Bonvin, eLife (2015)1CCharPPI web-server: Moal et al., Bioinformatics 2015

Performance compared with 105 functions reported in CCharPPI1, calculated on the same set of

structures (“composite scoring functions” reported in the plot)

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

Pears

on

's C

orr

ela

tio

n

ALL

RIGID

FLEXIBLE

Page 37: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Page 38: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

PRODIGY: the web-server1. Introduction - 2. Method - 3. Results - 4. Conclusion

Xue, Rodrigues, Kastritis, Bonvin, Vangone. Bioinformatics (2016)

Page 39: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

1. INTRODUCTION

2. THE METHOD

3. RESULTS

4. CONCLUSION

Page 40: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Take home message1. Introduction - 2. Method - 3. Results - 4. Conclusion

The number and nature

of interface contacts

is a simple but robust

predictor of

binding affinity

http://milou.science.uu.nl/services/PRODIGY/

Page 41: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ACKNOWLEDGMENTS

CSB group @ UU

Li Xue

João Rodrigues

Panagiotis Kastritis

Alexandre Bonvin

Page 42: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Further info

Page 43: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes
Page 44: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

Cross-validation data

ΔGpredicted= - 0.09459 ICscharged/charged

- 0.10007 ICscharged/apolar

+ 0.19577 ICspolar/polar

- 0.22671 ICspolar/apolar

+ 0.18681 %NISapolar

+ 0.3810 %NIScharged

- 15.9433

Vangone and Bonvin, eLife (2015)

Page 45: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes
Page 46: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes
Page 47: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

ICs BSA

• ICs performs better than BSA

BSA of each residue greatly depends from its ASA in the free form

BSAaa = ASAaa_free −ASAaa_complex

For aa at interface: ASAaa_complex ~ 0 BSAaa ∼ASAaa_free

The advantage of using contacts instead of surfaces?1. Introduction - 2. Method - 3. Results - 4. Conclusion

Page 48: BioExcel Webinar Series #7: PRODIGY, a web server to predict binding affinities in protein-protein complexes

bioexcel.eu

Audience Q&A

session

Please use the Questions

function in GoToWebinar

application

Any other questions or points

to discuss after the live

webinar? Join the discussion at

http://bit.ly/2dc69Ur