protein docking and molecular shape recognition what is ... · protein docking and molecular shape...

13
Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein Docking? Protein docking = shape recognition in 3D space However, ... proteins are flexible (more complexity)! Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Contents Motivation – Importance of Protein-Protein Interactions (PPIs) Polar Fourier Protein Shape Representation Application to Protein Docking – Hex The CAPRI Blind Docking Experiment New Developments – Multi-Dimensional FFTs, Using GPUs Application to Molecular Shape Recognition – ParaSurf & ParaFit Conclusions & Future Prospects PPI Networks are Fundamental to Biological Mechanisms If genomes provide the “blue-print” for life ... ... then proteins provide the “machinery” Understanding PPIs could lead to immense scientific advances and therapeutic benefits Yeast network figure from: J Hallinan & G Smith ICCS 2002 article 584

Upload: others

Post on 10-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Protein Docking and Molecular Shape RecognitionUsing Polar Fourier Correlations

Dave RitchieLORIA, Nancy

What is Protein Docking?

Protein docking = shape recognition in 3D space

However, ... proteins are flexible (more complexity)!

Protein Docking and Molecular Shape RecognitionUsing Polar Fourier Correlations

Contents

• Motivation – Importance of Protein-Protein Interactions (PPIs)

• Polar Fourier Protein Shape Representation

• Application to Protein Docking – Hex

• The CAPRI Blind Docking Experiment

• New Developments – Multi-Dimensional FFTs, Using GPUs

• Application to Molecular Shape Recognition – ParaSurf & ParaFit

• Conclusions & Future Prospects

PPI Networks are Fundamental to Biological Mechanisms

• If genomes provide the “blue-print” for life ...

• ... then proteins provide the “machinery”

• Understanding PPIs could lead to immense scientific advances and therapeutic benefits

Yeast network figure from: J Hallinan & G Smith ICCS 2002 article 584

Page 2: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Recent Growth of Protein-Protein Interaction (PPI) Literature

Citations of key yeast functional genomics papers

(per year):

• Red: Ito et al., Uetz et al. (Y2H)

• Blue: Ho et al., Gavin et al. (TAP-MS)

• Black: All protein-protein interaction papers

Figure from: Bork et al., Curr Op. Struct. Biol. (2004) 14 292–299

Docking - Predicting PPIs at the 3D Molecular Level

Ab Initio

• Soft Docking – FFT, Polar Fourier Correlations (∼ hours)

• MC/MD – Flexible side chains + backbones (∼ days)

Re-Scoring

• Knowledge-based potentials

Data-Driven

• Biochemical: mutagenesis hot-spot residues

• Biophysical: NMR CSP/RDC, H/D exchange, 13C labeling, ...

• ET + Correlated mutations

• Structural Databases (docking by homology)

The Basic Goal of Protein-Protein Docking

Find minimum potential energy of the system as rapidly as possible:

E =

φ(r)ρ(r)dVFor two proteins

φ(r) =φA(r) + φB(r)

ρ(r) =ρA(r) + ρB(r)

and so

E =

(φA(r)ρB(r) + φB(r)ρA(r))dV

• With brute-force search, typically need ∼ 109 such integrals

• Current algorithms often sum several such potential/density terms...

• ... and often use 3D Cartesian FFTs to accelerate the calculation

Real Spherical Harmonic Basis Functions

Orthogonality:

ylm(θ, φ)yl′m′(θ, φ)dΩ = δll′δmm′

Rotation: ylm(θ′, φ′) =l

m′=−l

R(l)m′m

(α, β, γ)ylm′(θ, φ)

Page 3: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Spherical Harmonic Surfaces

Example: 2D Radial Expansions (256 Basis Functions)

r(θ, φ) =15∑

l=0

l∑

m=−l

almylm(θ, φ)

• Good for matching similar shapes, not so good for docking...

Radial Basis Functions: Rnl(r)

HO-type (shape): Rnl(r) = N(q)nl e−ρ/2ρl/2L

(l+1/2)n−l−1 (ρ); ρ = r2/q, q = 20.

Coulomb (electro): Rnl(r) = N(Λ)nl e−ρ/2ρlL

(2l+2)n−l−1(ρ); ρ = 2Λr, Λ = 1/2.

Orthogonality:

∫ ∞

0

Rnl(r)Rn′l(r)r2dr = δnn′

30

R15,0(r)

30

R20,0(r)

30

R25,0(r)

30

R30,0(r)

3D Protein Shape Density Representations(Ritchie & Kemp (2000) Proteins 39 178–194)

• Sample surface skins onto a (0.75A)3 grid...

Molecular Surface

Solvent Accessible Surface Surface Skin

Protein Interior

SamplingSpheres

Surface Normals

Surface Skin: σ(r) =

1; r ∈ surface skin

0; otherwiseInterior: τ (r) =

1; r ∈ protein atom

0; otherwise

Parametrise as: σ(r) =

N∑

nlm

aσnlmRnl(r) ylm(θ, φ), etc.

Estimate as: aσnlm ≃

c

Rnl(rc) ylm(θc, φc)∆V

• Only need to do this once for each protein...

Polar Fourier Shape Density Reconstruction - Antibody CDRs

Image Order Coefficients

A Gaussians -

B N = 16 1,496

C N = 25 5,525

D N = 30 9,455

DW Ritchie (2003) Proteins Struct. Funct. Bionf. 52 98–106

Page 4: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

3D Shape Density Reconstruction – CAPRI T21: Orc1/Sir1

DW Ritchie (2008) Curr. Prot. Pep. Sci. 9(1) 1-15

Docking Using 3D Polar Fourier Density Functions - “Hex”

τσ(r)

(r)

Densities: σ(r) =N

nlm

aσnlmRnl(r)ylm(θ, φ) τ (r) =

N∑

nlm

aτnlmRnl(r)ylm(θ, φ)

Favourable:

(σA(rA)τB(rB) + τA(rA)σB(rB))dV

Unfavourable:

τA(rA)τB(rB)dV

Score: SAB =

(σAτB + τAσB − QτAτB)dV Penalty Factor: Q = 11

DW Ritchie & GJL Kemp (2000) Proteins Struct. Funct. Bionf. 39 178–194

Correlations - Overlap as a Function of Coordinate Operations

Rotation: R(α, β, γ)σA(r) =N

nlm

aσ′nlmRnl(r)ylm(θ, φ)

Rotated Coeffients: aσ′nlm =

l∑

m′=−l

R(l)mm′(α, β, γ)aσ

nlm′

Translation: Tz(R)σA(r) =N

nlm

aσ′′nlmRnl(r)ylm(θ, φ)

Translated Coefficients: aσ′′nlm =

N∑

n′l′

T(|m|)nl,n′l′(R)aσ

n′l′m

Hence:

σ′A(r)τ ′′

B(r)dV =N

nlm

aσ′nlmbτ ′′

nlm etc.

Search Space: ∼ 109 orientations (∼ 106 orientations/sec)

DW Ritchie (2005) J. Appl. Cryst. 38 808–818

Translation Matrices From Fourier-Bessel Transform Theory

Using spherical Bessel transforms:

Rnl(β) =

2

π

∫ ∞

0

Rnl(r)jl(βr)r2dr; Rnl(r) =

2

π

∫ ∞

0

Rnl(β)jl(βr)β2dβ

it can be shown that

T(|m|)n′l′,nl(R) =

l+l′∑

k=|l−l′|

A(ll′|m|)k

∫ ∞

0

Rnl(β)Rn′l′(β)jk(βR)β2dβ

where

A(ll′|m|)k = (−1)

k+l′−l2

+m(2k + 1)[

(2l + 1)(2l′ + 1)]1/2

(

l l′ k

0 0 0

)(

l l′ k

m m 0

)

• Can derive analytic formulae for both GTO and ETO radial functions

• Requires high precision math library (GMP)...

• Calculate once for R = 1, 2, 3, ...50A and store on disk ( ∼ 200Mb)

Page 5: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

6D Docking Search as a Nested Sequence of Transformations

Get 4 rotations from icosahedral tessellations ...A

(β2,γ2)(β1,γ1)

z

α2

R

βΑ B

Rotate A (×812 @ 7.5): A′(r) = R(0, β1, γ1)A(r)

Translate A (×50 @ 0.75A): A′′(r) = Tz(−R)A′(r)

Rotate B (×812 @ 7.5): B′(r) = R(0, β2, γ2)B(r)

Twist B (×64 @ 5.6): B′′(r) = R(α2, 0, 0)B′(r)

1D FFT: SAB(α2) =N−1∑

m=1−N

Pm cos mα2 + Qm sin mα2

Search Space: 812 × 50 × 812 × 64 ≃ 2 × 109 (∼ 106/s on a 1GHz PIII Xeon)

Shape Correlation Score as a Function of Twist Angle α2(Antibody HyHel-5/Lysozyme Complex)

90 0 900800800

S 2N=16

90 0 900800800

S 2N=20

90 0 900800800

S 2N=25

Re-Docking Known Protein ComplexesN = 16 N = 20 N = 25

Case Top RMS Top RMS Top RMS

SIC 3,407 0.00 2 0.22 1 0.82

KAI 17 0.41 3 0.69 7 0.81PTC 132 0.52 2 0.48 1 0.48

CGI 1 0.38 1 0.38 1 0.38CHO 1 0.45 1 0.55 1 0.55BGS 1 0.82 1 0.82 1 0.88

GGI 1 2.47 1 0.90 1 0.90TET 5 1.48 1 1.16 1 1.03

FPT 102 1.04 1 0.42 1 0.42IGF 3 0.71 1 0.77 1 0.77

JEL 4,867 0.81 1,060 0.81 2 0.81BQL 524 1.85 12 0.96 1 0.39

HFL 318 1.01 5 1.00 1 1.00HFM 7 2.19 27 1.09 10 1.09VFB 8,344 1.49 216 0.20 9 0.20

MLC 1,401 0.00 116 0.00 187 0.84MEL 9,898 1.03 27 1.03 3 1.03

JHL 385 0.62 8 0.38 1 1.08FBI 14 1.09 1 1.09 1 0.38

NCA 68 1.53 1 0.32 1 0.32NMB 160 2.43 1,630 1.39 1,009 1.39

NSN 19,992 1.11 716 0.75 1,130 2.29IAI 1,381 1.48 111 0.37 20 1.39DVF 11,145 0.00 88 1.38 49 0.44

KB5 140 0.34 1 0.34 78 1.38IGC 1,328 1.74 269 0.81 1 0.34

Show Docking Movie!

Page 6: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

CAPRI – Critical Assessment of Predicted Interactions

• Started in 2001/2 following CASP with 19 groups & 7 targets...

• At least one protein presented in its unbound form

• Any predictive approach allowed: homology/literature, etc.

Target Receptor Ligand Type Complex Lab

1 HPr Kinase HPr U/U Fieulaine et al. Janin

2 Rotavirus VP6 MCV U/B Vaney et al. Rey

3 Hemagglutinin HC63 U/B Barbey-Martin et al. Knossow

4 α -Amylase AMD10 U/B Desmyter et al. Cambillau

5 α -Amylase AMB7 U/B Desmyter et al. Cambillau

6 α -Amylase AMD9 U/B Desmyter et al. Cambillau

7 SpeA TCR 14.3.D U/U Sundberg et al. Mariuzza

• Now > 40 groups; Currently on Targets 28 ...

• 3 Sections - Predicters, Servers, Scorers

J Janin et al. (2003) Proteins Struct. Funct. Bioinf. 52 2–9

CAPRI Target 1 - Lactobacillus HPr / HprK

CAPRI Results: Targets 1–7

Predictor Software Algorithm T1 T2 T3 T4 T5 T6 T7

Abagyan ICM FF ** *** **

Camacho CHARMM FF * *** ***

Eisenstein MolFit FFT * * ***

Sternberg FTDOCK FFT * ** *

Ten Eyck DOT FFT * * **

Gray MC ** ***

Ritchie Hex SPF ** ***

Weng ZDOCK FFT ** **

Wolfson BUDDA/PPD GH * ***

Bates Guided Docking FF - - - ***

Palma BIGGER GF - - ** *

Gardiner GAPDOCK GA * * - - - - -

Olson Surfdock SH * - - - -

Valencia ANN * - - - - - -

Vakser GRAMM FFT * - - - -

∗ low, ∗∗ medium, ∗ ∗ ∗ high accuracy prediction; − no prediction

R Mendez et al. (2003) Proteins Struct. Funct. Bionf. 52 51–67

Docked Orientation (Hex) for Target 3 - Hemagglutinin/HC63

• CAPRI “medium accuracy” ( 1A ≤ Ligand RMSD ≤ 5A)

Page 7: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Docked Orientation (Hex) for Target 6 - Amylase/AMD9

• CAPRI “high accuracy” (Ligand RMSD ≤ 1A)

Subsequent CAPRI Targets (Rounds 3 – 5)

Target Description Comments

T8 Nidogen- γ3 - Laminin U/U

T9 LiCT homodimer build from monomer – 12A RMS deviation

T10 TBEV trimer build from monomer – 11A RMS deviation

T11 Cohesin - dockerin U/U; model-build dockerin

T12 Cohesin - dockerin U/B

T13 SAG1 - antibody Fab SAG1 conformational change: 10A RMS

T14 MYPT1 - PP1 δ U/U; model-build PP1 α → PP1 δ

T18 TAXI - xylanase U/B

T19 Ovine prion - antibody Fab model-build prion

• T15-T17 cancelled: structures released prematurely - Google!!!

• T11, T14, T19 involved homology model-building step...

CAPRI Results: Targets 8–19

Predictor Software T8 T9 T10 T11 T12 T13 T14 T18 T19

Abagyan ICM ** * ** *** * *** ** **

Wolfson PatchDock ** * * * * - ** ** *

Weng ZDOCK/RDOCK ** * *** *** *** ** **

Bates FTDOCK * * ** * ** ** *

Baker RosettaDock - ** *** ** *** ***

Camacho SmoothDock ** *** *** ** ** *

Gray RosettaDock *** - - ** *** **

Bonvin Haddock - - ** ** *** ***

Comeau ClusPro ** *** * *

Sternberg 3D-DOCK ** * * ** *

Eisenstein MolFit *** * *** **

Ritchie Hex ** *** * *

Zhou - - - *** ** * *

Ten Eyck DOT *** *** **

Zacharias ATTRACT ** - - - - *** **

Valencia * * * - -

Vakser GRAMM - - - - - ** **

Umeyama ** *

Kaznessis - - ***

Fano Grid-Hex - - *

R Mendez et al. (2005) Proteins Struct. Funct. Bionf. 60 150-169

Docked Orientation (Hex) for Target 12 - Cohesin/Dockerin

• Here, we assumed “molecular mimicry”

• First superposed dockerin onto cohesin dimer, then docked...

• CAPRI “high accuracy” (Interface RMSD ≤ 1A)

Page 8: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

5D FFT Correlations from Complex Overlap Expressions(Ritchie, Kozakov, Vajda, (2008) Bioinformatics 24 1865–1873)

Complex SHs, Ylm: ylm(θ, φ) =∑

t

U(l)mtYlt(θ, φ)

Complex coefficients: Anlm =∑

t

anltU(l)tm

Complex overlap: S =∑

kjsmnlv

D(j)∗ms (0, βA, γA)A∗

kjsT(|m|)kj,nl (R)D(l)

mv(αB, βB, γB)Bnlv

Collect coefficients: S(|m|)js,lv (R) =

kn

A∗kjsT

(|m|)kj,nl (R)Bnlv, k > j; n > l

To give: S =∑

jsmlv

D(j)∗ms (0, βA, γA)S

(|m|)js,lv (R)D(l)

mv(αB, βB, γB)

Expand as exponentials: D(l)mv(α, β, γ) =

t

Γtmlv e−imαe−itβe−ivγ

Hence: S =∑

jsmlvrt

Γrmjs S

(|m|)js,lv (R)Γtm

lv e−i(rβA−sγA+mαB+tβB+vγB)

Comparing FFT Correlation Speeds

N=25 Correlations, 2.6 × 109 Orientations

(Single CPU, 1.8GHz Xeon, 1Gb RAM)

Set-up FFT Rate Total Rate Total Time

Mins 106/ sec 106/ sec Mins

1D 8.0 1.0 0.8 43

3D 13.5 17.0 1.8 15

5D 9.8 4.5 2.2 21

The difference in 5D/3D FFT rates seems to

be due to CPU-cache/main-memory thrashing

• For two-property correlations, 5D FFT is ∼ 2x faster and 3D is ∼ 3x faster than 1D

• BUT for multi-property correlations, 5D gives almost NO extra cost per property

Porting Hex to a GPU using CUDA

• Modern GPUs have very high compute performance

• SIMT architecture = simultaneous instructions, multiple threads

• NVIDIA GPUs:

• Up to 4Gb memory

• Up to 240 arithmetic “cores”

• Up to Tflop performance

• Easy API with C++ syntax

• Grid of threads SIMT model

• BUT – for best results, need to understand the hardware...

CUDA Device Architecture

• Typically 8–16 multiprocessor blocks, each with 16 thread units

1 2 Thread Processors...

Shared Memory

15

0

0

Thread−Local Memory

Multiprocessor Block

7

(16Kb, fast)

Global Memory (256Mb − 4Gb, slow)

Host (PCIe)

• NB. global memory is ∼ 80x slower than shared memory

• Strategy: aim for “high arithmetic intensity” in shared memory

Page 9: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

CUDA Example - Matrix Multiplication

• Matrix multiplication C = A * B

• Each thread is responsible for calculating one element: C[i,k]

• Threads cooperate by reading & sharing sub-blocks of A & B

=

=

i

k

i

kbx

by

i

k

tytx

C

C

A B

BA*

* • Conventional algorithm

• C[i,k] = A[i] * B[k]

• GPU thread-blocks

• Multiprocessor launches multiple blocks to compute all of C

• Running thread-blocks concurrently hides memory latency

CUDA Programming - Matrix Multiplication Kernel__global__ void matmul(int wA, int wB, float *A, float *B, float *C)

float Cik = 0.0; // thread-local result variable

int bx = blockIdx.x, tx = threadIdx.x; // thread subscripts

int by = blockIdx.y, ty = threadIdx.y; // ("this" thread is one of a 2-D grid)

__shared__ float a_sub[16][16], b_sub[16][16]; // declare shared memory

for (int j=0; j<wA; j+=16) // thread-local loop

int ij = (16*by+ty)*wA + (j+tx); // thread-local array subscripts

int jk = (j+ty)*wB + (16*bx+tx);

a_sub[ty][tx] = A[ij]; // copy global data -> shared memory ("I/O")

b_sub[ty][tx] = B[jk];

__syncthreads(); // wait until all memory I/O finished

for (int jj=0; jj<16; jj++) Cik += a_sub[ty][jj] * b_sub[jj][tx];

__syncthreads(); // wait until all threads finished

int ik = (16*by+ty)*wB + (16*bx+tx); // array subscript of result element

C[ik] = Cik; // copy local result -> global memory

Cuda Porting Strategy

• Only port compute-intensive steps e.g. matrix multiply ...

• Consider using provided CUDA libraries: cuFFT, cuBLAST...

• Perform recursion, random access calculations on CPU first...

• Re-write complex/clever data structures as vectors, arrays...

• ... and round-up array dimensions to multiples of 16

• Re-write loops on 1D vectors as 2D array operations, etc.

• Access array elements in natural order for best memory “I/O”

Preliminary Cuda Results for Hex Docking

• Overall speed-up depends on how you measure it !

• Currently, 30x–50x (128-core GTX-9800 v’s 1.8GHz Xeon)

• In cuFFT, 3D FFT is slow compared to 1D FFT

• For Hex, best relative improvement is 1D FFTs using N=25

• Key Hex functions implemented using 5 or 6 CUDA kernels

• Total learning + programming effort = 4 weeks

• Modern GPUs are now very powerful and easy to program!

• New FX-5800 (240 core) should give “interactive” docking...

Page 10: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Fast 2D Surface Envelope Matching(Ritchie & Kemp (1999) J Comp Chem 20 383–395)

• 2D surface comparisions are much faster than 3D:

SAB =

|rA(θ, φ) − rB(θ, φ)|2dΩ

• Expansions to L=7 (64 coeffs) take ∼ 0.05 s per superposition...

ParaSurf – SH Surfaces & Properties from Semi-Empirical QM(Lin & Clark (2005) J Chem Inf Model 45 1010–1016; Clark (2004) J Mol graph 22 519–525)

• From MOPAC or VAMP calculate:

• Density contours of 2 × 10−4e/A3

( ∼ SAS)

• MEP, IEL, EAL, αL as expansions to L=15

• Concise/convenient non-atomistic descriptors for ComFA/QSAR?

ParaFit - High Throughput SH Surface & Property Matching

Distance: D =

(rA(θ, φ) − rB(θ, φ)′)2dΩ

Orthogonality: D = |a|2 + |b|2 − 2a.b′

Rotation: b′lm =

m′

R(l)mm′(α, β, γ)blm′

Hodgkin: S = 2a.b′/(|a|2 + |b|2)

Carbo: S = a.b′/(|a|.|b|)

Tanimoto: S = a.b′/(|a|2 + |b|2 − a.b′)

Multi-property: S = pSshape + qSMEP + rSIEL + sSEAL + tSαL

Fast Brute-Force Superposition Searches

• Euler rotations generated from icosahedral tesselation of sphere

• 22,500 samples (500(β, γ) × 45(α)) of about 8 degree steps

• Refine with 16 × 16 × 16 equatorial grid of 1 degree steps

• Approx 0.05 seconds / superposition on 1.8GHz P-III Xeon CPU...

Page 11: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Canonical Orientations – Aligning Molecules to Principal Axes

• Find principal radii by brute force search to L=6

• similar to finding moments of inertia

• but no ambiguity with respect to 180 degree flips

z

x

• Canonical orientations of similar molecules often overlay very well

Clustering the Odour Dataset using 2D Surface Shape(Takane et al. (2004) Org. Biomol. Chem. 2 3250–3255)

• Seven classes: bitter, ambergris, camphoraceous, rose, jasmine, muguet, musk

• Following Takene et al., cluster into 10 group using ParaSurf & Parafit:

unix% PS mopac run

unix% PS parasurf run

unix% parafit -matrix -dif odour data.dif * p.sdf

unix% dif2jpg -n 10 -d odour data.dif

unix% eog odour data.jpg

Visualisation of Odour Dataset Clustering Results(Mavridis et al. (2007), J. Chem. Inf. Model. 45(5) 1787-1796.)

Clustering Superposed Pairs Clustering Canonical Orientations

Shape-Based Virtual Screening of CXCR4 & CCR5 Antagonists(V. Perez-Nueno et al., (2008) J. Chem. Inf. Model. 48(3) 509-533)

• Assembled 602 known actives (TAK779, AMD3100, etc.) against CXCR4 & CCR5

• Performed virtual screening against 4700 inactives (with TAK779, AMD3100 as queries)

Page 12: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Comparing Ligand-Based & Docking-Based Virtual Screening

• Docking enrichments are better for CXCR4 than CCR5 (better CXCR4 homology model)

• But shape-based scoring generally gives better enrichments overall...

Conclusions & Future Prospects

• Protein Docking (“Hex”):

• Novel, fast, & fairly accurate docking algorithm

• Multi-dimensional FFT gives good speed-up, especially 3D

• Polar Fourier FFT maps v. well to GPU, with v. good speed-up

• Main challenge is now scoring & flexibility, not search...

• Small-Molecule Applications:

• SH shape-matching is at least as good as ROCS and v. fast...

• The Future?

• Extensible to ComFA/QSAR & ligand docking...?

• High throughput 2D/3D database screening now feasible...?

Acknowledgments

ANR 2009-2010

BBSRC 1996-2000, 2006-07

EPSRC 2002-06

Tim Clark, Brian Hudson & Vishwesh VenkatramanSandor Vajda & Dima Kozakov

Lazaros Mavridis & Violeta Perez-Nueno

Software & Preprints: http://www.loria.fr/∼ritchied/

PSFB special issue: Third CAPRI Evaluation Meeting Dec 2007(Google: Proteins Wiley)

Review: DW Ritchie (2008) Curr. Prot. Pep. Sci. 9(1) 1–15.

Extra Slides

Page 13: Protein Docking and Molecular Shape Recognition What is ... · Protein Docking and Molecular Shape Recognition Using Polar Fourier Correlations Dave Ritchie LORIA, Nancy What is Protein

Using Low Resolution Docking to Cross-Validate Predicted PPIs?

Low resolution docking of Tripsin + BPTI

a) Crystal structure + low res FFT

• Gold: BPTI location in crystal

• Red: centroid of calculated BPTI solutions

b) Model-built structure (green) + low res FFT

Figure from: Tovchigrechko et al., Prot. Sci. (2002) 11 1888–1896

Multi-Sample Docking for Very Large Molecules - Antibody-VP2