coarse and reliable geometric alignment for protein docking yusu wang stanford university joint work...
DESCRIPTION
Challenges Physical and biochemical mechanism Binding sites? Energy function: hydrophobicity, electrostatics, etc High complexity Thousands of atoms High dimension, flexibilityTRANSCRIPT
Coarse and Reliable Geometric Alignment
for Protein Docking
Yusu WangStanford University
Joint Work with P. K. Agarwal, P. Brown, H. Edelsbrunner, J. Rudolph
Duke University
Motivation
How proteins interact with each other?
Docking problem Predict docking configuration
Challenges
Physical and biochemical mechanism Binding sites? Energy function: hydrophobicity, electrostatics, etc
High complexity Thousands of atoms High dimension, flexibility
Coarse alignment Rigid molecules Small sets of candidates
Refinement Flexibility, chemical information
Two-step Approach
Coarse Alignment
Goal
A relatively small set of possible configurations
not too tight, global fitting …
Lock and Key Principle
… To use an image, I would say that enzyme and glycoside have to fit into each other like a lock and a key, in order to exert a
chemical effect on each other…
--- Emil Fischer, 1894
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Geometric complementary at a coarse level
Coarse Alignment Algorithm
Capture features (protrusions, cavities)
Align these features
Capture Features
Previous work: feature points Connolly function
[Connolly, 83]
Our work: feature pairs Describe more global features Specify importance
Capture Feature Pairs
Height function as example
Extend to all directions -- Elevation function
k-legged Maxima of Elevation
1-legged 2-legged 3-legged 4-legged
[Agarwal, Edelsbrunner, Harer, Wang, SOCG’04]
Examples
2-legged 4-legged3-legged
3-Legged Maximum
In Short :
Each maximum captures a feature on surface
Four types of features
Collect feature pairs: Any two points within the same maximum
A concise representation of meaningful features!
Surface Representation using Elevation
Coarse Alignment Algorithm
Describe protrusions and cavities (via feature pairs)
Align features
PairMatch Alg:
Take a feature pair from each set Align two feature pairs, get T Rank T ’s by their scores
Output ranked sequence of configurations
Align Features
Reassembly of Known Complexes
A test set of 25 protein complexes CoarseAlign: take top 100 ranked coarse alignments Refinement: using local improvement (Choi et al.)
Docking Results (CoarseAlign)
pdb-id Rank RMSD (Å)1BRS 1 1.591A22 2 2.752PTC 1 4.551MEE 1 1.331CHO 1 2.711JLT 8 3.641CSE 2 2.213SGB 1 3.213HLA 1 1.87
Docking Results (Refinement)
pdb-id Rank.refine RMSD.refine1BRS 1 0.541A22 1 1.082PTC 1 0.661MEE 1 0.571CHO 1 0.991JLT 1 1.571CSE 1 0.823SGB 1 2.243HLA 1 0.78
Overall
23/25 return a near-native configuration w/o false positives
Unbound Protein Docking
Docking benchmark by [Chen et al.’03] Take 49 out of 59 complexes
Sample ResultsTop 2,000 All outputs
pdb-id RMSD(Å) Hits RMSD (Å) Size1ACB 3.70 20 1.75 14,4261AVW 5.51 8 5.42 23,5651BRC 4.66 35 4.66 12,7701BRS 1.60 7 1.60 11,6071CGI 3.04 5 3.04 10,1351CHO 2.35 27 2.35 11,8151CSE 3.15 7 2.74 21,0681DFJ 6.44 2 6.44 35,231
1MAH 2.78 4 2.78 25,402
More Results
Output size: All < ~50,000, most < 25,000
Quality Among top 2,000 ranked configurations
38/49 produce at least one with < 6Å Among all outputs
47/49 produce at least one with < 6Å
Summary
Elevation function -> meaningful features Useful coarse alignments
Combine with refinement for the unbound case
Sample proteins
1A22 1JLT 3HLA
Related Docking Packages
FTDock, DOT, ZDock: FFT-based Shape complementary, electrostatics
HEX: Fourier correlation GRAMM: FFT (focus on low resolution docking) BiGGER PPD: Geometric Hashing