calculation of protein-ligand binding affinities - the skaggs school

24
Calculation of Protein-Ligand Binding Affinities Michael K. Gilson 1 and Huan-Xiang Zhou 2 1 Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, Maryland 20850; email: [email protected] 2 Department of Physics and Institute of Molecular Biophysics and School of Computational Science, Florida State University, Tallahassee, Florida 32306; email: [email protected] Annu. Rev. Biophys. Biomol. Struct 2007. 36:21–42 First published online as a Review in Advance on January 2, 2007 The Annual Review of Biophysics and Biomolecular Structure is online at biophys.annualreviews.org This article’s doi: 10.1146/annurev.biophys.36.040306.132550 Copyright c 2007 by Annual Reviews. All rights reserved 1056-8700/07/0609-0021$20.00 MKG dedicates this article to the memory of his father, Saul Bernard Gilson, MD: scholar, poet, and physician, 1922–2006. Key Words computation, drug-design, entropy, free energy, modeling Abstract Accurate methods of computing the affinity of a small molecule with a protein are needed to speed the discovery of new medications and biological probes. This paper reviews physics-based models of bind- ing, beginning with a summary of the changes in potential energy, solvation energy, and configurational entropy that influence affinity, and a theoretical overview to frame the discussion of specific compu- tational approaches. Important advances are reported in modeling protein-ligand energetics, such as the incorporation of electronic polarization and the use of quantum mechanical methods. Recent calculations suggest that changes in configurational entropy strongly oppose binding and must be included if accurate affinities are to be obtained. The linear interaction energy (LIE) and molecular me- chanics Poisson-Boltzmann surface area (MM-PBSA) methods are analyzed, as are free energy pathway methods, which show promise and may be ready for more extensive testing. Ultimately, major im- provements in modeling accuracy will likely require advances on multiple fronts, as well as continued validation against experiment. 21 Annu. Rev. Biophys. Biomol. Struct. 2007.36:21-42. Downloaded from arjournals.annualreviews.org by University Of Maryland on 05/07/07. For personal use only.

Upload: others

Post on 09-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Calculation ofProtein-Ligand BindingAffinities∗

Michael K. Gilson1 and Huan-Xiang Zhou2

1Center for Advanced Research in Biotechnology, University of MarylandBiotechnology Institute, Rockville, Maryland 20850; email: [email protected] of Physics and Institute of Molecular Biophysics and School ofComputational Science, Florida State University, Tallahassee, Florida 32306;email: [email protected]

Annu. Rev. Biophys. Biomol. Struct 2007. 36:21–42

First published online as a Review in Advance onJanuary 2, 2007

The Annual Review of Biophysics and BiomolecularStructure is online at biophys.annualreviews.org

This article’s doi:10.1146/annurev.biophys.36.040306.132550

Copyright c© 2007 by Annual Reviews.All rights reserved

1056-8700/07/0609-0021$20.00

∗MKG dedicates this article to the memory of hisfather, Saul Bernard Gilson, MD: scholar, poet,and physician, 1922–2006.

Key Words

computation, drug-design, entropy, free energy, modeling

AbstractAccurate methods of computing the affinity of a small molecule witha protein are needed to speed the discovery of new medications andbiological probes. This paper reviews physics-based models of bind-ing, beginning with a summary of the changes in potential energy,solvation energy, and configurational entropy that influence affinity,and a theoretical overview to frame the discussion of specific compu-tational approaches. Important advances are reported in modelingprotein-ligand energetics, such as the incorporation of electronicpolarization and the use of quantum mechanical methods. Recentcalculations suggest that changes in configurational entropy stronglyoppose binding and must be included if accurate affinities are to beobtained. The linear interaction energy (LIE) and molecular me-chanics Poisson-Boltzmann surface area (MM-PBSA) methods areanalyzed, as are free energy pathway methods, which show promiseand may be ready for more extensive testing. Ultimately, major im-provements in modeling accuracy will likely require advances onmultiple fronts, as well as continued validation against experiment.

21

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 2: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Ligand: in general,any molecule thatbinds another; usedhere for a drug-likeorganic molecule,typically ofmolecular weightless than ∼600 Da

Contents

INTRODUCTION. . . . . . . . . . . . . . . . . 22MODELING THE PHYSICAL

CHEMISTRY OF BINDING:FORCES AND FLEXIBILITY. . . 23Potential Energy . . . . . . . . . . . . . . . . . 23Effects of Water, Dissolved

Electrolytes, and pH. . . . . . . . . . . 24Theoretical Framework for

Calculating Affinities . . . . . . . . . . 26Strain, Configurational Entropy,

and Preorganization . . . . . . . . . . . 27APPROACHES TO

CALCULATING AFFINITIES . . 28Docking and Scoring . . . . . . . . . . . . . 28Free Energy Methods . . . . . . . . . . . . 30

PROSPECTS ANDCHALLENGES . . . . . . . . . . . . . . . . . 33

INTRODUCTION

Molecular recognition is central to biology,forming the foundation for precise hormonalcontrol of distant organs, immune targetingof nonself proteins, and the specificity of en-zymatic catalysis, for example. It also offers apowerful mechanism for exogenous control ofbiological systems, and many medications andbiological probes act by binding and usuallyinhibiting a specific macromolecular target.Indeed, the discovery of a ligand that bindsa targeted protein with high affinity is a ma-jor preoccupation of early-stage drug discov-ery and also of chemical genomics projectsseeking small-molecule inhibitors to eluci-date gene function. However, discovering asmall molecule that binds a specific proteintightly, while retaining favorable pharmaco-logical properties, can be a major and costlychallenge.

Computation can speed this process (62)by artificial intelligence and machine-learningtechnologies and through simulation andmodeling. In particular, structure-based mod-

eling uses the three-dimensional atomic co-ordinates of the targeted protein to calculatethe binding free energy of a proposed lig-and, or at least to rank candidate ligands ac-cording to their predicted affinities for thetarget. This information is then available toguide molecular design and synthesis. Ide-ally, computational methods would be as ac-curate as experiment, with binding affini-ties correct to within 1 kcal mol−1, but thisgoal has not been achieved. Indeed, cal-culating protein-ligand affinities is difficult.One reason is that the free energy of bind-ing is typically a small difference betweenlarge numbers, namely the interactions of theligand and protein with each other versustheir interactions with water and counteri-ons in their unbound state. Also, the inter-atomic forces are strong and short ranged,making for an energy function sharply de-pendent on the details of molecular confor-mation. Finally, the protein and the ligandare flexible and have many degrees of free-dom; therefore exploring all potentially rele-vant conformations is a large computationaltask.

Despite these difficulties, computer mod-eling has enormous potential to make accu-rate predictions and provide insights that canguide molecular design. In this paper, we re-view recent progress and discuss prospects forfurther advances. We begin with a discussionof the forces involved in molecular recogni-tion and how they are modeled. In partic-ular, many approaches to calculating bind-ing affinities can be analyzed by identifyinghow they treat the potential energy of theprotein and ligand as a function of confor-mation, the influence of the aqueous solvent,and the conformational flexibility of the twomolecules. This discussion provides the basisfor a review of approaches to modeling bind-ing, which are broadly categorized as dock-ing and scoring and free energy methods.A short conclusion touches on the prospectsfor further progress. Space limitations pre-clude addressing many interesting topics, such

22 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 3: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

as the treatment of protonation equilibria,the possible consequences of macromolecularcrowding on binding, the automated designof ligands, and the special challenges posedby nucleic acid targets. For the same rea-son, the literature is summarized with a broadbrush, using representative, rather than ex-haustive citations, in hopes of conveying ina few pages the structure of the field and pro-viding a framework that helps the reader makesense of a range of papers.

MODELING THE PHYSICALCHEMISTRY OF BINDING:FORCES AND FLEXIBILITY

Perhaps someday there will be computers fastenough to compute binding affinities fromfirst-principles quantum mechanical simula-tions of all the nuclei and electrons in a fem-toliter solvent droplet, with dissolved proteinand ligand molecules. For now, however, onemust rely on a theoretical framework thatyields the affinity from calculations on justone protein and one ligand, and on approx-imations that allow energies and forces to becomputed more rapidly than does an ab ini-tio quantum mechanical treatment. In gen-eral, such approximations entail reducing thenumber of degrees of freedom that are treatedexplicitly and replacing them with new forcesthat account for the implicit degrees of free-dom. For example, one can model the wa-ter around a protein by explicitly simulatingthe motions of thousands of water molecules,allowing the properties of water to emergeautomatically from the simulation. Alterna-tively, one can discard the explicit watermolecules and instead use an implicit watermodel consisting of new energy terms for thehydrophobic effect and dielectric screening.Such models can provide a useful intuitiveunderstanding of the system, but it is alsoimportant to remain alert to their potentialweaknesses.

Below we outline current thinking on keyenergy components that drive or oppose bind-

Strain: the energycost of adopting thebound conformation

ing, breaking them into potential energy (asif in vacuo) and the effects of solvent. Thena theoretical framework for computing affini-ties based on these forces is described, fol-lowed by some of the implications of thetheory for binding thermodynamics, such asstrain and entropy.

Potential Energy

The potential energy of an isolated moleculeor complex in a given conformation can becomputed by quantum mechanical methodsthat explicitly account for the electronic struc-ture. Such quantum calculations can be donein various ways, but a central division isthat between more time-consuming ab ini-tio methods and faster semiempirical meth-ods, which substitute fitted parameters forquantities that would otherwise need to becalculated. The details of these approachesare beyond the scope of this review. Quan-tum methods in principle are broadly ap-plicable, but they are time-consuming rel-ative to the force-field models discussed inthe next paragraph, typically requiring min-utes to months of computer time, dependingupon the size of the system and the level ofthe calculation. Therefore, it is difficult touse them in methods that calculate the en-ergy of many different conformations. It isalso important to be aware that many ab initiomethods markedly underestimate the attrac-tive forces due to electron-electron correla-tions between atoms, i.e., dispersion forces,the attractive part of the van der Waals inter-action.

Empirical force fields (8) represent acomputationally fast alternative to quantummechanical calculations of the potential en-ergy, because they encapsulate the chief con-sequences of electronic structure in a fewenergy terms that are easily calculated, of-ten in well under 1 s for one conforma-tion of a system. Current examples includeCHARMM (15), AMBER (30), and GRO-MOS (121). A typical functional form is given

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 23

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 4: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Electronicpolarization:inducedredistribution ofelectronic charge inan atom or molecule

by Equation 1:

U =Nbonds∑i=1

kb,i (bi − b0,i )2

+Nangles∑i=1

kθ,i (θi − θ0,i )2

+Ndihedrals∑

i=1

kφ,i (cos(niφi − φ0,i ))

+Nimprp−dhd∑

i=1

kω,i (ωi − ω0,i )2

+Natoms∑

i

Natoms∑j �=i

(Ai j

r12i j

− Bi j

r6i j

)

+Natoms∑

i

Natoms∑j>i

(qi q j

Dri j

). 1.

The first two terms treat bond stretch-ing and angle bending by harmonic springswith large force constants. The third termaccounts for bond torsions (dihedrals), andthe fourth, improper dihedral, term is usedto keep planar groups planar. The fifth termaccounts for the van der Waals interactionsamong atoms with a short-ranged hard re-pulsion that prevents atoms from overlap-ping and a slightly longer-ranged weak at-traction attributable to correlated motionsof the electrons (dispersion forces). The fi-nal term accounts for Coulombic interactionsbetween atoms, which result chiefly from thefact that electrons tend to concentrate aroundatoms with large electronegativities and be-come depleted elsewhere, leading to negativeand positive partial atomic charges; those withcharges of large magnitude are polar atoms.It is evident from Equation 1 that an empir-ical energy function contains many parame-ters, such as the spring force constants k, thevan der Waals parameters A and B, and theatomic charges q. These parameters can of-ten be obtained by fitting to high-level quan-tum calculations, but it is best if they canbe tested and adjusted by computing physicalproperties, such as solvation energies of modelcompounds, and comparing them with experi-mental data. When suitable experimental data

are not available, it can be difficult to generatetrustworthy parameters.

Newer force fields are becoming morecomplex in order to achieve greater accuracy.More detailed models of the charge distri-bution can be employed (87, 102, 115), astouched on below, and there is also great inter-est in accounting for electronic polarization,the redistribution of atomic charges inducedby changes in the electrical fields to which theatoms are subjected (66, 77).

Effects of Water, DissolvedElectrolytes, and pH

Protein-ligand binding normally occurs in asalt-water environment, which has a strongeffect on energetics. Water has a dielectricconstant (D in Equation 1) of about 80,whereas the dielectric constant of vacuum is1. This leads to a new, one-body solvation en-ergy for each atomic charge, which arises fromthe favorable interaction between the chargeand the high-dielectric environment (12). As aconsequence, there can be a substantial energypenalty for moving the polar part of a ligandout of water and into the binding site. In ad-dition, water screens the charge-charge inter-actions of fully hydrated atoms by ∼80-fold,as indicated in the Coulombic term in Equa-tion 1. However, atoms in a protein-ligand in-terface are sequestered from the solvent andtherefore interact with an effective dielectricconstant less than 80. In general, atoms thatare further apart are more likely to interactthrough solvent, and this idea led to the intro-duction of a crude screening model consistingof a distance-dependent dielectric. For atomsi and j in Equation 1, Dij = Crij, where C isa constant often set to 4 and rij is the inter-atomic distance. This model allows one of thechief effects of the solvent to be accounted forin a computationally efficient manner, and it isused in a number of ligand-protein docking al-gorithms. However, it fails to account for theone-body solvation energy of each atom. Inaddition, the electrostatic interaction of two

24 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 5: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

atoms actually depends not only on their mu-tual distance, but also on the positions of allthe other protein and ligand atoms, becausethese determine where the high-dielectric sol-vent can penetrate.

These additional complexities can be cap-tured by continuum electrostatics models,which treat the protein and ligand as low-dielectric regions with embedded atomiccharges, surrounded by the high-dielectricaqueous solvent (55). In particular, numeri-cal solutions of the Poisson-Boltzmann (PB)equation offer a reasonably efficient methodof accounting for the subtleties of molec-ular electrostatics, including the additionalelectrostatic screening produced by dissolvedsalts. The PB equation usually can be solvedfor a protein in seconds to minutes of com-puter time for a biomolecular system, and ityields the interaction energy of the protein,ligand, or complex with the water in whichit is immersed. This electrostatic part of thesolvation energy can be added to the poten-tial energy obtained by a force field or othermethod to provide a more complete account-ing of conformational energetics. When manyconformations need to be processed, fastergeneralized Born (GB) models are valuable asapproximations to the full PB model (34, 51,58, 59, 80, 97). These are based on a recog-nition that dielectric screening of a charge-charge interaction correlates with the degreeto which each charge interacts with surround-ing water (43).

Water also gives rise to the hydropho-bic effect, the tendency of water moleculesto drive nonpolar solutes together (68). Thispromotes the association of nonpolar sur-faces of the ligand and the protein. The hy-drophobic effect is often accounted for by anadditional solvation energy term that is pro-portional to molecular surface area, with apositive coefficient. The effect is to add a pos-itive (unfavorable) solvation energy to con-formations with more surface area and thusto favor binding, which reduces surface area.Combining the PB or GB electrostatics mod-

Poisson-Boltzmann(PB) equation: apartial differentialequation solved toobtain the molecularelectrostaticpotential, whichdepends on thelocations and valuesof atomic charges,the dielectricconstant as afunction of position,and the ionicstrength

Generalized Born(GB) model: a fastapproximation to thePoisson-Boltzmannequation

PBSA:Poisson-Boltzmannsurface area

GBSA: generalizedBorn surface area

Implicit solventmodel: a model thataccounts for solventwithout explicitlymodeling individualwater molecules

MD: moleculardynamics

MC: Monte Carlo

els with such a surface area term yields thePBSA (112) and GBSA (97) solvation mod-els, respectively. These are called implicit sol-vent models because they do not treat anywater molecules explicitly. Once parameter-ized, the PBSA and GBSA models providerather good agreement with experimental sol-vation energies of small model compounds(10, 112), but they may be less accurate formore complex molecules, such as proteins,that can bind or completely sequester individ-ual water molecules. Unfortunately, it is notstraightforward to generate experimental datathat directly address this issue, but computa-tional studies are beginning to be applied to it(50, 83, 91, 118).

The influence of solvent on binding canalso be treated with molecular dynamics (MD)or Monte Carlo (MC) simulations that in-clude thousands of explicit water moleculesmodeled with an empirical force field (6, 64,102). Dielectric screening, the solvation ofpolar groups, and the hydrophobic effect allemerge automatically with this approach. Inaddition, it should provide a better treatmentof bound and sequestered water molecules, atleast in principle. However, an explicit treat-ment of solvent is substantially more costlycomputationally than an implicit model, byperhaps an order of magnitude, depending onthe specifics of the comparison.

Physiological solutions also buffer protonsat a specific pH and thus control the elec-trical charges of acidic and basic groups ofboth the protein and the ligand. As a re-sult, the pH can strongly influence confor-mational preferences and binding affinities.Indeed, protonation and binding are thermo-dynamically linked, so that the pH affectsthe apparent binding affinity and binding inturn affects protonation states. A number ofresearch groups are working on methods ofpredicting protonation states, which will un-doubtedly be important for computing accu-rate affinities. The potential for many drug-like molecules to exist in multiple tautomericstates poses similar challenges.

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 25

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 6: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Theoretical Framework forCalculating Affinities

Given a method of computing the potentialand solvation energies of a protein, a ligand,and their complex, a theory of binding is stillneeded to compute their affinity. One widelyused approach (42) is based on recognitionthat the free energy of binding is the changein free energy when one protein and one lig-and react to form one complex. The change infree energy of a solution when one complex isadded to the system is defined as the complex’schemical potential, μP L; similarly, the changein free energy when one protein and one lig-and are removed is −μP − μL. Thus, the freeenergy of binding is �G = μP L − μP − μL.Statistical thermodynamics provides an ex-

pression for the chemical potential of the pro-tein, for example, as

μP = −RT ln(

QNP +1,NS

QNP ,NS

)

= −RT ln

(8π2

CP

∫e−(U(rP )+W(rP ))/RTdrp

),

2.

where R is the gas constant; T is the abso-lute temperature; QNP +1,NS and QNP ,NS arethe canonical partition functions of solutionsof NP + 1 proteins and NS solvent moleculesand NP proteins and NS solvent molecules, re-spectively; the factor of 8π2 results from thefact that overall rotation of the protein doesnot affect the integral over the other coordi-

nates of the system; CP is the concentrationof the protein; U(rP) is the potential energy ofthe protein as a function of its internal coor-dinates rP and thus of its conformation; andW(rP) is the corresponding solvation energy.

(A mass-dependent prefactor that cancels inthe final expression for the binding affinity hasbeen omitted, as has a small pressure-volumeterm.) The Boltzmann factor in the integralis larger for more stable conformations of theprotein, so that the more low-energy confor-mations the protein can adopt, the greater theintegral and the more negative its standardchemical potential, implying greater stabilityand less reactivity. Also, the higher the con-centration CP, the more positive the standardchemical potential, implying greater reactiv-ity, consistent with the law of mass action.

Supplying analogous formulae for thechemical potentials of the ligand and thecomplex allows the free energy of binding tobe written as

�G = −RT ln

(1

8π2

CP CL

CP L

∫e−(U(rP L )+W(rP L ))/RTdrP L(∫

e−(U(rP )+W(rP ))/RTdrP) (∫

e−(U(rL )+W(rL ))/RTdrL))

. 3.

For the most part, the integrals in this expres-sion are considered to range over all confor-mations, but only conformations that are lowin energy actually require attention; for ex-ample, conformations with highly stretchedbonds can be neglected. However, a sub-tlety arises for the protein-ligand complex,because the integral should include only con-figurations in which the ligand is consideredbound to the protein. For tight complexes, anyintegration domain that includes the lowest-energy configurations and not too much moresuffices (42); for weak binding, a more in-volved definition is needed (88).

Replacing each concentration in Equation3 by the standard concentration C ◦ (1 M)yields the standard free energy of binding:

�G o = −RT ln

(C o

8π2

∫e−(U(rP L )+W(rP L ))/RTdrP L(∫

e−(U(rP )+W(rP ))/RTdrP) (∫

e−(U(rL )+W(rL ))/RTdrL))

. 4.

The factor of C ◦ makes it apparent that thehigher the standard concentration, the morenegative the standard free energy of binding,again consistent with the law of mass action.The relationship of �Go to the equilibrium

26 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 7: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

constant K is established by recognizing that�G = 0 when the system is at equilibrium.Equation 3 then gives the equilibrium ratioof concentrations as:

(CP L

CP CL

)eq

= 18π2

∫e−(U(rP L )+W(rP L ))/RTdrP L(∫

e−(U(rP )+W(rP ))/RTdrP) (∫

e−(U(rL )+W(rL ))/RTdrL) 5.

Multiplying both sides by Co and using Equa-tion 4 yields that

K = e−�Go/RT =(

CP LCo

CP CL

)eq

6.

Equation 4 can be used to derive variousmethods of computing binding affinities, in-cluding Double Decoupling and Mining Min-ima (23, 52). Furthermore, Equation 4 can bemanipulated into the form

�Go = 〈UP L〉 − 〈UP 〉 − 〈UL〉 + 〈WPL〉− 〈WP 〉 − 〈WL〉 − T�So

config, 7.

where 〈UP L〉 and 〈WP L〉, for example, arethe Boltzmann-averaged potential and sol-vation energies of the complex; and �So

config,the change in configurational entropy, is thechange in the entropy associated with ligandand protein motions, including translationalmotion, for binding at standard concentra-tion. The solvation terms are actually solva-tion free energies and include both energeticand entropic components. Equation 7 is rec-ognizable as the basis of most physics-basedscoring functions as well as the molecularmechanics Poisson-Boltzmann surface area(MM-PBSA) method (46, 114).

Some methods of computing bindingaffinities yield predicted values of �Go thatdo not depend on the standard concentra-tion. This is problematic because the stan-dard concentration is arbitrary. Indeed, if itwere changed from 1 M to 1 nM, for example,then every experimental standard free energyof binding would be reported as RT ln 109 =12 kcal mol−1 more positive. Therefore, a casecan be made that the success of a model isfortuitous if the model’s predictions do not

Configurationalentropy: theentropy associatedwith motions of theligand and/or protein

depend on standard concentration. On theother hand, a model that omits standard con-centration can legitimately provide the differ-ence between two binding affinities, because

the standard concentration cancels in suchdifferences.

Strain, Configurational Entropy,and Preorganization

Equations 4 and 7 not only provide a basis forcomputational models, but also have impor-tant implications that can be understood intu-itively. In particular, they bear directly on theconcepts of strain, configurational entropy,and preorganization. If the ligand adopts astrained conformation in the bound state, therise of its internal energy on binding will makea positive contribution to 〈UP L〉−〈UP 〉−〈UL〉and thus oppose binding. Alternatively, if theligand accesses a wide range of equally sta-ble conformations in the free state, but only anarrow range of conformations in the boundstate, its configurational entropy will drop andthus oppose binding. Finally, if the boundconformation of the ligand corresponds to anespecially stable conformation of the free lig-and, then the ligand is preorganized for bind-ing, and both the strain and entropic penaltieswill be low. The same considerations apply tothe protein.

Changes in configurational entropy uponbinding are the subject of a literature too com-plex and extensive to be summarized here; akey problem has been the lack of a methodto compute these quantities. The MiningMinima technique does allow such calcula-tions, and applications to host-guest modelsystems (23, 26) and a protein-ligand sys-tem (22) indicate that the entropic penalty ofEquation 7 is large, opposing binding nearlyas strongly as the energy changes favor it.

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 27

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 8: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Thus, the loss in configurational entropy can-cels most of the binding energy, and the netbinding free energy is a small difference be-tween larger numbers. Importantly, leavingout the configurational entropy, and thus fo-cusing only on energy, leads to significantlyworse correlation with experimental free en-ergies of binding. Nonetheless, there is signif-icant entropy-energy correlation, and scalingthe energy terms by a factor of perhaps 0.1 to0.3 yields numbers in the experimental range.Similar results should hold for protein-ligandbinding.

APPROACHES TOCALCULATING AFFINITIES

Docking and Scoring

Docking methods try to identify the moststable conformation of the ligand-proteincomplex using a simplified energy model, fre-quently an empirical force field with a sim-ple solvent model for the sake of computa-tional speed. The conformation is then used toassign a binding energy or score to the ligand,sometimes using the same energy model thatwas used during docking and sometimes witha more sophisticated and time-consumingmodel. Reliance on a single bound conforma-tion in effect assumes that this conformationis the only one that is significantly occupied,so that it dominates the averages in Equation7. Pioneering initial approaches (32, 45, 76)have been followed by further advances fromthe same groups and many others, as recentlyreviewed (14, 113), and a number of meth-ods are actively used for computer-aided drugdiscovery (110).

The proliferation of docking and scor-ing methods also has inspired evaluations andcomparisons (25, 69, 125, 127), which are ofparticular interest to drug companies mak-ing purchasing decisions. Some of the mostcommonly used criteria are a method’s abil-ity to reproduce the correct bound conforma-tion of a ligand-protein complex, its ability toassign better energy scores to known high-

affinity ligands than to a large set of decoycompounds (enrichment), and the ability togenerate scores that correlate with measuredbinding affinities of known ligands. Over-all, existing methods frequently yield accuratebound complexes and can usually provide sig-nificant enrichment. However, the results aresystem dependent, and different methods per-form better on different systems. In addition,scores are at best weakly predictive of affini-ties across a series of known ligands. Carryingout a fair evaluation poses its own challenges(29); for example, enrichment studies pro-vide different results when the known ligandshave nanomolar versus micromolar affinities.Docking results also depend on proceduraldetails outside the actual docking protocol,such as the choice of protonation sites on theligands (75, 120). Results are sometimes im-proved when multiple scoring functions arecombined in a single consensus score, withthe idea that a ligand that ranks high on mul-tiple scoring functions is more likely to bindtightly (7, 24, 90). Nevertheless, the gains ofthis approach are limited by the simplicity oftypical scoring functions, which are designedfor computational throughput.

As a consequence, there is great interest inmore sophisticated and hence more accuratescoring functions. If these prove to be slow, itwill still be possible to prescreen compoundswith a fast scoring function and apply themore accurate methods only to compoundsthat pass the prescreen. The fast method alsocan provide initial docked conformations toinitiate calculations with the slower one. Thefollowing subsections review efforts to in-crease the sophistication of docking and scor-ing methods.

Improved potential energy models. Oneimportant direction involves using potentialenergy models that are more advanced than atypical empirical force field. A particularly in-teresting study provides evidence that dock-ing accuracy is substantially better when theenergy model accounts for the redistributionof ligand charges due to the electric field of

28 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 9: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

the binding site, i.e., for electronic polariza-tion (28). The procedure involves iterativelyadjusting partial atomic charges via quantummechanical calculations during a docking pro-cedure. Rather large shifts in charge were ob-served, up to an average of 0.4 e across allatoms for some ligands. A number of otherstudies also explore force fields with more de-tailed treatments of electrostatics (31, 37, 87),and a different style of force field, designed foruse with metalloenzymes, has now been usedto model protein-ligand systems (48).

Another approach has been to use a quan-tum mechanical treatment of the ligand, andeven the protein, rather than rely on an empir-ical force field (47, 71, 98, 122). Because suchmethods capture dispersion forces poorly, avan der Waals dispersion term has been com-bined with the quantum approach, as done inan intriguing trial application to 165 protein-ligand systems (99). A recent review providesa useful perspective on the use of quantumcalculations in structure-based drug discovery(95).

Improved solvent models. Many scoringfunctions have relied on simple models of thesolvent, such as the distance-dependent di-electric. These fail to capture the sensitivityof the interactions between atomic charges tothe details of their locations relative to thesolvent, as well as the fact that polar groupshave strong, favorable interactions with thesolvent that may be lost upon binding. Thisissue has been addressed by incorporatingPBSA and GBSA solvation models into dock-ing and scoring, and a number of papers re-port improvements in accuracy (56, 84, 86, 87,111, 135, 136). However, increased accuracyin some systems requires careful assignmentof protonation and tautomer states (38, 56).It seems that, in these cases, removing onesource of error exposes another one.

Several studies also address the formationof water binding sites during docking (41, 100,123, and references therein), but the methodstried so far do not seem to markedly improveaccuracy. In addition, it is not yet clear how to

adjust the score of a ligand if it is predicted tobind with a water molecule.

Accounting for conformational changes ofthe receptor. The first docking algorithmstypically treated the protein as rigid, althoughbinding sites in reality adjust to accommodatedifferent ligands. This simplification can beproblematic, because even a small steric over-lap can make a good ligand look like a misfitif it is docked into a single conformation ofthe receptor that was solved with a differentligand bound. However, because the recep-tor possesses many internal degrees of free-dom, treating it as flexible can vastly expandthe problem of discovering the most stableconformation of the protein-ligand complex.Moreover, as the number of candidate energyminima rises, the energy model must becomeincreasingly accurate in order to identify theglobal minimum correctly. A wide variety ofapproaches have been devised to address thisproblem, as previously reviewed (18). Meth-ods range from using a single conformation ofthe receptor, but softening it to approximatethe consequences of flexibility (60); to dock-ing into a single representation of the bindingsite obtained by averaging across multiple re-ceptor structures (16, 74, 93); to docking intomultiple discrete receptor conformations (1,20, 39, 130, 131); to reoptimizing the confor-mation of the binding site during the dockingcalculation itself (65, 70, 108, 133, 134). Therelative merits of the various approaches arenot yet clear, though one study provides ev-idence that soft docking yields worse enrich-ment of known ligands than does docking intomultiple structures of the receptor (39).

Configurational entropy. Even if a dockingalgorithm always identifies the global energyminimum of a protein-ligand complex witha perfectly accurate potential and solvationenergy function, it will not yield an accuratebinding free energy. One reason is that thebinding free energy includes a loss in con-figurational entropy due to the decrease infreedom of the ligand and the protein upon

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 29

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 10: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

binding. As noted above, the loss of configu-rational entropy is nearly as large as the gainin favorable binding energy and correlatesroughly with the energy change. Therefore,it is expected that a model that accounts forchanges in potential and solvation energy, butneglects the loss of configurational entropy,will tend to overestimate binding affinities. Infact, the energy terms in a physics-based scor-ing function had to be scaled down roughly10-fold to bring the results into range withexperiment (89), as previously noted (26); asimilar result is observed in calculations withthe MM-PBSA method when configurationalentropy is neglected (17).

A number of scoring methods use therigid-rotor/harmonic-oscillator formalism toseparate the ligand’s loss of entropy into arotational and translational part that resultsfrom trapping in the binding site and anotherpart associated with the ligand’s loss of flex-ibility. It is generally assumed that any en-tropy change associated with the hard degreesof freedom (bond stretches and angle bends)is negligible. This is reasonable because bondlengths and bond angles are unlikely to be per-turbed much by the noncovalent forces thatdrive protein-ligand binding. It is also widelyassumed that changes in rotational and trans-lational entropy are either negligible or con-stant across all binding reactions, but this viewis not well founded. In fact, the translationaland rotational motions of a bound ligand de-pend on exactly how tightly it is held (23, 26),and this can vary not only from one complexto another, but also among alternative boundconformations of a single complex (26). Thistopic poses nontrivial theoretical issues and isnow the subject of increasing computationalscrutiny (2, 33, 78, 85, 117, 138). For example,a recent study reports improved docking ac-curacy with a novel method of estimating theentropy of a bound ligand from the rangesof rotational, translational, and torsional co-ordinates associated with multiple dockedconformations (107).

Changes in entropy due to the decreasedflexibility of the ligand upon binding are com-

monly regarded as resulting from a reductionin the number of energetically accessible lig-and rotamers upon binding and are often ac-counted for with empirical terms proportionalto the number of rotatable bonds of the ligandaffected by binding (9, 36, 44, 101, 124). Thecorresponding free energy penalties are typi-cally on the order of 0.4 kcal mol−1 torsion−1

but run at least to 1 kcal mol−1 torsion−1 (99).However, a simple argument suggests thatsuch estimates markedly overestimate the lossof entropy due to the drop in accessible ro-tamers. For a ligand with 10 rotatable bonds,the torsional penalty would be 4 kcal mol−1,corresponding to a drop of e−4/RT = 800-fold in the number of equally stable ligandrotamers upon binding. This implies a mini-mum of 800 equally stable, rotamerically dis-tinct ligand conformations in the free state.This number seems improbably large. It isnot clear whether the literature has directlyaddressed this issue, but programs designedto generate a large number of diverse ligandconformations typically yield far fewer con-formations (13, 72). Moreover, of these con-formations, only a few are likely to be within1 kcal mol−1 of the global energy minimum.Therefore, it is unlikely that rotamer count-ing accounts for much of a ligand’s entropyloss upon binding. Recent studies of modelsystems do reveal substantial losses in con-figurational entropy upon binding, but theyattribute these losses to the fact that the en-ergy wells of a bound complex are narrowerthan those of the free species, rather than toa reduction in the number of stable rotamers(22, 26). Overall, it appears that a deeper un-derstanding of changes in configurational en-tropy upon binding could lead to improvedaccuracy in docking and scoring.

Free Energy Methods

Free energy methods use conformationalsampling to generate thermodynamic av-erages, in contrast with docking methods,which focus on a single bound conforma-tion. The use of conformational averaging is

30 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 11: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

advantageous because it removes sensitivity tothe details of the single representative confor-mation relied on by docking methods. On theother hand, free energy methods need morecomputer time in order to generate unam-biguous, converged results. Also, the confor-mational searches tend not to be wide rangingand thus may need to be seeded with severalplausible docked conformations. Two generalapproaches are considered here: end-pointmethods, which generate conformations ofonly the free and bound species and computethe binding free energy by taking a differ-ence, and pathway methods, which computethe sum of small changes along a multisteppathway connecting the initial and final states.

End-Point methods. The linear interactionenergy (LIE) method (3) involves runningtwo MD simulations with an empirical forcefield: one for the ligand in solution, and theother for the ligand in the protein bind-ing site. Snapshots saved from the simula-tions represent Boltzmann ensembles of con-formations and are used to compute theBoltzmann-averaged electrostatic (Uelec) andvan der Waals (Uvdw) interaction energies ofthe ligand with its environment in the boundand free states. The binding free energy isthen estimated as

�G o ≈ β(〈Uelec〉bound − 〈Uelec〉free)

+ α(〈Uvdw〉bound − 〈Uvdw〉free), 8.

where the angle brackets indicate Boltzmannaverages. Changes in the internal energy ofthe solvent and the protein are considered tobe accounted for by the factors α and β, onthe basis of their reorganization energy in re-sponse to the ligand (3). As recently summa-rized (19), good correlations between calcu-lation and experiment have been reported ina number of publications, both with the stan-dard coefficients α = 0.18 and β = 0.33 (19)and with system-specific values of these co-efficients (126). An additional surface-area-dependent term has also been incorporated(61), and more recent variants speed the MD

LIE: linearinteraction energy

calculations by using implicit treatments ofthe solvent (19, 135). In general, it does notseem that a single set of coefficients can beapplied to all systems; therefore some experi-mental binding data for the system of interestmay be needed before this method can be usedeffectively.

Nonetheless, the quality of the results issomewhat surprising, because LIE does notaccount explicitly for standard concentrationor for changes in the configurational entropyor the internal energy of the ligand. Themethod may be successful in part because itis generally used to compare ligands within asingle chemical series. Entropy-energy com-pensation (23, 26), and cancellation of theneglected drop in solute entropy by the ne-glected rise in the solvation entropy, may alsohelp.

A second end-point approach, theMM-PBSA method, along with its GB vari-ant (MM-GBSA), uses MD simulations of thefree ligand, free protein, and their complexas a basis for calculating the average potentialand solvation energies in Equation 7 (46,114). The MD runs typically use an empiricalforce field and an explicit solvent model.The resulting snapshots i are postprocessedby stripping them of their explicit solventmolecules and computing their potentialenergies Ui with the empirical force fieldand their solvation energies Wi with eitherthe PBSA or GBSA implicit solvent model.Averaging over each trajectory allows thechanges in mean potential and solvationenergy to be calculated, as per Equation 7.The change in configurational entropy isthen estimated by energy-minimizing a fewsnapshots of the free and bound molecules,computing their entropy with the rigid-rotor/harmonic-oscillator approximation,and obtaining the average entropy changeover these snapshots. This approach neglectsany contribution to the entropy change thatmay result from a change in the numberof thermodynamically accessible energyminima. The quasiharmonic approximation(67) also has been used to estimate the change

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 31

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 12: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

in configurational entropy from the MD sim-ulations (57, 79), despite concerns regardingits accuracy (21). It is not always clear fromthe literature whether or how the standardconcentration has been incorporated intoMM-PBSA or MM-GBSA calculations, buta correct approach is available (equation 3in Reference 96); therefore the MM-PBSAmethod allows a clear connection with thetheory of binding.

Although early applications of MM-PBSAappeared promising, it apparently is diffi-cult to converge the energy averages reli-ably. One reason for this difficulty may bethat they encompass energy fluctuations notonly of the ligand and the binding site, butalso of parts of the protein remote from thebinding site, which are less relevant to bind-ing. Convergence has been accelerated by asingle-trajectory approach (17, 79), in whichonly one MD simulation of the protein-ligandcomplex is carried out. Conformations of thenominally free ligand are then derived sim-ply by deleting the protein from the result-ing snapshots, and likewise for the “free” pro-tein. In this single-trajectory approach, onlyligand-protein distances contribute to thecomputed change in potential energy, becauseall other interatomic distances stay exactly thesame. The single-trajectory approach resem-bles LIE because it similarly neglects changesin internal energy and configurational en-tropy of both ligand and protein. However,it differs from LIE in that it assumes the sameconformational distribution for the free andbound ligand, whereas these change in LIE.

Free energy pathway methods. Thechange in free energy between two states ofa system, such as before and after binding,can be written formally as −RT ln

⟨e−�U/RT

⟩,

where �U is the change in the energyfunction between the initial and final states,and the angle brackets indicate a Boltzmannaverage taken in the initial state (137). Forbinding, �U would reflect the introductionof interactions between the protein andligand, and the average could, in principle,

be obtained by MD or MC simulations. Inreality, such a simulation is extraordinarilydifficult to converge unless the initial andfinal states are similar. The free energyperturbation method solves this problemby breaking the change into N small stepsδU (perturbations) and running a separatesimulation for each resulting energy functionUi to obtain the stepwise free energy changes−RT ln

⟨e−δU/RT

⟩i associated with each step,

where 〈〉i indicates a Boltzmann average withthe energy function Ui. Thus, the initial andfinal states are linked by a pathway of smallsteps for which the small free energy differ-ences can be computed. Another pathwayapproach, thermodynamic integration (TI)(73, 116), uses MD or MC to compute thefirst derivative of the free energy with respectto the distance along the path and thenestimates the total change in free energy via anumerical integral of the derivative along thepath. Because the first derivative of the freeenergy is effectively a force, TI effectivelyinvolves a work integral. Either free energyperturbation or TI can be applied to a givenfree energy calculation, and both methodsare widely used.

The first applications of such methods tobinding did not seek to compute the stan-dard free energy of binding, but rather tocompute the difference between the bindingfree energies of two similar ligands, ��Go,by a method termed computational alchemy(119). This involves using pathways to com-pute the change in free energy when ligandA is changed to ligand B within the bind-ing site, as well as in solution. A free en-ergy cycle shows that the difference betweenthese two free energy changes equals the dif-ference between the binding free energiesof the two ligands. The calculations typi-cally employ lengthy MD or MC simulations(hours to many days, depending on the cal-culation) with an empirical force field andan explicit treatment of solvent. Relative tocomputing full-fledged standard free energiesof binding, in which a ligand must be re-moved from the binding site, computational

32 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 13: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

alchemy has the merit of focusing on a rela-tively small perturbation, in which one ligandis changed to another similar one, and thus ofconverging relatively rapidly. Computationalalchemy also avoids the problem of account-ing for standard concentration, because it canbe shown from Equation 4 that Co cancelsin ��Go. The method is theoretically soundand is supported by a literature reporting rela-tively accurate results, often within 1 to 2 kcalmol−1 of experimental data. Still, computa-tional alchemy has not yet been subjected toextensive validation studies like those used toevaluate docking and scoring methods.

The use of computational alchemy hasbeen limited in part by the amount of com-puter time it requires, especially when ligandA and ligand B are markedly different. How-ever, advances in computer technology andsimulation algorithms allow practical applica-tion of this method to drug discovery (4, 27,49, 104). Also, a clever extension of computa-tional alchemy by TI yields the relative affini-ties of multiple similar ligands from a singlesimulation run, at least for rigid ligands (92).At least two studies suggest that the methodis more accurate than MM-PBSA (46, 94), asnoted above. Its convergence properties alsomay be better than that of the original imple-mentation of MM-PBSA because unlike MM-PBSA the work terms it computes are not di-rectly influenced by fluctuations far from thebinding site.

Pathway methods can also be used to com-pute the standard binding free energy of a pro-tein and ligand. The Double Decoupling ap-proach (42), which was in effect first used tocompute the affinity of xenon for myoglobin(54), draws on related approaches (53, 106),notably the Double Annihilation method (63).It involves using a pathway technique to com-pute the work of gradually decoupling the lig-and from the binding site and then effectivelycoupling it with an energy well or trap of de-fined size in bulk solution. The work of allow-ing the ligand to escape from the trap into avolume equal to 1/Co can be computed ana-lytically, so that the simulation results can be

connected with the appropriate standard state.This approach has been further developed andapplied to several protein-ligand systems (11,35). A computationally massive application ofthe similar but nonrigorous (42) Double An-nihilation method to the association of eightligands with the protein FKBP is also of con-siderable interest (40).

It has been argued that Double De-coupling is difficult to converge for highlycharged ligands because the final result mustaccurately balance strong interactions be-tween the ligand and the solvent against sim-ilarly strong interactions between the ligandand the protein (57). Alternative approachescompute the free energy of binding via apathway in which the ligand is gradually ex-tracted from the binding site. Preliminaryresults from techniques of this type appearpromising (57, 79), though their applicabilityto ligands that bind in a closed cavity withinthe protein is not yet defined.

Because the pathway methods are sopromising and yet so difficult to run to con-vergence, there is great interest in accelerat-ing them or developing useful shortcuts, asrecently reviewed (105). Approaches includeimproved methods of extracting free energiesfrom the simulations (109, 132), novel sam-pling methods to speed convergence (82, 103),and hybrid implicit/explicit solvent modelsthat reduce the computational load (5, 81).

PROSPECTS AND CHALLENGES

Accurate calculation of protein-ligand bind-ing affinities is an important and still unsolvedproblem. Nonetheless, significant progress isbeing made through a deepening understand-ing of the physical chemistry and the im-plementation of tractable algorithms basedon this understanding. Advances in com-puter hardware also continue to extend thelevel of detail that can be modeled. Thereis room for improvement in many aspectsof existing methods, including models of thepotential energy and of solvent, treatmentof protonation and tautomer equilibria, and

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 33

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 14: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

Solvent entropy:entropy associatedwith motions of thesolvent molecules

algorithms and heuristics for conformationalsampling. Indeed, one should not expectmarked improvement in affinity calculationsfrom progress against any single part of theproblem because the remaining parts still gen-erate large errors. Rather, success likely re-quires a concerted effort on all fronts. Un-derstanding the nature of the approximationsused is essential in this process, so that suc-cesses or failures will be interpretable and apath of improvement can be discerned. Oth-erwise, model development risks devolving totrial and error.

Comparison with experimental data re-mains indispensable. Validation studies areeasier for computationally faster approaches,such as docking and scoring, but are alsoneeded for the more time-consuming meth-ods. Simple model systems can play animportant role in validation. For example, ar-tificial protein cavities that bind small, rigidligands allow a focus on intermolecular inter-actions rather than on issues of ligand flexibil-ity (128, 129). Similarly, studies of host-guestmodel systems (23, 26) have provided insights

into configurational entropy that bear on thedevelopment of improved scoring functions.Experimental data also are necessary as a basisfor improving specific components of bind-ing models, and expansion of certain data setswould be welcome and valuable. For exam-ple, solvation models cannot be applied con-fidently to many candidate ligands becausethe existing database of solvation free ener-gies does not include many of the chemicalmoieties that frequently appear in drug-likecompounds.

Finally, today’s understanding of the phys-ical chemistry of molecular recognition mayprove to be incomplete. Specific issues likelyto be of continuing interest include changesin ligand, protein, and solvent entropy uponbinding; the puzzling and apparently erro-neous tendency of many binding models toyield affinities that correlate strongly with themolecular weight of the ligand; electronic po-larization; the modeling of metal-containingbinding sites; and the thermodynamic impli-cations of water binding sites at the ligand-protein interface.

SUMMARY POINTS

1. Recent evaluations show that docking methods have considerable success in correctlyidentifying bound conformations and in enriching known ligands against a back-ground of decoy compounds, but yield less impressive results for ranking knownligand according to affinity.

2. Important advances are being made in calculating the potential energy as a functionof conformation, through more sophisticated force fields and the incorporation ofquantum mechanical treatments in ligand-protein modeling.

3. Recent calculations suggest that changes in configurational entropy on binding arelarge, cancel much of the energy change that drives binding, and must be accountedfor to obtain good correlations with measured affinities.

4. Free energy pathway calculations with explicit solvent appear to yield good agreementwith experiment, and it may be time to undertake more comprehensive validationstudies.

5. Achieving accurate affinity calculations will likely require advances on multiple fronts:potential energy functions, solvation models, and the treatment of conformationalflexibility.

34 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 15: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

DISCLOSURE STATEMENT

MKG is a member of the company VeraChem LLC.

ACKNOWLEDGMENTS

The authors thank Dr. Hillary Gilson for valuable comments on the manuscript. This publica-tion was made possible by grant no. GM061300 to MKG and grant no. GM058187 to HXZ,from the National Institute of General Medical Sciences of the National Institutes of Health.Its contents are solely the responsibility of the authors and do not necessarily represent theofficial views of the National Institute of General Medical Sciences.

LITERATURE CITED

1. Alberts IL, Todorov NP, Dean PM. 2005. Receptor flexibility in de novo ligand designand docking. J. Med. Chem. 48:6585–96

2. Alsallaq R, Zhou HX. 2007. Energy landscape and transition state of protein-proteinassociation. Biophys. J. 92:1486–502

3. Aqvist J, Medina C, Samuelsson JE. 1994. A new method for predicting binding affinityin computer-aided drug design. Protein Eng. 7:385–91

4. Archontis G, Watson KA, Xie Q, Andreou G, Chrysina ED, et al. 2005. Glycogen phos-phorylase inhibitors: a free energy perturbation analysis of glucopyranose spirohydantoinanalogues. Protein Struct. Funct. Bioinform. 61:984–98

5. Banavali NK, Im W, Roux B. 2002. Electrostatic free energy calculations using the gen-eralized solvent boundary potential method. J. Chem. Phys. 117:7381–88

6. Berendsen HJC, Grigera JR, Straatsma TP. 1987. The missing term in effective pairpotentials. J. Phys. Chem. 91:6269–71

7. Betzi S, Suhre K, Chetrit B, Guerlesquin F, Morelli X. 2006. GFscore: a general nonlinearconsensus scoring function for high-throughput docking. J. Chem. Inf. Model. 46:1704–12

8. Bixon M, Lifson S. 1967. Potential functions and conformations in cycloalkanes. Tetra-hedron 23:769–84

9. Bohm HJ. 1994. The development of a simple empirical scoring function to estimate thebinding constant for a protein-ligand complex of known three-dimensional structure. J.Comput. Aided Mol. Des. 8:243–56

10. Bordner AJ, Cavasotto CN, Abagyan RA. 2002. Accurate transferable model for water,n-octanol, and n-hexadecane solvation free energies. J. Phys. Chem. B 106:11009–15

11. Boresch S, Tettinger F, Leitgeb M, Karplus M. 2003. Absolute binding free energies: aquantitative approach for their calculation. J. Phys. Chem. B 107:9535–51

12. Born M. 1920. Volumes and heats of hydration of ions. Z. Phys. 1:45–4813. Bostrom J. 2001. Reproducing the conformations of protein-bound ligands: a critical

evaluation of several popular conformational searching tools. J. Comput. Aided Mol. Des.15:1137–52

14. Brooijmans N, Kuntz ID. 2003. Molecular recognition and docking algorithms. Annu.Rev. Biophys. Biomol. Struct. 32:335–73

15. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. 1983.CHARMM: a program for macromolecular energy, minimization and dynamics calcula-tions. J. Comput. Chem. 4:187–217

16. Broughton HB. 2000. A method for including protein flexibility in protein-ligand dock-ing: improving tools for database mining and virtual screening. J. Mol. Graph. Model.18:247–57

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 35

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 16: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

17. Brown SP, Muchmore SW. 2006. High-throughput calculation of protein-ligand bind-ing affinities: modification and adaptation of the MM-PBSA protocol to enterprise gridcomputing. J. Chem. Inf. Model. 46:999–1005

18. Carlson HA, McCammon JA. 2000. Accommodating protein flexibility in computationaldrug design. Mol. Pharmacol. 57:213–18

19. Carlsson J, Ander M, Nervall M, Aqvist J. 2006. Continuum solvation models in thelinear interaction energy method. J. Phys. Chem. B 110:12034–41

20. Cavasotto CN, Kovacs JA, Abagyan RA. 2005. Representing receptor flexibility in liganddocking through relevant normal modes. J. Am. Chem. Soc. 127:9632–40

21. Chang C, Chang W, Gilson MK. 2005. Evaluating the accuracy of the quasiharmonicapproximation. J. Chem. Theory Comput. 1:1017–28

22. Chang C, Chen W, Gilson MK. 2007. Ligand configurational entropy and protein bind-ing. Proc. Natl. Acad. Sci. USA. In press

23. Chang CE, Gilson MK. 2004. Free energy, entropy, and induced fit in host-guest recog-nition: calculations with the second-generation mining minima algorithm. J. Am. Chem.Soc. 126:13156–64

24. Charifson PS, Corkery JJ, Murcko MA, Walters WP. 1999. Consensus scoring: a methodfor obtaining improved hit rates from docking databases of three-dimensional structuresinto proteins. J. Med. Chem. 42:5100–9

25. Chen H, Lyne PD, Giordanetto F, Lovell T, Li J. 2005. On evaluating molecular-dockingmethods for pose prediction and enrichment factors. J. Chem. Inf. Model. 46:401–15

26. Computationalstudy of host-guestsystems, withinsights regardingconfigurationalentropy that areapplicable toprotein-ligandbinding.

26. Chen W, Chang C, Gilson MK. 2004. Calculation of cyclodextrin binding affinities:energy, entropy, and implications for drug design. Biophys. J. 87:3035–49

27. Chipot C, Rozanska X, Dixit SB. 2005. Can free energy calculations be fast and accurateat the same time? Binding of low-affinity, nonpeptide inhibitors to the SH2 domain ofthe Src protein. J. Comput. Aided Mol. Des. 19:765–70

28. Evidence thatredistribution ofligand charge is animportantdeterminant ofconformation.

28. Cho AE, Guallar V, Berne BJ, Friesner R. 2005. Importance of accurate chargesin molecular docking: quantum mechanical/molecular mechanical (QM/MM) ap-proach. J. Comput. Chem. 26:915–31

29. Cole JC, Murray CW, Nissink JWM, Taylor RD, Taylor R. 2005. Comparing protein-ligand docking programs is difficult. Protein Struct. Funct. Bioinform. 60:325–32

30. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, et al. 1995. A second generationforce-field for the simulation of proteins, nucleic-acids, and organic-molecules. J. Am.Chem. Soc. 117:5179–97

31. Curioni A, Mordasini T, Andreoni W. 2004. Enhancing the accuracy of virtual screening:molecular dynamics with quantum-refined force fields. J. Comput. Aided Mol. Des. 18:773–84

32. DesJarlais RL, Sheridan RP, Dixon JS, Kuntz ID, Venkataraghavan R. 1986. Dockingflexible ligands to macromolecular receptors by molecular shape. J. Med. Chem. 29:2149–53

33. Dixit SB, Chipot C. 2001. Can absolute free energies of association be estimated frommolecular mechanical simulations? The biotin-streptavidin system revisited. J. Phys.Chem. A 105:9795–99

34. Dominy BN, Brooks CL III. 1999. Development of a generalized Born modelparametrization for proteins and nucleic acids. J. Phys. Chem. B 103:3765–73

35. Donnini S, Juffer AH. 2004. Calculation of affinities of peptides for proteins. J. Comput.Chem. 25:393–411

36 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 17: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

36. Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP. 1997. Empirical scoringfunctions: I. The development of a fast empirical scoring function to estimate the bindingaffinity of ligands in receptor complexes. J. Comput. Aided Mol. Des. 11:425–45

37. Ferrara P, Curioni A, Vangrevelinghe E, Meyer T, Mordasini T, et al. 2006. New scoringfunctions for virtual screening from molecular dynamics simulations with a quantum-refined force-field (QRFF-MD). Application to cyclin-dependent kinase 2. J. Chem. Inf.Model. 46:254–63

38. Ferrara P, Gohlke H, Price DJ, Klebe G, Brooks CL III. 2004. Assessing scoring functionsfor protein-ligand interactions. J. Med. Chem. 47:3032–47

39. Ferrari AM, Wei BQ, Costantino L, Shoichet BK. 2004. Soft docking and multiplereceptor conformations in virtual screening. J. Med. Chem. 47:5076–84

40. Fujitani H, Tanida Y, Ito M, Jayachandran G, Snow CD, et al. 2005. Direct calculationof the binding free energies of FKBP ligands. J. Chem. Phys. 123:084108

41. Garcia-Sosa AT, Mancera RL, Dean PM. 2003. WaterScore: a novel method for distin-guishing between bound and displaceable water molecules in the crystal. J. Mol. Model.9:172–82

42. Gilson MK, Given JA, Bush BL, McCammon JA. 1997. The statistical-thermodynamicbasis for computation of binding affinities: a critical review. Biophys. J. 72:1047–69

43. Gilson MK, Honig B. 1991. The inclusion of electrostatic hydration energies in molecularmechanics calculations. J. Comput. Aided Mol. Des. 5:5–20

44. Giordanetto F, Cotesta S, Catana C, Trosset JY, Vulpetti A, et al. 2004. Novel scoringfunctions comprising QXP, SASA, and protein side-chain entropy terms. J. Chem. Inf.Comput. Sci. 44:882–93

45. Goodsell DS, Olson AJ. 1990. Automated docking of substrates to proteins by simulatedannealing. Protein Struct. Funct. Genet. 8:195–202

46. Gouda H, Kuntz ID, Case DA, Kollman PA. 2003. Free energy calculations for theo-phylline binding to an RNA aptamer: comparison of MM-PBSA and thermodynamicintegration methods. Biopolymers 68:16–34

47. Grater F, Schwarzl SM, Dejaegere A, Fischer S, Smith JC. 2005. Protein/ligand bindingfree energies calculated with quantum mechanics/molecular mechanics. J. Phys. Chem. B109:10474–83

48. Gresh N. 2006. Development, validation, and applications of anisotropic polarizablemolecular mechanics to study ligand and drug-receptor interactions. Curr. Pharmacol.Des. 12:2121–58

49. Guimaraes CRW, Boger DL, Jorgensen WL. 2005. Elucidation of fatty acid amide hy-drolase inhibition by potent alpha-ketoheterocycle derivatives from Monte Carlo simu-lations. J. Am. Chem. Soc. 127:17377–84

50. Hamelberg D, McCammon JA. 2004. Standard free energy of releasing a localized watermolecule from the binding pockets of proteins: double-decoupling method. J. Am. Chem.Soc. 126:7683–89

51. Hawkins GD, Cramer CJ, Truhlar DG. 1995. Pairwise solute descreening of solutecharges from a dielectric medium. Chem. Phys. Lett. 246:122–29

52. Head MS, Given JA, Gilson MK. 1997. “Mining minima”: direct computation of con-formational free energy. J. Phys. Chem. 101:1609–18

53. Helms V, Wade R. 1995. Thermodynamics of water mediating protein-ligand interac-tions in cytochrome P450cam: a molecular dynamics study. Biophys. J. 69:810–24

54. Hermans J, Shankar S. 1986. The free energy of xenon binding to myoglobin frommolecular dynamics simulation. Isr. J. Chem. 27:225–27

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 37

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 18: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

55. Honig B, Sharp K, Yang AS. 1993. Macroscopic models of aqueous solutions: biologicaland chemical applications. J. Phys. Chem. 97:1101–9

56. Use of aphysics-basedscoring function inlarge-scale virtualscreening.

56. Huang N, Kalyanaraman C, Irwin JJ, Jacobson MP. 2006. Physics-based scoringof protein-ligand complexes: enrichment in large-scale virtual screening. J. Chem.

Inf. Model. 46:243–53

57. Calculation ofbinding affinitywith a pathwaymethod thatinvolves pulling theligand from thebinding site.

57. Woo HJ, Roux B. 2005. Calculation of absolute protein-ligand binding free energyfrom computer simulations. Proc. Natl. Acad. Sci. USA 102:6825–30

58. Im W, Lee MS, Brooks CL III. 2003. Generalized Born model with a simple smoothingfunction. J. Comput. Chem. 24:1691–702

59. Jayaram B, Liu Y, Beveridge DL. 1998. A modification of the generalized Born theoryfor improved estimates of solvation energies and pK shifts. J. Chem. Phys. 109:1465–71

60. Jiang F, Kim SH. 1991. “Soft docking”: matching of molecular surface cubes. J. Mol. Biol.219:79–102

61. Jones-Hertzog DK, Jorgensen WL. 1997. Binding affinities for sulfonamide inhibitorswith human thrombin using Monte Carlo simulations with a linear response method. J.Med. Chem. 40:1539–49

62. Jorgensen WL. 2004. The many roles of computation in drug discovery. Science 303:1813–18

63. Jorgensen WL, Buckner JK, Boudon S, Tirado-Rives J. 1988. Efficient computation ofabsolute free energies of binding by computer simulations. Application to methane dimerin water. J. Chem. Phys. 89:3742–46

64. Jorgensen WL, Jenson C. 1998. Temperature dependence of TIP3P, SPC, and TIP4Pwater from NPT Monte Carlo simulations: seeking temperatures of maximum density.J. Comput. Chem. 19:1179–86

65. Kairys V, Gilson MK. 2002. Enhanced docking with the mining minima optimizer: ac-celeration and side-chain flexibility. J. Comput. Chem. 23:1656–70

66. Kaminski GA, Stern HA, Berne BJ, Friesner RA. 2004. Development of an accurate androbust polarizable molecular mechanics force field from ab initio quantum chemistry. J.Phys. Chem. A 108:621–27

67. Karplus M, Kushick J. 1981. Method for estimating the configurational entropy of macro-molecules. Macromolecules 14:325–32

68. Kauzmann W. 1959. Some factors in the interpretation of protein denaturation. Adv.Protein Chem. 14:1–63

69. Kellenberger E, Rodrigo J, Muller P, Rognan D. 2004. Comparative evaluation of eightdocking tools for docking and virtual screening accuracy. Proteins Struct. Funct. Bioinform.57:225–42

70. Keseru GM, Kolossvary I. 2001. Fully flexible low-mode docking: application to inducedfit in HIV integrase. J. Am. Chem. Soc. 123:12708–9

71. Khandelwal A, Lukacova V, Comez D, Kroll DM, Raha S, Balaz S. 2005. A combinationof docking, QM/MM methods, and MD simulation for binding affinity estimation ofmetalloprotein ligands. J. Med. Chem. 48:5437–47

72. Kirchmair J, Wolber G, Laggner C, Langer T. 2006. Comparative performance assess-ment of the conformational model generators Omega and Catalyst: a large-scale surveyon the retrieval of protein-bound ligand conformations. J. Chem. Inf. Model. 46:1848–61

73. Kirkwood JG. 1935. Statistical mechanics of fluid mixtures. J. Chem. Phys. 3:300–1374. Knegtel RMA, Kuntz ID, Oshiro CM. 1997. Molecular docking to ensembles of protein

structures. J. Mol. Biol. 266:424–40

38 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 19: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

75. Knox AJS, Meegan MJ, Carta G, Lloyd DG. 2005. Considerations in compound databasepreparation—“hidden” impact on virtual screening results. J. Chem. Inf. Model. 45:1908–19

76. Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. 1982. A geometric approachto macromolecule-ligand interactions. J. Mol. Biol. 161:269–88

77. Lamoureux G, Harder E, Vorobyov IV, Roux B, MacKerell AD. 2006. A polarizablemodel of water for molecular dynamics simulations of biomolecules. Chem. Phys. Lett.418:245–49

78. Lazaridis T, Masunov A, Gandolfo F. 2002. Contributions to the binding free energy ofligands to avidin and streptavidin. Protein Struct. Funct. Genet. 47:194–208

79. Lee MS, Olson MA. 2006. Calculation of absolute protein-ligand binding affinity usingpath and endpoint approaches. Biophys. J. 90:864–77

80. Lee MS, Salsbury FR Jr, Brooks CL III. 2002. Novel generalized Born methods. J. Chem.Phys. 116:10606–14

81. Theoretical andpractical analysis ofMM-PBSA andfree energypathway methods.

81. Lee MS, Salsbury FR, Olson MA. 2004. An efficient hybrid explicit/implicit solventmethod for biomolecular simulations. J. Comput. Chem. 25:1967–78

82. Leitgeb M, Schroder C, Boresch S. 2005. Alchemical free energy calculations and multipleconformational substates. J. Chem. Phys. 122:084109

83. Li Z, Lazaridis T. 2005. The effect of water displacement on binding thermodynamics:concanavalin A. J. Phys. Chem. B 109:662–70

84. Liu HY, Kuntz ID, Zou X. 2004. Pairwise GB/SA scoring function for structure-baseddrug design. J. Phys. Chem. B 108:5453–62

85. Luo H, Sharp K. 2002. On the calculation of absolute macromolecular binding freeenergies. Proc. Natl. Acad. Sci. USA 99:10399–404

86. Majeux N, Scarsi M, Apostolakis J, Ehrhardt C, Caflisch A. 1999. Exhaustive docking ofmolecular fragments with electrostatic solvation. Protein Struct. Funct. Genet. 37:88–105

87. Maple JR, Cao Y, Damm W, Halgren TA, Kaminski GA, et al. 2005. A polarizable forcefield and continuum solvation methodology for modeling of protein-ligand interactions.J. Chem. Theor. Comput. 1:694–715

88. Mihailescu M, Gilson MK. 2004. On the theory of noncovalent binding. Biophys. J.87:23–26

89. Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, et al. 1998. Automateddocking using a Lamarckian genetic algorithm and an empirical binding free energyfunction. J. Comput. Chem. 19:1639–62

90. Oda A, Tsuchida K, Takakura T, Yamaotsu N, Hirono S. 2006. Comparison of consensusscoring strategies for evaluating computational models of protein-ligand complexes. J.Chem. Inf. Model. 46:380–91

91. Olano LR, Rick SW. 2004. Hydration free energies and entropies for water in proteininteriors. J. Am. Chem. Soc. 126:7991–8000

92. A method ofpredicting theaffinities ofmultiple similarligands based upona single simulation.

92. Oostenbrink C, van Gunsteren WF. 2005. Free energies of ligand binding forstructurally diverse compounds. Proc. Natl. Acad. Sci. USA 102:6750–54

93. Osterberg F, Morris GM, Sanner MF, Olson AJ, Goodsell DS. 2002. Automated dock-ing to multiple target structures: incorporation of protein mobility and structural waterheterogeneity into AutoDock. Protein Struct. Funct. Genet. 46:34–40

94. Criticalcomparison ofMM-PBSA andfree energypathway methods.

94. Pearlman DA. 2005. Evaluating the molecular mechanics Poisson-Boltzmann sur-face area free energy method using a congeneric series of ligands to p38 MAPkinase. J. Med. Chem. 48:7796–807

95. Peters MB, Raha K, Merz KM Jr. 2006. Quantum mechanics in structure-based drugdesign. Curr. Opin. Drug. Discov. 9:370–79

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 39

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 20: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

96. Potter MJ, Gilson MK. 2002. Coordinate systems and the calculation of molecular prop-erties. J. Phys. Chem. A 126:563–66

97. Qiu D, Shenkin PS, Hollinger FP, Still WC. 1997. The GB/SA continuum model forsolvation: a fast analytical method for the calculation of approximate Born radii. J. Phys.Chem. 101:3005–14

98. Raha K, Merz KM Jr. 2004. A quantum mechanics-based scoring function: study of zincion-mediated ligand binding. J. Am. Chem. Soc. 126:1020–21

99. Raha K, Merz KM Jr. 2005. Large-scale validation of a quantum mechanics based scoringfunction: predicting the binding affinity and the binding mode of a diverse set of protein-ligand complexes. J. Med. Chem. 48:4558–75

100. Rarey M, Kramer B, Lengauer T. 1999. The particle concept: placing discrete watermolecules during protein-ligand docking predictions. Protein Struct. Funct. Genet. 34:17–28

101. Rarey M, Kramer B, Lengauer T, Klebe G. 1996. A fast flexible docking method usingan incremental construction algorithm. J. Mol. Biol. 261:470–89

102. Ren PY, Ponder JW. 2004. Temperature and pressure dependence of the AMOEBA watermodel. J. Phys. Chem. B 108:13427–37

103. Rick SW. 2006. Increasing the efficiency of free energy calculations using parallel tem-pering and histogram reweighting. J. Chem. Theor. Comput. 2:939–46

104. Rizzo R, Wang DP, Tirado-Rives J, Jorgensen WL. 2000. Validation of a model for thecomplex of HIV-1 reverse transcriptase with sustiva through computation of resistanceprofiles. J. Am. Chem. Soc. 122:12898–900

105. Rodinger T, Pomes R. 2005. Enhancing the accuracy, the efficiency and the scope of freeenergy simulations. Curr. Opin. Struct. Biol. 15:164–70

106. Roux B, Nina M, Pomes R, Smith JC. 1996. Thermodynamic stability of water moleculesin the bacteriorhodopsin proton channel: a molecular dynamics free energy perturbationstudy. Biophys. J. 71:670–81

107. Ruvinsky AM, Kozintsev AV. 2005. New and fast statistical-thermodynamic method forcomputation of protein-ligand binding entropy substantially improves docking accuracy.J. Comput. Chem. 26:1089–95

108. Sherman W, Day T, Jacobson MP, Friesner RA, Farid R. 2006. Novel procedure formodeling ligand/receptor induced fit effects. J. Med. Chem. 49:534–53

109. Shirts MR, Pande VS. 2005. Comparison of efficiency and bias of free energies computedby exponential averaging, the Bennett acceptance ratio, and thermodynamic integration.J. Chem. Phys. 122:144107

110. Shoichet BK. 2004. Virtual screening of chemical libraries. Nature 432:862–65111. Shoichet BK, Leach AR, Kuntz ID. 1999. Ligand solvation in molecular docking. Protein

Struct. Funct. Genet. 34:4–16112. Sitkoff D, Sharp KA, Honig B. 1994. Accurate calculation of hydration free energies

using macroscopic solvation models. J. Phys. Chem. 98:1978–88113. Sousa SF, Fernandes PA, Ramos MJ. 2006. Protein-ligand docking: current status and

future challenges. Protein Struct. Funct. Bioinform. 65:15–26114. Srinivasan J, Cheatham TE III, Cieplak P, Kollman PA, Case DA. 1998. Continuum

solvent studies of the stability of DNA, RNA, and phosphoramidate-DNA helices. J.Am. Chem. Soc. 120:9401–9

115. Stone AJ, Price SL. 1988. Some new ideas in the theory of intermolecular forces -anisotropic atom potentials. J. Phys. Chem. 92:3325–35

116. Straatsma TP, McCammon JA. 1991. Multiconfiguration thermodynamic integration. J.Chem. Phys. 95:1175–88

40 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 21: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

117. Theoreticalanalysis ofMM-PBSA, andcalculation ofcontributions fromrotational andtranslationaldegrees offreedom.

117. Swanson JMJ, Henchman RH, McCammon JA. 2004. Revisiting free energy cal-culations: a theoretical connection to MM/PBSA and direct calculation of the as-sociation free energy. Biophys. J. 86:67–74

118. Tashiro M, Stuchebrukhov AA. 2005. Thermodynamic properties of internal watermolecules in the hydrophobic cavity around the catalytic center of cytochrome c oxi-dase. J. Phys. Chem. B 109:1015–22

119. Tembe BL, McCammon JA. 1984. Ligand-receptor interactions. Comput. Chem. 8:281–83120. Todorov NP, Monthoux PH, Alberts IL. 2006. The influence of variations of ligand pro-

tonation and tautomerism on protein-ligand recognition and binding energy landscape.J. Chem. Inf. Model. 46:1134–42

121. van Gunsteren WF, Berendsen HJC. 1987. Groningen Molecular Simulation (GROMOS)Library Manual. Groningen, The Nether.: BIOMOS

122. Vasilyev V, Bliznyuk A. 2004. Application of semiempirical quantum chemical methodsas a scoring function in docking. Theor. Chem. Acc. 112:313–17

123. Verdonk ML, Chessari G, Cole JC, Hartshorn MJ, Murray CW, et al. 2005. Modelingwater molecules in protein-ligand docking using GOLD. J. Med. Chem. 48:6504–15

124. Wang R, Lai L, Wang S. 2002. Further development and validation of empirical scoringfunctions for structure-based binding affinity prediction. J. Comput. Aided Mol. Des. 16:11–26

125. Wang R, Lu Y, Fang X, Wang S. 2004. An extensive test of 14 scoring functions usingthe PDBbind refined set of 800 protein-ligand complexes. J. Chem. Inf. Comput. Sci.44:2114–24

126. Wang W, Wang J, Kollman PA. 1999. What determines the van der Waals coefficientbeta in the LIE (linear interaction energy) method to estimate binding free energies usingmolecular dynamics simulations? Protein Struct. Funct. Genet. 34:395–402

127. Broad anddetailed evaluationof docking andscoring methods.

127. Warren GL, Andrews CW, Capelli A, Clarke B, LaLonde J, et al. 2005. A criticalassessment of docking programs and scoring functions. J. Med. Chem. 49:5912–31

128. Wei BQ, Baase WA, Weaver LH, Matthews BW, Shoichet BK. 2002. A model bindingsite for testing scoring functions in molecular docking. J. Mol. Biol. 322:339–55

129. Wei BQ, Weaver LH, Ferrari AM, Matthews BW, Shoichet BK. 2004. Testing a flexible-receptor docking algorithm in a model binding site. J. Mol. Biol. 337:1161–82

130. Wong CF, Kua J, Zhang Y, Straatsma TP, McCammon JA. 2005. Molecular dockingof balanol to dynamics snapshots of protein kinase A. Protein Struct. Funct. Bioinform.61:850–58

131. Yang AY, Kallblad P, Mancera RL. 2004. Molecular modelling prediction of ligand bind-ing site flexibility. J. Comput. Aided Mol. Des. 18:235–50

132. Yang W, Bitetti-Putzer R, Karplus M. 2004. Free energy simulations: use of reversecumulative averaging to determine the equilibrated region and the time required forconvergence. J. Chem. Phys. 120:2618–28

133. Zacharias M. 2004. Rapid protein-ligand docking using soft modes from molecular dy-namics simulations to account for protein deformability: binding of FK506 to FKBP.Protein Struct. Funct. Bioinform. 54:759–67

134. Zavodszky M, Kuhn L. 2005. Side-chain flexibility in protein-ligand binding: the minimalrotation hypothesis. Protein Sci. 14:1104–14

135. Zhou R, Friesner RA, Ghosh A, Rizzo RC, Jorgensen WL, Levy RM. 2001. New linearinteraction method for binding affinity calculations using a continuum solvent model. J.Phys. Chem. B 105:10388–97

www.annualreviews.org • Calculation of Protein-Ligand Binding Affinities 41

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 22: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

ANRV311-BB36-02 ARI 3 April 2007 20:8

136. Zou X, Sun YX, Kuntz ID. 1999. Inclusion of solvation in ligand binding free energycalculations using the generalized-Born model. J. Am. Chem. Soc. 121:8033–43

137. Zwanzig RW. 1954. High-temperature equation of state by a perturbation method. I.Nonpolar gases. J. Chem. Phys. 22:1420–26

138. Lu B, Wong CF. 2005. Direct estimation of entropy loss due to reduced translational androtational motions upon molecular binding. Biopolymers 79:277–85

RELATED REVIEWS

Brooijmans N, Kuntz ID. 2003. Molecular recognition and docking algorithms. Annu. Rev.Biophys. Biomol. Struct. 32:335–73

Peters MB, Raha K, Merz KM Jr. 2006. Quantum mechanics in structure-based drug design.Curr. Opin. Drug. Discov. 9:370–79

Rodinger T, Pomes R. 2005. Enhancing the accuracy, the efficiency and the scope of free energysimulations. Curr. Opin. Struct. Biol. 15:164–70

42 Gilson · Zhou

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 23: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

Contents ARI 12 April 2006 20:14

Annual Reviewof Biophysics andBiomolecularStructure

Volume 35, 2006Contents

FrontispieceMartin Karplus � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �xii

Spinach on the Ceiling: A Theoretical Chemist’s Return to BiologyMartin Karplus � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Computer-Based Design of Novel Protein StructuresGlenn L. Butterfoss and Brian Kuhlman � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �49

Lessons from Lactose PermeaseLan Guan and H. Ronald Kaback � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �67

Evolutionary Relationships and Structural Mechanisms of AAA+ProteinsJan P. Erzberger and James M. Berger � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �93

Symmetry, Form, and Shape: Guiding Principles for Robustness inMacromolecular MachinesFlorence Tama and Charles L. Brooks, III � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 115

Fusion Pores and Fusion Machines in Ca2+-Triggered ExocytosisMeyer B. Jackson and Edwin R. Chapman � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 135

RNA Folding During TranscriptionTao Pan and Tobin Sosnick � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 161

Roles of Bilayer Material Properties in Function and Distribution ofMembrane ProteinsThomas J. McIntosh and Sidney A. Simon � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 177

Electron Tomography of Membrane-Bound Cellular OrganellesTerrence G. Frey, Guy A. Perkins, and Mark H. Ellisman � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 199

Expanding the Genetic CodeLei Wang, Jianming Xie, and Peter G. Schultz � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 225

Radiolytic Protein Footprinting with Mass Spectrometry to Probe theStructure of Macromolecular ComplexesKeiji Takamoto and Mark R. Chance � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 251

v

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.

Page 24: Calculation of Protein-Ligand Binding Affinities - The Skaggs School

Contents ARI 12 April 2006 20:14

The ESCRT Complexes: Structure and Mechanism of aMembrane-Trafficking NetworkJames H. Hurley and Scott D. Emr � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 277

Ribosome Dynamics: Insights from Atomic Structure Modeling intoCryo-Electron Microscopy MapsKakoli Mitra and Joachim Frank � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 299

NMR Techniques for Very Large Proteins and RNAs in SolutionAndreas G. Tzakos, Christy R.R. Grace, Peter J. Lukavsky, and Roland Riek � � � � � � � � � � 319

Single-Molecule Analysis of RNA Polymerase TranscriptionLu Bai, Thomas J. Santangelo, and Michelle D. Wang � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 343

Quantitative Fluorescent Speckle Microscopy of CytoskeletonDynamicsGaudenz Danuser and Clare M. Waterman-Storer � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 361

Water Mediation in Protein Folding and Molecular RecognitionYaakov Levy and José N. Onuchic � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 389

Continuous Membrane-Cytoskeleton Adhesion Requires ContinuousAccommodation to Lipid and Cytoskeleton DynamicsMichael P. Sheetz, Julia E. Sable, and Hans-Günther Döbereiner � � � � � � � � � � � � � � � � � � � � � � � 417

Cryo-Electron Microscopy of Spliceosomal ComponentsHolger Stark and Reinhard Lührmann � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 435

Mechanotransduction Involving Multimodular Proteins: ConvertingForce into Biochemical SignalsViola Vogel � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 459

INDEX

Subject Index � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 489

Cumulative Index of Contributing Authors, Volumes 31–35 � � � � � � � � � � � � � � � � � � � � � � � � � � � 509

Cumulative Index of Chapter Titles, Volumes 31–35 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 512

ERRATA

An online log of corrections to Annual Review of Biophysics and Biomolecular Structurechapters (if any, 1997 to the present) may be found athttp://biophys.annualreviews.org/errata.shtml

vi Contents

Ann

u. R

ev. B

ioph

ys. B

iom

ol. S

truc

t. 20

07.3

6:21

-42.

Dow

nloa

ded

from

arj

ourn

als.

annu

alre

view

s.or

gby

Uni

vers

ity O

f M

aryl

and

on 0

5/07

/07.

For

per

sona

l use

onl

y.