empirical force fields: overview and parameter...
TRANSCRIPT
Empirical Force Fields:Overview and parameter optimization.
Alexander D. MacKerell, Jr.Department of Pharmaceutical Sciences
School of PharmacyUniversity of Maryland, Baltimore
Baltimore, MD 21201, USA
Adopted as Additional Instructions on F.F. for CHEM5021/8021 (J. Gao, Computational Chemistry)original material: http://www.pharmacy.umaryland.edu/faculty/amackere/param/force_field_dev.htm
Potential energy function(mathematical equations)
Empirical force field(equations and parameters that relate chemical structure and
conformation to energy)
Class ICHARMMCHARMm (Accelrys)AMBEROPLS/AMBER/SchrödingerECEPP (free energy force field)GROMOS
Class IICFF95 (Biosym/Accelrys)MM3MMFF94 (CHARMM, Macromodel, elsewhere)UFF, DREIDING
Common empirical force fields
Extended atom models (omit non-polar hydrogens)
CHARMM PARAM19 (proteins)
use with implicit solvent models
ACE, EEF, GB
improper term to maintain chirality
loss of cation - pi interactions
OPLS
AMBER
GROMOS
Transition State Force Field ParametersSame approach as standard force field parameterizationRequire target data for transition state of interest: ab initio
Metal Force Field ParameterizationOnly interaction parameters orInclude intramolecular terms
Parameterization of QM atoms for QM/MM calculations
Include explicit term in potential energy function to treat polarization of charge distribution by the environment.
Still under development.
PIPF (Gao and coworkers)AMBERFriesner/Berne et al. (Schrödinger Inc.)CHARMMTINKER
Polarizable (non-additive) force fields
Class I Potential Energy Function
Intramolecular (internal, bonded terms)
Kbbonds∑ b − bo( )2 + Kθ
angles∑ θ −θ o( )2 + Kφ
torsions∑ 1+ cos(nφ −δ )( )
+ Kϕimpropers∑ ϕ − ϕo( )2 + KUB
Urey −Bradley∑ r1, 3 − r1, 3, o( )2
Intermolecular (external, nonbonded terms)
qiq j
4πDrij
+ εijnonbonded
∑ Rmin,ij
rij
12
− 2Rmin,ij
rij
6
Class II force fields (e.g. MM3, MMFF, UFF, CFF)
Kb, 2 b − bo( )2 + Kb, 3 b − bo( )3 + Kb, 4 b − bo( )4[ ]bonds∑
+ Kθ , 2 θ −θ o( )2 + Kθ , 3 θ −θ o( )3 + Kθ , 4 θ −θ o( )4[ ]angles∑
+ Kφ ,1 1− cosφ( )+ Kφ , 2 1− cos2φ( )+ Kφ , 3 1− cos3φ( )[ ]dihedrals∑
+ Kχimpropers∑ χ 2
+ Kbb 'bonds'∑
bonds∑ b − bo( ) b '−bo '( )+ Kθθ '
angles'∑
angles∑ θ −θ o( ) θ '−θ o '( )
+ Kbθangles∑
bonds∑ b − bo( ) θ −θ o( )
+ b − bo( ) Kφ , b1 cosφ + Kφ , b2 cos2φ + Kφ , b3 cos3φ[ ]dihedrals∑
bonds∑
+ b '−bo '( ) Kφ , b '1 cosφ + Kφ , b' 2 cos2φ + Kφ , b' 3 cos3φ[ ]dihedrals∑
bonds'∑
+ θ −θ o( ) Kφ ,θ 1 cosφ + Kφ ,θ 2 cos2φ + Kφ ,θ 3 cos3φ[ ]dihedrals∑
angles∑
+ θ −θ o( ) θ '−θ o '( )cosφdihedrals∑
angles'∑
angles∑
Equilibrium termsbo: bondsθo: anglesn: dihedral multiplicityδo: dihedral phaseωo: impropersr1,3o: Urey-Bradley
Force constantsKb: bondsKθ: anglesKφ: dihedralKω: impropersKUB: Urey-Bradley
Intramolecular parameters
Kbbonds∑ b − bo( )2 + Kθ
angles∑ θ −θo( )2 + Kφ
torsions∑ 1+ cos(nφ −δ)( )
+ Kϕimpropers∑ ϕ −ϕo( )2 + KUB
Urey−Bradley∑ r1,3 − r1,3,o( )2
Vbond = Kb b − bo( )2
1 2
3 4
Vangle = Kθ θ −θo( )2
Vdihedral = Kφ (1+ (cosnφ −δ))
Intermolecular interactions between bonded atoms1,2 interactions: 01,3 interactions: 01,4 interactions: 1 or scaled> 1,4 interactions: 1
Bond Energy versus Bond length
0
100
200
300
400
0.5 1 1.5 2 2.5
Bond length, Å
Chemical type Kbond bo
C-C 100 kca l/mole/Å2 1.5 ÅC=C 200 kca l/mole/Å2 1.3 ÅC=-C 400 kca l/mole/Å2 1.2 Å
Vbond = Kb b − bo( )2
Dihedral energy versus dihedral angle
0
5
10
15
20
0 60 120 180 240 300 360
Dihedral Angle, degrees
Vdihedral = Kφ (1+ (cosnφ −δ))
δ = 0˚
H
C
HHH
Vimproper = Kϕ ϕ −ϕo( )2
VUrey−Bradley = KUB r1,3 − r1,3o( )2
qi: partial atomic chargeD: dielectric constantε: Lennard-Jones (LJ, vdW) well-depthRmin: LJ radius (Rmin/2 in CHARMM)Combining rules (CHARMM, Amber)
Rmin i,j = Rmin i + Rmin jεi,j = SQRT(εi * εj )
Intermolecular parameters
qiq j
4πDrij
+ εijnonbonded
∑ Rmin,ij
rij
12
− 2Rmin,ij
rij
6
Electrostatic Energy versus Distance
-100
-80
-60
-40
-20
0
20
40
60
80
100
0 1 2 3 4 5 6 7 8
Distance, Å
Inte
racti
on
en
erg
y,
kca
l/m
ol
q1=1, q2=1
q1=-1, q2=1
Treatment of hydrogen bonds???
Partial atomic charges
C O H N0.5-0.5 0.35
-0.45
Lennard-Jones Energy
-0.5
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1 2 3 4 5
Distance, Å
Series1
Rmin,ij
εi,j
εij
Rmin,ij
rij
12
− 2Rmin,ij
rij
6
Lennard-Jones Energy versus Distance
-0.5
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1 2 3 4 5 6 7 8
Distance, Å
Inte
racti
on
En
erg
y,
kca
l/m
ol
e=0.2,Rmin=2.5
Rmin,i,j
eps,i,j
Vvdw = εij e−aRmin,ij
rij −Rmin,ij
rij
6
vdw∑
Vvdw = εijvdw∑ Rmin,ij
rij
9
−Rmin,ij
rij
6
Alternate intermolecular terms
VHbond = εHBRHB ,A− H
rA− H
12
−RHB,A− H
rA− H
10
Hbonds
∑ *cos θA− H −D( )
Extent of Parameter Optimization
Minimal optimizationby analogy (i.e. direct transfer of known parameters)dihedrals only??
Maximal optimizationtime-consumingrequires appropriate target data
Choice based on goal of the calculationsMinimal
database screeningNMR/X-ray structure determination
Maximalfree energy calculations (perturbations, potential of mean force)mechanistic studiessubtle environmental effects lead optimization
Solute-Solute
Solvent-SoluteSolvent-Solvent
"Interaction Triad"
IntermolecularIntramolecular
Self-Consistent Empirical Parameterization
Intermolecular Optimization
Partial Atomic Charges
VDW Parameters
Intramolecular Optimization
Bonds
Angles
Torsions
Impropers, Urey-Bradley
Parameter Optimization Complete
if intermolecular and intramoleculachanges < convergence criteria
Initial Geometry
if intermolecular change > conv.crit.
if intermolecular microscopic and macroscopic change < convergence criteria
if intramolecular change > conv.crit.
if intramolecular and intermolecular change > conv.crit.
NH
N
NHO
OH
NH
N
NHOOH
A B C
A) IndoleB) HydrazineC) PhenolLinking model compounds
When creating a covalent link between model compounds move the charge on deleted H into the carbon to maintain integer charge (i.e. methyl (qC=-0.27, qH=0.09) to methylene (qC=-0.18, qH=0.09)
1) Identify previously parameterized compounds2) Access topology information
i) Assign atom typesii) Connectivity (bonds)iii) Charges
CHARMM topology (parameter files)
top_all22_model.inp (par_all22_prot.inp)top_all22_prot.inp (par_all22_prot.inp)top_all22_sugar.inp (par_all22_sugar.inp)top_all27_lipid.rtf (par_all27_lipid.prm)top_all27_na.rtf (par_all27_na.prm)top_all27_na_lipid.rtf (par_all27_na_lipid.prm)top_all27_prot_lipid.rtf (par_all27_prot_lipid.prm)top_all27_prot_na.rtf (par_all27_prot_na.prm)toph19.inp (param19.inp)
NA and lipid force fields have new LJ parameters for the alkanes, representing increased optimization of the protein alkane parameters. Tests have shown that these are compatible (e.g. in protein-nucleic acid simulations). For new systems is suggested that the new LJ parameters be used. Note that only the LJ parameters were changed; the internal parameters are identical
From top_all22_model.inp
RESI PHEN 0.00 ! phenol, adm jr.GROUPATOM CG CA -0.115 !ATOM HG HP 0.115 ! HD1 HE1GROUP ! | |ATOM CD1 CA -0.115 ! CD1--CE1ATOM HD1 HP 0.115 ! // \\GROUP ! HG--CG CZ--OHATOM CD2 CA -0.115 ! \ / \ATOM HD2 HP 0.115 ! CD2==CE2 HHGROUP ! | |ATOM CE1 CA -0.115 ! HD2 HE2ATOM HE1 HP 0.115GROUPATOM CE2 CA -0.115ATOM HE2 HP 0.115GROUPATOM CZ CA 0.11ATOM OH OH1 -0.54ATOM HH H 0.43BOND CD2 CG CE1 CD1 CZ CE2 CG HG CD1 HD1 BOND CD2 HD2 CE1 HE1 CE2 HE2 CZ OH OH HHDOUBLE CD1 CG CE2 CD2 CZ CE1
HG will ultimately be deleted. Therefore, move HG (hydrogen) charge into CG, such that the CG charge becomes 0.00 in the final compound.
Use remaining charges/atom types without any changes.
Do the same with indole
Top_all22_model.inp contains all protein model compounds. Lipid, nucleic acid and carbohydate model compounds are in the full topology files.
Creation of topology for central model compound
Resi Mod1 ! Model compound 1Group !specifies integer charge group of atoms (not essential)ATOM C1 CT3 -0.27ATOM H11 HA3 0.09ATOM H12 HA3 0.09ATOM H13 HA3 0.09GROUPATOM C2 C 0.51ATOM O2 O -0.51GROUP ATOM N3 NH1 -0.47ATOM H3 H 0.31ATOM N4 NR1 0.16 !new atomATOM C5 CEL1 -0.15 ATOM H51 HEL1 0.15ATOM C6 CT3 -0.27ATOM H61 HA 0.09ATOM H62 HA 0.09ATOM H63 HA 0.09BOND C1 H11 C1 H12 C1 H13 C1 C2 C2 O2 C2 N3 N3 H3BOND N3 N4 C5 H51 C5 C6 C6 H61 C6 H62 C6 H63DOUBLE N4 C5 (DOUBLE only required for MMFF)
Start with alanine dipeptide.Note use of new aliphatic LJ parameters and, importantly, atom types. NR1 from histidine unprotonated ring nitrogen. Charge (very bad) initially set to yield unit charge for the group.
CEL1/HEL1 from propene (lipid model compound). See top_all27_prot_lipid.rtf
Note use of large group to allow flexibility in charge optimization.
N
NHO
Intermolecular Optimization Target Data
Local/Small MoleculeExperimental
Interaction enthalpies (MassSpec)Interaction geometries (microwave, crystal)Dipole moments
Quantum mechanicalMulliken Population AnalysisElectrostatic potential (ESP) based
CHELPG (g98: POP=(CHELPG,DIPOLE)Restricted ESP (AMBER)
Dimer Interaction Energies and Geometries (OPLS, CHARMM)
Global/condensed phase (all experimental)Pure solvents (heats of vaporization, density, heat capacity, isocompressibility)Aqueous solution (heats/free energies of solution, partial molar volumes)Crystals (heats of sublimation, lattice parameters, interaction geometries)
Additive Models: account for lack of explicit inclusion ofpolarizability via “overcharging” of atoms.
RESP: HF/6-31G overestimates dipole moments (AMBER)Interaction based optimization (CHARMM, OPLS)
local polarization includedscale target interaction energies (CHARMM)
1.16 for polar neutral compounds1.0 for charged compounds
Partial Atomic Charge Determination
For a particular force field do NOT change the QM level of theory. This is necessary to maintain consistency with the remainder of the force field.
Starting charges??peptide bondmethylimidazole (N-N=C)?Mulliken population analysis
Final charges (methyl, vary qC to maintain integer charge, always qH = 0.09)interactions with water (HF/6-31G*, monohydrates!)dipole moment
N
NO
H
CH3
H
CH3
Model compound 1-water interaction energies/geometries
Interaction Energies (kcal/mole) Interaction Distances (�)Ab initio Analogy Optimized Ab initio Analogy Optimized
1) O2...HOH -6.12 -6.56 -6.04 2.06 1.76 1.78
2) N3-H..OHH -7.27 -7.19 -7.19 2.12 1.91 1.89
3) N4...HOH -5.22 -1.16 -5.30 2.33 2.30 2.06
4) C5-H..OHH -3.86 -3.04 -3.69 2.46 2.51 2.44
Energetic statistical analysis Ave. Difference 1.13 0.06 RMS Difference 1.75 0.09 Dipole Moments (debeye)
5.69 4.89 6.00Ab initio interaction energies scaled by 1.16.
Comparison of analogy and optimized charges
N
NHO
Name Type Analogy OptimizedC1 CT3 -0.27 -0.27
H11 HA3 0.09 0.09H12 HA3 0.09 0.09H13 HA3 0.09 0.09C2 C 0.51 0.58O2 O -0.51 -0.50N3 NH1 -0.47 -0.32H3 H 0.31 0.33N4 NR1 0.16 -0.31C5 CEL1 -0.15 -0.25
H51 HEL1 0.15 0.29C6 CT3 -0.27 -0.09
H61 HA 0.09 0.09H62 HA 0.09 0.09H63 HA 0.09 0.09
Note charge on C6 methyl carbon.Non-integer charge is typically placed on the adjacent aliphatic carbon.
LJ (vdw) parameters
Direct transfer from available parameters is generally adequate
Test viaHeat of vaporizationDensity (Molecular Volume)Partial molar volumeCrystal simulations
For details of LJ parameter optimization see Chen, Yin and MacKerell, JCC, 23:199-213 (2002)
Geometries (equilibrium bond, angle, dihedral, UB and improper terms)microwave, electron diffraction, ab initiosmall molecule x-ray crystallography (CSD)crystal surveys of geometries
Vibrational spectra (force constants) infrared, raman, ab initio
Conformational energies (force constants)microwave, ab initio
Intramolecular optimization target data
Comparison of atom names (upper) and atom types (lower)
C2
NH
N4
N3O2
C5
OH
C
NH
NR1
NH1O
CEL1
OHH
HEL1
H3
H5
Bond and angle equilibrium term optimizationOnly for required parameters!
Experimental Crystal structure, electron diffraction, microwaveCrystal survey of specific moieties
QMFull compound if possible (HF or MP2/6-31G*)Fragments if necessary
NH
N
NHO
CH
OH
Bonds (list doesn’t include lipid-protein alkane nomenclature differences)NH1-NR1, NR1-CEL1
AnglesNR1-NH1-H, NR1-NH1-C, NH1-NR1-CEL1NR1-CEL1-CTL3, NR1-CEL1-HEL1
DihedralsCTL3-C-NH1-NR1, C-NH1-NR1-CEL1, O-C-NH1-NR1, NH1-NR1-CEL1-HEL1, NH1-NR1-CEL1-CTL3H-NH1-NR1-CEL1, NR1-CEL1-CTL3-HAL3
Bonds and angles for model compound 1
MP2/6-31G* CSD Analogy OptimizedBond lengths 1 2 1 2C-Na 1.385 1.382 1.37±0.03 1.35±0.01 1.342 1.344N-N 1.370 1.366 1.38±0.02 1.37±0.01 1.386 1.365N=C 1.289 1.290 1.29±0.02 1.28±0.01 1.339 1.289AnglesC-N-N 120.8 122.4 120.7±5.8 119.7±2.9 124.5 121.4N-N=C 116.0 116.6 114.5±5.3 115.8±1.6 119.6 115.6N=C-C 119.9 120.0 120.7±4.7 121.2±2.2 122.4 121.0The MP2/6-31G* results are for the 1) all-trans and 2) 0û, 180û, 180ûÊglobal minimum energy structures. TheCambridge structural database results represent mean±standard deviation for all structures with R-factor < 0.1 and1) the N7 and C10 sites undefined and 2) the N7 and C10 sites explicitly protonated. A) Not optimized as part ofthe present study.
NH1-NR1 from 400/1.38 to 550/1.36, NR1=CEL1 from 500/1.342 to 680/1.290: C-NH1-NR1 from 50.0/120.0 to 50.0/115.0, NH1- NR1-CEL1 from 50.0/120.0 to 50.0/115.0, NR1-CEL1-CT3 from 48.0/123.5 to 48.0/122.5. For planar systems keep the sum of the equilibrium angle parameters equal to 360.0
Bond, angle, dihedral, UB and improper force constants
Vibrational spectraFrequenciesAssignments
Conformational EnergeticsRelative energiesPotential energy surfaces
Vibrations are generally used to optimize the bond, angle, UB and improper FCs while conformational energies are used for the dihedral FCs. However, vibrations will also be used for a number of the dihedral FCs, especially those involving hydrogens and in rings.
# Freq Assign % Assign % Assign %1 62 tC2N 64 tN3N 462 133 tC1H3 50 tN3N 18 tC2N 173 148 tC1H3 46 tC6H3 254 154 dC2NN 44 dN3NC 28 dN4CC 165 205 tC6H3 59 tN4C 22 tN3N 216 333 tN4C 73 tC2N 227 361 dC1CN 45 dN4CC 21 dN3NC 168 446 rC=O 32 dN4CC 209 568 wNH 7710 586 dC1CN 21 dC2NN 20 rC=O 1811 618 wC=O 83 wNH 28 tC2N -2612 649 rC=O 27 dN4CC 1913 922 sC1-C 6214 940 wC5H 8015 1031 rCH3' 33 sC5-C 3116 1114 rCH3 6617 1139 rCH3' 76 wC=O 2018 1157 rCH3 61 wC5H 2119 1234 sC5-C 33 sN-N 3220 1269 sN-N 36 rCH3' 18
# Freq Assign % Assign %21 1446 rNH 3522 1447 rC5H 47 sC-N 1823 1527 dCH3 7724 1532 dCH3 8825 1599 dCH3a' 50 dCH3a 1726 1610 dCH3a 71 dCH3a' 2427 1612 dCH3a' 3028 1613 dCH3a 70 dCH3a' 2329 1622 dCH3a' 57 dCH3a 1930 1782 sN=C 7131 1901 sC=O 7832 3250 sCH3 76 sC5-H 2133 3258 sC5-H 78 sCH3 2134 3280 sCH3 9935 3330 sCH3a 75 sCH3a' 2536 3372 sCH3a' 10037 3377 sCH3a' 73 sCH3a 2438 3403 sCH3a 9939 3688 sN-H 100
Vibrational Spectra of Model Compound 1 from MP2/6-31G* QM calculations
Frequencies in cm-1. Assignments and % are the modes and there respective percentscontributing to each vibration.
Comparison of the scaled ab initio, by analogy and optimized vibrations for selected modes
g98 Analogy Optimized# Freq Assi % # Freq Assi % # Freq AssisN=N30 1782 sN=C 71 21 1228 sN=C 37 31 1802 sN=C
rC5H 36 sN-N30 1646 sN=C 28
sC5-C 24rC5H 18
sN-N19 1234 sC5-C 33 20 1113 sN-N 53 20 1200 rNH
sN-N 32 rNH 26 sN-N20 1269 sN-N 36 rC5H
rCH3' 18 23 1395 dCH3sN-N
31 1802 sN=CsN-N
dC2NN4 154 dC2NN44 5 207 dC2NN36 4 158 dC2NN
dN3NC28 tN4C 31 dN3NCdN4CC 16 dN4CC
10 586 dC1CN 21 12 607 dC1CN 26 11 574 dC1CNdC2NN20 dC2NN25 dC2NNrC=O 18 dN4CC
NH1-NR1 from 400/1.38 to 550/1.36NR1=CEL1 from 500/1.342 to 680/1.290:C-NH1-NR1 from 50.0/120.0 to 50.0/115.0,
NH
N
NHO
OH
N
NHO
Dihedral optimization based on QM potential energy surfaces (HF/6-31G* or MP2/6-31G*).
NH
N
NHOOHN
H
NH2O
HN
OH
Potential energy surfaces on compounds with multiple rotatable bonds
1) Full geometry optimization2) Constrain n-1 dihedrals to minimum energy values or trans
conformation3) Sample selected dihedral surface4) Repeat for all rotatable bonds dihedrals5) Repeat 2-5 using alternate minima if deemed appropriate
N
NHO
i)
ii)iii)
Note that the potential energy surface about a given torsion is the sum of the contributions from ALL terms in the potential energy function, not just the dihedral term
Creation of full compound
1) Obtain indole and phenol from top_all22_model.inp2) Rename phenol atom types to avoid conflicts with indole (add P)3) Delete model 1 terminal methyls and perform charge adjustments
i) Move HZ2 charge (0.115) into CZ2 (-0.115 -> 0.000) total charge on deleted C1 methyl (0.00) onto CZ2 (0.00 -> 0.00)
ii) Move HPG charge (0.115) into CPG (-0.115 -> 0.000) and move total charge on the C6 methyl (0.18) onto CPG (0.00 -> 0.18)
4) Add parameters by analogy (use CHARMM error messages)5) Generate IC table (IC GENErate)6) Generate cartesian coordinates based on IC table (check carefully!)
CZ2
C2
NH
N
NHO
C5CPG
OHN
NOC6H3
C1H3
H
H
Check novel connectivities1) dihedrals !2) bonds3) angles
NH
N
NHOOH
HN
OH
NH
NH2O
MP2/6-31G* versus HF/6-31G*
Addition of simple functional groups is generally straightforward once the full compound parameters have been optimized.
C
H
HH
NH
H
N
H
HH
C
O
O
O
H
C
O
NH2
OCH3F
Lead Optimization
1) Delete appropriate hydrogens (i.e. at site of covalent bond)2) Shift charge of deleted hydrogen into carbon being functionalized.3) Add functional group4) Offset charge on functionalized carbon to account for functional
group charge requirements1) Aliphatics: just neutralize added functional group, qH=0.092) Phenol OH: qC=0.11, qO=-0.54, qH=0.433) Aliphatic OH: qC=-0.04, qO=-0.66, qH=0.434) Amino: qC=0.16, qCH=0.05, qN=-0.30, qH=0.335) Carboxylate: qC=-0.37, qCO=-0.62, qO=-0.76
5) Internal parameters should be present. Add by analogy if needed.6) Optimize necessary parameters.
Perform above via the CHARMM PATCH (PRES) command
Summary
1) Junk in, junk out: Parameter optimization effort based on application requirements.
2) Follow standard protocol for the force field of interest (higher level QM is not necessarily better).
3) Careful parameter optimization of lead molecules4) Simple substitutions often require minimal or no
optimization.
Acknowledgements
Daxu YinIjen Chen
Nicolas FoloppeNilesh Banavali
NIHDoD HPC
NPACIPSC Terascale Computing
VT = Kb(b − b0)2
bonds∑ + Kθ (θ −θ0)2
angles∑ + Kφ 1+ cos(nφ −δ )[ ]
dihedrals∑
+ Kub(S − S0)2
1,3pairs∑ + Kw(w − w0)2
improper∑
+ εij
Rmin,ijrij
12
− 2Rmin,ij
rij
6
nonbonded
∑ +qiqj
4πDrij
AmberCHARMMGROMOS
OPLS
Class I potential energy function