new building biomolecular models in amber · 2020. 9. 30. · workflow for building 1uf0 –pdb...

Building Biomolecular Models in AMBERConcepts and practicalities

AMBER conceptsWhat you need to know to get started

The AMBER Suite of Programs

AMBERTOOLS is a free software toolkit for building,

running and analysing MD simulations of biomolecules.

antechamber: For parameterisation of non-standard

residues or small molecules.

xLeap: Associates a biomolecular structure with an MD

force-field and creates the atomistic model.

sander/sander.mpi: Runs MD on single or multiple (mpi

version) CPU processors.

pmemd/pmemd.mpi/pmemd.cuda: Runs MD on single or

multiple (mpi version) processor, or on GPU (cuda version).

Requires an AMBER licence.

cpptraj: Toolkit for manipulating and analysing MD

trajectories.

https://ambermd.org/Manuals.php

https://ambermd.org/Manuals.php

Using the AMBER Leap Module

There are two ways of using the leap module either:

tLeap

Command line only. Will read a list of Leap instructions

from a file, e.g.:

$ tleap –f leapscript

xLeap

Interactive GUI enabling you to visualise your protein as

you build it. You need to type in each command sequentially.

Using xLeap

The xLeap module connects the protein structure

with the AMBER force-field.

AMBER contains residue templates for standard

biological units (e.g. amino acid residues) that the

program uses to assign force-field parameters from

a pdb file. It understands chemistry!

xLeap produces files called “prmtop” (or top) and

“rst”.

“parm” contains the parameters and connectivities

“rst” contains the starting coordinates.

Histidines within AMBERIn crystal structures, histidine is normally denoted by HIS,

because the protonation state is ambiguous. There are

three residue templates for histidine:

HIE (default, H on the delta nitrogen)

HID (H on the epsilon nitrogen)

HIP (hydrogens on both nitrogens, this is positively charged).

HIE and HID tautomers HIP resonance structures

Choosing your AMBER Force-field

AMBER force-fields:

The “Molecular mechanics force fields” chapter of the AMBER

manual provides a detailed description of each available force-field,

with their respective advantages and caveats (on pg 35 onwards for

AMBER20).

Specifying a Force-field within Leap

The following Leap script will load in recommended

protein/DNA/lipid/solvent and general components

source leaprc.ff14SB Protein

source leaprc.DNA.bsc1 DNA

source leaprc.lipid17 Lipid

source leaprc.water.tip3p Water and ions

source leaprc.gaff2 General

Additional parameters for common co-factors (e.g. ATP/NADH),

amino acid modifications (e.g.

phosphoserine/histidine/threonine/tyrosine and alternative solvent

boxes (e.g. DMSO/chloroform) are available from:

http://research.bmh.manchester.ac.uk/bryce/amber/

http://research.bmh.manchester.ac.uk/bryce/amber/

Structures that Require Extra Set-up

Biomolecular complexes - correct position of TER flag in

pdb and force-field parameterisation (for ligands).

Disulphide bonds (covalent linkage can be added in Leap)

Co-ordinated metal ions (e.g. Zn/Mg/Mn). Chimera has a

metal centre builder.

Post-translational modifications (e.g. phospho-

serine/glycosylation).

Membrane proteins embedded in a bilayer

AMBER DemonstrationWorkflow transcripts – for

reference/clarification

AMBER commands to build 1UF0

Antechamber command to (very crudely) parameterise the

ligand:

Run antechamber using the bcc charge model. Ligand has a net charge of -1 (-nc flag). Use

gaff2 atom types (-at flag).

antechamber -i 6uf0_ligand.pdb -fi pdb -o 6uf0_ligand.mol2 -fo mol2 -c bcc -at gaff2 -nc -1

Run parmchk to generate missing parameters

parmchk2 -i 6uf0_ligand.mol2 -f mol2 -o 6uf0_ligand.frcmod

Reliable force-field parameterisation for small molecules is a “dark

art”, and requires expertise. For example, you may wish to use the

Gaussian program for a better treatment of the QM (antechamber

uses a cheap semi-empirical method).

AMBER commands to build 1UF0

Leap commands to build the solvated complex:

source leaprc.protein.ff14SB

source leaprc.gaff2

source leaprc.water.tip3p

Q5V = loadmol2 6uf0_ligand.mol2

loadamberparams 6uf0_ligand.frcmod

pdb = loadpdb 6uf0_final_cleanH_HID.pdb

saveamberparm pdb 6uf0_Q5Y.prmtop 6uf0_Q5Y.rst

addions pdb Na+ 0

solvatebox pdb TIP3PBOX 10.0

addions pdb Na+ 28

addions pdb Cl- 0

saveamberparm pdb 6uf0_Q5Y_wat.prmtop 6uf0_Q5Y_wat.rst

Load in ff14, gaff2, solvent force-

fields

Load ligand mol2 and frcmod

Load pdb of complex

Save unsolvated prmtop/rst

Neutralise with Na+

Add 10.0Å water box

Add 28 Na+ ions

Neutralise with Cl-

Save solvated prmtop/rst

AMBER Tip: To add excess salt (e.g. 150mM), note the number of water molecules added

by Leap (e.g. 10242). Water is 55.5M. Therefore, calculate ((0.15/55.5) × num_waters).

For 10424 waters, we need to add 28 excess ions.

Workflow for Building 1UF0 – pdb preparation1. Download pdb and pdb redo files. Compare the two, and note the changes during re-

refinement. Assess the resolution of the structures, to know how much you can trust the

reported sidechain/rotamer positions.

2. Load the complex into VMD and assess the contents of the pdb file. Create a new

representation (“Graphics → Representations → Create Rep), use the “Licorice” drawing

method, and in the “Selected atoms” box type “not protein”. This will reveal everything in

the file that is not your protein, and allow you to inspect your ligand.

3. Delete anything that is a buffer component and therefore an artefact of the crystallisation (e.g.

by manually editing your pdb file). You can generally delete the crystalline waters so long as

they do not form bridging interactions with the ligand. You must check this carefully in

VMD, as water bridges can be key to specificity.

4. Check whether your protein contains disulphide bonds. (If it does, refer to, for example,

https://ambermd.org/tutorials/pengfei/index.php).

5. Ensure that there is a TER flag between your protein and your ligand (and any ions or waters

you choose to retain).

6. Check for missing residues (Chimera is best for this, and links directly to the Modeller

program which will build missing loops).

7. Protonate your structure in Chimera (or with H++ http://biophysics.cs.vt.edu/). In Chimera

use “Tools → Structure Editing → AddH”. Make sure that the “also consider H-bonds”

option is switched on. Pay careful attention to any titratable groups on your ligand and their

neighbouring interactions which may favour a particular protonation state (e.g. induce a pKa

shift). Save this new pdb file.

https://ambermd.org/tutorials/pengfei/index.php

http://biophysics.cs.vt.edu/

Workflow for Building 6UF0 - xLeap1. Execute all of the leap commands up to and including the step where the pdb file of the

complex is loaded in.

2. Type “list” to check that your ligand (Q5V) has been read into the residue template list.

3. You will see that xLeap adds a new atom to HIE residue number 436, in this case. By

typing “edit pdb” you will be able to identify where this atom is, and understand where

the problem has come from. You will see that this residue should be of type HID, not

HIE, which is the default state assumed for the HIS residue type. This deviation from the

default occurs in this case because Chimera has identified a backbone H-bond interaction

with carbonyl oxygen of ILE 435.

4. Close xLeap and edit the pdb file to change the name of residue 436 from HIS to HID.

Notice that there are multiple other HIS in the pdb file that are in the default, HIE state.

5. Repeat step 1. You should see that there are now no xLeap errors. Save parameter and

restart files

6. Type “charge pdb” to find out the charge on your molecule (in this case -4).

7. Neutralise with counterions.

8. Add waters, and note the number of water molecules added.

9. Use the number of waters added to calculate the number of excess ions you require to

obtain the salt concentration required (generally around 150mM).

10. Add the necessary number, then neutralise. Save solvated parameter and restart files.

AMBER tip: When AMBER gives you warnings, it’s most often telling you about things it has fixed. If it

will save a topology file, this normally means that everything is ok. Try this first (e.g. “saveamberparm”,

and then problem solve if it refuses to write out topology/parameter information.

Workflow for Building 1UF0 – run and visualise

1. Equilibrate your structure using pmemd.mpi on a parallel CPU machine. The AMBER

CPU version is more numerically stable, and is recommended for equilibration.

2. Perform your production run using pmemd.cuda on your GPU. Note that implicitly

solvated GB/SA simulations are not compatible with the GPU version of AMBER, and

require parallel CPU.

3. Use “cpptraj” to catenate your trajectories (if you have restarted your production run

multiple times). If you have a biomolecular complex, you may need to use the

“image” command to ensure that both biomolecules are located in the same periodic

box. Remove water and counterions (unless you are specifically interested in ion or

solvation shells around your protein, which you may well be).

4. Read your “.prmtop” and “.nc” files into VMD to visualise your trajectory. Be careful

that there is the same number of atoms in your trajectory (.nc) and topology file – or

your visualisation will be a horrible mess (e.g. using a solvated topology file but an

unsolvated trajectory output by “cpptraj”, or visa versa, gives dramatically terrifying

results!!)

5. Inspect your ligand/protein interactions very carefully. Do the key interactions you

observed in the original pdb file persist? Are there any problems with regions of the

protein changing shape during the trajectory? Are there any other closely homologous

protein/ligand interactions in the literature that you can compare with? History repeats

itself in structural biology, and key motifs occur repeatedly in unexpected places, so

these comparisons can be invaluably helpful.

Diamond shows δ-hydrogen correctly added

into the pdb file by Chimera (but currently un-

bonded).

Default ɛ-hydrogen associated with HIS residue name (this is bonded in accordance with the residue

template)

Backbone carbonyl oxygen

which forms H-bond with HID

δ-hydrogen

AMBER cpptraj commands to process 1UF0

cpptraj commands to image then dehydrate the trajectory:

Read in the solvated trajectory (containing 500000

structures), keeping every 100th conformer

Put all periodic images back into the principal

simulation box

Remove water molecules

Remove global translation and rotation of the protein

(residues 1-287)

Write out new trajectory

trajin 6uf0_Q5Y_watmd9.x 1 500000 100

autoimage

strip :WAT

rms first :1-287

trajout image_6uf0_Q5Y_ions.nc

go

How to make the corresponding prmtop file for visualisation:

parmstrip :WAT

parmwrite out 6uf0_Q5Y_ions.prmtop

go

Remove the waters from the topology file

Write out your new topology file

Run cpptraj on your solvated prmtop/trajectory with:

> cpptraj 6uf0_Q5Y_wat.prmtop < myptrajscript

Visualisation of 1UF0 Trajectory

VMD visualisation showing the protein (new cartoon), the ligand (licorice), residues

containing an atom within 5Å of the ligand (lines) and Na+ (blue VdW) and Cl- (red VdW).

AMBER practicalBuild and run a peptide MD simulation

Keep your Workspace Tidy!

Make a directory for your current work.

$ mkdir peptide_model

Enter that directory.

$ cd peptide_model

It is important to keep different simulations in

different directories or you will get into a horrible

mess.

Building Peptides in xLeap

Start xLeap:

$ xleap

You should see the xLeap Universe Editor window.

For this practical, we only need the protein force-field

(leaprc.ff14SB).

> source leaprc.protein.ff14SB

To see which molecules you have available in ff14SB, type:

> list

You should see a list of amino acid residues and their C- and

N-terminal counterparts.

To build your peptide, think of ~5 amino acids (substitute

RES with the amino acid of your choice – they can all be

different!!):

You need NXXX and CXXX at the ends of the peptide to

correctly chemically cap the ends. Use the edit command to

visualise your molecule.

> peptide = sequence {NRES RES RES RES CRES}

> edit peptide

Close the editor window using the drop down menu.

Now save your molecule:

> savepdb peptide peptide.pdb

> saveamberparm peptide peptide.prmtop peptide.rst

Playing with xLeap

> edit peptide Look at molecule

Now you have saved your files you can play with the

select, draw, erase commands. Play with the mouse

buttons and work out how to rotate the molecule and

zoom in and out.

Select a SMALL group of atoms. Use the drop-down

menu to edit selected atoms. What do you see? What do

the columns mean?

Try relax selection, check unit, calculate net charge.

Close the molecule editor and close xLeap.

MD Simulations in Implicit Solvent

In this example, we perform an implicitly solvated

simulation of our peptide using the GB/SA (Generalised

Born/Surface area model (pg 67 of the AMBER20 manual).

BEWARE!!

For simplicity, we have used the igb = 1 option, which may

not be the best choice for your system. Please refer to the

relevant section of the manual for a detailed discussed

(Section 4.1, pg 69-70 of the AMBER20 manual).

GB/SA models are often used in post-processing to

calculate interaction energies.

When used for MD, conformational sampling is enhanced

because of the absence of solvent damping.

Running MD with Sander (or PMEMD)

To run sander you need:1) The topology (prmtop) and input coordinate (rst) files

2) Sander input files (for implicit solvent just min1.in,

min2.in md1.in, md2.in, md3.in)

3) A shell script which tells the computer to run sander:

If you are fortunate enough to have a gpu, replace sander

with pmemd.cuda (not available in AmberTools - sorry)

Output trajectory file

l=rst

f=min1

sander -O -i $f.in -o $f.out -inf $f.inf -c peptide.$l -ref peptide.$l -r peptide.$f

-p peptide.prmtop -x peptide$f.nc -e peptide$f.ene

Input file (e.g.

min1.in) Output files (e.g.

min1.out/min2.inf)

Input coordinates

files (e.g. peptide.rst)

Restart file from this

run (e.g. peptide.min1)

Topology file Output file containing energy information

Equilibration

1. Energy minimisation (with then without

restraints).

2. Heat molecule from 0 to 300K (with

restraints).

3. Restrained MD at 300K

4. Production run

Equilibration is essential to obtaining a stable trajectory.

Running Sander (Implicit Solvent)

Put your input files in your working directory. You need

the .in/.sh/.prmtop and .rst files.

You will first need to make your script that runs sander

executable in Linux with chmod.

In the gbsa_peptide_run.sh script, you need to make

sure that the filenames for your ****.prmtop and

****.rst files are the same as those you have in your

working directory or AMBER won’t find them!

$ chmod +x gbsa_peptide_run.sh Make the script executable

$ ./gbsa_peptide_run.sh Execute the script

Implicit Solvent MD input file for Sander

Here is a specimen input file for an implicitly

solvated MD run (time in ps = nstlim × dt).

How long will this simulation run for?

How many MD snapshots will it output?

Production MD run at 300K

&cntrl

ntc=2, ! Enable SHAKE to constrain all bonds involving hydrogen

ntf=2, ! Setting to not calculate force for SHAKE constrained bonds

cut=12.0, ! Nonbonded cutoff distance in Angstroms (for PME, limit of the direct space sum - do NOT reduce this below 8.0.)

igb=1, ! Pairwaise generalized Born (implicit) solvent

gbsa=1, ! Carry out generalised Born/surface area simulations

saltcon=0.1, ! Set concentration of 1-1 mobile counterions

ntpr=500, ! Print to the Amber mdout output file every ntpr cycles

ntwx=500, ! Write Amber trajectory file mdcrd every ntwx steps

nstlim =500000, ! Number of MD steps in run (nstlim * dt = run length in ps)

dt=0.002, ! Time step in picoseconds (ps). The time length of each MD step

ntt=1, ! Temperature control with Langevin thermostat

temp0=300.0, ! Initial thermostat temperature in K

ntx=5, ! Read coordinates and velocities from unformatted inpcrd coordinate file

irest=1, ! Restart previous MD run [This means velocities are expected in the inpcrd file and will be used to provide initial atom velocities]

nscm = 1000, ! Remove translational and rotataional center-of-mass movement at regular intervals

/

Visualising the Results

Visualising your trajectories is the most effective way to

assess if anything obvious has gone wrong.

Start VMD and read in your topology (e.g. peptide.prmtop)

file (this is of file type amber7parm) and your trajectory file

(e.g. peptidemd3.nc) (of file type amber netcdf).

Watch your dynamics!!!

You cannot currently view AMBER netcdf (.nc)

trajectories with VMD in Windows.

You can change the format of the trajectory file output by AMBER using the

“ioutfm” flag in sander/pmemd. Or you could post-process the trajectory with

cpptraj and write it out in a different format (e.g. mdcrd).

“Repeat” MD Simulations

One way to perform an independent “repeat” of a

simulation is to reassign the velocities at a chosen

point in the trajectory.

These input flags change!ntx=5 and irest=1 for a restart where velocities are

retained.

Assign new velocities then run MD at 300K

&cntrl

ntc=2, ! Enable SHAKE to constrain all bonds involving hydrogen

ntf=2, ! Setting to not calculate force for SHAKE constrained bonds

cut=12.0, ! Nonbonded cutoff distance in Angstroms (for PME, limit of the direct space sum - do NOT reduce this below 8.0.)

igb=1, ! Pairwaise generalized Born (implicit) solvent

gbsa=1, ! Carry out generalised Born/surface area simulations

saltcon=0.1, ! Set concentration of 1-1 mobile counterions

ntpr=500, ! Print to the Amber mdout output file every ntpr cycles

ntwx=500, ! Write Amber trajectory file mdcrd every ntwx steps

nstlim =250000, ! Number of MD steps in run (nstlim * dt = run length in ps)

dt=0.002, ! Time step in picoseconds (ps). The time length of each MD step

ntt=1, ! Temperature control with Langevin thermostat

temp0=300.0, ! Initial thermostat temperature in K

ntx=1, ! Read coordinates but NOT velocities from unformatted inpcrd coordinate file

irest=0, ! Assign new velocities

nscm = 1000, ! Remove translational and rotataional center-of-mass movement at regular intervals

/

Run a Repeat MD Simulation

Make a new directory, and copy the files you need into here.

Running on from your restart file from md2.in (eg

peptide.md2), run an independent repeat of your previous

simulation by asking md3.in to reassign a new set of

velocities (e.g. use md3_repeat.in).

You can use the file “gbsa_peptide_run_repeat.sh” to help you.

Call this trajectory something different (eg peptide_repeatmd3.nc)

Compare the two trajectories in VMD, and convince

yourself that they are different.

For example, you could plot a graph of the end to end

distance of the peptide in the two simulations.

AMBER practical appendixMost Common Simulation Problems

Very Common Simulation Problems

P: When I look at my trajectory in VMD, there

are funny lines all over the place!

S: The topology file you have loaded does not

contain the same number of atoms as your

trajectory. You may be using a topology file

with water, when your trajectory has been

dehydrated.

Alternatively, you have loaded the trajectory as

“with periodic box” when no box information

is present, or visa versa.

P: I built my molecule in leap, but when I try and

run a simulation it explodes!

S: You have a bad starting configuration – probably with

some nasty VdW clashes (eg atoms on top of one another).

You may inadvertently have multiple conformers in your file (e.g. for

sidechains) which are sitting on top of each other. Delete all but one

of these and try again!!

Otherwise, use the visualisation package Chimera to identify clashes

(“Tools → Surface/Binding Analysis → Find Clashes/Contacts”),

and either change the way you build your molecule or relax that

section separately in Chimera (very slow) xLeap/sander – or better –

both.

P: When I run a simulation of DNA (or

drug/protein complex), I find that one of the

molecules jumps out of the box!!

S: AMBER has saved the coordinates of different

periodic boxes for the different parts of your

complex. You can fix this with the “image”

command in “cpptraj” (but it can be fiddly!)

P: When I read my pdb file into Leap, AMBER

complains that it doesn’t recognise the atom names.

It also add lots of atoms that I didn’t expect.

S: AMBER requires very specific atom names to

compare with its residue templates. If those in your

pdb file are not AMBER compatible, it won’t

understand them. You can either

“addPdbAtomMap” in Leap or change them in the

original pdb file.

P: When I read my drug-DNA complex/ phosphorylated

protein etc that I downloaded from the pdb into xLeap, it

won’t save a topology file because there are “missing

parameters”.

S: The most usual cause is a missing TER flag between your ligand

and your protein/metal ions etc. AMBER tries to bond these when

the TER is missing, but will be unable to find the relevant parameters

because they are not chemically relevant.

If this is not the problem, then maybe you are trying to run a

simulation of a residue that is not standard, and AMBER does not

know the parameters by default. You need to look and see if there are

AMBER parameters available for the simulation you are trying to

perform, if not, you need to calculate your own with

antechamber/Gaussian etc – you might be able to use gaff2 (this is a

dirty solution!)

new building biomolecular models in amber · 2020. 9. 30. · workflow for building 1uf0 –pdb...

Documents