Tutorial 1
Tutorial 1
source leaprc ff14SB load parameters for the AMBER ff14SB force field, which by
default also loads the parameters for the TIP3P water model
wliq = loadpdb wat_box.pdb load the PDB file with the water box coordinates into a
unit called wliq
setbox wliq vdw this calculates the dimensions of the periodic box to be used by
finding the min/max coordinates of water box system and adding the applicable van der
Waals radii as a buffer
We can now run tleap to create the desired files from the command line:
tleap -f setup_water.leap
At this point it is worthwhile to create a directory structure that will help us in keeping various
things organized. Here is one suggestion:
mkdir analysis mdcrd mdin mdout rst
This creates five subdirectories inside the tutorial directory. All of the popular MD packages
Amber included create many different output files, and it is a good idea to organize things so
that you can find them easily.
Minimization
In general energy minimization is a necessary prerequisite to running a molecular dynamics
simulation. If a simulation is started while the system is in a particularly unfavorable
configuration, it will almost surely crash due to numerical instabilities caused by large forces. In
this case minimization is essential because our water box was built without any regard to
relative molecular orientations (although the program that created the box does respect
intermolecular spacing).
To minimize this system we will use the Amber executable pmemd.MPI, a parallel version of the
pmemd MD program. (Quick aside: AmberTools is distributed with sander, a free executable
that can carry out all of the same calculations and in fact has greater functionality than pmemd,
which is distributed with the non-free Amber package. But pmemd is ~2x faster or even more
in some cases than sander and so we will use it extensively here.)
Using a text editor create a file mdin/min.in with the following content:
liquid minimization
&cntrl
imin = 1,
ntb = 1, cut = 9.0,
ntc = 1, ntf = 1,
maxcyc = 1000, ncyc = 500
/
The first line of the file is a title line. Pretty much anything can be written here. Beyond that,
here is a brief translation of this file:
&cntrl this specifies that everything that follows belongs to the cntrl namelist (a list of
parameters for controlling the simulation)
imin = 1 flag for doing minimization (vs. real MD)
ntb = 1 use constant volume periodic boundaries
cut = 9.0 set the real space cut-off for both electrostatics and van der Waals forces to
9.0
ntc = 1 SHAKE is not used to constrain any bond lengths
ntf = 1 compute all bonded interactions (needed when SHAKE is not used)
maxcyc = 1000 the maximum number of minimization steps to be carried out
ncyc = 500 carry out steepest descent minimization for the first 500 steps, followed
by conjugate gradient minimization (provided that ntmin = 1, which is the default if not
otherwise specified)
We can now perform the energy minimization from the command line (type everything below on
the same line):
ibrun -np 16 pmemd.MPI -O -i mdin/min.in -o mdout/wat_box_min.out -p
wat_box.prmtop -c wat_box.inpcrd -r rst/wat_box_min.rst
This tells pmemd.MPI to run on 16 cores (the max number in each node) using the specified
input script (-i), prmtop file (-p), and inpcrd file (-c). It will output final coordinates to a restart file
(-r) and a variety of simulation output to a text file (-o). (Note: On the majority of computers the
command to run an executable in parallel is mpirun, not ibrun.)
Take a look inside mdout/wat_box_min.out to see what the output looks like. In general you
should find the potential energy (ENERGY) and root-mean-square gradient of the potential
energy (RMS) decreasing as the minimization progresses. (The RMS gradient provides some
sense of the average magnitude of the force acting on each atom.)
Heating
Now we want to bring our simulation box up to the desired temperature. In this case well run
the simulation at room temperature (298.15 K). Create a new text file mdin/heat.in with the
following content:
heat liquid
&cntrl
imin = 0, irest = 0, ntx = 1,
ntb = 1, cut = 9.0,
ntc = 2, ntf = 2,
tempi = 100.0, temp0 = 298.15, ntt = 3, gamma_ln = 1.0, ig = -1,
nstlim = 15000, dt = 0.002,
ntpr = 250, ntwx = 250, ntwr = 5000,
ioutfm = 1, nmropt = 1,
/
&wt
type='TEMP0', istep1=0, istep2=10000, value1=100.0, value2=298.15
/
&wt
type='END'
/
Note that there are many of the same terms as used in our minimization script, but some have
different values. There are also many new terms. Here are brief descriptions of all of the terms
that are either new or changed:
The remaining lines are separate namelists that enable us to vary the thermostat settings on the
fly. In particular from steps 0 to 10000, we linearly increase the thermostat temperature from
100.0 to 298.15 K. Then from steps 10001 to 15000 we maintain the thermostat temperature at
298.15 K. With modern MD packages it is not clear that we need to be so conservative with our
heating protocol, but starting from a lower temperature can potentially improve simulation
stability for systems that are difficult to minimize.
Run this NVT heating simulation with the following command:
ibrun -np 16 pmemd.MPI -O -i mdin/heat.in -o mdout/wat_box_heat.out -p
wat_box.prmtop -c rst/wat_box_min.rst -r rst/wat_box_heat.rst -x
mdcrd/wat_box_heat.mdcrd
Note that we use the minimized coordinates restart file as our starting coordinates (-c) and we
now also output a simulation trajectory file (-x), which contains the coordinates of all of the
atoms in our system (and optionally the velocities) at every 250th step of the simulation.
Equilibration (round #1)
Now that our system is up to temperature, we can bring it to the correct pressure/density. By
now youve hopefully gotten the hang of the organization scheme were using. Create a new
text file called mdin/eq1.in with the following content:
equilibrate liquid (Berendsen barostat)
&cntrl
imin = 0, irest = 1, ntx = 5,
ntb = 2, cut = 9.0,
This input script contains only a few new or changed terms that are not self-explanatory:
irest = 1, ntx = 5 continue this simulation from a restart file, reading both the
coordinates and velocities from that restart file
ntb = 2 use constant pressure (i.e., variable volume) periodic boundaries
pres0 = 1.013, ntp = 1, barostat = 1, taup = 2.0 set the desired target pressure of the
simulation to 1.013 bar (1 atm) using an isotropic Berendsen barostat with a time
constant of 2.0 ps
iwrap = 1 ensure that all molecules are inside the original system box when writing
out restart and trajectory files
Lets analyze the output file to monitor the convergence of the density of the system. On the
command line, input the following command:
grep Density mdout/wat_box_eq1.out
The grep command outputs lines in a file (or files) that contain the expression in quotes.
(Strictly speaking, quotes arent necessary if the expression is a single word, number, etc.) You
should see that the density trends toward ~1.0 g/cm3 and then fluctuates near that value, with
the exception of the last two numbers. (Why? Hint: Go look at the simulation output file!)
If we wanted to extract this data in such a way that might be useful for plotting, we could do
something like:
grep Density mdout/wat_box_eq1.out | head -100 | awk {print $3} >
analysis/eq1_density.dat
The awk command is quite versatile in parsing text files and even performing some basic data
manipulating. Here weve used it to print out the 3rd column of the grep output, which weve
truncated to include only the first 100 lines. You should now be able to plot the contents of the
resulting file (eq1_density.txt) using the tool of your choice. In this case it takes roughly 20 ps
for the system to reach the equilibrium density. (Note that the simulation output is written every
250 steps or 0.5 ps.)
Equilibration (round #2)
Now that our system is at the correct temperature and pressure, we can switch the type of
barostat to one that is more suited to equilibrium conditions (and generates a rigorously correct
NPT ensemble!). Create a new text file mdin/eq2.in with the following content:
equilibrate liquid (Monte Carlo barostat)
&cntrl
imin = 0, irest = 1, ntx = 5,
ntb = 2, cut = 9.0,
pres0 = 1.013, ntp = 1, barostat = 2, mcbarint = 100,
ntc = 2, ntf = 2,
temp0 = 298.15, ntt = 3, gamma_ln = 1.0, ig = -1,
nstlim = 25000, dt = 0.002,
ntpr = 500, ntwx = 500, ntwr = 5000,
ioutfm = 1, iwrap = 1
/
This input script contains just two new terms in this one line: pres0 = 1.013, ntp = 1, barostat =
2, mcbarint = 100. Instead of using a Berendsen barostat, we have now switched to a Monte
Carlo barostat in which new system volumes are attempted every 100 steps.
Run this NPT simulation with the following command:
ibrun -np 16 pmemd.MPI -O -i mdin/eq2.in -o mdout/wat_box_eq2.out -p
wat_box.prmtop -c rst/wat_box_eq1.rst -r rst/wat_box_eq2.rst -x
mdcrd/wat_box_eq2.mdcrd
We wont take too much time to analyze this output, but it wouldnt hurt to do a quick sanity
check of the density to ensure that it is fluctuating about an equilibrium value and not drifting
(which would indicate that more equilibration is necessary). You can also see the outcomes of
the Monte Carlo barostat swaps in the simulation output. On average, we would like these to be
accepted about ~50% of the time for the most efficient sampling of phase space.
Production
Now we are ready to do a production simulation from which we can get statistics that are
rigorously correct with regards to the NPT ensemble. Create a new text file mdin/prod.in with
the following content:
production simulation
&cntrl
imin = 0, irest = 1, ntx = 5,
ntb = 2, cut = 9.0,
pres0 = 1.013, ntp = 1, barostat = 2, mcbarint = 100,
ntc = 2, ntf = 2,
temp0 = 298.15, ntt = 3, gamma_ln = 1.0, ig = -1,
nstlim = 1000000, dt = 0.002,
ntpr = 500, ntwx = 500, ntwr = 50000,
ioutfm = 1, iwrap = 1
/
The only difference between this input script and mdin/eq2.in is the length of the simulation has
been greatly extended (to 1.0 x 106 steps or 2.0 ns). In general it is a good idea for your final
equilibration and production scripts to be identical in their parameters aside from simulation
length and output frequencies.
Run the production simulation with the following command:
ibrun -np 16 pmemd.MPI -O -i mdin/prod.in -o mdout/wat_box_prod.out -p
wat_box.prmtop -c rst/wat_box_eq2.rst -r rst/wat_box_prod.rst -x
mdcrd/wat_box_prod.mdcrd
This simulation will take several minutes to run. If you want an estimate of when it will
complete, open the file named mdinfo that is automatically created by pmemd. (You can control
the name and location of this file using the flag -inf. This will come in handy for later tutorials.)
Computing the average density
If you examine your production simulation output, you should find that the average density is
~0.987 g/cm3, which differs slightly from the known experimental value at these conditions
(0.997 g/cm3). Is this just a statistical fluke due to our finite sampling or a real anomaly? We
can use some basic data analysis techniques to find out.
First, extract the densities from the production simulation output file. (Remember to exclude the
last two densities, as these correspond to the mean and RMS values.) Next, we need to create
a program to compute the standard error of the mean density. Standard errors are generally
calculated as the sample standard deviation (sx) divided by the square root of the number of
samples (n):
SE x =
sx
n
This formula assumes that each sample (i.e., simulation snapshot) is statistically independent,
but it is important to note that successive observations taken from MD simulations can be
strongly correlated because the atomic positions at time t + t depend on the positions at time t.
This is counteracted somewhat by the fact that we do not record observables at every time step,
but rather at every jth step (such that n = tsim/j), as specified in our input script, where tsim is the
total simulation length (in steps) and j is the frequency at which output is written out (again in
steps).
We therefore need a method to estimate how many independent samples of the density we
actually have, and one way to do this is by determining the characteristic timescale of the
autocorrelation function of the density. We will not cover the theory here (more can be found in
Frankel and Smits book, among others), but the standard error for simulation observables can
be estimated as:
SE x
sx
n 2ncorr ,x
where ncorr,x is the characteristic number of snapshots for the autocorrelation in the observable
of interest to decay to (approximately!) zero. (If you are unfamiliar with autocorrelation
functions, there is a nice explanation available on Wikipedia, as well as in the free energy
simulations review by Shirts and Mobley that is part of our Suggested Reading list.)
In practice ncorr,x is usually calculated by assuming that the autocorrelation function has a simple
exponential form:
ACFx ( ) e
/ncorr ,x
Therefore by fitting a function of this form to the autocorrelation function, we can estimate this
characteristic timescale. Because this analysis is nontrivial, we have placed an example
IPython notebook on the workshop website (water_analysis.ipynb) that uses the scipy and
numpy libraries to carry out such an analysis. (Dont worry, youll get a chance to adapt it for
the next task!)
Computing the enthalpy of vaporization
In addition to checking that our simulation obtains the correct density for liquid water at 298.15
K, you may also be curious about the accuracy of the intermolecular interactions. One way to
measure this is to calculate the enthalpy of vaporization, which is the energetic cost of
converting one mole of a substance from the liquid phase to the gas phase.
The enthalpy of vaporization can be calculated from MD simulations as:
cpptraj can be run interactively or with a script; we will choose the former option for now. At
cpptraj -p wat_box.prmtop
You will now see a prompt where you can type in commands. Enter the following:
trajin mdcrd/wat_box_prod.mdcrd
radial analysis/wat_OO_rdf.dat 0.05 8.0 @O volume
run
quit
Appendix: Creating PDB files of a single water molecule and water box
Use the molecule editor of your choice to save a PDB file of a single water molecule. One
editor that we recommend is Avogadro (available at http://avogadro.cc/wiki/Main_Page). In
Avogadro this task can be done in the following way:
(1) Select the Draw Tool (looks like a pencil).
(2) In the Element: drop-down menu, select Oxygen (8).
(3) Click anywhere in the black View window. You should see a water molecule appear. If
you see a lone oxygen atom, then undo this action, make sure the Adjust Hydrogens
box is checked, and click in the View window again.
(4) File Save As wat.pdb (into the tutorial directory)
Note: You will need to do some editing of this file with a text editor to make sure that tleap
recognizes this as a water molecule. In particular you must:
Change HETATM to ATOM (with two spaces after the word ends)
Change the residue number of the hydrogen atoms (5th column) from 0 to 1
Change the names of the two hydrogen atoms (3rd column) from H (with one space
after the letter) to H1 and H2, respectively
At room temperature, water has a density of ~1 g/cm3, but we want to convert this molecules
per cubic ngstrom (molecules/3) so that we can determine an appropriate box size. Using the
following formula, we find:
1 g/cm3 x (10-8 cm/)3 x (6.022 x 1023 molecules/mol) x 1/(18.015 g/mol) = 0.0334 molecules/3
Therefore we need a box of volume 640 molecules/(0.0334 molecules/3) = 19200 3 to fit
these molecules at the appropriate density. This implies a side length of 26.8 , which we will
conservatively round up to 27 .
We can then build the water box using any number of tools. One popular, free, and opensource tool is called packmol (http://www.ime.unicamp.br/~martinez/packmol/). Once you have
downloaded and compiled this software, create a text file wat_box.inp with the following content:
tolerance 3.00
output wat_box.pdb
filetype pdb
add_amber_ter
structure wat.pdb
number 640
inside cube 0. 0. 0. 27.
end structure
You can then build the water box by typing the following at the command line:
$path-to-packmol/packmol < water_box.inp
where $path-to-packmol should be replaced with the full path to wherever packmol has been
installed.
An alternative way to approach this problem is to use the built-in solvate function of tleap. If
you want to go this route, create and modify the wat.pdb file as directed above. Then we create
the following script to be executed by tleap:
source leaprc.ff14SB
wgas = loadpdb wat.pdb
saveamberparm wgas wat_gas.prmtop wat_gas.inpcrd
wliq = copy wgas
solvatebox wliq TIP3PBOX 12.11
saveamberparm wliq wat_box.prmtop wat_box.inpcrd
quit
Unfortunately it is not possible to specify the number of desired solvent molecules with tleap,
but creating a box that extends at least 12.11 from the starting water molecule will yield ~640
water molecules.