Tutorial Genesis
Tutorial Genesis
Release 1.0
RIKEN AICS
2015/05/08
Citation Information
J. Jung, T. Mori, C. Kobayashi, Y. Matsunaga, T. Yoda, M. Feig, and Y. Sugita, "GENESIS: A hybridparallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations", WIREs Computational Molecular Science (DOI: 10.1002/wcms.1220)
Copyright Notice
GENESIS is distributed under the GNU General Public License version 2.
Copyright 2014 RIKEN.
GENESIS is free software; you can redistribute it and/or modify it under the terms of the
GNU General Public License as published by the Free Software Foundation; either version
2 of the License, or (at your option) any later version.
GENESIS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with GENESIS
see the file COPYING. If not, write to the Free Software Foundation, Inc., 59 Temple
Place - Suite 330, Boston, MA 02111-1307, USA.
It should be mentioned this package contains the following softwares for convenience. Please note that
these are not covered by the license under which a copy of GENESIS is licensed to you, while neither
composition nor distribution of any derivative work of GENESIS with these software violates the terms
of each license, provided that it meets every condition of the respective licenses.
This library is free software; you can redistribute it and/or modify it under the terms of the
GNU Library General Public License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along
with this library; if not, write to the Free Foundation, Inc., 59 Temple Place, Suite 330,
Boston, MA 02111-1307 USA
Note:
"GENESIS (General Neuron Simulation System)" (http://www.genesis-sim.org/) simulator, which has
much longer history in the development.
[The Book of GENESIS]
http://www.amazon.com/
The-Book-GENESIS-Exploring-SImulation/dp/0387949380/
ref=sr_1_1?ie=UTF8&qid=1394425468&sr=8-1&keywords=bower+beeman
Our "GENESIS" has been developed independently to investigate conformational dynamics of proteins,
nucleic acids, biological membranes, and other biomolecules.
CONTENTS
Introduction
Getting Started
2.1 How to install GENESIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Basic Usages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
7
Available Programs
13
3.1 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Analysis tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Parallel I/O tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Energy
5.1 General keywords . . . . . . . . . . . .
5.2 Non-bonded interaction related keywords
5.3 Particle mesh Ewald related keywords . .
5.4 Lookup table related keywords . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
18
19
20
21
Dynamics
23
6.1 General keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2 Simulated annealing related keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Minimize
25
Constraints
26
Ensemble
27
10 Boundary
29
11 Selection
30
12 Restraints
32
13 REMD
34
37
ii
14.1
14.2
14.3
14.4
14.5
14.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
41
43
45
46
47
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
56
56
58
61
61
64
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
81
85
88
90
91
93
iii
iv
CHAPTER
ONE
INTRODUCTION
GENESIS (Generalized-Ensemble Simulation System) is a suite of computer programs for carrying out
molecular dynamics (MD) simulations of biomolecular systems. MD simulations of biomolecules such
as proteins, nucleic acids, lipid bilayers, N-glycans, and so on, are used as important research tools
in structural and molecular biology. Many useful MD simulation packages [1] [2] [3] [4] [5] are now
available together with accurate molecular force field parameter sets [6] [7] [8] [9] [10]. Most of the MD
software have been optimized and parallelized for distributed-memory parallel supercomputers or PCclusters. Therefore we can use hundreds of CPUs or CPU cores efficiently for a single MD simulation
of a relatively large biomolecular system, typically composed of several hundreds thousand of atoms. In
recent years, the number of available CPUs or CPU cores is rapidly increasing, and thereby, modern and
more efficient parallel schemes are necessary to be implemented in the MD simulation programs.
One of our major motivations is to develop MD simulation software whose performance is scalable on
such modern supercomputers. For this purpose, we have developed the software from scratch, introducing the hybrid (MPI + OpenMP) parallelism and several new parallel algorithms [11] [12]. Another
major motivation is to develop a MD simulation code, which can be easily understood and modified
for methodological development. The two policies (high parallel performance and simplicity) usually
conflict with each other in computer software. Therefore we considered to develop two MD simulators
simultaneously.
They are SPDYN (Spatical decomposition dynamics) and ATDYN (Atomic decomposition dynamics).
Although these two MD codes share almost the same data structures, subroutines, and modules, a different parallelization scheme is introduced in each simulator. In SPDYN, the spatial decomposition
scheme is implemented with new parallel algorithms [11] [12]. Its performance is therefore better than
ATDYN and most of other previously developed MD simulators. In ATDYN, the atomic decomposition
scheme is introduced aiming for simplicity in the source code; and enhanced conformational sampling
algorithms such as the replica-exchange molecular dynamics (REMD) method is available.
Due to the simple parallelization, the performance of ATDYN is worse than SPDYN. However, ATDYN
is simpler and, thereby, easier to modify for development of new algorithms or novel molecular models.
We hope, ambitious users will try to develop new methodologies in ATDYN at first and, eventually,
move to SPDYN for the better performance. As we try to maintain consistency between the source
codes of ATDYN and SPDYN, switching from ATDYN to SPDYN is possible by ambitious users.
Other features in GENESIS are listed as follows:
Not only atomistic molecular force field (CHARMM) but also Go-model is available in ATDYN.
For extremely large biomolecular systems (more than 10 million atoms), parallel input/output
(I/O) scheme is implemented and available.
GENESIS is optimized for K computer (developed by RIKEN and Fujitsu company), but it is
available on Intel-based supercomputers or PC-clusters.
1
GENESIS is written in modern Fortran 90/95/2003 using modules and dynamic memory allocation. No common blocks are used!
GENESIS is free software under the GNU General Public License (GPL) version 2 or later. We
allow any uses to use/modify GENESIS and redistribute the modified version under the same
license.
And so on.
This manual consists of 18 chapters including information how to get started, explanation of each keyword in control files, and tutorials for MD simulations, coarse-grained (CG) MD simulations with Gomodel, REMD simulations, and so on. We recommend new users of GENESIS to start from the next
chapter, Getting Started, to learn a general idea, installation, and work flow of the program.
Compared to other MD software like AMBER, CHARMM, NAMD, and so on, GENESIS is a very
young MD simulator. Before releasing the program, the developers and contributors in GENESIS development team worked hard to kill all bugs in the program, and performed a bunch of test simulations.
Still, there is possibility of several defects or minor bugs in GENESIS. Since we cannot bear any responsibility for the simulation results produced by GENESIS, we recommend new users to check their
results carefully, if necessary, by comparing with other MD programs.
We, GENESIS development team, have a lot of plans for future development of methodology and
molecular models. Some of them are already on-going projects by us. We would like to grow GENESIS
toward one of the most powerful and feasible MD software package, contributing to computational
chemistry and biophysics. We believe that the current status of computational studies in life science
area is still in the very early stage (like GENESIS) compared to established experimental researches.
We hope, GENESIS can push forward the computational science and contribute to life-scientific and
medical applications in the near future.
CHAPTER
TWO
GETTING STARTED
GENESIS consists of two simulators and several analysis tools. The simulators, called ATDYN
and SPDYN, can perform minimization, molecular dynamics, and other advanced simulations of
biomolecules. The analysis tools, trj_analysis, crd_convert, pcrd_convert, remd_convert, are used
for post-processing trajectories produced by the simulators. prst_setup generates GENESIS original
multiple restart I/O data from the conventional CHARMM input data.
A description of each program is given in the next chapter (Available Programs), and a detailed usage
(including references for input parameters) is explained in Tutorials (from Chapter 14). This chapter
is devoted to orient new users who have just downloaded GENESIS package. In the first half of this
chapter, compilation and installation of GENESIS is described. In the later half, we give the users a
general idea of how to use GENESIS for their own purposes.
2.1.2 Download
GENESIS package is available at http://www.riken.jp/TMS2012/cbp/en/research/software/genesis/index.html
The files are organized according to their purpose.
User Guide (genesis.pdf): this file
Source code (genesis.tgz): source code for the simulators and the tools
Test files (test.tgz): script and input files for regression tests (see below)
Tutorials (tutorial.tgz): input files for tutorials, see Tutorials (from Chapter 14)
2.1.3 Installation
First, extract the archive of the source code (genesis.tgz).
# untar the package file and change the working directory
$ tar xvfz genesis.tgz
$ cd genesis/
$ ls -lF
total 8
-rw-r--r-1 user staff
79 9 12 22:27 README # README file
-rw-r--r-1 user staff
79 9 12 22:27 COPYING # License agreement
drwxr-xr-x 10 user staff 340 10 20 18:16 src/
# Source codes
GENESIS uses GNU autotools build system. Change the current directory to src/, and run
./configure script to create a Makefile for your system. ./configure script tries to detect all
of the requirements needed to compile GENESIS:
It may take a while ./configure script to complete. While running, it prints messages telling which
things are checked.
If you need to add compilation options and/or library paths, you can set the following options:
options
prefix=PREFIX
exec-prefix=EPREFIX
program-prefix=PREFIX
program-suffix=SUFFIX
enable-debug=LEVEL
disable-mpi
disable-openmp
enable-fft3d
with-fftw[=PATH]
with-lapack
host=host
FC
FCFLAGS
LAPACK_PATH
LAPACK_LIBS
meaning
install architecture-independent files in PREFIX
install architecture-dependent files in PREFIX
prepend PREFIX to installed program names
append SUFFIX to installed program names
enable debugging (time consuming)
disable MPI parallelization
disable OpenMP parallelization
enable FFT3D calculation (see [ENERGY] section)
use FFTW library instead of embedded FFTE routine
use LAPACK
disable compiler check (Required in cross compiler system)
Fortran compiler command
Fortran compiler flags
LAPACK library path (recommended when given with-lapack)
LAPACK library path (recommended when given with-lapack)
Once ./configure successfully finished, Makefile is created for your system. To compile and
install GENESIS, type make install command:
$ make install
make install compiles all of the GENESIS programs (the simulators and the analysis tools). The
compiled binary files are copied to bin/ directory. If the compilations are successfully finished, the
following binary files has to be in your bin/:
$ ls -lF ../bin/
total 57840
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
-rwxr-xr-x 1 user
staff
staff
staff
staff
staff
staff
staff
staff
6271072
1804980
1911360
6575348
1946552
1180328
7259252
1472264
10
10
10
10
10
10
10
10
20
20
20
20
20
20
20
20
18:19
18:24
18:24
18:23
18:24
18:24
18:23
18:24
atdyn
crd_convert
pcrd_convert
prst_setup
remd_convert
rst_convert
spdyn
trj_analysis
10:29
10:29
10:29
10:29
10:29
10:29
10:29
10:29
10:32
test_spdyn
test_parallel_IO
test_common
test_atdyn
test.py
genesis.py
charmm.py
build
param
Here, genesis_command is a command line for executing ATDYN or SPDYN (default: mpirun -np
8 atdyn). parallel_io needs to be appended to check the parallel I/O facility.
# execute atdyn test
$ ./test.py "mpirun -np 8 /path/to/atdyn"
# execute spdyn test
$ ./test.py "mpirun -np 8 /path/to/spdyn"
# execute spdyn test on FUJITSU compiler
$ ./test.py "mpirun -np 8 /path/to/spdyn" 0.03
Note: The regression tests are designed to be executed with 8 MPI processes. In particular, the parallel
I/O test will be failed with other MPI conditions. As for ATDYN and SPDYN tests, other MPI conditions
may give invalid results.
Note: Since the optimization scheme of FUJITSU compiler is different from those of other compilers,
tolerance of spdyn test should be increased in FUJITSU compiler.
Note: prst_setup does not work using the Fujitsu compiler. There is no problem in running SPDYN
by reading parallel restart files generated by prst_setup with other compilers.
GENESIS program prints a templates of the control file then executed with -h ctrl template_name
option. A list of template_name names can obtained by running it with -h option only. For example,
SPDYN prints the template names for md (molecular dynamics), min (minimization), and remd (replica
exchange molecular dynamics):
$ spdyn -h
# normal usage
% ./spdyn INP
# check control parameters of md
% ./spdyn -h ctrl md
# check control parameters of remd
% ./spdyn -h ctrl remd
For example, the template for minimization is shown by issuing the following command:
$ spdyn
[INPUT]
topfile
parfile
psffile
pdbfile
-h ctrl min
=
=
=
=
sample.top
sample.par
sample.psf
sample.pdb
[ENERGY]
forcefield
electrostatic
switchdist
(skipped...)
= CHARMM
= PME
= 10.0
#
#
#
#
topology file
parameter file
protein structure file
PDB file
# [CUTOFF,PME]
# switch distance
If you are interested in all of available options including detailed parameters for advanced users, add
_all to template_name. For example, running with -h ctrl min_all prints all of available options for minimization.
2.2.3 Minimization
A control file for minimization of a molecule is shown below. [INPUT] section contains the input file
names, and the parameters of [ENERGY] specify energy and force evaluation. [MINIMIZE] section
enables the minimization algorithm.
[INPUT]
topfile
parfile
psffile
pdbfile
=
=
=
=
top_all27_prot_lipid.top
top_all27_prot_lipid.par
mol.psf
mol.pdb
[OUTPUT]
dcdfile = min.dcd
rstfile = min.rst
[ENERGY]
forcefield
electrostatic
switchdist
cutoffdist
pairlistdist
=
=
=
=
=
#
#
#
#
topology file
parameter file
protein structure file
PDB file
CHARMM
PME
10.0
12.0
13.0
#
#
#
#
#
[CHARMM,KBGO]
[CUTOFF,PME]
switch distance
cutoff distance
pairlist distance
10
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
=
=
=
=
[BOUNDARY]
type
= PBC
100
10
10
100
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
# [NOBC,PBC]
=
=
=
=
=
top_all27_prot_lipid.top
top_all27_prot_lipid.par
mol.psf
mol.pdb
min.rst
[OUTPUT]
dcdfile = md.dcd
rstfile = md.rst
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
restart file
[ENERGY]
forcefield
electrostatic
switchdist
cutoffdist
pairlistdist
=
=
=
=
=
CHARMM
PME
8.0
12.0
13.0
#
#
#
#
#
[CHARMM,KBGO]
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
=
=
=
=
=
=
LEAP
1000
0.002
10
10
1000
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
timestep (ps)
energy output period
coordinates output period
restart output period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVE
= NO
= 300
# [NVE,NVT,NPT]
# no thermostat
# initial temperature
[BOUNDARY]
11
type
= PBC
# [NOBC,PBC]
12
CHAPTER
THREE
AVAILABLE PROGRAMS
GENESIS consists of two simulators with different algorithms and several analysis and conversion tools
for trajectories.
3.1 Simulators
GENESIS has two simulators with different decomposition schemes, but Input/Output files adhering
to the common formats (except for the parallel I/O scheme). However, some of simulation options are
ATDYN or SPDYN specific.
ATDYN (ATomic decomposition DYNamics simulator)
Molecular dynamics, energy minimization, and replica exchange molecular dynamics
(REMD) [18] [19] are available. The simulator uses the atomic decomposition scheme
with hybrid (MPI/OpenMP) parallelization. The simulator is designed for an easy prototyping and implementation of new methods (force fields, generalized-ensemble, external force,
coarse-grained models, etc.).
SPDYN (SPatial decomposition DYNamics simulator)
Molecular dynamics, energy minimization, and replica exchange molecular dynamics
(REMD) are available. The simulator uses the spatial decomposition scheme with our new
algorithms for parallel scaling; and parallel I/O scheme is available. The simulator is more
complex and designed for high performance and good scalability on massively parallel computers.
13
14
CHAPTER
FOUR
topfile
Topology file which contains information about atom connectivity in residues and monomer
unit. GENESIS reads topfile in CHARMM format. For details on the format, see the
CHARMM web site [14]. GENESIS can read multiple topfiles (see Note). (Required)
parfile
Parameter file which contains force field parameters, e.g., force constants and equilibrium
geometries. GENESIS reads parfile in CHARMM format. GENESIS can read multiple
parfiles (see Note). (Required)
strfile
CHARMM stream file which contains additional force field parameters. GENESIS can
read multiple strfiles (see Note). (Optional)
psffile
PSFfile contains system information about atomic mass, charge, atom connectivity, and etc.
GENESIS reads psffile in X-PLOR and CHARMM formats. (Required)
pdbfile
Atomic coordinates in PDB (Protein Data Bank) format. (Required)
crdfile
Atomic coordinates in CHARMM format. If crdfile is specified, coordinates in crdfile are
used as the initial coordinates prior to those in pdbfile. (Optional)
rstfile
15
This file contains atomic coordinates, velocities, simulation box size and other simulation
variables with the double-precision floating-point number representation. Rstfile, which
is given in GENESIS-original format, is specified for restarting a simulation. If rstfile
exists, coordinates in pdbfile/crdfile and simulation box size in the control file are ignored.
(Optional)
reffile
Reference coordinates (PDB file format) for positional restraints. This file should contain
the same total number of atoms in pdbfile. (Optional)
localresfile (available for SPDYN only)
LocalRes file is used to applying external forces as Local restraint. (Optional)
Local restraint are the restraining forces implemented within the spatial decomposition scheme. Their are more computationally efficient than the ordinary ones, but the atoms
has to be located in the same cell.
These energy terms are harmonic potentials:
U (r) = k (r r0 )2 for bonds
U () = k ( 0 )2 for bond angles
U () = k ( 0 )2 for dihedral angles
Here, r, , and are bond distance, angle, and dihedral angles, respectively; subscript 0
denotes their reference values; and k is the force constant. The restraint energies are added
to the corresponding bond or angle/dihedral angle energies in the output file.
The file contains the following information;
[BOND/ANGLE/DIHEDRAL] atom atom [atom [atom]] k r0
atom indices start from 1.
Example for localres file
BOND
139 143
ANGLE
233 231 247
DIHEDRAL 22 24 41
43
2.0 10.0
3.0 10.0
2.0 10.0
Note: Ordinary restraint functions are set in [RESTRAINTS] section (see Restraints). The equivalent
calculations to local restraint can be performed with ordinary functions of ATDYN/SPDYN
without the spatial decomposition scheme.
Note: To specify multiple files for parfile, topfile, and strfile, comma-separated lists are written (e.g.,
parfile = par_all36_prot.prm, par_all36_na.prm, par_all36_lipid.prm). If the character string becomes
long, backslash can be used for line break and continuation.
16
dcdfile
Trajectory is written in DCD format (also used by X-PLOR and CHARMM). The
double-precision flouting-point number representation is used in the file (Required if
crdout_period > 0)
dcdvelfile
Velocities are also written in DCD format. (Required if velout_period > 0)
rstfile
ReSTart file contains coordinate, velocities, simulation box size and other dynamic informations. An MD simulation can be restarted using a rstfile generated by either a previous MD
or a MIN simulation; as an REMD simulation can be restarted from any (MIN/MD/REMD)
rstfile. (Required if rstout_period > 0)
remfile (generated in REMD simulations)
The file provides parameter ID for each replica to match replica ID and parameter ID. It is
used as an input file for remd_convert utility. (Required if exchange_period > 0)
17
CHAPTER
FIVE
ENERGY
In [ENERGY] section, there are several options to set keywords related to energy and force evaluation.
Kb (b b0 )2 +
Kub (S S0 )2 +
UB
bond
nonbond
K ( 0 )2
improper
dihedral
K ( 0 )2
angle
K (1 + cos (n )) +
"
Rmin,ij
rij
12
2
Rmin,ij
rij
6 #
+
X
nonbond
qi qj
1 rij
where Kb , Kub , K , K , and K are force constants of bond, UB 1-3 distance, angle, dihedral angle,
and improper dihedral angle potentials, respectively; b0 , S0 , 0 , and 0 are corresponding equilibrium
values; and is a phase shift of the dihedral angle potential. is Lennard-Jones potential well depth,
Rmin,ij is a distance of the Lennard-Jones potential minimum, qi is an atomic charge, 1 is an effective
dielectric constant, and rij is a distance between two atoms. The parameters are set according to an
atom type.
The form of potential energy function is force field dependent. In GENESIS, TIP3P explicit water [20]
with CHARMM22 and CHARMM27 force fields are available (proteins: [7][8], nucleic acids: [21][22],
and lipids: [23][24]). Recently developed CHARMM36 force fields [25] [26] are also available.
As for a coarse-grained simulations, a model proposed by Karanicolas and Brooks [27] [28] is supported.
It is mainly based on the Go-like model [29] with some sequence-based parameters incorporated. The
potential energy function of the model is:
18
E(r) =
Kb (b b0 )2 +
bond
K ( 0 )2
angle
K (1 + cos(n ))
dihedral
"
+
13
nativecontacts
X
nonnativepairs
"
#
Rmin,ij 12
Rmin,ij 10
Rmin,ij 6
18
+4
rij
rij
rij
#
12
Rmin,ij
rij
where the last two terms are non-bonded interactions between native contact pairs and non-native contact
pairs, respectively. Roughly speaking, the native contacts are defined for the pairs close to each other in
a PDB structure of a target molecule. Rmin,ij is a reference distance between a pair atoms in the PDB
structure.
electrostatic CUTOFF/PME
Type of long range electrostatic energy/force evaluations (Default : PME)
switchdist Real
Switch cut-off distance (unit : Angstrom) (Default : 10.0)
5.2. Non-bonded interaction related keywords
19
cutoffdist Real
Potential energy/force cut-off distance (unit : Angstrom) (Default : 12.0)
pairlistdist Real
Cut-off distance of Verlet pair list for non-bonded energy/force evaluations [31] (unit :
Angstrom) (Default : 13.5)
dielec_const Real
Dielectric constant (Default : 1.0)
vdw_force_switch YES / NO
Usage of the force switch function for van der Waals interactions (Default : NO)
+
cos(G
r
)
ij
1
rij
V
1
1
|G|2
2
ij
i<j
ij
|G| 6=0
Here, the first term is calculated with cut-off because it is decreased rapidly then pair distances increase.
The third term is so called self-energy, calculated only once. The second term is rewritten as:
X exp(|G|2 /42 )
|S(G)|2
2
|G|
2
|G| 6=0
where
S(G) =
For the evaluation of S(G); b1 (G1 ), b2 (G2 ), and b3 (G3 ) are approximated by the cardinal B-splines of
order n; and F(Q) is calculated by the fast Fourier transformation (FFT) of charge values.
pme_alpha Real
Exponent of complementary error function (Default : 0.34)
pme_ngrid_x Integer
Number of FFT grid points along x dimension (Required if PME is used)
5.3. Particle mesh Ewald related keywords
20
pme_ngrid_y Integer
Number of FFT grid points along y dimension (Required if PME is used)
pme_ngrid_z Integer
Number of FFT grid points along z dimension (Required if PME is used)
pme_nspline Integer
B-spline order of charges for the evaluation from b1 (G1 ) to b3 (G3 ) (Default : 4)
pme_multiple YES/NO
Enable partitioning of processors for real and reciprocal PME term computation (available
in ATDYN only) (Default : NO)
pme_mul_ratio Integer
Ratio of process numbers for real and reciprocal PME term computation (available in ATDYN only and used when PME_multiple=YES only) (Default : 1)
21
where
L = INT(Density rv2 /r2 )
and
t = Density rv2 /r2 L
Density is the number of points per a unit interval. Lookup table using cubic interpolation is different
from that of linear interpolation. In the case of cubic interpolation, monotonic cubic Hermite polynomial
interpolation is used to impose the monotonicity of the energy value. Energy/gradients are evaluated as
a function of r2 [36] using four basis functions for the cubic Hermite spline : h00 (t), h10 (t), h01 (t),
h11 (t)
Ftab (L 2) + Ftab (L 1)
h10
2
Ftab (L 1) + Ftab (L)
h11 (t)
+ Ftab (L)h10 (t) +
2
where
L = INT(Density r2 )
and
t = Density r2 L
table YES / NO
Enable the lookup table (Default : YES)
table_density Integer
Number of table points per unit interval (Default : 20)
table_order 0 / 1 / 3
Order of interpolation used in the lookup table (Default : 1 (Electrostatic=PME), 3 (Electrostatic=Cutoff)
water_model TIP3 / NONE
Type of water model used for the lookup table approach (Default : TIP3)
5.4. Lookup table related keywords
22
CHAPTER
SIX
DYNAMICS
In [DYNAMICS] section, we choose integrator and time step options for MD simulations.
t
[F(t) + F(t + t)]
2m
Leapfrog (LEAP) integrator has one stage. After force evaluation, velocities at half time step and coordinates at next integer time step are updated as follow:
1
1
t
v(t + t) = v(t t) +
F(t)
2
2
m
1
r(t + t) = r(t) + tv(t + t)
2
annelaing YES/NO
Enable simulated annealing (Default : NO)
annelaing_period Integer
Period of temperature steps in time steps (Default : N/A)
dtemperature Real
Temperature change per temperature step (unit : Kelvin) (Default : 0.0)
24
CHAPTER
SEVEN
MINIMIZE
In [MINIMIZE] section, we choose options for energy minimization. Currently, the steepest descent
algorithm is available only.
method SD
Algorithm of minimization (Default : SD).
nsteps Integer
Number of minimization steps (Default : 100)
eneout_period Integer
Period of energy outputs in minimization steps (Default : 10)
crdout_period Integer
Period of coordinates outputs in minimization steps (Default : N/A)
rstout_period Integer
Period of restart file updates in minimization steps (Default : N/A)
nbupdate_period Integer
Period of non-bonded pair list updates in minimization steps (Default : 10)
Note: If the initial structure is very deviated from the equilibrium and there are several non-bonded
interactions at short distances, it is recommend to assign table_order=0 to avoid large energy/force
values. Please note, constraints are not available in minimization.
25
CHAPTER
EIGHT
CONSTRAINTS
In [CONSTRAINTS] section, constraint keywords are described. SHAKE scheme is used for bonds
between heavy and hydrogen atoms [37]. For velocity Verlet integrator, RATTLE constraint is also
available [38]. Bond constraints between heavy atoms are not imposed. If explict water molecules
(TIP3P) are present, SETTLE algorithm is enabled automatically [39].
rigid_bond YES/NO
Enable constraints (not availalbe for minimization) (Default : NO).
shake_iteration Integer
Number of iterations for coordinates/velocities updates. If it does not converge within the
given number of iterations, the program terminates with an error message. (Default : 500)
shake_tolerance Real
Tolerance of SHAKE convergence (unit : Angstrom) (Default : 1.0e-10)
faster_water YES/NO
Usage of SETTLE algorithm for constraints in water molecules (Default : YES)
(fast_water=NO is not available in SPDYN).
26
CHAPTER
NINE
ENSEMBLE
In [ENSEMBLE] section, type of ensemble, temperature and pressure control algorithm, and parameters used in the algorithms are specified.
In the Langevin thermostat algorithm (ensemble=NVT with tpcontrol=LANGEVIN), every particle is coupled with a viscous background and a stochastic heat bath [40]:
dv(t)
F(t) + R(t)
=
v(t)
dt
m
where is the thermostat friction parameter (gamma_t keyword) and R(t) is the stochastic force. In
the Langevin thermostat and barostat method (ensemble=NPT with tpcontrol=LANGEVIN), the
equation of motions are given by [41]:
dr(t)
= v(t) + v r(t)
dt
dv(t)
F(t) + R(t)
3
=
[p + (1 + )v ]v(t)
dt
m
f
dv (t)
3K
= [3V (P (t) P0 (t)) +
p v + Rp ]/pmass
dt
f
where K is the kinetic energy, p is the barostat friction parameter (gamma_p keyword), Rp is the
stochastic pressure variable.
LEAP NVT
O
O
LEAP NPT
O
O
VVER NVT
O
O
VVER NPT
X
O
28
CHAPTER
TEN
BOUNDARY
In [BOUNDARY] section, boundary conditions of the system such as simulation box size are specified.
type PBC/NOBC
Type of boundary condition (Default : PBC).
PBC: Periodic boundary condition (rectangular or cubic box)
NOBC: Non boundary condition (vacuum system) (available in ATDYN only).
box_size_x Real
Box size along the x dimension (unit : Angstrom) (Required if PBC is used)
box_size_y Real
Box size along the y dimension (unit : Angstrom) (Required if PBC is used)
box_size_z Real
Box size along the z dimension (unit : Angstrom) (Required if PBC is used)
domain_x Integer
Number of domains along the x dimension (Optional; available in SPDYN only)
domain_y Integer
Number of domains along the y dimension (Optional; available in SPDYN only)
domain_z Integer
Number of domains along the z dimension (Optional; available in SPDYN only)
Note: If number of domains (domain_x, domain_y, and domain_z) are not specified in the control file,
they are automatically determined based on the number of MPI processes. If the user wants to specify
number of domains explicitly, the product of domain numbers (domain_x x domain_y x domain_z) must
be equal to the total number of MPI processes.
Note: If the simulation system has a periodic boundary condition, the user must specify the box size
initially. During simulations, box size is saved in a restart file. If the restart file is used as an input of the
subsequent restart MD, the initial box size is replaced with it, even if another value is specified in the
control file.
29
CHAPTER
ELEVEN
SELECTION
To perform MD simulations with restraints like umbrella sampling, the user must select atoms to be
restrained. Once the group of selected atoms is defined in this section, the group index can be called
from [RESTRAINTS] section. Nothing is selected as default.
[SELECTION] section is also included in a control file of some analysis tools, in which selected atoms
are used as analysis and fitting atoms.
group expression
Select atoms by expression and define them as a group. Available keywords and operators in expression are listed in Table II. Note that mname (or moleculename, molname) in
expression is a molecule name that is defined by mol_name below.
mole_name molecule starting-residue ending-residue
Define molecule by starting-residue and ending-residue. Those residues are defined by
[segment id]:residue number:residue name
meaning
atom name
atom index
atom number
residue name
residue number
molecule name
segment index
water molecule
hydrogen atoms
heavy atoms
protein backbone
all atoms
conjunction
logical add
negation
assemble
example
an:CA
ai:1-5
atno:6
rnam:GLY
rno:1-5
mname:molA
segid:PROA
30
Note: ai, atomindex, and atomidx indicate the index of atom that is sequentially renumbered over
all atoms in the system, while atno and atomno are the index of atom that is assigned to each atom in
PDB file. Atom index in PDB file (column 2) is not always starting from 1 or numbered sequentially. If
the user want to use such PDB file as an input file, although it shoule be a rare case, atno and atomno
are useful to select atoms.
31
CHAPTER
TWELVE
RESTRAINTS
In [RESTRAINTS] section contains keywords to set external harmonic restraint functions. The restraint
functions are applied to the selected atom groups in [SELECTION] section.
External harmonic restraint functions are applied to the selected atom groups in order to restrict the
motions of the atoms. The potential energy of a restaint is:
U (x) = k (x x0 )2
nfunctions Integer
Number of ordinary restraint functions. Note that number of local restraint is not included.
The following parameters is set with serial number (Maximum is nfunctions). (Default: 0)
function POSI /DIST[MASS] /ANGLE[MASS] /DIHED[MASS] /RMSD[MASS]
Type of harmonic restraint. (Default: N/A)
Each keyword is used with serial number up to numfuncs POSI: positional restraint. The
reference coordinates are set by reffile in [INPUT]. (see Input and Output files)
DIST[MASS]: distance restraint.
ANGLE[MASS]: angle restraint.
DIHED[MASS]: dihedral angle restraint.
RMSD[MASS] (available in ATDYN only) : RMSD restraint. MASS means mass-weighted
RMSD. The calculation is done without superimposing the reference coordinates.
DIST, ANGLE, DIHED are capable to calculate distance/angle/dihedral among representative points of the groups. MASS indicates that the force is applied to the center of mass
of a selected group. The force without MASS is applied to the arithmetic average of the
coordinates.
constant Real
32
=
=
=
=
=
1
DIST
10.0
2.0
1 2
33
CHAPTER
THIRTEEN
REMD
In [REMD] section, the users can specify keywords for Replica-Exchange Molecular Dynamics
(REMD) simulation. REMD method is one of the enhanced conformational sampling methods used
for systems with rugged free-energy landscapes. The original temperature-exchange method (T-REMD)
is the most widely used in simulations of bio-molecules [18] [42]. Here, replicas (or copies) of the
original system are prepared, and different temperatures are assigned to each replica. Each replica is run
in a canonical (NVT) or isobaric-isothermal (NPT) ensemble, and target temperatures are exchanged
between a pair of replicas during a simulation. Exchanging temperature enforces a random walk in
temperature space, resulting in the simulation surmounting energy barriers and the sampling of a much
wider conformational space of target molecules.
In REMD methods, the transition probability for the replica exchange process is given by the usual
Metropolis criterion,
w(X X 0 ) = min(1,
P (X 0 )
) = min(1, exp()).
P (X)
where E is the potential energy, q is the position of atoms, is the inverse temperature defined by
= 1/kB T , i and j are the replica indexes, and m and n are the parameter indexes. After the replica
exchange, atomic momenta are rescaled as follows:
[i]0
r
=
Tn [i]
p ,
Tm
[j]0
r
=
Tm [j]
p ,
Tn
T-REMD is performed with the Langevin method in NPT, NPAT, and NPgT ensembles. For other cases,
only atomic momenta are rescaled.
In GENESIS, not only Temperature REMD but also pressure REMD, surface-tension REMD, REUS (or
Hamiltonian REMD), and their multi-dimensional version are available in both ATDYN and SPDYN.
REMD simulations in GENESIS require a MPI environment. At least one MPI processors must be
assigned to one replica. For example, in the case that the user wants to prepare 32 replicas, (32 x n) MPI
processors are needed.
dimension Integer
Number of dimensions (i.e. number of parameter types to be exchanged) (Default: 1).
exchange_period Integer
Period of replica exchange in time steps (Default: 100). If exchange_period = 0 is
specified, REMD simulation without parameter exchange is executed.
iseed
Random number seed in the replica exchange scheme (Default: 3141592).
type TEMPERATURE / PRESSURE / GAMMA / RESTRAINT
Type of parameter to be exchanged in a dimension (Default: TEMPERATURE).
TEMPERATURE: Temperature REMD
PRESSURE: Pressure REMD
GAMMA: Surface-tension REMD
RESTRAINT: REUS (or Hamiltonian REMD)
nreplica Integer
Number of replicas (or parameters) in a dimension (Default: 0).
parameters real / (real, real)
List of parameters in a dimension. Parameters are separated by white spaces, and the total number of parameters must be equal to the above nreplica. In REUS (type =
RESTRAINT), parameters should be specified in parentheses, where the first and second
values separated by a comma correspond to a force constant and reference value in the
restraint potential function, respectively. Even if force constants and reference values are
specified here, dummy values must be set in the corresponding restraint function in [RESTRAINTS] section.
rest_function (only available for REUS)
Index of restraint function, which is pointing to the restraint function defined in [RESTRAINTS] section (see Restraints).
If rest_function is a positinal restraint, multiple PDB files with different coordinates
should be prepared by specifying reffile = filename{}.pdb in [INPUT] section.
In this case, reference values in the above parameters are ignored but should be specified
as a dummy.
cyclic_params YES / NO
35
Note: When multi-dimensional REMD is carried out, parameters are exchanged alternatively. For
example, in TP-REMD (type(1) = TEMPERATURE and type(2) = PRESSURE), temperature is exchanged at the first exchange trial, and pressure is exchanged at the second trial. This is repeated during
the simulations.
Note: All parameters except for exchange_period in [REMD] section should not be changed
before and after restart run.
=
=
=
=
=
1
1000
TEMPERATURE
4
300 310 320 330
=
=
=
=
=
=
=
=
=
[SELECTION]
group1
group2
= ai:1
= ai:2
[RESTRAINTS]
nfunctions
function1
reference1
constant1
select_index1
=
=
=
=
=
2
1000
TEMPERATURE
8
300 310 320 330 340 350 360 370
RESTRAINT
4
(2.0,10.0) (2.0,10.5) (2.0,11.0) (2.0,11.5)
1
1
DIST
10.0
2.0
1 2
# dummy
# dummy
36
CHAPTER
FOURTEEN
#
$
#
$
#
$
Lets look at the header entries of the file. It can tell a lot about this molecule, for example, the structure
was determined by X-ray diffraction (described in EXPDTA entry in the file), there is no missing residues
(no MISSING entries), and there are three disulfide bonds between 5-55, 14-38, 30-51 cystein residue
pairs (SSBOND entries).
Since atom the naming conventions of PDB and CHARMM are slightly different, we need manually to
edit the file to comply with the CHARMM convention with a text editor (such as vi or Emacs). We need
to replace O and OXT1 atoms of the C-terminal residue (ALA58) with OT1 and OT2, respectively. After
the edit, ALA58 should look like follow:
$ less 4pti.pdb
...skipped...
ATOM
449 N
ATOM
450 CA
ATOM
451 C
ATOM
452 OT1
ATOM
453 CB
ATOM
454 OT2
TER
455
...skipped...
ALA
ALA
ALA
ALA
ALA
ALA
ALA
A
A
A
A
A
A
A
58
58
58
58
58
58
58
25.146
25.617
25.248
24.962
27.160
24.919
29.681
30.840
30.735
31.791
30.980
29.594
-6.493
-7.256
-8.729
-9.369
-7.146
-9.172
1.00
1.00
1.00
1.00
1.00
1.00
46.21
45.05
46.90
39.78
50.07
43.54
N
C
C
O
C
O
Since GENESIS does not provide any programs for building simulation systems, we use other packages
such as CHARMM [14] or psfgen [15] (supplied with NAMD [16]). In this chapter, we use VMD
[17], and psfgen is called via the plug-in interface within VMD. Other VMD plug-ins (solvate [47] and
autoionize [48]) are also used for solvation and neutralization, respectively.
A VMD script for building the simulation system containing a BPTI molecule in a box with water
molecules and ions is at tutorial/bpti/1_setup/setup.tcl. We can run it with VMD:
$ vmd -dispdev text <setup.tcl | tee run.out
Here, with -dispdev text option, we inhibits any graphical display windows from opening (we do
not need any visualization for this set-up). The redirection <setup.tcl instructs VMD to process the
script. The content of the script, written in Tcl languages, is shown below.
The script consists of four parts. In the first part, the crystal water molecules are removed by using the
atom selection facility of VMD. The protein structure without crystal water molecules is written in a pdbfile (4pti_protein.pdb). In the second part, psfgen plugin is called. psfgen matches the residues in
the original file with the residues defined in topfile (top_all27_prot_lipid.rtf), and then creates disulfide bonds by patching special residues for them. Also, the coordinates of missing hydrogen
atoms are guessed from the topology. Then, a psffile (protein.psf) and a pdbfile (protein.pdb)
of BPTI without solvent are written. In the third part, solvate plugin is invoked to solvate BPTI with
TIP3P bulk water molecules and fills the box for simulations under the periodic boundary conditions.
The resulting system is written in solvate.psf and solvate.pdb. In the fourth part, counter ions
are added to neutralize the system using autoionize plug-in. The final results are saved as ionize.psf
and ionize.pdb.
38
39
It is always a good idea to check the final structure by a visual inspection. We can visualize the structure
with VMD by specifying psffile (ionize.psf) and pdbfile (ionize.pdb) together:
$ vmd -psf ionize.psf -pdb ionize.pdb
If the building has successfully finished, the structure looks like follows:
Note: The size (70 A x 83 A x 68 A) of the simulation box built at this step is rather large for a
typical globular protein, because we are going to use SPDYN. For the spatial decomposition scheme in
SPDYN, the simulation box size in each of direction has to be 5-times larger than the pair list distance
(pairlistdist=13.5 distance is in Angstrom). If you prefer a smaller simulation box, decrease the
padding distance (-t 22.5) in solvate plug-in, and use ATDYN for your simulations instead.
40
The control file (run.inp, shown below) consists of several sections, such as [INPUT], [OUTPUT],
and [ENERGY], etc., where we can specify the control parameters for the calculation.
In [INPUT] section, input files for the minimization are specified. topfile (topology file), parfile (parameter file), psffile (PSF file), pdbfile (PDB file with an initial structure) are always required. As an
optional input, reffile (reference file) is given as the reference structure for positional restraints (see Input
and Output files for an explanation of each input file).
In [OUTPUT] section, output files are specified. SPDYN does not create any output file unless we
explicitly specify the files. Here, rstfile (restart file) is used for the restart of the simulation (see Input
and Output files for an explanation of each output file).
In [ENERGY] section, we set the keyword related to the energy and force evaluation. Here, the particle
mesh ewald (PME) method is selected for computing electrostatic interactions, usually combined with
the periodic boundary condition (PBC) in [BOUNDARY] section. By default, an interpolation scheme
with the lookup table is used for the evaluations of non-bonded interactions.
[MINIMIZE] section turns on the minimization engine of SPDYN. Here, we specify 1000 steps of the
steepest descent algorithm (SD). See Minimize for details.
In [BOUNDARY] section, the boundary condition and the simulation box size are set. Here, we use the
periodic boundary condition (PBC). The values of the box size were taken from the previous building
step. See Boundary for details.
In [SELECTION] section, we define a group (group1) consisting of backbone atoms of the protein to
impose positional restraints. For the detailed expressions for selection, see Selection.
In [RESTRAINTS] section, positional restraints are specified for the group of backbone atoms defined
in the previous [SELECTION] section. The positional restraints are imposed with respect to the reference coordinates given by reffile in [INPUT] section. See Restraints for details.
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
41
[OUTPUT]
dcdfile = run.dcd
rstfile = run.rst
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
=
=
=
=
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
PME
10.0
12.0
13.5
72
80
72
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
1000
100
100
1000
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
PBC
70.8250
83.2579
69.0930
#
#
#
#
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
After the minimization, it is recommended to confirm the decrease of the potential energy. The output
format of GENESIS is simple, so we can easily extract the potential energy term with standard UNIX
tools:
$ grep "^INFO" run.out | tail -n +2 | awk {print $2, $5} >pot.dat
$ gnuplot
gnuplot> set xlabel "step"
gnuplot> set ylabel "potential energy [kcal/mol]"
gnuplot> plot "pot.dat" w lp
42
In [INPUT] section of the control file (run.inp), rstfile is explicitly set for the restart from the previous
minimization.
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
rstfile
=
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
../2_minimization/run.rst
#
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
restart file
Major differences from the previous minimization are [DYNAMICS], [CONSTRAINTS], and [ENSEMBLE] sections with the parameters for molecular dynamics simulations.
[DYNAMICS] section turns on the molecular dynamics engine of SPDYN. In this section, the parameters related to molecular dynamics are specified. For the heating up of the system, a simulated annealing
protocol is enabled with annealing=YES. The simulated annealing algorithm increases the target
temperature by dtemperature K every anneal_period steps. In this case, the temperature is
43
=
=
=
=
=
=
=
=
=
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 0.1
# [NVE,NVT,NPT]
# thermostat
# initial temperature (K)
[BOUNDARY]
type
= PBC
# [PBC,NOBC]
LEAP
5000
0.002
50
50
5000
YES
50
3
#
#
#
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
simulated annealing
annealing period
temperature change at
annealing (K)
After the heating up simulation, the increase in temperature can be confirmed in this way:
$ grep "^INFO" run.out | tail -n +2 | awk {print $3, $21} >temp.dat
$ gnuplot
gnuplot> set xlabel "time [ps]"
gnuplot> set ylabel "temperature [K]"
gnuplot> plot "temp.dat" w lp
44
Also, it is recommended to visualize the trajectory to confirm that the protein structure is not disrupted.
We can visualize the trajectory with VMD by the following command:
$ vmd -psf ../1_setup/ionize.psf -dcd run.dcd
Note: For real applications, 10 ps may be too short for the heating up process. For your research, longer
runs than 1-nanosecond would be recommended. The same is true for the next equilibration step.
In the control file, [SELECTION] and [RESTRAINTS] sections are removed, since we performs a
restraint-free simulation. Most of the control parameters are same as those in the previous heat-up
simulation, except for rstfile, as we specify the output of the previous simulation.
45
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
rstfile
=
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
../3_heating/run.rst
#
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
restart file
After the equilibration, it is always a good practice to check the convergence of quantities related to
thermodynamic conditions (such as temperature, pressure, or volume). This can be done in a similar
way described before.
In [DYNAMICS] section of the control file, the total number of simulation steps is set as
nsteps=500000, and the time step is set as timestep=0.002 ps. This combination results in
the length of nsteps*time step = 1000 ps = 1 ns simulation. Since the output period for coordinates is crdout_period=500, we have the total of nsteps/crdout_period = 1000 snapshots
in the output trajectory.
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
=
=
=
=
=
=
LEAP
500000
0.002
500
500
500000
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
When the production simulation finishes, lets visualize the trajectory with VMD. From the visualization,
you may get some insights into the fluctuations of the protein structure. In the next step, we will quantify
the size of the fluctuations by analyzing the trajectory.
46
The control file for calculating the RMSDs is shown below. In [TRAJECTORY] section, we set the
input trajectory file as trjfile1=../5_production/run.dcd.
Note: crd_convert supports multiple input files, so we can specify another file as trjfile2=XXX.
In [SELECTION] section, we define a group (group1) of the C atoms of the protein, which are fitted
for the RMSD calculation. In the [FITTING] section, the fitting method is specified. In this case, the
translations TR and rotations ROT are allowed. Finally the RMSD output file (rmsfile=run.rms) is
set in [OUTPUT] section.
[INPUT]
psffile = ../1_setup/ionize.psf
pdbfile = ../1_setup/ionize.pdb
[OUTPUT]
rmsfile = run.rms
# RMSD file
[TRAJECTORY]
trjfile1
md_step1
mdout_period1
ana_period1
trj_format
trj_type
=
=
=
=
=
=
[SELECTION]
group1
# selection group 1
[FITTING]
fitting_method
fitting_atom
= TR+ROT
= 1
# method
# atom group
../5_production/run.dcd
1000
1
1
DCD
COOR+BOX
#
#
#
#
#
#
trajectory file
number of MD steps
MD output period
analysis period
(PDB/DCD)
(COOR/COOR+BOX)
47
[OPTION]
check_only
= NO
# (YES/NO)
After running crd_convert, we get the RMSD file (run.rms). The first column of the file is for the
time step, and the second column is for the RMSD values in unit of Angstrom. This can be plotted with
gnuplot:
$ gnuplot
gnuplot> set xlabel "step"
gnuplot> set ylabel "RMSD [angstrom]"
gnuplot> plot "run.rms" w lp
The most of the RMSDs are lower than 1 Angstrom throughout the simulation, indicating that this
protein is very stable in solution.
48
CHAPTER
FIFTEEN
#
$
#
$
#
$
The submission to the MMTSB web service consists of three stages: (1) upload the cleaned PDB file
(1pgb_edited.pdb), (2) give a reference tag (i.e. 1pgb), and (3) enter your e-mail address. After
the submission, youll get an e-mail with a tar-ball file containing parfile and topfile.
After extracting the tar-ball file into the working directory, you can see parfile, topfile
(GO_1pgb.param and GO_1pgb.top, respectively) and other files.
# change to the set-up directory
$ cd tutorial/go/1_setup/
# extract the tar-ball file
50
Here, with -dispdev text option, we inhibits any graphical display windows from opening (we do
not need any visualization for this set-up). The redirection <setup.tcl instructs VMD to process the
script. The content of the script, written in Tcl languages, is shown below.
The script consists of three parts. In the first part, the PDB file created by the MMTSB web service is
read, and the molecule is moved as the center of mass is at the origin. In the second part, residue names
are replaced with special names for the Go-model (G1, G2, G3,...). These replacements are required to
match the residue names to those defined in topfile (GO_1pgb.param). In the third part, psfgen plugin
is called, and psffile (go.pdb) and pdbfile (go.pdb) are generated.
##### read pdb and remove center of mass
mol load pdb GO_1pgb.pdb
##### replace residue names with G1, G2, G3, ...
set all [atomselect top all]
set residue_list [lsort -unique [$all get resid]]
foreach i $residue_list {
set resname_go [format "G%d" $i]
set res [atomselect top "resid $i" frame all]
$res set resname $resname_go
}
$all writepdb tmp.pdb
##### generate PSF and PDB files
package require psfgen
resetpsf
topology GO_1pgb.top
segment PROT {
first none
last none
pdb tmp.pdb
}
regenerate angles dihedrals
coordpdb tmp.pdb PROT
# invert the center of mass
$all moveby [vecinvert [measure center $all weight mass]]
$all moveby [vecinvert [measure center $all]]
# write psf and pdb files
writepsf go.psf
writepdb go.pdb
exit
It is always a good idea to inspect the final structure visually. We can visualize the structure with VMD
15.1. Building a simulation system
51
If the building has successfully finished, the structure looks like this:
52
The control file (run.inp, shown below) is different from those for atomistic simulations.
The control file (run.inp) contains several sections, such as [INPUT], [OUTPUT], and [ENERGY],
where we can specify the control parameters for the simulation. In [INPUT] section, topfile (topology
file), parfile (parameter file), psffile (PSF file), pdbfile (the initial structure) are set (see Input and Output
files for an explanation of each input file).
In [OUTPUT] section, output files are set. ATDYN does not create any output file unless we explicitly
specify the files. Here, rstfile (restart file) and dcdfile (binary trajectory file) are set (see Input and Output
files for an explanation of each output file).
In [ENERGY] section, we can specify the parameters related to the energy and force evaluation. Here,
KBGO is specfied for the Go-model of Karanicolas and Brooks (forcefield=KBGO). This value turns
on the special 12-10-6 type vdW interactions for native contacts. For the distances, very large values
are set (switchdist=997, cutoffdist=998, pairlistdist=999) to perform a non-cutoff
simulation.
[DYNAMICS] section turns on the molecular dynamics engine of ATDYN. For the Go-model with
SHAKE constraints, time step can be 20 fs.
In [CONSTRAINTS] section, we enable SHAKE algorithm on all bonded pairs (rigid_bond=YES).
To supress SETTLE algorithm applied for non-existent TIP3P water molecules, we have to disable it
explicitly (fast_water=NO). The tolerance for SHAKE is rather large compared to atomistic simulations (shake_tolerance=1.0e-6).
In [ENSEMBLE] section, LANGEVIN thermostat is chosen for an isothermal simulation with the friction constant of 1.0 ps-1 .
Finally, in [ENSEMBLE] section, no boundary condition is imposed in this system.
[INPUT]
topfile
parfile
psffile
pdbfile
=
=
=
=
../1_setup/GO_1pgb.top
../1_setup/GO_1pgb.param
../1_setup/go.psf
../1_setup/go.pdb
[OUTPUT]
dcdfile = run.dcd
rstfile = run.rst
#
#
#
#
topology file
parameter file
protein structure file
PDB file
[ENERGY]
forcefield
electrostatic
switchdist
cutoffdist
pairlistdist
table
=
=
=
=
=
=
KBGO
CUTOFF
997.0
998.0
999.0
NO
#
#
#
#
switch distance
cutoff distance
pair-list cutoff distance
usage of lookup table
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
rstout_period
crdout_period
=
=
=
=
=
=
LEAP
100000000
0.020
10000
10000
10000
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
timestep (ps)
energy output period
restart output period
coordinates output period
53
nbupdate_period = 10000
[CONSTRAINTS]
rigid_bond
= YES
fast_water
= NO
shake_tolerance = 1.0e-6
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 325
gamma_t
= 0.01
[BOUNDARY]
type
= NOBC
#
#
#
#
#
#
#
#
#
#
[NVE,NVT,NPT]
thermostat
initial and target
temperature (K)
thermostat friction (ps-1)
in [LANGEVIN]
# [PBC, NOBC]
The control file for calculating the RMSDs is shown below. In [TRAJECTORY] section, we set the
input trajectory file as trjfile1=../2_production/run.dcd.
Note: crd_convert supports multiple input files, so we can specify another file as trjfile2=XXX.
In [SELECTION] section, we define a group (group1) of all beads in the model which are fitted for the RMSD calculation. In the [FITTING] section, the fitting method is specified. In this
case, the translations TR and rotations ROT are allowed for the fitting. Finally the RMSD output file
(rmsfile=run.rms) is set in [OUTPUT] section.
[INPUT]
psffile = ../1_setup/go.psf
reffile = ../1_setup/go.pdb
[OUTPUT]
rmsfile = run.rms
# RMSD file
[TRAJECTORY]
trjfile1
md_step1
# trajectory file
# number of MD steps
= ../2_production/run.dcd
= 100000000
54
mdout_period1
ana_period1
trj_format
trj_type
=
=
=
=
10000
1
DCD
COOR
#
#
#
#
MD output period
analysis period
(PDB/DCD)
(COOR/COOR+BOX)
[SELECTION]
group1
= all
[FITTING]
fitting_method
fitting_atom
= TR+ROT
= 1
# method
# atom group
[OPTION]
check_only
= NO
# (YES/NO)
After running crd_convert, we get the RMSD file (run.rms). The first column of the file is for the
time step, and the second column is for the RMSD values in unit of Angstrom. This can be plotted with
gnuplot:
$ gnuplot
gnuplot> set xlabel "step"
gnuplot> set ylabel "RMSD [angstrom]"
gnuplot> plot "run.rms" w lp
The noticeable fluctuations of RMSD values correspond to folding/unfolding events of the protein.
55
CHAPTER
SIXTEEN
16.1 Minimization
As the initial structures often contain non-physical steric clashes or non-equilibrium geometries, it is
recommended to relax the system before the REMD simulation by minimizing the potential energy of
the system. ATDYN and SPDYN support the steepest descent algorithm for minimization, which moves
atoms proportionally to their negative gradients of the potential energy.
The following commands perform a 1000-step minimization with ATDYN.
#
$
#
$
56
The control file (remd_min.inp, shown below) consists of several sections, such as [INPUT], [OUTPUT], and [ENERGY], etc., where we can specify the control parameters for the simulation.
In [INPUT] section, input files for the minimization are specified. topfile (topology file), parfile (parameter file), psffile (PSF file), pdbfile (initial structure) are always required. An optional input, reffile
(reference file), is the reference structure for positional restraints (see Input and Output files for an
explanation about each input file).
In [OUTPUT] section, output files are specified. ATDYN does not create any output file unless we
explicitly specify the files. Here, rstfile (restart file) is used for the restart of the simulation (see Input
and Output files for an explanation about each output file).
In [ENERGY] section, we set the keyword related to the energy and force evaluation. Here, the particle
mesh ewald (PME) method is selected for computing electrostatic interactions, usually combined with the
periodic boundary condition (PBC) in [BOUNDARY] section. table_order=0 is specified for nonbonded interactions (by default, an interpolation scheme with the lookup table is used for the evaluations
of non-bonded interactions). When table_order=0, the forces values of the nearest grid point in the
table are used (without any interpolations), and thus large forces which may occur at the beginning of
minimization can be safely truncated. See Energy for details.
[MINIMIZE] section turns on the minimization engine of ATDYN. Here, we specify 1000 steps of the
steepest descent algorithm (SD). See Minimize for details.
In [BOUNDARY] section, the boundary condition and the simulation box size are set. Here, we use the
periodic boundary condition (PBC). See Boundary for details.
In [SELECTION] section, we define a group (group1) consisting of backbone atoms of the protein to
impose positional restraints. For the detailed expressions for selection, see Selection.
In [RESTRAINTS] section, positional restraints are specified for the group of backbone atoms defined
in the previous [SELECTION] section. The positional restraints are imposed with respect to the reference coordinates given by reffile in [INPUT] section. See Restraints for details.
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ala.psf
../1_setup/ala.pdb
../1_setup/ala.pdb
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
[OUTPUT]
dcdfile = ./remd_min.dcd
rstfile = ./remd_min.rst
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table
table_order
pme_ngrid_x
pme_ngrid_y
#
#
#
#
#
#
#
#
16.1. Minimization
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
YES
1
64
64
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
usage of lookup table
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
57
pme_ngrid_z
= 64
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
nbupdate_period
=
=
=
=
=
1000
10
10
1000
5
#
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
nonbond update period
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
PBC
64.0
64.0
64.0
#
#
#
#
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
16.2 Equilibration
In this step, we equilibrate the system under the condition of 300 K and 1 atm by a 10-picosecond
molecular dynamics simulation. In the first equilibration (eq1), positional restraints are imposed on the
backbone, and at the second equilibration (eq2) all the restraints are removed to relax the whole system
for the production simulation.
The following commands perform the first 10-picosecond NPT molecular dynamics simulation under
the condition of 300 K and 1 atm with restraints (eq1).
#
$
#
$
#
$
In [INPUT] section of the control file (remd_eq1.inp), rstfile is set to the restart file from the previous
minimization.
The main differences from the previous minimization step-up are [DYNAMICS], [CONSTRAINTS],
and [ENSEMBLE] sections, which contain the keywords for molecular dynamics simulations.
[DYNAMICS] section enables the molecular dynamics engine of ATDYN. Parameters related to molecular dynamics integrator are specified in this section. See Dynamics for details.
[CONSTRAINTS] section specifies constraints during a simulation. rigid_bond is a keyword for
the SHAKE algorithm. For TIP3P waters, a more fast algorithm (SETTLE) is automatically applied if
rigid_bond=YES. See Constraints for details.
16.2. Equilibration
58
In [ENSEMBLE] section, we can choose a thermostat and a barostat from several standard options. For
typical simulations, LANGEVIN or BERENDSEN is recommended. See Ensemble for details.
In [BOUNDARY] section, we no longer need to specify the simulation box size, as the box size are read
from the restart file.
[INPUT]
topfile
parfile
psffile
pdbfile
rstfile
reffile
=
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ala.psf
../1_setup/ala.pdb
../2_minimization/remd_min.rst
../1_setup/ala.pdb
#
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
restart file
reference for restraints
[OUTPUT]
dcdfile = ./remd_eq1.dcd
rstfile = ./remd_eq1.rst
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
YES
1
64
64
64
#
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
usage of lookup table
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
nbupdate_period
=
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
5
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
nonbond update period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NPT
= LANGEVIN
= 300.0
pressure
= 1.0
#
#
#
#
#
[BOUNDARY]
type
= PBC
# [PBC,NOBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
= 1
= POSI
# number of functions
# restraint function type
16.2. Equilibration
[NVE,NVT,NPT]
thermostat and barostat
initial and target
temperature (K)
target pressure (atm)
59
constant1
select_index1
= 5.0
= 1
# force constant
# restraint group
In the second equilibration, [SELECTION] and [RESTRAINTS] sections in the control file are removed, as we perform a restraint-free simulation (eq2). Most of the control parameters are the same to
the previous equilibration (eq1) except rstfile, which is set to the restart file of the first equilibration.
[INPUT]
topfile
parfile
psffile
pdbfile
rstfile
reffile
=
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ala.psf
../1_setup/ala.pdb
./remd_eq1.rst
../1_setup/ala.pdb
#
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
restart file
reference for restraints
[OUTPUT]
dcdfile = ./remd_eq2.dcd
rstfile = ./remd_eq2.rst
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
YES
1
64
64
64
#
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
usage of lookup table
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
nbupdate_period
=
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
5
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
nonbond update period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NPT
= LANGEVIN
= 300.0
pressure
= 1.0
#
#
#
#
#
[BOUNDARY]
type
= PBC
# [PBC]
[NVE,NVT,NPT]
thermostat and barostat
initial and target
temperature (K)
target pressure (atm)
Also, it is recommended to visualize the trajectory to confirm, the protein structure is not disrupted. We
can visualize the trajectory with VMD by the following command:
16.2. Equilibration
60
Note: For real applications, 10 ps may be too short for equilibration. For your research, longer equilibration than 1 ns is recommended.
After the equilibration, it is always a good practice to check the convergence of quantities related to
thermodynamic conditions (such as temperature, pressure, or volume). This can be done in a similar
way described before.
61
[INPUT]
topfile
parfile
psffile
pdbfile
rstfile
=
=
=
=
=
#
#
#
#
#
remd_run_eq{}.log
remd_run_eq{}.dcd
remd_run_eq{}.rst
remd_run_eq{}.rem
[OUTPUT]
logfile =
dcdfile =
rstfile =
remfile =
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ala.psf
../1_setup/ala.pdb
../3_equilibration/remd_eq2.rst
#
#
#
#
topology file
parameter file
protein structure file
PDB file
restart file
[REMD]
dimension
= 1
exchange_period = 0
type(1)
nreplica(1)
parameters(1)
= TEMPERATURE
= 32
= 300 301 302 303 304 305 306 307 308 309 310 311 312 \
313 314 315 316 317 318 319 320 321 322 323 324 325 \
326 327 328 329 330 331
cyclic_params(1)= NO
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
1
64
64
64
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
nbupdate_period
=
=
=
=
=
=
=
LEAP
10000
0.002
50
50
10000
5
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
nonbond update period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 300.0
#
#
#
#
[BOUNDARY]
type
box_size_x
box_size_y
= PBC
= 64.0
= 64.0
# [NOBC,PBC]
# box size (x) in [PBC]
# box size (y) in [PBC]
[NVE,NVT,NPT]
thermostat
initial and target
temperature (K)
62
box_size_z
= 64.0
After the equilibrating the system, we move on to the production simulation of REMD. The following
commands perform 1-nanosecond REMD simulations in NVT ensemble:
# perform production simulation of REMD with ATDYN by submitting a
batch job script
$ qsub remd_run.sh
=
=
=
=
=
[OUTPUT]
logfile =
dcdfile =
rstfile =
remfile =
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ala.psf
../1_setup/ala.pdb
./remd_run_eq{}.rst
#
#
#
#
#
remd_run{}.log
remd_run{}.dcd
remd_run{}.rst
remd_run{}.rem
[REMD]
dimension
exchange_period
type(1)
nreplica(1)
parameters(1)
=
=
=
=
=
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
=
=
=
=
=
=
#
#
#
#
topology file
parameter file
protein structure file
PDB file
restart file
1
1000
TEMPERATURE
32
300 301 302 303 304 305 306 307 308 309 310 311 312 \
313 314 315 316 317 318 319 320 321 322 323 324 325 \
326 327 328 329 330 331
cyclic_params(1)= NO
PME
7.5
8.0
9.0
1
64
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
order of lookup table
grid size_x in [PME]
63
pme_ngrid_y
pme_ngrid_z
= 64
= 64
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
nbupdate_period
rstout_period
=
=
=
=
=
=
=
#
#
#
#
#
#
#
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 300.0
#
#
#
#
[NVE,NVT,NPT]
thermostat
initial and target
temperature (K)
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
#
#
#
#
[NOBC,PBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
LEAP
1500000
0.002
1000
1000
5
1500000
PBC
64.0
64.0
64.0
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
nonbond update period
restart output period
The control file for remd_convert (Inp_remd_conv) are shown below. In [INPUT] section, we specify
the output files (dcdfile and remfile) from the REMD simulation. In [SELECTION] section, we
define a group (group1) of the heavy atoms of the ALAD which are fitted. In the [FITTING] section,
the fitting method is set. In this case, the translations TR and rotations ROT are allowed for the fitting.
In the [OPTION] section, the type of conversion is set to convert_type=PARAMETER, and the
indices of the type to be converted are set to convert_ids=1 17 32. Period of the dcdfile
trajectory and the output format are set to dcd_md_period=1000 and trjout_format=DCD,
respectively. The selection of output atoms is set to trjout_atom=1, and a periodic boundary correction is set to pbc_correct=MOLECULE.
16.5. Analysis of REMD simulation
64
[INPUT]
psffile
reffile
dcdfile
remfile
=
=
=
=
../1_setup/ala.psf
../1_setup/ala.pdb
../4_production/remd_run{}.dcd
../4_production/remd_run{}.rem
[OUTPUT]
pdbfile = ./remd_run_param.pdb
trjfile = ./remd_run_param{}.trj
# PDB file
# trajectory file
[SELECTION]
group1
mole_name1
[FITTING]
fitting_method
fitting_atom
zrot_ngrid
zrot_grid_size
=
=
=
=
TR+ROT
1
10
1.0
#
#
#
#
[OPTION]
check_only
convert_type
convert_ids
dcd_md_period
trjout_format
trjout_type
trjout_atom
pbc_correct
=
=
=
=
=
=
=
=
NO
PARAMETER
1 17 32
1000
DCD
COOR
1
MOLECULE
# (YES/NO)
# (REPLICA/PARAMETER)
#
#
#
#
#
method
atom group
number of z-rot grids
z-rot grid size
To examine (1) the time series of replica exchange, we use the standard output file of the REMD simulation to extract time series of ReplicaID:
$
65
66
We can observed random walks in replica space at parameterID=1 (300 K), 17 (316 K) and 32 (331
K) colored by red, green and blue, respectively.
To examine (2) the time series of temperature exchange of three arbitrary chosen replicas (replica id = 1,
17 and 32), we use the standard output file of the REMD simulation to extract time series of Parameter:
$ grep Parameter : ./remd_run.param |
awk / Parameter :/{print $3, $19, $34}>
$ gnuplot
gnuplot> set xlabel "Time (ns)"
gnuplot> set ylabel "Temperature [K]"
gnuplot> plot "remd_run.param" u ($0*0.002):1
title "replicaID=1",\
"remd_run.param" u ($0*0.002):2
title "replicaID=17",\
"remd_run.param" u ($0*0.002):3
title "replicaID=32"
./remd_run.param
w l lw 2 lt 1
w l lw 2 lt 2
w l lw 2 lt 3
67
To examine (3) the time series of total potential energy of three arbitrary chosen replicas (replica id = 1,
17 and 32), we use the logfile of the REMD simulation to extract time series of POTENTIAL_ENE:
$ grep INFO: remd_run01.log | tail -n +2 | awk {print
./remd_run01.ene
$ grep INFO: remd_run17.log | tail -n +2 | awk {print
./remd_run17.ene
$ grep INFO: remd_run32.log | tail -n +2 | awk {print
./remd_run32.ene
$ gnuplot
gnuplot> set xlabel "Time (ns)"
gnuplot> set ylabel "Potential energy (kcal/mol)"
gnuplot> plot "remd_run01.ene" u ($0*0.002):1 w l lw 1.5
title "replicaID=1",\
"remd_run17.ene" u ($0*0.002):1 w l lw 1.5
title "replicaID=17",\
"remd_run32.ene" u ($0*0.002):1 w l lw 1.5
title "replicaID=32"
$5}>
$5}>
$5}>
lt 1
lt 1
lt 1
68
We can observe random walks in temperature and potential energy spaces at replicaID=1, 17 and
32, colored by red, green and blue, respectively. These results indicate that the REMD simulation
performed satisfactorily.
Finally we plot PMF (Potential of Mean Force) surface versus two torsion angles: CLP-NL-CA-CRP
() and NL-CA-CRP-NR ().
#
$
#
$
The control file for trj_analysis (Inp_remd_torsion) are shown below. In [OPTION] section, we specify
four atom names which are consisting torsion angle. Each atom name is defined using the segment name
and the residue name.
[INPUT]
psffile = ../1_setup/alad.psf
reffile = ../1_setup/alad.pdb
[OUTPUT]
torfile = ./remd_run_param01.tor
# torsion file
[TRAJECTORY]
trjfile1
md_step1
mdout_period1
ana_period1
repeat1
=
=
=
=
=
trj_format
trj_type
trj_natom
= DCD
= COOR
= 10
./remd_run_param01.trj
1500000
1000
1
1
#
#
#
#
trajectory file
number of MD steps
MD output period
analysis period
#
#
#
#
(PDB/DCD)
(COOR/COOR+BOX)
number of atoms
in trajectory file
[OPTION]
69
check_only
torsion1
torsion2
= NO
= PROT:1:ALAD:CLP
PROT:1:ALAD:CA
= PROT:1:ALAD:NL
PROT:1:ALAD:CRP
# (YES/NO)
PROT:1:ALAD:NL \
PROT:1:ALAD:CRP
PROT:1:ALAD:CA \
PROT:1:ALAD:NR
70
CHAPTER
SEVENTEEN
71
2. To choose umbrella potentials, a user needs to find appropriate reaction coordinates describing the
conformational changes of the system. ALAD is a common model for backbone conformational
analyses, so we use dihedral angles as the reaction coordinates.
=
=
=
=
=
72
reffile = ../1_setup/ala.pdb
[OUTPUT]
logfile =
dcdfile =
rstfile =
remfile =
reus_run_eq{}.log
reus_run_eq{}.dcd
reus_run_eq{}.rst
reus_run_eq{}.rem
#
#
#
#
[REMD]
dimension
= 2
exchange_period = 0
type(1)
=
nreplica(1)
=
parameters(1)
=
cyclic_params(1)=
TEMPERATURE
4
300 301 302 303
NO
type(2)
=
nreplica(2)
=
parameters(2)
=
cyclic_params(2)=
rest_function(2)=
RESTRAINT
4
(1.0 144.0) (1.1 146.0) (1.2 148.0) (1.3 150.0)
NO
1
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
1
64
64
64
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
nbupdate_period
rstout_period
=
=
=
=
=
=
=
LEAP
10000
0.002
50
50
5
10000
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
nonbond update period
restart output period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 300.0
#
#
#
#
[NVE,NVT,NPT]
thermostat
initial and target
temperature (K)
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
#
#
#
#
[NOBC,PBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
PBC
64.0
64.0
64.0
73
[SELECTION]
group1
group2
group3
group4
=
=
=
=
an:NL
an:CA
an:CRP
an:NR
#
#
#
#
restraint
restraint
restraint
restraint
[RESTRAINTS]
nfunctions
function1
constant1
reference1
select_index1
=
=
=
=
=
1
DIHED
1.0
144.0
1 2 3 4
#
#
#
#
#
number of functions
restraint function type
force constant
reference
restraint groups
group
group
group
group
1
2
3
4
After equilibrating the system, we move on to the production simulation of REUS. The following commands perform 6-nanosecond REUS simulations in NVT ensemble:
# perform production simulation of REUS with ATDYN by submitting a
batch job script
$ qsub reus_run.sh
=
=
=
=
=
[OUTPUT]
logfile =
dcdfile =
rstfile =
remfile =
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/alad.psf
./reus_run_eq{}.rst
../1_setup/alad.pdb
#
#
#
#
#
reus_run{}.log
reus_run{}.dcd
reus_run{}.rst
reus_run{}.rem
[REMD]
dimension
exchange_period
type(1)
nreplica(1)
parameters(1)
=
=
=
=
=
#
#
#
#
topology file
parameter file
protein structure file
restart file
PDB file
2
1000
TEMPERATURE
4
300 301 302 303
74
cyclic_params(1)=
type(2)
=
nreplica(2)
=
parameters(2)
=
cyclic_params(2)=
rest_function(2)=
NO
RESTRAINT
4
(1.0 144.0) (1.1 146.0) (1.2 148.0) (1.3 150.0)
NO
1
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
7.5
8.0
9.0
1
64
64
64
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
nbupdate_period
=
=
=
=
=
=
=
LEAP
3000000
0.002
1000
1000
3000000
5
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
nonbond update period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 300.0
#
#
#
#
[NVE,NVT,NPT]
thermostat
initial and target
temperature (K)
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
PBC
64.0
64.0
64.0
#
#
#
#
[NOBC,PBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
[SELECTION]
group1
group2
group3
group4
=
=
=
=
an:NL
an:CA
an:CRP
an:NR
#
#
#
#
restraint
restraint
restraint
restraint
[RESTRAINTS]
nfunctions
function1
constant1
reference1
select_index1
=
=
=
=
=
1
DIHED
1.0
144.0
1 2 3 4
#
#
#
#
#
number of functions
restraint function type
force constant
reference
restraint groups
group
group
group
group
1
2
3
4
75
The control file for remd_convert (Inp_reus_conv) is shown below. In [INPUT] section, we specify
the output files (dcdfile and remfile) from the REUS simulation. In [SELECTION] section, we
define a group (group1) of the heavy atoms of ALAD which are fitted. In the [FITTING] section, the
fitting method is set. In this case, the translations TR and rotations ROT are allowed for the fitting.
In the [OPTION] section, the type of conversion is set to convert_type=PARAMETER, and the indices of the type to be converted are set to convert_ids=1 9 16. Period of the dcdfile trajectory
and the output format are set to dcd_md_period=1000 and trjout_format=DCD, respectively.
The selection of output atoms is set to trjout_atom=1, and a periodic boundary correction is set to
pbc_correct=MOLECULE.
[INPUT]
psffile
reffile
dcdfile
remfile
=
=
=
=
../1_setup/ala.psf
../1_setup/ala.pdb
../4_production/reus_run{}.dcd
../4_production/reus_run{}.rem
[OUTPUT]
pdbfile = ./reus_run_param.pdb
trjfile = ./reus_run_param{}.trj
# PDB file
# trajectory file
[SELECTION]
group1
mole_name1
[FITTING]
fitting_method
fitting_atom
zrot_ngrid
zrot_grid_size
=
=
=
=
TR+ROT
1
10
1.0
#
#
#
#
[OPTION]
check_only
convert_type
convert_ids
dcd_md_period
trjout_format
trjout_type
trjout_atom
pbc_correct
=
=
=
=
=
=
=
=
NO
PARAMETER
1 9 16
1000
DCD
COOR
1
MOLECULE
# (YES/NO)
# (REPLICA/PARAMETER)
#
#
#
#
#
method
atom group
number of z-rot grids
z-rot grid size
76
To examine (1) the time series of replica exchange, we use the standard output file of the REUS simulation to extract time series of ReplicaID:
$
77
We can observed random walks in replica space at parameterID=1 (300 K, (1.0 144.0)),
parameterID=9 (302 K, (1.0 144.0)) and parameterID=16 (303 K, (1.3 150.0)) colored by red,
green and blue, respectively.
To examine (2) the time series of parameter exchange of three arbitrary chosen replicas (replica id = 1,
9 and 16), we use the standard output file of the REUS simulation to extract time series of ParameterID:
$ grep ParameterID : ./remd_run.param |
awk / ParameterID :/{print $2, $10, $17}> ./remd_run.param
$ gnuplot
gnuplot> set xlabel "Time (ns)"
gnuplot> set ylabel "Temperature [K]"
gnuplot> plot "reus_run.param" u ($0*0.002):1 w l lw 2 lt 1
title "replicaID=1",\
"reus_run.param" u ($0*0.002):2 w l lw 2 lt 2
title "replicaID=9",\
"reus_run.param" u ($0*0.002):3 w l lw 2 lt 3
title "replicaID=16"
78
To examine (3) the time series of total potential energy of three arbitrary chosen replicas (replica id = 1,
9 and 16), we use the enefile of the REUS simulation to extract time series of POTENTIAL_ENE:
$ grep INFO: reus_run01.log | tail -n +2 | awk {print $5}>
./reus_run01.ene
$ grep INFO: reus_run09.log | tail -n +2 | awk {print $5}>
./reus_run09.ene
$ grep INFO: reus_run16.log | tail -n +2 | awk {print $5}>
./reus_run16.ene
$ gnuplot
gnuplot> set xlabel "Time (ns)"
gnuplot> set ylabel "Potential energy (kcal/mol)"
gnuplot> plot "reus_run01.ene" u (\$0*0.002):1 w l lw 1.5 lt 1
title "replicaID=1",\
"reus_run09.ene" u (\$0*0.002):1 w l lw 1.5 lt 1
title "replicaID=9",\
"reus_run16.ene" u (\$0*0.002):1 w l lw 1.5 lt 1
title "replicaID=16"
79
We can observe random walks in temperature and potential energy spaces at replicaID=1, 9 and 16,
colored by red, green and blue, respectively. These results indicate that the REUS simulation performed
satisfactorily.
To plot PMF (Potential of Mean Force) surface versus two torsion angles: CLP-NL-CA-CRP () and
NL-CA-CRP-NR () at 300K with restraint bias removed, WHAM (Weighted Histogram Analysis
Method) could be applied to the REUS trajectory. However, WHAM is beyond the scope of this tutorial.
80
CHAPTER
EIGHTEEN
TUTORIAL 5: PARALLEL
INPUT/OUTPUT SCHEME
If a simulation system is huge, and a larger number of processors is availalbe, it takes a significant
amount of time to gather data for restart and trajectory file updates. So, it is preferred, each process writes
local information separately, rather than a single does for all. In GENESIS, the parallel input/output
(I/O) scheme is available to deal effectively with larger restart or trajectory files.
Note: The parallel I/O scheme are available in SPDYN only.
In this tutorial, we explain an example of the parallel I/O with BPTI protein used in Tutorial 1: Building
and Simulating BPTI in Water. The procedures of the simulation are identical, except that we are dealing
with the parallel I/O using the same number of input files and the number of MPI processes.
The input files of this tutorial are at tutorial/bpti_parallel_io/ of the GENESIS package. In
all cases, we use 8 MPI processes.
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
[OUTPUT]
dcdfile = run.dcd
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
81
rstfile = run.rst
# restart file
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
YES
0
72
80
72
#
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
usage of lookup table
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
=
=
=
=
1000
100
100
1000
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
=
=
=
=
PBC
70.8250
83.2579
69.0930
#
#
#
#
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
The control file is very similar to this except a few changes. First, in [OUTPUT] section, instead
of output restart file, parallel restart file for minimization should be specified. Second, you do not
need [MINIMIZE] section ( you can just leave the same information written in the control file of the
minimization). Third, you should specify the number of domains in each direction or the total number
of domains (which is identical to the number of MPI processes).
Below we write the input (setup.inp) to generate multiple restart files for minimization. It is difficult
to distinguish the necessary keywords from unnecessary ones. So, we wrote all the keywords identically
to control file for minimization. It is necessary to change in [OUTPUT] and [BOUNDARY] sections
only. In [BOUNDARY] section, you need to set the number of sub-domains in each direction using
domain_x, domain_y, and domain_z, or total number of sub-domains using domain_xyz. In
this example, we set the number of sub-domains in each domain:
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
[OUTPUT]
rstfile = setup().rst
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
82
cachepath = ./cache
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
0
72
80
72
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
=
=
=
=
1000
100
100
1000
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
domain_x
domain_y
domain_z
=
=
=
=
=
=
=
PBC
70.8250
83.2579
69.0930
2
2
2
#
#
#
#
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
After executing these commands, you generate 8 restart files named setup0.rst ~ setup7.rst.
These files will be used instead of PDB and PSF files for minimization.
83
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
0
72
80
72
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[MINIMIZE]
nsteps
eneout_period
crdout_period
rstout_period
=
=
=
=
1000
100
100
1000
#
#
#
#
number of steps
energy output period
coordinates output period
restart output period
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
domain_x
domain_y
domain_z
=
=
=
=
=
=
=
PBC
70.8250
83.2579
69.0930
2
2
2
#
#
#
#
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
The main difference between this and control input of minimization without the parallel I/O is at the
keywords of input and output files. If there are multiple files corresponding to each MPI process, ()
should be added as above.
You can perform minimization with the following command:
84
The outputs of the parallel I/O is identical to the ones without it.
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
1
72
80
72
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
annealing
anneal_period
dtemperature
=
=
=
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
YES
50
3
#
#
#
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
simulated annealing
annealing period
temperature change at
annealing (K)
85
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 0.1
# [NVE,NVT,NPT]
# thermostat
# initial temperature (K)
[BOUNDARY]
type
domain_x
domain_y
domain_z
=
=
=
=
# [PBC,NOBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
PBC
2
2
2
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
However, if you use this control input, you get the following error message:
Pio_Check_Compatible>
Pio_Check_Compatible>
Pio_Check_Compatible>
Pio_Check_Compatible>
This is due to [CONSTRAINTS] section, which does not exist in the previous control input file. Unlike
the restart files without the parallel I/O, the potential function information is included in the restart files
(thats why PSF files is not necessary for the parallel I/O). So any change of potential functions requires
a consistent restart files to be regenerated.
The procedure of regenerating restart files is very similar to the case of minimization. The control input
to generate restart files is as follow:
[INPUT]
topfile
parfile
psffile
pdbfile
reffile
rstfile
=
=
=
=
=
=
../1_setup/top_all27_prot_lipid.rtf
../1_setup/par_all27_prot_lipid.prm
../1_setup/ionize.psf
../1_setup/ionize.pdb
../1_setup/ionize.pdb
../2_minimization/run().rst
[OUTPUT]
rstfile = setup().rst
cachepath = ./cache
[OUTPUT]
dcdfile = run().dcd
rstfile = run().rst
#
#
#
#
#
#
topology file
parameter file
protein structure file
PDB file
reference for restraints
restart file
[ENERGY]
86
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
1
72
80
72
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
= NVT
= LANGEVIN
# [NVE,NVT,NPT]
# thermostat
[BOUNDARY]
type
box_size_x
box_size_y
box_size_z
domain_x
domain_y
domain_z
=
=
=
=
=
=
=
#
#
#
#
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
PBC
70.8250
83.2579
69.0930
2
2
2
1
POSI
10.0
1
[PBC,NOBC]
box size (x) in [PBC]
box size (y) in [PBC]
box size (z) in [PBC]
number of functions
restraint function type
force constant
restraint group
In this case, the restart files generated after minimization are in [INPUT] section. Here, we generated
restart files without setting the number of sub-domains in each direction, but setting the total number of
processes only with the keyword domain_xyz.
After running prst_setup setup.inp | tee setup.out, the restart files for the heat-up simulation are regenerated. Except [INPUT], [OUTPUT], and [BOUNDARY] sections.
87
rstfile = run().rst
# restart file
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
table_order
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
1
72
80
72
#
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
order of lookup table
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
annealing
anneal_period
dtemperature
=
=
=
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
YES
50
3
#
#
#
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
simulated annealing
annealing period
temperature change at
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NVT
= LANGEVIN
= 0.1
# [NVE,NVT,NPT]
# thermostat
# initial temperature (K)
[BOUNDARY]
type
domain_x
domain_y
domain_z
=
=
=
=
# [PBC,NOBC]
[SELECTION]
group1
= backbone
[RESTRAINTS]
nfunctions
function1
constant1
select_index1
=
=
=
=
#
#
#
#
PBC
2
2
2
1
POSI
10.0
1
number of functions
restraint function type
force constant
restraint group
By running mpirun -np 8 spdyn run.inp | tee run.out, the procedure of Heat-up with
restraints on the protein could be performed. Unlike the control in the previous section of minimization,
some informations in the [BOUNDARY] section are missing. This is due to the fact the information of
the [BOUNDARY] section is already saved in the parallel restart files, setup0.rst~setup7.rst.
18.3 Equilibration
To do equilibration, the following command is necessary:
18.3. Equilibration
88
In this step, there is a change in [ENSEMBE] section (from NVT to NPT). Thus, we need to regenerate
RST files accordingly to the change:
[INPUT]
topfile = ../1_setup/top_all27_prot_lipid.rtf # topology file
parfile = ../1_setup/par_all27_prot_lipid.prm # parameter file
rstfile = ../3_heating/run().rst
# restart file
[OUTPUT]
rstfile = setup().rst
# restart file
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
72
80
72
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
= NPT
= LANGEVIN
# [NVE,NVT,NPT]
# thermostat and barostat
[BOUNDARY]
type
domain_x
domain_y
domain_z
=
=
=
=
# [PBC,NOBC]
PBC
2
2
2
From this input with prst_setup, multiple restart files are generated. After running prst_setup
setup.inp | tee setup.out, the parallel I/O files are regenerated and these are used for the
equilibration. In this case, we set the number of sub-domains to 2 in each dimension.
The control of the simulation run is as follow:
[INPUT]
topfile = ../1_setup/top_all27_prot_lipid.rtf # topology file
parfile = ../1_setup/par_all27_prot_lipid.prm # parameter file
rstfile = ./setup().rst
# restart file
[OUTPUT]
18.3. Equilibration
89
dcdfile = run().dcd
rstfile = run().rst
[ENERGY]
electrostatic
switchdist
cutoffdist
pairlistdist
pme_ngrid_x
pme_ngrid_y
pme_ngrid_z
=
=
=
=
=
=
=
PME
10.0
12.0
13.5
72
80
72
#
#
#
#
#
#
#
[CUTOFF,PME]
switch distance
cutoff distance
pair-list cutoff distance
grid size_x in [PME]
grid size_y in [PME]
grid size_z in [PME]
[DYNAMICS]
integrator
nsteps
timestep
eneout_period
crdout_period
rstout_period
=
=
=
=
=
=
LEAP
5000
0.002
50
50
5000
#
#
#
#
#
#
[LEAP,VVER]
number of MD steps
time step (ps)
energy output period
coordinates output period
restart output period
[CONSTRAINTS]
rigid_bond
= YES
[ENSEMBLE]
ensemble
tpcontrol
temperature
= NPT
= LANGEVIN
= 300.0
pressure
= 1.0
#
#
#
#
#
[BOUNDARY]
type
= PBC
# [PBC,NOBC]
[NVE,NVT,NPT]
barostat and thermostat
initial and target
temperature (K)
target pressure (atm)
As in the previous heat-up simulation, we skipped the simulation box size information in [BOUNDARY]
section, because the informations is already in the restart files.
90
rstfile = run().rst
# restart file
As in the case of production simulation, the control file of the parallel I/O is almost identical to the one
without it, except [TRAJECTORY] section. First, we add () to trjfile1 due to multiple trajectory
files. Second, COOR+BOX is not necessary because the box size information is already written to each
trajectory file. Below there is the control file for RMSD calculation.
[INPUT]
psffile = ../1_setup/ionize.psf
reffile = ../1_setup/ionize.pdb
[OUTPUT]
rmsfile = run.rms
# RMSD file
[TRAJECTORY]
trjfile1
md_step1
mdout_period1
ana_period1
=
=
=
=
[SELECTION]
group1
[FITTING]
fitting_method
fitting_atom
= TR+ROT
= 001
# method
# atom group
[OPTION]
check_only
= NO
# (YES/NO)
../5_production/run().dcd
500000
500
500
#
#
#
#
trajectory file
number of MD steps
MD output period
analysis period
91
Finally, with this we obtain the same RMSD values as in Tutorial 1: Building and Simulating BPTI in
Water.
92
BIBLIOGRAPHY
http://dx.doi.org/10.1002/jcc.21287
http://dx.doi.org/10.1002/jcc.21287
3
http://dx.doi.org/10.1002/jcc.20289
4
http://dx.doi.org/10.1002/jcc.20289
2
93
[8] A. D. MacKerell, M. Feig, and C. L. Brooks. Improved treatment of the protein backbone in empirical force fields. J. Am. Chem. Soc., 126(3):698699, 2004.
[9] W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives. Development and Testing of the OPLS AllAtom Force Field on Conformational Energetics and Properties of Organic Liquids. J. Am. Chem.
Soc., 118(45):1122511236, 1996.
[10] C. Oostenbrink, A. Villa, A. E. Mark, and W. F. Van Gunsteren. A biomolecular force field based
on the free enthalpy of hydration and solvation: The GROMOS force-field parameter sets 53A5 and
53A6. J. Comput. Chem., 25(13):16561676, 2004.
[11] J. Jung, T. Mori, and Y. Sugita. Efficient lookup table using a linear function of inverse distance
squared. J. Comput. Chem., 34(28):24122420, 2013.
[12] J. Jung, T. Mori, and Y. Sugita. Midpoint cell method for hybrid (MPI+OpenMP) parallelization
of molecular dynamics simulations. J. Comput. Chem., 35(14):10641072, 2014.
[13] Open MPI. http://www.open-mpi.org/.
[14] CHARMM. http://www.charmm.org/.
[15] psfgen. http://www.ks.uiuc.edu/Research/vmd/plugins/psfgen/ug.pdf.
[16] NAMD. http://www.ks.uiuc.edu/Research/namd/.
[17] VMD. http://www.ks.uiuc.edu/Research/vmd/.
[18] Y. Sugita and Y. Okamoto. Replica-exchange molecular dynamics method for protein folding.
Chem. Phys. Lett., 314(12):141151, 1999.
[19] Y. Sugita, A. Kitao, and Y. Okamoto. Multidimensional replica-exchange method for free-energy
calculations. J. Chem. Phys., 113(15):60426051, 2000.
[20] W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein. Comparison of
Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys., 79(2):926935, 1983.
[21] N. Foloppe and A. D. Mackerell. All-atom empirical force field for nucleic acids: I. parameter
optimization based on small molecule and condensed phase macromolecular target data. J. Comput.
Chem., 21(2):86104, 2000.
[22] A. D. Mackerell and N. K. Banavali. All-atom empirical force field for nucleic acids: II. application
to molecular dynamics simulations of dna and rna in solution. J. Comput. Chem., 21(2):105120,
2000.
[23] S. D. Feller, D. X. Yin, R. W. Pastor, and A. D. Mackerell. Molecular dynamics simulation of unsaturated lipid bilayers at low hydration: parameterization and comparison with diffraction studies.
Biophys. J., 73(5):22692279, 1997.
[24] J. B. Klauda, R. M. Venable, J. A. Freites, J. W. OConnor, D. J. Tobias, C. Mondragon-Ramirez,
I. Vorobyov, and R. W. Pastor. Update of the charmm all-atom additive force field for lipids: validation on six lipid types. J. Phys. Chem. B, 114(23):78307843, 2010.
[25] R. B. Best, X. Zhu, J. Shim, P. E. M. Lopes, J. Mittal, M. Feig, and A. D. MacKerell. Optimization
of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone
\phi , \psi and side-Chain \chi 1 and \chi 2 dihedral angles. J. Chem. Theo. Comput., 8(9):32573273,
2012.
[26] J. Huang and A. D. MacKerell. CHARMM36 all-atom additive protein force field: Validation
based on comparison to NMR data. J. Comput. Chem., 34(25):21352145, 2013.
Bibliography
94
[27] J. Karanicolas and C. L. Brooks, III. The origins of asymmetry in the folding transition states of
protein L and protein G. Protein Sci., 11(10):23512361, 2002.
[28] J. Karanicolas and C. L. Brooks III. Improved Go-like models demonstrate the robustness of protein folding mechanisms towards non-native interactions. J. Mol. Biol., 334(2):309325, 2003.
[29] N. Go. Theoretical studies of protein folding. Annu. Rev. Biophys. Bioeng., 12:183210, 1983.
[30] P. J. Steinbach and B. R. Brooks. New Spherical-Cutoff Methods for Long-Range Forces in Macromolecular Simulation. J. Comput. Chem., 15(7):667683, 1994.
[31] L. Verlet. Computer Experiments on Classical Fluids .I. Thermodynamical Properties of LennardJones Molecules. Phys. Rev., 159(1):98103, 1967.
[32] T. Darden, D. York, and L. Pedersen. Particle mesh Ewald: An Nlog(N) method for Ewald sums
in large systems. J. Chem. Phys., 98(12):1008910092, 1993.
[33] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, and L. G. Pedersen. A smooth particle
mesh Ewald method. J. Chem. Phys., 103(19):85778593, 1995.
[34] D. Takahashi. FFTE: A Fast Fourier Transform Package. http://www.ffte.jp/.
[35] FFTW. http://www.fftw.org/.
[36] L. Nilsson. Efficient Table Lookup Without Inverse Square Roots for Calculation of Pair Wise
Atomic Interactions in Classical Simulations. J. Comput. Chem., 30(9):14901498, 2009.
[37] J. P. Ryckaert, G. Ciccotti, and H. J. C. Berendsen. Numerical-Integration of Cartesian Equations
of Motion of a System with Constraints - Molecular-Dynamics of N-Alkanes. J. Comput. Chem.,
23(3):327341, 1977.
[38] H. C. Andersen. Rattle - a Velocity Version of the Shake Algorithm for Molecular-Dynamics Calculations. J. Comput. Chem., 52(1):2434, 1983.
[39] S. Miyamoto and P. A. Kollman. Settle - an Analytical Version of the Shake and Rattle Algorithm
for Rigid Water Models. J. Comput. Chem., 13(8):952962, 1992.
[40] S. A. Adelman and J. D. Doll. Generalized Langevin Equation Approach for Atom-Solid-Surface
Scattering - General Formulation for Classical Scattering Off Harmonic Solids. J. Chem. Phys.,
64(6):23752388, 1976.
[41] D. Quigley and M. I. J. Probert. Langevin dynamics in constant pressure extended systems. J.
Chem. Phys., 120(24):1143211441, 2004.
[42] A. Mitsutake, Y. Sugita, and Y. Okamoto. Generalized-ensemble algorithms for molecular simulations of biopolymers. Biopolymers, 60(2):96123, 2001.
[43] J. A. McCammon, B. R. Gelin, and M. Karplus. Dynamics of folded proteins.. Nature,
267(5612):585590, 1977.
[44] D. E. Shaw, P. Maragakis, K. Lindorff-Larsen, S. Piana, R. O. Dror, M. P. Eastwood, J. A. Bank,
J. M. Jumper, J. K. Salmon, Y. Shan, and W. Wriggers. Atomic-level characterization of the structural dynamics of proteins.. Science, 330(6002):341346, 2010.
[45] gnuplot. http://www.gnuplot.info/.
[46] RCSB Protein Data Bank. http://www.rcsb.org/.
[47] solvate plugin. http://www.ks.uiuc.edu/Research/vmd/plugins/solvate/.
[48] autoionize plugin. http://www.ks.uiuc.edu/Research/vmd/plugins/autoionize/.
Bibliography
95
Bibliography
96