0% found this document useful (0 votes)
82 views42 pages

GCEM

The document describes extensions made to the MEAD molecular electrostatics software. The extensions allow for modeling of biological macromolecules including membranes with transmembrane potentials. Additional extensions enable visualization of electrostatic potentials, charge distributions, ion distributions, and dielectric distributions. An example is shown of the transmembrane potential across an ammonia transporter protein embedded in a lipid bilayer membrane.

Uploaded by

Bis Chem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views42 pages

GCEM

The document describes extensions made to the MEAD molecular electrostatics software. The extensions allow for modeling of biological macromolecules including membranes with transmembrane potentials. Additional extensions enable visualization of electrostatic potentials, charge distributions, ion distributions, and dielectric distributions. An example is shown of the transmembrane potential across an ammonia transporter protein embedded in a lipid bilayer membrane.

Uploaded by

Bis Chem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

1 Purpose of MEAD and our extensions

This page describes our extensions to the molecular electrostatics software suite MEAD. The
original version of MEAD was written by Donald Bashford and can be found at http://www.stjuderesearch.org/site/
MEAD consists of a library of C++ objects and some applications that use these objects
for modeling electrostatic properties of molecules. MEAD is an acronym for macroscopic
electrostatics with atomic detail. That is, macroscopic continuum electrostatics is applied at a
molecular scale, partitioning the system in regions with different dielectric constants (molecule,
solvent, membrane, ...). Solvent regions can contain mobile ions. The electrostatic potential
is computed according to the linearized Poisson-Boltzmann equation. Despite its simplicity,
this model can be used to compute molecular properties with very good accuracy (pKa values,
binding constants, electrostatic solvation energies, ...).
Our extensions allow a detailed modeling of (biological) macromolecules including the pos-
sibility to account for a membrane environment with an electrostatic trans-membrane potential.
Further extensions allow the visualization of electrostatic potentials, charge distributions, elec-
trolyte distributions and dielectric distributions. The data can be given out as three-dimensional
volumetric data in OpenDX format, or within a cut plane or along a line in ASCII format for
plotting with your favorite software. The figure below shows the trans-membrane potential
across the ammonia transporter Amt-1 from Archaeoglobus fulgidus as an example application
for the visualization extensions (structure: Andrade, Susana L. A.; Dickmanns, Antje; Ficner,
Ralph and Einsle, Oliver, 2005, PNAS, 102, 14994-14999).
• A lipid membrane can be represented by three dielectric slabs that model the hydrophobic,
ion-inaccessible core region and the hydrophilic headgroup regions that can be penetrated
by mobile ions. Regions within the membrane boundaries that are not part of the protein
or the membrane can be specified (e.g., water filled protein cavities or a channel through
the membrane).
• The protein interior can comprise different dielectric regions that can be used, for exam-
ple, to account for protein regions that are modeled classically and such that are modeled
with a quantum chemical approach. In addition there can be regions that model protein
cavities or channels.
• The solvent phase(s) are modeled by separate dielectric regions that can also contain
mobile ions.

2 Documentation
The documentation for the original MEAD package is the file README found in the root
directory of the distribution. Below, you can find the information from this README file and
information about our changes and additional application programs.
The MEAD library is best explored by looking at the source code. The best starting point
may be to look at some of the simpler programs like potential or solvate and then one of
the solver programs (for instance my xyz solver). Don’t start with multiflex or gcem,
because these programs and the objects used by them are too complex for a start. A brief outline
of the design of the MEAD library can be found in [?].

1
Figure 1: The trans-membrane potential distribution across the ammonium transporter Amt-1.
The extracellular side is shown at the top and the intracellular side at the bottom. The dark
outer contour denotes a projection of the solvent accessible surface of a transporter trimer into
a plane perpendicular to the membrane. The lighter inner contour shows a projection of a slice
of Amt-1 of 5 Å thickness into the same projection plane. The slice shows the trans-membrane
pore including putative ammonia positions and the twin-histidine motive determined by X-ray
crystallography. a) The trans-membrane potential is plotted in a slice plane cutting through the
transporter’s trans-membrane pore. The potential is projected into a plane perpendicular to the
membrane, while the slice plane is slightly tilted relative to the membrane normal to follow
the course of the trans-membrane pore. The values at the white contours denote the fraction
of the trans-membrane potential at the respective coordinate. It can be seen that the membrane
potential distribution within the protein does not show a simple linear dependency on the z
coordinate. b) The mobile source charge distribution of the trans-membrane potential in the
same slice and projection planes as in a). Darker red or blue shades denote higher negative
or positive charge density, respectively. It can be seen that most of the unbalanced charge is
concentrated close to the membrane and in the depressions of the protein surface.

2
The calculation of binding properties with a continuum electrostatics/molecular mechanics
model is described in a short version on the page about GMCT and GCEM. A more detailed
description is found in the user manual of GCEM (found in the directory doc of the distribution)
and the corresponding paper. See [?] for a review of titration calculations with continuum
electrostatics within a classical two-state model.

3
3 General information on MEAD usage
Here, you find information on general options and input files common to all MEAD programs

3.1 Program Usage


Program settings are in most cases specified with command line options. Input files are de-
scribed below. The last command line arguments specify the name of the molecule. The pro-
gram is called with:
program dir/program [options] [molname(s)]
where molname(s) are the names of the molecule(s) for which the calculation will be done.
molname(s) are used as prefix for some input files.

3.2 Program Options and Their Defaults


• -epsin float: relative dielectric constant of a molecular interior region (no default
value!!! – recommended 4.0)

• -epsin[1-3] float: for multidielectric usage

• -epsext float (80): dielectric constant of an exterior/solvent region (in potential,


my xyz solver)

• -epsext float (80): dielectric constant of an exterior/solvent region (multiflex


and solvate)

• -epsvac float (80): dielectric constant of vacuum (solvate)

• -T float (298.15): absolute temperature in K. ATTENTION: this number has


only an influence on the ion distribution, but not on the dielectric constant. Calculated
temperature effects are not realistic!

• -solrad float (1.4): solvent probe radius for determining a solvent accessible
surface / the solvent inaccessible volume

• -sterln float (2.0): thickness of an ion-exclusion layer (Stern layer) for deter-
mining the ion-inaccessible volume

• -ionicstr float (0.0): ionic strength in mol/l (The physiological ioinc strength
is about 0.1 to 0.2 mol/l. Using a very low ionic strenght 0.01 or even 0.0 mol/l heavily
affects the results!)

• -kBolt float (5.984e-06): Boltzmann constant in units of e2◦ /(ÅK) (elemen-


tary charges squared per Ångström and Kelvin). This constant must be adjusted if other
units of charge, length or temperature are used.

4
• -econv float (332.063202): conversion factor from e2◦ /(ÅK) (elementary charges
squared per Ångström and Kelvin) to kcal/mol. This constant must be adjusted if units of
energy and electrostatic potentials other than kcal/mol and kcal/(mol)e◦ are needed in
the output.

• -conconv float (6.022214e-04): conversion factor from mol/l (moles per liter)
to 1/Å3 (particles per cubic Å). This constant must be adjusted if other units of length or
concentration are used.

• -epssave oldway: flag that triggers the old style of averaging the dielectric constant
between grid points of differing dielectric constants. Between two potential lattice points
in the finite difference method. The new way involves inverse averaging and is similar to
a proposal by McCammon. The old way is a simple mean. The new way is significantly
more accurate the option to do it the old way is only provided for the sake of reproducing
old experimental results. It is not recommended otherwise.

• -converge oldway: Revert to the old way of testing for convergence of the succes-
sive over-relaxation (SOR) method of solving the finite difference representation of the
linearized Poisson Boltzmann equation. The new method gives improved long range
accuracy for large lattices, but at a sometimes substantial computational cost. See the
NEWS file for further discussion.

• -blab[1-3]: lag that controls the verbosity of the programs while writing to stdout,
specifying no blab flag at all is least verbose. In the latter case, only essential output is
written to stdout. Writing to output files is not affected.

visualization options (my xyz solver):

• -write pot: flag that triggers the output of the electrostatic potential

• -write rho: flag that triggers the output of the charge distribution

• -write eps: flag that triggers the output of the dielectric constant distribution

• -write ely: flag that triggers the output of the ionic strength distribution

• -x float: x coordinate of the grid center used for the OpenDX volumetric data file.

• -y float: y coordinate of the grid center used for the OpenDX volumetric data file.

• -z float: z coordinate of the grid center used for the OpenDX volumetric data file.

• -space float: grid spacing used for the OpenDX volumetric data file.

• -count int [int int]: edge length of the grid used for the OpenDX volumetric
data output in grid points. If one number is specified, all edges have the same length. If
three numbers are specified, they correspond to the edge lengths in x, y and z direction.
For the output of the electrostatic potential, the grid must be entirely covered by the grid
used in the calculation of the electrostatic potential (specified in the .ogm or .mgm file).

5
• -gmt 7 floats 1 int [6 more floats]: Specification of this option triggers
the output of the quantities requested by write pot, write rho, write eps and write ely as
two-dimensional curves for plotting with the GMT program suite or other plotting soft-
ware. The arguments are xcenter, ycenter, zcenter, xnormal, ynormal, znormal, grid spac-
ing, number of grid points xcenter2, ycenter2, zcenter2, xnormal2, ynormal2, znormal2
The first three numbers define the x, y and z coordinates of the grid/plane center. The
next three numbers define components of the normal vector of the plane. The following
integral number defines the edge length of the curve in grid points. Optionally, six more
values can be specified to define the base point and the normal vector of a second plane.
In this case, the values of the requested functions are calculated in this second plane then
and projected onto the first plane. This feature can for example be useful for a trans-
membrane pore that is not perfectly normal to the membrane plane as shown for in the
figure on the top of the page.

3.3 Input Files


• name.pqr:
contains a molecule structure in a format similar to that of a PDB file but with atomic
partial charges and radii in the occupancy and B factor columns, respectively. More
specifically, lines beginning with either ”ATOM” or ”HETATM” (no leading spaces) are
interpreted as a set of tokens separated by one or more spaces or TAB characters. Note
that the .pqr format does not support some PDB features such as a altLoc fields, and a one
character chainID between resName and resSeq. Doing so would break the whitespace
separated tokens convention that allows for easy processing with perl scripts, etc. Instead
we optionally allow for additional digits at the end of the line to specify global conforma-
tion, chain, residue and instance, where instance is a certain form of a site or residue. If
you have a PDB file and need to generate a PQR file, this implies making some choices
about the charges and radii. This is similar to making a choice about what force field
to use in an MD simulation. MEAD per se, doesn’t make the choice for you. However,
in the utilities subdir are some tools that may be useful if you want to use CHARMM
or PARSE parameters. The Amber program suite comes with a program, ambpdb, that
allows you to generate a PDB or PQR file given Amber format files. Another option is
the program pdb2pqr. format:
ATOM/HETATM atnum atname resname resnum x y z charge radius {confid chainid siteid instid}
...

The contents of the first two columns are ignored. The last four columns are optional
(used, e.g., by GCEM). atname is the atom name. resname is the residue name. resnum
is the residue number. x,y,z are the x, y, and z coordinates of the atom (floating point
numbers). charge is the atomic partial charge of the atom. radius is the atom radius.
confid, chainid siteid and instid are integer numbers that identify the global conformation,
the polymer chain, the site and the instance to which the atom belongs, respectively.
• name.ogm and name.mgm:
definine the cubic grids for the computation of electrostatic potentials with the finite dif-
ference method. format:

6
grid_center_1 grid_spacing_1 grid_dimension_1
...
grid_center_N grid_spacing_N grid_dimension_N

grid center is the grid center and can be given as three floating point numbers spec-
ifying the coordinates directly or as centering style. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. For the centering style, three
options are available.

– ON ORIGIN specifies that the grid center is placed on the origin of the coordinate
system (0.0, 0.0, 0.0).
– ON GEOM CENT specifies that the grid center is to be placed on the geometric
center of the molecule / receptor or site denoted by name.
– ON CENT OF INTEREST specifies that the grid center is to be placed on the geo-
metric center of a site of interest.

grid spacing is the spacing between two grid points in AA. grid dimension is the edge
length of the cubic grid in grid points and must be an odd integer. Each grid must be
smaller than the previous grid, and normally it will also have a smaller grid spacing.

• name.fpt:
is a file containing the coordinates at which the electrostatic potential shall be calcu-
lated (traditional format) or a file containing the atomic partial charges and their coor-
dinates (extended format I for the the application programs named my xyz solver)
or a file containing the atomic partial charges and their coordinates for each instance of
each site found in a receptor (extended format II for the the application programs named
my xyz solver)

– traditional format (for command line option -ProteinField/-ReactionField):

coordinate_1
...
coordinate_N

where coordinate consists of three floating point values that denote the x, y and
z coordinates of a point respectively. Optionally, the coordinate values can be en-
closed by parenthesis and separated by commas. The line breaks can be substituted
with any number of whitespace characters.
– extended format I (for command line option -pf):

coordinate_1 charge_1
...
coordinate_N charge_N

7
coordinate consists of three floating point values that denote the x, y and z co-
ordinates of a point respectively. Optionally, the coordinate values can be enclosed
by parenthesis and separated by commas. charge is the atomic partial charge at
the preceding coordinate. The line breaks can be substituted with any number of
whitespace characters.
– extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1


...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the
following atomic partial charge belongs. coordinate consists of three floating
point values that denote the x, y and z coordinates of a point respectively. Optionally,
the coordinate values can be enclosed by parenthesis and separated by commas.
charge is the atomic partial charge at the preceding coordinate. The line breaks
can be substituted with any number of whitespace characters.

• name.potat:
This is a binary file produced by some programs, to avoid costly recalculation of elec-
trostatic potentials. It contains the potential at each atom of name generated by some set
of charges. Variations of name may denote charge states, sites, conformers, solvated or
vacuum or uniform dielectric environments, depending on the application. Atomic coor-
dinates and radii and the generating charges are also included for the sake of consistency
checking. These files allow multiflex, etc. to avoid unnecessary recalculations when
all you want to do is add or alter some site, but you must be careful about which .potat
files you keep. Specify the -blab2 flag for a blow-by-blow account of attempts to read and
write .potat files in multiflex.

8
4 GCEM – Generatlized Continuum Electrostatic Model
GCEM is a program for the automated preparation of the necessary input for GMCT from a
continuum electrostatics/molecular mechanics model.
GCEM is a program for the automated preparation of the necessary input for GMCT from
a continuum electrostatics/molecular mechanics model. Details about the applications and the
underlying model of GCEM can be found on a separate documentation. Examples for the usage
of the program can be found in the directory examples/gcem.

4.1 Program Usage


GCEM computes the energy terms used in the microstate energy function of GMCT using a
continuum electrostatics/molecular mechanics model. GCEM has a number of new features as
compared to multiflex and is based on a generalized formulation of binding theory, which offers
a wider application range.
GCEM considers one global conformation at a time. That is, a separate calculation needs to
be set up for each global conformation.
Program settings are specified with command line options. Input files are described below.
The last command line argument is the name of the molecule. The program is called with:
program dir/gcem options molname
where molname is the name of the molecule/receptor. molname is used as prefix for some
input files.
Each GCEM calculation consist of at least two program runs. The preprocessing run gen-
erates sidechain rotamers, calculates molecular mechanics energy terms, writes a restart file for
the postprocessing run and sets up the necessary input for the continuum electrostatics calcu-
lations that are done by independent MEAD programs (my xyz solver). This approach al-
lows a very efficient and simple parallelization without communication overhead (see the PERL
scripts rst-gcem-xyz.pl provided with the examples for an example). The postprocessing can
also be run several times alternating with recomputation of the electrostatic energies to elimi-
nate energetically unfavorable conformers. Thereby, the continuum electrostatics calculations
are refined resulting in a decrease of the error introduced by the inflation of the low dielectric
molecule interior by the high-energy conformers.
The intended purpose of the interior dielectric regions eps1set, eps2set and eps3set
was for modeling a quantum chemically treated region, a classically treated region and a region
of solvent filled cavities or pores inside the molecular structure that are to be excluded from the
membrane dielectric, but they might as well be used for other purposes. The interior regions
are inaccessible to mobile ions. The intended purpose of the region elycavset was to model ion
accessible cavities or depressions in the protein surface reaching into the membrane boundaries
(e.g. gorges leading to a channel entrance and trans-membrane pores with large diameter).
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.

9
4.2 Program Options and Their Defaults
• -grid detail int (2): This option influences the detail of the auto-generated
grids for the computation of the electrostatic potentials with the finite difference method.
A higher value will result in a larger and finer grids. The default value of 2 will in most
cases suffice for high quality results, sometimes a value of 3 can be advantageous (espe-
cially for membrane proteins, where the apolar environment leads to more far-reaching
electrostatic interactions).

0 coarse, not advisable for production


1 economic, similar to the setting in the old multiflex examples, can result in signifi-
cant discretization errors
2 high, somewhat finer and larger grids than for 1, solvation energies and interaction
energies should be largely converged with respect to grid spacing
3 very high, even finer and larger innermost grids, ensures very high-quality solvation
and interaction energies also for long-range interactions in membrane proteins
4 ultra high, extremely fine grids, should not result in significant numeric differences
of the result relative to 3, needs very much memory and computation time

If user supplied grids exist, the setting affects the automatic adjustment of the innermost
grid’s size to the size of each site. A higher value will result in a larger spacing of the site
to the grid boundary.

• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region and the region specified by elycavset (if applicable).

• -epshead float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epscore float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epsin1 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps1set (if applicable).

• -epsin2 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps2set (if applicable).

• -epsin3 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps3set (if applicable).

• -eps ff float: Floating point number that defines the dielectric constant of the ref-
erence environment used in force field parametrization.

• -eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 epsin1 (if applicable).

10
• -eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• -eps3set string: String giving the prefix of a .pqr file that define the dielectric
region 3 with the dielectric constant epsin3 (if applicable).

• -elycavset string: String giving the prefix of a .pqr file that defines the an ex-
terior region with the dielectric constant epsext and ionic strength defined by ionicstr or
ionicstr1 and ionicstr2 according to the membrane side (if applicable).

• -membz z lower core z upper core z lower head z upper head: Bound-
aries of a three-layer dielectric slab representing the polar, ion-accessible headgroup
regions and the apolar ion-inaccessible core region of the membrane (if applicable).
The membrane is perpendicular to the z-direction, hence the components of its nor-
mal vector are given by (0,0,1). The headgroup region extends from z=z lower head to
z=z upper head excluding the core region. The core region extends from z=z lower core
to z=z upper core excluding the core region. The headgroup region can be omitted by
setting z lower core = z lower head and z upper core = z upper head. The ionic strength
on the membrane sides can be set to equal values by specifying ionicstr or set to distinct
values by specifying ionicstr1 and ionicstr2. The ionic strength within the boundaries
of the core region (e.g., in a channel pore defined by membhole or elycavset) is linearly
interpolated between ionicstr1 and ionicstr2.

• -ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

• -ionicstr1 float: Ionic strength in the upper exterior region(s) (z> z upper core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr2 must be specified.

• -ionicstr2 float: Ionic strength in the lower exterior region(s) (z < z lower core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr1 must be specified.

• -membhole radius cent x cent y: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0). This option works seldom satisfactory for real proteins with non-cylindrical
shape. The use of elycavset and/or one of the interior dielectric regions, for example
eps3set, is to be preferred.

• -inside (1): This option defines which membrane side is “inside” (cytoplasmic) in
the electrophysiological sense, where the membrane potential is measured at the inside
relative to the outside. A value of 1 states that the lower membrane side is inside, while a
value of 0 states that the upper side is inside.

• -capacitance: This flag triggers the calculation of the capacitance of the system.
The capacitance is the accumulated charge in Ampere seconds = Coulomb per mem-
brane potential in Volt. Hence, the capacitance is given by default in Farad (F = As/V).
The calculation of the capacitance requires tighter convergence criteria for the electro-
static potentials, which increases the required computation time. The capacitive energy

11
is only needed if there are multiple global conformations. The difference between the
the capacitive energy terms of different global conformations is negligibly small under
normal conditions of naturally occurring values of the membrane potential, but depends
quadratically on membrane potential. Caution: The capacitance and hence the capacitive
energy depend on the system size (mainly on the area of the covered membrane region).
Therefore, the same grid size must be used for the innermost grid in the calculation of the
capacitance for all global conformations (normally automatically taken care of).

• -charmm par string: Name of a CHARMM parameter file containing the CHARMM
force field parameters. Examples are provided with the examples in the subdirectories of
examples/gcem. The file is not read if there are only non-flexible sites (according to
the definitions in molname.rot) and skip stat mmg and skip stat mmint are given on the
command line.

• -charmm top string:Name of a CHARMM topology file containing the residue


topology definitions. Double bonds must be specified as such for correct automatic iden-
tification of the sidechain torsions. Examples are provided with in the subdirectories of
examples/gcem. The file is not read if there are only non-flexible sites (according to
the definitions in molname.rot) and skip stat mmg and skip stat mmint are given on the
command line.

• -rotlib string (rotlib): Prefix of the name of a file containing the Squirell
backbone dependent rotamer library (tested with 2002 and 2010 versions). First, it is
attempted to read rotlibname.dat as binary boost archive. If the binary file is not found, it
is attempted to read rotlibname.txt. The file is not read if there are only non-flexible sites
(according to the definitions in molname.rot).

• -skip stat mmint: Flag that triggers the omission of molecular mechanics contribu-
tions to the intrinsic energies of each non-flexible site (according to the specification of
the site type in molname.rot).

• -skip stat mmg: Flag that triggers the omission of molecular mechanics contributions
to each interaction energy that involves a non-flexible site (according to the specification
of the site type in molname.rot).

• -empirical energy: Flag that triggers the use of an empirical energy function for
the part of the intrinsic energy that involves atoms of the site itself and nearby backbone
residues that are thought to determine the rotamer propensity in the backbone dependent
rotamer library. The empirical energy is defined as β 1 ln[p], where p is the propensity for
the corresponding sidechain rotamer in the rotamer library.

• -print chm: Flag that triggers detailed output regarding the molecular mechanics
terms to stdout (independent on the blab level).

• -write binary rotlib: Flag that triggers writing of the rotamer library as binary
archive (using the Boost serialization library).

other program specific options:

12
• -gcem dir string: Name of the subdirectory to which gcem will write input files
for the continuum electrostatics solvers and the corresponding job script(s). In addition,
the directory will contain some deprecated files that are no longer used but may be helpful
in debugging while making changes to the source code.

• -mead path string: Path to the root directory of the MEAD distribution used for
the calls to the continuum electrostatics solvers in the job script(s). The option is useful
if the solver jobs are run on a different system, for example a remote computer cluster.

• -filter ctof float (1e99): Intrinsic energy cutoff (in kcal/mol) for eliminat-
ing conformers with excessively high intrinsic energy, which makes them unlikely to be
populated significantly in equilibrium. A conformer (maybe a sidechain rotamer) consist-
ing of N instances (may be different binding forms of the sidechain) is eliminated if the
minimum intrinsic energy of any instance of the conformer is larger than the minimum
intrinsic energy of all instances of the site plus filter ctof.

• -filter ctof gs float (1e99): Energy cutoff (in kcal/mol) for eliminating con-
formers with excessively high energy, which makes them unlikely to be populated sig-
nificantly in equilibrium. The elimination criterion is almost identical to the Goldstein
criterion of dead end elimination, with the exception that the hypothetical minimum en-
ergy must be larger by filter ctof gs rather than by 0. A conformer (maybe a sidechain
rotamer) consisting of N instances (may be different binding forms of the sidechain) is
eliminated if all of its instances fulfill the modified Goldstein criterion.

• -cap max nin int: Flag that triggers the reduction of the number of instances of any
site to a number ≤ cap max nin. Keep no more than cap max nin instances of a site with
lowest sum of intrinsic energy and sum of minimum interaction energies with all other
sites. The method is equivalent to repeated application of the modified Goldstein criterion
to all conformers with successively decreasing filter ctof gs until the number of instances
is equal to or smaller than cap max nin.

• -print debug: Flag that triggers detailed output regarding the read and generated
data to stdout (independent on the blab level). The flag is mainly intended for verification
purposes while debugging new program features.

4.3 Input Files


• molname.pqr:
The receptor structure. The file format is described above under general options and input
files.

• molname.ogm and molname.mgm (optional):


defines the electrostatics grids for the site in the receptor environment and their the model
compound in the reference environment (bulk solution), respectively. The file format is
described above under general options and input files.

13
• molname.rot:
defines the sites of molname ans their types For each site, the file contains a line of the
format:

site_label site_type

The site label is constructed from sitename-chainid-resid, where chainid and resid cor-
respond to the data found in molname.pqr (chainid set to 0 if absent in molname.pqr).
site type is one of:

0 ignored as site
1 flexible and titratable (site name.est required, see below) user defined conformers
are read from molname.pqr flexible parts must be present for each conformer with
unique instance IDs differing from 0, For amino acid residues, additional sidechain
rotamers are generated for amino acid residues (currently not implemented for Pro),
using dihedral angles from the Squirell backbone dependent rotamer library. For
each conformer, the different forms defined in site name.est are generated.
2 flexible and non-titratable (site name.est not required) user defined conformers are
read from molname.pqr flexible parts must be present for each conformer with
unique instance IDs differing from 0, For amino acid residues, additional sidechain
rotamers are generated for amino acid residues (currently not implemented for Pro),
using dihedral angles from the Squirell backbone dependent rotamer library.
3 non-flexible and titratable (site name.est required, see below) For the conformer
found in molname.pqr, the different forms defined in site name.est are generated
4 quantum mechanically treated site (QM site, site label.qst required, see below) Con-
formational flexibility/orientational polarization effects are thought to be considered
within the QM treatment. The model energy should also contain any energy terms
due to bound ligands, and any other energy contribution that apart from the interac-
tions with the rest of the protein. GCEM The model energy of a QM site is expected
to be completely defined in site label.qst, where site label is given by sitename-
chainid-resid. The coordinates and atomic partial charges of the QM site are taken
from molname.pqr or separate .pqr files as described below for site label.qst.

• molname.con:
defines the connectivity of the sites to the membrane sides. The file is only required for
membrane proteins. For each site, the file contains a line of the format:

site_label connectivity

The site label is constructed from sitename-chainid-resid, where chainid and resid cor-
respond to the data found in molname.pqr (chainid set to 0 if absent in molname.pqr).
connectivity is one of:

0 The site binds ligands from the outer membrane side.

14
1 The site binds ligands from the inner membrane side.

• sitename.est:
An .est file defines forms of a titratable site. Example:

label NTP NTD1 NTD2 NTD3


Gmodel 0 10.881 10.881 10.881
proton 1 0 0 0
center N
LYS CA 0.21 0.18 0.18 0.18
LYS HA 0.10 0.10 0.10 0.10
LYS N -0.30 -0.96 -0.96 -0.96
LYS HT1 0.33 NaN 0.34 0.34
LYS HT2 0.33 0.34 NaN 0.34
LYS HT3 0.33 0.34 0.34 NaN

The keywords have the following meaning:

– label: labels for the forms (instances)


– Gmodel: model energy, (relative) chemical potentials of the forms (see also the
program manual of GMCT)
– EpsRef: specification is optional, relative dielectric constant of the reference envi-
ronment to which the Gmodel value refers
– proton, electron ...: Any keyword or line patterns not matching one of
the other entries of this table names a ligand type. The following numbers specify
the numbers of the ligand type bound by each form.
– center: currently ignored, formerly used to define the atom of the model com-
pound to be used as center of the finite difference grids, GCEM uses the geometric
center of the model compound.
– remaining lines: residue name, atom name, atomic partial charges of the atoms in
each form

• site label.qst: A .qst file defines forms of a QM site. Example:

label FES1010 FES1111


Gmodel -1.4232 0
EpsRef 1 1
proton 1 2
electron 1 2

– label: labels for the forms


– Gmodel: model energy, (relative) chemical potentials of the forms (see also the
program manual of GMCT)

15
– EpsRef: specification is optional, relative dielectric constant of the reference envi-
ronment to which the Gmodel value refers
– proton, electron ... : Any keyword not found among among the pre-
ceding keywords names a ligand type. The following numbers specify the numbers
of the ligand type bound by each form.

Conformational flexibility/orientational polarization effects are thought to be considered


within the QM treatment. The model energy should also contain any energy terms due to
bound ligands, and any other energy contribution that apart from the interactions with the
rest of the protein. For each instance, a structure file named site label instance label.pqr
is expected to exist. The model energy of a QM site is expected to be completely defined
in extended site label.qst, where extended site label is constructed from the site label as
found in molname.rot by inserting the conformer id confid. The extended site label is
given by sitename-confid-chainid-resid, where the confid. Atomic coordinates and partial
charges of atoms of the site found in any structure of the QM site are ignored if also
found in molname.pqr. Capping atoms and groups are assumed not to be present in the
structures the user has to remove them during structure preparation. Bonded molecular
mechanics terms involving link-atoms are assumed to be the equal for all instances of the
QM site, and thus neglected. If you wish to include such energy contributions, you can
add them to the model energy. There are unavoidable ambiguities and inconsistencies in
the treatment of the link regions. Therefore, it is advisable to choose the extent of the QM
site such that any significant conformational flexibility occurs well within the QM site.
In this way, the above assumption (equal energy contributions for all instances of the QM
site due to the link region) is justified just as for the other site types.

• rotlibname.txt/dat:
Squirell backbone dependent rotamer library in ASCII format or as Boost binary archive.
The file format is described on the rotamer library website of the Dunbrack group (http://dunbrack.fccc.edu/b

• CHARMM parameter and residue topology files:


The file format is described on www.charmm.org. The automatic determination of the
amino acid sidechain torsion angles requires that double bonds are specified as such in
the residue topology file.

16
5 my 2diel solver
This program computes the electrostatic potential and the corresponding electrostatic energy
terms of a site in a two-dielectric environment. Additional features enable the use of this pro-
gram for visualization purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

5.1 Program Usage


Program settings are specified with command line options. Input files are described below. The
program is called with:
program_dir/my_2diel_solver {options} sitename backgroundname
where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set and eps2set was for modeling
a quantum chemically treated region and a a classically treated region, but they might as well
be used for other purposes. The interior regions are inaccessible to mobile ions. The intended
purpose of the region elycavset was to model ion accessible cavities or depressions in the protein
surface reaching into the membrane boundaries (e.g. gorges leading to a channel entrance and
trans-membrane pores with large diameter).
Examples for the program usage can be found in the subdirectories located in examples/gcem.
GCEM will create the input files and a job script job.sh using the my xyz solvers in the directory
gcem dir/gcem.

5.2 Program options and their defaults


general options are described above
program specific options for the continuum electrostatics calculation:
• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region outside the solvent inaccessible volume of backgroundname.
• -epsin float: Floating point number that defines the dielectric constant of the inte-
rior of backgroundname.

17
• -ionicstr float: Ionic strength in the exterior region outside the solvent inacces-
sible volume of backgroundname (by default in in mol/l).

5.3 Input Files


• sitename.pqr, backgroundname.pqr and additional .pqr files used to define the dielectric
regions are structure files in .pqr format. The file format is described above under general
options and input files.

• sitename.mgm defines the electrostatics grids for the site in the receptor environment.
The file format is described above under general options and input files.

18
6 my 3diel solver
This program computes the electrostatic potential of a site in a three-dielectric environment and
the corresponding electrostatic energy terms. Additional features enable the use of this program
for visualization purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and
to cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies
with other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

6.1 Program usage


Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_3diel_solver {options} sitename backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set and eps2set was for modeling
a quantum chemically treated region and a a classically treated region, but they might as well
be used for other purposes. The interior regions are inaccessible to mobile ions. The intended
purpose of the region elycavset was to model ion accessible cavities or depressions in the protein
surface reaching into the membrane boundaries (e.g. gorges leading to a channel entrance and
trans-membrane pores with large diameter).
Examples for the program usage can be found in the subdirectories located in examples/gcem.
GCEM will create the input files and a job script job.sh using the my xyz solvers in the directory
gcem dir/gcem.

6.2 Program options and their defaults


general options are described above
program specific options for the continuum electrostatics calculation:

• --fpt string: Prefix of a file fptname.fpt for the calculation of site-site interaction
energies

19
• --epsext float: Floating point number that defines the dielectric constant of the
solvent region and the region specified by elycavset (if applicable).

• --epsin1 float: Floating point number that defines the dielectric constant of the
region specified by eps1set (if applicable).

• --epsin2 float: Floating point number that defines the dielectric constant of the
region specified by eps2set (if applicable).

• --epshomo float: Floating point number that defines the dielectric constant of an
additional homogeneous dielectric (optional).

• --eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 with the dielectric constant epsin1 (if applicable).

• --eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• --ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

6.3 Input Files


• sitename.pqr, backgroundname.pqr and additional .pqr files used to define the dielectric
regions are structure files in .pqr format. The file format is described above under general
options and input files.

• sitename.ogm defines the electrostatics grids for the site in the receptor environment. The
file format is described above under general options and input files.

• fptname.fpt is a file containing the atomic partial charges and their coordinates for each
instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1


...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

20
7 my Ndiel solver
This program computes the electrostatic potential of a site in an environment with an arbitrary
number of dielectric and electrolyte regions and the corresponding electrostatic energy terms.
Additional features enable the use of this program for visualization purposes and as helper
program for GCEM.
This program computes the electrostatic potential of a site in an environment with an ar-
bitrary number of dielectric and electrolyte regions and the corresponding electrostatic energy
terms. Additional features enable the use of this program for visualization purposes and as
helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies with
other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

7.1 program usage


Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_Ndiel_solver {options} sitename backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
An example of the program usage can be found in the directory examples/my Ndiel solver.

7.2 program options and their defaults


general options are described above

7.3 input files


• sitename.pqr, backgroundname.pqr and additional .pqr files used to define the dielectric
regions are structure files in .pqr format. The file format is described above under general
options and input files.

21
• sitename.ogm defines the electrostatics grids for the site in the receptor environment.
The file format is described above under general options and input files.
• backgroundname.diel is a file defining the dielectric regions format of a single line
corresponding to a dielectric region:

pqrname eps solrad

pqrname is the prefix of a structure file that determines the ion-inaccessible volume of
the region.
eps is the relative dielectric constant of the dielectric region.
solrad is the solvent probe sphere radius used to define the dielectric region.
The priority of the regions increases from the top to the bottom of the list. Solvent inac-
cessible regions of previous entries are overridden by the solvent inaccessible regions of
following entries.
• backgroundname.ely is a file defining the electrolyte regions
format of a single line corresponding to an electrolyte region:

pqrname istr ionrad

pqrname is the prefix of a structure file that determines the ion-inaccessible volume of
the region.
istr is the ionic strength in the ion-accessible volume of the electrolyte region.
ionrad is the ion radius (Stern layer radius) used to define the electrolyte region.
The priority of the regions increases from the top to the bottom of the list. Ion accessi-
ble regions of previous entries are overridden by the ion-accessible regions of following
entries.
• fptname.fpt is a file containing the atomic partial charges and their coordinates for
each instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1


...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

22
8 my memb solver
This program computes the electrostatic potential and the corresponding electrostatic energy
terms of a site in a environment that models a lipid membrane, the receptor and the solvent
phases above and below the membrane with up to 6 dielectric and 2 electrolyte regions. Addi-
tional features enable the use of this program for visualization purposes and as helper program
for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies with
other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

8.1 Program Usage


Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_memb_solver {options} sitename backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set, eps2set and eps3set was
for modeling a quantum chemically treated region, a classically treated region and a region of
solvent filled cavities or pores inside the molecular structure that are to be excluded from the
membrane dielectric, but they might as well be used for other purposes. The interior regions
are inaccessible to mobile ions. The intended purpose of the region elycavset was to model ion
accessible cavities or depressions in the protein surface reaching into the membrane boundaries
(e.g. gorges leading to a channel entrance and trans-membrane pores with large diameter).
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.
Examples for the program usage can be found in the directory examples/gcem/bR of the
distribution. GCEM will create the input files and a job script job.sh using the my xyz solvers
in the directory gcem dir/gcem.

23
8.2 Program options and their defaults
general options are described above
program specific options for the continuum electrostatics calculation:

• -fpt string: Prefix of a file fptname.fpt for the calculation of site-site interaction
energies

• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region and the region specified by elycavset (if applicable).

• -epshead float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epscore float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epsin1 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps1set (if applicable).

• -epsin2 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps2set (if applicable).

• -epsin3 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps3set (if applicable).

• -epshomo float: Floating point number that defines the dielectric constant of an
additional homogeneous dielectric (optional).

• -eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 with the dielectric constant epsin1 (if applicable).

• -eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• -eps3set string: String giving the prefix of a .pqr file that define the dielectric
region 3 with the dielectric constant epsin3 (if applicable).

• -elycavset string: String giving the prefix of a .pqr file that defines the an ex-
terior region with the dielectric constant epsext and ionic strength defined by ionicstr or
ionicstr1 and ionicstr2 according to the membrane side (if applicable).

• -membz z lower core z upper core z lower head z upper head: Bound-
aries of a three-layer dielectric slab representing the polar, ion-accessible headgroup
regions and the apolar ion-inaccessible core region of the membrane (if applicable).
The membrane is perpendicular to the z-direction, hence the components of its nor-
mal vector are given by (0,0,1). The headgroup region extends from z=z lower head to
z=z upper head excluding the core region. The core region extends from z=z lower core
to z=z upper core excluding the core region. The headgroup region can be omitted by

24
setting z lower core = z lower head and z upper core = z upper head. The ionic strength
on the membrane sides can be set to equal values by specifying ionicstr or set to distinct
values by specifying ionicstr1 and ionicstr2. The ionic strength within the boundaries
of the core region (e.g., in a channel pore defined by membhole or elycavset) is linearly
interpolated between ionicstr1 and ionicstr2.

• -ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

• -ionicstr1 float: Ionic strength in the upper exterior region(s) (z > z upper core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr2 must be specified.

• -ionicstr2 float: Ionic strength in the lower exterior region(s) (z < z lower core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr1 must be specified.

• -membhole radius {cent x cent y}: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0). This option works seldom satisfactory for real proteins with non-cylindrical
shape. The use of elycavset and/or one of the interior dielectric regions, for example
eps3set, is to be preferred.

8.3 Input Files


• sitename.pqr, backgroundname.pqr and additional .pqr files used to define the dielectric
regions are structure files in .pqr format. The file format is described above under general
options and input files.

• sitename.ogm defines the electrostatics grids for the site in the receptor environment. The
file format is described above under general options and input files.

• fptname.fpt is a file containing the atomic partial charges and their coordinates for each
instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1


...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

25
9 my membpot solver
This program computes the electrostatic trans-membrane potential and the corresponding elec-
trostatic energy terms in a environment that models a lipid membrane the protein and the solvent
phases above and below the membrane with up to 6 dielectric and 2 electrolyte regions (same
as for my mem solver). Additional features enable the use of this program for visualization
purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of backgroundname with the
charge distribution causing the electrostatic trans-membrane potential. If the option -fpt is
specified, the electrostatic interaction energies of sites with the charge distribution causing the
electrostatic trans-membrane potential are calculated. If the option -capacitance is specified, the
program calculates the capacitance of the receptor-membrane system.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.
The theoretical basis of computing the electrostatic trans-membrane potential is described
in (Roux, 1997). A constant offset potential of ±0.5Ψ is added to all ion-accessible grid points
on the inner and outer side of the membrane, respectively. Equivalently, an effective uniform
charge distribution can be assigned to the ion accessible volume on either membrane side. For
2 h3 Ψ
the finite difference representation, an effective charge of q ef f = ± κ̄ 8π , is assigned to each
ion accessible grid point of the inner or outer membrane side, respectively. Here, κ̄2 is the
inverse Debye length and h is the grid spacing. As it turned out, this alternative way of adding
the offset potential was also implemented by Roux and coworkers for the PBEQ module of
CHARMM.

9.1 Program Usage


Program settings are specified with command line options. Input files are described below. The
last command line argument is the name of the molecule. The program is called with:

program_dir/my_membpot_solver {options} backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set, eps2set and eps3set was
for modeling a quantum chemically treated region, a classically treated region and a region of
solvent filled cavities or pores inside the molecular structure that are to be excluded from the
membrane dielectric, but they might as well be used for other purposes. The interior regions
are inaccessible to mobile ions. The intended purpose of the region elycavset was to model ion
accessible cavities or depressions in the protein surface reaching into the membrane boundaries
(e.g. gorges leading to a channel entrance and trans-membrane pores with large diameter).

26
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.
Examples for the program usage can be found in the directories examples/my membpot solver
and examples/gcem/bR of the distribution. GCEM will create the input files and a job script
job.sh using the my membpot solver in the directory gcem dir/membpot.

9.2 Program options and their defaults


general options are described above
program specific options for the continuum electrostatics calculation:

• -fpt string: Prefix of a file fptname.fpt for the calculation of site-site interaction
energies

• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region and the region specified by elycavset (if applicable).

• -epshead float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epscore float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epsin1 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps1set (if applicable).

• -epsin2 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps2set (if applicable).

• -epsin3 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps3set (if applicable).

• -epshomo float: Floating point number that defines the dielectric constant of an
additional homogeneous dielectric (optional).

• -eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 with the dielectric constant epsin1 (if applicable).

• -eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• -eps3set string: String giving the prefix of a .pqr file that define the dielectric
region 3 with the dielectric constant epsin3 (if applicable).

• -elycavset string: String giving the prefix of a .pqr file that defines the an ex-
terior region with the dielectric constant epsext and ionic strength defined by ionicstr or
ionicstr1 and ionicstr2 according to the membrane side (if applicable).

27
• -membz z lower core z upper core z lower head z upper head: Bound-
aries of a three-layer dielectric slab representing the polar, ion-accessible headgroup
regions and the apolar ion-inaccessible core region of the membrane (if applicable).
The membrane is perpendicular to the z-direction, hence the components of its nor-
mal vector are given by (0,0,1). The headgroup region extends from z=z lower head to
z=z upper head excluding the core region. The core region extends from z=z lower core
to z=z upper core excluding the core region. The headgroup region can be omitted by
setting z lower core = z lower head and z upper core = z upper head. The ionic strength
on the membrane sides can be set to equal values by specifying ionicstr or set to distinct
values by specifying ionicstr1 and ionicstr2. The ionic strength within the boundaries
of the core region (e.g., in a channel pore defined by membhole or elycavset) is linearly
interpolated between ionicstr1 and ionicstr2.

• -ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

• -ionicstr1 float: Ionic strength in the upper exterior region(s) (z > z upper core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr2 must be specified.

• -ionicstr2 float: Ionic strength in the lower exterior region(s) (z < z lower core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr1 must be specified.

• -membhole radius {cent x cent y}: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0). This option works seldom satisfactory for real proteins with non-cylindrical
shape. The use of elycavset and/or one of the interior dielectric regions, for example
eps3set, is to be preferred.

• -inside 1: This option defines which membrane side is ı̈nsideı̈n the electrophysiolog-
ical sense, where the membrane potential is measured at the inside relative to the outside.
A value of 1 states that the lower membrane side is inside, while a value of 0 states that
the upper side is inside.

• -capacitance: This flag triggers the calculation of the capacitance of the system.
The capacitance is the accumulated charge in Ampere seconds = Coulomb per membrane
potential in Volt. Hence, the capacitance is given by default in Farad (F = As/V).

9.3 Input Files


• backgroundname.pqr and additional .pqr files used to define the dielectric regions are
structure files in .pqr format. The file format is described above under general options
and input files.

• backgroundname.ogm defines the electrostatics grids for the site in the receptor environ-
ment. The file format is described above under general options and input files.

• fptname.fpt is a file containing the atomic partial charges and their coordinates for each
instance of each site found in a receptor

28
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1


...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

29
10 get-curves-gmct
This program analyzes the output of GMCT and extracts information as for example net binding
probabilities of sites summed over all instances, and pK1/2 values. The MEAD library is only
used for reading and writing of instance structures from the GCEM input.
The program reads a GMCT output file, generates derived data and writes the results to the
directory curve dir.
get-curves-gmct writes N-dimensional curves that describe the dependence of a quantity
(e.g., a binding probability) on the chemical potentials of the ligands and the membrane po-
tential. One-, two-, and three-dimensional curves can, for example, be used for plotting with
the GMT program suite or other plotting software. The curves are written to subdirectories of
curve dir whose names correspond to the respective keywords in molname.gc.setup (example:
the output of the midpoints of binding titrations is triggered with the input option ibindhalf and
the corresponding curves are written to curve dir/bind half).
The site and background structures in .pqr format needed to construct assembled structures
of molname with all sites in their most highly populated instances are read from gcem dir (only
needed if the input option itraj is set to 1). Molecular structures in .pqr format can be visualized
with molecular structure viewers as for example VMD or PyMol.

10.1 Program Usage


Program settings are specified within a setup file. Input files are described below. The last
command line argument is the name of the molecule. The program is called with:

program_dir/get-curves-gmct {options} molname

where molname is the name of the site used as prefix for the corresponding setup file mol-
name.gc.setup.
Examples for the program usage can be found in the directories examples/DTPA titra and
examples/HEWL titra of the GMCT distribution.

10.2 Program options and their defaults


• temp float (298.15): absolute temperature in K.

• gmct outfile string (molname.gmct.out): Name of the GMCT output file.

• curve dir string (my curves): Name of the subdirectory that will contain the
curves and structure files created by get-curves-gmct (created automatically if not exis-
tent).

• gcem dir string (..): Name of the directory containing the gcem input. The .pqr
files are expected to be found in subdirectories with the name gcem input or qmpb input.
If there are multiple conformations, the program expects to find the corresponding .pqr
files in the subdirectories (gcem—qmpb) dir/confname.

• blab int (0): Program verbosity as described under general program options.

30
• nformat int: Keyword that triggers the reading of a block of format descriptions for
the chemical potentials (same order as in gmct outfile. The number gives the number of
format lines in the block. Each line consists of the keyword format followed by a con-
version factor and the number of decimal digits after the comma to which the converted
chemical potential, pmf or membrane potential should be rounded. If there is a proton
chemical potential and an electron chemical potential which should appear in the form of
a pH value and a reduction potential in V in the program output, the block could look like
this:

nformat 2
format -0.733004200213 3
format -0.043364101728026 4

• ipmf int (0): Flag whether or not (1 = yes, 0 = no) to output of a proton motive
force (pmf = µout in out in
H + − µH + − F ∆Ψ = µ̄H + − µ̄H + in the curves instead separate proton
chemical potentials and an electric membrane potential. If ipmf is set to 1, the membrane
potential is substituted with the pmf and the proton chemical potential on the outside is
omitted.

• iprob int (0): Flag whether or not (1 = yes, 0 = no) to output occupation probability
curves for each instance.

• ibprob int (0): Flag whether or not (1 = yes, 0 = no) to output occupation prob-
ability curves for each possible number of ligands bound to a site (permutation of the
possible numbers of bound ligands of each type).

• ifprob int (0): Flag whether or not (1 = yes, 0 = no) to output occupation proba-
bility curves for each (binding) form of a site.

• icprob int (0): Flag whether or not (1 = yes, 0 = no) to output occupation prob-
ability curves for each global conformation and each conformer of a site. Conformers
are expected be ordered sequentially in gmct outfile (automatically done if the input for
GMCT was generated with GCEM ) , where each conformer possesses the same (binding)
forms.

• ibmean int (0): Flag whether or not (1 = yes, 0 = no) to output the average number
of bound ligands of each type bound by a site.

• imaxp int (0): Flag whether or not (1 = yes, 0 = no) to output the most highly
populated instance of a site.

• ibmaxp int (0): Flag whether or not (1 = yes, 0 = no) to output the most highly
populated number of bound ligands of each ligand type for a site.

• ifmaxp int (0): Flag whether or not (1 = yes, 0 = no) to output the most highly
populated (binding) form of a site.

31
• icmaxp int (0): Flag whether or not (1 = yes, 0 = no) to output the most highly
populated global conformation and the most highly populated conformer of a site.

• ihalf int (0): Flag whether or not (1 = yes, 0 = no) to output the midpoints for
each transition between the possible pairs of instances of a site. One chemical potential
is varied, while the others have fixed values. The midpoint is the value of the varied
chemical potential at which the population of the two instances is equal at fixed values of
all other chemical potentials.

• ibhalf int (0): Flag whether or not (1 = yes, 0 = no) to output the midpoints
for each transition between consecutive binding numbers of each ligand type and each
site. The midpoint is the chemical potential of the corresponding ligand type at which
the population of the consecutive binding numbers is equal at fixed values of all other
chemical potentials, the pmfs and the membrane potential. An example application of
this option is the determination of pK1/2 values, which in general depend on the chemical
potentials of all other ligands present in the system and a possibly present membrane
potential.

• ifhalf int (0): Flag whether or not (1 = yes, 0 = no) to output the midpoints for
each transition between the possible pairs of (binding) forms of a site. One chemical
potential is varied, while the others have fixed values. The midpoint is the value of the
varied chemical potential at which the population of the two (binding) forms is equal at
fixed values of all other chemical potentials.

• ichalf int (0): Flag whether or not (1 = yes, 0 = no) to output the midpoints for
each transition between the possible pairs of global conformations and of conformers of
a site. One chemical potential is varied, while the others have fixed values. The midpoint
is the value of the varied chemical potential at which the population of the two global
conformations or site conformers is equal at fixed values of all other chemical potentials.

• iginst int (0): Flag whether or not (1 = yes, 0 = no) to output the free energy of
the instances of a site calculated from G = β 1 ln[p], where p is the occupation probability
of the instance.

• igbind int (0): Flag whether or not (1 = yes, 0 = no) to output the free energy of
each possible number of ligands bound to a site (permutation of the possible numbers
of bound ligands of each type) calculated from G = β 1 ln[p], where p is the occupation
probability of the binding number.

• igconf int (0): Flag whether or not (1 = yes, 0 = no) to output the free energy of
each global conformation and each site conformer calculated from G = β 1 ln[p], where p
is the occupation probability of the global conformation or site conformer.

• itraj int (0): Flag whether or not (1 = yes, 0 = no) to write assembled structures
of molname with all sites in their most highly populated instances for each permutation
of chemical potential values.

32
10.3 Input Files
• molname.gc.setup contains the program options as described above.

• molname.gmct.out (or user specified name) contains the calculation results written by
GMCT to stdout. The results comprise the populations of the instances and the global
conformations for each permutation of chemical potential values as computed from a
Monte Carlo simulation with the program gmct or an analytical calculation with smt. The
format is described in the user manual of the program suite GMCT .

• .pqr files for the instances of all sites and the background. (only needed if the input option
itraj is set to 1) The file format is described above under general options and input files.

33
11 pqr2SolvAccVol
This program computes an analytical representation of the solvent (in)accessible volume of a
molecular structure and writes it to a binary or ASCII file which can be read by the programs
my xyz solver to avoid time-consuming recomputation of the volume.
Program settings are specified with command line options. Input files are described below.
The program uses a .pqr file as input (file format described above). The command line option
-solrad specifies the solvent probe sphere radius. (described under general options above). The
last command line argument is the name of the molecule. The program is called with:
program_dir/pqr2SolvAccVol {options} molname
where molname is the name of the site used as prefix for the corresponding .pqr file.

12 pqr2IonAccVol
This program computes an analytical representation of the ion inaccessible volume (solvent
inaccessible volume with Stern layer radius as probe radius and extended by the Stern layer
radius) of a molecular structure and writes it to a binary or ASCII file which can be read by the
programs my xyz solver to avoid time-consuming recomputation of the volume.
Program settings are specified with command line options. Input files are described below.
The program uses a .pqr file as input (file format described above). The program uses a .pqr
file as input (file format described above). The command line option -sterln specifies the ion
probe sphere radius that defines the thickness of the Stern layer. The command line option -gmt
specifies the dimensions of the curve, the projection plane and optionally a second slice plane
(described under general options above). The last command line argument is the name of the
molecule. The program is called with:
program\_dir/pqr2IonAccVol {options} molname
where molname is the name of the site used as prefix for the corresponding .pqr file.

13 pqr2crv
This program computes a projection of the solvent inaccessible volume of a molecular structure
or a slice thereof into a user defined plane for visualization purposes. An example is shown in
the above figure showing the trans-membrane potential distribution across Amt-1.
Program settings are specified with command line options. Input files are described below.
The program uses a .pqr file as input (file format described above). The command line option
-solrad specifies the solvent probe sphere radius. The command line option -gmt specifies the
dimensions of the curve, the projection plane and optionally a second slice plane (described
under general options above). The last command line argument is the name of the molecule.
The program is called with:
program_dir/pqr2crv {options} molname
where molname is the name of the site used as prefix for the corresponding .pqr file.

34
14 potential
This program calculates the electrostatic potential due to the charge distribution of molname
Potential calculates the potential due to the molecule, molname whose coordinates radii, and
charges are specified in a file molname.pqr, and writes to standard output its values at points
specified by the molname.fpt file. The units of the output potentials are input charge units
divided by input length units. For typical PDB derived input files, this would be elementary
charges/Ångström. In that case, to get (kcal/mol)/elementary charge, multiply by 332.063202.
The CoarseFieldOut and CoarsefieldInit options will cause the coarsest potential lattice to be
written to, or initialized from, an AVS (Advanced Visualization System) ”field” file. By default,
the units are as above for the the output potentials, but if you give the option ” AvsScaleFactor
f”, where f is a floating point number, the field will be scaled by that factor.
The program is called with:

program_dir/potential {options} {molname}

Potential requires the files molname.pqr and molname.ogm. molname.fpt is not strictly
required, but if neither the .fpt file nor the CoarseFieldOut option are used, the program produces
no output of the calculated potentials. The epsin option is mandatory.

• -CoarseFieldOut string: prefix of output file name.fld. This option triggers writ-
ing of electrostatic potential for the coarsest (first) grid level in AVS ”field” format to
name.fld.

• -CoarseFieldInit string: prefix of input file name.fld. This option triggers


reading of initial values for the electrostatic potential for the coarsest (first) grid level
in AVS ”field” format from name.fld.

• -AvsScaleFactor float: Scale factor for unit conversion of the the electrostatic
potential for the coarsest (first) grid level in AVS ”field” format.

15 solvate
This program calculates the electrostatic solvation energy of a molecule in a solvent.
Solvate calculates the Born solvation energy of a molecule - that is, the difference in the
electrostatic work required to bring its atom charges from zero to their full values in solvent
versus vacuum.
The program is called with:

program_dir/solvate {options} {molname}

molname is the molecule for which the the calculation is to be done and whose coordinates
radii and charges are specified in a file molname.pqr. Solvate requires a molname.pqr file and a
molname.ogm file (see below) as inputs. The -epsin option is mandatory.
he Born solvation energy is written to standard output in kcal/mol. Physical conditions and
units for I/O can be set by flags on the command line (see general options above). By default

35
solvate assumes we are going from vacuum (eps=1) to water (eps=80). Note that solvate uses
the epssol and epsvac flags rather than the epsext options to control solvent conditions. You
can try it out on a sphere to check agreement with the Born formula. See the example in the
directory example/solvate/born.
An example for the program usage can be found in the directory examples/solvate of the
distribution.
program specific options:
• -ReactionField: flag that triggers the output of the electrostatic reaction potential
(difference between the electrostatic potential in protein + solvent environment vs. vac-
uum) at the coordinates specified in molname.fpt (solvate and solinprot only). The corre-
sponding potential values will be written to molname.rf.

16 solinprot
Solinprot calculates the electrostatic part of the transfer energy for bringing a compound from
the bulk solvent to a molecular environment.
Calculates the ”solvation” energy of the molecule named by solute (for which there must be
solute.pqr and solute.ogm files) inside the molecule named by protein (for which the must be a
protein.pqr file) which is, in turn, in some solvent (water, by default). The interior of the solute is
presumed to have the dielectric constant, epsin1 (given by the epsin1 flag) and regions interior to
the protein but exterior to the solute are presumed to have the dielectric constant, epsin2 (given
by the epsin2 flag) and regions exterior to both protein and solute are presumed to have dielectric
constant, epsext (80.0 by default). The protein may contain charges and their interaction with
the solute will contribute to ”solvation energy.” The solvated energy is calculated relative to
a vacuum calculation in which the dielectric constant has a value of epsin1 inside the solute
and epsvac (1.0 by default) outside. The calculation works like this First the potential due
to the solute charges (call them rho solute) in the above described dielectric environment is
calculated. Call this potential phi sol. Next the potential due to rho solute is calculated in
the vacuum dielectric environment described above. Call this potential phi vac. The reaction
field component of the solvation energy is then, (rho solute*phi sol - rho solute*phi vac)/2,
where ”*” indicates a suitable sum or integral of charge times potential. So far, this is the
same as the solvate program except for the three dielectric environment on the solvated side.
We also need the contribution due to protein charges, rho protein, interacting with the solute
rho protein*phi sol.
The program is called with:
program_dir/solinprot {options} solute protein
The program expects to find the file solute.pqr containing the solute’s coordinates charges
and radii and the file solute.ogm containg the grid definitions. In addition, the file protein.pqr is
needed which contains the coordinates, charges and radii of the protein. The options are similar
to those for the solvate program except that the epsin1 and epsin2 options are mandatory and
the epsin option is forbidden. The epsext option is used instead of epssol for specifying the
solvent dielectric. The ProteinField flag is also available.
program specific options:

36
• -ReactionField: flag that triggers the output of the electrostatic reaction potential
(difference between the electrostatic potential in protein + solvent environment vs. vac-
uum) at the coordinates specified in molname.fpt. The corresponding potential values
will be written to molname.rf.

• -ProteinField: flag that triggers the output of the electrostatic potential due to the
protein charge distribution at the coordinates specified in molname.fpt. The correspond-
ing potential values will be written to molname.pf.

37
17 multiflex
Multiflex is a program for the automated preparation of the necessary input for continuum
electrostatic calculations on binding equilibria within a traditional two-state model.
Multiflex does the electrostatic part of a titration calculation for a multi site titrating molecule.
It can do single conformer calculations based on the methods described in Karplus and Bashford
(1990) and Bashford and Gerwert (1992) which assumes a rigid molecule, or it can included
limited conformational flexibility by the method of You and Bashford (1995). In the latter case,
the user must supply the coordinates for the conformational variants and the corresponding non
electrostatic energies. This can be done with a program like CHARMM.
See Ullmann1999 for a review of titration calculations with continuum electrostatics within
a classical two-state model. GCEM (see program description above) provides an alternative to
multiflex with additional features and a more detailed receptor model.
For single conformer calculation, multiflex works like the old multimead program. It takes
molname.pqr, molname.ogm, molname.mgm, molname.sites and molname.st files as inputs (see
below) and as its main outputs, produces a molname.pkint file, which contains the calculated
intrinsic pKas, a molname.g file, which contains site site interactions in units of charge squared
per length and a molname.summ file which summarizes the self and background contributions
to the intrinsic pK of each site.
The molname.pkint file and the molname.g file can be used directly as input to redti, the
program for calculating titration curves. Alternatively, these files can be used with Monte Carlo
simulation programs (e.g., XMCTI) to compute titration curves for systems, in which the com-
putational cost of redti is prohibitive.
Multiflex also produces a file, molname.sitename.summ for each titrating site which con-
tains some summary information that is mainly interesting for multi conformer (flexible) cal-
culations and it produces a large number of .potat files, which are binary files that are useful if
a job is interrupted and restarted (see below). The epsin flag is mandatory. Other flags can be
used to change units and/or physical conditions or include a membrane.
For the ”flexible” calculations which involve multiple conformers, multiflex needs the input
files described above but it also needs additional files For each flexible site there must be a file,
molname.sitename.confs (see below). These .confs files tell about the possible conformers of
that site and their non-electrostatic energies (see below under input files). For each conformer
named in a .confs file, there must be a .pqr file having the coordinates, charges and radii for the
whole protein with the site sitename in conformer confname (see below under input files). You
might need a lot of of .pqr files, for example, the You and Bashford calculations on lysozyme
had 36 conformers for each of 12 sites and one molname.pqr file is always needed so 433 .pqr
files were needed for each titration calculation. Flexible and single conformer sites can co exist
in the same molecule. The way multiflex tells the difference is that flexible sites have confs files
and single conformer files don’t.
Program settings are specified with command line options. Special input files are described
below. Example applications are found in the directory examples/multiflex of the MEAD dis-
tribution. The last command line arguments specify the name of the molecule. The program is
called with:

program_dir/multiflex {options} molname

38
where molname is the name of the molecule for which the calculation will be done. mol-
name is used as prefix for some input files.
Examples for the program usage can be found in the directory examples/multiflex of the
distribution.

17.1 Program options and their defaults


program specific options:

• -epsin float: dielectric constant of a molecular interior region

• -epssol float (80): dielectric constant of an exterior / solvent region

• -ionicstr float (0.0): ionic strength in mol/l

• -nopotat: setting this flag prevents multiflex from writing .potat files for saving com-
puted electrostatic potentials as restart files for future runs.

• -membz z lower z upper: Set up a membrane parallel to the x-y plane (membrane
normal in z direction). The membrane is modeled as low-dielectric slab whose lower and
upper boundaries are given by z lower and z upper, respectively.

• -membhole radius {cent x cent y}: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0).

• -site int: Integral number N whose specification causes multiflex to do the electro-
statics calculations only for the Nth site specified in the .sites file.

17.2 Input Files


• molname.pqr: is the structure of molname. The file format is described above under
general options and input files.

• molname.sites: contains a list of the titratable sites of molname. For each site, the
file contains a line of the following format:

res_num site_type {chain_id}

res num is the residue number of the site which will be matched to the residue numbers
of the atoms in molname.pqr. site type is the type of the site (usually similar to the residue
name). A file named site type.st is needed for each site type (see below). The optional
argument chain id is the chain id of the site. chain id is matched to the corresponding
chain id in molname.pqr and must consequently also be specified in molname.pqr also if
specified here.

39
• site type.st: multiflex expects a file with the name site type.st to specify details
concerning each site type that appears in the molname.sites file. The first line contains
one floating point number which is the model compound pK for that type of site. All
remaining lines are of the format:

resname atomname prot_charge deprot_charge

where resname and atomname, along with the res nums given in the molname.sites file
match an atom in the molname.pqr file that is part of a titrating site. prot charge is the
charge of this atom in the protonated state and deprot charge is its charge in the depro-
tonated state. It is expected that the sum of all prot charges subtracted from the sum of
all deprot charges will equal one. An extension of this file’s syntax is planned to allow
a regular expression to be given that will specify how a model compound is to be made
from the atom coordinates in molname.pqr.

• molname.sitename.conf: multiflex expects a file with the name molname.sitename.conf


to exist for each flexible site (otherwise the site is treated as non-flexible). Here, ”site-
name” is constructed from the molname.sites file by the procedure, ”7 GLU” ”GLU-7”
These .confs files tell about the possible conformers of that site. They contain lines of the
form

confname mac_non_elstat_energy mod_non_elstat_energy

where confname is the name of the conformer and the next two entries are non elec-
trostatic energies of the conformer in the macromolecule and in the model compound
corresponding to this site. The energies must be given in kcal/mol unless the econv flag
(see below) has been given to change the energy units.

• molname.sitename.confname.pqr: For each conformer named in a .confs file,


there must be a .pqr file having the coordinates, charges and radii for the whole protein
with the site sitename in conformer confname The file format is described above under
general options and input files.

• molname.ogm and molname.mgm: defines the electrostatics grids for the site in
the receptor environment and for the model compound in the reference environment (bulk
solution), respectively. The file format is described above under general options and input
files.

17.3 Output Files


• molname.pkint contains the calculated intrinsic pKa values of the sites in pK units.
Each site has no more than two protonation forms (protonated and deprotonated). The
protonation forms are called instances in the nomenclature of GCEM. The intrinsic pKa
int
pKa,i of a site i is equal to the relative intrinsic energy of the two protonation forms in
pK units

40
int 1 int int
pKa,i = ln 10(Ei,deprot Ei,prot ) (1)
RT
• molname.g contains the (relative) interaction energies between each pair of sites in
units of e2◦ /Å (charge squared per Ångström). The relative interaction energy between the
sites i and j in terms of instance-instance interaction energies as used by GCEM. is given
by

gi,j = Wi,prot,j,prot −Wi,deprot,j,deprot −(Wi,prot,j,deprot −Wi,deprot,j,deprot +Wi,deprot,j,prot −Wi,deprot,j,deprot )


(2)
which becomes after collection of terms

gi,j = Wi,prot,j,prot + Wi,deprot,j,deprot − Wi,prot,j,deprot − Wi,deprot,j,prot (3)

For each site, the file contains a line of the following format:

i j g_{i,j}

where i and j are the site ids in the order of the molname.sites file and gi,j is the interaction
energy between the sites
• molname.summ contains a summary of the background and self energy contributions
to the intrinsic pKa values which summarizes the self (Born solvation) and background
contributions to the intrinsic pKa value of each site. See Ullmann1999 for a review of
titration calculations with continuum electrostatics within a classical two-state model.
• molname.sitename.summ contains a summary of the energy contributions to the
intrinsic pKa values file which summarizes the energy contributions to the intrinsic pKa
value of each conformer confname of the flexible site sitename.

18 redti
This program calculates titration curves with an approximate analytical method (reduced site
method).
Redti solves the multiple site titration curve problem given a set of intrinsic pKas (mol-
name.pkint) and a site-site interaction matrix (molname.g) using the reduced site method de-
scribed by Bashford & Karplus (1991) J. Phys. Chem. vol. 95, pp. 9556-61. The input files can
be obtained from multiflex (see above for a description of the files). Redti is written in C rather
than C++. Its command line syntax is:

program_dir/redti {options} molname

where molname is the name of the molecule for which the calculation will be done. mol-
name is used as prefix for some input files.
program specific options:

41
• -cutoff float (20.5): Cutoff for the reduced site method.

• -pHrange pH min pH max(-20 30.1): pH range for the titration calculation

• -dry: Flag that causes redti to do a ”dry run” in which it prints the number of sites to
be included in the reduced site calculation at each pH point. This is useful for checking
whether a calculation will require a prohibitive amount of CPU time since CPU time will
go exponentially in the reduced site number.

18.1 Input Files


• molname.pkint contains the calculated intrinsic pKa values of the sites in pK units.

• molname.g contains the (relative) interaction energies in units of e2◦ /Å (charge squared
per Å)

42

You might also like