0% found this document useful (0 votes)

82 views42 pages

GCEM

The document describes extensions made to the MEAD molecular electrostatics software. The extensions allow for modeling of biological macromolecules including membranes with transmembrane potentials. Additional extensions enable visualization of electrostatic potentials, charge distributions, ion distributions, and dielectric distributions. An example is shown of the transmembrane potential across an ammonia transporter protein embedded in a lipid bilayer membrane.

Uploaded by

Bis Chem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views42 pages

GCEM

Uploaded by

Bis Chem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

1 Purpose of MEAD and our extensions

This page describes our extensions to the molecular electrostatics software suite MEAD. The
original version of MEAD was written by Donald Bashford and can be found at http://www.stjuderesearch.org/site/
MEAD consists of a library of C++ objects and some applications that use these objects
for modeling electrostatic properties of molecules. MEAD is an acronym for macroscopic
electrostatics with atomic detail. That is, macroscopic continuum electrostatics is applied at a
molecular scale, partitioning the system in regions with different dielectric constants (molecule,
solvent, membrane, ...). Solvent regions can contain mobile ions. The electrostatic potential
is computed according to the linearized Poisson-Boltzmann equation. Despite its simplicity,
this model can be used to compute molecular properties with very good accuracy (pKa values,
binding constants, electrostatic solvation energies, ...).
Our extensions allow a detailed modeling of (biological) macromolecules including the pos-
sibility to account for a membrane environment with an electrostatic trans-membrane potential.
Further extensions allow the visualization of electrostatic potentials, charge distributions, elec-
trolyte distributions and dielectric distributions. The data can be given out as three-dimensional
volumetric data in OpenDX format, or within a cut plane or along a line in ASCII format for
plotting with your favorite software. The figure below shows the trans-membrane potential
across the ammonia transporter Amt-1 from Archaeoglobus fulgidus as an example application
for the visualization extensions (structure: Andrade, Susana L. A.; Dickmanns, Antje; Ficner,
Ralph and Einsle, Oliver, 2005, PNAS, 102, 14994-14999).
• A lipid membrane can be represented by three dielectric slabs that model the hydrophobic,
ion-inaccessible core region and the hydrophilic headgroup regions that can be penetrated
by mobile ions. Regions within the membrane boundaries that are not part of the protein
or the membrane can be specified (e.g., water filled protein cavities or a channel through
the membrane).
• The protein interior can comprise different dielectric regions that can be used, for exam-
ple, to account for protein regions that are modeled classically and such that are modeled
with a quantum chemical approach. In addition there can be regions that model protein
cavities or channels.
• The solvent phase(s) are modeled by separate dielectric regions that can also contain
mobile ions.

2 Documentation
The documentation for the original MEAD package is the file README found in the root
directory of the distribution. Below, you can find the information from this README file and
information about our changes and additional application programs.
The MEAD library is best explored by looking at the source code. The best starting point
may be to look at some of the simpler programs like potential or solvate and then one of
the solver programs (for instance my xyz solver). Don’t start with multiflex or gcem,
because these programs and the objects used by them are too complex for a start. A brief outline
of the design of the MEAD library can be found in [?].

1
Figure 1: The trans-membrane potential distribution across the ammonium transporter Amt-1.
The extracellular side is shown at the top and the intracellular side at the bottom. The dark
outer contour denotes a projection of the solvent accessible surface of a transporter trimer into
a plane perpendicular to the membrane. The lighter inner contour shows a projection of a slice
of Amt-1 of 5 Å thickness into the same projection plane. The slice shows the trans-membrane
pore including putative ammonia positions and the twin-histidine motive determined by X-ray
crystallography. a) The trans-membrane potential is plotted in a slice plane cutting through the
transporter’s trans-membrane pore. The potential is projected into a plane perpendicular to the
membrane, while the slice plane is slightly tilted relative to the membrane normal to follow
the course of the trans-membrane pore. The values at the white contours denote the fraction
of the trans-membrane potential at the respective coordinate. It can be seen that the membrane
potential distribution within the protein does not show a simple linear dependency on the z
coordinate. b) The mobile source charge distribution of the trans-membrane potential in the
same slice and projection planes as in a). Darker red or blue shades denote higher negative
or positive charge density, respectively. It can be seen that most of the unbalanced charge is
concentrated close to the membrane and in the depressions of the protein surface.

2
The calculation of binding properties with a continuum electrostatics/molecular mechanics
model is described in a short version on the page about GMCT and GCEM. A more detailed
description is found in the user manual of GCEM (found in the directory doc of the distribution)
and the corresponding paper. See [?] for a review of titration calculations with continuum
electrostatics within a classical two-state model.

3
3 General information on MEAD usage
Here, you find information on general options and input files common to all MEAD programs

3.1 Program Usage

Program settings are in most cases specified with command line options. Input files are de-
scribed below. The last command line arguments specify the name of the molecule. The pro-
gram is called with:
program dir/program [options] [molname(s)]
where molname(s) are the names of the molecule(s) for which the calculation will be done.
molname(s) are used as prefix for some input files.

3.2 Program Options and Their Defaults

• -epsin float: relative dielectric constant of a molecular interior region (no default
value!!! – recommended 4.0)

• -epsin[1-3] float: for multidielectric usage

• -epsext float (80): dielectric constant of an exterior/solvent region (in potential,

my xyz solver)

• -epsext float (80): dielectric constant of an exterior/solvent region (multiflex

and solvate)

• -epsvac float (80): dielectric constant of vacuum (solvate)

• -T float (298.15): absolute temperature in K. ATTENTION: this number has

only an influence on the ion distribution, but not on the dielectric constant. Calculated
temperature effects are not realistic!

• -solrad float (1.4): solvent probe radius for determining a solvent accessible
surface / the solvent inaccessible volume

• -sterln float (2.0): thickness of an ion-exclusion layer (Stern layer) for deter-
mining the ion-inaccessible volume

• -ionicstr float (0.0): ionic strength in mol/l (The physiological ioinc strength
is about 0.1 to 0.2 mol/l. Using a very low ionic strenght 0.01 or even 0.0 mol/l heavily
affects the results!)

• -kBolt float (5.984e-06): Boltzmann constant in units of e2◦ /(ÅK) (elemen-

tary charges squared per Ångström and Kelvin). This constant must be adjusted if other
units of charge, length or temperature are used.

4
• -econv float (332.063202): conversion factor from e2◦ /(ÅK) (elementary charges
squared per Ångström and Kelvin) to kcal/mol. This constant must be adjusted if units of
energy and electrostatic potentials other than kcal/mol and kcal/(mol)e◦ are needed in
the output.

• -conconv float (6.022214e-04): conversion factor from mol/l (moles per liter)
to 1/Å3 (particles per cubic Å). This constant must be adjusted if other units of length or
concentration are used.

• -epssave oldway: flag that triggers the old style of averaging the dielectric constant
between grid points of differing dielectric constants. Between two potential lattice points
in the finite difference method. The new way involves inverse averaging and is similar to
a proposal by McCammon. The old way is a simple mean. The new way is significantly
more accurate the option to do it the old way is only provided for the sake of reproducing
old experimental results. It is not recommended otherwise.

• -converge oldway: Revert to the old way of testing for convergence of the succes-
sive over-relaxation (SOR) method of solving the finite difference representation of the
linearized Poisson Boltzmann equation. The new method gives improved long range
accuracy for large lattices, but at a sometimes substantial computational cost. See the
NEWS file for further discussion.

• -blab[1-3]: lag that controls the verbosity of the programs while writing to stdout,
specifying no blab flag at all is least verbose. In the latter case, only essential output is
written to stdout. Writing to output files is not affected.

visualization options (my xyz solver):

• -write pot: flag that triggers the output of the electrostatic potential

• -write rho: flag that triggers the output of the charge distribution

• -write eps: flag that triggers the output of the dielectric constant distribution

• -write ely: flag that triggers the output of the ionic strength distribution

• -x float: x coordinate of the grid center used for the OpenDX volumetric data file.

• -y float: y coordinate of the grid center used for the OpenDX volumetric data file.

• -z float: z coordinate of the grid center used for the OpenDX volumetric data file.

• -space float: grid spacing used for the OpenDX volumetric data file.

• -count int [int int]: edge length of the grid used for the OpenDX volumetric
data output in grid points. If one number is specified, all edges have the same length. If
three numbers are specified, they correspond to the edge lengths in x, y and z direction.
For the output of the electrostatic potential, the grid must be entirely covered by the grid
used in the calculation of the electrostatic potential (specified in the .ogm or .mgm file).

5
• -gmt 7 floats 1 int [6 more floats]: Specification of this option triggers
the output of the quantities requested by write pot, write rho, write eps and write ely as
two-dimensional curves for plotting with the GMT program suite or other plotting soft-
ware. The arguments are xcenter, ycenter, zcenter, xnormal, ynormal, znormal, grid spac-
ing, number of grid points xcenter2, ycenter2, zcenter2, xnormal2, ynormal2, znormal2
The first three numbers define the x, y and z coordinates of the grid/plane center. The
next three numbers define components of the normal vector of the plane. The following
integral number defines the edge length of the curve in grid points. Optionally, six more
values can be specified to define the base point and the normal vector of a second plane.
In this case, the values of the requested functions are calculated in this second plane then
and projected onto the first plane. This feature can for example be useful for a trans-
membrane pore that is not perfectly normal to the membrane plane as shown for in the
figure on the top of the page.

3.3 Input Files

• name.pqr:
contains a molecule structure in a format similar to that of a PDB file but with atomic
partial charges and radii in the occupancy and B factor columns, respectively. More
specifically, lines beginning with either ”ATOM” or ”HETATM” (no leading spaces) are
interpreted as a set of tokens separated by one or more spaces or TAB characters. Note
that the .pqr format does not support some PDB features such as a altLoc fields, and a one
character chainID between resName and resSeq. Doing so would break the whitespace
separated tokens convention that allows for easy processing with perl scripts, etc. Instead
we optionally allow for additional digits at the end of the line to specify global conforma-
tion, chain, residue and instance, where instance is a certain form of a site or residue. If
you have a PDB file and need to generate a PQR file, this implies making some choices
about the charges and radii. This is similar to making a choice about what force field
to use in an MD simulation. MEAD per se, doesn’t make the choice for you. However,
in the utilities subdir are some tools that may be useful if you want to use CHARMM
or PARSE parameters. The Amber program suite comes with a program, ambpdb, that
allows you to generate a PDB or PQR file given Amber format files. Another option is
the program pdb2pqr. format:
ATOM/HETATM atnum atname resname resnum x y z charge radius {confid chainid siteid instid}
...

The contents of the first two columns are ignored. The last four columns are optional
(used, e.g., by GCEM). atname is the atom name. resname is the residue name. resnum
is the residue number. x,y,z are the x, y, and z coordinates of the atom (floating point
numbers). charge is the atomic partial charge of the atom. radius is the atom radius.
confid, chainid siteid and instid are integer numbers that identify the global conformation,
the polymer chain, the site and the instance to which the atom belongs, respectively.
• name.ogm and name.mgm:
definine the cubic grids for the computation of electrostatic potentials with the finite dif-
ference method. format:

6
grid_center_1 grid_spacing_1 grid_dimension_1
...
grid_center_N grid_spacing_N grid_dimension_N

grid center is the grid center and can be given as three floating point numbers spec-
ifying the coordinates directly or as centering style. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. For the centering style, three
options are available.

– ON ORIGIN specifies that the grid center is placed on the origin of the coordinate
system (0.0, 0.0, 0.0).
– ON GEOM CENT specifies that the grid center is to be placed on the geometric
center of the molecule / receptor or site denoted by name.
– ON CENT OF INTEREST specifies that the grid center is to be placed on the geo-
metric center of a site of interest.

grid spacing is the spacing between two grid points in AA. grid dimension is the edge
length of the cubic grid in grid points and must be an odd integer. Each grid must be
smaller than the previous grid, and normally it will also have a smaller grid spacing.

• name.fpt:
is a file containing the coordinates at which the electrostatic potential shall be calcu-
lated (traditional format) or a file containing the atomic partial charges and their coor-
dinates (extended format I for the the application programs named my xyz solver)
or a file containing the atomic partial charges and their coordinates for each instance of
each site found in a receptor (extended format II for the the application programs named
my xyz solver)

– traditional format (for command line option -ProteinField/-ReactionField):

coordinate_1
...
coordinate_N

where coordinate consists of three floating point values that denote the x, y and
z coordinates of a point respectively. Optionally, the coordinate values can be en-
closed by parenthesis and separated by commas. The line breaks can be substituted
with any number of whitespace characters.
– extended format I (for command line option -pf):

coordinate_1 charge_1
...
coordinate_N charge_N

7
coordinate consists of three floating point values that denote the x, y and z co-
ordinates of a point respectively. Optionally, the coordinate values can be enclosed
by parenthesis and separated by commas. charge is the atomic partial charge at
the preceding coordinate. The line breaks can be substituted with any number of
whitespace characters.
– extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1

...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the
following atomic partial charge belongs. coordinate consists of three floating
point values that denote the x, y and z coordinates of a point respectively. Optionally,
the coordinate values can be enclosed by parenthesis and separated by commas.
charge is the atomic partial charge at the preceding coordinate. The line breaks
can be substituted with any number of whitespace characters.

• name.potat:
This is a binary file produced by some programs, to avoid costly recalculation of elec-
trostatic potentials. It contains the potential at each atom of name generated by some set
of charges. Variations of name may denote charge states, sites, conformers, solvated or
vacuum or uniform dielectric environments, depending on the application. Atomic coor-
dinates and radii and the generating charges are also included for the sake of consistency
checking. These files allow multiflex, etc. to avoid unnecessary recalculations when
all you want to do is add or alter some site, but you must be careful about which .potat
files you keep. Specify the -blab2 flag for a blow-by-blow account of attempts to read and
write .potat files in multiflex.

8
4 GCEM – Generatlized Continuum Electrostatic Model
GCEM is a program for the automated preparation of the necessary input for GMCT from a
continuum electrostatics/molecular mechanics model.
GCEM is a program for the automated preparation of the necessary input for GMCT from
a continuum electrostatics/molecular mechanics model. Details about the applications and the
underlying model of GCEM can be found on a separate documentation. Examples for the usage
of the program can be found in the directory examples/gcem.

4.1 Program Usage

GCEM computes the energy terms used in the microstate energy function of GMCT using a
continuum electrostatics/molecular mechanics model. GCEM has a number of new features as
compared to multiflex and is based on a generalized formulation of binding theory, which offers
a wider application range.
GCEM considers one global conformation at a time. That is, a separate calculation needs to
be set up for each global conformation.
Program settings are specified with command line options. Input files are described below.
The last command line argument is the name of the molecule. The program is called with:
program dir/gcem options molname
where molname is the name of the molecule/receptor. molname is used as prefix for some
input files.
Each GCEM calculation consist of at least two program runs. The preprocessing run gen-
erates sidechain rotamers, calculates molecular mechanics energy terms, writes a restart file for
the postprocessing run and sets up the necessary input for the continuum electrostatics calcu-
lations that are done by independent MEAD programs (my xyz solver). This approach al-
lows a very efficient and simple parallelization without communication overhead (see the PERL
scripts rst-gcem-xyz.pl provided with the examples for an example). The postprocessing can
also be run several times alternating with recomputation of the electrostatic energies to elimi-
nate energetically unfavorable conformers. Thereby, the continuum electrostatics calculations
are refined resulting in a decrease of the error introduced by the inflation of the low dielectric
molecule interior by the high-energy conformers.
The intended purpose of the interior dielectric regions eps1set, eps2set and eps3set
was for modeling a quantum chemically treated region, a classically treated region and a region
of solvent filled cavities or pores inside the molecular structure that are to be excluded from the
membrane dielectric, but they might as well be used for other purposes. The interior regions
are inaccessible to mobile ions. The intended purpose of the region elycavset was to model ion
accessible cavities or depressions in the protein surface reaching into the membrane boundaries
(e.g. gorges leading to a channel entrance and trans-membrane pores with large diameter).
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.

9
4.2 Program Options and Their Defaults
• -grid detail int (2): This option influences the detail of the auto-generated
grids for the computation of the electrostatic potentials with the finite difference method.
A higher value will result in a larger and finer grids. The default value of 2 will in most
cases suffice for high quality results, sometimes a value of 3 can be advantageous (espe-
cially for membrane proteins, where the apolar environment leads to more far-reaching
electrostatic interactions).

0 coarse, not advisable for production

1 economic, similar to the setting in the old multiflex examples, can result in signifi-
cant discretization errors
2 high, somewhat finer and larger grids than for 1, solvation energies and interaction
energies should be largely converged with respect to grid spacing
3 very high, even finer and larger innermost grids, ensures very high-quality solvation
and interaction energies also for long-range interactions in membrane proteins
4 ultra high, extremely fine grids, should not result in significant numeric differences
of the result relative to 3, needs very much memory and computation time

If user supplied grids exist, the setting affects the automatic adjustment of the innermost
grid’s size to the size of each site. A higher value will result in a larger spacing of the site
to the grid boundary.

• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region and the region specified by elycavset (if applicable).

• -epshead float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epscore float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epsin1 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps1set (if applicable).

• -epsin2 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps2set (if applicable).

• -epsin3 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps3set (if applicable).

• -eps ff float: Floating point number that defines the dielectric constant of the ref-
erence environment used in force field parametrization.

• -eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 epsin1 (if applicable).

10
• -eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• -eps3set string: String giving the prefix of a .pqr file that define the dielectric
region 3 with the dielectric constant epsin3 (if applicable).

• -elycavset string: String giving the prefix of a .pqr file that defines the an ex-
terior region with the dielectric constant epsext and ionic strength defined by ionicstr or
ionicstr1 and ionicstr2 according to the membrane side (if applicable).

• -membz z lower core z upper core z lower head z upper head: Bound-
aries of a three-layer dielectric slab representing the polar, ion-accessible headgroup
regions and the apolar ion-inaccessible core region of the membrane (if applicable).
The membrane is perpendicular to the z-direction, hence the components of its nor-
mal vector are given by (0,0,1). The headgroup region extends from z=z lower head to
z=z upper head excluding the core region. The core region extends from z=z lower core
to z=z upper core excluding the core region. The headgroup region can be omitted by
setting z lower core = z lower head and z upper core = z upper head. The ionic strength
on the membrane sides can be set to equal values by specifying ionicstr or set to distinct
values by specifying ionicstr1 and ionicstr2. The ionic strength within the boundaries
of the core region (e.g., in a channel pore defined by membhole or elycavset) is linearly
interpolated between ionicstr1 and ionicstr2.

• -ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

• -ionicstr1 float: Ionic strength in the upper exterior region(s) (z> z upper core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr2 must be specified.

• -ionicstr2 float: Ionic strength in the lower exterior region(s) (z < z lower core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr1 must be specified.

• -membhole radius cent x cent y: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0). This option works seldom satisfactory for real proteins with non-cylindrical
shape. The use of elycavset and/or one of the interior dielectric regions, for example
eps3set, is to be preferred.

• -inside (1): This option defines which membrane side is “inside” (cytoplasmic) in
the electrophysiological sense, where the membrane potential is measured at the inside
relative to the outside. A value of 1 states that the lower membrane side is inside, while a
value of 0 states that the upper side is inside.

• -capacitance: This flag triggers the calculation of the capacitance of the system.
The capacitance is the accumulated charge in Ampere seconds = Coulomb per mem-
brane potential in Volt. Hence, the capacitance is given by default in Farad (F = As/V).
The calculation of the capacitance requires tighter convergence criteria for the electro-
static potentials, which increases the required computation time. The capacitive energy

11
is only needed if there are multiple global conformations. The difference between the
the capacitive energy terms of different global conformations is negligibly small under
normal conditions of naturally occurring values of the membrane potential, but depends
quadratically on membrane potential. Caution: The capacitance and hence the capacitive
energy depend on the system size (mainly on the area of the covered membrane region).
Therefore, the same grid size must be used for the innermost grid in the calculation of the
capacitance for all global conformations (normally automatically taken care of).

• -charmm par string: Name of a CHARMM parameter file containing the CHARMM
force field parameters. Examples are provided with the examples in the subdirectories of
examples/gcem. The file is not read if there are only non-flexible sites (according to
the definitions in molname.rot) and skip stat mmg and skip stat mmint are given on the
command line.

• -charmm top string:Name of a CHARMM topology file containing the residue

topology definitions. Double bonds must be specified as such for correct automatic iden-
tification of the sidechain torsions. Examples are provided with in the subdirectories of
examples/gcem. The file is not read if there are only non-flexible sites (according to
the definitions in molname.rot) and skip stat mmg and skip stat mmint are given on the
command line.

• -rotlib string (rotlib): Prefix of the name of a file containing the Squirell
backbone dependent rotamer library (tested with 2002 and 2010 versions). First, it is
attempted to read rotlibname.dat as binary boost archive. If the binary file is not found, it
is attempted to read rotlibname.txt. The file is not read if there are only non-flexible sites
(according to the definitions in molname.rot).

• -skip stat mmint: Flag that triggers the omission of molecular mechanics contribu-
tions to the intrinsic energies of each non-flexible site (according to the specification of
the site type in molname.rot).

• -skip stat mmg: Flag that triggers the omission of molecular mechanics contributions
to each interaction energy that involves a non-flexible site (according to the specification
of the site type in molname.rot).

• -empirical energy: Flag that triggers the use of an empirical energy function for
the part of the intrinsic energy that involves atoms of the site itself and nearby backbone
residues that are thought to determine the rotamer propensity in the backbone dependent
rotamer library. The empirical energy is defined as β 1 ln[p], where p is the propensity for
the corresponding sidechain rotamer in the rotamer library.

• -print chm: Flag that triggers detailed output regarding the molecular mechanics
terms to stdout (independent on the blab level).

• -write binary rotlib: Flag that triggers writing of the rotamer library as binary
archive (using the Boost serialization library).

other program specific options:

12
• -gcem dir string: Name of the subdirectory to which gcem will write input files
for the continuum electrostatics solvers and the corresponding job script(s). In addition,
the directory will contain some deprecated files that are no longer used but may be helpful
in debugging while making changes to the source code.

• -mead path string: Path to the root directory of the MEAD distribution used for
the calls to the continuum electrostatics solvers in the job script(s). The option is useful
if the solver jobs are run on a different system, for example a remote computer cluster.

• -filter ctof float (1e99): Intrinsic energy cutoff (in kcal/mol) for eliminat-
ing conformers with excessively high intrinsic energy, which makes them unlikely to be
populated significantly in equilibrium. A conformer (maybe a sidechain rotamer) consist-
ing of N instances (may be different binding forms of the sidechain) is eliminated if the
minimum intrinsic energy of any instance of the conformer is larger than the minimum
intrinsic energy of all instances of the site plus filter ctof.

• -filter ctof gs float (1e99): Energy cutoff (in kcal/mol) for eliminating con-
formers with excessively high energy, which makes them unlikely to be populated sig-
nificantly in equilibrium. The elimination criterion is almost identical to the Goldstein
criterion of dead end elimination, with the exception that the hypothetical minimum en-
ergy must be larger by filter ctof gs rather than by 0. A conformer (maybe a sidechain
rotamer) consisting of N instances (may be different binding forms of the sidechain) is
eliminated if all of its instances fulfill the modified Goldstein criterion.

• -cap max nin int: Flag that triggers the reduction of the number of instances of any
site to a number ≤ cap max nin. Keep no more than cap max nin instances of a site with
lowest sum of intrinsic energy and sum of minimum interaction energies with all other
sites. The method is equivalent to repeated application of the modified Goldstein criterion
to all conformers with successively decreasing filter ctof gs until the number of instances
is equal to or smaller than cap max nin.

• -print debug: Flag that triggers detailed output regarding the read and generated
data to stdout (independent on the blab level). The flag is mainly intended for verification
purposes while debugging new program features.

4.3 Input Files

• molname.pqr:
The receptor structure. The file format is described above under general options and input
files.

• molname.ogm and molname.mgm (optional):

defines the electrostatics grids for the site in the receptor environment and their the model
compound in the reference environment (bulk solution), respectively. The file format is
described above under general options and input files.

13
• molname.rot:
defines the sites of molname ans their types For each site, the file contains a line of the
format:

site_label site_type

The site label is constructed from sitename-chainid-resid, where chainid and resid cor-
respond to the data found in molname.pqr (chainid set to 0 if absent in molname.pqr).
site type is one of:

0 ignored as site
1 flexible and titratable (site name.est required, see below) user defined conformers
are read from molname.pqr flexible parts must be present for each conformer with
unique instance IDs differing from 0, For amino acid residues, additional sidechain
rotamers are generated for amino acid residues (currently not implemented for Pro),
using dihedral angles from the Squirell backbone dependent rotamer library. For
each conformer, the different forms defined in site name.est are generated.
2 flexible and non-titratable (site name.est not required) user defined conformers are
read from molname.pqr flexible parts must be present for each conformer with
unique instance IDs differing from 0, For amino acid residues, additional sidechain
rotamers are generated for amino acid residues (currently not implemented for Pro),
using dihedral angles from the Squirell backbone dependent rotamer library.
3 non-flexible and titratable (site name.est required, see below) For the conformer
found in molname.pqr, the different forms defined in site name.est are generated
4 quantum mechanically treated site (QM site, site label.qst required, see below) Con-
formational flexibility/orientational polarization effects are thought to be considered
within the QM treatment. The model energy should also contain any energy terms
due to bound ligands, and any other energy contribution that apart from the interac-
tions with the rest of the protein. GCEM The model energy of a QM site is expected
to be completely defined in site label.qst, where site label is given by sitename-
chainid-resid. The coordinates and atomic partial charges of the QM site are taken
from molname.pqr or separate .pqr files as described below for site label.qst.

• molname.con:
defines the connectivity of the sites to the membrane sides. The file is only required for
membrane proteins. For each site, the file contains a line of the format:

site_label connectivity

The site label is constructed from sitename-chainid-resid, where chainid and resid cor-
respond to the data found in molname.pqr (chainid set to 0 if absent in molname.pqr).
connectivity is one of:

0 The site binds ligands from the outer membrane side.

14
1 The site binds ligands from the inner membrane side.

• sitename.est:
An .est file defines forms of a titratable site. Example:

label NTP NTD1 NTD2 NTD3

Gmodel 0 10.881 10.881 10.881
proton 1 0 0 0
center N
LYS CA 0.21 0.18 0.18 0.18
LYS HA 0.10 0.10 0.10 0.10
LYS N -0.30 -0.96 -0.96 -0.96
LYS HT1 0.33 NaN 0.34 0.34
LYS HT2 0.33 0.34 NaN 0.34
LYS HT3 0.33 0.34 0.34 NaN

The keywords have the following meaning:

– label: labels for the forms (instances)

– Gmodel: model energy, (relative) chemical potentials of the forms (see also the
program manual of GMCT)
– EpsRef: specification is optional, relative dielectric constant of the reference envi-
ronment to which the Gmodel value refers
– proton, electron ...: Any keyword or line patterns not matching one of
the other entries of this table names a ligand type. The following numbers specify
the numbers of the ligand type bound by each form.
– center: currently ignored, formerly used to define the atom of the model com-
pound to be used as center of the finite difference grids, GCEM uses the geometric
center of the model compound.
– remaining lines: residue name, atom name, atomic partial charges of the atoms in
each form

• site label.qst: A .qst file defines forms of a QM site. Example:

label FES1010 FES1111

Gmodel -1.4232 0
EpsRef 1 1
proton 1 2
electron 1 2

– label: labels for the forms

– Gmodel: model energy, (relative) chemical potentials of the forms (see also the
program manual of GMCT)

15
– EpsRef: specification is optional, relative dielectric constant of the reference envi-
ronment to which the Gmodel value refers
– proton, electron ... : Any keyword not found among among the pre-
ceding keywords names a ligand type. The following numbers specify the numbers
of the ligand type bound by each form.

Conformational flexibility/orientational polarization effects are thought to be considered

within the QM treatment. The model energy should also contain any energy terms due to
bound ligands, and any other energy contribution that apart from the interactions with the
rest of the protein. For each instance, a structure file named site label instance label.pqr
is expected to exist. The model energy of a QM site is expected to be completely defined
in extended site label.qst, where extended site label is constructed from the site label as
found in molname.rot by inserting the conformer id confid. The extended site label is
given by sitename-confid-chainid-resid, where the confid. Atomic coordinates and partial
charges of atoms of the site found in any structure of the QM site are ignored if also
found in molname.pqr. Capping atoms and groups are assumed not to be present in the
structures the user has to remove them during structure preparation. Bonded molecular
mechanics terms involving link-atoms are assumed to be the equal for all instances of the
QM site, and thus neglected. If you wish to include such energy contributions, you can
add them to the model energy. There are unavoidable ambiguities and inconsistencies in
the treatment of the link regions. Therefore, it is advisable to choose the extent of the QM
site such that any significant conformational flexibility occurs well within the QM site.
In this way, the above assumption (equal energy contributions for all instances of the QM
site due to the link region) is justified just as for the other site types.

• rotlibname.txt/dat:
Squirell backbone dependent rotamer library in ASCII format or as Boost binary archive.
The file format is described on the rotamer library website of the Dunbrack group (http://dunbrack.fccc.edu/b

• CHARMM parameter and residue topology files:

The file format is described on www.charmm.org. The automatic determination of the
amino acid sidechain torsion angles requires that double bonds are specified as such in
the residue topology file.

16
5 my 2diel solver
This program computes the electrostatic potential and the corresponding electrostatic energy
terms of a site in a two-dielectric environment. Additional features enable the use of this pro-
gram for visualization purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

5.1 Program Usage

Program settings are specified with command line options. Input files are described below. The
program is called with:
program_dir/my_2diel_solver {options} sitename backgroundname
where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set and eps2set was for modeling
a quantum chemically treated region and a a classically treated region, but they might as well
be used for other purposes. The interior regions are inaccessible to mobile ions. The intended
purpose of the region elycavset was to model ion accessible cavities or depressions in the protein
surface reaching into the membrane boundaries (e.g. gorges leading to a channel entrance and
trans-membrane pores with large diameter).
Examples for the program usage can be found in the subdirectories located in examples/gcem.
GCEM will create the input files and a job script job.sh using the my xyz solvers in the directory
gcem dir/gcem.

5.2 Program options and their defaults

general options are described above
program specific options for the continuum electrostatics calculation:
• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region outside the solvent inaccessible volume of backgroundname.
• -epsin float: Floating point number that defines the dielectric constant of the inte-
rior of backgroundname.

17
• -ionicstr float: Ionic strength in the exterior region outside the solvent inacces-
sible volume of backgroundname (by default in in mol/l).

5.3 Input Files

• sitename.pqr, backgroundname.pqr and additional .pqr files used to define the dielectric
regions are structure files in .pqr format. The file format is described above under general
options and input files.

• sitename.mgm defines the electrostatics grids for the site in the receptor environment.
The file format is described above under general options and input files.

18
6 my 3diel solver
This program computes the electrostatic potential of a site in a three-dielectric environment and
the corresponding electrostatic energy terms. Additional features enable the use of this program
for visualization purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and
to cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies
with other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

6.1 Program usage

Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_3diel_solver {options} sitename backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set and eps2set was for modeling
a quantum chemically treated region and a a classically treated region, but they might as well
be used for other purposes. The interior regions are inaccessible to mobile ions. The intended
purpose of the region elycavset was to model ion accessible cavities or depressions in the protein
surface reaching into the membrane boundaries (e.g. gorges leading to a channel entrance and
trans-membrane pores with large diameter).
Examples for the program usage can be found in the subdirectories located in examples/gcem.
GCEM will create the input files and a job script job.sh using the my xyz solvers in the directory
gcem dir/gcem.

6.2 Program options and their defaults

general options are described above
program specific options for the continuum electrostatics calculation:

• --fpt string: Prefix of a file fptname.fpt for the calculation of site-site interaction
energies

19
• --epsext float: Floating point number that defines the dielectric constant of the
solvent region and the region specified by elycavset (if applicable).

• --epsin1 float: Floating point number that defines the dielectric constant of the
region specified by eps1set (if applicable).

• --epsin2 float: Floating point number that defines the dielectric constant of the
region specified by eps2set (if applicable).

• --epshomo float: Floating point number that defines the dielectric constant of an
additional homogeneous dielectric (optional).

• --eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 with the dielectric constant epsin1 (if applicable).

• --eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• --ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

6.3 Input Files

• sitename.ogm defines the electrostatics grids for the site in the receptor environment. The
file format is described above under general options and input files.

• fptname.fpt is a file containing the atomic partial charges and their coordinates for each
instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1

...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

20
7 my Ndiel solver
This program computes the electrostatic potential of a site in an environment with an arbitrary
number of dielectric and electrolyte regions and the corresponding electrostatic energy terms.
Additional features enable the use of this program for visualization purposes and as helper
program for GCEM.
This program computes the electrostatic potential of a site in an environment with an ar-
bitrary number of dielectric and electrolyte regions and the corresponding electrostatic energy
terms. Additional features enable the use of this program for visualization purposes and as
helper program for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies with
other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

7.1 program usage

Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_Ndiel_solver {options} sitename backgroundname

7.2 program options and their defaults

general options are described above

7.3 input files

21
• sitename.ogm defines the electrostatics grids for the site in the receptor environment.
The file format is described above under general options and input files.
• backgroundname.diel is a file defining the dielectric regions format of a single line
corresponding to a dielectric region:

pqrname eps solrad

pqrname is the prefix of a structure file that determines the ion-inaccessible volume of
the region.
eps is the relative dielectric constant of the dielectric region.
solrad is the solvent probe sphere radius used to define the dielectric region.
The priority of the regions increases from the top to the bottom of the list. Solvent inac-
cessible regions of previous entries are overridden by the solvent inaccessible regions of
following entries.
• backgroundname.ely is a file defining the electrolyte regions
format of a single line corresponding to an electrolyte region:

pqrname istr ionrad

pqrname is the prefix of a structure file that determines the ion-inaccessible volume of
the region.
istr is the ionic strength in the ion-accessible volume of the electrolyte region.
ionrad is the ion radius (Stern layer radius) used to define the electrolyte region.
The priority of the regions increases from the top to the bottom of the list. Ion accessi-
ble regions of previous entries are overridden by the ion-accessible regions of following
entries.
• fptname.fpt is a file containing the atomic partial charges and their coordinates for
each instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1

...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

22
8 my memb solver
This program computes the electrostatic potential and the corresponding electrostatic energy
terms of a site in a environment that models a lipid membrane, the receptor and the solvent
phases above and below the membrane with up to 6 dielectric and 2 electrolyte regions. Addi-
tional features enable the use of this program for visualization purposes and as helper program
for GCEM.
The program calculates the electrostatic interaction energy of sitename with background-
name and the Born solvation energy of sitename. The calculation of the Born solvation energy
requires a second calculation with one of the programs my xyz solver as reference point and to
cancel the grid artifacts. If the option -fpt is specified, the electrostatic interaction energies with
other sites can be calculated.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.

8.1 Program Usage

Program settings are specified with command line options. Input files are described below. The
program is called with:

program_dir/my_memb_solver {options} sitename backgroundname

where sitename is the name of the site used as prefix for the corresponding .pqr and .ogm
files. backgroundname is the name of the background (structure containing all atoms not be-
longing to any site) and used as prefix for the corresponding .pqr file.
The intended purpose of the interior dielectric regions eps1set, eps2set and eps3set was
for modeling a quantum chemically treated region, a classically treated region and a region of
solvent filled cavities or pores inside the molecular structure that are to be excluded from the
membrane dielectric, but they might as well be used for other purposes. The interior regions
are inaccessible to mobile ions. The intended purpose of the region elycavset was to model ion
accessible cavities or depressions in the protein surface reaching into the membrane boundaries
(e.g. gorges leading to a channel entrance and trans-membrane pores with large diameter).
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.
Examples for the program usage can be found in the directory examples/gcem/bR of the
distribution. GCEM will create the input files and a job script job.sh using the my xyz solvers
in the directory gcem dir/gcem.

23
8.2 Program options and their defaults
general options are described above
program specific options for the continuum electrostatics calculation:

• -fpt string: Prefix of a file fptname.fpt for the calculation of site-site interaction
energies

• -epsext float: Floating point number that defines the dielectric constant of the sol-
vent region and the region specified by elycavset (if applicable).

• -epshead float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epscore float: Floating point number that defines the dielectric constant of the
membrane head group region (if applicable / membz is defined).

• -epsin1 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps1set (if applicable).

• -epsin2 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps2set (if applicable).

• -epsin3 float: Floating point number that defines the dielectric constant of the re-
gion specified by eps3set (if applicable).

• -epshomo float: Floating point number that defines the dielectric constant of an
additional homogeneous dielectric (optional).

• -eps1set string: String giving the prefix of a .pqr file that define the dielectric
region 1 with the dielectric constant epsin1 (if applicable).

• -eps2set string: String giving the prefix of a .pqr file that define the dielectric
region 2 with the dielectric constant epsin2 (if applicable).

• -eps3set string: String giving the prefix of a .pqr file that define the dielectric
region 3 with the dielectric constant epsin3 (if applicable).

24
setting z lower core = z lower head and z upper core = z upper head. The ionic strength
on the membrane sides can be set to equal values by specifying ionicstr or set to distinct
values by specifying ionicstr1 and ionicstr2. The ionic strength within the boundaries
of the core region (e.g., in a channel pore defined by membhole or elycavset) is linearly
interpolated between ionicstr1 and ionicstr2.

• -ionicstr float: Ionic strength in the exterior region(s) (by default in in mol/l).

• -ionicstr1 float: Ionic strength in the upper exterior region(s) (z > z upper core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr2 must be specified.

• -ionicstr2 float: Ionic strength in the lower exterior region(s) (z < z lower core)
(by default in in mol/l). Overrides ionicstr. If specified, also ionicstr1 must be specified.

• -membhole radius {cent x cent y}: Exclude a cylindrical hole from the mem-
brane. The radius of the hole is given by float 1. The optional arguments cent x and
cent y specify the x and y coordinates of the cylinder center, respectively (default values
0.0 and 0.0). This option works seldom satisfactory for real proteins with non-cylindrical
shape. The use of elycavset and/or one of the interior dielectric regions, for example
eps3set, is to be preferred.

8.3 Input Files

• sitename.ogm defines the electrostatics grids for the site in the receptor environment. The
file format is described above under general options and input files.

• fptname.fpt is a file containing the atomic partial charges and their coordinates for each
instance of each site found in a receptor
extended format II (for command line option -fpt):

site_of_1 instance_of_1 coordinate_1 charge_1

...
site_of_N instance_of_N coordinate_N charge_N

site and instance denote the site and instance to whose charge distribution the following
atomic partial charge belongs. coordinate consists of three floating point values that de-
note the x, y and z coordinates of a point respectively. Optionally, the coordinate values
can be enclosed by parenthesis and separated by commas. charge is the atomic partial
charge at the preceding coordinate. The line breaks can be substituted with any number
of whitespace characters.

25
9 my membpot solver
This program computes the electrostatic trans-membrane potential and the corresponding elec-
trostatic energy terms in a environment that models a lipid membrane the protein and the solvent
phases above and below the membrane with up to 6 dielectric and 2 electrolyte regions (same
as for my mem solver). Additional features enable the use of this program for visualization
purposes and as helper program for GCEM.
The program calculates the electrostatic interaction energy of backgroundname with the
charge distribution causing the electrostatic trans-membrane potential. If the option -fpt is
specified, the electrostatic interaction energies of sites with the charge distribution causing the
electrostatic trans-membrane potential are calculated. If the option -capacitance is specified, the
program calculates the capacitance of the receptor-membrane system.
All quantities that are discretized on the finite difference grid can be written as OpenDX
three-dimensional volume data, two-dimensional curves in slice planes or one-dimensional
curves along lines. The generated data can be used for visualization and verification purposes.
One might, for example, want to verify the correct assignment of dielectric regions in a conve-
nient way using a molecular structure viewer that can read OpenDX volume data, as for example
VMD or PyMol. The data can also be written as two-dimensional curves for plotting with the
GMT program suite or other plotting software.
The theoretical basis of computing the electrostatic trans-membrane potential is described
in (Roux, 1997). A constant offset potential of ±0.5Ψ is added to all ion-accessible grid points
on the inner and outer side of the membrane, respectively. Equivalently, an effective uniform
charge distribution can be assigned to the ion accessible volume on either membrane side. For
2 h3 Ψ
the finite difference representation, an effective charge of q ef f = ± κ̄ 8π , is assigned to each
ion accessible grid point of the inner or outer membrane side, respectively. Here, κ̄2 is the
inverse Debye length and h is the grid spacing. As it turned out, this alternative way of adding
the offset potential was also implemented by Roux and coworkers for the PBEQ module of
CHARMM.

9.1 Program Usage

Program settings are specified with command line options. Input files are described below. The
last command line argument is the name of the molecule. The program is called with:

program_dir/my_membpot_solver {options} backgroundname

26
A membrane can be modeled by three-layer dielectric slab representing the polar, ion-
accessible headgroup regions (optional) and the apolar ion-inaccessible core region of the mem-
brane. The membrane is requested with membz.
Examples for the program usage can be found in the directories examples/my membpot solver
and examples/gcem/bR of the distribution. GCEM will create the input files and a job script
job.sh using the my membpot solver in the directory gcem dir/membpot.

9.2 Program options and their defaults