Methods For Molecular Dynamics Simulations of Protein Folding/unfolding in Solution
Methods For Molecular Dynamics Simulations of Protein Folding/unfolding in Solution
www.elsevier.com/locate/ymeth
Abstract
All atom molecular dynamics simulations have become a standard method for mapping equilibrium protein dynamics and non-
equilibrium events like folding and unfolding. Here, we present detailed methods for performing such simulations. Generic protocols
for minimization, solvation, simulation, and analysis derived from previous studies are also presented. As a measure of validation,
our water model is compared with experiment. An example of current applications of these methods, simulations of the ultrafast
folding protein Engrailed Homeodomain are presented including the experimental evidence used to verify their results. Ultrafast
folders are an invaluable tool for studying protein behavior as folding and unfolding events measured by experiment occur on
timescales accessible with the high-resolution molecular dynamics methods we describe. Finally, to demonstrate the prospect of
these methods for folding proteins, a temperature quench simulation of a thermal unfolding intermediate of the Engrailed
Homeodomain is described.
Ó 2004 Elsevier Inc. All rights reserved.
Keywords: Protein folding; Protein unfolding; Molecular dynamics; Force field; Water model
1. Introduction account for the lack of water molecules) [15], and other
methods with reduced complexity. Another approach is
Molecular dynamics (MD) is a theoretical physics to study very small proteins because their small size
technique for the examination of molecular systems at decreases the computational requirements [16]. More
atomic detail. It has a sound basis in statistical me- recently, increases in computer speed, and the prolifer-
chanics and classical physics [1–3]. MD has been used in ation of inexpensive multi-processor machines have
areas as diverse as materials sciences [4], atmospherics enabled all-atom simulations of full proteins access to
[5], and in the biosciences for systems with lipids [6], long simulation time scales [17].
nucleic acids [7–9], and proteins [10–13]. All atom simulation techniques provide atomic res-
Accurate simulation of biomolecules in solution (i.e., olution of equilibrium protein dynamic behavior and
the condensed phase) requires as much detail as possible non-equilibrium events like protein folding and un-
in the internal representation of the system under study. folding. When used in conjunction with experiment,
For this reason, Ôall atomÕ MD, where all of the atoms simulations provide an enhanced view of the system
(including hydrogens) are treated explicitly during the under study. Recent work has examined aspects of
calculations, is the most realistic approach, and gener- protein folding and unfolding [18–23], and the
ally prevails over Ôunited atomÕ (e.g., methyl groups mechanisms of chemical denaturants and co-solvents
treated as a single unit) [14], implicit solvent (e.g., [18,24–26].
distance dependent dielectric or other approximations to There are several well-known methods and imple-
mentations for molecular dynamics simulations of
*
Corresponding author. Fax: 1-206-685-7420. proteins and other biomolecules [14,27–30]. Here, we
E-mail address: [email protected] (V. Daggett). present our methods for all atom MD simulation of
1046-2023/$ - see front matter Ó 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.ymeth.2004.03.008
D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120 113
proteins in solution based on the force field and pro- perature on a step-to-step basis. As a result, the im-
tocols described by Levitt et al. [27,28]. The known plementation is computationally efficient. An efficient
implementations of these methods include the ENCAD computational approach implies fewer numerical op-
program [27] and in lucem Molecular Mechanics erations, reducing the drift (from round-off errors) in
(known as ilmm, our scalable parallel in-house pro- the conserved property (energy), thereby maximizing
gram). The protocols presented are generic renditions the integrity of the simulation. In addition, attempting
of those used in a variety of protein studies from our to control the properties of macroscopic variables, such
laboratory. as temperature and pressure, for distinctly microscopic
systems is fundamentally flawed, and difficult to
achieve.
2. Molecular dynamics simulation methods
2.2. Numerical integration
Molecular dynamics is the time dependent integration
of the classical equations of motion for molecular sys- Stepwise numerical integration of the equations of
tems. The equations of motion, for all but the simplest motion can be performed in a variety of ways [1,2,14,27–
systems, are of sufficient complexity that the integration 31]; we use the Beeman algorithm as modified by Brooks
must be done numerically over a large number of very (Eq. (2) [27,31]). Energy conservation with the Brooks–
small discrete timesteps rather than analytically in a Beeman algorithm is better than that of the commonly
continuous fashion. This treatment of time assumes that used Verlet method. A range of integration timesteps
at any given discrete time step the atomic coordinates was tested for stability (i.e., conservation of energy). For
are fixed. This assumption holds if the magnitude of the all but the most extreme cases, a Dt of 2 fs was found to
time step is sufficiently small (e.g., approximately 2 fs, or be appropriate [28]. Larger values of Dt disrupt the
less). At any given time step, these ÔfixedÕ coordinates are continuity of the simulation and conservation of energy,
used to calculate the potential energy and its first de- while smaller values do not make efficient use of the
rivative, the force, using a molecular mechanics force computational resources
field. Dt2
Generally, for any atom, evolution in time proceeds xi ðt þ DtÞ ¼ xi ðtÞ þ vi ðtÞDt þ ½5ai ðtÞ ai ðt DtÞ ;
8
from step n to n þ 1 as described in Eq. (1), where Dt
subscripts denote the time step, Dt is the magnitude of vi ðt þ DtÞ ¼ vi ðtÞ þ ½3ai ðt þ DtÞ þ 6ai ðtÞ ai ðt DtÞ :
8
the integration time step, a is the acceleration, f is the
force on the atom, m is the atomic mass, v is the velocity, ð2Þ
and x refers to the atomic coordinates:
fn 2.3. Molecular mechanics force field
an ¼ ;
m
An all atom molecular mechanics force field analyt-
vnþ1 ¼ vn þ an Dt; ð1Þ
ically describes the potential energy of a system in terms
1 of the geometries of atomic centers. The energy calcu-
xnþ1 ¼ xn þ vn Dt þ an Dt2 :
2 lation and dynamics (ENCAD) force field was origi-
A long series of these steps generates a trajectory nally described by Levitt [31] and was subsequently
through phase space, the 6N dimensional space (where updated [27] and augmented to include the flexible
N is the number of atoms) defined by the three space three-center (F3C) water model [28]. As with other
vectors of the atomsÕ positions, and velocities. In gen- biomolecular force fields such as those in CHARMM
eral, post-simulation analysis is concerned with the [14,30] and AMBER [29], the potential energy param-
atomic position (coordinates) subspace of phase space. eters (e.g., ideal bond length, bond vibration energies)
in the ENCAD force field are derived empirically from
2.1. The microcanonical ensemble ab initio quantum mechanics, spectroscopy, and crys-
tallography. Curious readers are directed to the history
The microcanoncial (NVE) ensemble fixes the of the ENCAD force field and its genealogy [32] which
number of atoms, the volume of the periodic box, and includes a description of the original work from LifsonÕs
the total energy (potential and kinetic) of the system. group; the ECEPP force field; and protocols from
Energy conservation is naturally satisfied for NVE ScheragaÕs lab [33–35]; the Kollman group force fields
when the classical equations of motion are used [1]. implemented within AMBER [36,37]; the GROMOS
There are several other advantages to performing force field from van Gunsteren [38]; the hydrocarbon
simulations in the NVE ensemble: there is no need force fields (MM2-4) of Allinger et al. [39]; and a his-
to couple the microscopic system to macroscopic torical account of molecular dynamics and CHARMM
thermodynamic properties such as pressure and tem- by Karplus [40].
114 D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120
Eq. (3) contains the ENCAD potential function, V . It oV
f ¼
describes the potential energy as a function of internal ox
coordinates which are calculated from the Cartesian X
bonds X
bondangles
coordinates. It enters from f in Eq. (1). V is expressed in V¼ Kb;i ðbi b0;i Þ2 þ Kh;i ðhi h0;i Þ2
two, three, and four body interaction terms. The first i i
three terms represent the intramolecular interactions X
torsionangles
due to bond lengths, bond angles, and dihedral/torsion þ Ku;i f1 cos½ni ðui u0;i Þg þ Unb
i
angles. The fourth term accounts for the Ônon-bondedÕ "
X 12 6 # X qi qj
energies attributed to the van der Waals and electro- r0;ij r0;ij
static interactions of atom pairs. Idealized plots of the Vnb ¼ eij 2eij þ 332 :
pairsi;j
rij rij pairsi;j
rij
constituent terms of the potential energy function are
provided in Fig. 1 ð3Þ
The energy of a covalent bond is treated as a har-
monic oscillator with an energy minimum at b0 (Fig. 1)
and force constant of Kb . Bond angles are treated sim-
ilarly with ideal angle h0 and force constant Kh . The
third term, used for dihedral and out-of-plane torsion
angles, is represented by a cosine with n periods with a
minimum energy at u0 , and barrier to rotation force
constant of Ku.
The van der Waals interaction energy of the atomic
pair i and j is treated with a 12/6 Lennard-Jones func-
tion. When the pair distance rij is less than r0 , the geo-
metric mean of ri and rj , the function is highly repulsive
(Fig. 1). At rij greater than r0 , the interaction is attrac-
tive with a minimum value of e0 , the geometric mean of
ei , and ej . The interatomic attraction decreases as the
separation distance approaches infinity. Pairs of atoms
within the same molecule separated by fewer than four
bonds are not included in this term.
The electrostatic interaction energy of the atomic pair
i and j, with partial charges qi and qj , respectively,
separated by distance rij is expressed with a Coulomb
style potential. In this model the energy of interaction is
favorable when the signs of the partial charges are dif-
ferent and unfavorable when they are the same. As with
the Lennard-Jones potential, the energy of interaction
gradually decreases to zero as rij approaches infinity.
The set of parameters for protein atoms including
force constants, equilibrium values, ri , ei , and partial
charges qi is available elsewhere [27]. The parameters for
the flexible three-center (F3C) water model, an explicit
solvent model designed for the ENCAD potential, are
Fig. 1. Idealized plots of constituent terms of the ENCAD potential
function, U . The potential energy of each term is the y-axis. (A) The also available [28]. Additions for chemical denaturants
harmonic term that describes the interaction energy of two bonded [24–26] and other co-solvents and ions can be found in
atoms as a function of the distance of their atomic centers with ideal the references that describe their applications [18,19,25].
distance b0 . (B) The harmonic term, similar in form to (A), but of
lower energy, that describes the interaction of two atoms bonded to a
2.4. Non-bonded interaction cutoff
third atom as a function of the angle between them with the ideal angle
h0 . (C) A typical periodic (n ¼ 2) cosine term with a minimum at u0
used to describe both in- and out-of-plane (i.e., proper and improper) To mimic the solution state of a system, the simula-
dihedral angle energies. Plots (A–C) share the same range for energy. tion volume is treated as an infinitely repeating cell or
(D) The van der Waals interaction energy of two atoms with e and r0 Ôperiodic box.Õ Conceptually, this is similar to an or-
the geometric mean of their respective e and r0 . (E) Three typical
thorhombic unit cell in crystallography. The result is an
electrostatic interactions. The top line idealizes the interaction of
charges with like signs while the bottom line idealizes the interaction of infinite solution with a protein concentration ap-
two charges with different signs. The sum of (D–E) constitutes the non- proaching (but usually below) those in vivo. In practice,
bonded interaction energy of two atomic centers. it is neither necessary nor computationally possible to
D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120 115
consider all of the non-bonded interactions arising from inadequate preparation. The specific number of steps of
an infinite solution, as in Eq. (3). In fact, the dielectric minimization and dynamics may vary between applica-
constant becomes large at fairly short distances: e 50 at tions, but the general protocol remains consistent.
10 A separation of the charges and 70 at 15 A [41]. For native state and unfolding simulations, an ex-
Therefore, it is common practice to use a non-bonded perimental structure (derived from crystallography or
pair distance cutoff, rc . Pairs separated by distances NMR experiments) is used. The crystal structures re-
greater than rc (e.g., 8, 10 A, etc.) are not considered; quire that hydrogens be added. For refolding simula-
that is, the energy of interaction beyond rc is zero. tions, a starting structure can be taken from a thermal, or
The NVE ensemble relies on the continuity of po- chemical unfolding MD trajectory. Pre- and post-tran-
tential terms to conserve energy. Without further mod- sition state structures, folding intermediates, and struc-
ification, such a scheme would be discontinuous at rc . tures from the denatured ensemble have all been used for
To maintain the integrity of the NVE ensemble and to this purpose [18–22]. The potential energy of the com-
preserve the energies and forces of interaction, the plete structure is minimized briefly (usually 200–1500
ENCAD force field uses a force-shifting cutoff, Vfs . This steps) with respect to the atomic coordinates, usually
method smoothly and continuously shifts the energies with a mix of steepest-descent and conjugate-gradient
and forces by subtracting from the original potential techniques [42]. The resulting Ôminimized structureÕ is
term, Vnb , its first order Taylor expansion about rc , as in then ready to be solvated for simulation in solution.
Eq. (4). The non-bonded potential terms and a more The minimized structure is placed in an empty peri-
complete discussion can be found elsewhere [27]: odic box, the walls of which extend a specified distance
from the protein. This box is then
(typically 8–12 A)
dVnb ðrc Þ
Vfs ðrÞ ¼ Vnb ðrÞ Vnb ðrc Þ þ ðr rc Þ : ð4Þ filled with solvent. It is necessary to extend the box to or
dr
beyond rc to eliminate any direct interactions between
The choice of cutoff is not arbitrary. Clearly a very the proteinÕs first and second solvation shells. Such in-
short cutoff (4 A) does not adequately model the teractions might alter the process under study. At dis-
electrostatic screening properties of systems. However, tances past these shells, the water has been shown to
slightly longer cutoff ranges such as 8 and 10 A have behave as bulk [43].
been shown in model peptide systems to behave very Water molecules are added from periodic boxes pre-
similarly to much longer cutoffs such as 12, 14, and 16 A equilibrated to the appropriate density for the desired
([27,28] and [D.A.C. Beck, R.S. Armen, V. Daggett, simulation temperature. Waters from the pre-equili-
2003, manuscript in preparation]). In general, very long brated box are not added to the system if they are within
cutoffs (20 A) do not improve the fidelity of the cal- a specified Ôradius of exclusionÕ (typically 1.67–2.10 A)
culations and take significantly more computational from the protein. The waters (only) are minimized to
time (as rc increases the number of pairs grows expo- smooth the solvent network before a short (typically 1–
nentially). Additional problems with very long cutoffs 5 ps) MD simulation of the water (only) is performed.
arise when they exceed half the periodic box dimensions. The protein is fixed during this process to encourage
In this case, an atomic pair could have multiple degen- water to populate relevant hydration sites on the protein
erate interactions, which if evaluated would overesti- surface without causing disruption to the protein
mate the energy of interaction and lead to perturbations structure. In the final steps of preparation, the protein
in the atomic interactions. (only) is minimized followed by a minimization of the
entire system (water and protein).
3. Molecular dynamics simulation protocols 3.2. Modifications for higher order systems
readily converges, the process is terminated, and the The energy drift arising with this method is primarily
solution is ready for simulation. kinetic and due to numerical round-off. As a result, the
mean system temperature over a large number of steps
3.3. Temperature can be used to monitor energy conservation; when the
mean temperature drifts, the velocities are rescaled.
Studies performed with these methods are more Using double precision (64 bit) operations, systems of
concerned with behavior at a given temperature (e.g., modest complexity simulated at 298 K rescale once per
298 K) than at a given energy (e.g., )24,532 kcal/mol). 5.0 106 steps or every 10 ns.
However, in the NVE ensemble, the step-to-step kinetic
energy (and thus the temperature) of the system may
vary. At each step, the temperature T is calculated from 4. Validation and results
the atomic velocities according to Eq. (5). In this ex-
pression, the sum is over all atoms, each with mass mi , As mentioned above, poorly prepared starting struc-
and instantaneous velocity of vi . N is the number of tures can introduce fictitious behavior into what is
atoms and Kb is the Boltzmann constant. Due to the otherwise a correct biophysical simulation. Similarly,
step-to-step fluctuations, the mean of these instanta- incorrect parameterization can cause improper dynamics.
neous temperature samplings for a time interval (typi- For these reasons, it is critical that rigorous comparisons
cally 100–500 steps) is a more appropriate measure of T . with experiment be conducted to validate the simulation
P methodology. Here, we present a minimal set of experi-
mi v2i
T ¼ : ð5Þ mental comparisons as means of validation, a brief syn-
3NKb opsis of the most commonly used MD simulation analysis
At the beginning of a simulation, small, equal, and methods, and a glance at some current results.
opposite impulses are applied to randomly selected pairs
of atoms. This process is continued until the Maxwellian 4.1. Water
velocity distribution for the system has a mean within a
few Kelvin of the desired simulation temperature. The Explicit water models have a number of experimental
system must be brought to temperature slowly enough observations against which they can be validated. The
such that it is not shocked. Our current protocols heat F3C model has been thoroughly tested and documented
the system by 0.05–0.1 K per step. [28,43,46]. Here, we have chosen two of the most im-
Simulation of protein native states frequently occurs portant bulk properties known from experiment: water
at 298 K. For simulations of thermal unfolding, any self-diffusion and the radial distribution function; which
temperature above the protein of interestÕs melting reflect the dynamic behavior; and structure of the sol-
temperature, Tm , can be used. Our past studies have vent, respectively. These properties are well reproduced
shown, however, that thermal unfolding is an activated with the methods described above and a commonly used
process obeying the rules of Arrhenius behavior [21– non-bonded cutoff of 8 A. The simulation used for these
23,44]. That is, increasing temperature does not alter the comparisons had 502 F3C water molecules at the
pathway of unfolding, only the rate. As a result, it is experimental density of 0.997 gm/ml and was run for
possible to simulate unfolding at temperatures signifi- 11 ns. The first nanosecond was allocated to system
cantly above a proteinÕs Tm . The increased rate of un- equilibration.
folding allows short unfolding simulations to sample not The self-diffusion of water as a function of simulation
only the transition state and early intermediates of un- time is presented in Fig. 2. The mean diffusion over the
folding but large regions of the denatured ensemble. last nanosecond is 0.23 A2 /ps, in agreement with exper-
With the Maxwellian temperature distribution, a 200 K iment (0.23–0.25 A /ps) [47]. The diffusion calculation
2
increase in temperature corresponds to only a 30% in- converges with simulation time [2]. This convergence is
crease in the mean atomic velocities. common with much of MD analysis and reflects the
In addition to the increased rate of unfolding, high need for averaging over long time scales to approximate
temperature simulations benefit from reduced system sampling from ensembles. Another approach to long
density. As stated previously, the density of a system time scale sampling is to use numerous short simulations
during preparation is set to the value obtained from performed in parallel. Each simulation has a slight
experiment. At 498 K, the density of liquid water from perturbation to its starting structure or a different ran-
experiment is 0.829 gm/ml [45]. Contrast this with dom number seed during the heating stage. By the
0.997 gm/ml for 298 K [45] and it is readily apparent that ergodic principle, the sampling of these multiple short
there will be significantly fewer non-bonded interaction simulations is equivalent to the sampling of a single long
partners at 498 K. The reduced number of partners simulation [1].
translates into yet faster simulation run-times without Another commonly used property for validation of
disrupting the integrity of the study. water models is the radial distribution function (RDF),
D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120 117
Fig. 4. Ca RMSD to crystal structure as a function of time for three MD simulations of the En-HD. En-HD is a three helix bundle (1enh [50]) with a
fair amount of helical structure in its unfolding intermediate, and denatured ensemble. The native state and thermal unfolding simulations have been
fully characterized and verified against experiment [21,22]. The 298 K native state simulation (light grey) is a reference against which to compare the
498 K thermal unfolding (dark grey) and 298 K folding/quench (black) simulations. The structures are colored according to the native state helices:
The fluctuations reflect the
helix I in red; helix II in green; and helix III in blue. In the native state simulation, the RMSD ranges from 2.0 to 3.5 A.
degree of mobility in the loops between helices. The transition state in the thermal unfolding simulation was seen at 0.26 ns. The 5 ns (10.47 A Ca
RMSD to crystal) structure from the unfolding simulation was used to seed the folding/temperature quench simulation. Within the first 5 ns, the
folding system undergoes an initial collapse and refolds by the framework model to a final structure that is 3.58 A from the crystal, and 2.88 A
from
the final structure of the native state simulation. These simulations were performed with ENCAD and ilmm, and used an 8 A cutoff range [27]. The
Ca RMSD calculation excludes the highly mobile N- and C-termini.
intermediate state is native, although transient non-na- segment. Protein refolding occurs very much as the re-
tive helices seen in the unfolding simulation are consis- verse of denaturation: after quenching at 298 K, tran-
tent with NMR chemical shift deviations of the L16A sient non-native helical segments are lost, and much of
mutant. The extrapolated temperature dependent rates the native helical structure quickly returns (<5 ns).
of unfolding from temperature jump experiments are in Subsequently, the I, II scaffold returns (see Ôhelix dock-
good agreement with those from simulation at high ingÕ in Fig. 4), and the swing arm of helix III begins to
temperature, especially when considering the Ôsingle move toward the core (see Ôrefolding finalÕ in Fig. 4).
moleculeÕ aspect of simulation [22]. At 373 K, for Although the final structure is similar to structures in
example, the half-life of folding was 2 ns from simu- the native state ensemble, the refolding simulation is on-
lation, and about 5 ns by extrapolation of the experi- going in order to capture the complete atomic detail of
mental data. the end-stages in helix docking.
A post-transition state starting structure from the For experimental comparisons, there are several
thermal unfolding run was used for a temperature other important computational analyses that must be
quench/refolding simulation. It is 10.47 A Ca root- performed. For example, one must demonstrate that the
mean-square deviation (RMSD) from the crystal struc- potential function and simulation protocols reproduce
ture. This intermediate is non-native in that very few the structure, and dynamic behavior of the native state
tertiary contacts are present, each helix lacks several under folding conditions. The starting structure, or
turns, and the N-terminus contains a non-native helical crystal structure in this case, is a useful reference against
D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120 119
which simulation can be compared. The native state unfolding of proteins in agreement with experiments
RMSD to the crystal structure in Fig. 4 ranges from 2 to probing both folding and unfolding. The discovery of
These fluctuations reflect the degree of mobility in
4 A. ultrafast folding proteins bridge the gap between MD
the loops between the helices. In the thermal unfolding and experiment and illustrate the synergy between the
simulations, the RMSD rapidly diverges from the range two approaches: theorists get validation from experi-
of values experienced by the native ensemble to a value ment and experimentalists get atomic level detail from
at 60 ns. The refolding simulation starts from
of 18.6 A theory.
the 5 ns, 10.47 A, unfolding intermediate. The final
structure of the folding simulation after a 55 ns simula-
tion at 298 K has an RMSD of 3.58 A. The final struc-
These two final Acknowledgments
ture of the native state is similar, 3.57 A.
structures are 2.88 A from each other. The similarity in
This work was supported by the National Institutes
RMSD to the crystal structure of the final native and
of Health (GM 50789 to V.D.). D.B. is supported by an
ÔrefoldedÕ structures, in conjunction with their relatively
NIH Molecular Biophysics Training Grant (National
low RMSD to each other, is an indication that the
Research Service Award 5 T32 GM 08268). Some of the
protein in the quenched, refolding simulation has be-
simulations presented were computed on hardware do-
come very native-like.
nated by Intel. University of California, San Francisco,
The RMSD alone is not a sufficient description of
MIDASPLUS, was used to prepare Fig. 4.
protein structure. Other relevant analyses include the
calculation of solvent accessible surface area (SASA)
and the number and persistence of residue–residue
contacts. The mean and standard deviations of the total References
SASA (by the NACCESS method [49]) for the final
2 ) and the re- [1] M.P. Allen, D.J. Tildesley, Computer Simulation of Liquids,
nanosecond of the native (4753 127 A Oxford University Press, Oxford, 1987.
2
folding simulation (4803 168 A ) overlap. The statis- [2] J.M. Haile, Molecular Dynamics Simulation: Elementary Meth-
tical similarity of these values further suggests the ods, Wiley, New York, 1992.
refolding run is adopting a very native like conforma- [3] J.A. McCammon, B.R. Gelin, M. Karplus, Nature 267 (1977)
tion. The SASA for the final nanosecond of the 498 K 585–590.
2 ), however, is very [4] K. Kremer, Macromol. Chem. Phys. 204 (2003) 257–264.
unfolding simulation (6335 204 A [5] P. Jungwirth, D. Tobias, J. Phys. Chem. B 106 (2002) 6361–6373.
different from the values for the native state and the [6] L. Saiz, S. Bandyopadhyay, M.L. Klein, Biosci. Rep. 22 (2002)
folding run. Also of interest in studies of this type are 151–173.
the SASA breakdowns by residue, hydrophobicity, and [7] W. Wang, O. Donini, C.M. Reyes, P.A. Kollman, Annu. Rev.
side/main-chain (data not shown). Biophys. Biomol. Struct. 30 (2001) 211–243.
[8] E. Giudice, R. Lavery, Acc. Chem. Res. 35 (2002) 350–357.
The total number of side-chain to side-chain contacts [9] J. Norberg, L. Nilsson, Acc. Chem. Res. 35 (2002) 465–472.
for the last nanosecond of these simulations was calcu- [10] M. Karplus, J.A. McCammon, Nat. Struct. Biol. 9 (2002) 646–
lated. As with the SASA, the refolding simulation mean, 652.
and SD (138.8 2.9) is within the fluctuations of the [11] T. Hansson, C. Oostenbrink, W. van Gunsteren, Curr. Opin.
native state (143.6 2.3), a further indication of re- Struct. Biol. 12 (2002) 190–196.
[12] V. Daggett, Acc. Chem. Res. 35 (2002) 422–429.
folding. In contrast, the unfolding simulation [13] A. Warshel, Acc. Chem. Res. 35 (2002) 385–395.
(83.0 3.4) has about 60% of the contacts populated in [14] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S.
the native state simulation. The denatured state of En- Swaminathan, M. Karplus, J. Comput. Chem. 4 (1983) 187–217.
HD contains considerable residual helical structure in [15] D. Bashford, D.A. Case, Annu. Rev. Phys. Chem. 51 (2000) 129–
both the simulations and as assessed by experiment [22]. 152.
[16] Y. Duan, P.A. Kollman, Science 282 (1998) 740–744.
The high degree of contacts in the denatured state re- [17] V. Daggett, Curr. Opin. Struct. Biol. 10 (2000) 160–164.
flects intra-helical contacts, not contacts for docking of [18] D.O.V. Alonso, V. Daggett, J. Mol. Biol. 247 (1995) 501–520.
the helices. More detailed analysis of precisely which [19] D.O.V. Alonso, V. Daggett, Protein Sci. 7 (1998) 860–874.
native contacts are preserved and which are lost is typ- [20] D. De Jong, R. Riley, D.O.V. Alonso, V. Daggett, J. Mol. Biol.
ical for such studies (data not shown). 319 (2002) 229–342.
[21] U. Mayor, C.M. Johnson, V. Daggett, A.R. Fersht, Proc. Natl.
Acad. Sci. USA 97 (2000) 13518–13522.
[22] U. Mayor, N.R. Guydosh, C.M. Johnson, J.G. Grossmann, S.
5. Concluding remarks Sato, G.S. Jas, S.M. Freund, D.O.V. Alonso, V. Daggett, A.R.
Fersht, Nature 421 (2003) 863–867.
Molecular dynamics is a useful tool for enhancing the [23] R. Day, B.J. Bennion, S. Ham, V. Daggett, J. Mol. Biol. 322
(2002) 189–203.
information obtained from experiment about protein [24] K.E. Laidig, V. Daggett, J. Phys. Chem. 100 (1996) 5616–5619.
native states, thermal and chemical unfolding events, [25] Q. Zou, B.J. Bennion, V. Daggett, K.P. Murphy, J. Am. Chem.
and folding pathways. These methods permit reliable Soc. 124 (2002) 1192–1202.
120 D.A.C. Beck, V. Daggett / Methods 34 (2004) 112–120
[26] B.J. Bennion, V. Daggett, Proc. Natl. Acad. Sci. USA 100 (2003) [38] W. van Gunsteren, X. Daura, A. Mark, in: P.V.R. Schleyer (Ed.),
5142–5147. Encyclopedia of Computational Chemistry, John Wiley, New
[27] M. Levitt, M. Hirshberg, R. Sharon, V. Daggett, Comput. Phys. York, Chichester, 1998.
Commun. 91 (1995) 215–231. [39] N.L. Allinger, K.S. Chen, J.H. Lii, J. Comput. Chem. 17 (1996)
[28] M. Levitt, M. Hirshberg, R. Sharon, K.E. Laidig, V. Daggett, J. 642–668.
Phys. Chem. B 101 (1997) 5051–5061. [40] M. Karplus, Biopolymers 68 (2003) 350–358.
[29] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. [41] V. Daggett, P.A. Kollman, I.D. Kuntz, Biopolymers 31 (1991)
Cheatham, S. Debolt, D. Ferguson, G. Seibel, P. Kollman, 285–304.
Comput. Phys. Commun. 91 (1995) 1–41. [42] W.H. Press, Numerical Recipes in C: The Art of Scientific
[30] A.D. Mackerrell, B. Brooks, C.L. Brooks, L. Nilsson, B. Roux, Y. Computing, Cambridge University Press, Cambridge, New York,
Won, M. Karplus, in: P. Schleyer (Ed.), The Encyclopedia of 1992.
Computational Chemistry, vol. 1, John Wiley, Chichester, 1998, [43] D.A. Beck, D.O. Alonso, V. Daggett, Biophys. Chem. 100 (2003)
pp. 271–77. 221–237.
[31] M. Levitt, J. Mol. Biol. 168 (1983) 595–620. [44] N. Ferguson, J.R. Pires, F. Toepert, C.M. Johnson, Y.P. Pan, R.
[32] M. Levitt, Nat. Struct. Biol. 8 (2001) 392–393. Volkmer-Engert, J. Schneider-Mergener, V. Daggett, H. Oschk-
[33] F.A. Momany, R.F. Mcguire, A.W. Burgess, H.A. Scheraga, J. inat, A. Fersht, Proc. Natl. Acad. Sci. USA 98 (2001) 13008–13013.
Phys. Chem. 79 (1975) 2361–2381. [45] G.S. Kell, J. Chem. Eng. Data 12 (1967) 66.
[34] G. Nemethy, M.S. Pottle, H.A. Scheraga, J. Phys. Chem. 87 [46] K.E. Laidig, J.L. Gainer, V. Daggett, J. Am. Chem. Soc. 120
(1983) 1883–1887. (1998) 9394–9395.
[35] M.J. Sippl, G. Nemethy, H.A. Scheraga, J. Phys. Chem. 88 (1984) [47] K. Krynicki, C.D. Green, D.W. Sawyer, Discuss. Faraday Soc. 66
6231–6233. (1978) 199–208.
[36] S.J. Weiner, P.A. Kollman, D.T. Nguyen, D.A. Case, J. Comput. [48] A.K. Soper, Chem. Phys. 258 (2000) 121–137.
Chem. 7 (1986) 230–252. [49] S.J. Hubbard, J.M. Thornton, Department of Biochemistry and
[37] S.J. Weiner, P.A. Kollman, D.A. Case, U.C. Singh, C. Ghio, G. Molecular Biology, University College London, 1993.
Alagona, S. Profeta, P. Weiner, J. Am. Chem. Soc. 106 (1984) [50] N.D. Clarke, C.R. Kissinger, J. Desjarlais, G.L. Gilliland, C.O.
765–784. Pabo, Protein Sci. 3 (1994) 1779–1787.