Dill - Polymer Principles and Protein Folding - 1999 - Protein Science

Protein Science ~1999!, 8:11661180. Cambridge University Press. Printed in the USA.
Copyright 1999 The Protein Society
REVIEW
Polymer principles and protein folding
KEN A. DILL 1
University of California, San Francisco, 3333 California Street, Ste. 415, San Francisco, California 94118
~Received January 25, 1999; Accepted March 4, 1999!
Abstract
This paper surveys the emerging role of statistical mechanics and polymer theory in protein folding. In the polymer
perspective, the folding code is more a solvation code than a code of local fc propensities. The polymer perspective
resolves two classic puzzles: ~1! the Blind Watchmakers Paradox that biological proteins could not have originated from
random sequences, and ~2! Levinthals Paradox that the folded state of a protein cannot be found by random search. Both
paradoxes are traditionally framed in terms of random unguided searches through vast spaces, and vastness is equated
with impossibility. But both processes are partly guided. The searches are more akin to balls rolling down funnels than
balls rolling aimlessly on flat surfaces. In both cases, the vastness of the search is largely irrelevant to the search time
and success. These ideas are captured by energy and fitness landscapes. Energy landscapes give a language for bridging
between microscopics and macroscopics, for relating folding kinetics to equilibrium fluctuations, and for developing
new and faster computational search strategies.
Keywords: new view; polymer; protein folding; statistical mechanics
This paper describes a perspective on protein folding that derives year time frame, from the 1930s to 1980s. It originated with Mir-
in part from simple statistical mechanical and polymer models. As sky and Pauling in 1936 ~Mirsky & Pauling, 1936!, who proposed
with any perspective, this one is a personal opinion, with all the that backbone hydrogen bonding is a prominent folding force.
limitations that implies. The first part of this paper explores the During the next 15 years, Paulings group used the structures of
folding code. ~1! Structure: How is the native structure encoded in small molecule hydrogen-bonding compounds to predict that folded
the amino acid sequence? ~2! Thermodynamics: Why is folding so proteins would have a-helical and b-sheet structures ~Pauling &
cooperative? ~3! Kinetics: What determines the speed and the rate- Corey, 1951a, 1951b, 1951c, 1951d; Pauling et al., 1951!. The first
limiting steps of folding? Polymer modeling suggests that the fold- X-ray crystal structures of globular proteins gave strong support to
ing code is more a solvation code and less a linear encoding of this view by confirming the existence of the predicted a-helices
torsion angles along the peptide bond, even though the latter is not and b-sheets ~Kendrew et al., 1958!. Hydrogen bonding was seen
negligible. The second part explores the energy landscape perspec- to be an important structure-causing force in proteins.
tive on folding kinetics. Polymer modeling suggests that the fold- During the same period, a step was taken toward understanding
ing process more closely resembles balls rolling down bumpy folding cooperativity through an understanding of the helix-coil
funnels than balls rolling aimlessly on flat surfaces or rolling sin- transition. For many years it had been known that protein folding
gle file along identical trajectories. is cooperative, i.e., that there is a dramatic transition from dena-
tured to native states upon only small changes in solvent, pH, or
DISCUSSION temperature. In the 1950s and 1960s, theoretical work particularly
of Schellman ~1958!, Zimm and Bragg ~1959!, Poland and Scheraga
Side-chain interactions contribute to architecture, ~1970!, and experiments ~Doty & Yang, 1956; Doty et al., 1956!
just as backbone interactions do showed that long peptide chains can undergo a helix-coil transition
that is cooperative. The helix-coil transition is driven by hydrogen
The backbone forces of folding
bonding and fc propensities among near-neighbor groups along
Table 1 compares two different perspectives on the folding code. the chain. For many years, this has been the main model for con-
A backbone-centric, helix-centric perspective arose over the 50 formational cooperativity in biomolecules.
To complete the picture of structure, thermodynamics, and ki-
Reprint requests to: Ken A. Dill, University of California, San Francisco, netics, experiments beginning in the 1970s showed that helices can
3333 California Street, Ste. 415, San Francisco, California 94118; e-mail: form rapidly ~Kim & Baldwin, 1982; Williams et al., 1996!. One
[email protected]. inference was that folding is hierarchical and can be explained by
1
The author is grateful to Hans Neurath, the Protein Society, and Protein
Science, for the opportunity to present this overview, which is largely taken
a scheme 18 r 28 r 38: the primary structure leads to secondary
from a talk given on the occasion of the Hans Neurath Award lecture, at the structure ~fast!, which is then assembled into tertiary structure
Protein Society meeting, July 27, 1998. ~slower!. Hierarchical assembly was seen as a solution to the prob-
1166
Polymer principles and protein folding 1167
Table 1. The side-chain forces of folding

A different perspective has developed from polymer modeling
Backbone-centric Side-chain over about the past 15 years. The polymer perspective is side-chain
view centric view centric, rather than backbone centric. The idea is that folding is
Dominant force FC, Hydrogen bonds Hydrophobicity,
dictated not so much by the propensities for nearest neighbor amino
hydrogen bonds acids to favor particular fc values ~a-helix or b-sheet propensi-
Thermodynamic Helix-coil transition Collapse transition ties!, even though there is abundant evidence for such preferences
cooperativity ~Honig & Cohen, 1996; Aurora et al., 1997!. Rather, in the side-
Kinetics Helix formation is fast Desolvation is fast chain-centric view the greater contribution to the free energy of
Role of Nonspecific Drives specific folding is encoded in a more delocalized solvation code: there
hydrophobicity architecture are very few conformations of the full chain that can bury nonpolar
Folding code FC-centric ~18 r 28 r 38! Solvation code amino acids to the greatest possible degree ~Dill, 1985; Dill et al.,
1995!. Even short peptides, such as amphipathic helices, can be
driven by solvation. Hydrophobic interactions, however they are
defined, are among the strongest interactions among amino acids
lem of how the protein sorts through conformational space hay- in water. And in large proteins, there are many of them. In this
stack quickly on its way to finding the native state needle. The view, hydrophobic interactions are not nonspecific glue, but a cru-
same hierarchy has been widely explored as a computational strat- cial structure-determining driving force. In this view, folding co-
egy for predicting native states from amino acid sequences: use operativity more closely resembles a process of polymer collapse
local helix and sheet propensities to predict secondary structures, in a poor solvent than a helix-coil transformation. In this view, fast
then assemble them into tertiary structures. secondary structure formation is less a consequence of strong helix
The upshot was a perspective in which the backbone inter- propensities, and more an indirect consequence of a drive toward
actionshydrogen bonding and fc propensitieshave been seen nonpolar desolvation.
as a large part of the explanation of the structures, thermodynam- The true balance between side-chain and backbone forces is not
ics, and kinetics of protein folding ~Honig & Cohen, 1996; Aurora yet known. The side-chain-centric view has been based on the
et al., 1997; Baldwin & Rose, 1999a, 1999b!. The fc propensities following logic. Simplified models that include side-chain inter-
are not equivalent to hydrogen bonding, since hydrogen bonds are actions, but have the fc preferences turned off, predict many
involved in nonlocal interactions, whereas fc interactions, by def- properties of globular proteins. In contrast, models that keep fc
inition, are not. Nevertheless, from the perspective of the sequence- propensities and turn off side-chain interactions predict only heli-
dependent interactions, and sequence-structure relationships, a ces or strands and no compact folded state ~Thomas & Dill, 1993!.
backbone-centric view is largely a fc-centric view, since there has Indeed, helix-coil experiments show that fc propensities control
been little basis for believing one amide-carbonyl backbone hy- structures for sequences that are unable to collapse. For example
drogen bond has a substantially different strength than another in highly charged poly-benzyl-l-glutamate is the classical helix-
a sequence-dependent way. On the other hand, hydrophobic inter- former.
actions, which were first identified as important for protein folding It follows that a minimal model of globular protein behavior can
by Kauzmann ~1959!, were seen as a nonspecific glue that aided be constructed from a side-chain-centric perspective but not from
collapse but otherwise played little role in dictating the specific a backbone-centric perspective. This means that it may be possible
architectures of native proteins ~Anfinsen & Scheraga, 1975!. Hy- to design polymers that could fold and perform protein-like func-
drophobic interactions are mainly expressed by the side chains. tions, even without peptide backbones. RNA molecules already
They are through-space and solvent-mediated contact inter- provide some proof of this principle. Minimal models are guides
actions, rather than through-nearest-neighbor-bonds, as are fc for such general principles.
interactions. This distinction between torsion-based nearest-neighbor But minimal models do not tell us the actual balance of forces
through-chain interactions that involve fc angles, and contact- in real proteins. If our goal is an accurate model of proteins, we
based through-space interactions, that involve displacement or undoubtedly cannot ignore backbone interactions ~Honig & Cohen,
exchange of solvent, seems less ambiguous than distinctions be- 1996! or details of steric packing, or the different fc interactions
tween secondary vs. tertiary forces, or local vs. nonlocal forces. among the amino acids. In the end, since protein stability is a small
The fc interactions are mainly steric torsional constraints cap- difference of large interactions, all interactions can contribute to
tured in Ramachandran plots. Contact interactions, such as side- structure, thermodynamics, and kinetics.
chain contacts, include hydrogen bonding and hydrophobic What is the evidence for the side-chain-centric view? ~1! A
interactions and van der Waals interactions among non-neighboring backbone centric view does not predict collapse. A coordination of
monomers. fc choices to cause collapse would be extraordinarily fortuitous.
The fc perspective does not address a key issue. As with most ~2! Helix and strand propensities tend to be weak. Excepting poly-
other polymers, a large conformational space is a consequence of alanine-based sequences ~Scholtz & Baldwin, 1992!, peptides that
weak preferences of each monomer unit for one region of torsion- are found to be in helices or strands in globular proteins are un-
angle space relative to another region. But if they are to account for stable when isolated in solution. Moreover, most helices and strands
the folding code, fc propensities must be different in the native are amphipathic ~Eisenberg et al., 1984; Bowie et al., 1990; Branden
state than in the denatured state. In particular, fc interactions must & Tooze, 1999!, implicating solvation forces. The a-helical and
change when folding conditions are turned on. But there is little b-strand propensities are context dependent ~Kabsch & Sander,
evidence that tri- and tetra-peptides adopt native-like conforma- 1984; Minor & Kim, 1994!, and the nonlocal interactions in b-sheets
tions and overcome the chain entropy, when the solvent or tem- are large ~Smith & Regan, 1995! and numerous. ~3! In a globular
perature are changed. protein, the number of local interactions is proportional to the
1168 K.A. Dill
number of amino acids N, but the number of nonlocal interactions problem, not the solution. This is a key message from the successes
is proportional to about 2N, so the latter should dominate in larger of the two-dimensional lattice Ising model in the revolution that
proteins. ~4! Helices and strands often take their conformational took place in understanding critical phenomena ~Stanley, 1971!.
instructions from their context or from the solvent ~Kuroda et al., The inability of earlier models of phase transitions to capture sub-
1996; Predki et al., 1996!. ~5! To a first approximation, a fold is tle critical behavior was attributable, not to the lack of realism and
determined by the binary sequence of hydrophobic0polar mono- atomic detail, but to a lack of rigor in the mathematics of the
mers, even when fc propensities are largely chosen randomly models.
~Reidhaar-Olson & Sauer, 1988; Bowie et al., 1990; Lim & Sauer, Mathematician Mark Kac once said that the purpose of models
1991; Gassner et al., 1992; Lim et al., 1992; Kamtekar et al., 1993; is to polarize our thinking, to help us formulate questions. A
Matthews, 1993; Munson et al., 1994, 1996; Lazar et al., 1997; model manifests a point of view; it regards certain components of
Roy et al., 1997; Schafmeister et al., 1997; Wu & Kim, 1997!. a problem as relevant, important, or dominant, and other compo-
~6! Protein folds are less affected by mutations on their surfaces nents as irrelevant, unimportant, or negligible, and then devises a
than in their hydrophobic cores ~Lim & Sauer, 1991; Matthews, chain of logic leading to predictions from those premises. Most
1993!. ~7! Some experiments show that protein folding is not broadly, the point of a model is to make decisive and testable
hierarchical, implying that secondary structures are not pre- predictions, regardless of whether its fine structure looks realistic.
assembled and used as building blocks in tertiary assembly. For A key advantage of simplified models is that their parameters are
example, a b-sheet protein can fold via a helical intermediate physical and minimal in number. The chain of logic from premises
~Shiraki et al., 1995; Hamada et al., 1996!. ~8! Hydrophobic clus- to conclusions is direct. Simplified models serve to generate hy-
tering, like secondary structure formation, can be very fast ~Chan potheses that often cannot be generated in any other way, but that
et al., 1997; Ramachandra Shastry & Roder, 1998; Ramachandra can then be tested by experiments or refined simulations.
Shastry et al., 1998!, and it can drive helix and sheet formation. Simplified models have been useful for exploring entropies and
combinatoric principles of conformational and sequence spaces.
Two problematic paradoxes of protein science have been shown
Simplified models are hypothesis generators
by polymer modeling to be neither problematic nor paradoxical.
The predictions described above come, in part, from models that ~1! The Blind Watchmaker Paradox: The probability that natural
involve considerable simplification. An example is the HP model, proteins could be found in a random search of sequence space was
in which each amino acid is represented as a bead, each bond is a seen to be impossibly small. ~2! The Levinthal Paradox: The prob-
straight line, bond angles are a few discrete options rather than a ability that a protein could find its native state by random search
continuum, different conformations conform to lattices in two or was seen to be impossibly small. Both paradoxes have been framed
three dimensions, and the 20 amino acids are condensed into a in terms of random unguided processes that search for a single
two-letter alphabet: H ~hydrophobic! or P ~polar! ~Dill, 1985; Dill point, the endstate, in a vast space. Biological evolution searches
et al., 1995!. through sequence space; the endstate is a single protein having a
While statistical mechanical models are simplified in their rep- particular function. Protein folding searches through conforma-
resentation of energies and atomic details, they are more refined tional space; the endstate is the single native structure of a given
in other respects ~Camacho & Thirumalai, 1993; Bryngelson protein. In both cases, the vastness of the search ~i.e., the size of
et al., 1995; Dill et al., 1995; Karplus, 1997; Onuchic et al., 1997!: the search space! is taken, according to the paradox, to be the key
~1! their full conformational space can be explored extensively, to the impossibility of reaching the endstate.
sometimes without sampling or approximation, and ~2! sometimes
the full sequence space can be explored. For some questions, it is
more important to get right the representation of conformational or Creationists, evolutionists, and blind watchmakers:
sequence spaces than it is to get right the atomic details. Many Can proteins arise from random sequences?
questions of structure, stability, and kinetics are not about the
Sequence space is a large place. For protein chains of 100 amino
locations of the hydrogen bonds in native lysozyme. They are not
acids, the number of possible sequences of the 20 different amino
questions that are answerable by crystallography. They are about
acids is 20 100 5 10 130 ~see Fig. 1!. Creationists have used such
distributions and ensembles, flexibilities and entropies, energy land-
numbers to argue the impossibility that proteins, and life, could
scapes, folding kinetics, big conformational changes, or sequence
space. They are low-resolution questions that have low-resolution
answers. Apart from X-ray crystallography and NMR, the work-
horses of biomolecule science for many years have been low-
resolution experimentsCD, fluorescence, small-angle scattering,
some NMR experiments, calorimetry, chromatography, ANS bind-
ing, melting curves, etc.
For questions involving conformational ensembles, conforma-
tional entropies, sequence space, long time and large spatial scales,
ensemble averaging, or non-native states like transition states, mol-
ten globules, intermediates or denatured states, there is currently
little alternative to some degree of simplification in models. It is
sometimes helpful not to have atomic details, picosecond by pico-
Fig. 1. Sequence space is large. There are 20 100 different 100-mer se-
second, because it is hard to see the forest of principles through the quences. The probability of finding one particular sequence is 20 2100 , but
trees of detail. It would be a mistake to believe that any model is the probability of finding any sequence that folds to a particular structure
improvable by adding structural detail. Sometimes details are the is predicted to be more than 100 orders of magnitude larger.
have arisen from the random sequences that were plausibly on the amino acid spheres together to make a protein; 67 of them will be
prebiotic earth. Creationists solve the large numbers problem by on the surface and 33 will be in the core.! Therefore, the real
invoking divine intervention. Evolutionists solve the large num- search for protein structure takes place, not in a space of 20 N , but
bers problem instead by the accretion of advantage that happens in a space nearer in size to 2 N03 5 2 33 5 10 10 for N 5 100. The
through natural selection ~Dawkins, 1996!. But both evolutionists other 120 orders of magnitude in sequence space are highly de-
and creationists start from the same premise, the large numbers generate; the folded states of those sequences will look much like
problem. Evolutionists too assume that natural proteins are infin- ones already found in the search of the smaller space.
itesimal specks in an impossibly vast and meaningless sequence Sequence space is therefore not likely to be vast darkness with
space, as indicated by the following quotes: Only a very small infinitesimal specks of protein-like light spots. It is not perfectly
fraction of this unimaginably large number of polypeptide chains light either. On a logarithmic scale, sequence space is predicted to
would adopt a single stable three-dimensional conformation ~Al- be more like a beige sea in which virtually all molecules are
berts et al., 1998!, and It is certain that we need a hefty measure nearly folded. A typical random chain of 100 amino acids is
of cumulative selection in our explanations of life ~Dawkins, predicted to be highly compact in water, have considerable sec-
1996!. It is this numbers problem that I refer to, with the help of ondary structure, and be structured much like a molten globule
the wonderful metaphor of Richard Dawkins, as the Blind Watch- ~Lau & Dill, 1990; Chan & Dill, 1991!. This is a far better starting
makers paradox. point for natural selection than are specks in astronomically large
But statistical mechanical modeling ~Lau & Dill, 1990; Chan & sequence spaces.
Dill, 1991; Lipman & Wilbur, 1991! shows that there is very little But whatever the starting point, there remains the need for a
numbers problem in the first place. Reach into a soup of random process of improvement. Evolution can improve proteins by nat-
amino acid sequences. The chance of pulling out a biologically ural selection. Richard Dawkins has explained natural selection
important molecule depends on what is meant by biologically using the metaphor of a Blind Watchmaker. By invoking Watch-
important. The following two questions are vastly different ~see maker, he means the endstate has the appearance of having been
Fig. 1!: ~1! What is the probability of pulling from that soup a designed. The traditional inference is that an object that appears
specific sequence? ~2! What is the probability of pulling from that designed was built by a systematic step by step procedure, as in
soup any sequence that folds to a specific structure? The answer to building a watch. But Dawkins term Blind means that, on the
question ~1! is 10 2130 for a 100-mer. The chance of pulling out a contrary, the natural selection process is not so systematic and does
polypeptide having, say, the lysozyme sequence, is essentially zero. not involve a specific pre-ordained sequence of events. Natural
But to achieve biological function, we care only about finding a selection is a Blind Watchmaker that improves proteins through
particular fold, not a particular sequence. And modeling shows that incremental steps, each of which involves some bias, however
the probability of finding a structure is likely to be more than 100 small, at the same time it also involves considerable random choice
orders of magnitude larger than the probability of finding a se- among alternatives.
quence ~Lau & Dill, 1990; Chan & Dill, 1991!. The chance of The Blind Watchmaker is also a useful metaphor for protein
pulling out any sequence that folds to roughly lysozymes structure folding kinetics, the systematic accretion of native structure over
is closer to 10 210 to 10 220 . While this number too may seem the time course of a folding experiment ~Zwanzig et al., 1992!. A
impossibly small, nature works with these sorts of numbers all the native protein has the appearance of design. For example, the steric
time. These numbers would imply about one such sequence in a fit of side chains in a protein core is as precise as that of a
liter of random sequences at nanomolar concentrations! And the jigsaw puzzle. But tight packing of irregular objects can also be
probability of finding any chain fold, not just lysozyme, is even achieved by shaking up nuts and bolts in a jar, with no design
higher. involved ~Bromberg & Dill, 1994!. The appearance of design does
Why does the numbers problem disappear when we seek struc- not mean that folding happens in serial step by step fashion. Even
tures rather than sequences? There is an enormous degeneracy in a jigsaw puzzle can be constructed through different parallel se-
sequence space: many different sequences can fold to the same quences of events ~Harrison & Durbin, 1985!. The native structure
native structure. A protein can be mutated substantially without can be reached, over the time course of folding, by a process that
changing its fold. The explanation for this is simple. If, as noted ~1! starts from different initial conformations and ~2! proceeds by
above, a fold is primarily determined by the binary sequence of incremental improvements, each of which has some bias but also
hydrophobic0polar monomers, then the essential features of the involves considerable random choice among alternatives. Even a
full 20 100 sequence space are found by searching a space of only very small bias ~deviation from randomness! in choosing among
about the 2 100 5 10 30 sequences that are written in a binary al- alternatives can speed up the search time ~compared to a random
phabet ~H 5 hydrophobic, P 5 polar!, a reduction of 100 orders of search! by tens to hundreds of orders of magnitude ~see below!.
magnitude. Degeneracy means that hydrophobic monomers are When perfect randomness is not the driver, the vastness of the
largely interchangeable with each other, and polar monomers are search becomes irrelevant to the search time ~Dill, 1993!. These
interchangeable with each other, for determining a fold. ~Function ideas are captured in landscapes: fitness landscapes in sequence
may have additional requirements, but estimates indicate that these space or energy landscapes in conformational space. We now focus
do not change the numbers much ~Lau & Dill, 1990!.! on the latter.
Moreover, modeling ~Lau & Dill, 1990; Chan & Dill, 1991! and
experiments ~Reidhaar-Olson & Sauer, 1988; Matthews, 1993! show
The old and new views of folding kinetics:
that the relevant space is even smaller, because only about N03 of
Different questions
the residues are crucial for foldingthose that define the hydro-
phobic core. To first approximation, most surface sites can be Protein folding kinetics has been described in terms of so-called
mutated without changing structure or function. ~The factor of N03 Old and New Views ~Baldwin, 1994, 1995; Dill & Chan, 1997!. To
comes from the geometry of surface0volume ratios. Pack 100 define these views, we first distinguish models, old and new, from
1170 K.A. Dill
views of microscopic folding processes, old and new. For models, residue chain is 4 100 ' 10 60 chain conformations. Only one of
the terms old and new is too stark a contrast. It leads to the these is the native structure. Levinthals proposed solution for find-
perception that we should be asking: Why do we need new mod- ing the needle in the haystack was that all chains must follow the
els? What was wrong with the old ones? Which models are better? same microscopic pathway, like ants single file on a trail ~Levinthal,
But these questions miss the mark. Old and new models do not 1968!. By same pathway, he specified that every chain follows
address the same questions. The old models are mass action mod- the same sequence of bond angle changes, in the same order, to
els used to fit experimental data on folding relaxation times and reach the native state. In the Sequential Micropath view, kinetic
amplitudes. The new view is not a denial of these models. Mass intermediates ~if they were on-pathway! were seen as helpful mile-
action models remain valid for representing such data. While mass posts because they would show what route was taken, and there-
action modeling gives a macroscopic description of experimental fore what routes were avoided, and therefore how the haystack was
data, the new statistical mechanical models give a microscopic searched efficiently. Two-state kinetics was seen as uninformative
framework to explain that data. about the mileposts of folding.
Where old and new views differ, however, is in their interpre-
tation of the microscopic processes of folding. In the hope that Ensemble perspective
changing terminology can help untangle some confusion, I will In the Ensemble view, the vastness of the search is largely
replace old view with Sequential Micropath view and New irrelevant. The more important problem is kinetic traps ~Chan &
view with Ensemble view. Dill, 1994, 1998!. Chains can sort very quickly through vast stretches
Table 2 summarizes the differences between the two views of of conformational space. In this view, chains fall energetically
folding kinetics ~Dill & Chan, 1997!. The language of the Sequen- downhill, as when balls roll down bumpy funnels. Chains do not
tial Micropath viewpathways, transition states, reaction coordi- fold by random searches on level energy landscapes. In this view,
nates, on-path and off-path intermediatesis intended to explain two-state kinetics often means the chain is folding at nearly its
what exponentials do ~i.e., what you see in experiments!. Experi- maximum possible diffusion-limited speed, without kinetic traps.
mental relaxation data are interpreted in terms of mass-action di- In this view, stable intermediates are mainly seen as kinetic traps
agrams having arrows that connect symbols like D ~denatured!, I that slow down the folding process.
~intermediate!, and N ~native!. Nothing in this language says what Heres why the vastness of the search is irrelevant. Even a very
any one molecule is doing at any given time, or how the kinetics small bias, in the form of the forces of protein folding, can be the
of folding is related to the monomer sequence, or how to assign difference between folding times measured in lifetimes of the
microscopic chain conformations to labels I or D or transition universe vs. milliseconds ~Bryngelson & Wolynes, 1987; Dill,
state, etc. But the language of the Ensemble viewlandscapes, 1987; Zwanzig et al., 1992!. ~On a large flat golf course, a golf
folding funnelsis intended to describe what molecules do ~i.e., ball will never find the hole by random processes, but if the
how individual molecules progress toward the folded state!, and golf course has even a small tilt that funnels toward the hole, no
how different monomer sequences lead to different kinetics. problem!!
What causes the funnel-like tilt on a folding landscape? The first
Sequential micropath perspective estimates of the shapes of folding energy landscapes were based on
mean-field theories ~see Fig. 2! ~Dill, 1985; Bryngelson & Wolynes,
The main problem, according to the Sequential Micropath view,
1987!. Hydrophobic collapse leads to compact chain conforma-
was the search problem, which has been called the Levinthal par-
tions. The funnel arises because the drive to collapse is also a drive
adox. As Levinthal posed it ~Levinthal, 1968; Wetlaufer, 1973!, a
toward a reduced ensemble of conformations. There are many
random search of conformations would take a protein forever.
non-native states ~high energy!, but only one native state ~low
Levinthal saw folding as a search through a vast conformational
energy!. The fraction of conformations that are compact is infin-
space, the haystack, for the native structure, the needle. Suppose
itesimal compared to the total conformational space. If there are
the conformational space is represented by four preferred fc an-
4 100 conformations of a chain, FloryHuggins-like theories predict
gles for each peptide bond: a-helical, b-strand, and two others. In
that only about ~40e!100 ' 10 17 of those conformations are com-
terms of those discrete options, the size of the space for a 100
pact ~Dill, 1985!. More accurate recent estimates predict a number
that is even smaller ~Yue & Dill, 1995!: the number of compact
conformations having a hydrophobic core may even be as small as
1 for some sequences. This estimate is supported by experiments
Table 2. on reduced alphabets based on hydrophobicity codes; a small frac-
tion of sequences appear to fold relatively uniquely ~Riddle et al.,
Sequential 1997; Roy et al., 1997; Schafmeister et al., 1997!.
micropath view Ensemble view
In short, the Ensemble view is a reversal of the sequential
Language Paths, intermediates, Landscapes, funnels micropath view. What was seen as the slow stepthe search through
transition states, the huge haystack of non-native chain conformationsis now
reaction coordinates seen as happening at near diffusion-limited speed. Collapse can be
Explains What exponentials do What molecules do fast. Sifting through most of the haystack is fast; the slow part is
~what you see! ~how it works! the endgame of reconfiguring a very small set of near-native con-
Main problem Search problem Trap problem formations. In the past few years, new fast experimental methods
Proposed solution Sequential pathways Funnels ~Burton et al., 1997; Callendar et al., 1999! have shown that pro-
Intermediates Mileposts Traps teins can fold at nearly diffusion-limited rates, on submillisecond
Two-state kinetics No information Implies fast folding
time scales ~Huang & Oas, 1995; Ballew et al., 1996a, 1996b;
Pascher et al., 1996; Burton et al., 1997; Chan C-K et al., 1997;
Fig. 3. Simple mass-action schemes describe observed relaxation rates

and amplitudes, using symbols such as N ~native!, D ~denatured!, and I
~intermediate!.
interpret them. When a single exponential decay is observed in

both folding and unfolding directions, it is described as two-state
kinetics, because two mass action symbols, such as N ~native! and
D ~denatured!, and an arrow interconnecting them, provide the
simplest scheme that can model the data. But when multiple ex-
ponentials are observed, at least one additional symbol must be
invoked in a mass-action law. When there are three such symbols,
say N, D, and I ~intermediate!, there are two main ways those
symbols have been interconnected by arrows: I is called an On-
pathway intermediate or I is an Off-pathway intermediate.
But even when experiments provide a perfectly accurate mass
Fig. 2. Smooth funnel landscape ~bottom!. Denatured conformations fol- action model for the folding and unfolding kinetics of a particular
low different folding routes to the native state. The top figure shows the protein, it does not give enough information to make a microscopic
FloryHuggins excluded volume estimate for the landscape shape ~turned model of how folding takes place. Experimental data are too av-
sideways!: V ; ~N0r!!0@~N0r! N ~N0r 2 N !!#, where V is the number of eraged to inform the local decisions that must be made in confor-
chain conformations, N is chain length, and 0 # r # 1 is the compactness
of the chain, an approximate measure of the depth on the landscape ~Dill, mational searching. A microstate is a single chain conformation. A
1985!. macrostatesuch as the unfolded state U, an intermediate state I,
a molten globule M, or a transition state Tis some collection of
individual conformations. The native state N is often appropriately
regarded as both a microstate and a macrostate. To construct a
Gilmanshin et al., 1997a, 1997b, 1998; Ramachandra Shastry & folding algorithm requires a computational recipe that will begin
Roder, 1998; Ramachandra Shastry et al., 1998!. Energy land- with a microstatesome particular chain conformationthen eval-
scapes provide the language that can help describe folding events uate its energy, then choose which specific bonds to change and by
at any level, from the microscopic to the macroscopic. how much, in order to take a computational step to make it a more
native conformation.
But experimentally obtained mass action models give only rec-
Energy landscapes connect single-chain microscopics
ipes for dealing with macrostates, such as I ~intermediate!, D ~de-
to experimental macroscopics
natured state!, T ~transition state!, etc., and not for dealing with
One long-term goal of protein folding experiments has been to microstates. Here are macrorecipes for how to move a chain con-
help understand the microscopic basis for the folding code. But formation toward the native structure. From an on-pathway inter-
this remains a promise, not a reality. Why? Prior to energy land- mediate state, move uphill along the reaction coordinate in the
scapes, there has been no way to connect the macroscopics that forward direction. From a transition state conformation, move down-
experiments measure to the microscopics that are needed in com- hill. From an off-pathway intermediate, go back to the denatured
putational folding algorithms. Here is the problem. state and try again to go forward along the reaction coordinate.
Figure 3 illustrates the kind of folding kinetics data that is tra- But these macrorecipes do not answer the following questions.
ditionally measured, and the mass-action models that are used to ~1! What is the macrostate to which a particular chain conforma-
1172 K.A. Dill
tion should be assigned? Or choose a macrostate: what micro- which, in turn, is a distinction between microscopics and macro-
scopic conformations are in it? What is the ensemble called scopics ~see Fig. 5!. The Sequential Micropath view postulates a
intermediate state, or denatured state, or transition state? simple relationship between these two types of diagrams. In the
Currently, such assignments must be made arbitrarily. Macrostates Ensemble view, the relationship can be complex, but, in general, is
are averages over many microscopic conformations; they are not not known. A microstate is a single point on an energy landscape
descriptions of single chain conformations. ~2! What series of chain and has free energy Fmicro 5 F~f!, which is also called the internal
conformations defines the reaction coordinate? A reaction coordi- free energy ~Dill & Chan, 1997!. A macrostate has free energy
nate is a macrovariable, not a microvariable ~see below!. For pro- Fmacro 5 F~j!, where j is just a scalar quantity, such as a reaction
tein folding, the reaction coordinate is not known in microscopic coordinate or a progress variable. A given value of j represents
terms. ~3! Even if we knew the reaction coordinate, how do we some particular ensemble of microscopic conformations.
know which way is forward? Which specific bond angles should Figure 6 illustrates the difference between Fmicro and Fmacro , in
we change to progress toward the native state? Every protein fold- a simple model. Suppose we choose as a progress variable the
ing algorithm must make these kinds of microdecisions at every number of hydrophobic contacts, j 5 0, 1, 2, . . . , m, to reflect the
step. But no experiment yet gives such microinformation. Energy extent of folding. This is just one of many possible progress vari-
landscapes can provide the common language to bridge between ables; it is just chosen here for illustration because it simplifies the
micro- and macrodescriptions. math. The density of states g~j! is a count of the number of
different microstates f that define a particular macrostate j. Fig-
What is an energy landscape? ure 6 shows one of the g~0! ' 500,000 conformations that have
According to the principles of thermodynamics, if a system has j 5 0 hydrophobic contacts, one of the g~4! 5 67 conformations
n degrees of freedom f 5 @f1 , f2 , . . . , fn #, the stable state of the that have j 5 4 hydrophobic contacts, and the g~6! 5 1 confor-
system can be found by determining the set of values f* 5 @f*1 , mation that has j 5 6 hydrophobic contacts; this is the native
f2*, . . . , fn*# that gives the minimum value of the free energy func- structure in this model.
tion F~f! 5 F~f1 , f2 , . . . , fn !, when explored over all possible To determine Fmicro , focus on a particular conformation. For that
values of f ~see Fig. 4!. Such functions F~f! are called energy conformation, sum all the energies due to bond angles, torsions,
landscapes. Energy landscapes, per se, are neither new, nor con- stretches, van der Waals interactions, hydrogen bonds, electrostat-
troversial, nor limited to proteins. Energy landscape is nothing ics, and include the solvation free energy due to the relative amounts
more than a name for this function. For protein folding, f may be of buried and exposed hydrophobic and polar surface. Fmicro is a
the backbone and sidechain bond angles, for example. free energy, rather than just an energy, because it includes solvation
and desolvation entropies and the hydrophobic effect. Fmicro is not
the total free energy, however, because it does not include the
Distinguishing between microscopics and macroscopics
chain conformational entropy: it treats only a single conformation.
The distinction between the old and new views is the distinction In the HP model, in which hydrophobicity dominates, a given
between an energy landscape and a reaction coordinate diagram, chain conformation has j hydrophobic contacts, so Fmicro 5 jE,
where E , 0 is the free energy of desolvating two nonpolar groups
and bringing them into contact.
The relationship between energy landscapes and reaction dia-
grams is a relationship between Fmacro and Fmicro . Fmacro does include
the chain conformational entropy,
Fmacro ~ j! 5 2kT ln V 5 2kT ln@ g~ j!e 2Fmicro 0kT #
5 Fmicro ~ j! 2 kT ln g~ j!, ~1!
where V is the partition function, Fmicro ~j! is the internal free

energy for each conformation that has j hydrophobic contacts, and
g~j! is the number of conformations having j hydrophobic con-
tacts. Other progress variables can be more complex, but this sim-
ple model is sufficient for present purposes. If we express the
conformational entropy of the macrostate j as Sconformational ~j! 5
k ln g~j!, then Equation 1 becomes
Fmacro ~ j! 5 Fmicro ~ j! 2 TSconformational ~ j!, ~2!
The main point is that Fmicro is the free energy of a single chain
conformation whereas Fmacro is the free energy of some ensemble
of conformations that collectively have some macroscopic mean-
Fig. 4. Energy landscapes are free energies, Fmicro ~f1 , f2 , . . . !, as a func-
ing, such as an intermediate, transition state, molten globule, or the
tion of the degrees of freedom, f1 , f2 , . . . , such as backbone and side-chain denatured state. Fmacro includes a conformational entropy ~k ln g~j!!,
bond angles. due to the number of microscopic conformations in the particular
A B
Fig. 5. ~A! Energy landscape vs. ~B! reaction diagram. A landscape is a free energy Fmicro of each individual chain conformation vs.
the many microscopic degrees of freedom. A reaction diagram is a free energy Fmacro of an ensemble of molecules, and includes the
chain conformational entropy. Here Fmacro is a function of a single variable, j, such as a reaction coordinate. The reaction coordinate
is usually not known for protein folding. The red arrow on the landscape indicates a possible micropath, an individual folding trajectory.
In this case, the micropath never involves an uphill step, and yet the reaction diagram has a free energy barrier. The barrier is due to
the slow entropic search of many different chains seeking the entry to the central steep funnel.
macrostate. Fmacro ~j! is a function of a single variable j, and traditional reaction coordinate diagram ~see Fig. 5B!. In contrast,
therefore it corresponds to just an ordinary two-dimensional plot, Fmicro ~f! is the energy landscape; it is a function of many degrees
of the folding free energy vs. reaction coordinate j. This is the of freedom. Landscapes are usually plotted in three dimensions, as
a simplification, since it is impossible to draw high-dimensional
surfaces. Traditional terms such as intermediate state, pathway,
transition state, and free energy barrier refer to Fmacro ~j!. In con-
trast, computer simulations usually explore Fmicro ~f!.
What are folding pathways? Micropaths and

microbarriers vs. macropaths and macrobarriers
The distinction between micro and macro also applies to folding
kinetics. A micropath is one trajectory that one protein follows as
it folds. At time t, the degrees of freedom have the value f~t!. That
is, f~0! 5 @f1 ~0!, f2 ~0!, . . . , fn ~0!# at time t 5 0, then f~t1 ! 5
@f1 ~t1 !, f2 ~t1 !, . . . , fn ~t1 !# at time t 5 t1 , etc. f~t! describes the
path a fly might take in a multidimensional space. Most computer
simulations have explored one or a few micropaths, although a few
modeling efforts have been able to explore more complete ensem-
ble averages ~Chan & Dill, 1994, 1998!. Because proteins are
subject to Brownian motion, a micropath involves much motion
that would seem pointless to an observer. For example, a chain can
pass back and forth through a given configuration many times. In
contrast, a macropath describes some progress variable j~t!, which
involves different ensembles at different times during the folding
or unfolding process. Experiments have given information only
about macropaths, whereas simulations usually give only informa-
tion about micropaths.
The key to the Sequential Micropath view is an implicit assump-
tion of equivalence between macropaths and micropaths. The prem-
ise of the Sequential Micropath view is that there is a simple and
direct relationship between f~t! and j~t!, just as there is in tradi-
tional chemical kinetics ~see Fig. 7!. If an energy landscape has an
energy well corresponding to reactants A, another energy well
corresponding to products B, and a lowest-energy superhighway,
Fig. 6. The density of states g~j! is a count of the number of chain con-
which defines the route that most molecules take from A to B, then
formations, in this case having j hydrophobic contacts. On the energy a one-dimensional reaction pathway j can be obtained by painting
ladder, more hydrophobic contacts corresponds to lower energy. a stripe along the centerline of the superhighway through the multi-
1174 K.A. Dill
~reaction coordinate diagram! and microlevels ~energy landscape!.

Peaks and valleys along the reaction profile represent microscopic
milestones along the energy landscape. But Figure 8B illustrates
that sometimes there may be no such correspondence for some
folding processes. For folding, a given experimental observation
~as manifested in the reaction coordinate profile! can arise from
many different landscape shapes. A landscape uniquely specifies a
reaction diagram, but a reaction diagram does not uniquely specify
a landscape. Figure 8B illustrates that folding becomes increas-
ingly pathway-like at late stages, because the molecules become
localized near the native state in conformational space. When chem-
ical reactions have a single exponential time dependence, it implies
an identifiable energy barrier. But for a single exponential, or any
other particular time dependence in folding processes, no direct
inference about microscopic bottlenecks is possible, as shown below.
Sometimes micropaths can be very different from macropaths.
Figures 5 and 9 show two landscape features in which micropaths
do not coincide with macropaths. ~1! A downhill micropath con-
tributes to an uphill macropath. A downhill micropath means that
the chain does not break favorable contacts, say hydrophobic con-
tacts. But this can involve a barrier on a reaction diagram because
the microscopic meandering on flat plains on an energy diagram
can be slow ~Fig. 5!. This will be manifested as a conformational
entropy barrier ~uphill! on the reaction diagram but only as a slight
downhill slope on the energy landscape. ~2! A downhill macropath
can include some uphill micropaths. An uphill micropath can arise
when one chain breaks favorable contacts, while most other chains
find lower energy routes that avoid breaking contacts ~Fig. 9!. It is
because we do not yet know the relationships between micropaths
Fig. 7. Classical energy landscape for chemical reactions. Reactants, prod-
ucts, and intermediate states are low-energy depressions. The reaction path- and macropaths that we cannot use experimental data and mass-
way is a lowest energy highway from reactants to products. Transition action models ~macropaths!, to help us forge folding algorithms,
states are peaks along the pathway. For simple chemical reactions, most which require knowledge of microscopic details.
molecules follow essentially the same reaction path.
Energy landscapes are funnels:

The bottom is smaller than the top
dimensional f space from A to B. In some cases, particularly near
While the shapes of folding energy landscapes are not yet known
the end of the folding process, this way of defining reaction path-
in detail, it is uncontestable that they are funnel like, in the sense
way may be useful and adequate. But this direct relationship be-
of the term that we use here ~Dill & Chan, 1997!. Here, funnel
tween f and j is valid only when molecules, like ants along a single
means that many conformations have high energy and few have
file trail, all follow essentially the same route. That is, if one mol-
low energy. More specifically, conformations having high Fmicro
ecule folds by first forming a helix at the N-terminal end, then form-
~denatured states! have high conformational entropy and states
ing a contact between residues 1 and 27, then undoing the helix, then
having low Fmicro ~native state and other deep minima! have low
forming a contact between monomers 3 and 18, etc., then the equiv-
conformational entropy. ~By some definitions, funnel also car-
alence of micropaths and macropaths would mean that all the other
ries the connotation of smooth landscapes, so it also has implica-
molecules will undergo exactly the same sequence of events too.
tions about dynamics and time dependence, namely that barriers
But while micropaths in chemical reaction kinetics overwhelm-
are small so the process happens quickly. Here, the term funnel
ingly overlap with each other, the micropaths in protein folding
carries no such implication about kinetics or barrier heights or
can be very different. Chemical bonding involves energies much
smoothness or any landscape shape feature other than: there are
greater than kT, whereas each interaction in a folding process is not
many conformations of high free energy ~Fmicro ! and few confor-
much larger than kT, so thermal motions can cause much larger
mations of low free energy.!
variations in folding than in chemical reactions. One molecule may
Energy landscapes also have funnel-like shapes for processes of
form its N-terminal helix first, while another molecule in solution,
ligand binding to biomolecules: there are few tightly bound con-
bombarded differently by Brownian motion, may form its C-terminal
formations, and many unbound or weakly bound conformations
contacts first. In the end, both molecules will fold, but each follows
~Frauenfelder et al., 1991; Miller & Dill, 1997; Tsai et al., 1999!.
a different micropath. Simulations usually show some degree of
preference among microroutes, particularly in late stages of fold-
ing, but there cannot be perfect registry in the early stages of the The chickenegg problem: Collapse first or
micropaths because the starting points ~the denatured conforma- secondary structure first?
tions! are so different. Which comes first in the folding process, collapse or secondary
Figure 8A illustrates that for traditional chemical reactions structure? Just as the answer to where chickens come from is more
~Fig. 7!, there is a direct correspondence between the macrolevels complex than chicken or egg, so also folding is undoubtedly
A B
Fig. 8. A: For chemical reactions ~energies .. kT !, the macrostates on reaction coordinate diagrams correspond to the time series of
microstates on the energy landscape. B: For folding processes ~energies per interaction ' kT !, the observed macrostates may not
uniquely specify the time series of microstates on the energy landscape.
more complex than collapse first or secondary structure first.

Collapse, secondary structure, and hierarchical assembly
~Baldwin & Rose, 1999a, 1999b! are macroterms, like reaction
coordinate, since each describes an ensemble property. Hierarchic
folding has been recently defined as a process in which folding
begins with structures that are local in sequence and marginal in
stability; these local structures interact to produce intermediates of
ever-increasing complexity and grow, ultimately, into the native
conformation ~Baldwin & Rose, 1999a, 1999b!. It is proposed
that hierarchical folding involves multiple folding routes, rather
than a unique sequential pathway. By these criteria, there is little to
distinguish hierarchical folding from the Ensemble view. Growing
stability corresponds to a downhill flow on a landscape, and the
early preference for local contacts is similar to that found in energy-
based microscopic models, such as the following. The diffusion-
collision model is based on assuming that fc preferences are
established early, then secondary structures assemble into tertiary
structures ~Karplus & Weaver, 1976; Lee et al., 1987; Yapa et al.,
1992!. A zippers model also proposes that local contacts form
earlier, on average, than the nonlocal contacts. But the zippers
model supposes that structure development is driven by solvation Fig. 9. An uphill micropath ~red line! is surrounded by more favorable
forces ~Dill et al., 1993; Fiebig & Dill, 1993!. routes that do not involve uphill steps to reach the native state.
1176 K.A. Dill
Alternatives to the sequential micropath fluctuations!. Energy landscapes provide the framework for relat-
and ensemble views? ing the thermodynamics and kinetics of protein folding. Figure 10
There have been efforts to marry old and new views ~Pande shows two landscapes: one is a smooth funnel, the other is a
et al., 1998; Pande & Rokhsar, 1999!. Those efforts aim to recon- rugged funnel. For the smooth funnel, folding kinetics should be
cile how there can be preferred folding routes at the same time that fast and two state. For the rugged landscape, folding kinetics will
individual chains follow different micropaths. But no marriage is be slower and more complex.
needed. Preferred routes and states are part and parcel of the En- The shape of the landscape also describes the fluctuations at
semble perspective. In my opinion, the Ensemble perspective is equilibrium. Fluctuations are interesting for two reasons. First,
not one model, one result, or one energy landscape shape. It is not these are the motions that are important for protein function, such
a denial of patterns, pathways, uniqueness, or structure. It is just a as when an enzyme enters a transition state for catalyzing a reac-
perspective based on recognizing the general funnel-like nature of tion. Second, fluctuations can be measured by NMR or thermal
the energy function Fmicro ~f! with bumps and wiggles and shapes factors in X-ray crystallography. The fluctuations are those con-
that have yet to be determined. The funnel perspective is univer- formations having energies only one or two kT higher than the
sally captured in many different models, monomer sequences, po- native conformation and are therefore transiently populated due to
tential functions, move sets, and definitions of transition states. occasional Brownian bombardments. If a protein has a smooth
While particular results can depend on model details, the funnel landscape, the motions of the protein are mostly small wiggles,
concept is a broad brush picture of how a large ensemble leads to never deviating much from the native structure because to do so
a small ensemble, how an unstructured population changes through would require a high energy. But for a bumpy landscape, very
time to become a single structure, and how the degrees of freedom non-native-like conformations can occasionally be populated un-
diminish from being many and uncoupled and unsynchronized to der native conditions because the energies of such conformations
being few and coupled and synchronized. This process is bound to are not much higher than those of the native molecule ~Miller &
involve preferences. Dill, 1995; Tang & Dill, 1998!. During those fluctuations, protons
Indeed, at the end of the folding process, it would be remarkable or ligands could exchange in or out, or the protein could have other
and maybe impossibleto have large diversity in conformations transiently different properties than the native molecule. If we
or trajectories. Most of the simulations that have led to the En- knew the shapes of energy landscapes, we could better understand
semble perspective have found preferred folding routes in the late the relationship between folding kinetics and equilibrium fluctua-
stages of folding ~Miller et al., 1992; Lazaridis & Karplus, 1997!. tions around the native state.
What is new in the new view, and what was the essence of
Levinthals problem, was what happens in the early stages of Landscape-ology can help in developing
folding, not the late stages. Levinthals concern was how to search new conformational search strategies
the huge space of denatured conformations. The Ensemble view Knowing the shapes of energy landscapes should also help to
merely asserts that molecules cannot be synchronized at the be- create faster computer conformational search methods. In the Se-
ginning of folding because different chains have such different quential Micropath view, on-pathway intermediates are held in
unfolded conformations. Although the denatured state is a single special regard because of how they might illuminate the folding
macrostate, it is a very heterogeneous collection of microstates.
It will surely remain a matter of opinion for any given simula-
tion whether what is interesting is the pathway or the variance
from it. In either case, energy landscapes provide the basis for
calculating any property of interest.
Why do we need energy landscapes?

Once energy landscapes are better understood, particularly for more
realistic models of proteins, they should be able to serve several
purposes. First, they should provide a consistent and rigorous lan-
guage for interrelating macroscopics to microscopics. Simulators
micropaths, when properly averaged, can teach us about experi-
mentalists macropaths, and experimentalists macropaths can test
simulators models. Second, landscapes provide a link between
thermodynamics and kinetics, described below. And third, land-
scapes may provide the bridge so that folding kinetics can be
brought to bear on speeding up conformational search strategies,
also described below. Fig. 10. Comparing fluctuations on smooth vs. rugged landscapes. The
state of lowest free energy is native ~N !, indicated as the lower tick mark
on the y-axis. Normal fluctuations increase the energy, as indicated by the
Relating thermodynamics and kinetics: A fluctuation higher tick mark. Thermal fluctuations lead to only small conformational
dissipation relationship for proteins? deviations from the native structure on smooth landscapes, but can lead to
A most remarkable result in statistical mechanics is the larger deviations on rough landscapes. The native lattice conformation has
six hydrophobic contacts, whereas a conformation having only one unit
fluctuationdissipation theorem ~Chandler, 1987!. This theorem higher energy ~five hydrophobic contacts!, has a completely different con-
relates a kinetic property of systems ~the rate of approach to equi- formation. Rugged landscapes mean that small excursions in energy ~from
librium!, to an equilibrium property ~the nature of the equilibrium native! can lead to large excursions in structure.
path. But, as noted above, computer folding algorithms have not

been able to use such macro-information. The landscape view is
more egalitarian. Every conformation, no matter how distant from
the native state, can give some useful information about the native
state, as described below. New and faster conformational search
strategies are emerging that are based on what rudimentary knowl-
edge is currently available of the shapes of energy landscapes.
Current search methods, such as Monte Carlo ~MC!, Simulated
Annealing ~SA!, and Molecular Dynamics ~MD!, explore energy
surfaces and are slow because they get caught in kinetic traps. We
call these local search methods; they do not make use of global
information about the shape of the underlying energy landscape. In
a local search method, some very small change in a conformation
is considered. Such changes are highly localized on the energy
landscape. An energy is evaluated, and some decision is made
whether to take that step uphill or downhill. The move is accepted
or rejected, usually either based on Metropolis criteria or Newtons Fig. 11. Convex global underestimator ~CGU! conformational search strat-
laws. Such strategies are very slow because they are unguided by egy. Traditional methods, such as Monte Carlo ~MC!, molecular dynamics
~MD!, and simulated annealing ~SA!, search over the tops of energy land-
global information, they involve much randomness, and they usu- scapes and can get caught in kinetic traps. The CGU searches underneath
ally terminate in kinetic traps. the landscape instead by using a few sampled local minima ~indicated by
Here is an analogy. One way to find the lowest point on the dots! to generate a series of underestimating parabolic surfaces to locate
Himalayan mountains is to always walk downhill until you can go the global minimum ~Dill et al., 1997!.
down no further. Then go uphill until you can go down again. This
is the Monte Carlo and SA approach. Such random walking is
much slower and more haphazard than if you used a contour map
to guide your journey. For example, if a protein folding algorithm J.B. Rosen, & K.A. Dill, unpubl. comm.!. The only knowledge the
creates a structure that does not have a hydrophobic core, it should CGU currently uses is just that landscapes are funnel-like. As we
not keep changing one bond angle at a time as many current learn more about the shapes of protein energy landscapes, it should
methods do; it should stop wasting its time and move to a very be possible to create faster conformational search strategies.
different part of conformation space.
New methods are developing for speeding up conformational
searching, based on emerging knowledge of the shapes of energy Historical parallels with polymer science?
landscapes ~Maranas et al., 1995; Phillips et al., 1995!. For exam-
For the past 40 years, a defining paradigm of protein science has
ple, the idea behind the Convex Global Underestimator ~CGU!
been Structural Biology. Structural Biology has provided a frame-
method ~Phillips et al., 1995; Dill et al., 1997; K.W. Foreman, A.J.
work for deciding what questions are important and how to answer
Phillips, J.B. Rosen, & K.A. Dill, unpubl. comm.! is to sample a
them. Two key imperatives of Structural Biology are ~1! high
few conformations chosen randomly from the conformational space,
resolution, the importance of atomic detail and ~2! unique archi-
find the nearest local energy minimum for each one, then construct
tectures, the importance of specific geometric interrelationships
a multi-dimensional parabolic underestimator surface U~f! un-
among atoms. Protein structures have atomic resolution, and every
derneath all the minima that are known so far ~see Fig. 11!. U~f!
serves as a predictor for where the global minimum might be
found, if the energy landscape is funnel-like. Subsequent under-
estimator surfaces are constructed iteratively for increasingly nar-
rowed regions around the native state. In this way, every chain
conformation that is sampledno matter how non-native
contributes some information about the landscape shape, and con-
tributes to an estimate of where the native state will be found. In
contrast, local search methods make no such use of collective
information about all other conformations that have been sampled
before a given step.
The CGU and other underestimator methods look promising, on
the following bases. ~1! Starting from different initial starting points
on the landscape, the CGU usually reaches the same final point,
indicating that it finds global minima and does not get stuck in
kinetic traps. ~2! An advantage of the CGU over MC and SA is that
no problem-dependent adjustment is required, as when devising
temperature schedules or proper move sets. ~3! Tests so far in a Fig. 12. Relative search depth ~energy! of simulated annealing compared
simple protein folding model and on van der Waals clusters up to to the CGU, for different lengths of model proteins, up to 36 amino acids
~K.W. Foreman et al., in prep.!. For short chains, SA reaches the same
21 atoms shows that CGU reaches much lower on energy land- depth on energy landscapes as the CGU, but for longer chains, SA gets
scapes in a given time than MC or SA ~see Fig. 12!, and the ad- stuck at increasingly higher altitudes on the energy landscape, where the
vantage increases with chain length ~K.W. Foreman, A.J. Phillips, relative depth is indicated by the cartoon on the right.
1178 K.A. Dill
atom has its place. It has been considered important to know THE only about a decade after the macromolecular hypothesis was ac-
native structure, THE transition state, or THE intermediate structure. cepted, quantitative statistical mechanical models began to suc-
Of course, it is clear that proteins wiggle and move; they are not cessfully explain rubber elasticity, the viscosities and viscoelasticities
perfectly static ~Karplus, 1997!. But even so, such fluctuations are of chain molecule liquids, the dependence of the physical proper-
often regarded as a sort of footnote to the main message, much like ties of polymeric materials on molecular weight distributions, and
error bars in experimental data. According to this logic, folding the unusual thermodynamics of polymer solutions. Such statistical
pathways are less like a perfect train track, where no lateral vari- ideas now provide the foundation of modern polymer theory. For
ation is allowed, and more like a highway, where some small many properties of proteins, too, it seems clear that statistics is not
degree of weaving and lateral meandering can take place. just a caveat about small details but is at the very heart of the
But in the polymer view, statistics can play a fundamentally problems that proteins must solve.
different and deeper role. It is more like replacing a train track, not
with a highway, but with a ski bowl. Driving on a highway from
point A to point B can be described by average velocities, posi- Conclusions
tions, and altitudes along the reaction coordinate, the highway.
Statistical mechanical models can give useful insights about pro-
But tracking an ensemble of skiers is quite a different business than
teins. While all-atom models sacrifice conformational sampling to
tracking the flow of cars on a highway. Skiers can take different
gain atomic detail, statistical mechanical models do the reverse.
routes. The average position of skiers on a mountainside is a much
Because simple models explore non-native states so effectively,
more heterogeneous property, with less apparent meaning. What is
have few parameters, and cost little computer time, they have been
THE structure, or even THE averaged structure, at any given time,
useful for exploring folding forces and principles. They have led to
is not yet clear, or necessarily always meaningful.
the perspective that the folding code is primarily a solvation code,
The importance of one particular structure and the neglect of
rather than a local propensities code. Statistical mechanical models
statistics has a parallel in the history of polymer science. The
are well suited to addressing combinatoric problems, such as the
breakthrough that founded polymer science was the macromolec-
Levinthal and Blind Watchmaker paradoxes. The conclusion is that
ular hypothesis, the idea that there were long chains covalently
we should beware of needle in a haystack arguments, because
linked together ~Flory, 1953; Morawetz, 1985!. The huge resis-
nature does not seem to work that way. Each step is not unguided.
tance to this idea prior to the 1920s was due to a faith in the
Conformational and sequence spaces are more like landscapes.
importance of specific structures and a reluctance to fully appre-
Landscapes are funnel like, wide at the top and narrow at the
ciate the statistics. According to Flory ~1953!:
bottom, sometimes with hills and valleys. All conformationsnot
just on-pathway intermediates for examplecan give some guid-
Organic chemists were motivated by the desire to devise con-
ance toward the global minimum. New computational search meth-
cise formulas and to isolate pure substances, the term pure . . .
ods are drawing on this information to make better folding and
invariably implying a formula of convenient size. Hence the
docking algorithms. The energy landscape perspective may help
quest for the cellulose molecule or the rubber molecule contin-
connect the currently disjoint areas of kinetics experiments and
ued. . . . By the turn of the century this objective had crystallized
conformational search strategies.
to a discipline which dominated synthetic organic chemistry. To
be eligible for acceptance in the chemical kingdom, a newly
created substance . . . had to be separated in such a state that it Acknowledgments
could be characterized by a molecular formula. The investigator
was obliged to adduce elementary analyses to confirm the com- I thank Sarina Bromberg and Jack Schonbrun for very helpful comments.
I also thank Hue Sun Chan for his insightful comments on this manuscript,
position, and to supplement these with molecular weight deter- for his many contributions over the past decade that are summarized in this
minations for the purpose of showing that the substance was work, and for teaching me a great deal about proteins. I appreciate support
neither more nor less complex than the formula proposed. Other- from the Lawrence Berkeley Labs and from NIH grant GM34993.
wise the fruits of his labors would not be elevated to an honored
place in the immortal pages of the chemical compendiums. The
successes of synthetic organic chemistry in creating the hun- References
dreds of thousands of different combinations and permutations Alberts B, Bray D, Johnson A, Raff M, Roberts K, Walter P. 1998. Essential cell
of atoms must not be discounted. In magnitude of creative biology: An introduction to the molecular biology of the cell. New York:
achievement, they are scarcely surpassed in any other field of Garland.
Anfinsen CB, Scheraga HA. 1975. Experimental and theoretical aspects of
science. While this discipline was strikingly successful, it also protein folding. Adv Prot Chem 29:205300.
tended to narrow the outlook of contemporary researchers. They Aurora R, Creamer TP, Srinivasan R, Rose GD. 1997. Local interactions in
came to believe that every definable substance could be classi- protein folding: Lessons from the a-helix. J Biol Chem 272:14131416.
Baldwin RL. 1994. Protein folding: Matching speed and stability. Nature 369:183
fied in terms of a single definite molecule capable of being
184.
represented by a concise formula. Baldwin RL. 1995. The nature of protein folding pathways: The classic versus
the new view. J Biomol NMR 5:103109.
With the macromolecular hypothesis came the recognition that Baldwin RL, Rose GD. 1999a. Is protein folding hierarchic? I. Local structure
and peptide folding. Trends Biochem Sci 24:2633.
N polymer molecules in solution, even when they are called by the Baldwin RL, Rose GD. 1999b. Is protein folding hierarchic? II. Folding inter-
same name, such as polyethylene, are not identical to each other. mediates and transition states. Trends Biochem Sci 24:7783.
Each molecule in solution can have a different conformation and Ballew RM, Sabelko J, Gruebele M. 1996a. Direct observation of fast protein
even, for synthetic polymers, a different chain length. Hence dif- folding: The initial collapse of apomyoglobin. Proc Natl Acad Sci USA
93:57595764.
ferent experiments see different ensemble averages and give dif- Ballew RM, Sabelko J, Gruebele M. 1996b. Observation of distinct nanosecond
ferent perspectives on the same molecule, polyethylene. Within and microsecond protein folding events. Nature Struct Biol 3:923926.
Bowie J, Reidhaar-Olson J, Lim WA, Sauer RT. 1990. Deciphering the messages Huang GS, Oas TG. 1995. Submillisecond folding of monomeric l repressor.
in protein sequences: Tolerance to amino acid substitutions. Science 247:1306 Proc Natl Acad Sci USA 92:6878 6882.
1310. Kabsch W, Sander C. 1984. On the use of sequence homologies to predict
Branden C, Tooze J. 1999. Introduction to protein structure, 2nd ed. New York: protein structure: Identical pentapeptides can have completely different con-
Garland. formations. Proc Natl Acad Sci USA 81:10751078.
Bromberg S, Dill KA. 1994. Side chain entropy and packing in proteins. Protein Kamtekar S, Schiffer JM, Xiong H, Babik JM, Hecht MH. 1993. Protein design
Sci 3:9971009. by binary patterning of polar and nonpolar amino acids. Science 262:1680
Bryngelson J, Onuchic J, Socci ND, Wolynes PG. 1995. Funnels, pathways, and 1685.
the energy landscape of protein folding: A synthesis. Proteins 21:167195. Karplus M. 1997. The Levinthal paradox: Yesterday and today. Folding Design
Bryngelson J, Wolynes PG. 1987. Spin-glass and the statistical mechanics of 2:S69S75.
protein folding. Proc Natl Acad Sci USA 84:75247528. Karplus M, Weaver DL. 1976. Protein-folding dynamics. Nature 260:404 406.
Burton RE, Huang GS, Daugherty MA, Calderone TL, Oas TG. 1997. The Kauzmann W. 1959. Some factors in the interpretation of protein denaturation.
energy landscape of a fast-folding protein mapped by ala r gly substitu- Adv Prot Chem 14:1 63.
tions. Nature Struct Biol 4:305310. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. 1958.
Callendar RH, Gilmanshin R, Dyer RB, Woodruff WH. 1999. Annu Rev Phys A three-dimensional model of the myoglobin molecule obtained by X-ray
Chem. Forthcoming. analysis. Nature 181:662 666.
Camacho CJ, Thirumalai D. 1993. Kinetics and thermodynamics of folding in Kim PS, Baldwin RL. 1982. Specific intermediates in the folding reactions of
model proteins. Proc Natl Acad Sci USA 90:6369 6372. small proteins and the mechanism of protein folding. Annu Rev Biochem
Chan C-K, Hu Y, Takahashi S, Rousseau DL, Eaton WA, Hofrichter J. 1997. 51:459 489.
Submillisecond protein folding kinetics studied by ultrarapid mixing. Proc Kuroda Y, Hamada D, Tanaka T, Goto Y. 1996. High helicity of peptide frag-
Natl Acad Sci USA 94:17791784. ments corresponding to beta-strand regions of beta-lactoglobulin observed
Chan HS, Dill KA. 1991. Sequence space soup of protein and copolymers. by 2D-NMR spectroscopy. Folding Design 4:255263.
J Chem Phys 95:37753787. Lau KF, Dill KA. 1990. Theory for protein mutability and biogenesis. Proc Natl
Chan HS, Dill KA. 1994. Transition states and folding dynamics of proteins and Acad Sci USA 87:638 642.
heteropolymers. J Chem Phys 100:92389257. Lazar GA, Desjarlais JR, Handel TM. 1997. De novo design of the hydrophobic
Chan HS, Dill KA. 1998. Protein folding in the landscape perspective: Chevron core of ubiquitin. Protein Sci 6:11671178.
plots and non-Arrhenius kinetics. Proteins 30:233. Lazaridis T, Karplus M. 1997. New view of protein folding reconciled
Chandler D. 1987. Introduction to modern statistical mechanics. New York: with the old through multiple unfolding simulations. Science 278:1928
Oxford Press. 1931.
Dawkins R. 1996. The blind watchmaker: Why the evidence of evolution reveals Lee S, Bashford D, Karplus M, Weaver DL. 1987. Brownian dynamics simu-
a universe without design. New York: Norton. lation of protein folding: A study of the diffusion-collision model. Biopoly-
Dill KA. 1985. Theory for the folding and stability of globular proteins. Bio- mers 26:481509.
chemistry 24:15011509. Levinthal C. 1968. Are there pathways for protein folding? J Chim Phys 65:
Dill KA. 1987. The stabilities of globular proteins. In: Oxender DL, Fox CF, 44 45.
eds. Protein Engineering. Alan R. Liss Inc. pp 187192. Lim WA, Farrugio DC, Sauer RT. 1992. Structural and energetic consequences
Dill KA. 1993. Folding proteins: Finding a needle in a haystack. Curr Op Struct of disruptive mutations in a protein core. Biochemistry 31:4324.
Biol 3:99103. Lim WA, Sauer RT. 1991. The role of internal packing interactions in deter-
Dill KA, Bromberg S, Yue K, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. mining the structure and stability of a protein. J Mol Biol 219:359376.
Principles of protein folding: A perspective from simple exact models. Pro- Lipman DJ, Wilbur WJ. 1991. Modeling neutral and selective evolution of
tein Sci 4:561 602. protein folding. Proc Roy Soc B245:711.
Dill KA, Chan HS. 1997. From Levinthal to pathways to funnels: The new Maranas DC, Androulakis IP, Floudas CA. 1995. A deterministic global opti-
view of protein folding kinetics. Nature Struct Biol 4:1019. mization approach for the protein folding problem. DIMACS series in Dis-
Dill KA, Fiebig KM, Chan HS. 1993. Cooperativity in protein-folding kinetics. crete Mathematics and Theoretical Computer Science 23:133150.
Proc Natl Acad Sci USA 90:19421946. Matthews BW. 1993. Structural and genetic analysis of protein stability. Annu
Dill KA, Phillips, AT, Rosen JB. 1997. Protein structure and energy landscape Rev Biochem 62:139160.
dependence on sequence using a continuous energy function. J Comp Biol Miller DW, Dill KA. 1995. A statistical mechanical model for hydrogen ex-
4:227239. change in globular proteins. Protein Sci 4:18601873.
Doty P, Bradbury JH, Holtzer AM. 1956. The molecular weight, configuration Miller DW, Dill KA. 1997. Ligand binding to proteins: The binding landscape
and association of poly-g-benzyl-l-glutamate in various solvents. J Am model. Protein Sci 6:21662179.
Chem Soc 78:947954. Miller R, Danko CA, Fasolka MJ, Balacz AC, Chan HS, Dill KA. 1992. Folding
Doty P, Yang JT. 1956. Polypeptides VII. Poly-gamma-benzyl-l-glutamate: The kinetics of proteins and copolymers. J Chem Phys 96:768780.
helix-coil transition in solution. J Am Chem Soc 78:498. Minor DL, Kim PS. 1994. Context is a major determinant of beta-sheet pro-
Eisenberg D, Weiss RM, Terwilliger C. 1984. The hydrophobic moment detects pensity. Nature 371:264267.
periodicity in protein hydrophobicity. Proc Natl Acad Sci USA 81:140144. Mirsky AE, Pauling L. 1936. On the structure of native, denatured, and coag-
Fiebig KM, Dill KA. 1993. Protein core assembly processes. J Chem Phys ulated proteins. Proc Natl Acad Sci USA 22:439 447.
98:34753487. Morawetz H. 1985. Polymers: The origins and growth of a science. New York:
Flory PJ. 1953. Principles of polymer chemistry. Ithaca, New York: Cornell Wiley.
University Press. pp 819. Munson M, Balasubramanian S, Fleming KG, Nagi AD, et al. 1996. What
Frauenfelder H, Sligar SG, Wolynes PG. 1991. The energy landscapes and makes a protein a protein: Hydrophobic core designs that specify stability
motions of proteins. Science 254:15981603. and structural properties. Protein Sci 5:15841593.
Gassner NC, Baase WA, Matthews BW. 1992. A test of the jigsaw puzzle model Munson M, OBrien R, Sturtevant JM, Regan L. 1994. Redesigning the hydro-
for protein folding by multiple methionine substitutions within the core of phobic core of a four-helix-bundle-protein. Protein Sci 3:21052022.
T4 lysozyme. Proc Natl Acad Sci USA 93:1215512158. Onuchic JN, Luthy-Schulten Z, Wolynes PG. 1997. Theory of protein folding:
Gilmanshin R, Callendar RH, Dyer RB. 1998. Fast events in protein folding: The energy landscape perspective. Annu Rev Phys Chem 48:545 600.
The time evolution of primary processes. Nature Struct Biol 5:363365. Pande VS, Grosberg AY, Tanaka T, Rokhsar DS. 1998. Pathways for protein
Gilmanshin R, Williams S, Callendar RH, Woodruff WH, Dyer RB. 1997a. Fast folding: Is a new view needed? Curr Op Struc Biol 8:6879.
events in protein folding: Relaxation dynamics of secondary and tertiary Pande VS, Rokhsar DS. 1999. Folding pathway of a lattice model for proteins.
structure in native apomyoglobin. Proc Natl Acad Sci USA 94:37093713. Proc Natl Acad Sci USA 96:12731278.
Gilmanshin R, Williams S, Callendar RH, Woodruff WH, Dyer RB. 1997b. Fast Pascher T, Chesick JP, Winkler JR, Gray HB. 1996. Protein folding triggered by
events in protein folding: Relaxation dynamics and structure of the I form electron transfer. Science 271:15581560.
of apomyoglobin. Biochemistry 36:1500615012. Pauling L, Corey RB. 1951a. Atomic coordinates and structure factors for two
Hamada D, Segawa S, Goto Y. 1996. Non-native alpha-helical intermediate in helical configurations of polypeptide chains. Proc Natl Acad Sci USA 37:235
the refolding of beta-lactoglobulin: A predominantly beta-sheet protein. Na- 240.
ture Struct Biol 3:868873. Pauling L, Corey RB. 1951b. The pleated sheet, a new layer configuration of
Harrison SC, Durbin R. 1985. Is there a single pathway for the folding of a polypeptide chains. Proc Natl Acad Sci USA 37:251256.
polypeptide chain? Proc Natl Acad Sci USA 82:4028 4030. Pauling L, Corey RB. 1951c. The structure of fibrous proteins of the collagen-
Honig B, Cohen FE. 1996. Adding backbone to protein folding: Why proteins gelatin group. Proc Natl Acad Sci USA 37:272281.
are polypeptides. Folding Design 1:R17R20. Pauling L, Corey RB. 1951d. Configurations of polypeptide chains with favored
1180 K.A. Dill
orientations around single bonds: Two new pleated sheets. Proc Natl Acad Scholtz JM, Baldwin RL. 1992. The mechanism of a-helix formation by pep-
Sci USA 37:729740. tides. Annu Rev Biophys Biomol Struct 21:95118.
Pauling L, Corey RB, Branson HR. 1951. The structure of proteins: Two hydrogen- Shiraki K, Nishikawa K, Goto Y. 1995. Trifluoroethanol-induced stabilization of
bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci the a-helical structure of b-lectoglobulin: Implication for non-hierarchical
USA 37:205211. protein folding. J Mol Biol 245:180194.
Phillips AT, Rosen JB, Walke VH. 1995. Molecular structure determination by Smith CK, Regan L. 1995. Guidelines for protein design: The energetics of
convex global underestimation. DIMACS series in Discrete Mathematics beta-sheet side chain interactions. Science 270:980982.
and Theoretical Computer Science 23:181. Stanley HE. 1971. Introduction to phase transitions and critical phenomena,
Poland DC, Scheraga HA. 1970. Theory of the helix-coil transition. New York: Oxford: Oxford Press.
Academic Press. Tang KES, Dill KA. 1998. Native protein fluctuations: The conformational-
Predki PF, Agrawal V, Brunger AT, Regan L. 1996. Amino-acid substitutions in motion temperature and the inverse correlation of protein flexibility with
a surface turn modulate protein stability. Nature Struct Biol 3:5458. protein stability. J Biomol Struct Dyn 16:397 411.
Ramachandra Shastry MC, Roder H. 1998. Evidence for barrier-limited protein Thomas PD, Dill KA. 1993. Local and nonlocal interactions in globular proteins
folding kinetics on the microsecond time scale. Nature Struct Biol 5:385 and mechanisms of alcohol denaturation. Protein Sci 2:20502065.
392. Tsai C-J, Kumar S, Ma B, Nussinov R. 1999. Folding funnels, binding funnels,
Ramachandra Shastry MC, Saunder JM, Roder H. 1998. Kinetic and structural and protein function. Protein Sci 8:11791188.
analysis of submillisecond folding events in cytochrome c. Acc Chem Res Wetlaufer DB. 1973. Nucleation, rapid folding, and globular intrachain regions
31:717725. in proteins. Proc Natl Acad Sci USA 70:697701.
Reidhaar-Olson JF, Sauer RT. 1988. Combinatorial cassette mutagenesis as a Williams S, Causgrove TP, Gilmanshin R, Fang KS, Callendar RH, Woodruff
probe of the informational content of protein sequences. Science 241:5357. WH, Dyer RB. 1996. Fast events in protein folding: Helix melting and
Riddle DS, Santiago JV, BrayHall ST, Doshi N, Grantcharova VP, Yi Q, Baker formation in a small peptide. Biochemistry 35:691 697.
D. 1997. Functional rapidly folding proteins from simplified amino acid Wu LC, Kim PS. 1997. Hydrophobic sequence minimization of the a-lactalbumin
sequences. Nature Struct Biol 4:805809. molten globule. Proc Natl Acad Sci USA 94:1431414319.
Roy S, Ratnaswamy G, Boice JA, Fairman D, McLendon G, Hecht MH. 1997. Yapa K, Weaver DL, Karplus M. 1992. b-sheet coil transitions in a simple
A protein designed by binary patterning of polar and nonpolar amino acids polypeptide model. Proteins 12:237265.
displays native-like properties. J Am Chem Soc 116:53025306. Yue K, Dill KA. 1995. Forces of tertiary structural organization in globular
Schafmeister CE, LaPorte SL, Miercke LJW, Stroud RM. 1997. A designed four proteins. Proc Natl Acad Sci USA 92:146150.
helix bundle protein with native-like structure. Nature Struct Biol 4:1039 Zimm BH, Bragg J. 1959. Theory of the phase transition between helix and
1046. random coil in polypeptide chains. J Chem Phys 31:526.
Schellman JA. 1958. The factors affecting the stability of hydrogen-bonded Zwanzig R, Szabo A, Bagchi B. 1992. Levinthals paradox. Proc Natl Acad Sci
polypeptide structures in solution. J Chem Phys 62:1485. USA 89:2022.

Dill - Polymer Principles and Protein Folding - 1999 - Protein Science

Uploaded by

Copyright:

Available Formats

Dill - Polymer Principles and Protein Folding - 1999 - Protein Science

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dill - Polymer Principles and Protein Folding - 1999 - Protein Science

Uploaded by

Copyright:

Available Formats

Protein Science ~1999!, 8:11661180. Cambridge University Press. Printed in the USA.

Copyright 1999 The Protein Society

Polymer principles and protein folding

Table 1. The side-chain forces of folding

Fig. 3. Simple mass-action schemes describe observed relaxation rates

interpret them. When a single exponential decay is observed in

Fmacro ~ j! 5 2kT ln V 5 2kT ln@ g~ j!e 2Fmicro 0kT #

5 Fmicro ~ j! 2 kT ln g~ j!, ~1!

where V is the partition function, Fmicro ~j! is the internal free

Fmacro ~ j! 5 Fmicro ~ j! 2 TSconformational ~ j!, ~2!

What are folding pathways? Micropaths and

~reaction coordinate diagram! and microlevels ~energy landscape!.

Energy landscapes are funnels:

more complex than collapse first or secondary structure first.

Why do we need energy landscapes?

path. But, as noted above, computer folding algorithms have not

You might also like