Dill - Polymer Principles and Protein Folding - 1999 - Protein Science
Dill - Polymer Principles and Protein Folding - 1999 - Protein Science
Dill - Polymer Principles and Protein Folding - 1999 - Protein Science
REVIEW
KEN A. DILL 1
University of California, San Francisco, 3333 California Street, Ste. 415, San Francisco, California 94118
~Received January 25, 1999; Accepted March 4, 1999!
Abstract
This paper surveys the emerging role of statistical mechanics and polymer theory in protein folding. In the polymer
perspective, the folding code is more a solvation code than a code of local fc propensities. The polymer perspective
resolves two classic puzzles: ~1! the Blind Watchmakers Paradox that biological proteins could not have originated from
random sequences, and ~2! Levinthals Paradox that the folded state of a protein cannot be found by random search. Both
paradoxes are traditionally framed in terms of random unguided searches through vast spaces, and vastness is equated
with impossibility. But both processes are partly guided. The searches are more akin to balls rolling down funnels than
balls rolling aimlessly on flat surfaces. In both cases, the vastness of the search is largely irrelevant to the search time
and success. These ideas are captured by energy and fitness landscapes. Energy landscapes give a language for bridging
between microscopics and macroscopics, for relating folding kinetics to equilibrium fluctuations, and for developing
new and faster computational search strategies.
Keywords: new view; polymer; protein folding; statistical mechanics
This paper describes a perspective on protein folding that derives year time frame, from the 1930s to 1980s. It originated with Mir-
in part from simple statistical mechanical and polymer models. As sky and Pauling in 1936 ~Mirsky & Pauling, 1936!, who proposed
with any perspective, this one is a personal opinion, with all the that backbone hydrogen bonding is a prominent folding force.
limitations that implies. The first part of this paper explores the During the next 15 years, Paulings group used the structures of
folding code. ~1! Structure: How is the native structure encoded in small molecule hydrogen-bonding compounds to predict that folded
the amino acid sequence? ~2! Thermodynamics: Why is folding so proteins would have a-helical and b-sheet structures ~Pauling &
cooperative? ~3! Kinetics: What determines the speed and the rate- Corey, 1951a, 1951b, 1951c, 1951d; Pauling et al., 1951!. The first
limiting steps of folding? Polymer modeling suggests that the fold- X-ray crystal structures of globular proteins gave strong support to
ing code is more a solvation code and less a linear encoding of this view by confirming the existence of the predicted a-helices
torsion angles along the peptide bond, even though the latter is not and b-sheets ~Kendrew et al., 1958!. Hydrogen bonding was seen
negligible. The second part explores the energy landscape perspec- to be an important structure-causing force in proteins.
tive on folding kinetics. Polymer modeling suggests that the fold- During the same period, a step was taken toward understanding
ing process more closely resembles balls rolling down bumpy folding cooperativity through an understanding of the helix-coil
funnels than balls rolling aimlessly on flat surfaces or rolling sin- transition. For many years it had been known that protein folding
gle file along identical trajectories. is cooperative, i.e., that there is a dramatic transition from dena-
tured to native states upon only small changes in solvent, pH, or
DISCUSSION temperature. In the 1950s and 1960s, theoretical work particularly
of Schellman ~1958!, Zimm and Bragg ~1959!, Poland and Scheraga
Side-chain interactions contribute to architecture, ~1970!, and experiments ~Doty & Yang, 1956; Doty et al., 1956!
just as backbone interactions do showed that long peptide chains can undergo a helix-coil transition
that is cooperative. The helix-coil transition is driven by hydrogen
The backbone forces of folding
bonding and fc propensities among near-neighbor groups along
Table 1 compares two different perspectives on the folding code. the chain. For many years, this has been the main model for con-
A backbone-centric, helix-centric perspective arose over the 50 formational cooperativity in biomolecules.
To complete the picture of structure, thermodynamics, and ki-
Reprint requests to: Ken A. Dill, University of California, San Francisco, netics, experiments beginning in the 1970s showed that helices can
3333 California Street, Ste. 415, San Francisco, California 94118; e-mail: form rapidly ~Kim & Baldwin, 1982; Williams et al., 1996!. One
[email protected]. inference was that folding is hierarchical and can be explained by
1
The author is grateful to Hans Neurath, the Protein Society, and Protein
Science, for the opportunity to present this overview, which is largely taken
a scheme 18 r 28 r 38: the primary structure leads to secondary
from a talk given on the occasion of the Hans Neurath Award lecture, at the structure ~fast!, which is then assembled into tertiary structure
Protein Society meeting, July 27, 1998. ~slower!. Hierarchical assembly was seen as a solution to the prob-
1166
Polymer principles and protein folding 1167
number of amino acids N, but the number of nonlocal interactions problem, not the solution. This is a key message from the successes
is proportional to about 2N, so the latter should dominate in larger of the two-dimensional lattice Ising model in the revolution that
proteins. ~4! Helices and strands often take their conformational took place in understanding critical phenomena ~Stanley, 1971!.
instructions from their context or from the solvent ~Kuroda et al., The inability of earlier models of phase transitions to capture sub-
1996; Predki et al., 1996!. ~5! To a first approximation, a fold is tle critical behavior was attributable, not to the lack of realism and
determined by the binary sequence of hydrophobic0polar mono- atomic detail, but to a lack of rigor in the mathematics of the
mers, even when fc propensities are largely chosen randomly models.
~Reidhaar-Olson & Sauer, 1988; Bowie et al., 1990; Lim & Sauer, Mathematician Mark Kac once said that the purpose of models
1991; Gassner et al., 1992; Lim et al., 1992; Kamtekar et al., 1993; is to polarize our thinking, to help us formulate questions. A
Matthews, 1993; Munson et al., 1994, 1996; Lazar et al., 1997; model manifests a point of view; it regards certain components of
Roy et al., 1997; Schafmeister et al., 1997; Wu & Kim, 1997!. a problem as relevant, important, or dominant, and other compo-
~6! Protein folds are less affected by mutations on their surfaces nents as irrelevant, unimportant, or negligible, and then devises a
than in their hydrophobic cores ~Lim & Sauer, 1991; Matthews, chain of logic leading to predictions from those premises. Most
1993!. ~7! Some experiments show that protein folding is not broadly, the point of a model is to make decisive and testable
hierarchical, implying that secondary structures are not pre- predictions, regardless of whether its fine structure looks realistic.
assembled and used as building blocks in tertiary assembly. For A key advantage of simplified models is that their parameters are
example, a b-sheet protein can fold via a helical intermediate physical and minimal in number. The chain of logic from premises
~Shiraki et al., 1995; Hamada et al., 1996!. ~8! Hydrophobic clus- to conclusions is direct. Simplified models serve to generate hy-
tering, like secondary structure formation, can be very fast ~Chan potheses that often cannot be generated in any other way, but that
et al., 1997; Ramachandra Shastry & Roder, 1998; Ramachandra can then be tested by experiments or refined simulations.
Shastry et al., 1998!, and it can drive helix and sheet formation. Simplified models have been useful for exploring entropies and
combinatoric principles of conformational and sequence spaces.
Two problematic paradoxes of protein science have been shown
Simplified models are hypothesis generators
by polymer modeling to be neither problematic nor paradoxical.
The predictions described above come, in part, from models that ~1! The Blind Watchmaker Paradox: The probability that natural
involve considerable simplification. An example is the HP model, proteins could be found in a random search of sequence space was
in which each amino acid is represented as a bead, each bond is a seen to be impossibly small. ~2! The Levinthal Paradox: The prob-
straight line, bond angles are a few discrete options rather than a ability that a protein could find its native state by random search
continuum, different conformations conform to lattices in two or was seen to be impossibly small. Both paradoxes have been framed
three dimensions, and the 20 amino acids are condensed into a in terms of random unguided processes that search for a single
two-letter alphabet: H ~hydrophobic! or P ~polar! ~Dill, 1985; Dill point, the endstate, in a vast space. Biological evolution searches
et al., 1995!. through sequence space; the endstate is a single protein having a
While statistical mechanical models are simplified in their rep- particular function. Protein folding searches through conforma-
resentation of energies and atomic details, they are more refined tional space; the endstate is the single native structure of a given
in other respects ~Camacho & Thirumalai, 1993; Bryngelson protein. In both cases, the vastness of the search ~i.e., the size of
et al., 1995; Dill et al., 1995; Karplus, 1997; Onuchic et al., 1997!: the search space! is taken, according to the paradox, to be the key
~1! their full conformational space can be explored extensively, to the impossibility of reaching the endstate.
sometimes without sampling or approximation, and ~2! sometimes
the full sequence space can be explored. For some questions, it is
more important to get right the representation of conformational or Creationists, evolutionists, and blind watchmakers:
sequence spaces than it is to get right the atomic details. Many Can proteins arise from random sequences?
questions of structure, stability, and kinetics are not about the
Sequence space is a large place. For protein chains of 100 amino
locations of the hydrogen bonds in native lysozyme. They are not
acids, the number of possible sequences of the 20 different amino
questions that are answerable by crystallography. They are about
acids is 20 100 5 10 130 ~see Fig. 1!. Creationists have used such
distributions and ensembles, flexibilities and entropies, energy land-
numbers to argue the impossibility that proteins, and life, could
scapes, folding kinetics, big conformational changes, or sequence
space. They are low-resolution questions that have low-resolution
answers. Apart from X-ray crystallography and NMR, the work-
horses of biomolecule science for many years have been low-
resolution experimentsCD, fluorescence, small-angle scattering,
some NMR experiments, calorimetry, chromatography, ANS bind-
ing, melting curves, etc.
For questions involving conformational ensembles, conforma-
tional entropies, sequence space, long time and large spatial scales,
ensemble averaging, or non-native states like transition states, mol-
ten globules, intermediates or denatured states, there is currently
little alternative to some degree of simplification in models. It is
sometimes helpful not to have atomic details, picosecond by pico-
Fig. 1. Sequence space is large. There are 20 100 different 100-mer se-
second, because it is hard to see the forest of principles through the quences. The probability of finding one particular sequence is 20 2100 , but
trees of detail. It would be a mistake to believe that any model is the probability of finding any sequence that folds to a particular structure
improvable by adding structural detail. Sometimes details are the is predicted to be more than 100 orders of magnitude larger.
Polymer principles and protein folding 1169
have arisen from the random sequences that were plausibly on the amino acid spheres together to make a protein; 67 of them will be
prebiotic earth. Creationists solve the large numbers problem by on the surface and 33 will be in the core.! Therefore, the real
invoking divine intervention. Evolutionists solve the large num- search for protein structure takes place, not in a space of 20 N , but
bers problem instead by the accretion of advantage that happens in a space nearer in size to 2 N03 5 2 33 5 10 10 for N 5 100. The
through natural selection ~Dawkins, 1996!. But both evolutionists other 120 orders of magnitude in sequence space are highly de-
and creationists start from the same premise, the large numbers generate; the folded states of those sequences will look much like
problem. Evolutionists too assume that natural proteins are infin- ones already found in the search of the smaller space.
itesimal specks in an impossibly vast and meaningless sequence Sequence space is therefore not likely to be vast darkness with
space, as indicated by the following quotes: Only a very small infinitesimal specks of protein-like light spots. It is not perfectly
fraction of this unimaginably large number of polypeptide chains light either. On a logarithmic scale, sequence space is predicted to
would adopt a single stable three-dimensional conformation ~Al- be more like a beige sea in which virtually all molecules are
berts et al., 1998!, and It is certain that we need a hefty measure nearly folded. A typical random chain of 100 amino acids is
of cumulative selection in our explanations of life ~Dawkins, predicted to be highly compact in water, have considerable sec-
1996!. It is this numbers problem that I refer to, with the help of ondary structure, and be structured much like a molten globule
the wonderful metaphor of Richard Dawkins, as the Blind Watch- ~Lau & Dill, 1990; Chan & Dill, 1991!. This is a far better starting
makers paradox. point for natural selection than are specks in astronomically large
But statistical mechanical modeling ~Lau & Dill, 1990; Chan & sequence spaces.
Dill, 1991; Lipman & Wilbur, 1991! shows that there is very little But whatever the starting point, there remains the need for a
numbers problem in the first place. Reach into a soup of random process of improvement. Evolution can improve proteins by nat-
amino acid sequences. The chance of pulling out a biologically ural selection. Richard Dawkins has explained natural selection
important molecule depends on what is meant by biologically using the metaphor of a Blind Watchmaker. By invoking Watch-
important. The following two questions are vastly different ~see maker, he means the endstate has the appearance of having been
Fig. 1!: ~1! What is the probability of pulling from that soup a designed. The traditional inference is that an object that appears
specific sequence? ~2! What is the probability of pulling from that designed was built by a systematic step by step procedure, as in
soup any sequence that folds to a specific structure? The answer to building a watch. But Dawkins term Blind means that, on the
question ~1! is 10 2130 for a 100-mer. The chance of pulling out a contrary, the natural selection process is not so systematic and does
polypeptide having, say, the lysozyme sequence, is essentially zero. not involve a specific pre-ordained sequence of events. Natural
But to achieve biological function, we care only about finding a selection is a Blind Watchmaker that improves proteins through
particular fold, not a particular sequence. And modeling shows that incremental steps, each of which involves some bias, however
the probability of finding a structure is likely to be more than 100 small, at the same time it also involves considerable random choice
orders of magnitude larger than the probability of finding a se- among alternatives.
quence ~Lau & Dill, 1990; Chan & Dill, 1991!. The chance of The Blind Watchmaker is also a useful metaphor for protein
pulling out any sequence that folds to roughly lysozymes structure folding kinetics, the systematic accretion of native structure over
is closer to 10 210 to 10 220 . While this number too may seem the time course of a folding experiment ~Zwanzig et al., 1992!. A
impossibly small, nature works with these sorts of numbers all the native protein has the appearance of design. For example, the steric
time. These numbers would imply about one such sequence in a fit of side chains in a protein core is as precise as that of a
liter of random sequences at nanomolar concentrations! And the jigsaw puzzle. But tight packing of irregular objects can also be
probability of finding any chain fold, not just lysozyme, is even achieved by shaking up nuts and bolts in a jar, with no design
higher. involved ~Bromberg & Dill, 1994!. The appearance of design does
Why does the numbers problem disappear when we seek struc- not mean that folding happens in serial step by step fashion. Even
tures rather than sequences? There is an enormous degeneracy in a jigsaw puzzle can be constructed through different parallel se-
sequence space: many different sequences can fold to the same quences of events ~Harrison & Durbin, 1985!. The native structure
native structure. A protein can be mutated substantially without can be reached, over the time course of folding, by a process that
changing its fold. The explanation for this is simple. If, as noted ~1! starts from different initial conformations and ~2! proceeds by
above, a fold is primarily determined by the binary sequence of incremental improvements, each of which has some bias but also
hydrophobic0polar monomers, then the essential features of the involves considerable random choice among alternatives. Even a
full 20 100 sequence space are found by searching a space of only very small bias ~deviation from randomness! in choosing among
about the 2 100 5 10 30 sequences that are written in a binary al- alternatives can speed up the search time ~compared to a random
phabet ~H 5 hydrophobic, P 5 polar!, a reduction of 100 orders of search! by tens to hundreds of orders of magnitude ~see below!.
magnitude. Degeneracy means that hydrophobic monomers are When perfect randomness is not the driver, the vastness of the
largely interchangeable with each other, and polar monomers are search becomes irrelevant to the search time ~Dill, 1993!. These
interchangeable with each other, for determining a fold. ~Function ideas are captured in landscapes: fitness landscapes in sequence
may have additional requirements, but estimates indicate that these space or energy landscapes in conformational space. We now focus
do not change the numbers much ~Lau & Dill, 1990!.! on the latter.
Moreover, modeling ~Lau & Dill, 1990; Chan & Dill, 1991! and
experiments ~Reidhaar-Olson & Sauer, 1988; Matthews, 1993! show
The old and new views of folding kinetics:
that the relevant space is even smaller, because only about N03 of
Different questions
the residues are crucial for foldingthose that define the hydro-
phobic core. To first approximation, most surface sites can be Protein folding kinetics has been described in terms of so-called
mutated without changing structure or function. ~The factor of N03 Old and New Views ~Baldwin, 1994, 1995; Dill & Chan, 1997!. To
comes from the geometry of surface0volume ratios. Pack 100 define these views, we first distinguish models, old and new, from
1170 K.A. Dill
views of microscopic folding processes, old and new. For models, residue chain is 4 100 ' 10 60 chain conformations. Only one of
the terms old and new is too stark a contrast. It leads to the these is the native structure. Levinthals proposed solution for find-
perception that we should be asking: Why do we need new mod- ing the needle in the haystack was that all chains must follow the
els? What was wrong with the old ones? Which models are better? same microscopic pathway, like ants single file on a trail ~Levinthal,
But these questions miss the mark. Old and new models do not 1968!. By same pathway, he specified that every chain follows
address the same questions. The old models are mass action mod- the same sequence of bond angle changes, in the same order, to
els used to fit experimental data on folding relaxation times and reach the native state. In the Sequential Micropath view, kinetic
amplitudes. The new view is not a denial of these models. Mass intermediates ~if they were on-pathway! were seen as helpful mile-
action models remain valid for representing such data. While mass posts because they would show what route was taken, and there-
action modeling gives a macroscopic description of experimental fore what routes were avoided, and therefore how the haystack was
data, the new statistical mechanical models give a microscopic searched efficiently. Two-state kinetics was seen as uninformative
framework to explain that data. about the mileposts of folding.
Where old and new views differ, however, is in their interpre-
tation of the microscopic processes of folding. In the hope that Ensemble perspective
changing terminology can help untangle some confusion, I will In the Ensemble view, the vastness of the search is largely
replace old view with Sequential Micropath view and New irrelevant. The more important problem is kinetic traps ~Chan &
view with Ensemble view. Dill, 1994, 1998!. Chains can sort very quickly through vast stretches
Table 2 summarizes the differences between the two views of of conformational space. In this view, chains fall energetically
folding kinetics ~Dill & Chan, 1997!. The language of the Sequen- downhill, as when balls roll down bumpy funnels. Chains do not
tial Micropath viewpathways, transition states, reaction coordi- fold by random searches on level energy landscapes. In this view,
nates, on-path and off-path intermediatesis intended to explain two-state kinetics often means the chain is folding at nearly its
what exponentials do ~i.e., what you see in experiments!. Experi- maximum possible diffusion-limited speed, without kinetic traps.
mental relaxation data are interpreted in terms of mass-action di- In this view, stable intermediates are mainly seen as kinetic traps
agrams having arrows that connect symbols like D ~denatured!, I that slow down the folding process.
~intermediate!, and N ~native!. Nothing in this language says what Heres why the vastness of the search is irrelevant. Even a very
any one molecule is doing at any given time, or how the kinetics small bias, in the form of the forces of protein folding, can be the
of folding is related to the monomer sequence, or how to assign difference between folding times measured in lifetimes of the
microscopic chain conformations to labels I or D or transition universe vs. milliseconds ~Bryngelson & Wolynes, 1987; Dill,
state, etc. But the language of the Ensemble viewlandscapes, 1987; Zwanzig et al., 1992!. ~On a large flat golf course, a golf
folding funnelsis intended to describe what molecules do ~i.e., ball will never find the hole by random processes, but if the
how individual molecules progress toward the folded state!, and golf course has even a small tilt that funnels toward the hole, no
how different monomer sequences lead to different kinetics. problem!!
What causes the funnel-like tilt on a folding landscape? The first
Sequential micropath perspective estimates of the shapes of folding energy landscapes were based on
mean-field theories ~see Fig. 2! ~Dill, 1985; Bryngelson & Wolynes,
The main problem, according to the Sequential Micropath view,
1987!. Hydrophobic collapse leads to compact chain conforma-
was the search problem, which has been called the Levinthal par-
tions. The funnel arises because the drive to collapse is also a drive
adox. As Levinthal posed it ~Levinthal, 1968; Wetlaufer, 1973!, a
toward a reduced ensemble of conformations. There are many
random search of conformations would take a protein forever.
non-native states ~high energy!, but only one native state ~low
Levinthal saw folding as a search through a vast conformational
energy!. The fraction of conformations that are compact is infin-
space, the haystack, for the native structure, the needle. Suppose
itesimal compared to the total conformational space. If there are
the conformational space is represented by four preferred fc an-
4 100 conformations of a chain, FloryHuggins-like theories predict
gles for each peptide bond: a-helical, b-strand, and two others. In
that only about ~40e!100 ' 10 17 of those conformations are com-
terms of those discrete options, the size of the space for a 100
pact ~Dill, 1985!. More accurate recent estimates predict a number
that is even smaller ~Yue & Dill, 1995!: the number of compact
conformations having a hydrophobic core may even be as small as
1 for some sequences. This estimate is supported by experiments
Table 2. on reduced alphabets based on hydrophobicity codes; a small frac-
tion of sequences appear to fold relatively uniquely ~Riddle et al.,
Sequential 1997; Roy et al., 1997; Schafmeister et al., 1997!.
micropath view Ensemble view
In short, the Ensemble view is a reversal of the sequential
Language Paths, intermediates, Landscapes, funnels micropath view. What was seen as the slow stepthe search through
transition states, the huge haystack of non-native chain conformationsis now
reaction coordinates seen as happening at near diffusion-limited speed. Collapse can be
Explains What exponentials do What molecules do fast. Sifting through most of the haystack is fast; the slow part is
~what you see! ~how it works! the endgame of reconfiguring a very small set of near-native con-
Main problem Search problem Trap problem formations. In the past few years, new fast experimental methods
Proposed solution Sequential pathways Funnels ~Burton et al., 1997; Callendar et al., 1999! have shown that pro-
Intermediates Mileposts Traps teins can fold at nearly diffusion-limited rates, on submillisecond
Two-state kinetics No information Implies fast folding
time scales ~Huang & Oas, 1995; Ballew et al., 1996a, 1996b;
Pascher et al., 1996; Burton et al., 1997; Chan C-K et al., 1997;
Polymer principles and protein folding 1171
tion should be assigned? Or choose a macrostate: what micro- which, in turn, is a distinction between microscopics and macro-
scopic conformations are in it? What is the ensemble called scopics ~see Fig. 5!. The Sequential Micropath view postulates a
intermediate state, or denatured state, or transition state? simple relationship between these two types of diagrams. In the
Currently, such assignments must be made arbitrarily. Macrostates Ensemble view, the relationship can be complex, but, in general, is
are averages over many microscopic conformations; they are not not known. A microstate is a single point on an energy landscape
descriptions of single chain conformations. ~2! What series of chain and has free energy Fmicro 5 F~f!, which is also called the internal
conformations defines the reaction coordinate? A reaction coordi- free energy ~Dill & Chan, 1997!. A macrostate has free energy
nate is a macrovariable, not a microvariable ~see below!. For pro- Fmacro 5 F~j!, where j is just a scalar quantity, such as a reaction
tein folding, the reaction coordinate is not known in microscopic coordinate or a progress variable. A given value of j represents
terms. ~3! Even if we knew the reaction coordinate, how do we some particular ensemble of microscopic conformations.
know which way is forward? Which specific bond angles should Figure 6 illustrates the difference between Fmicro and Fmacro , in
we change to progress toward the native state? Every protein fold- a simple model. Suppose we choose as a progress variable the
ing algorithm must make these kinds of microdecisions at every number of hydrophobic contacts, j 5 0, 1, 2, . . . , m, to reflect the
step. But no experiment yet gives such microinformation. Energy extent of folding. This is just one of many possible progress vari-
landscapes can provide the common language to bridge between ables; it is just chosen here for illustration because it simplifies the
micro- and macrodescriptions. math. The density of states g~j! is a count of the number of
different microstates f that define a particular macrostate j. Fig-
What is an energy landscape? ure 6 shows one of the g~0! ' 500,000 conformations that have
According to the principles of thermodynamics, if a system has j 5 0 hydrophobic contacts, one of the g~4! 5 67 conformations
n degrees of freedom f 5 @f1 , f2 , . . . , fn #, the stable state of the that have j 5 4 hydrophobic contacts, and the g~6! 5 1 confor-
system can be found by determining the set of values f* 5 @f*1 , mation that has j 5 6 hydrophobic contacts; this is the native
f2*, . . . , fn*# that gives the minimum value of the free energy func- structure in this model.
tion F~f! 5 F~f1 , f2 , . . . , fn !, when explored over all possible To determine Fmicro , focus on a particular conformation. For that
values of f ~see Fig. 4!. Such functions F~f! are called energy conformation, sum all the energies due to bond angles, torsions,
landscapes. Energy landscapes, per se, are neither new, nor con- stretches, van der Waals interactions, hydrogen bonds, electrostat-
troversial, nor limited to proteins. Energy landscape is nothing ics, and include the solvation free energy due to the relative amounts
more than a name for this function. For protein folding, f may be of buried and exposed hydrophobic and polar surface. Fmicro is a
the backbone and sidechain bond angles, for example. free energy, rather than just an energy, because it includes solvation
and desolvation entropies and the hydrophobic effect. Fmicro is not
the total free energy, however, because it does not include the
Distinguishing between microscopics and macroscopics
chain conformational entropy: it treats only a single conformation.
The distinction between the old and new views is the distinction In the HP model, in which hydrophobicity dominates, a given
between an energy landscape and a reaction coordinate diagram, chain conformation has j hydrophobic contacts, so Fmicro 5 jE,
where E , 0 is the free energy of desolvating two nonpolar groups
and bringing them into contact.
The relationship between energy landscapes and reaction dia-
grams is a relationship between Fmacro and Fmicro . Fmacro does include
the chain conformational entropy,
The main point is that Fmicro is the free energy of a single chain
conformation whereas Fmacro is the free energy of some ensemble
of conformations that collectively have some macroscopic mean-
Fig. 4. Energy landscapes are free energies, Fmicro ~f1 , f2 , . . . !, as a func-
ing, such as an intermediate, transition state, molten globule, or the
tion of the degrees of freedom, f1 , f2 , . . . , such as backbone and side-chain denatured state. Fmacro includes a conformational entropy ~k ln g~j!!,
bond angles. due to the number of microscopic conformations in the particular
Polymer principles and protein folding 1173
A B
Fig. 5. ~A! Energy landscape vs. ~B! reaction diagram. A landscape is a free energy Fmicro of each individual chain conformation vs.
the many microscopic degrees of freedom. A reaction diagram is a free energy Fmacro of an ensemble of molecules, and includes the
chain conformational entropy. Here Fmacro is a function of a single variable, j, such as a reaction coordinate. The reaction coordinate
is usually not known for protein folding. The red arrow on the landscape indicates a possible micropath, an individual folding trajectory.
In this case, the micropath never involves an uphill step, and yet the reaction diagram has a free energy barrier. The barrier is due to
the slow entropic search of many different chains seeking the entry to the central steep funnel.
macrostate. Fmacro ~j! is a function of a single variable j, and traditional reaction coordinate diagram ~see Fig. 5B!. In contrast,
therefore it corresponds to just an ordinary two-dimensional plot, Fmicro ~f! is the energy landscape; it is a function of many degrees
of the folding free energy vs. reaction coordinate j. This is the of freedom. Landscapes are usually plotted in three dimensions, as
a simplification, since it is impossible to draw high-dimensional
surfaces. Traditional terms such as intermediate state, pathway,
transition state, and free energy barrier refer to Fmacro ~j!. In con-
trast, computer simulations usually explore Fmicro ~f!.
A B
Fig. 8. A: For chemical reactions ~energies .. kT !, the macrostates on reaction coordinate diagrams correspond to the time series of
microstates on the energy landscape. B: For folding processes ~energies per interaction ' kT !, the observed macrostates may not
uniquely specify the time series of microstates on the energy landscape.
Alternatives to the sequential micropath fluctuations!. Energy landscapes provide the framework for relat-
and ensemble views? ing the thermodynamics and kinetics of protein folding. Figure 10
There have been efforts to marry old and new views ~Pande shows two landscapes: one is a smooth funnel, the other is a
et al., 1998; Pande & Rokhsar, 1999!. Those efforts aim to recon- rugged funnel. For the smooth funnel, folding kinetics should be
cile how there can be preferred folding routes at the same time that fast and two state. For the rugged landscape, folding kinetics will
individual chains follow different micropaths. But no marriage is be slower and more complex.
needed. Preferred routes and states are part and parcel of the En- The shape of the landscape also describes the fluctuations at
semble perspective. In my opinion, the Ensemble perspective is equilibrium. Fluctuations are interesting for two reasons. First,
not one model, one result, or one energy landscape shape. It is not these are the motions that are important for protein function, such
a denial of patterns, pathways, uniqueness, or structure. It is just a as when an enzyme enters a transition state for catalyzing a reac-
perspective based on recognizing the general funnel-like nature of tion. Second, fluctuations can be measured by NMR or thermal
the energy function Fmicro ~f! with bumps and wiggles and shapes factors in X-ray crystallography. The fluctuations are those con-
that have yet to be determined. The funnel perspective is univer- formations having energies only one or two kT higher than the
sally captured in many different models, monomer sequences, po- native conformation and are therefore transiently populated due to
tential functions, move sets, and definitions of transition states. occasional Brownian bombardments. If a protein has a smooth
While particular results can depend on model details, the funnel landscape, the motions of the protein are mostly small wiggles,
concept is a broad brush picture of how a large ensemble leads to never deviating much from the native structure because to do so
a small ensemble, how an unstructured population changes through would require a high energy. But for a bumpy landscape, very
time to become a single structure, and how the degrees of freedom non-native-like conformations can occasionally be populated un-
diminish from being many and uncoupled and unsynchronized to der native conditions because the energies of such conformations
being few and coupled and synchronized. This process is bound to are not much higher than those of the native molecule ~Miller &
involve preferences. Dill, 1995; Tang & Dill, 1998!. During those fluctuations, protons
Indeed, at the end of the folding process, it would be remarkable or ligands could exchange in or out, or the protein could have other
and maybe impossibleto have large diversity in conformations transiently different properties than the native molecule. If we
or trajectories. Most of the simulations that have led to the En- knew the shapes of energy landscapes, we could better understand
semble perspective have found preferred folding routes in the late the relationship between folding kinetics and equilibrium fluctua-
stages of folding ~Miller et al., 1992; Lazaridis & Karplus, 1997!. tions around the native state.
What is new in the new view, and what was the essence of
Levinthals problem, was what happens in the early stages of Landscape-ology can help in developing
folding, not the late stages. Levinthals concern was how to search new conformational search strategies
the huge space of denatured conformations. The Ensemble view Knowing the shapes of energy landscapes should also help to
merely asserts that molecules cannot be synchronized at the be- create faster computer conformational search methods. In the Se-
ginning of folding because different chains have such different quential Micropath view, on-pathway intermediates are held in
unfolded conformations. Although the denatured state is a single special regard because of how they might illuminate the folding
macrostate, it is a very heterogeneous collection of microstates.
It will surely remain a matter of opinion for any given simula-
tion whether what is interesting is the pathway or the variance
from it. In either case, energy landscapes provide the basis for
calculating any property of interest.
atom has its place. It has been considered important to know THE only about a decade after the macromolecular hypothesis was ac-
native structure, THE transition state, or THE intermediate structure. cepted, quantitative statistical mechanical models began to suc-
Of course, it is clear that proteins wiggle and move; they are not cessfully explain rubber elasticity, the viscosities and viscoelasticities
perfectly static ~Karplus, 1997!. But even so, such fluctuations are of chain molecule liquids, the dependence of the physical proper-
often regarded as a sort of footnote to the main message, much like ties of polymeric materials on molecular weight distributions, and
error bars in experimental data. According to this logic, folding the unusual thermodynamics of polymer solutions. Such statistical
pathways are less like a perfect train track, where no lateral vari- ideas now provide the foundation of modern polymer theory. For
ation is allowed, and more like a highway, where some small many properties of proteins, too, it seems clear that statistics is not
degree of weaving and lateral meandering can take place. just a caveat about small details but is at the very heart of the
But in the polymer view, statistics can play a fundamentally problems that proteins must solve.
different and deeper role. It is more like replacing a train track, not
with a highway, but with a ski bowl. Driving on a highway from
point A to point B can be described by average velocities, posi- Conclusions
tions, and altitudes along the reaction coordinate, the highway.
Statistical mechanical models can give useful insights about pro-
But tracking an ensemble of skiers is quite a different business than
teins. While all-atom models sacrifice conformational sampling to
tracking the flow of cars on a highway. Skiers can take different
gain atomic detail, statistical mechanical models do the reverse.
routes. The average position of skiers on a mountainside is a much
Because simple models explore non-native states so effectively,
more heterogeneous property, with less apparent meaning. What is
have few parameters, and cost little computer time, they have been
THE structure, or even THE averaged structure, at any given time,
useful for exploring folding forces and principles. They have led to
is not yet clear, or necessarily always meaningful.
the perspective that the folding code is primarily a solvation code,
The importance of one particular structure and the neglect of
rather than a local propensities code. Statistical mechanical models
statistics has a parallel in the history of polymer science. The
are well suited to addressing combinatoric problems, such as the
breakthrough that founded polymer science was the macromolec-
Levinthal and Blind Watchmaker paradoxes. The conclusion is that
ular hypothesis, the idea that there were long chains covalently
we should beware of needle in a haystack arguments, because
linked together ~Flory, 1953; Morawetz, 1985!. The huge resis-
nature does not seem to work that way. Each step is not unguided.
tance to this idea prior to the 1920s was due to a faith in the
Conformational and sequence spaces are more like landscapes.
importance of specific structures and a reluctance to fully appre-
Landscapes are funnel like, wide at the top and narrow at the
ciate the statistics. According to Flory ~1953!:
bottom, sometimes with hills and valleys. All conformationsnot
just on-pathway intermediates for examplecan give some guid-
Organic chemists were motivated by the desire to devise con-
ance toward the global minimum. New computational search meth-
cise formulas and to isolate pure substances, the term pure . . .
ods are drawing on this information to make better folding and
invariably implying a formula of convenient size. Hence the
docking algorithms. The energy landscape perspective may help
quest for the cellulose molecule or the rubber molecule contin-
connect the currently disjoint areas of kinetics experiments and
ued. . . . By the turn of the century this objective had crystallized
conformational search strategies.
to a discipline which dominated synthetic organic chemistry. To
be eligible for acceptance in the chemical kingdom, a newly
created substance . . . had to be separated in such a state that it Acknowledgments
could be characterized by a molecular formula. The investigator
was obliged to adduce elementary analyses to confirm the com- I thank Sarina Bromberg and Jack Schonbrun for very helpful comments.
I also thank Hue Sun Chan for his insightful comments on this manuscript,
position, and to supplement these with molecular weight deter- for his many contributions over the past decade that are summarized in this
minations for the purpose of showing that the substance was work, and for teaching me a great deal about proteins. I appreciate support
neither more nor less complex than the formula proposed. Other- from the Lawrence Berkeley Labs and from NIH grant GM34993.
wise the fruits of his labors would not be elevated to an honored
place in the immortal pages of the chemical compendiums. The
successes of synthetic organic chemistry in creating the hun- References
dreds of thousands of different combinations and permutations Alberts B, Bray D, Johnson A, Raff M, Roberts K, Walter P. 1998. Essential cell
of atoms must not be discounted. In magnitude of creative biology: An introduction to the molecular biology of the cell. New York:
achievement, they are scarcely surpassed in any other field of Garland.
Anfinsen CB, Scheraga HA. 1975. Experimental and theoretical aspects of
science. While this discipline was strikingly successful, it also protein folding. Adv Prot Chem 29:205300.
tended to narrow the outlook of contemporary researchers. They Aurora R, Creamer TP, Srinivasan R, Rose GD. 1997. Local interactions in
came to believe that every definable substance could be classi- protein folding: Lessons from the a-helix. J Biol Chem 272:14131416.
Baldwin RL. 1994. Protein folding: Matching speed and stability. Nature 369:183
fied in terms of a single definite molecule capable of being
184.
represented by a concise formula. Baldwin RL. 1995. The nature of protein folding pathways: The classic versus
the new view. J Biomol NMR 5:103109.
With the macromolecular hypothesis came the recognition that Baldwin RL, Rose GD. 1999a. Is protein folding hierarchic? I. Local structure
and peptide folding. Trends Biochem Sci 24:2633.
N polymer molecules in solution, even when they are called by the Baldwin RL, Rose GD. 1999b. Is protein folding hierarchic? II. Folding inter-
same name, such as polyethylene, are not identical to each other. mediates and transition states. Trends Biochem Sci 24:7783.
Each molecule in solution can have a different conformation and Ballew RM, Sabelko J, Gruebele M. 1996a. Direct observation of fast protein
even, for synthetic polymers, a different chain length. Hence dif- folding: The initial collapse of apomyoglobin. Proc Natl Acad Sci USA
93:57595764.
ferent experiments see different ensemble averages and give dif- Ballew RM, Sabelko J, Gruebele M. 1996b. Observation of distinct nanosecond
ferent perspectives on the same molecule, polyethylene. Within and microsecond protein folding events. Nature Struct Biol 3:923926.
Polymer principles and protein folding 1179
Bowie J, Reidhaar-Olson J, Lim WA, Sauer RT. 1990. Deciphering the messages Huang GS, Oas TG. 1995. Submillisecond folding of monomeric l repressor.
in protein sequences: Tolerance to amino acid substitutions. Science 247:1306 Proc Natl Acad Sci USA 92:6878 6882.
1310. Kabsch W, Sander C. 1984. On the use of sequence homologies to predict
Branden C, Tooze J. 1999. Introduction to protein structure, 2nd ed. New York: protein structure: Identical pentapeptides can have completely different con-
Garland. formations. Proc Natl Acad Sci USA 81:10751078.
Bromberg S, Dill KA. 1994. Side chain entropy and packing in proteins. Protein Kamtekar S, Schiffer JM, Xiong H, Babik JM, Hecht MH. 1993. Protein design
Sci 3:9971009. by binary patterning of polar and nonpolar amino acids. Science 262:1680
Bryngelson J, Onuchic J, Socci ND, Wolynes PG. 1995. Funnels, pathways, and 1685.
the energy landscape of protein folding: A synthesis. Proteins 21:167195. Karplus M. 1997. The Levinthal paradox: Yesterday and today. Folding Design
Bryngelson J, Wolynes PG. 1987. Spin-glass and the statistical mechanics of 2:S69S75.
protein folding. Proc Natl Acad Sci USA 84:75247528. Karplus M, Weaver DL. 1976. Protein-folding dynamics. Nature 260:404 406.
Burton RE, Huang GS, Daugherty MA, Calderone TL, Oas TG. 1997. The Kauzmann W. 1959. Some factors in the interpretation of protein denaturation.
energy landscape of a fast-folding protein mapped by ala r gly substitu- Adv Prot Chem 14:1 63.
tions. Nature Struct Biol 4:305310. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. 1958.
Callendar RH, Gilmanshin R, Dyer RB, Woodruff WH. 1999. Annu Rev Phys A three-dimensional model of the myoglobin molecule obtained by X-ray
Chem. Forthcoming. analysis. Nature 181:662 666.
Camacho CJ, Thirumalai D. 1993. Kinetics and thermodynamics of folding in Kim PS, Baldwin RL. 1982. Specific intermediates in the folding reactions of
model proteins. Proc Natl Acad Sci USA 90:6369 6372. small proteins and the mechanism of protein folding. Annu Rev Biochem
Chan C-K, Hu Y, Takahashi S, Rousseau DL, Eaton WA, Hofrichter J. 1997. 51:459 489.
Submillisecond protein folding kinetics studied by ultrarapid mixing. Proc Kuroda Y, Hamada D, Tanaka T, Goto Y. 1996. High helicity of peptide frag-
Natl Acad Sci USA 94:17791784. ments corresponding to beta-strand regions of beta-lactoglobulin observed
Chan HS, Dill KA. 1991. Sequence space soup of protein and copolymers. by 2D-NMR spectroscopy. Folding Design 4:255263.
J Chem Phys 95:37753787. Lau KF, Dill KA. 1990. Theory for protein mutability and biogenesis. Proc Natl
Chan HS, Dill KA. 1994. Transition states and folding dynamics of proteins and Acad Sci USA 87:638 642.
heteropolymers. J Chem Phys 100:92389257. Lazar GA, Desjarlais JR, Handel TM. 1997. De novo design of the hydrophobic
Chan HS, Dill KA. 1998. Protein folding in the landscape perspective: Chevron core of ubiquitin. Protein Sci 6:11671178.
plots and non-Arrhenius kinetics. Proteins 30:233. Lazaridis T, Karplus M. 1997. New view of protein folding reconciled
Chandler D. 1987. Introduction to modern statistical mechanics. New York: with the old through multiple unfolding simulations. Science 278:1928
Oxford Press. 1931.
Dawkins R. 1996. The blind watchmaker: Why the evidence of evolution reveals Lee S, Bashford D, Karplus M, Weaver DL. 1987. Brownian dynamics simu-
a universe without design. New York: Norton. lation of protein folding: A study of the diffusion-collision model. Biopoly-
Dill KA. 1985. Theory for the folding and stability of globular proteins. Bio- mers 26:481509.
chemistry 24:15011509. Levinthal C. 1968. Are there pathways for protein folding? J Chim Phys 65:
Dill KA. 1987. The stabilities of globular proteins. In: Oxender DL, Fox CF, 44 45.
eds. Protein Engineering. Alan R. Liss Inc. pp 187192. Lim WA, Farrugio DC, Sauer RT. 1992. Structural and energetic consequences
Dill KA. 1993. Folding proteins: Finding a needle in a haystack. Curr Op Struct of disruptive mutations in a protein core. Biochemistry 31:4324.
Biol 3:99103. Lim WA, Sauer RT. 1991. The role of internal packing interactions in deter-
Dill KA, Bromberg S, Yue K, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. mining the structure and stability of a protein. J Mol Biol 219:359376.
Principles of protein folding: A perspective from simple exact models. Pro- Lipman DJ, Wilbur WJ. 1991. Modeling neutral and selective evolution of
tein Sci 4:561 602. protein folding. Proc Roy Soc B245:711.
Dill KA, Chan HS. 1997. From Levinthal to pathways to funnels: The new Maranas DC, Androulakis IP, Floudas CA. 1995. A deterministic global opti-
view of protein folding kinetics. Nature Struct Biol 4:1019. mization approach for the protein folding problem. DIMACS series in Dis-
Dill KA, Fiebig KM, Chan HS. 1993. Cooperativity in protein-folding kinetics. crete Mathematics and Theoretical Computer Science 23:133150.
Proc Natl Acad Sci USA 90:19421946. Matthews BW. 1993. Structural and genetic analysis of protein stability. Annu
Dill KA, Phillips, AT, Rosen JB. 1997. Protein structure and energy landscape Rev Biochem 62:139160.
dependence on sequence using a continuous energy function. J Comp Biol Miller DW, Dill KA. 1995. A statistical mechanical model for hydrogen ex-
4:227239. change in globular proteins. Protein Sci 4:18601873.
Doty P, Bradbury JH, Holtzer AM. 1956. The molecular weight, configuration Miller DW, Dill KA. 1997. Ligand binding to proteins: The binding landscape
and association of poly-g-benzyl-l-glutamate in various solvents. J Am model. Protein Sci 6:21662179.
Chem Soc 78:947954. Miller R, Danko CA, Fasolka MJ, Balacz AC, Chan HS, Dill KA. 1992. Folding
Doty P, Yang JT. 1956. Polypeptides VII. Poly-gamma-benzyl-l-glutamate: The kinetics of proteins and copolymers. J Chem Phys 96:768780.
helix-coil transition in solution. J Am Chem Soc 78:498. Minor DL, Kim PS. 1994. Context is a major determinant of beta-sheet pro-
Eisenberg D, Weiss RM, Terwilliger C. 1984. The hydrophobic moment detects pensity. Nature 371:264267.
periodicity in protein hydrophobicity. Proc Natl Acad Sci USA 81:140144. Mirsky AE, Pauling L. 1936. On the structure of native, denatured, and coag-
Fiebig KM, Dill KA. 1993. Protein core assembly processes. J Chem Phys ulated proteins. Proc Natl Acad Sci USA 22:439 447.
98:34753487. Morawetz H. 1985. Polymers: The origins and growth of a science. New York:
Flory PJ. 1953. Principles of polymer chemistry. Ithaca, New York: Cornell Wiley.
University Press. pp 819. Munson M, Balasubramanian S, Fleming KG, Nagi AD, et al. 1996. What
Frauenfelder H, Sligar SG, Wolynes PG. 1991. The energy landscapes and makes a protein a protein: Hydrophobic core designs that specify stability
motions of proteins. Science 254:15981603. and structural properties. Protein Sci 5:15841593.
Gassner NC, Baase WA, Matthews BW. 1992. A test of the jigsaw puzzle model Munson M, OBrien R, Sturtevant JM, Regan L. 1994. Redesigning the hydro-
for protein folding by multiple methionine substitutions within the core of phobic core of a four-helix-bundle-protein. Protein Sci 3:21052022.
T4 lysozyme. Proc Natl Acad Sci USA 93:1215512158. Onuchic JN, Luthy-Schulten Z, Wolynes PG. 1997. Theory of protein folding:
Gilmanshin R, Callendar RH, Dyer RB. 1998. Fast events in protein folding: The energy landscape perspective. Annu Rev Phys Chem 48:545 600.
The time evolution of primary processes. Nature Struct Biol 5:363365. Pande VS, Grosberg AY, Tanaka T, Rokhsar DS. 1998. Pathways for protein
Gilmanshin R, Williams S, Callendar RH, Woodruff WH, Dyer RB. 1997a. Fast folding: Is a new view needed? Curr Op Struc Biol 8:6879.
events in protein folding: Relaxation dynamics of secondary and tertiary Pande VS, Rokhsar DS. 1999. Folding pathway of a lattice model for proteins.
structure in native apomyoglobin. Proc Natl Acad Sci USA 94:37093713. Proc Natl Acad Sci USA 96:12731278.
Gilmanshin R, Williams S, Callendar RH, Woodruff WH, Dyer RB. 1997b. Fast Pascher T, Chesick JP, Winkler JR, Gray HB. 1996. Protein folding triggered by
events in protein folding: Relaxation dynamics and structure of the I form electron transfer. Science 271:15581560.
of apomyoglobin. Biochemistry 36:1500615012. Pauling L, Corey RB. 1951a. Atomic coordinates and structure factors for two
Hamada D, Segawa S, Goto Y. 1996. Non-native alpha-helical intermediate in helical configurations of polypeptide chains. Proc Natl Acad Sci USA 37:235
the refolding of beta-lactoglobulin: A predominantly beta-sheet protein. Na- 240.
ture Struct Biol 3:868873. Pauling L, Corey RB. 1951b. The pleated sheet, a new layer configuration of
Harrison SC, Durbin R. 1985. Is there a single pathway for the folding of a polypeptide chains. Proc Natl Acad Sci USA 37:251256.
polypeptide chain? Proc Natl Acad Sci USA 82:4028 4030. Pauling L, Corey RB. 1951c. The structure of fibrous proteins of the collagen-
Honig B, Cohen FE. 1996. Adding backbone to protein folding: Why proteins gelatin group. Proc Natl Acad Sci USA 37:272281.
are polypeptides. Folding Design 1:R17R20. Pauling L, Corey RB. 1951d. Configurations of polypeptide chains with favored
1180 K.A. Dill
orientations around single bonds: Two new pleated sheets. Proc Natl Acad Scholtz JM, Baldwin RL. 1992. The mechanism of a-helix formation by pep-
Sci USA 37:729740. tides. Annu Rev Biophys Biomol Struct 21:95118.
Pauling L, Corey RB, Branson HR. 1951. The structure of proteins: Two hydrogen- Shiraki K, Nishikawa K, Goto Y. 1995. Trifluoroethanol-induced stabilization of
bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci the a-helical structure of b-lectoglobulin: Implication for non-hierarchical
USA 37:205211. protein folding. J Mol Biol 245:180194.
Phillips AT, Rosen JB, Walke VH. 1995. Molecular structure determination by Smith CK, Regan L. 1995. Guidelines for protein design: The energetics of
convex global underestimation. DIMACS series in Discrete Mathematics beta-sheet side chain interactions. Science 270:980982.
and Theoretical Computer Science 23:181. Stanley HE. 1971. Introduction to phase transitions and critical phenomena,
Poland DC, Scheraga HA. 1970. Theory of the helix-coil transition. New York: Oxford: Oxford Press.
Academic Press. Tang KES, Dill KA. 1998. Native protein fluctuations: The conformational-
Predki PF, Agrawal V, Brunger AT, Regan L. 1996. Amino-acid substitutions in motion temperature and the inverse correlation of protein flexibility with
a surface turn modulate protein stability. Nature Struct Biol 3:5458. protein stability. J Biomol Struct Dyn 16:397 411.
Ramachandra Shastry MC, Roder H. 1998. Evidence for barrier-limited protein Thomas PD, Dill KA. 1993. Local and nonlocal interactions in globular proteins
folding kinetics on the microsecond time scale. Nature Struct Biol 5:385 and mechanisms of alcohol denaturation. Protein Sci 2:20502065.
392. Tsai C-J, Kumar S, Ma B, Nussinov R. 1999. Folding funnels, binding funnels,
Ramachandra Shastry MC, Saunder JM, Roder H. 1998. Kinetic and structural and protein function. Protein Sci 8:11791188.
analysis of submillisecond folding events in cytochrome c. Acc Chem Res Wetlaufer DB. 1973. Nucleation, rapid folding, and globular intrachain regions
31:717725. in proteins. Proc Natl Acad Sci USA 70:697701.
Reidhaar-Olson JF, Sauer RT. 1988. Combinatorial cassette mutagenesis as a Williams S, Causgrove TP, Gilmanshin R, Fang KS, Callendar RH, Woodruff
probe of the informational content of protein sequences. Science 241:5357. WH, Dyer RB. 1996. Fast events in protein folding: Helix melting and
Riddle DS, Santiago JV, BrayHall ST, Doshi N, Grantcharova VP, Yi Q, Baker formation in a small peptide. Biochemistry 35:691 697.
D. 1997. Functional rapidly folding proteins from simplified amino acid Wu LC, Kim PS. 1997. Hydrophobic sequence minimization of the a-lactalbumin
sequences. Nature Struct Biol 4:805809. molten globule. Proc Natl Acad Sci USA 94:1431414319.
Roy S, Ratnaswamy G, Boice JA, Fairman D, McLendon G, Hecht MH. 1997. Yapa K, Weaver DL, Karplus M. 1992. b-sheet coil transitions in a simple
A protein designed by binary patterning of polar and nonpolar amino acids polypeptide model. Proteins 12:237265.
displays native-like properties. J Am Chem Soc 116:53025306. Yue K, Dill KA. 1995. Forces of tertiary structural organization in globular
Schafmeister CE, LaPorte SL, Miercke LJW, Stroud RM. 1997. A designed four proteins. Proc Natl Acad Sci USA 92:146150.
helix bundle protein with native-like structure. Nature Struct Biol 4:1039 Zimm BH, Bragg J. 1959. Theory of the phase transition between helix and
1046. random coil in polypeptide chains. J Chem Phys 31:526.
Schellman JA. 1958. The factors affecting the stability of hydrogen-bonded Zwanzig R, Szabo A, Bagchi B. 1992. Levinthals paradox. Proc Natl Acad Sci
polypeptide structures in solution. J Chem Phys 62:1485. USA 89:2022.