A Survey of Left-handed Helices in Protein Structures
Marian Novotny1,2 and Gerard J. Kleywegt1*
All naturally occurring amino acids with the exception of glycine contain
one or more chiral carbon atoms and can therefore occur in two different
configurations, L (levo, left-handed) and D (dextro, right-handed). Proteins
are almost exclusively built from L-amino acids. The stereochemical bias of
nature is further reflected at the secondary structure level where right-
Linnaeus Centre for
handed helices are strongly preferred over left-handed helices.
Bioinformatics, Biomedical
The handedness of helices has not received much attention in the past
Centre, Box 598, SE-751 24
and is often overlooked during the analysis, description and deposition of
Uppsala, Sweden
experimentally solved protein structures. Therefore, an extensive survey of
left-handed helices in the Protein Data Bank (PDB) was undertaken to
analyse their frequency of occurrence, length, amino acid composition,
conservation and possible structural or functional role.
All left-handed helices (of four or more residues) in a non-redundant
subset of the PDB, were identified using hydrogen-bonding analysis,
comparison of related structures, and experimental electron density
assessment to filter out likely spurious and artefactual hits. This analysis
yielded 31 verified left-handed helices in a set of 7284 proteins. The f
angles of the residues in the left-handed helices lie between 308and 1308
and the j angles lie between K508 and 1008. Most of the helices are short
(four residues) and for 87% of them, it was possible to determine that they
are important for the stability of the protein, for ligand binding, or as part of
the active site. This suggests that, even though left-handed helices are rare,
when they do occur, they are structurally or functionally significant.
Four secondary structure assignment programs were tested for their
ability to identify the handedness of the helices. Of these programs, only
DSSP correctly assigns the handedness.
Keywords: left-handed helix; protein fold; protein structure; secondary
*Corresponding author structure; structurefunction relations
Nature shows a profound right-left asymmetry.1 side-chain atoms and the main-chain carbonyl
DNA appears mainly in the right-handed moiety.2 Conversely, D-amino acids will form more
B-conformation as do a-helices in proteins. This stable a-helices with a left-handed than with a right-
right-handed preference of biological macro- handed conformation. This phenomenon is perhaps
molecules is a consequence of the selective incor- most convincingly illustrated by the fact that the
poration of L-amino acids into proteins and of structures of all-D proteins, such as D-rubredoxin3
D-monosaccharides into DNA. The basis for this and D-monellin,4 are the mirror images of the
selectivity remains unknown, although a number of corresponding all-L proteins.
explanations have been proposed.1 a-Helices are characterised by a typical hydrogen
a-Helices composed of L-amino acids are ener- bonding pattern in which the main-chain carbonyl
getically more favourable in a right-handed confor- oxygen of residue i forms a hydrogen bond with the
mation than in the left-handed mirror image of main-chain amide hydrogen of residue iC4. Other
this arrangement due to steric hindrance between types of helices can also occur in proteins,
namely 310 and p-helices with (i, iC3) and (i, iC5)
232 Left-handed Helices
functional importance due to their unique successfully) four secondary structure assign-
structural parameters.5 To our knowledge, the ment programs assign the handedness of left-
handedness of helices in proteins has not been handed helices.
studied systematically, and not much is known
about left-handed helices other than that they are Detection and verification of left-handed helices
very rare.
Stretches of amino acids with unusual backbone Initially, ideal left-handed helices of different
conformations (e.g. left-handed helices) often lengths (four to ten residues) were constructed
appear at ligand-binding sites, proteinprotein and used as templates in SPASM searches15 against
interfaces or other functional sites. It has been subsets of the PDB.17 The hits obtained in these
suggested that proteins may sacrifice a part of their searches were used to define putative left-handed
stability to form an effective functional site.6,7 It has helices as being continuous stretches of at least four
further been suggested that searches for regions residues whose f angles are all between 308 and
with unusual backbone conformations could be 1308 and whose j angles are all between K508 and
used for the annotation of novel protein structures, 1008. Subsequently, a non-redundant subset of the
since such regions are candidates for being PDB (version of September, 2003) was generated
functional sites.7 with the PISCES server.18 To produce a large
Little is known about how residues with left- enough subset, relaxed criteria were used in the
handed helical conformations affect the stability of generation of the subset: none of the pairs of protein
proteins. It is often assumed that such residues chains had more than 90% sequence identity, and all
(with the exception of glycine residues) would crystal structures with a resolution better than 3.5 A
suffer steric clashes between main-chain and Cb and an R-value less than 0.4 were included, and so
atoms, thereby reducing the stability of the protein.2 were all proteins whose structure had been solved
However, Takano et al. found that five of six non- by nuclear magnetic resonance (NMR) spec-
glycine left-handed residues in lysozyme do not troscopy. The minimum chain length was set to
significantly impair the stability of lysozyme.8 On ten amino acid residues. The resulting subset
the other hand, these residues are scattered contained 7284 protein chains from 6535 PDB
throughout the lysozyme sequence and do not entries and included 1,687,315 amino acids. A Perl
form a left-handed helix. Scattered amino acids program was written to find all instances of left-
with a left-handed helical conformation also occur handed helices (according to the definition above)
in type I 00 turns, where two amino acids have in this subset. Initially, wider ranges for the
this conformation, and in helix stop signal9 and torsion angles were used (f between K208and
Schellman motifs.10 1458 and j between K708 and 1458), and the hits
To our knowledge, there are currently only three that were obtained were examined to check if
protein structure entries in the PDB that have a left- their hydrogen-bonding pattern was compatible
handed helix assigned. Two of these are four with left-handed a or 310-helices. The final ranges
residues long and they are found in thermolysin yielded no false negatives and relatively few false
(PDB code 8tln)11 and neutral protease (1npc).12 The positives.
third one is a three-residue long helix in granulo- If any part of a structure satisfied our definition of
cyte-colony-stimulating factor (1rhg).13 A PubMed a left-handed helix it was designated a putative hit.
search yielded one more left-handed helix, namely The hits were visualised with Deep View,19 and
in spinach glycolate oxidase (1gox).14 Interestingly, hydrogen-bonding patterns were analysed with
this left-handed helix is a part of the active site of Deep View, DSSP20 and HBPLUS.21 Putative left-
the enzyme. The left-handed helices in glycolate handed helices were subjected to additional checks
oxidase and granulocyte-colony-stimulating factor to validate them or to dismiss them as probable
both have a hydrogen-bonding pattern consistent artefacts. Any hits in NMR structures were only
with a 310-helix. accepted if the left-handed helix occurred in at least
A few years ago we discovered a left-handed 50% of the models of the ensemble. For those crystal
helix that plays an important role in alanine structures for which electron-density maps were
racemase15 and that had previously been classified available from the Uppsala Electron Density Server
incorrectly as right-handed.16 We also noted at (EDS),22 the density for the putative helices was
the time that several secondary structure assign- inspected. In cases where no map was available
ment programs failed to annotate the handedness of (including all hits found in NMR structures), the
this helix. These findings prompted us to undertake PDB was checked for structures of the same or
related proteins and, if any were found, it was
a survey of left-handed helices in the PDB with
checked if the corresponding residues in the related
respect to their frequency of occurrence, their
structures had similar f and j angles as the
length, their sequence characteristics (if any),
residues in the putative left-handed helix. If no
their conservation and, in particular, their pos-
related structures could be found in the PDB, we
sible functional or structural roles. We also
trusted the authors and accepted the hit as a true
investigated in more detail if (and how
left-handed helix.
For each of the accepted left-handed helices,
http://pubmed.gov/. many sources of information (literature, contacts
Left-handed Helices 233
with authors, and bioinformatics resources such as cases, one or more closely related structures were
SWISS-PROT,23 ProSite,24 PDBsum25 and Omim) available in which no support could be found for
were consulted to find information about the any left-handed helical conformation, and in one
possible functional or structural significance of the case there was no support in the electron density
residues in the helix. (although the structure had been determined at
We also investigated if the left-handed helices are 1.9 A resolution and refined to a crystallographic
recognised as such by the secondary structure R-value of 0.2). Finally, one entry was accepted
assignment tools DSSP,20 STRIDE,26 Promotif27 in good faith. After these validation steps, there
and SecStruct,5 and in the annotation in the original remained 31 hits that were deemed to be genuine
PDB files. The frequencies of occurrence of the 20 left-handed helices (Table 1). Based on hydrogen-
natural amino acids in the left-handed helices were bonding analysis, the left-handed helices were
calculated for comparison with their frequencies in further divided into a-helices (11 cases) and 310-
the entire subset of the PDB. The sequences of the helices (20 cases). Two of the hits occurred in NMR
left-handed helices were used to search a 95% structures.
sequence identity subset of the PDB with the The left-handed helices were short; the longest
PATINPROT server 28 to find out if identical helix was six residues long, but the majority were
sequences occur in any other proteins whose just four residues long (Table 1). The distribution of
structure is known. Any hits were then analysed the f and j torsion angle values of all the left-
to determine if the sequence displayed strong handed helices is shown in Figure 1. For a-helices,
secondary structure preferences. the average values of f and j were 598 (sZ128) and
We also generated a non-redundant subset of the
428 (sZ138), respectively; for 310-helices, these
PDB with PISCES18 using a 25% cut-off for the
values were 678 (sZ218) and 238 (sZ258), respect-
sequence identity (all other criteria were the same as
ively. The 310-helices thus had a wider distribution
in the generation of the 90% subset) and located all
of f and j angles than the a-helices. The observed
left-handed helices in that subset. For comparison
purposes, the set of proteins from the 90% subset torsion angle ranges for the left-handed helices are
that contained a left-handed helix was also pruned similar to those defined by Gunasekaran et al. (208 to
with PISCES to yield a set of left-handed-helix 1258 for f and K458 to 908 for j).9 The average f
containing proteins with no more than 25% and j angles for the left-handed helices are also
sequence identity. similar to those observed for right-handed helices
(ignoring the sign changes).29 The proteins that
contain left-handed helices show no preference for
Analysis of the left-handed helices in the PDB overall secondary structure contents; they belong
to the classes of mainly alpha, mainly-beta and
Ideal left-handed helices of four to ten residues mixed alpha-beta proteins. Most of the left-
were used in SPASM searches for similar motifs in handed helices are located on the protein surface,
the PDB. This yielded 21 left-handed helices that but there are no obvious patterns in the types
were used to define the initial ranges of f and j and spatial orientations of flanking secondary
angle values for residues in left-handed helices, structure elements.
namely 08 to 1258 for f and K508 to 758 for j. Later, All amino acid types except proline were encoun-
these ranges were extended (K208 to 1458 for f and tered in left-handed helices. Proline residues are
K708 to 1458 for j) to make sure that no true left- unlikely to appear in helices in general, because
handed helices would be missed (false nega- their imido nitrogen atom cannot donate a hydro-
tives). Eventually, the ranges were narrowed gen bond and because the bulky Cd methylene
down to the final values (308 to 1308 for f and group attached to the nitrogen introduces severe
K508 to 1008 for j), which ensured that there steric clashes. Moreover, due to the inability of
were no false negatives, and not too many false proline residues to assume positive f values, they
positives. A minimum length of four residues are not expected to occur in any left-handed helices
was imposed on potential left-handed helices to at all, and this expectation is borne out by our
ensure that any such helices would contain at observations. The second position in the helix is
least one full turn. occupied by asparagine in 11 cases and by
The criteria for detecting left-handed helices were glutamate or glutamine in six more cases. The last
implemented in a Perl program that was run on all two positions in the helix are often occupied by two
the members of the non-redundant subset of the identical residues, usually Gly-Gly (14 cases), and
PDB described earlier, an exercise that yielded 56 an even higher prevalence of glycine is apparent at
putative hits. All hits in structures determined by the last position in the helices (19 cases). This is
NMR were required to have a left-handed helical most likely because glycine is achiral and lacks a
conformation in more than half of the models in the side-chain, so that left-handed and right-handed
ensemble (to filter out spurious hits with little conformations are equally favourable. Although the
support in the experimental NMR data), which sample size is very small, tryptophan has a
reduced the number of hits to 38. In six of the 38 surprisingly high propensity for being in left-
handed helices, especially compared to its pro-
http://www.ncbi.nlm.nih.gov/omim/ pensity for assuming a left-handed backbone
234 Left-handed Helices
conformation. Some additional data can be found helix only contains an RL nest, whereas 14 helices
on our web site. contain only an LR nest.
Sometimes right-handed helices have a Schell- To assess if the short amino acid sequences of the
man turn as a C-cap.10,30 In such a motif, the helix is left-handed helices always occur in such secondary
terminated by a residue with a left-handed confor- structure elements, all occurrences of each of the 31
mation, and 52 or 61 hydrogen bonds are usually sequences were located in a 95% sequence-identity
formed. Of the present set of left-handed helices, 13 subset of the PDB with the PATINPROT server.28 As
contain a left-handed version of the Schellman turn. expected,32 none of the sequences found in the left-
Whenever two consecutive residues have a helical handed helices showed any strong secondary
conformation of opposite handedness (i.e. RL or structure preferences. Most were found to occur in
LR), they form a motif called a nest.31 Analysis of all three secondary structure types (helix, strand
the 31 left-handed helices and their flanking and loop), e.g. the sequence AQGG originally found
residues revealed that six of them contain both an in P1 nuclease (PDB code 1ak0) as a left-handed
RL nest at their N terminus and an LR nest at their C helix was found in ten other proteins in the PDB
terminus, whereas ten of them contain neither. One subset where it appeared as a helix, a strand, a loop
and combinations thereof. The sequence of the
longest left-handed helix (HAGEGG) was not
found in any other protein in the PDB.
http://xray.bmc.uu.se/wmarian/left The extent to which left-handed helices are
Left-handed Helices 235
Figure 1. Distribution of f and j torsion angles of residues found in left-handed helices. Pink squares represent
residues in 310-helices and blue diamonds represent residues in a-helices. (For interpretation of the references to colour
in this Figure legend, the reader is referred to the web version of this article.)
conserved between different sequence families at addition, the assignments of these helices in the
the same homologous superfamily level of the parent PDB files were retrieved. Significant differ-
CATH classification33 was also investigated. It was ences in the results of the programs were observed.
found that none of the helices that occurred in In fact, in none of the cases did the programs
proteins that had been classified in CATH are produce an identical result. Only two of the
conserved across all corresponding sequence programs, DSSP and SecStruct, provided any
families. This suggests that the left-handed helix information at all about the handedness of the
motifs have evolved relatively recently and that secondary structure elements (both of them in a
they serve a specific purpose. column labelled chirality in the output). The
To assess the effect of using different cut-offs for results obtained with the oldest program, DSSP,
the allowed level of sequence identity during the agree best with our own manual assignments. Good
generation of reduced subsets of the PDB, the agreement was also obtained with the results of
analysis was repeated with a 25% subset. Whereas SecStruct, but the chirality assignments of this
the 90% subset yielded 31 left-handed helices, only program were often misleading. Interestingly, the
13 such helices were found in the 25% subset. handedness of the helices appears to have escaped
However, if the set of 31 proteins that contain a left- the attention of most of the depositors and
handed helix is in turn reduced to a subset of annotators of the structures. As far as we could
proteins that have no more than 25% sequence determine, only four of the 31 left-handed helices
identity, the number of hits is as high as 18. This were described as such in the corresponding PDB
discrepancy can be explained by realising that the file (1npc, 8tln) or original papers (1b9w, 1n1i).
two processes of selecting a subset of the PDB using
a cut-off on the allowable sequence identity level Function of left-handed helices
and that of testing a set of proteins for a particular
property (in this case, the presence of left-handed For each of the 31 left-handed helices it was
helices) are not commutative. In this case, 13 is the investigated if they are important for the structure
number of left-handed helices found in a 25% or function of the parent protein. The results are as
subset of the PDB, whereas 18 is the size of a 25% follows (the hits have been grouped according to
PDB subset of the proteins that are known to sequence and structure similarity):
contain a left-handed helix.
Finally, four secondary structure assignment
Alanine racemase
programs (DSSP, STRIDE, PROMOTIF and Sec-
Struct) were tested for their ability to recognise the This enzyme mediates the interconversion of L
handedness of left-handed helices (Table 2). In and D-alanine, which is indispensable for cell wall
236 Left-handed Helices
formation. Alanine racemase (PDB code 1bd0) was The seven proteins have a conserved Asn residue
already known to contain a left-handed helix.15 The in their left-handed helices that forms an important
catalytic residue is Lys39 and the left-handed a- hydrogen bond34 either with residue iC5 (protein C
helix covers residues 40 to 44, where Tyr43 is the and factor VII) or iC7 (merozoite surface proteins)
covalent attachment site for the cofactor pyridoxal in the EGF-like domain, or with residues in the
phosphate (Figure 2(a)). The carbonyl oxygen atom protease domain (factor IX and factor X). There is
of Gly44 forms an interdomain hydrogen bond with also some evidence that the left-handed helix in
Arg366.16 The entire helix is part of the PROSITE factor IX is involved in the binding of factor VIII,
pattern (PS00395) for the alanine racemase family. which is its natural interaction partner in the blood-
clotting cascade. Double mutation of Asn89 and
Asn92 to alanine completely abolishes the inter-
EGF-like family action between factor IX and factor VIII.35 However,
A left-handed helix was identified in seven the single mutation of Asn89 to alanine has only a
proteins with EGF-like domains (PDB codes 1aut, marginal effect on binding.36,37
1b9w, 1g2l, 1kli, 1n1i, 1ob1 and 1rfn). Four of these The two consecutive glycine residues in the left-
are blood-clotting proteins (factor VII, factor IX, handed helix have been implicated as protein-
factor X and protein C) and three are merozoite stabilising factors in the merozoite surface pro-
surface proteins from various Plasmodium species. teins.38 The first amino acid in the helix in protein C
These two functionally distinct groups show weak (Asp101) participates in an unusual hydrogen
but detectable sequence similarity, mainly due to bonding interaction with another acidic residue
conserved cysteine residues that form disulphide (Glu85).39 Mutations of three residues in the left-
bridges, but also in a short stretch of sequence handed helix, namely Asp101, Gly103 and Gly104,
containing the left-handed helix. The hydrogen- cause protein C deficiency which may lead to
bonding pattern of these helices classifies them as recurrent venous thrombosis, which in turn may
310-helices rather than a-helices. cause neonatal death.4042
Left-handed Helices 237
residue is the substrate-binding residue in this RNA and DNA. The enzyme contains a short left-
family of proteins as well. handed a-helix (Ala131 to Gly134) that forms a part
of the active site cleft of the enzyme. Asn135,
immediately following the helix, forms a hydrogen
Endostatins bond with the substrate.57 This enzyme contains
Left-handed a-helical segments (with identical zinc ions, and one of them is coordinated by His126,
sequences) were identified in three different endo- which lies five residues upstream of the left-handed
statins (PDB codes 1bnl, 1dy2, 1koe). A conserved helix.
cysteine residue in these helices participates in a
disulphide bridge that is crucial for the stability of
these proteins.5254 2,5-Diketo-D-gluconic acid reductase (Dkg A)
DkgA (PDB code 1mzr) is an enzyme in the
metabolic pathway to vitamin C. A short 310-helix
Proteins from peptidase family M14
was identified involving amino acids 191 to 194.
A left-handed a-helix was found in three related Leu190 of this enzyme coordinates the phosphate
proteases, namely, in thermolysin (PDB code 8tln), group of the cofactor NADPH (S. Jeudy, personal
aurolysin (1bqb) and neutral protease (1npc), and communication).
the sequence of the helix is identical in all three. The
side-chain of the first amino acid in the helix,
aspartate, forms a hydrogen bond with a histidine Remaining cases
that coordinates the zinc ion in the active site.12 For the remaining proteins that contain a left-
There is also a hydrogen bond between the back- handed helix (PDB codes 1hxx, 1kws, 1jv1 and
bone of this aspartate and the zinc-coordinating 1h21) no indication could be found as to their
histidine, which further underscores the import- possible structural or functional roles.
ance of the helical segment for the catalytic
mechanism of these proteases. Concluding remarks
