Olanders2020 Article ConformationalAnalysisOfMacroc

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Journal of Computer-Aided Molecular Design (2020) 34:231–252

https://doi.org/10.1007/s10822-020-00277-2

Conformational analysis of macrocycles: comparing general


and specialized methods
Gustav Olanders1   · Hiba Alogheli1 · Peter Brandt2   · Anders Karlén1 

Received: 29 July 2019 / Accepted: 3 January 2020 / Published online: 21 January 2020
© The Author(s) 2020

Abstract 
Macrocycles represent an important class of medicinally relevant small molecules due to their interesting biological properties.
Therefore, a firm understanding of their conformational preferences is important for drug design. Given the importance of
macrocycle-protein modelling in drug discovery, we envisaged that a systematic study of both classical and recent specialized
methods would provide guidance for other practitioners within the field. In this study we compare the performance of the
general, well established conformational analysis methods Monte Carlo Multiple Minimum (MCMM) and Mixed Torsional/
Low-Mode sampling (MTLMOD) with two more recent and specialized macrocycle sampling techniques: MacroModel mac-
rocycle Baseline Search (MD/LLMOD) and Prime macrocycle conformational sampling (PRIME-MCS). Using macrocycles
extracted from 44 macrocycle-protein X-ray crystallography complexes, we evaluated each method based on their ability to (i)
generate unique conformers, (ii) generate unique macrocycle ring conformations, (iii) identify the global energy minimum, (iv)
identify conformers similar to the X-ray ligand conformation after Protein Preparation Wizard treatment (X-rayppw), and (v) to
the X-rayppw ring conformation. Computational speed was also considered. In addition, conformational coverage, as defined
by the number of conformations identified, was studied. In order to study the relative energies of the bioactive conformations,
the energy differences between the global energy minima and the energy minimized X-rayppw structures and, the global energy
minima and the MCMM-Exhaustive (1,000,000 search steps) generated conformers closest to the X-rayppw structure, were
calculated and analysed. All searches were performed using relatively short run times (10,000 steps for MCMM, MTLMOD
and MD/LLMOD). To assess the performance of the methods, they were compared to an exhaustive MCMM search using
1,000,000 search steps for each of the 44 macrocycles (requiring ca 200 times more CPU time). Prior to our analysis, we also
investigated if the general search methods MCMM and MTLMOD could also be optimized for macrocycle conformational
sampling. Taken together, our work concludes that the more general methods can be optimized for macrocycle modelling by
slightly adjusting the settings around the ring closure bond. In most cases, MCMM and MTLMOD with either standard or
enhanced settings performed well in comparison to the more specialized macrocycle sampling methods MD/LLMOD and
PRIME-MCS. When using enhanced settings for MCMM and MTLMOD, the X-rayppw conformation was regenerated with
the greatest accuracy. The, MD/LLMOD emerged as the most efficient method for generating the global energy minima.
Graphic abstract

Keywords  Macrocycles · Conformational sampling · Drug design

Electronic supplementary material  The online version of this


article (https​://doi.org/10.1007/s1082​2-020-00277​-2) contains
supplementary material, which is available to authorized users.

Extended author information available on the last page of the article

13
Vol.:(0123456789)

232 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Introduction methods, we decided to include the general Monte Carlo


Multiple Minimum (MCMM) method since it has not yet
Computational modelling has transformed the strategic been extensively applied towards macrocycle sampling.
decision making process in drug discovery; both reducing The MCMM algorithm was published by Chang et al. in
costs and improving efficiency [1]. Prominent areas of con- 1989 [41] and is implemented in the Schrödinger software
tribution include pharmacophore-based, and shaped-based MacroModel. In 1989, yet another conformational search
virtual screening [2-4], and docking [5] of drug candidates algorithm called random incremental pulse search (RIPS)
to their protein targets. Although these methods use differ- was published by Ferguson and Raber [42]. Today, a similar
ent approaches, they all require conformational data as an approach to RIPS, called stochastic search, is implemented
input. Conformational sampling is also required for other in the Chemical Computing Group’s Molecular Operating
computational techniques employed in medicinal chemistry, Environment (MOE) software [43]. The MCMM and MOE
for example drug design [6], drug permeability prediction stochastic search methods are not built upon the same search
[7], NMR data interpretation [8-11], and fitting molecules algorithm and, therefore, we expect differences in their per-
to X-ray electron density maps [12]. Therefore, it is of great formance. Whilst the MOE stochastic search algorithm has
importance to have reliable and efficient methods for con- often been used in conformational analysis comparison stud-
former generation. ies, MCMM has not been utilized in this capacity. Thus, we
Recently, macrocycles (herein defined as cyclic com- wanted to investigate the performance of MCMM applied
pounds with a ring size of 10 atoms or more) have gained on macrocycles. Another more general method, the Mixed
increased importance in drug discovery because of their Torsional/Low-Mode (MTLMOD) sampling conformational
unique properties [6, 13-17]. Macrocycles may possess cell search method, was also included in this study. This method
permeability better than expected by the “rule-of-five” [6, has been used in several recent publications to sample the
17, 18], improved metabolic stability [11], enhanced bind- conformational space of both macrocycles and non-macro-
ing properties to featureless binding sites [19], as well as cyclic structures [44]. Finally, we wanted to compare these
the ability to disrupt protein–protein interactions [19-22]. more general conformational search methods with two more
However, macrocycles often present a significant synthetic recent specialized macrocycle sampling methods: Macro-
challenge [13, 23-25]. It is therefore of great importance to Model macrocycle Baseline Search (MD/LLMOD) [32] and
develop and improve computational methods, such as con- Prime macrocycle conformational sampling (PRIME-MCS)
formational analysis, to focus the design of new macrocyclic [33]. MD/LLMOD combines a short molecular dynamics
ligands [26, 27]. simulation with Large-Scale Low-Mode steps. In compari-
In 1990, Saunders et  al. performed a conformational son, PRIME-MCS splits the macrocycle backbone into two
analysis study on cycloheptadecane, aiming to identify the pieces, sampling them independently using predefined angle
best method for searching large ring structures [28]. After libraries before reconnecting the pieces again [33]. Before
evaluating systematic and random search methods, as well as performing the method comparison, we investigated if the
molecular dynamics and a distance geometry method, they general methods (MCMM and MTLMOD) could also be
concluded that cycloheptadecane was lying at the boundary optimized for macrocycle conformational sampling. Based
of what could be addressed with the technology of the time. on the findings in the optimization step, we performed the
In recent times, many new algorithms for exploring molec- method comparison study where all search methods (includ-
ular potential energy surfaces have been developed e.g. ing both standard and enhanced MCMM and MTLMOD)
LMOD [8], LLMOD [29], MTLMOD [30], LowModeMD were benchmarked against an exhaustive MCMM search
[31], MD/LLMOD [32], PRIME-MCS [33], ForceGen using 1,000,000 search steps. After comparatively evaluat-
[34], BRIKARD [35], PLOP [36], a DFT-D3/COSMO-RS ing these methods, we addressed the conformational cover-
based method [37], and, most recently, Conformator [38]. age, as well as the energy difference between the conformer
However, conformational sampling of macrocycles is still closest to the X-rayppw conformation and the “global energy
considered a challenging task [36, 39]. To provide guid- minimum”. The workflow of the study is summarized in
ance for other practitioners within the field we compare the Fig. 1.
conformational search capabilities of four different meth-
ods with respect to sampling the conformational space of
macrocycles. Methods
In the current study, we use a data set of 44 protein-mac-
rocycle complexes (38 unique ring systems) [40], where the Unless otherwise stated, all calculations were performed
majority of the structures originated from the commonly within the Schrödinger Small-Molecule Drug Discovery
used data set of Watts et  al. [32] In terms of sampling Suite 2017–1 [45] using the OPLS3 [46] force field with
the GB/SA continuum solvation model for water [47].

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 233

in 1FKJ/4NNR, and Geldanamycin in 1YET/2ESA) present


in the Alogheli data set, were removed from the data set to
give 44 macrocycles.

X‑ray structures preparation

All 44 X-ray structures were downloaded from the Protein


Data Bank (PDB) [53, 54] and prepared using the Protein
Preparation Wizard [55, 56] in Maestro [57] using default
options as described below. In the previous docking study by
Alogheli et al. [40] the OPLS-2005 force field was used for
preparing the structures while in the present work we used
the OPLS3 force field. The PDB structures were therefore
reprocessed using OPLS3. Missing side-chains were added
by Prime side-chain predictions [58-60]. In cases where resi-
dues had alternate positions, the first listed position, or the
position with the highest average occupancies, was selected.
The ligand tautomer and ionization state, as well as protein
protonation states, were the same as used in the study by
Alogheli et al. [40] (see the tautomer and ionization state of
the structures in Table 1). Furthermore, the hydrogen bond
networks of the protein–ligand complexes were optimized
and water molecules forming less than three hydrogen bonds
to non-waters were removed. Finally, the protein–ligand
complexes were energy minimized using default settings
where heavy atoms were displaced no more than 0.3 Å Root
Mean Square Deviation (RMSD).

Selecting a non‑biased starting conformation

To compare strategies for generating a non-biased starting con-


formation, two approaches were used. The most commonly
employed method involves converting the X-ray structure to
SMILES format, whilst retaining stereochemical information,
and then converting it back to the 3D structure. Accordingly,
Fig. 1  A graphical summary of the study design all macrocycle X-ray structures in the current study were con-
verted to SMILES codes and then back to their 3D structures
using LigPrep, before these conformational geometries were
When setting up a calculation in the Maestro GUI, some compared to their corresponding X-ray structures. Alter-
methods use kJ mol−1 and others kcal mol−1. Therefore, we natively, we also applied a more elaborate approach where
decided to present the settings in the unit used, along with we performed an MCMM conformational sampling of each
the alternative unit in parenthesis. All graphs presented macrocycle (starting from the SMILES generated structure)
herein were made in Python [48] and R [49]. Figures were using 10,000 search steps and an energy window for keeping
made in Microsoft PowerPoint [50]. PCA-models (includ- conformers of 62.8 kJ mol−1 (15.01 kcal mol−1). The con-
ing score and loading plots) were made in SIMCA [51]. former with the highest RMSD to the X-ray conformation
Molecular modeling figures were made in PyMOL [52]. was selected and further analyzed (hereafter called “starting
conformer”). To compare the two strategies, we used a confor-
Data set selection mational clustering tool and calculated the torsional RMSD,
including all dihedral angles except those involving terminal
The macrocycles in the 47 protein–ligand complexes pre- atoms, i.e. methyl hydrogens. The two strategies were also
viously published by Alogheli et al. [40] were used in the compared by calculating the number of torsional angles that
present study. The three duplicate structures 2IYF, 1FKJ deviated more than 120° or 60°, respectively, from the tor-
and, 1YET (Erythromycin in 2IYF/3FRQ and Tacrolimus sional angles found in the X-rayppw conformation (excluding

13

234 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Table 1  Structures of the
Macrocycles in the Tautomer/
Ionization States Used for
Conformational Analysis

The PDB code of the complex structure are shown to the lower left, whereas the ligand name (when
given) and the ligand code are shown to the upper left and lower right, respectively. Macrocycles that
were included in the subset are marked with “subset” to the upper right.

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 235

terminal atom dihedral angles). We also included a comparison Conformational sampling using MD/LLMOD
to the energy minimized X-rayppw structure. All calculations
were performed by running a script in the command script For MacroModel Macrocycle Baseline Search (MD/
editor followed by a separate python script. LLMOD) the energy window for keeping conformers was
The conformer most dissimilar to the X-rayppw structure set to 15.01 kcal mol−1 (62.8 kJ mol−1) and the torsion sam-
after an MCMM search was selected as the “starting con- pling option was set to extended mode, enabling ester and
former” and was used for all conformational sampling stud- amide sampling. The remaining settings were left at their
ies performed in this study. A similar approach for generat- default values: elimination of redundant conformers using
ing the starting conformation was used by Coutsias et al. an RMSD of 0.75 Å, 5000 molecular dynamics simulation
[35] cycles and 5000 LLMOD (Large-scale Low-mode) search
steps. Eigenvectors were determined for each new global
Conformational sampling minimum.

All methods except MD/LLMOD and MCMM-Exhaustive Conformational sampling using PRIME‑MCS
were run in triplicates using different seeds.
PRIME-MCS was run from the command line. In short,
Conformational sampling using MCMM and MTLMOD PRIME-MCS was run in vacuum. PRIME-MCS was run
using the sampling intensity “thorough” generating up
The Monte Carlo Multiple Minimum (MCMM) and the to 1000 conformations. For more details about the used
Mixed torsional/Low-Mode sampling (MTLMOD) search PRIME-MCS syntax, see “PRIME-MCS sampling-syntax”
methods implemented in MacroModel [61] were run with section in supporting information.
10,000 steps in total for each compound. The option of
using a fixed number of steps per rotatable bond, as well as Exploring energy minimization method and ring
the Multi-Ligand option, were deselected. Torsional sam- closure settings on a diverse subset of 10
pling of amides, esters, as well as all C–N and C–O single macrocycles
bonds and C=N and N=N double bonds, were allowed in
the search (“extended sampling”). For energy minimiza- Selection of a diverse subset
tions, up to 50,000 steps of Truncated Newton minimiza-
tion (TNCG) [62] with a gradient convergence criterion of Ten diverse macrocycles were chosen from the 44 macro-
0.05 kJ Å−1 mol−1 was used (the minimization terminates cycles to represent the full data set of 44 macrocycles using
when the convergence criterion is met). A 0.5 Å distance a principal component analysis (see section “Selection of a
threshold between any pair of heavy atoms (and O–H, Diverse Subset.” in supporting information). The macrocy-
S–H) was used for elimination of redundant conformers. cles in the subset are marked with “subset” in Table 1.
The energy window for keeping conformers was set to
62.8 kJ mol−1 (15.01 kcal mol−1). For MTLMOD, the prob- Conformational sampling of the macrocycles in the subset
ability of a torsion rotation/molecule translation was set to using MCMM and MTLMOD
the default value of 0.5. Also, the minimum and maximum
distance for low-mode moves were kept at default values of The same settings as described above using 10,000 search
3.0 and 6.0 Å, respectively. Random seeding was achieved steps was used except that this study was performed using
by modifying the .com files, see section “Random seeding only one seed.
for MCMM and MTLMOD” in supporting information.
Minimization method
Conformational sampling using one million search steps
(MCMM‑Exhaustive) The PRCG and TNCG minimization methods were com-
pared using the MCMM and MTLMOD methods. For
In this MCMM search the same settings as metioned above energy minimizations, up to 50,000 steps of PRCG or
were used except that stereocenters adjacent to ring clo- TNCG minimization with a gradient convergence criterion
sures were avoided and a wider ring-opening criterion was of 0.05 kJ Å−1 mol−1 was used (the minimization terminates
used (0–100 Å). This search will be referred to as MCMM- when the convergence criterion is met).
Exhaustive and 1,000,000 search steps were used for each
compound.

13

236 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Ring closure criterion the heavy atoms in the macrocyclic ring and all heavy atoms,
respectively, were calculated.
The default ring closure criterion closes the opened ring
systems if the distance between the ring-opened atoms are Producing a conformation similar to the X‑rayppw
between 0.5 and 2.5 Å. It is recommended to use a wider conformation
ring closure criterion of ca. 0.1–5.0 Å for larger ring sys-
tems, therefore, this distance was evaluated [63]. A very The ability of different search methods to generate conform-
wide ring closure criteria, between 0–100  Å, was also ers similar to the experimentally determined conformation
evaluated. was evaluated by calculating the RMSD between the heavy
atoms in the ligand X-ray structure after Protein Preparation
Evaluation of the conformational search methods Wizard treatment (called X-rayppw) and the generated con-
formers using the superposition tool in Maestro.
The performance of the conformational search methods were
evaluated with respect to the number of unique conform- Producing a conformation similar to the X‑rayppw ring
ers generated, number of unique ring conformations, com- conformation
putational speed, ability to find the global minimum and,
the ability to identify conformers similar to the experimen- The ability of the different search methods to generate ring
tally determined X-ray conformation after Protein Prepara- conformations similar to the experimentally determined
tion Wizard treatment (X-rayppw) and to the X- ­rayppw ring X-ray ring conformation was evaluated by calculating the
conformation. ­RMSDRING between the heavy ring atoms in the X-ray struc-
ture after Protein Preparation Wizard treatment (X-rayppw)
Number of generated conformers and ring conformations and the generated conformers using the superposition tool
in Maestro.
The number of generated conformers were extracted from
the conformational search log files (.log files). The number
of ring conformers generated by each method was investi- Results and discussion
gated via the Redundant Conformation Elimination method
implemented in MacroModel (for specialized settings see This study aimed to evaluate the performance of the more
“Calculating the Number of Generated Conformers and general and well-established conformational analysis meth-
Ring Conformations.” in supporting information). The heavy ods MCMM and MTLMOD in comparison with the new
atoms in the macrocyclic ring were superimposed and redun- specialized macrocycle sampling techniques MD/LLMOD
dant conformers were eliminated based on a maximum atom and PRIME-MCS. Given the importance of macrocycle-pro-
deviation cut-off of 0.5 Å. The Retain Mirror Image confor- tein modelling in drug discovery, we envisaged that a sys-
mation option was used. The energy window for conformer tematic study of both classical and recent specialized meth-
selection was set to 62.8 kJ mol−1 (15.01 kcal mol−1). ods would provide guidance for other practitioners within
the field. In addition to assessing the relative performance
Computational speed of these conformational search methods, we also wanted to
address the challenge of performing conformational analysis
To compare computational times between methods, the CPU of large macrocyclic structures with many rotatable bonds.
times were extracted from the log files (.log-file). This included studying the degree of conformational space
covered in a conformational search. The energy differences
Identifying the global energy minimum between the conformers most similar to the X-rayppw confor-
mation and the lowest energy conformation identified were
The global energy minimum conformer was considered also studied. However, the default settings of the general
as identified if a method generated a conformer with an methods have not necessarily been optimized to perform
energy difference not greater than 1 kJ mol−1 compared well on macrocycles [32]. Therefore, using 10 macrocycles
to the lowest energy conformer found by any method for as a representative subset of the full data set, we first investi-
that macrocycle (here assumed to correspond to the global gated whether small changes to the MCMM and MTLMOD
energy minimum). As an additional evaluation, the similar- methods could enhance conformational sampling of these
ity in geometry between the global energy minimum and challenging ring systems and improve the X-rayppw repro-
the lowest energy conformer within 1 kJ mol−1 from the duction accuracy. Thereafter, we used the full data set with
global energy conformer generated by the other methods, macrocycles extracted from 44 crystal structures and com-
was analyzed. Here two different RMSD values using only pared the two general methods (with and without enhanced

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 237

settings) with the two more specialized methods MCS- Table 2  Characteristics of the Full Data set Consisting of 44 Macro-
PRIME and MD/LLMOD. cycles
Conformational sampling can be run using many different Property Average Median Minimum Maximum
settings. To enable fair comparison of the current work with
PDB resolution (Å) 1.88 1.88 0.95 2.50
previous studies, we opted to employ 10,000 search steps
Ring size 17 16 11 29
per structure, an amount that should be feasible for most
#Torsional angles 25 23 8 47
modelling projects [33, 34, 41, 64]. To further align our
­sampleda
work with the literature, an energy window of 15 kcal mol−1,
Molecular weight 571 538 280 1041
[44] and up to 50,000 minimization steps [32] was used. For
DonorHBb 2.5 2.0 0 9.3
all methods except MD/LLMOD and MCMM-Exhaustive,
AcceptHBc 12.0 11.2 5.3 26.9
three runs with different seeds were performed to assess how
QPlogPo/wd 2.7 2.8 − 2.6 6.8
the stochastic element of the searches affected the results
PSAe 142 124 71 411
[44]. Finally, we ran an exhaustive conformational search
using the MCMM-Enhanced settings and 1,000,000 search All descriptors were calculated using QikProp, except for ring size
steps per structure to compare with the results obtained from and the number of torsional angles sampled, which were calculated
by hand. For 2XYT descriptors were calculated using Instant JChem
the searches using only 10,000 steps search per structure. [83]
It should be noted that when the MD/LLMOD method a

Number of torsional angles sampled during the MCMM and
was developed it was trained on about two-thirds of the mac- MTLMOD conformational searches
rocycles used in this study, which could potentially bias the b
 Number of hydrogen bond donors
results [32]. c
 Number of hydrogen bond acceptors
d
 Calculated octanol/water partition coefficient
Data set selection e
 Polar surface area

In general, macrocycle conformational analysis studies have


used structures obtained from both the PDB and the Cam- site produced by SiteMap [66], were removed. After this,
bridge Structural Database (CSD) [65], with the majority of 31 structures remained. Subsequently, using the same crite-
the structures retrieved from the latter. A significant differ- ria, 16 PDB structures containing macrocycles were added,
ence between these databases is that reported macrocycle which gave 47 structures altogether. As described in the
structures are typically crystallised either with or without Methods section, there are three duplicate structures in the
protein partners in the PDB and CSD, respectively. Whilst Alogheli data set (Erythromycin in 2IYF/3FRQ, Tacroli-
reported conformations of macrocycles reported in both mus in 1FKJ/4NNR, and Geldanamycin in 1YET/2ESA).
the “free” and protein-bound state can be similar, they may To avoid duplicate sampling, 2IYF, 1FKJ and, 1YET were
also diverge significantly [36]. Given our emphasis on bio- removed from the present data set. Thus, the number of
logically relevant macrocycles, we directed efforts towards unique macrocycles in our data set is 44 and the number of
X-ray conformations extracted from the PDB as these pro- unique ring system is 38.
tein–ligand complexes can be considered as the bioactive, The macrocycle ring sizes varied between 11 and 29
bound-state conformations. This allowed us to exclusively ring atoms (median 16), and the number of rotatable bonds
study if the conformational analysis methods could produce ranged from 8 to 47 (median 23), see Table 1 for 2D struc-
conformations similar to the protein-bound macrocycle tures and Table 2 for a summary of some characteristics of
conformations. the data set.
Currently, there are several macrocycle data sets publicly
available. Two of the most frequently used were published Generating a non‑biased starting conformation
by Chen and Foloppe [44] and Watts et al. [32]. In the pre-
sent study we used the 47 protein-macrocycle complexes In a conformational analysis study, it is less of a challenge
previously published by Alogheli et al. [40] Briefly, the to generate a conformation close to the X-ray conformation
Alogheli et al. data set originates from the 150 structures if the starting conformation is geometrically similar. There-
collected by Watts et al. However, all 83 structures obtained fore, to ensure a starting geometry sufficiently dissimilar
from the CSD were removed. The remaining 67 PDB struc- from the X-ray structure, it is typical to convert the X-ray
tures were further filtered where structures with either a ring structure to a 2D SMILES string and then convert it back
size below 10 atoms, an overall resolution above 2.5 Å, poor to a 3D structure (keeping stereochemical information). In
ligand resolution, uncertain stereochemistry, predominantly this study we used an alternative approach where confor-
solvent exposed ligands, extensive ligand-ligand interac- mational ensembles generated via a 10,000 step MCMM
tions and structures that did not overlap with the binding search were generated for each macrocycle and compared

13

238 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Table 3  Comparing different strategies to generate non-biased start- generated conformers had atomic RMSD values below the
ing conformers 1 Å threshold. As a general note, evaluating the dissimilarity
RMSD (Å)a between starting and X-rayppw conformations is advisable
irrespective of the generating method.
Conformer  < 1 Å 1 Å–2 Å  > 2 Å

Energy minimized X-rayppw 40 5 0 Exploring energy minimization method and ring


ligand closure settings for MCMM and MTLMOD
Starting conformer 0 7 38 on a diverse subset of 10 macrocycles
SMILES conformer 4 15 26
a
 RMSD for the conformer identified with the lowest RMSD value to Energy minimization method
the X-rayppw ligand. The conformers are, dependent on their calcu-
lated RMSD values, divided into three different groups with RMSD Chen and Foloppe have shown that the settings for sam-
values: below 1 Å, between 1–2 Å, and greater than 2 Å
pling macrocycles can be enhanced for improved search per-
formance [44]. As it was observed that a major part of the
to the corresponding X-rayppw conformation. The conformer conformational searches was spent on energy minimization
with the highest atomic RMSD to the X-rayppw conformation of the generated conformations, the choice of minimization
was then chosen as the starting conformation for all subse- method was investigated using the diverse 10 macrocycle
quent conformational analysis studies. Using this approach subset. The conformational search methods in MacroModel
no starting conformers had RMSD values below 1 Å to the offer a wide variety of minimization methods, where the
X-rayppw conformation, see Table 3 (“starting conformer”) Polak-Ribiere Conjugated Gradient method (PRCG) is the
and Table  S1 for detailed results. This should be com- default method while the truncated-Newton conjugate-gra-
pared with the generally accepted procedure of converting dient (TNCG) is described as a superb method for flexible
SMILES strings to 3D structures, which had four structures structures [63]. Newton based minimization methods have
below the 1 Å RMSD threshold. However, it is well-known also been frequently employed [30, 41, 67]. The PRCG and
that one can obtain a high RMSD between two structures TNCG minimization methods were compared for the two
by altering only one or a few torsional angles, leaving all conformational analysis methods MCMM and MTLMOD.
other geometric parameters unchanged. Therefore, to further To align all minimizations, up to 50,000 minimization steps
evaluate the similarity between the starting conformers and were chosen since this is default setting for the MD/LLMOD
the X-ray conformation, the torsional RMSD values were method (the minimization terminates when the convergence
calculated. criterion is met). In summary, applied on the diverse mac-
Comparing the torsional RMSD values between the two rocycle subset, MCMM and MTLMOD ran 1.6 times and,
different approaches, the median torsional RMSD were 7.5 times faster, respectively, using TNCG instead of PRCG,
64.3° and 68.9° for the SMILES generated conformers and see Table 4. Therefore, all minimizations with the MCMM
the conformational ensemble generated starting conform- and MTLMOD methods were run with TNCG instead of
ers, respectively (Table S1). For comparison, the energy PRCG in this study.
minimized X-rayppw conformations had a median torsional
RMSD of 8.9°. We also compared the number of torsional Ring closure criterion for MCMM and MTLMOD
angles that differed by more than 120° in comparison to conformational searches
the X-rayppw conformation, which were similar for the two
approaches. The median number of torsional angles differ- The MCMM and MTLMOD search methods are imple-
ing by more than 120° was five for both the starting con- mented in MacroModel. The MTLMOD method uses
former and the SMILES generated conformer. The median either LowMode steps or MCMM steps [67]. To generate
number of torsional angles differing by more than 60° was a new conformation for ring-containing compounds using
slightly higher for the starting conformers (16 torsional the MCMM or MTLMOD methods, the ring needs to be
angles) compared to the SMILES conformers (14 torsional temporarily opened, thus, a cleavage site must be identified.
angles). Thus, no major difference was observed between Using ths default settings the acceptance criteria for ring
the two approaches for generating a starting conformation closure after torsional variation is 0.5–2.5 Å. To evaluate the
of sufficient dissimilarity to the X-rayppw structure. Taken performance if all opened rings are re-closed, a very wide
together, we used the most dissimilar structure based on ring closure criteria between 0 and 100 Å was investigated.
atomic RMSD as the starting conformation in the confor- Similar wide ring closure distances have been used in pre-
mational analysis studies. Overall, this conformation was vious studies (e.g., 0.1–30 Å) [30]. Cleavage sites adjacent
more dissimilar to the X-rayppw conformation as compared to to a stereocenter can potentially present complications as
the SMILES generated conformer, since four of the SMILES reconnecting the two atoms after the torsional movement

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 239

Table 4  Computational times used in the conformational analysis of generate many ring conformations when analysing macro-
the ten macrocycles in the diverse subset using two different minimi- cycles is of key importance and therefore these settings were
zation methods
used in all subsequent studies (termed MCMM-Enhanced
Conformational search Energy minimization Compu- and MTLMOD-Enhanced). A more detailed walkthrough
method method tational of the modified parameters is described in section “Method
­timea
Optimization Using a Diverse Subset of 10 Macrocycles” in
MCMMb PRCG​c 856 the supporting information.
MCMMb TNCGd 544
MTLMODe PRCG​c 4109 Comparing all search methods using the full data
MTLMODe TNCGd 551 set of 44 macrocycles
a
 The sum total of computational time (minutes) consumption for con-
formational analysis of ten macrocycles To compare the general conformational search methods
b
 Monte Carlo Multiple Minimum
(MCMM, MTLMOD) with the more specialized macro-
c
 Polak-Ribiere Conjugated Gradient
cycle-sampling methods (MD/LLMOD, PRIME-MCS) we
d applied these methods on all macrocycles contained in the
 Truncated Newton Conjugated Gradient
e full data set. As the MCMM-Enhanced and MTLMOD-
 Mixed torsional/Low-mode
Enhanced methods performed well for the diverse subset,
these were also included in the comparison study. Methods
may induce inversion of the original stereocenter. Accord- were evaluated based on the following criteria; the ability
ingly, a chirality check is used to reject conformations where to identify (i) unique conformers, (ii) unique macrocycle
this occurs. Therefore, avoiding stereocenters as ring clo- ring conformations, (iii) the global energy minimum, and
sure atoms might increase the number of generated con- (iv) the methods’ computational speed and (v) the ability
formations by reducing the amount rejected due to altered to identify conformers similar to the X-rayppw conforma-
stereochemistry. Applied on the 10 diverse macrocycles, tion, and (vi) to the X-rayppw ring conformation. To evalu-
the performance for both MCMM and MTLMOD when ate how well the different conformational search methods
changing the ring opening width and ring-opening place- performed and to get an approximation of the search effi-
ment were evaluated. As expected, avoiding ring-closures ciency, it would also be interesting to compare the gener-
adjacent to chiral centers and increasing the ring opening ated conformational ensembles with the complete set of all
width provided the highest number of unique ring confor- possible conformers. However, as the number of conformers
mations (Table 5). This combination also generated at least increases almost exponentially with the number of rotatable
one conformer within 2 Å RMSD to the X-rayppw conforma- bonds, generation of such complete ensembles is difficult
tion for all 10 macrocycles. We reasoned that the ability to [64]. To at least address this challenge, the conformational

Table 5  Summary of conformational analysis settings and results of the 10 macrocycles in the diverse subset
Method Ring opening Ring closure No. ­confa No. unique CPU Global energy mini- Best fit conformation
distance (å) ring c­ onfb time mum found for no. RMSD (Å)e
(min)c ­Macrocyclesd
 < 1 Å 1 Å–2 Å  > 2 Å

MCMM Standard 0.5–2.5 40,507 2482 666 4 7 0 3


MCMM Standard 0.1–5.0 40,946 5158 646 3 8 1 1
MCMM Standard 0–100 36,034 9326 813 5 7 3 0
MCMM Moved 0.5–2.5 41,538 2811 614 3 6 2 2
MCMM-Enhanced Moved 0–100 38,037 9402 800 5 7 3 0
MTLMOD Standard 0.5–2.5 29,417 4082 714 4 6 1 3
MTLMOD-Enhanced Moved 0–100 32,489 7574 704 3 8 2 0
a
 The sum total of conformers generated
b
 The sum total of unique ring conformations generated
c
 The sum total of computational time (minutes) used for conformational analysis
d
 Number of macrocycles where the lowest energy conformer was identified or a conformer with an energy difference no greater than 1 kJ mol−1
e
 RMSD for the conformer identified with the lowest RMSD value to the X-ray ligand after protein preparation treatment (X-rayppw). The con-
formers are, dependent on their RMSD values, divided into three different groups with RMSD values: below 1 Å, between 1 Å–2 Å, and greater
than 2 Å

13

240 Journal of Computer-Aided Molecular Design (2020) 34:231–252

search methods examined in this work were benchmarked Using 100 times as many search steps, the MCMM-Exhaus-
against an exhaustive MCMM run using 1,000,000 search tive search identified about 48 times more conformers in
steps. The MCMM method was used in this study because total (7,528,356), as compared to the shorter MCMM pro-
it is an efficient method for generating conformers and the tocol (discussed further in section “Conformational cover-
most efficient in generating ring conformations. MCMM, age—are we reaching convergence”) However, the MCMM-
MCMM-Enhanced, MTLMOD, MTLMOD-Enhanced and Exhaustive searches were intended to serve as a benchmark
PRIME-MCS were run three times independently using dif- in this study and generating large conformational ensembles
ferent seeds. The settings used for the different methods are can be problematic in terms of disk space, data handling and,
summarized in Table 6 and all results are summarized in downstream processing such as visual inspection, docking,
Table 7. For those methods where 3 different seeds were pharmacophore modelling and quantum mechanical opti-
used, the Max, Min and Average results for each method mizations, etc.
are presented.
PRIME-MCS employs, in comparison with other meth- Number of unique ring conformations generated
ods that use 15 kcal mol−1, a much wider energy window
of 100 kcal mol−1 for saving conformers, which may result As not all conformational sampling software support mac-
in a larger number of conformers generated. Furthermore, rocyclic ring sampling and since ring sampling in itself is
PRIME-MCS minimizations are performed in vacuum not always easily performed, [36] we evaluated the different
as compared to the GB/SA water solvation model that is methods’ ability to generate unique ring conformations. This
used by the other methods. Therefore, whilst the results was defined as the sum of identified ring conformations for
of PRIME-MCS are not directly comparable with the each method divided by number of runs that were made for
MCMM, MTLMOD and MD/LLMOD methods, they are that method. Whilst one could hypothesize that generating
still included as a comparison in all results except in the more conformers would also provide more ring conforma-
search for the global energy minimum. tions, the sum of all ring conformations identified by each
method did not parallel the total number of generated con-
Total number of conformers generated formers for the full data set. Instead, MCMM-Enhanced pro-
duced the largest number of ring conformations (49,324)
For each of the three runs using different seeds, the sum of followed by MTLMOD-Enhanced (41,040), PRIME-MCS
all conformers identified for all 44 macrocycles was cal- (24,953), MTLMOD (23,367), MD/LLMOD (19,189) and
culated for each search method. Since all methods except MCMM (18,950) (Table 7 and Table S10). Thus, running
MD/LLMOD were run three times, the average number of MCMM and MTLMOD using adjusted settings regarding
conformers per run will be presented to allow a comparison the ring opening bond (the enhanced settings) increased the
between the methods. This number was calculated as the number of generated ring conformations drastically. The
sum of identified conformers for each method divided by the MCMM-Exhaustive searches generated 967,844 ring con-
number of runs that were made for that method. Across all formations in total showing that with increased sampling
search methods, MCMM generated the highest average num- more ring conformations could be found.
ber of conformers over all 44 macrocycles (155,296), see
Table 7 and Table S9. MCMM-Enhanced identified 149,831 Identifying the global energy minimum
conformers on average followed by MTLMOD-Enhanced
(134,396), MTLMOD (117,490), MD/LLMOD (45,917), The ability of the methods to identify the global energy
and PRIME-MCS (31,118). minimum was also evaluated. As minimization uses a GB/
In the 10,000 step runs, the variation in number of iden- SA solvation model in all methods except for PRIME-
tified conformations using the three different seeds did not MCS (vacuum), this method was not evaluated in this
vary considerably within the different methods for each mac- section. As MCMM-Exhaustive (1,000,000 search steps)
rocycle. The largest observed difference was for MCMM- identified the lowest energy conformer for all macrocy-
Enhanced with a variation of 7% between the highest and cles, this was considered as the global minimum. For the
lowest number of generated conformations (see max/min in other methods, the global minimum energy was considered
Table 7). For example, for 1BXO, which contains 24 rotat- identified if a conformer within 1 kJ mol−1 of the global
able bonds, MCMM-Enhanced generated 6141/5345/5963 energy minima was generated. We also set out to investi-
conformations out of 10,000 possible for each run. gate if these conformers were geometrically identical to
With 155,296 identified conformers, MCMM produces the global energy minima conformer. This was determined
a new conformer within the given energy window approxi- by first analyzing whether the global energy ring confor-
mately every third iteration (440,000 possible conformers if mation was identified and, secondly, whether the whole
every Monte Carlo step would generate a new conformer). conformer was identified. The global energy conformer

13
Table 6  Summary of the conformational analysis settings for the evaluated methods and literature protocols
Method Number of search Energy window Torsion sampling Elimination Ring closure Placement of ring Energy minimi- Maximum energy Energy minimi- Force field Solvent
steps (kcal ­mol−1) ­optiona of redundant distance (Å)c ­openingd zation ­methode minimization zation threshold
­conformationsb iterations (kJ Å−1 mol−1)

MCMM ­defaultf 1000 5.02 Intermediate AD 0.5 Å 0.5–2.5 Standard PRCG​ 2500 0.05 OPLS-3 water
MTLMOD 1000 5.02 Intermediate AD 0.5 Å 0.5–2.5 Standard PRCG​ 2500 0.05 OPLS-3 water
­defaultf
MD/LLMOD 5000 MD, 5000 10 Enhanced RMSD 0.75 Å NAg NAg NAVh 50,000 0.01 OPLS-3 water
­defaultf LLMOD
MD/LLMOD 5000 MD, 5000 15.01 Extended RMSD 0.75 Å NAg NAg NAVh 50,000 0.01 OPLS-3 water
LLMOD
MCMM 10,000 15.01 Extended AD 0.5 Å 0.5–2.5 Standard TNCG 50,000 0.05 OPLS-3 water
MCMM- 10,000 15.01 Extended AD 0.5 Å 0 – 100 Stereocenters TNCG 50,000 0.05 OPLS-3 water
Enhanced avoided
MCMM- 1,000,000 15.01 Extended AD 0.5 Å 0–100 Stereocenters TNCG 50,000 0.05 OPLS-3 water
Exhaustive avoided
MTLMOD 10,000 15.01 Extended AD 0.5 Å 0.5–2.5 Standard TNCG 50,000 0.05 OPLS-3 water
MTLMOD- 10,000 15.01 Extended AD 0.5 Å 0–100 Stereocenters TNCG 50,000 0.05 OPLS-3 water
Journal of Computer-Aided Molecular Design (2020) 34:231–252

Enhanced avoided
PRIME-MCS Spinroot ­10i 100 Peptide bonds Torsional finger- NAg NAg TNCG Chain 0.04 OPLS-2005 vacuum
print ­minimizationj
CF-MTLMODk 10,000 (400 15 Intermediate RMSD 0.25 Å 0.5–2.5 Standard PRCG​ 3000 0.05 OPLS-2005 water
RotStep)l
CF- 5000 MD, 5000 15 Enhanced RMSD 0.25 Å NAg NAg NAVh 50,000 0.01 OPLS-2005 water
MD/LLMODk LLMOD
CF-Low­ 10,000 15 NAg RMSD 0.25 Å NAg NAg NAVh 500 0.021 MMFF94x water
ModeMD ­MOEk

a
 Intermediate—Sample C–N and C–O single bonds other than in standard amides and esters; Enhanced—Sample all C–N and C–O single bonds; Extended—Sample all C–N and C–O single
bonds and C = N and N = N double bonds. Sampling of peptide bonds are allowed (“peptide bonds”)
b
 Atom deviation (AD): A conformation is unique if one (or several) of the defined atoms deviates more than specified, from the compared conformations after superposition. Root Mean Square
Deviation (RMSD): A conformation is unique if the RMSD value between two conformations exceeds the specified value. Torsional Fingerprint: Two conformations are considered redundant
if they have identical torsional fingerprints
c
 Re-close ring system if the ring closure atoms are within the defined distance range
d
 Placement of the macrocyclic ring opening bond could be adjacent to a stereocenter using the automatic setup (standard)
e
 Energy minimization method. Polak-Ribiere Conjugated Gradient (PRCG). Truncated Newton Conjugated Gradient (TNCG)
f
 Default refers to the predefined values in Schrödinger
g
 Not Applicable
h
 Not Available
i
 Spinroot 1, and 10 generating up to 100 and 1000 conformations, respectively
j
 Chain energy minimization, starts with a conjugate gradient followed by a Truncated Newton minimization
k
 Enhanced settings presented by I-Chen and Foloppe
l
 Limits the total number of search steps as a function of the number of rotatable bond. Only active if multiple ligands are sampled simultaneously

13
241

242 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Table 7  Summary of conformational analysis results for the full data set of 44 macrocycles
Method No. ­confa No. unique Computa- Global energy mini- Best fit conformation RMSDRING (Å)f
ring ­confb tional time mum found for no. RMSD (Å)e
(min)c ­macrocyclesd
 < 1 Å 1 Å–2 Å  > 2 Å  < 0.5 Å 0.5 Å–1 Å  > 1 Å

Energy minimized NAg NAg NAg NAg 40 5 0 45 1 0


X-rayppw ligand
Starting conformer NAg NAg NAg NAg 0 7 38 3 19 24
SMILES conformer NAg NAg NAg NAg 4 15 26 5 25 16
MCMMh
Average 155,296 18,950 3740 48% (63/132) 37 6 2 38 8 0
Max 159,973 20,274 4138 NAg 42 0 3 42 2 2
Min 150,558 17,840 3340 NAg 32 13 0 35 11 0
MTLMODh
Average 117,490 23,367 4125 48% (64/132) 35 8 2 39 6 1
Max 121,147 25,015 4468 NAg 39 5 1 42 1 3
Min 113,512 21,804 3806 NAg 31 10 4 37 9 0
MCMM-Enhancedh
Average 149,831 49,324 4642 52% (68/132) 40 5 0 44 2 0
Max 155,304 52,181 5022 NAg 41 4 0 45 1 0
Min 144,227 46,906 4325 NAg 35 10 0 40 6 0
MTLMOD-Enhancedh
Average 134,396 41,040 4182 50% (66/132) 36 9 0 45 1 0
Max 137,988 42,589 4515 NAg 41 3 1 46 0 0
Min 130,954 39,443 3932 NAg 32 13 0 41 5 0
MD/LLMODi
Average 45,917 19,189 4163 55% (24/44) 31 11 3 38 8 0
Max NA NA NA NAg NA NA NA NA NA NA
Min NA NA NA NAg NA NA NA NA NA NA
PRIME-MCSh
Average 31,118 24,953 9791 NAg 24 19 2 35 11 0
Max 31,286 25,122 9804 NAg 24 19 2 37 9 0
Min 30,952 24,795 9779 NAg 22 21 2 34 12 0
MCMM-Exhaustive 7,528,356 967,844 925,026 45 44 1 0 46 0 0
a
 The sum total of conformers generated
b
 The sum total of unique ring conformations
c
 The sum total of computational time (minutes) used for conformational analysis
d
 Number of macrocycles where the lowest energy conformer was identified or a conformer with an energy difference not greater than 1 kJ mol−1
and an RMSD below 0.1 Å to the global energy conformer using all heavy atoms
e
 RMSD for the conformer identified with the lowest RMSD value to the X-ray ligand after protein preparation treatment (X-rayppw). The con-
formers are, dependent on their RMSD values, divided into three different groups with RMSD values: below 1 Å, between 1–2 Å, and greater
than 2 Å
f
 RMSDRING for the conformer identified with the lowest ­RMSDRING value to the heavy ring atoms in the X-ray ligand after protein preparation
treatment. The conformers are, dependent on their RMSD values, divided into three different groups with RMSD values: below 0.5 Å, between
0.5–1 Å, and greater than 1 Å. 3BXR consist of two macrocyclic rings, therefore 46 (instead of 45) R
­ MSDRING values are presented
g
 Not Applicable
h
 Run three time using different seeds.
i
 MD/LLMOD was run one time

and ring conformation were considered geometrically mirror-image conformers of the global energy minimum
identical if the RMSD between the two conformers were were considered identical in this analysis.
below 0.1 Å RMSD when using either all heavy atoms pre- To compare the ability of different methods to identify
sent or just those of the macrocyclic ring, respectively. The the global energy minimum using 10,000 search steps, the

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 243

percentage of runs in which these conformations were iden-


tified was calculated. By only considering the energy, MD/
LLMOD identified the global energy minima in 64% of
the conformational searches (28/44), whereas MTLMOD-
Enhanced and MCMM-Enhanced identified a conformer
within 1 kJ mol−1 from the global energy minimum for 60%
(79/132) and 59% (78/132) of the runs, respectively, see
Table 7 and Table S11. MCMM and MTLMOD found the
global energy minimum for and 57% (75/132), 56% (74/132)
of the runs, respectively. Consequently, when using the
enhanced settings, both MCMM and MTLMOD performed
slightly better. Since the success rate of finding a conformer
within 1 kJ mol−1 from global minimum ranged between 56
and 64%, this suggests that the 10,000 search steps might
not be adequate for finding the global minimum for macro-
cycles. However, considering the number of rotatable bonds
Fig. 2  Relationship between rotatable bonds defined as the number of
a macrocycle may have and, consequently, the large number torsion angles sampled per macrocycle and the number of times the
of conformers that may exist, the probability of generating global energy minima were identified using the different methods (13
the global energy minimum should be rather low. Thus, the runs per macrocycle using 6 different methods)
low success rates of generating the global energy minimum
are not surprising.
By requiring that the ring conformation should be iden- energy minimum for a macrocycle with many rotatable bonds,
tical to that of the global energy minimum, MD/LLMOD more extensive conformational sampling than 10 000 search
instead identified the global energy minima in 61% (27/44) steps is required. For nine macrocycles, none of the methods
of the conformational searches. This was followed by except MCMM-Exhaustive were able to identify the global
MCMM-Enhanced 57% (75/132), MTLMOD-Enhanced energy minimum (1FKD, 1FKI, 1NMK, 1NSG, 1TPS, 2ASP,
55% (73/132), MTLMOD 54% (71/132) and, MCMM 53% 2DG4, 3BXS and 4NNR). These compounds contain some
(70/132). Using the strictest definition were the whole (all of the largest ring structures in the data set except 3BXS. In
heavy atoms) global energy minima conformer needs to be 3BXS none of the methods identified the correct valine side-
identical, MD/LLMOD identified the global energy minima chain orientation.
in 55% (24/44) of the conformational searches, followed by
MCMM-Enhanced 52% (68/132), MTLMOD-Enhanced Computational speed
50% (66/132), MTLMOD 48% (64/132) and, MCMM 48%
(63/132). In summary, MD/LLMOD generated the global The computational speed per method is given as the average
energy minima most frequently of the evaluated methods CPU-time to sample the 44 macrocycles (total CPU time/
across all three definitions of the global energy minima. number of seeds). The fastest method overall, was MCMM
The challenge of identifying the global energy minimum for (3740  min) followed by MTLMOD (4125  min), MD/
a given macrocycle can be estimated by the number of times LLMOD (4163) MTLMOD-Enhanced (4182 min), MCMM-
it is found over 13 different runs (3 MCMM runs, 3 MCMM- Enhanced (4642 min) and PRIME-MCS (9791 min), see
Enhanced runs, 3 MTLMOD runs, 3 MTLMOD-Enhanced Table  7 and Table  S12. PRIME-MCS (9791  min) was
runs and 1 MD/LLMOD run) [44]. For each macrocycle, the thereby roughly twice as slow in comparison to the second
relationship between the number of times the methods found slowest method MCMM-Enhanced (5345 min). MCMM-
the global energy minimum (using the all heavy atoms defi- Exhaustive ran for 925,026 min, i.e. almost 2 years, in total
nition) and the number of rotatable bonds is shown in Fig. 2 (median 17,412 min, approximately 12 days per macrocy-
below. For example, all 13 methods identified the global cle). Other conformational sampling methodologies requir-
energy minimum for 1S9D (15 rotatable bonds), while none ing up to 7 days on a 100 CPU cluster have been published
of the methods found the global energy minima for 1NSG [37]. However, this amount of conformational sampling is
(46 rotatable bonds). As expected, the methods are more suc- rarely described. Thus, using MCMM-Exhaustive is prob-
cessful for identifying the global energy minimum for less ably unreasonably lengthy for most modelling projects in the
flexible macrocycles (less than 20 rotatable bonds, dotted line field of drug discovery.
in Fig. 2). In contrast, this is more challenging for flexible
macrocycles with more than 33 rotatable bonds (grey area in
Fig. 2). This suggests that if the aim is to generate the global

13

244 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Generating a conformation similar to the X‑rayppw


conformation

The ability to generate conformers similar to the bioactive


conformation in the X-ray structure (best-fit conformation
in Table 7) was analysed by calculating the RMSD between
all conformers for each macrocycle and the X-ray ligand
after Protein Preparation Wizard treatment (the X-rayppw
conformation). Structures downloaded from the PDB are
interpretations of electron densities and therefore models
that may contain errors, especially in the ligand structures
[68-71]. Therefore, we allowed for “adjusting” distorted
bond angles and lengths, etc., in protein–ligand complexes
and thereby aligning the structures to the force field. Thus,
we used the X-rayppw conformation as the X-ray structure
in all comparisons. When available, the fit to the electron Fig. 3  All heavy-atoms RMSD. The mean RMSD value was used for
density was evaluated for the protein-macrocycle complexes those methods that were run more than one time. The cumulative per-
after the restrained energy minimization. The conformer formance describing how successful the methods are at generating a
conformer close to the X-rayppw conformation is shown. The perfor-
most similar to the X-rayppw conformation was grouped in mance is benchmarked against the energy minimized X-rayppw con-
one of the following categories based on the RMSD value; formations (shown by the pink line near the bottom) and an exhaus-
below 1 Å, between 1 Å–2 Å and greater than 2 Å. All non- tive MCMM run (1,000,000 search steps shown by the purple line at
hydrogen atoms were considered for the RMSD calculations. the bottom)
Furthermore, since one of the complexes (3BXS) contains
the macrocycle bound to two separate binding sites with dif- if the sampling is sufficiently increased. Surprisingly, when
ferent conformation, 45 instead of 44 macrocycles extracted comparing the MCMM-Enhanced and the MTLMOD-
from protein-macrocycle complexes were evaluated in this Enhanced searches with the MCMM-Exhaustive search the
section. The median RMSD value between the X-ray ligand results are not dramatically different. Thus, 10,000 search
structure before and after protein preparation wizard treat- steps seems to be adequate for generating a conformer close
ment was 0.19 Å. For methods run in triplicate, the mean to the X-rayppw conformation for the enhanced methods.
value is presented. Overall, the general methods seemed more efficient at gen-
Furthermore, to explore the local minimum closest to the erating conformers close to the X-rayppw conformations in
X-rayppw conformation we energy minimized the X-rayppw comparison to the more specialized methods. A visual over-
structure and compared it with the X-rayppw conformation view of how the different methods performed is depicted in
(“energy minimized X-rayppw ligand” in Table 7). 40 out of Fig. 3.
the 45 macrocycles had a local minimum with RMSD values
below 1 Å when superimposed on the X-rayppw structure, Generating the X‑rayppw ring conformation
and five energy minimized X-rayppw conformers had RMSD
values between 1 and 2 Å, see also Table S13. Most of the published macrocycle sampling studies have
The different methods’ ability to generate conformers focused on the RMSD to the macrocyclic ring atoms as
similar to the X-rayppw conformation can be seen in Table 7 found in the X-ray. Thus, as a final comparison, we wanted
(“best fit conformation”), Table S13 and Table S14. Analys- to evaluate if the methods examined here could identify a
ing the results below 1 Å RMSD, the MCMM-Exhaustive ring conformation similar to the X-rayppw ring conforma-
searches were able to identify such conformers for 44 out tion ­(RMSDRING). These results are summarized in Table 7
of 45 macrocycles. The RMSD for 1TPS was 1.22 Å and and are presented in greater detail in Tables S15 and S16.
this macrocycle contained the highest number of rotatable For 3BXR, which consists of two macrocyclic rings, an
bonds (47), which could be a reason for this. The second ­RMSDRING value was calculated for each ring separately.
best search method was MCMM-Enhanced (40 of 45 mac- Therefore, 46 RMSD values instead of 45 will be presented
rocycles), followed by MTLMOD-Enhanced (36 out of 45 in this section. For methods run in triplicates, the mean value
macrocycles). The more specialized methods MD/LLMOD is presented.
and PRIME-MCS searches generated a conformer below 1 Å Ideally, a conformational search method should be able to
RMSD for 31 and 24 macrocycles, respectively. The results identify a conformer below 0.5 Å ­RMSDRING (a commonly
from the MCMM-Exhaustive search imply that conformers used threshold [33, 36, 38]). Analysing the results of the
very close to the X-rayppw conformations can be generated benchmarking methods MCMM-Exhaustive and the energy

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 245

conformation with respect to all heavy-atoms RMSD, as well


as the backbone (ring) atoms R ­ MSDRING (see Table 8). As
it is not meaningful to directly compare the results between
different studies, we only looked at the internal order of the
different methods in each paper with respect to RMSD val-
ues. The comparison starts with the well-cited work from
Chen and Foloppe [44] where MD/LLMOD and MTLMOD,
among other methods were evaluated, followed by an anal-
ysis of the publications where the MD/LLMOD [32] and
PRIME-MCS [33] methods were published. Finally, we
included the publications that introduces the new methods
BRIKARD [35], PLOP [36], and ForceGen [34]. In all the
publications above, with the exception of ForceGen, the
MD/LLMOD method has been included, which allows it
to serve as a reference method in the current analysis (the
Fig. 4  Ring atoms RMSD. The mean RMSD value was used for those results of MD/LLMOD are shown in bold in Table 8).
methods that were run more  than one time. The cumulative perfor- Chen and Foloppe evaluated several different settings
mance describing how successful the methods are at generating a
on a series of conformational sampling methods with the
conformer close to the X-rayppw conformation is shown. The perfor-
mance is benchmarked against the energy minimized X-rayppw con- aim of optimizing search efficiency. The optimal settings
formations (shown by the pink line near the bottom) and an exhaus- derived are presented with the prefix “CF-” below and in
tive MCMM run (1,000,000 search steps shown by the purple line at Table 6. Unfortunately, not all heavy-atoms RMSD values
the bottom)
were reported in this study but instead the number of X-ray
structures that were reproduced within an RMSD of 1 Å.
minimized X-rayppw structures, these methods were able to The CF-MTLMOD method was the most accurate in gen-
identify a conformer below 0.5 Å R ­ MSDRING in 46 and 45 erating conformers below 1 Å RMSD (79%) followed by
cases out of the 46 macrocyclic rings, respectively. Ana- CF-MD/LLMOD (77%), MOE LowModeMD (72%), and
lysing the results obtained with the other methods, they all MOE stochastic search (53%) [44].
had a median ­RMSDRING below 0.5 Å. However, MCMM- Watts et al. presented the MD/LLMOD method and a
Enhanced and MTLMOD-Enhanced most accurately regen- macrocycle data set consisting of 67 PDB structures (150
erated the macrocyclic ring conformation and both meth- structures in total) [32]. This data set, as well as the MD/
ods identified a conformer below 0.5 Å ­RMSDRING in 44 LLMOD method, has been used in several other studies.
out of the 45 cases. For comparison, the third best method Applied on the 67 PDB structures, Watts et al. reported a
MTLMOD identified such a conformer for 39 out of the median heavy-atom RMSD of 1.14 Å (values reported in
46 macrocyclic rings. Compared to the other methods in supporting information in ref [32]) compared to the X-ray
Fig. 4, MCMM-Enhanced and MTLMOD-Enhanced seem structure).
to be more efficient at generating a ring conformations close Sindhikara et al. used 60 out of the 67 macrocycles in the
to the X-rayppw ring conformation. Watts et al. data set and included the MD/LLMOD method,
a molecular dynamics simulation method (simulations were
Comparing the results with prior work in the field run for 24 ns), and MOE LowModeMD [31] as reference
methods when they presented and evaluated the PRIME-
Several recent publications have studied the conformational MCS method [33]. Applied on those 60 PDB structures, Sin-
preference of macrocycles and presented new conforma- dhikara et al. reported the median all heavy-atom RMSD val-
tional sampling methods. However, a direct comparison of ues to the X-ray structure to be the lowest for MD/LLMOD
these methods with those presented in this study is difficult (1.10 Å), followed by PRIME-MCS (1.49 Å), MOE LowM-
since the datasets differ and the energies, number of con- odeMD (1.69 Å), and molecular dynamics (1.89 Å) (values
formers and ring conformations, may change significantly reported in supporting information in ref [33]). The median
depending on, for example, what force field has been used. ­RMSDRING values followed a pattern analogues to the all
Another challenge with such comparisons is that small heavy-atom RMSD values. The lowest median R ­ MSDRING
changes in programme settings can change the internal value was obtained for MD/LLMOD (0.38 Å) followed by
rankings between the methods. Despite this, we compared PRIME-MCS (0.40 Å), MOE LowModeMD (0.41 Å), and
the ability of the methods presented in each paper to repro- molecular dynamics (0.56 Å) (values reported in support-
duce the X-ray structure (X-ray accuracy). Specifically, ing information in ref [33]). Thus, in both cases, the MD/
we analyzed the methods ability to re-generate the X-ray

13

246 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Table 8  Summary of the X-ray accuracy reported in the literature


Method Authors Data set (total no. structures/ % Below Median RMSD (Å)b Median ­RMSDRING (Å)c
PDB structures) 1 Å
­RMSDa

MD/LLMOD Chen and Foloppe [44] Chen and Foloppe (30/30) 77 NAd NAd
CF-MTLMODe 79 NAd NAd
MOE LowModeMD 72 NAd NAd
Stochastic search 53 NAd NAd
MD/LLMOD Watts et al. [32] Watts et al. (150/67) NAd 1.14 NAd
MD/LLMOD Sindhikara et al. [33] Watts et al. (208/60) NAd 1.10 (PDB) 0.38 (PDB)
PRIME-MCS 60 PDB structures from Watts NAd 1.49 (PDB) 0.40 (PDB)
et al.
MOE LowModeMD NAd 1.69 (PDB) 0.41 (PDB)
Molecular dynamics NAd 1.89 (PDB) 0.56 (PDB)
simulation (24 ns)
BRIKARD Coutsias et al. [35] Coutsias et al. (67/39) NAd NAd 0.47 (all), 0.42 (PDB)
CF-MD/LLMODe NAd NAd 0.54 (all), 0.47 (PDB)
MD/LLMOD NAd NAd 0.63 (all), 0.49 (PDB)
CF-LowModeMDe NAd NAd 0.64 (all), 0.54 (PDB)
PLOP Wang et al. [36] Wang et al. (37/12) NAd NAd 0.25 (70% below 0.5 Å)
MD/LLMOD NAd NAd NAd (64% below 0.5 Å)
CF-MTLMODe Cleves and Jain [34] Chen and Foloppe NAd NAd NAd
CF-LowModeMDe (30/30) NAd NAd NAd
ForceGen NAd NAd NAd
MCMM Current work 2019 Alogheli and Watts et al. 78 0.58 0.16
MCMM-Enhanced (44/44) 84 0.58 0.16
31 PDB structures from Watts
MTLMOD-Enhanced 89 0.59 0.17
el al.
MTLMOD 78 0.77 0.18
MD/LLMOD 69 0.78 0.20
PRIME-MCS spinroot 30 51 0.98 0.27

In all the publications above (exception of ForceGen), the MD/LLMOD method (shown in bold) has been included, which allows it to serve as a
reference method
a
 Percent of macrocycles in the data set that the methods successfully generated a conformer below 1 Å RMSD to the X-ray conformation using
all heavy atoms
b
 Median RMSD using all heavy atoms
c
 Median RMSD using all only the heavy atoms in the macrocyclic ring
d
 Not Applicable
e
 Optimized settings presented by Chen and Foloppe

LLMOD method had the best accuracy for reproducing the slightly better than MD/LLMOD. However, when com-
X-ray conformation. paring the X-ray ring accuracy for the PDB structures
In a study by Coutsias et al., a new method called BRI- only, the differences between the methods were smaller.
KARD was presented [35]. To evaluate BRIKARD, Cout- For the 39 structures originating from the PDB, the
sias et al. collected a data set of 67 structures, of which 39 median ­RMSDRING was 0.42 Å for BRIKARD, followed
originated from the PDB. BRIKARD was benchmarked by CF-MD/LLMOD (0.47 Å), MD/LLMOD (0.49 Å), and
against MD/LLMOD, as well as the optimized methods CF-LowModeMD (0.54  Å) (median values calculated
CF-MD/LLMOD and CF-LowModeMD from the work of from supporting information in ref [35]).
Chen and Foloppe (see Table 6 for settings). Using all Wang et al. developed the PLOP method based on 37
67 structures, BRIKARD had a median R ­ MSDRING value macrocycles originating from both the PDB and CSD [36].
of 0.47 Å followed by CF-MD/LLMOD (0.54 Å), MD/ In their study, MD/LLMOD and PLOP were compared
LLMOD (0.63 Å) and CF-LowModeMD (0.64 Å) (values based on how well the backbone (ring atoms) were repro-
calculated from supporting information in ref [35]). Thus, duced. The optimized protocol of PLOP was able to repro-
BRIKARD seems to reproduce the ring conformation duce the crystal structure within 0.50 Å R
­ MSDRING for 31

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 247

out of 37 macrocycles with a median R ­ MSDRING value of depicts the 10 examples, chosen to represent the structur-
0.25 Å. Wang et al. concluded that the performance of PLOP ally diverse range of macrocycles). Our results show that the
was not statistically different compared to MD/LLMOD. number of conformers generated increases steadily for all
Cleves and Jain compared the performance of ForceGen, macrocycles, with the exception of 3JRX and 2HFK/2J9M,
CF-LowModeMD and CF-MTLMOD using the Chen and which are on top of each other. These three macrocycles
Foloppe data set (30 PDB structures) [34]. Applied on those have relatively small macrocyclic ring systems (12, 14 and
macrocycles, CF-LowModeMD showed equivalent repro- 15, respectively) and are not extensively substituted. As
duction of the X-ray conformation (all heavy-atoms) com- these macrocycles have the smallest number of torsional
pared to ForceGen, whereas CF-MTLMOD showed margin- angles they are therefore, expected to have fewer conforma-
ally better results compared to ForceGen. tions, for example, in comparison with 1S22. This macro-
To summarize, as shown from the studies above, the cycle contains a much larger ring system, substituted with
specialized macrocycle sampling method MD/LLMOD is a large flexible side-chain thereby increasing the degrees or
able to reproduce the X-ray structures accurately, generat- torsional freedom. Thus, after 1,000,000 search steps, full
ing better or comparable results to other methods in prior conformational coverage has, as expected, not been reached
publications. Interestingly, we have shown that by using for the majority of the ten displayed macrocycles. For the
slightly tweaked versions of the general methods, such as full data set, MCMM produced 155,846 conformers whereas
MCMM-Enhanced and MTLMOD-Enhanced, X-rayppw MCMM-Exhaustive identified 7,528,356, corresponding to
accuracy results comparable to, or even better than, MD/ roughly 48 times more conformers than MCMM.
LLMOD may be obtained at least for the data set and param- As expected, when examining the number of generated
eters used in this study. Therefore, to further explore the conformations within a narrower energy window (e.g. 10
general methods abilities, including MCMM-Enhanced and and 5 kcal mol−1, in Fig. 5b and c, respectively) the rate
MTLMOD-Enhanced in future method comparison studies of conformer generation decreases for most of the macro-
could be of interest. cycles. As seen in Fig. 5b, for six out of ten macrocycles
(2HFK, 2J9M, 2DG4, 2XBK, 3I6O and, 3JRX) the rate of
Conformational coverage—are we reaching conformer generation decreases. Within 5 kcal mol−1, all but
convergence one (1S22) of the ten conformational searches asymptoti-
cally approaches full coverage (Fig. 5c).
The absolute degree of conformational space covered in a Looking further at the energy distribution for all conform-
conformational search is often difficult to describe [72, 73]. ers of the 10 macrocycles, the vast majority of the conform-
In the literature, parameters such as the number of confor- ers have a relative energy above 10 kcal mol−1. For a more
mations identified, the number of times the lowest energy thorough discussion about the energy distribution, see sec-
conformation is visited, the range of compactness/extended- tion “Conformational coverage and conformational energy
ness covered by the conformations as described by the radius distribution.” in the supporting information.
of gyration, and the number of visited 3D pharmacophore
points have been considered [31, 44]. Full conformational Energy window for sampling macrocycles
coverage can also be defined as when all possible conform-
ers within a specified energy window have been found. As Numerous studies have been performed to understand the
macrocycles are said to be conformationally restricted, we conformational energy cost when a ligand binds to its target.
aimed to explore the conformational space in a more exhaus- Some argue that the acceptable conformational energy pen-
tive way than is typical. alty is relatively low (below 3 kcal mol−1 [74, 75], mostly
Since the shape of the energy hypersurface is force field below 5 kcal mol−1 [76], between 4 and 6 kcal mol−1 [77],
dependent, the number of possible low-energy conformers and mostly below 6 kcal mol−1 [78]) However, energies,
varies between the force fields. Quantum mechanical meth- above 9 kcal mol−1 [76], around 15.9 ± 11.5 kcal mol−1
ods would also most likely change the number of possible [79], between 0  –  25  kcal  mol−1 [80], and even up to
low-energy conformers. However, since most drug design 27 kcal mol−1 [81], have been suggested as feasible for pro-
projects are carried out in a molecular mechanics force field tein-bound ligands. For conformational sampling of macro-
environment, we were interested to explore how many con- cycles, Chen and Foloppe noticed an improved reproduction
formers that energy hypersurface contains. Therefore, we ran of the X-ray conformation for MTLMOD using an increased
the MCMM-Enhanced search with 1,000,000 search steps energy window of up to 15 kcal mol−1 [44]. Also, Alogheli
(MCMM-Exhaustive) for the full data set (44 macrocycles). et al. used a 15 kcal mol−1 energy window and found that the
The results were visualized by plotting the number of search conformer closest to the energy minimized X-ray structure
steps against the number of conformations generated within averaged around 5 kcal mol−1 from the global minimum with
15 kcal mol−1 from the global energy minimum (Fig. 5a the largest difference being 10.8 kcal mol−1 (almost the same

13

248 Journal of Computer-Aided Molecular Design (2020) 34:231–252

Fig. 5  Number of generated conformers for the ten macrocycles in nation of high energy conformers when a new “global energy mini-
the diverse subset within: a 15  kcal  mol−1; b 10  kcal  mol−1; and c mum” is generated during the search. The line for 1S22 in plot (B)
5  kcal  mol−1 from the lowest energy conformer using 1,000,000 do not reach 1 million steps because only up to 100,000 conformers
search steps in total. The discontinuities in the lines are due to elimi- within 10 kcal mol−1 are registered in the .log-file

data set as in this study and energies were calculated with Therefore, two energy differences were calculated and ana-
OPLS-2005) [40]. lyzed. The first between the global energy minimum and the
Calculating the conformational energy penalty upon bind- energy minimized X-rayppw structure and the second between
ing has been discussed rigorously in the literature and has the global energy minimum and the MCMM-Exhaustive
recently been summarized by Peach [82]. Thus, no attempts generated conformer closest to the X-rayppw structure. As
tackling this subject will be made herein. However, as many mentioned in section "Generating a Conformation Simi-
modeling methods utilize an energy window for generating lar to the X-rayppw Conformation", the conformer derived
conformers, the width of this window is of great importance. by energy minimizing the X-rayppw conformation should

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 249

correspond to the minimum closest to the X-rayppw confor- Our comparative analysis of different sampling meth-
mation. Therefore, with the aim of generating conformers ods showed that, in most cases, the general conformational
closest to the bioactive conformation, the energy difference search methods (MCMM, MTLMOD) with standard and
to this conformation could serve as an upper cut-off value enhanced settings compared well with the more specialized
for keeping conformers. The energy difference between the macrocycle sampling methods (MD/LLMOD and PRIME-
minimized X-rayppw conformation and the global minimum MCS). However, if the aim is to generate a large pool of
varied between 0 and 13.7 kcal mol−1, with a median value conformers or a conformer close to the X-rayppw structure,
of 5.6 kcal mol−1 (Table S17). Therefore, the energy window any of the general methods could be recommended. The
of 15 kcal mol−1 used herein seems appropriate. It should encouraging results of MCMM-Enhanced and MTLMOD-
be noted that only five out of the 45 macrocycles had energy Enhanced suggest that conformational sampling of mac-
differences exceeding 10 kcal mol−1. rocycles might be manageable when it comes to the gen-
As previously mentioned, the MCMM-Exhaustive eration of conformers close to the X-rayppw conformation.
searches were able to generate conformations similar to However, if the aim is to identify the global minimum, more
the X-rayppw structure (< 1  Å) for all but one of the 45 than 10,000 steps are required. Of the methods evaluated,
protein-macrocycle complexes, see Tables  7 and S13. the MD/LLMOD method performed the best in generating
When the energy difference between these conformations the global energy minimum.
and the corresponding global energy minimum (gener-
ated by any method) was analyzed it varied between 0 Acknowledgements  Open access funding provided by Uppsala Uni-
versity. This work was supported by the Swedish Research Council
and 21.4 kcal mol−1 with a median of 6.6 kcal mol−1 (see (521–2014-6711). The authors would like to thank Dr Lindon Moodie
Table S17). The conformation closest to the X-rayppw con- for constructive criticism of the manuscript. The authors would also
formation was found within 5 and 10 kcal mol−1 for 19 and like to thank the reviewers for their thoughtful comments and efforts
33 of the macrocycles, respectively. towards improving our manuscript.
Considering both energy differences, a 15 kcal mol−1
Open Access  This article is licensed under a Creative Commons Attri-
energy window for keeping conformers seems appropriate. bution 4.0 International License, which permits use, sharing, adapta-
However, there are many cases where an energy window of tion, distribution and reproduction in any medium or format, as long
10 kcal mol−1 or even 5 kcal mol−1, could be used instead. as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are
included in the article’s Creative Commons licence, unless indicated
Conclusions otherwise in a credit line to the material. If material is not included in
the article’s Creative Commons licence and your intended use is not
The present work has addressed macrocycle conforma- permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a
tional sampling. We evaluated the performance of two of copy of this licence, visit http://creat​iveco​mmons​.org/licen​ses/by/4.0/.
the commonly used, general-purpose conformational sam-
pling methods and compared them with two more recent
and specialized macrocycle sampling approaches. We also
determined that using TNCG as the energy minimization References
method and combining it with wider ring closure distance
settings (0–100 Å) and avoiding ring open bonds adjacent 1. Sliwoski G, Kothiwale S, Meiler J, Lowe EW Jr (2014) Compu-
to chiral carbons for MCMM and MTLMOD may be used tational methods in drug discovery. Pharmacol Rev 66:334–395.
https​://doi.org/10.1124/pr.112.00733​6
to enhance macrocycle sampling. Moreover, we have shown
2. Leach AR, Gillet VJ, Lewis RA, Taylor R (2010) Three-dimen-
that generating a starting conformation from a SMILES- sional pharmacophore methods in drug discovery. J Med Chem
string to a 3D-structure, in some cases, might produce a 53:539–558. https​://doi.org/10.1021/jm900​817u
conformation similar to the X-rayppw conformation. Thus, in 3. Nicholls A, McGaughey GB, Sheridan RP et al (2010) Molecu-
lar shape and medicinal chemistry: a perspective. J Med Chem
all studies attempting to reproduce experimental data such
53:3862–3886. https​://doi.org/10.1021/jm900​818s
as X-ray structures, the structural similarities between the 4. Chen I-J, Foloppe N (2008) Conformational sampling of druglike
starting conformation and the X-ray conformation should be molecules with MOE and catalyst: implications for pharmacoph-
analysed. Furthermore, as verified in the current study and ore modeling and virtual screening. J Chem Inf Model 48:1773–
1791. https​://doi.org/10.1021/ci800​130k
by others, the stochastic nature of conformational sampling
5. Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking
may influence the results. Consequently, we recommend and scoring in virtual screening for drug discovery: methods
assessing how stochastic elements inherent to a given search and applications. Nat Rev Drug Discov 3:935–949. https​://doi.
method affects these outcomes by either employing different org/10.1038/nrd15​49
6. Giordanetto F, Kihlberg J (2014) Macrocyclic drugs and clini-
starting conformations or different seeding.
cal candidates: what can medicinal chemists learn from their

13

250 Journal of Computer-Aided Molecular Design (2020) 34:231–252

properties? J Med Chem 57:278–295. https​://doi.org/10.1021/ 23. Madsen CM, Clausen MH (2011) Biologically active macrocyclic
jm400​887j compounds—from natural products to diversity-oriented synthe-
7. Rezai T, Bock JE, Zhou MV et al (2006) Conformational flexibil- sis. Eur J Org Chem 2011:3107–3115. https​://doi.org/10.1002/
ity, internal hydrogen bonding, and passive membrane permeabil- ejoc.20100​1715
ity: successful in silico prediction of the relative permeabilities of 24. Beckmann HSG, Nie F, Hagerman CE et al (2013) A strategy for
cyclic peptides. J Am Chem Soc 128:14073–14080. https​://doi. the diversity-oriented synthesis of macrocyclic scaffolds using
org/10.1021/ja063​076p multidimensional coupling. Nat Chem 5:861–867. https​://doi.
8. Kolossváry I, Guida WC (1996) Low mode search. An efficient, org/10.1038/nchem​.1729
automated computational method for conformational analysis: 25. Yu X, Sun D (2013) Macrocyclic drugs and synthetic methodolo-
application to cyclic and acyclic alkanes and cyclic peptides. J Am gies toward macrocycles. Molecules 18:6230–6268. https​://doi.
Chem Soc 118:5011–5019. https​://doi.org/10.1021/ja952​478m org/10.3390/molec​ules1​80662​30
9. Karlén A, Johansson AM, Arvidsson LE et al (1986) Confor- 26. Poulsen A, William A, Blanchard S et al (2013) Structure-based
mational analysis of the dopamine-receptor agonist 5-hydroxy-2- design of nitrogen-linked macrocyclic kinase inhibitors leading to
(dipropylamino)tetralin and its C(2)-methyl-substituted derivative. the clinical candidate SB1317/TG02, a potent inhibitor of cyclin
J Med Chem 29:917–924. https​://doi.org/10.1021/jm001​56a00​8 dependant kinases (CDKs), Janus kinase 2 (JAK2), and Fms-like
10. Blundell CD, Packer MJ, Almond A (2013) Quantification of free tyrosine kinase-3 (FLT3). J Mol Model 19:119–130. https​://doi.
ligand conformational preferences by NMR and their relationship org/10.1007/s0089​4-012-1528-7
to the bioactive conformation. Bioorg Med Chem 21:4976–4987. 27. Bowers AA, Greshook TJ, West N et al (2009) Synthesis and
https​://doi.org/10.1016/j.bmc.2013.06.056 conformation-activity relationships of the peptide isosteres of
11. Bell IM, Gallicchio SN, Abrams M et al (2002) 3-Aminopyr- FK228 and largazole. J Am Chem Soc 131:2900–2905. https​://
rolidinone farnesyltransferase inhibitors: design of macrocyclic doi.org/10.1021/ja807​772w
compounds with improved pharmacokinetics and excellent cell 28. Saunders M, Houk KN, Wu YD et  al (1990) Conformations
potency. J Med Chem 45:2388–2409. https​://doi.org/10.1021/ of cycloheptadecane. A comparison of methods for conforma-
jm010​531d tional searching. J Am Chem Soc 112:1419–1427. https​://doi.
12. Wlodek S, Skillman AG, Nicholls A (2006) Automated ligand org/10.1021/ja001​60a02​0
placement and refinement with a combined force field and shape 29. Kolossváry I, Keserü GM (2001) Hessian-free low-mode
potential. Acta Crystallogr Sect D 62:741–749. https​: //doi. conformational search for large-scale protein loop optimiza-
org/10.1107/S0907​44490​60160​76 tion: application to c-jun N-terminal kinase JNK3. J Comput
13. Driggers EM, Hale SP, Lee J, Terrett NK (2008) The exploration Chem 22:21–30. https​: //doi.org/10.1002/1096-987X(20010​
of macrocycles for drug discovery—an underexploited structural 115)22:1%3c21:AID-JCC3%3e3.0.CO;2-I
class. Nat Rev Drug Discov 7:608–624. https​://doi.org/10.1038/ 30. Parish C, Lombardi R, Sinclair K et al (2002) A comparison of
nrd25​90 the low mode and monte carlo conformational search methods.
14. Wessjohann LA, Ruijter E, Garcia-Rivera D, Brandt W (2005) J Mol Graph Model 21:129–150. https​://doi.org/10.1016/S1093​
What can a chemist learn from nature’s macrocycles?—A brief, -3263(02)00144​-4
conceptual view. Mol Divers 9:171–186. https​://doi.org/10.1007/ 31. Labute P (2010) LowModeMD—implicit low-mode veloc-
s1103​0-005-1314-x ity filtering applied to conformational search of macrocycles
15. Mallinson J, Collins I (2012) Macrocycles in new drug discov- and protein loops. J Chem Inf Model 50:792–800. https​://doi.
ery. Future Med Chem 4:1409–1438. https​://doi.org/10.4155/ org/10.1021/ci900​508k
fmc.12.93 32. Watts KS, Dalal P, Tebben AJ et al (2014) Macrocycle con-
16. Marsault E, Peterson ML (2011) Macrocycles are great cycles: formational sampling with macromodel. J Chem Inf Model
applications, opportunities, and challenges of synthetic macrocy- 54:2680–2696. https​://doi.org/10.1021/ci500​1696
cles in drug discovery. J Med Chem 54:1961–2004. https​://doi. 33. Sindhikara D, Spronk SA, Day T et al (2017) Improving accu-
org/10.1021/jm101​2374 racy, diversity, and speed with prime macrocycle conforma-
17. Doak BC, Zheng J, Dobritzsch D, Kihlberg J (2016) How beyond tional sampling. J Chem Inf Model 57:1881–1894. https​://doi.
rule of 5 drugs and clinical candidates bind to their targets. J org/10.1021/acs.jcim.7b000​52
Med Chem 59:2312–2327. https​://doi.org/10.1021/acs.jmedc​ 34. Cleves AE, Jain AN (2017) ForceGen 3D structure and con-
hem.5b012​86 former generation: from small lead-like molecules to macrocy-
18. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) clic drugs. J Comput Aided Mol Des 31:419–439. https​://doi.
Experimental and computational approaches to estimate solubil- org/10.1007/s1082​2-017-0015-8
ity and permeability in drug discovery and development settings. 35. Coutsias EA, Lexa KW, Wester MJ et al (2016) Exhaustive con-
Adv Drug Deliv Rev 46:3–26. https​://doi.org/10.1016/S0169​ formational sampling of complex fused ring macrocycles using
-409X(00)00129​-0 inverse kinematics. J Chem Theory Comput 12:4674–4687.
19. Villar EA, Beglov D, Chennamadhavuni S et al (2014) How pro- https​://doi.org/10.1021/acs.jctc.6b002​50
teins bind macrocycles. Nat Chem Biol 10:723–731. https​://doi. 36. Wang Q, Sciabola S, Barreiro G et al (2016) Dihedral angle-
org/10.1038/nchem​bio.1584 based sampling of natural product polyketide conformations:
20. Dougherty PG, Qian Z, Pei D (2017) Macrocycles as protein–pro- application to permeability prediction. J Chem Inf Model
tein interaction inhibitors. Biochem J 474:1109–1125. https​://doi. 56:2194–2206. https​://doi.org/10.1021/acs.jcim.6b002​37
org/10.1042/BCJ20​16061​9 37. Gutten O, Bím D, Řezáč J, Rulíšek L (2018) Macrocycle
21. Gao M, Cheng K, Yin H (2015) Targeting protein–protein inter- conformational sampling by DFT-D3/COSMO-RS methodol-
faces using macrocyclic peptides. Biopolymers 104:310–316. ogy. J Chem Inf Model 58:48–60. https​://doi.org/10.1021/acs.
https​://doi.org/10.1002/bip.22625​ jcim.7b004​53
22. Gavenonis J, Sheneman BA, Siegert TR et al (2014) Comprehen- 38. Friedrich NO, Flachsenberg F, Meyder A et al (2019) Conforma-
sive analysis of loops at protein–protein interfaces for macrocy- tor: a novel method for the generation of conformer ensembles.
cle design. Nat Chem Biol 10:716–722. https:​ //doi.org/10.1038/ J Chem Inf Model 59:731–742. https​://doi.org/10.1021/acs.
nchem​bio.1580 jcim.8b007​04

13
Journal of Computer-Aided Molecular Design (2020) 34:231–252 251

39. Hawkins PCD (2017) Conformation generation: the state of the 62. Ponder JW, Richards FM (1987) An efficient newton-like method
art. J Chem Inf Model 57:1747–1756. https:​ //doi.org/10.1021/acs. for molecular mechanics energy minimization of large molecules.
jcim.7b002​21 J Comput Chem 8:1016–1024. https​://doi.org/10.1002/jcc.54008​
40. Alogheli H, Olanders G, Schaal W et al (2017) Docking of mac- 0710
rocycles: comparing rigid and flexible docking in glide. J Chem 63. Schrödinger. MacroModel Command Reference Manual; New
Inf Model 57:190–202. https​://doi.org/10.1021/acs.jcim.6b004​43 York, NY, 2017.
41. Chang G, Guida WC, Still WC (1989) An internal coordinate 64. Beusen DD, Shands EFB, Karasek SF et al (1996) Systematic
monte carlo method for searching conformational space. J Am search in conformational analysis. J Mol Struct THEOCHEM
Chem Soc 111:4379–4386. https​://doi.org/10.1021/ja001​94a03​5 370:157–171. https​://doi.org/10.1016/S0166​-1280(96)04565​-4
42. Ferguson DM, Raber DJ (1989) A new approach to probing con- 65. Allen FH (2002) The cambridge structural database: a quarter of
formational space with molecular mechanics: random incremen- a million crystal structures and rising. Acta Crystallogr Sect B
tal pulse search. J Am Chem Soc 111:4371–4378. https​://doi. 58:380–388. https​://doi.org/10.1107/S0108​76810​20038​90
org/10.1021/ja001​94a03​4 66. Schrödinger: SiteMap, Schrödinger, LLC, New York, NY.
43. 2018 Chemical Computing Group ULC MOE User Guide, Gener- 67. Kolossváry I, Guida WC (1999) Low-mode gonformational
ating and Analyzing Conformations, Stochastic Search. In: MOE search elucidated: application to C39H80 and flexible dock-
2018.01 ing of 9-deazaguanine inhibitors into PNP. J Comput Chem
44. Chen I-J, Foloppe N (2013) Tackling the conformational sampling 20:1671–1684. https​://doi.org/10.1002/(SICI)1096-987X(19991​
of larger flexible compounds and macrocycles in pharmacology 130)20:15%3c167​1:AID-JCC7%3e3.3.CO;2-P
and drug discovery. Bioorg Med Chem 21:7898–7920. https:​ //doi. 68. Davis AM, Teague SJ, Kleywegt GJ (2003) Application and limi-
org/10.1016/j.bmc.2013.10.003 tations of x-ray crystallographic data in structure-based ligand and
45. Small-Molecule Drug Discovery Suite 2017–1, Schrödinger, LLC, drug design. Angew Chemie Int Ed 42:2718–2736. https​://doi.
New York, NY, 2017. org/10.1002/anie.20020​0539
46. Harder E, Damm W, Maple J et al (2016) OPLS3: a force field pro- 69. Deller MC, Rupp B (2015) Models of protein-ligand crystal struc-
viding broad coverage of drug-like small molecules and proteins. tures: trust, but verify. J Comput Aided Mol Des 29:817–836.
J Chem Theory Comput 12:281–296. https​://doi.org/10.1021/acs. https​://doi.org/10.1007/s1082​2-015-9833-8
jctc.5b008​64 70. Liebeschuetz J, Hennemann J, Olsson T, Groom CR (2012) The
47. Still WC, Tempczyk A, Hawley RC, Hendrickson T (1990) good, the bad and the twisted: a survey of ligand geometry in
Semianalytical treatment of solvation for molecular mechan- protein crystal structures. J Comput Aided Mol Des 26:169–183.
ics and dynamics. J Am Chem Soc 112:6127–6129. https​://doi. https​://doi.org/10.1007/s1082​2-011-9538-6
org/10.1021/ja001​72a03​8 71. Kleywegt GJ (2006) Crystallographic refinement of ligand
48. Python Software Foundation. Python Language Reference, version complexes. Acta Crystallogr Sect D 63:94–100. https​://doi.
3.6.5. Available at https​://www.pytho​n.org org/10.1107/S0907​44490​60226​57
49. The R Project for Statistical Computing. https​://www.r-proje​ 72. Borodina YV, Bolton E, Fontaine F, Bryant SH (2007) Assess-
ct.org/ ment of conformational ensemble sizes necessary for specific res-
50. Microsoft Office PowerPoint. Computer software. Vers. 2010. olutions of coverage of conformational space. J Chem Inf Model
Microsoft Corporation, 2010. 47:1428–1437. https​://doi.org/10.1021/ci700​0956
51. SIMCA® 14, part of the U ­ metricsTM Suite of Data Analytics Solu- 73. Smellie A, Kahn SD, Teig SL (1995) Analysis of conformational
tions, from Sartorius Stedim Data Analytics coverage. 1. Validation and estimation of coverage. J Chem Inf
52. The PyMOL Molecular Graphics System, Version 2.0, Comput Sci 35:285–294. https​://doi.org/10.1021/ci000​24a01​8
Schrödinger, LLC. 74. Boström J, Greenwood JR, Gottfries J (2003) Assessing the per-
53. Berman HM, Westbrook J, Feng Z et al (2000) The protein data formance of OMEGA with respect to retrieving bioactive confor-
bank. Nucleic Acids Res 28:235–242. https​://doi.org/10.1093/ mations. J Mol Graph Model 21:449–462. https:​ //doi.org/10.1016/
nar/28.1.235 S1093​-3263(02)00204​-8
54. RCSB Protein Data Bank. https​://www.rcsb.org/ (Accessed Janu- 75. Boström J, Norrby P-O, Liljefors T (1998) Conformational energy
ary 2017). penalties of protein-bound ligands. J Comput Aided Mol Des
55. Release S, 2017–1: Schrödinger Suite 2017–1 Protein preparation 12:383–396. https​://doi.org/10.1023/A:10080​07507​641
wizard; Epik, Schrödinger, LLC, New York, NY, (2016) Impact, 76. Perola E, Charifson PS (2004) Conformational analysis of drug-
Schrödinger, LLC, New York, NY, 2016. Prime, Schrödinger, like molecules bound to proteins: an extensive study of ligand
LLC, New York, NY, p 2017 reorganization upon binding. J Med Chem 47:2499–2510. https​
56. Sastry GM, Adzhigirey M, Day T et al (2013) Protein and ligand ://doi.org/10.1021/jm030​563w
preparation: parameters, protocols, and influence on virtual 77. Avgy-David HH, Senderowitz H (2015) Toward focusing con-
screening enrichments. J Comput Aided Mol Des 27:221–234. formational ensembles on bioactive conformations: a molecular
https​://doi.org/10.1007/s1082​2-013-9644-8 mechanics/quantum mechanics study. J Chem Inf Model 55:2154–
57. Schrödinger Release 2017–1: Maestro, Schrödinger, LLC, New 2167. https​://doi.org/10.1021/acs.jcim.5b002​59
York, NY, 2017. 78. Foloppe N, Chen I-J (2016) Towards understanding the unbound
58. Schrödinger Release 2017–1: Prime, Schrödinger, LLC, New state of drug compounds: implications for the intramolecular
York, NY, 2017. reorganization energy upon binding. Bioorg Med Chem 24:2159–
59. Jacobson MP, Pincus DL, Rapp CS et al (2004) A hierarchical 2189. https​://doi.org/10.1016/j.bmc.2016.03.022
approach to all-atom protein loop prediction. Proteins Struct Funct 79. Nicklaus MC, Wang S, Driscoll JS, Milne GWA (1995) Con-
Genet 55:351–367. https​://doi.org/10.1002/prot.10613​ formational changes of small molecules binding to proteins.
60. Jacobson MP, Friesner RA, Xiang Z, Honig B (2002) On the Bioorg Med Chem 3:411–428. https​://doi.org/10.1016/0968-
role of the crystal environment in determining protein side-chain 0896(95)00031​-B
conformations. J Mol Biol 320:597–608. https​://doi.org/10.1016/ 80. Sitzmann M, Weidlich IE, Filippov IV et al (2012) PDB ligand
S0022​-2836(02)00470​-9 conformational energies calculated quantum-mechanically. J
61. Schrödinger Release 2017–1: MacroModel, Schrödinger, LLC, Chem Inf Model 52:739–756. https​://doi.org/10.1021/ci200​595n
New York, NY, 2017.

13

252 Journal of Computer-Aided Molecular Design (2020) 34:231–252

81. Wembridge P, Robinson H, Novak I (2008) Computational study 83. Instant JChem 15.9.14.0, ChemAxon. https​://www.chema​xon.
of ligand binding to protein receptors. Bioorg Chem 36:288–294. com/
https​://doi.org/10.1016/j.bioor​g.2008.08.001
82. Peach ML, Cachau RE, Nicklaus MC (2017) Conformational Publisher’s Note Springer Nature remains neutral with regard to
energy range of ligands in protein crystal structures: the difficult jurisdictional claims in published maps and institutional affiliations.
quest for accurate understanding. J Mol Recogn 30:1–14. https​://
doi.org/10.1002/jmr.2618

Affiliations

Gustav Olanders1   · Hiba Alogheli1 · Peter Brandt2   · Anders Karlén1 

2
* Anders Karlén Present Address: Medicinal Chemistry, Research and Early
[email protected] Development Cardiovascular, Renal and Metabolism,
BioPharmaceuticals R&D, AstraZeneca, Gothenburg,
1
Department of Medicinal Chemistry, Uppsala University, Sweden
BMC, Box 574, 751 23 Uppsala, Sweden

13

You might also like