Morris 2009 Autodockv4
Morris 2009 Autodockv4
Morris 2009 Autodockv4
Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037
2
Department of Cognitive Science, University of California, San Diego, La Jolla,
California 92093
Received 10 December 2008; Revised 27 January 2009; Accepted 3 February 2009
DOI 10.1002/jcc.21256
Published online 27 April 2009 in Wiley InterScience (www.interscience.wiley.com).
Abstract: We describe the testing and release of AutoDock4 and the accompanying graphical user interface AutoDockTools. AutoDock4 incorporates limited exibility in the receptor. Several tests are reported here, including a
redocking experiment with 188 diverse ligand-protein complexes and a cross-docking experiment using exible sidechains in 87 HIV protease complexes. We also report its utility in analysis of covalently bound ligands, using both a
grid-based docking method and a modication of the exible sidechain technique.
q 2009 Wiley Periodicals, Inc.
Key words: AutoDock; computational docking; protein exibility; covalent ligands; computer-aided drug design
Introduction
Automated docking is widely used for the prediction of biomolecular complexes in structure/function analysis and in molecular
design. Dozens of effective methods are available, incorporating
different trade-offs in molecular representation, energy evaluation, and conformational sampling to provide predictions with a
reasonable computational effort.18 AutoDock combines an empirical free energy force eld with a Lamarckian Genetic Algorithm, providing fast prediction of bound conformations with
predicted free energies of association.9
In our hands, AutoDock3 has proven to be effective in
roughly half of the complexes that we have studied. The remaining half show signicant motion of the receptor upon binding,
and thus have required a more sophisticated model of motion in
the receptor, typically performed outside of AutoDock3. The
new version of AutoDock described hereAutoDock4incorporates explicit conformational modeling of specied sidechains
in the receptor to address this problem. This capability also provides an effective method for analysis of covalently attached
ligands.
Methods
ular targets.9,1114 To allow searching of the large conformational space available to a ligand around a protein, AutoDock
uses a grid-based method to allow rapid evaluation of the binding energy of trial conformations. In this method, the target protein is embedded in a grid. Then, a probe atom is sequentially
placed at each grid point, the interaction energy between the
probe and the target is computed, and the value is stored in the
grid. This grid of energies may then be used as a lookup table
during the docking simulation.
The primary method for conformational searching is a
Lamarckian genetic algorithm, described fully in Morris et al.9
A population of trial conformations is created, and then in successive generations these individuals mutate, exchange conformational parameters, and compete in a manner analogous to biological evolution, ultimately selecting individuals with lowest
binding energy. The Lamarckian aspect is an added feature
that allows individual conformations to search their local conformational space, nding local minima, and then pass this information to later generations. A simulated annealing search
method and a traditional genetic algorithm search method are
also available in AutoDock4.
AutoDock4 uses a semiempirical free energy force eld to
predict binding free energies of small molecules to macromolec-
Overview of AutoDock4
Since its release in 1990,10 AutoDock has proven to be an effective tool capable of quickly and accurately predicting bound
conformations and binding energies of ligands with macromolec-
2786
ular targets. Development and testing of the force eld has been
described elsewhere.11 The force eld is based on a comprehensive thermodynamic model that allows incorporation of intramolecular energies into the predicted free energy of binding. This
is performed by evaluating energies for both the bound and
unbound states. It also incorporates a new charge-based desolvation method that uses a typical set of atom types and charges.
The method has been calibrated on a set of 188 diverse protein
ligand complexes of known structure and binding energy, showing a standard error of about 23 kcal/mol in prediction of binding free energy in cross-validation studies.
Receptor Flexibility
DOI 10.1002/jcc
2787
Figure 1. Redocking results. Results of redocking of 188 diverse ligand-protein complexes. Open
squares represent complexes where the docked conformation with best predicted energy was less that
RMSD relative to the experimentally observed conformation. Dots are complexes where the best
3.5 A
RMSD from the experimental structure.
docked conformation was greater than 3.5 A
near the root), the other adds torsions progressively from the
leaves, moving the fewest number of atoms and leaving the core
of the molecule rigid.
Validation Data Sets
Two sets of complexes were used for the validation of AutoDock4, both of which have been described previously.11 A set
of 188 diverse protein-ligand complexes were taken from the
Ligand-Protein Database (http://lpdb.scripps.edu) and a set of 87
HIV protease complexes where taken from the PDBBind database (http://www.pdbbind.org). All coordinates were checked
manually for the proper biological unit and consistency in naming schemes. Hydrogen atoms and charges were added in AutoDockTools, using Babel for hydrogens and the Gasteiger
PEOE18 method for charges. Several misassigned charges were
modied manually, as described in the previous report.
The 87 HIV protease complexes were aligned to allow easy
comparison of docked conformations during cross dockings. An
analysis of steric clashes was performed by swapping ligands
within the set of 87 aligned complexes. ARG8 showed the largest number of bad contacts, with 877 cases where a ligand atom
distance. Other
contacted a sidechain atom with less than 2 A
contacts occurred with ILE50 (140 cases), PRO81 (95 cases),
and PHE82 (88 cases, in two V82F mutants).
Docking Experiments
Docking experiments were performed with AutoDock4 and compared with docking experiments with AutoDock3. For each complex, 50 docking experiments were performed using the
Lamarckian genetic algorithm with the default parameters from
AutoDock3. A maximum of 25 million energy evaluations was
DOI 10.1002/jcc
2788
Figure 2. Cross docking results. (A) Difference in energy between rigid docking and exible docking.
White points are docking experiments where the exible docking showed 20 kcal/mol more favorable
predicted free energy of docking and black points are docking experiments where the rigid docking
showed similar or worse free energy of docking. Inhibitors are ordered from small to large, with the
cyclic urea inhibitors separated at the bottom. The protease structures are ordered similarly. (B) Cross
docking with rigid protease structures. Each point is colored by the RMSD of ligand atoms from the
in white and RMSD [5 A
in black. (C) Cross docking
crystallographic structure, with RMSD 5 0 A
with ARG8 treated as exible in the protease. Each point is colored with the same scale as in (B).
DOI 10.1002/jcc
Figure 2. (Continued)
2790
Figure 3. Results of covalent docking. (A) Using a Gaussian map centered on serine OG. The crystallographic structure is shown in large bonds and the best docked conformation is shown in thinner
bonds. The blue sphere surrounds the region of most favorable energy in the Gaussian map. (B) Using
a Gaussian map centered on serine CB. (C) Using two Gaussian maps. (D) Using a exible sidechain
to model the covalent ligand.
DOI 10.1002/jcc
Conclusions
Dependence on grid-based energy evaluation is a major limitation of AutoDock4. It is required to allow rapid evaluation of
binding energies during the docking simulation, but it places a
2791
severe restriction on the representation of the target macromolecule: all of the atoms included in the grid must be treated as
rigid. The off-grid modeling of specic sidechains is a method
for incorporating limited exibility within this paradigm, and the
results presented here show that it will be effective in some
cases. However, adding exibility presents several problems: (1)
the calculation of the receptor energy is more computationally
intensive since exible regions must be evaluated by a full pairwise energy evaluation, and (2) the conformational space is
larger, and hence, there is more potential for false positives.
Future solutions to these problems will require advances in energetic functions, simplifying and/or speeding up pair wise
approaches, and use of hierarchical approaches19,20 that allow
different levels of sophistication in a docking simulation.
Acknowledgments
This is manuscript 19885 from the Scripps Research Institute.
References
1. Leach, A. R.; Shoichet, B. K.; Peishoff, C. E. J Med Chem 2006,
49, 5851.
2. Coupez, B.; Lewis, R. A. Curr Med Chem 2006, 13, 2995.
3. Sousa, S. F.; Fernandes, P. A.; Ramos, M. J. Proteins 2006, 65, 15.
4. Mohan, V.; Gibbs, A. C.; Cummings, M. D.; Jaeger, E. P.;
DesJarlais, R. L. Curr Pharm Des 2005, 11, 323.
5. Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Nat Rev Drug
Discov 2004, 3, 935.
6. Brooijmans, N.; Kuntz, I. D. Annu Rev Biophys Biomol Struct
2003, 32, 335.
7. Taylor, R. D.; Jewsbury, P. J.; Essex, J. W. J Comput Aided Mol
Design 2002, 16, 151.
8. Halperin, I.; Ma, B.; Wolfson, H.; Nussinov, R. Proteins: Struct
Funct Genet 2002, 47, 409.
9. Morris, G. M.; Goodsell, D. S.; Halliday, R. S.; Huey, R.; Hart, W.
E.; Belew, R. K.; Olson, A. J. J Comput Chem 1998, 19, 1639.
10. Goodsell, D. S.; Olson, A. J. Proteins: Struct Funct Genet 1990, 8,
195.
11. Huey, R.; Morris, G. M.; Olson, A. J.; Goodsell, D. S. J Comput
Chem 2006, 28, 1145.
12. Osterberg, F.; Morris, G. M.; Sanner, M. F.; Olson, A. J.; Goodsell,
D. S. Proteins: Struct Funct Genet 2002, 46, 34.
13. Morris, G. M.; Goodsell, D. S.; Huey, R.; Olson, A. J. J Comput
Aided Mol Design 1996, 10, 293.
14. Goodsell, D. S.; Morris, G. M.; Olson, A. J. J Mol Recogn 1996, 9,
1.
15. Lutz, M.; Asher, D. Learning Python; OReilly & Associates: Sebastopol, CA, 1999.
16. Sanner, M. F. J Mol Graphics Mod 1999, 17, 57.
17. Sanner, M. F. Structure 2005, 13, 447.
18. Gasteiger, J.; Marsili, M. Tetrahedron 1980, 36, 3219.
19. Zhao, Y.; Stofer, D.; Sanner, M. F. Bioinformatics 2006, 22, 2768.
20. Zhao, Y.; Sanner, M. F. J Comput Aided Mol Des 2007, 22, 673.
DOI 10.1002/jcc