Atikah
Atikah
Atikah
AutoDock is a suite of automated docking tools. It is designed to predict how small molecules, such as
substrates or drug candidates, bind to a receptor of known 3D structure.
Current distributions of AutoDock consist of two generations of software: AutoDock 4 and AutoDock
Vina.
AutoDock 4 actually consists of two main programs: autodock performs the docking of the ligand to a
set of grids describing the target protein; autogrid pre-calculates these grids.
In addition to using them for docking, the atomic affinity grids can be visualised. This can help, for
example, to guide organic synthetic chemists design better binders.
AutoDock Vina does not require choosing atom types and pre-calculating grid maps for them. Instead,
it calculates the grids internally, for the atom types that are needed, and it does this virtually instantly.
We have also developed a graphical user interface called AutoDockTools, or ADT for short, which
amongst other things helps to set up which bonds will treated as rotatable in the ligand and to analyze
dockings.
• X-ray crystallography;
• structure-based drug design;
• lead optimization;
• virtual screening (HTS);
• combinatorial library design;
• protein-protein docking;
• chemical mechanism studies.
AutoDock 4 is free and is available under the GNU General Public License. AutoDock Vina is
available under the Apache license, allowing commercial and non-commercial use and redistribution.
Click on the "Downloads" tab. And Happy Docking!
Because the scoring functions used by AutoDock 4 and AutoDock Vina are different and inexact, on
any given problem, either program may provide a better result.
Detailed information can be found on the AutoDock Vina web site.
What's new?
June 1, 2009
AutoDock 4.2 is faster than earlier versions, and it allows sidechains in the macromolecule to be
flexible. As before, rigid docking is blindingly fast, and high-quality flexible docking can be done in
around a minute. Up to 40,000 rigid dockings can be done in a day on one cpu.
AutoDock 4.2 now has a free-energy scoring function that is based on a linear regression analysis, the
AMBER force field, and an even larger set of diverse protein-ligand complexes with known inhibition
constants than we used in AutoDock 3.0. The best model was cross-validated with a separate set of
HIV-1 protease complexes, and confirmed that the standard error is around 2.5 kcal/mol. This is enough
to discriminate between leads with milli-, micro- and nano-molar inhibition constants.
You can read more about the new features in AutoDock 4.2 and how to use them in the AutoDock4.2
User Guide.
There are many different metrics out there for determining which docked poses for a given compound are "good". The previ-
ous posts do a good job of explaining those things (RMSD, free energy of binding, predicted inhibition constants). You can
also consider presence or absence of potential interactions if you know information about the binding mechanism. It's helpful
to know something about how things associate, but docking can still be useful if you don't have that information. Overall, the
metrics listed are most helpful if you have docking results for several ligands to compare. It's hard to tell the good from the
bad if you don't have the bad to set the bar.
Docking, specifically with AutoDock, AutoDock Vina and the like, is useful in various situations. The four I can immediately
think of are (1) screening thousands of compounds/ligands against one target/macromolecule, (2) screening one or a small
set of ligands against multiple macromolecule (copies of same target with structure variations or different targets), (3) binding
cavity sampling (if the binding site is unknown but the ligand is), (4) interaction prediction (known ligand and target but no
structure information; this also requires homology modeling).
Some more detailed statistics can be calculated if you conduct a large-scale docking study with a large number of ligands
(hundreds to thousands), which includes known ligands and decoy ligands (useful online database for identifying
these http://dud.docking.org/). Those include accuracy, precision, specificity, selectivity, and some others (Wiki page that
does a good job of explaining this: http://en.wikipedia.org/wiki/Sensitivity_and_specificity). This are helpful in assessing pre-
dictability, which means how well your protein structure model allows you to correctly identify things that bind. There are a lot
of papers out there that talk about this and many use receiver operating characteristic (ROC) plots. Off the top of my head:
Huang, N., B. K. Shoichet, et al. (2006). "Benchmarking Sets for Molecular Docking." Journal of Medicinal Chemistry 49(23):
6789-6801.
Zhang, X., S. E. Wong, et al. (2013). "Message Passing Interface and Multithreading Hybrid for Parallel Molecular Docking of
Large Databases on Petascale High Performance Computing Machines." Journal of Computational Chemistry 34: 915-927.
It sounds like the question regarding why docking methods are used is outstanding. I wouldn't say that docking results are
"untrustable", but I would say they are "variable". Different docking programs use different algorithms and scoring functions,
so the output will be different between programs. Differences also arise between multiple docking iterations of the same lig-
and. This stems in part from the use of random numbers, but also because of what we know about energy and enzyme kinet-
ics. The algorithms for the programs are derived from the body of knowledge on how ligands bind to macromolecules. There
are certain rules that apply in nature and we have to consider those rules when building theoretical models to predict the
binding process. There is no clear consensus on which theoretical method is the best for predicting binding, but regardless of
the difference in algorithm, one should see convergence/agreement of results if the methods are used appropriately. Binding
interactions experimentally should give one an idea of the energy necessary for activity. Docking programs in general attempt
to sample all the possible energies that result in activity in order to find the ideal energy, which should be the minimum of an
energy well within some energy landscape. Some algorithms/programs are better than others at quickly sampling possibilities
to find the energy minimum for a given protein-ligand pair. I think as far as free software goes, Vina gets us closer to repro-
ducibility when performing multiple docking iterations of the same ligand. An important point is docking is theoretical (as most
computational methods still are). There are bench-top experiments that can be done (and have been done in various cases;
Lewis, S. N., L. Brannan, et al. (2011). "Dietary a-Eleostearic Acid Ameliorates Experimental Inflammatory Bowel Disease in
Mice by Activating Peroxisome Proliferator-Activated Receptor-g." PLoS One 6(8): e24031.) to verify what was predicted with
docking. A search of the literature would provide countless studies where various computational methods have been used to
predict ligands and were complemented with experimental validation. I would use search terms that include the protein of
interest and the terms "docking" or "virtual screening".
An additional resource that summarizes considerations and problems for docking is:
Klebe, G. (2006). "Virtual ligand screening: strategies, perspectives and limitations." Drug Discovery Today 11(13-14): 580-
594.
o 9 Recommendations
First, are you using Autodock 4.0 or autodock vina, because they're both completely different in terms of speed, execution
and analytics. Assuming you are using the latter, Autodock Vina, it¡'s the simplest software for docking because the output
you see in the --log option (also printed above) is telling you the pose number, the energy (the more negative the value, the
better the interaction, which is assumed to be the dG), and then, RMSD's which are useless as they only tell you the struc-
tural difference between your first pose and the following poses when compared to the best one.
Ideally, one should trust that the first pose is always the best one (this is a milestone), and so, you must consider if your cor-
rect pose is the 2nd or the 3rd. How many poses you need to consider in your analysis ?. Well, is another tricky question, but
can be done using X-ray known inhibitors and re-dock them to your binding site and try to see if the software successfully
reproduces the binding pose. If your x-ray known inhibitor matches with the first pose, then you only consider the first pose of
every molecule. If your x-ray inhibitor matches with your 2nd pose, then, consider the first two poses of every molecule and
so on.When doing that, you filter out a lot of raw data you won't need to process, and only focus on those poses where realis-
tically, you may have a computational hit.
Afterwards, would be very recommended to re-score the filtered ligands with another software, preferentially with a force-field
dependent scoring function. There are a lot of force-fields to be used (MMFF94, PFROSST...), what to use, is up to you.
When applying a Force-Field scoring function, the energy value of dG of some pre-selected ligand will drop down very dra-
matically, then, you can reject those compounds if you want. Only those compounds where AutoDock vina and another scor-
ing functions gives you good values, should be considered for further analysis.
Additionally, you don't have to trust completely the docking prediction. You have to evaluate if the generated docked poses
are realistic or not as Viachaslau Bernat pointed out (i.e, there are no positively or negatively charged groups in the nearby of
neutral or aliphatic regions, if there is complementarity between your predicted poses and your binding site, if they poses you
obtain are encompassing a known pharmacophoric point described in the literature, etc.). The protocol itself is an art, can
NOT be extensively applied for every protein.
If you have more computational resources, those poses of those ligand that look promising can be included in a higher-com-
puter demanding approach, such is molecular dynamics, but this is another issue.
o 8 Recommendations
Ligplot
Dear Vishal,
Mostly Docking programs uses scoring functions for probable binding affinity between protein and ligands. For instance;
If you are using 0Auto dock vina then you will get RMSD values for each ligands interaction. Lowest binding energy shows
best affinity with respective 0 RMSD value. For more detail; go through this article
(http://www.ncbi.nlm.nih.gov/pubmed/19499576)
Moreover, Docking results can be visualized by PyMol in order to get atomic binding distance, bonds and what kind of amino
acid or nucleotides are involved between protein and ligands.
good luck
o 1 Recommendation
Please, illustrate me about docking. I read some months ago a paper showing that using several available docking programs
the results become kind of random number generator. I'm not sure if there is some program which provides systematic coher-
ent results for several problems... I would like to see how several probes dock to the same protein and also how the same
probe docks to several proteins. Is such an study published. If not: boys start doing it!
open .dlg file in text pad and search for RMSD TABLE in that
Like shown below
RMSD TABLE
Rank |Sub- Rank|Run|Binding-Energy|Cluster-RMSD|Reference-RMSD|Grep-Pattern
1 1 45 -6.24 0.00 83.60 RANKING
1 2 25 -5.50 1.11 83.60 RANKING
1 3 20 -5.36 1.53 84.30 RANKING
2 1 4 -4.03 0.00 90.64 RANKING
2 2 23 -4.03 0.26 90.62 RANKING
2 3 2 -4.00 0.21 90.62 RANKING
2 4 42 -3.98 0.38 90.70 RANKING
2 5 33 -3.96 0.23 90.60 RANKING
2 6 27 -3.96 0.26 90.45 RANKING
2 7 18 -3.96 0.25 90.69 RANKING
2 8 12 -3.96 0.41 90.59 RANKING
2 9 31 -3.94 0.24 90.48 RANKING
2 10 47 -3.94 0.63 90.38 RANKING
2 11 5 -3.93 0.64 90.62 RANKING
2 12 28 -3.93 0.19 90.54 RANKING
2 13 10 -3.93 0.53 90.68 RANKING
2 14 48 -3.93 0.54 90.72 RANKING
2 15 26 -3.92 0.50 90.74 RANKING
2 16 13 -3.92 0.61 90.63 RANKING
2 17 15 -3.91 0.54 90.61 RANKING
2 18 29 -3.91 0.62 90.67 RANKING
2 19 17 -3.91 0.40 90.64 RANKING
2 20 43 -3.90 0.70 90.62 RANKING
2 21 35 -3.89 0.38 90.53 RANKING
2 22 36 -3.89 0.81 90.57 RANKING
2 23 24 -3.89 0.35 90.66 RANKING
2 24 38 -3.88 0.61 90.60 RANKING
2 25 44 -3.87 0.59 90.63 RANKING
2 26 46 -3.86 0.50 90.86 RANKING
2 27 50 -3.86 0.43 90.62 RANKING
2 28 1 -3.85 0.61 90.67 RANKING
2 29 37 -3.83 0.56 90.77 RANKING
2 30 39 -3.80 0.47 90.78 RANKING
2 31 21 -3.78 1.62 89.59 RANKING
2 32 34 -3.77 1.59 89.60 RANKING
2 33 11 -3.75 0.56 90.82 RANKING
2 34 40 -3.74 1.64 89.68 RANKING
2 35 6 -3.74 1.45 89.77 RANKING
2 36 49 -3.64 0.78 90.68 RANKING
3 1 30 -3.81 0.00 89.88 RANKING
3 2 14 -3.71 0.25 89.90 RANKING
3 3 9 -3.69 0.20 89.90 RANKING
4 1 16 -3.69 0.00 92.38 RANKING
4 2 8 -3.65 1.84 91.88 RANKING
4 3 3 -3.58 0.84 92.64 RANKING
4 4 7 -3.52 0.83 92.51 RANKING
4 5 41 -3.52 1.00 92.72 RANKING
4 6 19 -3.50 0.99 92.65 RANKING
4 7 32 -3.47 1.15 92.82 RANKING
4 8 22 -3.39 0.70 92.53 RANKING
In this particular dlg file 45th run has low binding energy and u can visualized whether molecule in site the binding site or not
agree about autodock vina use instead of autodock. The developers themselves underlind better performance of former.
Concerning evaluation of binding mode I prefer to rely on visual analysis and reproducibility rather than on ranking.
o 1 Recommendation
First, are you using Autodock 4.0 or autodock vina, because they're both completely different in terms of speed, execution
and analytics. Assuming you are using the latter, Autodock Vina, it¡'s the simplest software for docking because the output
you see in the --log option (also printed above) is telling you the pose number, the energy (the more negative the value, the
better the interaction, which is assumed to be the dG), and then, RMSD's which are useless as they only tell you the struc-
tural difference between your first pose and the following poses when compared to the best one.
Ideally, one should trust that the first pose is always the best one (this is a milestone), and so, you must consider if your cor-
rect pose is the 2nd or the 3rd. How many poses you need to consider in your analysis ?. Well, is another tricky question, but
can be done using X-ray known inhibitors and re-dock them to your binding site and try to see if the software successfully
reproduces the binding pose. If your x-ray known inhibitor matches with the first pose, then you only consider the first pose of
every molecule. If your x-ray inhibitor matches with your 2nd pose, then, consider the first two poses of every molecule and
so on.When doing that, you filter out a lot of raw data you won't need to process, and only focus on those poses where realis-
tically, you may have a computational hit.
Afterwards, would be very recommended to re-score the filtered ligands with another software, preferentially with a force-field
dependent scoring function. There are a lot of force-fields to be used (MMFF94, PFROSST...), what to use, is up to you.
When applying a Force-Field scoring function, the energy value of dG of some pre-selected ligand will drop down very dra-
matically, then, you can reject those compounds if you want. Only those compounds where AutoDock vina and another scor-
ing functions gives you good values, should be considered for further analysis.
Additionally, you don't have to trust completely the docking prediction. You have to evaluate if the generated docked poses
are realistic or not as Viachaslau Bernat pointed out (i.e, there are no positively or negatively charged groups in the nearby of
neutral or aliphatic regions, if there is complementarity between your predicted poses and your binding site, if they poses you
obtain are encompassing a known pharmacophoric point described in the literature, etc.). The protocol itself is an art, can
NOT be extensively applied for every protein.
If you have more computational resources, those poses of those ligand that look promising can be included in a higher-com-
puter demanding approach, such is molecular dynamics, but this is another issue.
Apr 2, 2013 · Recommend
o 8 Recommendations
As Mr. Moral has said the more negative the values are the best is the interaction. But, always go in for the best score of the
largest cluster. B'coz the largest cluster indicate the suitable docking poses of the ligand. You are also provided with the inhi-
bition constant that can be further used for validation on comparing with the experimental data.
As Mr. Moral has said that re docking of the inhibitor in X-ray data and the analyzing the run with binding energy and H bond-
ing of amino acid of each pose the select the pose which show the same result which is for the original one. use this as a
filter for your new ligands.
Sorry! But nobody has answered my previous questions... except perhaps Jesus Seco two days ago who said: ' You don't
have to trust completely the docking prediction'! Then my mind can easily say: If you cannot trust the prediction: why you
keep using such untrustable programs? Another question I will be happy if it can be answered by anyone in any way tou
wish...
Jesus Seco Moral · University of Barcelona
As any simulation program, docking software only tries to fit the molecule within the active site without considering any exter-
nal factor that could dramatically affect to the binding process (presence of structural waters, protein mobility, presence or
other co-factors, etc.). That's the reason you don't have to trust the docking results except if you have an experimental valida-
tion. The reason everybody uses docking software is tricky, probably because is the easiest way to start when performing
any drug discovery project, otherwise, the computational efforts goes up dramatically (imagine have to run molecular dynam-
ics simulations for every ligand you want to consider in every pose the docking software predicts, that's senseless!!! and
crazy!!).
About the random number, is similar to what happens with molecular dynamics, where a random number determines the ini-
tial velocities of atoms. Here, the random number determines how to start the docking software starts to place the molecule
within the binding site. This is not a major limitation because you are not running only one iteration, you are supposed to run
as many iterations or runs or whatever they call as you want. Ideally, the larger the iterations, the better the prediction, but
also, the longer the time to wait to get some results. There is no consensus, and every software recommends a different
number of iterations, but with 50 or so you are carrying out exhaustive search about how to place the ligand in the protein
binding site. Moreover, some docking software are "exhaustive", which means the docking does not depend on a specific
and predefined number of runs (i.e, it places the ligand in as many positions as the algorithm determines, and then, clusters
the results and then prints out the results), thus, no random effect is occurring. As example, Autodock Vina, which for me is
the best one (also because is free) uses a random number but you can specify a parameters called "exhaustiveness" to in-
crease the quality of results. However, regardless of the different combinations of random number and exhaustiveness num-
bers you use, the prediction is essentially the same (or some minor structural differences could be noticed, but the binding
mode is practically identical, you might try this as practical and ilustrative example).
About the other question you faced (atom probes used in docking), I don't recommend to tackle this problem through dock-
ing, because for doing so there is another software called GRID, which is based on Molecular Interaction Fields (MIF), which
is pretty good. Another possibility would be to perform molecular dynamics with binary solvent (if you want, I recommend the
paper I wrote "Binding site detection and druggability index from first principles", J.Med.Chem, and if you read carefully (also
some external references herein) you will realize how you can perform atom probe through protein surface in different ways
(by the way, this is called computational solvent-mapping).
I hope I answered everything. If not, feel free to ask again.
Cheers!!
o 4 Recommendations
Whaw! Now I'm a bit better informed. Still you have not answered in full my question.
But anyway, that's fine for the moment...
Apr 4, 2013 · Recommend
I'm not an expert in the topic but I know that different codes (i.e. different algorithms) may give quite different docking predic-
tions. In some cases, the predictions are far from being perfect if you allow the molecules to relax which is not necessarily
considered in docking software.
There are many different metrics out there for determining which docked poses for a given compound are "good". The previ-
ous posts do a good job of explaining those things (RMSD, free energy of binding, predicted inhibition constants). You can
also consider presence or absence of potential interactions if you know information about the binding mechanism. It's helpful
to know something about how things associate, but docking can still be useful if you don't have that information. Overall, the
metrics listed are most helpful if you have docking results for several ligands to compare. It's hard to tell the good from the
bad if you don't have the bad to set the bar.
Docking, specifically with AutoDock, AutoDock Vina and the like, is useful in various situations. The four I can immediately
think of are (1) screening thousands of compounds/ligands against one target/macromolecule, (2) screening one or a small
set of ligands against multiple macromolecule (copies of same target with structure variations or different targets), (3) binding
cavity sampling (if the binding site is unknown but the ligand is), (4) interaction prediction (known ligand and target but no
structure information; this also requires homology modeling).
Some more detailed statistics can be calculated if you conduct a large-scale docking study with a large number of ligands
(hundreds to thousands), which includes known ligands and decoy ligands (useful online database for identifying
these http://dud.docking.org/). Those include accuracy, precision, specificity, selectivity, and some others (Wiki page that
does a good job of explaining this: http://en.wikipedia.org/wiki/Sensitivity_and_specificity). This are helpful in assessing pre-
dictability, which means how well your protein structure model allows you to correctly identify things that bind. There are a lot
of papers out there that talk about this and many use receiver operating characteristic (ROC) plots. Off the top of my head:
Huang, N., B. K. Shoichet, et al. (2006). "Benchmarking Sets for Molecular Docking." Journal of Medicinal Chemistry 49(23):
6789-6801.
Zhang, X., S. E. Wong, et al. (2013). "Message Passing Interface and Multithreading Hybrid for Parallel Molecular Docking of
Large Databases on Petascale High Performance Computing Machines." Journal of Computational Chemistry 34: 915-927.
It sounds like the question regarding why docking methods are used is outstanding. I wouldn't say that docking results are
"untrustable", but I would say they are "variable". Different docking programs use different algorithms and scoring functions,
so the output will be different between programs. Differences also arise between multiple docking iterations of the same lig-
and. This stems in part from the use of random numbers, but also because of what we know about energy and enzyme kinet-
ics. The algorithms for the programs are derived from the body of knowledge on how ligands bind to macromolecules. There
are certain rules that apply in nature and we have to consider those rules when building theoretical models to predict the
binding process. There is no clear consensus on which theoretical method is the best for predicting binding, but regardless of
the difference in algorithm, one should see convergence/agreement of results if the methods are used appropriately. Binding
interactions experimentally should give one an idea of the energy necessary for activity. Docking programs in general attempt
to sample all the possible energies that result in activity in order to find the ideal energy, which should be the minimum of an
energy well within some energy landscape. Some algorithms/programs are better than others at quickly sampling possibilities
to find the energy minimum for a given protein-ligand pair. I think as far as free software goes, Vina gets us closer to repro-
ducibility when performing multiple docking iterations of the same ligand. An important point is docking is theoretical (as most
computational methods still are). There are bench-top experiments that can be done (and have been done in various cases;
Lewis, S. N., L. Brannan, et al. (2011). "Dietary a-Eleostearic Acid Ameliorates Experimental Inflammatory Bowel Disease in
Mice by Activating Peroxisome Proliferator-Activated Receptor-g." PLoS One 6(8): e24031.) to verify what was predicted with
docking. A search of the literature would provide countless studies where various computational methods have been used to
predict ligands and were complemented with experimental validation. I would use search terms that include the protein of
interest and the terms "docking" or "virtual screening".
An additional resource that summarizes considerations and problems for docking is:
Klebe, G. (2006). "Virtual ligand screening: strategies, perspectives and limitations." Drug Discovery Today 11(13-14): 580-
594.
o 9 Recommendations
As far as i know crystal structure is just a snap shot of the protein-ligand complex. Though the comparison with interaction
present between protein and ligand in crystal will increase the reliability of result BUT will it not eliminate ligand with unique
poses.
How to analyze Autodock vina results?
How to interpret Autodock vina score val-
ues? What is basic cutoff parameters for
Autodock vina results?
RMSD
RMSD values are calculated relative to the best mode and use only movable heavy atoms.
Two variants of RMSD metrics are provided, rmsd/lb (RMSD lower bound) and rmsd/ub
(RMSD upper bound), differing in how the atoms are matched in the distance calculation:
rmsd/ub matches each atom in one conformation with itself in the other conformation,
ignoring any symmetry rmsd' matches each atom in one conformation with the closest
atom of the same element type in the other conformation.[1]
rmsd/lb is defined as follows: rmsd/lb(c1, c2) = max(rmsd'(c1, c2), rmsd'(c2, c1))
-----------------------------------------------------------------------
So, for example, a highly symmetrical rigid ligand could be rotated relative to a reference
conformation such that the new conformation is exactly equivalent to the reference
conformation (think of a benzene ring flipped 180 degrees). However, the internal
numbering of atoms doesn't change during the docking run, so the rmsd/ub algorithm,
which matches identically labeled atoms (rather than similar or equivalent atoms) would
yield a significant rmsd, whereas the rmsd/lb algorithm would yield a more realistic rmsd of
zero in this case. OTOH, for very flexible, asymmetric molecules, rmsd/lb would likely give
unreasonably low rmsd's and rmsd/ub would be a better model.
1
Generally, pose with the lowest docking energy (binding free energy) consider as the best conformation, but you
should also check other criteria ; RMSD and the number of interacting residues/ hydrogen bonding, which are
very important. The best conformation should follow all these three criteria. And also verify your best pose with
other molecular docking tools, Dock6, swissdock, Hex, dock, or Audotock Vina (new version of Audock) etc.
Dear All,
First of all, I am very new in molecular docking field.
I am using Autodock Vina with PyRx Virtual Screening Tool.
For the part for "analysis results", there are: Binding Affinity (kcal/mol); mode; RMSD lower bound
and RMSD higher Bound.
Cheers.
-Adrian K-
More the negative value more better indicator it is. below -1 it should be ok. u have to select most
Negative one.
note.- I asked my friend about it, he told me so. I used autodock once or twice only, if I will know
more I will let u know.
-Inbox-