UsingAutoDock4forVirtualScreening v4
UsingAutoDock4forVirtualScreening v4
Virtual Screening
29 January 2008, v2
1
Contents
Contents .............................................................................................................. 2
Introduction......................................................................................................... 4
Before We Start…............................................................................................ 4
Exercise Six: Calculating atomic affinity maps for a ligand library using
AutoGrid. ...........................................................................................................20
Procedure:......................................................................................................20
2
Files for exercises:...........................................................................................34
Input Files:......................................................................................................34
Results Files ..................................................................................................34
Ligand ........................................................................................................34
Macromolecule ..........................................................................................34
AutoGrid.....................................................................................................34
AutoDock ...................................................................................................34
3
Introduction
ZINC x1hpv.pdb
*.mol2 x1hpv.pdbqt
*.pdbqt x1hpv_*.gpf
x1hpv*map*
Before We Start…
System requirements: this
tutorial requires that you
We’ll use the directory /usr/tmp for the tutorial today. In practice
have cvs on your computer you’ll use a directory of your own choosing.
as well as MGLTools1.4.6,
autogrid4 and autodock4. Open a Terminal window and then type this at the UNIX, Mac OS X
or Linux prompt:
4
Type this:
cd /var/tmp
mkdir tutorial
cd tutorial
pwd
Note: when you are
dealing with large
volumes of data, you tutorial
want to keep it local so
that you don’t
overburden the file
system.
We will represent directories as shaded boxes connected with lines to
illustrate the data structure built in these exercises. This box represents
the ‘tutorial’ directory you have just created.
Set up for today’s exercises by checking out the VSTutorial files from
CVS, Concurrent Versions System. First setup access to CVS:
Type this:
TSRI only: on Apple
setenv CVSROOT :pserver:[email protected]:/opt/cvs
echo $CVSROOT
cvs login
computers you may need (When asked for a password, just press return.)
to access cvs like this:
/sw/bin/cvs login
Also: ignore any message Next, in the /usr/tmp/tutorial directory on the computer you are
about .cvspass errors. using here in the training room, check out the tutorial:
Type this:
cvs co VSTutorial
cd VSTutorial
5
$VSTROOT: #!/bin/csh
a short cut to the directory in #
which your Virtual Screening # $Id: ex00.csh,v 1.3 2005/01/31 18:11:28 lindy Exp $
Tutorial activities will take #
place.
# Because this script uses "pwd" to set VSTROOT it matters
Note: here we use the # where (which directory) you run it from. This script
backward-slanted single # should be run as "source ./scripts/ex00.csh" So,
quotation mark. UNIX # after you did your "cvs co VSTutorial" a "VSTutorial"
replaces strings enclosed by # directory was created and that's the one that should be
this character by the result of # your working directory when you source this script.
executing them. #
Here `pwd` is replaced by # Set up the root directory of the Virtual Screening
‘/usr/tmp/tutorial/VSTutorial’ # Tutorial
before setenv is executed. #
setenv VSTROOT `pwd`
Type this:
source scripts/ex00.csh
echo $VSTROOT
tutorial
$VSTROOT
VSTutorial
Results scripts
For TSRI users:
Type this: source scripts/setpath4.csh
Note: you will need to source this
script in any new terminal you (others need to edit the script for their local file systems)
open during this tutorial to
properly set up the environment
in that new terminal. 6
FAQ – Frequently Asked Questions
Sort them by lowest energy first, then use ADT to inspect the
quality of the binding.
The first thing to check is that the ligand is docking into some
kind of pocket on the receptor. The second is that there is a
chemical match between the atoms in the ligand and those in
the receptor. For example, check that carbon atoms in the
7
ligand are near hydrophobic atoms in the receptor while
nitrogens and oxygens in the ligand are near similar atoms in
the binding pocket. Check for charge complementarity. Check
whatever else you may know about your particular system: for
instance, if you know that the enzymatic action of your protein
involves a particular residue, examine how the ligand binds to
that residue. In the case of HIV protease, good inhibitors bind
in a mode which mimics the transition state.
http://autodock.scripps.edu
8
Exercise One: Populating the Ligand Directory:
obtaining mol2 files
The size of the library which can be screened depends on the available
computational resources. Typically libraries number in the tens to hundreds
of thousands of files. It is practically impossible to test exhaustively any
large chemical database. Libraries are constructed to maximize the chances
of obtaining good ‘hits’ by focusing on ligand diversity.
ZINC
ZINC Is Not Commerical is a free database of over 4.6 million
commercially-available compounds for virtual screening
(blaster.docking.org/zinc). The first exercise illustrates setting up a data
structure and populating the Ligands directory with 115 mol2 files from
ZINC.
Documentation
Documenting each step of a computational experiment in sufficient detail to
be able to reproduce it is an essential requirement. README files are one
common form of documentation. Important sections in a README file for
computation experiments include: Project, Author, Date, Task, Data sources,
Files in this directory, Output files, Running Scripts and other notes on the
location of the executable and environmental settings.
In the “Before We Start…” section, you set up local copies of the input files
and executable scripts we will use today.
9
Procedure:
1. In ex01.csh, we create a working directory called VirtualScreening
and two subdirectories: one called Ligands where we will do all the
preparation of the ligand files and a second called etc where we’ll keep
a few extra, useful files.
Next we populate the Ligands directory by splitting a multimolecule
file from ZINC into 115 separate files.
Finally, we add a positive control, ind.pdb, to the list of ligands.
cd $VSTROOT
mkdir VirtualScreening
foreach f (tmp*)
echo $f
set zid = `grep ZINC $f`
if !(-e “$zid”.mol2) then
Note: here we split a set filename = “$zid”.mol2
file with many else foreach n (`seq –w 1 99`)
molecules in mol2 if !(-e “$zid”_”$n”.mol2) then
format into separate set filename = “$zid”_”$n”.mol2
files to be processed in break
next exercises. endif
end
endif
mv –v $f $filename
end
10
Type this:
source $VSTROOT/scripts/ex01.csh
On Mac OS X: source $VSTROOT/scripts/ex01_mac.csh
tutorial
$VSTROOT
VSTutorial
etc Ligands
ligand.list ZINC*.mol2
ind.pdb
3. To confirm that the foreach loop did what we expected, list the
mol2 files. Use wc (word count) for counting. Check that the number
of mol2 files 115 plus 1 the number of pdb files, i.e. ind.pdb the
positive control, matches the number of ligands in the ligand.list file
Type this: 116.
\ls *.mol2 | wc –l
\ls *.pdb |wc -l
wc -l ../etc/ligand.list
4. Document the experiment:
Type this: cd $VSTROOT
vim README
11
Exercise Two: Processing the ligands: mol2 to
pdbqt.
In the tutorial “Using AutoDock 4 with ADT”, you prepared the ligand
file using ADT, a graphical user interface. It is not reasonable to try to
prepare thousands of ligand files using a graphical user interface.
Tasks of this magnitude must be automated. In this exercise, we
introduce prepare_ligand4.py, a python script in the AutoDockTools
module, and show you how to use it in a Unix foreach loop. Details of
its usage can be found in the Appendix.
Note:
The prepare_ligand.py Procedure:
script takes as input a pdb
or mol2 filewhich is
specified on the command
line with the ‘-l’ switch #!/bin/c s h
and writesa pdbqt file # $Id: ex02.csh,v 1.2 2005/01/31 00:48:01 lindy Exp$
with charges, root, and #
rotatable bonds defined.
The ‘-d’switch specifies # use the prepare_ligand4.py script to create pdbqt files
the filename of a python cd $VSTROOT/VirtualScreening/Ligands
dictionary that foreach f (`ls *`)
describesthe atomtypes echo $f
and other attributes of the pythonsh ../../prepare_ligand4.py -l $f –d ../etc/ligand_dict.py
end
set of input
files processed.This
information will be used
in the next exercise.
Type this:
source $VSTROOT/scripts/ex02.csh
12
2. Examine the results of this script:
Type this: \ls *.pdbqt |wc
\ls ../etc
tutorial
$VSTROOT
VSTutorial
etc Ligands
ligand_dict.py ZINC*.pdbqt
ind.pdbqt
Note:
ligand_dict.py is Figure 1.5 Exercise 2 Result
generated by
prepare_ligand4.py
and used in
Exercise 3.
3. Document:
Add an entry for this section’s procedure to the README file. Record
warning messages.
The UNIX ‘script filename’ command is an alternative to the
README file convention. It copies all the text from the terminal into
the specified transcript file. Here, you could start a transcript before the
foreach loop. To stop recording the transcript file, type Control D.
13
Exercise Three: Profiling the library: determining
the covering set of Atom Types:
Note: AutoDock4 In docking a set of ligands against a single receptor, you need only one
limits the number of grid map for each atom type in the covering set of atom types present
atoms in the ligand to
2048 and the in the ligands. In this exercise we write a summary of the ligand
number of rotatable library in order to determine the covering set of atom types and to
bonds in a ligand to 32. exclude ligands with too many atoms, atom types, rotatable bonds, etc
Procedure:
#!/bin/csh
# $Id: ex03.csh,v 1.4 2005/01/31 02:23:44 lindy Exp $
# The examine_ligand_dict.py scripts reads the
# ligand_dict.pywritten in Exercise 2 and writes a summary
# describing the set ofligands to stdout.
cd $VSTROOT/VirtualScreening/etc
cp ../../examine_ligand_dict.py .
./examine_ligand_dict.py > summary.txt
Notice the covering set of atoms. You may decide to remove some
stems based on this information.
Type this:
source $VSTROOT/scripts/ex03.csh
14
tutorial
$VSTROOT
VSTutorial
etc Ligands
summary.txt
15
Exercise Four: Preparing the receptor: pdb to
pdbqt.
Note: For most atoms, the The receptor file used by AutoDock must be in pdbqt format which is
autodock_type is the same as the pdb plus ‘q’ charge and ‘t’ autodock_type. To conform to the
element. The autodock_type for
aromatic carbons, which for autodock AutoDock atom types, polar hydrogens should be present whereas
are carbons in planar cycles, is A to non-polar hydrogens and lone pairs should be merged, each atom
distinguish them from aliphatic carbons
C. All oxygens are assumed to be able should be assigned a gasteiger partial charge.
to accept two hydrogen bond acceptors
and have the autodock_type OA. All
hydrogens are assumed to be able to be The Receptor directory is where we process the receptor once and
hydrogen bond donors and have the only once. All the ligands will refer to this single receptor.
autodock_type HD. Sulfur and nitrogen
atoms which can accept hydrogen bonds AutoDockTools should be familiar to you from the AutoDockTools
are autodock_types SA and NA tutorial.
respectively and are distinguished from
those which cannot which have
autodock_types S and N.
Procedure:
#!/bin/csh
# $Id: ex04.csh,v 1.2 2005/01/31 00:48:01 lindy Exp $
# Create a directory called Receptor and populate it
# with the supplied x1hpv.pdb file.On your own, use
# AutoDockTools to create the pdbqs file.
cd $VSTROOT/VirtualScreening
mkdir Receptor
cp ../x1hpv.pdb Receptor
cd Receptor
Type this:
source $VSTROOT/scripts/ex04.csh
16
tutorial
$VSTROOT
VSTutorial
Note: Alternatively, this * When processing is complete, type x1hpv.pdbqt into the file
preparation could be done via browser which opens. Be sure to write the file in the Receptor
the prepare_receptor4.py directory. Don’t close adt because we’ll use it in the next exercise.
script. However, if you are
working with a single receptor,
you should prepare it 3. Add an entry for this section’s procedure to the README file.
interactively to optimize
selecting the search space.
17
Exercise Five: Preparing AutoGrid Parameter Files
for the library
The grid parameter file tells AutoGrid the types of maps to compute,
the location and extent of those maps and specifies pair-wise potential
energy parameters. In general, one map is calculated for each element
in the ligand plus an electrostatics map. Self-consistent 12-6 Lennard-
Jones energy parameters - Rij, equilibrium internuclear separation and
epsij, energy well depth - are specified for each map based on types of
atoms in the macromolecule. If you want to model hydrogen bonding,
this is done by specifying 12-10 instead of 12-6 parameters in the gpf.
For a library of ligands, only one atom map per ligand type is required.
Each AutoGrid4 calculation creates the set of required atom maps
plus an electrostatics map and a desolvation map.
Procedure:
#!/bin/csh
# $Id: ex05.csh,v 1.2 2005/01/31 00:48:01 lindy Exp $
echo "Use adt to complete this exercise (05)"
18
2. Examine the grid parameter file you have prepared:
Type this:
cat x1hpv.gpf | more
tutorial
$VSTROOT
VSTutorial
19
Exercise Six: Calculating atomic affinity maps for a
ligand library using AutoGrid.
Procedure:
Note: the echo utility
allows you to ‘see’ what
commands are executed by #!/bin/csh
a script. Start it by typing #$Id: ex06.csh,v 1.2 2007/05/09 00:48:01 lindy Exp $
# 1. Use autogrid4 to create the grid map files:
“set echo’. Turn it off by
typing ‘unset echo’. Try it
cd $VSTROOT/VirtualScreening/Receptor
here by starting it before autogrid4 -p x1hpv.gpf -l x1hpv.glg
sourcing ex06.csh
Type this:
source $VSTROOT/scripts/ex06.csh
2. Check that the maps are there and check that there are 10 maps .
cd $VSTROOT/VirtualScreening/Receptor
Type this: ls –alt *map
ls –alt *map |wc -l
20
tutorial
$VSTROOT
VSTutorial
21
Exercise Seven: Validating the Protocol with a
Positive Control
Before we go on and make larger and larger resource and time
commitments to the virtual screening experiment, let's make sure in
the next exercise that the input files are valid.
Procedure:
#!/bin/csh
# $Id: ex07.csh,v 1.3 2005/01/31 02:23:04 lindy Exp $
22
Type this:
source $VSTROOT/scripts/ex07.csh
tutorial
$VSTROOT
VSTutorial
ind_x1hpv x1hpv.pdbqt
x1hpv*map*
Type this:
cat ind_x1hpv.dpf | more
3. Also, you can follow the execution of the autodock job using
tail. The ‘-f’ flag makes it follow as new output is written.
Type this:
tail –f ind_x1hpv.dlg
23
Exercise Eight: Preparing the Docking Directories
and Parameter Files for each ligand in a library.
In this exercise, we repeat the steps we used for the positive control in
the last exercise for each ligand to be screened. There is a separate
directory for each ligand. Each ligand directory contains symbolic
links to the autogrid maps and to the receptor. Each ligand directory
has its unique ligand.pdbqt and ligand.dpf files.
Procedure:
#!/bin/csh
Suggestion: unset echo here if # $Id: ex08.csh,v 1.5 2007/05/31 16:33:49 lindy Exp $
it is set because this script #Create the Dockings directory:
involves many steps for many
ligands. cd $VSTROOT/VirtualScreening
mkdir Dockings
cd Dockings
Type this:
source $VSTROOT/scripts/ex08.csh
24
tutorial
$VSTROOT
VSTutorial
x1hpv.pdbqt
x1hpv*map*
ind_x1hpv diversity*_x1hpv ind_x1hpv
diversity*.pdbqt ind.pdbqt
x1hpv.pdbqt x1hpv.pdbqt
x1hpv*map* x1hpv*map*
Figure 2.1 Exercise 8 Result diversity*_x1hpv.dpf ind_x1hpv.dpf
Type this:
pwd
ls
ls | wc –l
ls ZINC00000480_x1hpv
25
Exercise Nine: Launching many AutoDock jobs.
Procedure:
#!/bin/csh
#$Id: ex09.csh,v 1.4 2004/12/09 02:25:23 lindy Exp $
# 1. Create a file with a list of the dockings to run:
cd $VSTROOT/VirtualScreening/Dockings
/bin/ls > ../etc/docking.list
Type this:
source $VSTROOT/scripts/ex09.csh
26
tutorial
$VSTROOT
VSTutorial
ZINC*_x1hpv.dlg ind_x1hpv.dlg
2. Check that the docking logs exist in the directories under the
Dockings directory:
Type this:
cd $VSTROOT/VirtualScreening/Dockings
ls –alt /ZINC00000480_x1hpv
27
Exercise Ten: Identifying the Interesting Results to
Analyze.
Procedure:
#!/bin/csh
# $Id: ex10.csh,v 1.4 2007/06/31 02:27:03 lindy Exp $#
# Extract the Free Energy of Binding for the lowest energy
# in the largest cluster from the dlg files using the python
# script summarize_results4.py:
cd $VSTROOT/VirtualScreening/Dockings
foreach d (`/bin/ls`)
echo $d
pythonsh ../../summarize_results4.py –d $d –t 2. –L –a –o ../etc/summary_2.0.txt
end
Note:
-k5n means
sort on field 5 # Sort the summary_2.0.txt file based on the lowest energy conformation in
here the lowest the largest cluster to find your best dockings:
energy in the
largest cluster. cd ../etc
-t, means to use cat summary_2.0.txt|sort –k5n –t, > summary_2.0.sort
commas as field
separators
Type this:
source $VSTROOT/scripts/ex10.csh
28
tutorial
$VSTROOT
VSTutorial
summary_2.0.txt
summary_2.0.sort
2. Find your ligands which bind with the lowest energy (best binders)
at the top of the list in all_energies.sort. Locate the positive control.
Note the ligands that have better energies than it.
cd ../etc
head summary_2.0.sort
29
Exercise Eleven: Examine Top Dockings.
3. Setup receptor:
* Read in the file:
Note: If x1hpv.pdbqt is
File->Read Molecule
already read it, do not read -click on "PDB files (*.pdb)"
it in again. -select "AutoDock files (pdbqt) (*.pdbqt)
-select "x1hpv.pdbqt"
30
Note: You cannot change the -click on Material: Front
Material properties of a
geometry (such as its opacity) -change Opacity to .7
if it inherits Materials from its -click on Material: None
parent. To change this, set the -select root as current object in viewer
inheritMaterial flag to False:
-click on Current Geom * Close the DejaVu GUI:
Properties button to display a -click on Sphere/Cube/Cone (DejaVuGUI) Button
list of checkbuttons for * ADJUST the view:
different attributes of the
current geometry. -SHIFT-middle button to zoom in on x1hpv’s water
-click on inheritMaterial if
necessary to turn it off. 4. Repeat the following steps for each docking to be evaluated. Here
we show the procedure using ZINC00057384_x1hpv.dlg as an
example:
1. Analyze-> Docking Logs->Open
-select ZINC00057384_x1hpv.dlg
-click on Open
-click on OK
2. Analyze-> Clusterings->Show
write a printable version of histogram:
-click on histogram’s Edit-> Write
-type in this filename: “ZINC00057384_x1hpv.ps”
-click on Save
31
Using the TSRI cluster: garibaldi
All input file preparation should be done on your local computer. The
interactive head node on the garibaldi cluster is used to transfer
the files from your computer to the cluster where the calculations will
be carried out. For today’s tutorial, we will demonstrate launching a
sample docking and then use previously computed results.
We create a tar file of the VSTutorial directory tree:
Type this: cd /usr/tmp/tutorial
Note: In our usage here of tar –czvf VSTutorial.tar.gz VSTutorial
the tar command, we
include the verbose flag, Next transfer it to garibaldi using sftp.
-v, to show what is going sftp garibaldi
on.
put VSTutorial.tar.gz
exit
Log on to garibaldi:
ssh garibaldi
32
The pbs command qdel is used for removing a job from the queue:
qdel ######.garibaldi
You will receive an email when each job finishes that includes
information about whether the job finished successfully or not.
For sanity reasons, we will not be launching all the jobs. To do so you
would use a foreach loop like this:
33
Files for exercises:
Input Files:
x1hpv.pdb, ZINC.mol2, x1hpv.gpf
Results Files
Ligand
<ligand>.pdbqt
Macromolecule
x1hpv.pdbqt
AutoGrid
x1hpv.gpf
x1hpv.*.map, x1hpv.maps.fld, x1hpv.maps.xyz
AutoDock
ind_x1hpv.dpf, ind_x1hpv.dlg,
<ligand>_x1hpv.dpf, <ligand>_x1hpv.dlg
34
Appendix A: Usage for AutoDockTools Scripts
Note: You can generate any The python scripts in AutoDockTools/Utilities24 module are
of these usage statements customizable via input flags:
by typing the script name
with no input. eg:
prepare_ligand.py
prepare_ligand4.py –l ligand_filename
-l ligand filename (required)
35
prepare_receptor4.py –r filename
-r receptor_filename
Optional parameters:"
-v verbose output
-o pdbqt_filename (receptor_name.pdbqt)
-A type(s) of repairs to make (“ “):
'bonds_hydrogens': build bonds and add hydrogens
'bonds': build a single bond from each nonbonded atom
to its closest neighbor
hydrogens': add hydrogens
'checkhydrogens': add hydrogens only if there are none already
'None': do not make any repairs
(default is 'checkhydrogens')
-C preserve all input charges ie do not add new charges
(default is addition of gasteiger charges)
-p preserve input charges on specific atom types, eg -p Zn -p Fe
-U cleanup type:
'nphs': merge charges and remove non-polar hydrogens
'lps': merge charges and remove lone pairs
'waters': remove water residues
'nonstdres': remove chains composed entirely of
residues of types other than the standard 20 amino acids
'deleteAltB': remove XX@B atoms and rename XX@A atoms->XX
(default is 'nphs_lps_waters_nonstdres')
-e delete every nonstd residue from any chain
'True': any residue whose name is not in this list:
['CYS','ILE','SER','VAL','GLN','LYS','ASN',
'PRO','THR','PHE','ALA','HIS','GLY','ASP',
will be deleted from any chain. NB: there are no
nucleic acid residue names at all in the list.
(default is False which means not to do this)
-M mode (automatic)
interactive (do not automatically write outputfile)
36
summarize_results4.py -d directory
-d directory
Optional parameters:
-t rmsd tolerance (default is 1.0)
-f rmsd reference filename
(default is to use input ligand coordinates from docking log)
-b print best docking info only (default is print all)
-L print largest cluster info only (default is print all)
-B print best docking and largest cluster info only
(default is print all)
-o output filename
(default is 'summary_of_results')
-a append to output filename
(default is to open output filename 'w')
-k build hydrogen bonds
-r receptor filename
-u report unbound energy
-v verbose output
37