POAP Docking Protocol
POAP Docking Protocol
PII: S1476-9271(17)30575-3
DOI: https://doi.org/10.1016/j.compbiolchem.2018.02.012
Reference: CBAC 6795
Please cite this article as: Samdani, A., Vetrivel, Umashankar, POAP: A GNU
Parallel based multithreaded pipeline of Open Babel and AutoDock suite for
boosted High Throughput Virtual Screening.Computational Biology and Chemistry
https://doi.org/10.1016/j.compbiolchem.2018.02.012
This is a PDF file of an unedited manuscript that has been accepted for publication.
As a service to our customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and review of the resulting proof
before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that
apply to the journal pertain.
POAP: A GNU Parallel based multithreaded pipeline of Open Babel and AutoDock suite for
Vision Research Foundation, Sankara Nethralaya, Chennai - 600 006, Tamil Nadu, India.
2
School of Chemical and Biotechnology, SASTRA University, Thanjavur, India.
T
*
Corresponding author Email: [email protected], [email protected]
IP
Dr.V.Umashankar,
R
HOD & Principal Scientist
Centre for Bioinformatics,
SC
Kamalnayan Bajaj Institute for
Research in Vision and Ophthalmology,
Vision Research Foundation,
Sankara Nethralaya,
Chennai - 600 006, U
N
Tamil Nadu, India.
A
M
Graphical abstract
E PT
CC
A
1
T
R IP
Highlights:
SC
POAP is a GNU Parallel based pipeline that enables optimally parallelized HTVS run by
POAP provides high scalability and optimal usage of CPU cores leading to significant
ED
POAP features multi receptor docking and comparative analysis enabling drug repurposing
PT
studies
E
CC
A
2
T
R IP
Abstract
SC
High throughput virtual screening plays a crucial role in hit identification during the drug
U
discovery process. With the rapid increase in the chemical libraries, virtual screening process
N
becomes computationally challenging, thereby posing a demand for efficiently parallelized software
A
pipelines. Here we present a GNU Parallel based pipeline-POAP that is programmed to run Open
M
Babel and AutoDock suite under highly optimized parallelization. The ligand preparation module is
a unique feature in POAP, as it offers extensive options for geometry optimization, conformer
ED
generation, parallelization and also quarantines erroneous datasets for seamless operation. POAP
also features multi receptor docking that can be utilized for comparative virtual screening and drug
PT
efficient pipeline that enables high scalability, seamless operability, dynamic file handling and
E
CC
optimal utilization of CPU’s for computationally demanding tasks. POAP is distributed freely under
Keywords
GNU Parallel; AutoDock; Open Babel; Virtual screening; Parallel processing; ligand preparation
3
T
R IP
1. Introduction
SC
Drug discovery and development has undergone phenomenal changes over the years.
Computer aided drug designing strategies like molecular modelling and structure-based virtual
U
screening has been the reason for this progressive change. Moreover, discovery of new therapeutic
N
moieties in a swift and cost effective manner is the need of the hour, and can only be achieved by
A
computational aspects to biological and chemical space has extremely influenced modern drug
ED
development chain (Rahman et al., 2012). Computational tools are widely applied for predicting hit
molecules against the target of interest, and many such predictions were proven to be highly accurate
PT
at experimental validation (Kuhn et al., 2016; Sliwoski et al., 2014; Xia, 2017). In the current scenario,
structure based drug design involving structural refinement, molecular docking and virtual screening
E
has become an indispensable part of drug discovery process (Ferreira et al., 2015; Kalyani G, 2013;
CC
The “Open source” concept has revolutionized the software industry worldwide. A ten point
standards were announced by Open source initiative to define the term “open source” (Årdal and
Røttingen, 2012). Among these ten points, three are considered to be significant: access to source
code, free redistribution and creation of derived works (Årdal and Røttingen, 2012). Open source
based drug design software like AutoDock, Open Babel etc., have played a major role in accelerating
4
the drug discovery process (Umashankar, V. G. S., & Gurunathan, S., 2015) and also well abide to
the key standards of open source initiative. However, complete potential of these opensource tools in
High Throughput drug discovery can only be unleashed by means of massive parallelization, and
workflows are available only in expensive commercially licensed software. Thus, an open source
based Virtual screening pipeline which is parallelized efficiently, will attract many scientist with
T
limited resources to pursue virtual screening of using large set of chemical libraries in an efficient
IP
way.
R
Recent benchmarking studies on free and commercial docking tools have shown AutoDock
SC
Vina to be an optimal performer in identifying the best ligand bound pose (Wang et al., 2016).Virtual
screening of ligands using AutoDock and AutoDock Vina are being extensively used by various
U
people for the lead identification of the target proteins. Many useful tools like PyRx, raccoon, DOVIS,
N
VSdocker, AUDocker LE and Pymol plugins are available for performing virtual screeing studies
A
with AutoDock and AutoDock Vina (Chen, 2015, 2015; Lill and Danielson, 2011; Prakhov et al.,
M
2010; Sandeep et al., 2011; Zhang et al., 2008). However, there is a need for pipelines that efficiently
ED
utilize the simple yet powerful GNU Parallel for parallelizing the complete virtual screening
workflow, right from ligand preparation to post docking analysis. Especially, there is dearth of open
PT
source pipelines which can handle ligand preparation process in a parallelized manner. Inverse
docking and multiple protein docking protocols has been proven to be powerful methods to assign
E
the targets for the ligand of interest (Li et al., 2006; Medina-Franco et al., 2013). These protocols
CC
Though, there have been highly appreciable attempts to develop these sorts of open source
pipelines, there are concerns with installation, configuration and guided workflow. Moreover, many
of such pipelines have not attempted to completely utilize the highly efficient GNU Parallel tool to
parallelize the virtual screening process, including ligand preparation to post docking analysis.
5
Hence, in this study, it is attempted to develop a parallelized virtual screening pipeline:
Parallelized Open Babel & AutoDock suite Pipeline (POAP) which integrates the popular tools like
Open Babel, AutoDock, AutoDock Vina and AutoDockZN, in an easily configurable bash shell based
text interface. POAP offers modules for ligand preparation, Single receptor Virtual screening,
multiple receptor Virtual screening and consensus scoring. All these modules are engineered to run
in a GNU Parallel based multi CPU environment. In POAP, a well optimized dynamic file handling
T
is also implemented, thereby, enabling optimal RAM usage, quarantining of erroneous ligand datasets
IP
facilitating unperturbed operation of the workflow, and structured accessibility of input, output and
R
intermediary files. The developed pipeline demonstrates the effective usage of GNU Parallel tool to
SC
be implemented in the development of complete virtual screening workflow.
U
N
POAP was developed using bash programming language integrating the most popular tools:
A
Open Babel-2.4.0 for ligand optimization and AutoDock-4.2.6, AutoDock Vina-1.1.2, AutoDockZn
M
executions of the jobs were achieved by utilizing the GNU Parallel tool.
ED
GNU Parallel is a command line tool that can be used to run jobs in parallel. It contains most
of the options which are present in xargs. It has been widely used to run the same command for a
E
CC
number of times from the given input. GNU Parallel executes jobs in parallel mode depending on the
number of CPU threads assigned by the user. The usage of this tool enables complete and powerful
A
utilization of CPU resources. Efficient file handling and parallel execution of processes make GNU
Open Babel is a freely accessible cheminformatics tool that is used for ligand optimization,
6
format conversion, 3D co-ordinate generation and feature based filtration etc., (O'Boyle et al., 2011b;
O'Boyle et al., 2011a). However, this tool can handle only one ligand at a time committed to single
CPU. Hence, in the Ligand preparation module developed, Open Babel is parallelized using
appropriate GNU Parallel flags for conversion of the ligands in SMILES or 2D co-ordinate files to a
3D co-ordinate file (pdbqt format). Moreover, the ligand conformer generation methods such as
Genetic Algorithm, Random rotor search, Weighted Rotor search, Obconformer and Confab are also
T
invoked to produce the preferred number of conformations at user’s choice. Furthermore, this module
IP
also provides options for fixing the stereo chemical errors, energy minimization of the ligands with
R
conjugate gradient or steepest descendent methods.
SC
By default, the ligands with erroneous data are also processed by Open Babel which results in
archiving of erroneous incomplete files. This might disrupt the ligand preparation workflow. Thus,
U
during the ligand preparation process, these types of ligand data are identified and quarantined. For
N
each and every step in the ligand preparation module, the users will be prompted to enter the desired
A
values towards enabling a modular run. Detailed flowchart of the Ligand preparation pipeline is
M
shown in Fig.1.
ED
E PT
CC
A
7
T
R IP
SC
U
N
A
M
ED
E PT
CC
Fig.1: Detailed flowchart of Ligand preparation module. The dotted red line indicates the
A
AutoDock is a popular and widely used software for Protein-Ligand docking. It commits only
to a single CPU per docking run. AutoDock implements Lamarckian Genetic Algorithm and free
8
energy empirical scoring to calculate and produce the ligand binding energy (Morris et al., 2009).
AutoDock Vina is the improved version of AutoDock that uses gradient optimization process for
scoring the binding affinity of the ligands. It also features multi-threading capability and higher
accurate prediction of the ligand binding energy, thus making it a preferred tool for multiple ligand
screening processes (Trott and Olson, 2010). However, multiple CPU usage of AutoDock Vina is
dependent upon the level of exhaustiveness set during the search. POAP offers the flexibility of
T
choosing the maximum number of CPU’s at desired level of exhaustiveness, implemented through
IP
GNU Parallel. This feature enables optimal and maximal resource planning during the AutoDock
R
Vina run. Recently, AutoDockZN, a charge-independent and directional based model has been
SC
released, which can used for docking of ligands to zinc metalloproteins (Santos-Martins et al., 2014).
U
2.4 Dynamic file handling for reducing hard disk space and faster execution
N
A
In general, when AutoDock is run for single protein-ligand docking, user needs to run autogrid
M
separately to map the ligand atom types in the grid parameter file (gpf). During this step, the ligands
with unidentified atom types will not be processed. This becomes an important concern during the
ED
parallelized run of AutoDock. To address this concern, the prepared ligand datasets are crosschecked
for atom types supported by the AutoDock through an automated script, thereby the ligands with
PT
unidentified atom types will also be quarantined before the execution of autogrid. This feature is
E
extremely useful to set a seamless grid preparation during the execution of AutoDock based virtual
CC
screening. Moreover, in order to reduce time and hard disk space occupancy, all the map files
representing the entire ligand dataset are directed to a common hub directory. Otherwise, for each
A
ligand, map files will be created in separate directories leading to redundant map files for identical
atom types, thereby consuming huge disk space and computational time. Hence, during the execution
of this module, the non-redundant map files were kept as a central hub for performing the docking
process. Further, to enable easier visualization and interpretation of data using MGLTOOLS, the atom
map files and .dlg files are directed to the working directory in an automated manner. Detailed
9
flowchart of the developed Virtual screening pipeline is shown in Fig.2.
T
R IP
SC
U
N
A
M
ED
E PT
CC
A
10
A
CC
EPT
ED
M
11
A
N
U
SC
RIP
T
Fig.2: Detailed flowchart of Virtual screening module.2.5 Execution of Ligand Preparation
Module
To prepare the ligand datasets in required format, the ligand preparation script
“POAP_lig.bash” needs to be executed. In order to start this module in interactive mode, flag –s
should be used together. During the initial step of ligand preparation, user needs to specify the
directory path of ligands that are to be processed. Currently, POAP supports sdf, mol2, smi, mol, sd,
T
sy2, ml2, pdb formats. The ligand datasets shall be provided as single compressed file as well as
IP
individual files. Next, the user must specify the number of jobs to be run in parallel. This can be
R
arrived in accordance to the available computing resources. This module also offers the option to
SC
choose the desired forcefield (Ghemical/GAFF/MMFF94/MMFF94s/UFF) for ligand optimization
and 3D conversion. Moreover, user will also be prompted to select the methods for ligand conformer
U
search, which includes Genetic Algorithm, Random rotor search, Weighted rotor search,
N
Obconfomrer and Confab. Here, user needs to specify the search method and also should provide the
A
number of conformations to be generated and other RMSD or energy cut-offs to filter the ligand
M
conformations. This module also enables the user to skip this step and proceed directly to energy
ED
minimization step. In the minimization parameter feeding interface, user needs to specify the choice
of algorithm, number of steps and other cut-off values along with desired output file format.
PT
After fetching all the inputs from the user, the ligand preparation module will run the jobs in
E
parallel. During the parallelized run, the ligands with erroneous data will be quarantined and excluded.
CC
This quarantining process is triggered by the detection of ERROR messages prompted by the obabel
or by the prepare_ligand4.py script during pdbqt conversion. The ligands for which error messages
A
were prompted will be automatically quarantined in the respective process folders. In some cases,
Open Babel will generate an empty file without any error message, POAP is also programmed to
dynamically identify and quarantine these types of erroneous ligands. POAP also screens the ligand
datasets for AutoDock atom types and the ligands with unknown atom types and quarantines it. The
ligands passing all these checks will only be directed to the output directory. In case of optimized
12
ligand dataset availability, user can directly proceed towards pdbqt conversion using the -pdbqt flag
during the initiation of the POAP_lig.bash. Similarly, various shortcut flags like –three for 3D
conversion, -conf for conformer generation, -min for minimization etc. can be used during the
initiation of the script to skip the interactive mode. Kindly refer the detailed operating manual and
T
IP
To perform parallelized virtual screening with Audodock suite, POAP_vs.bash should be
executed in the bash terminal. To start the screening in an interactive mode, flag –s should be added
R
during the execution of the script. In the initial step of ligand assignment, user needs to specify the
SC
directory path of prepared ligands in pdbqt format. In case if the user intends to use AutoDock Vina,
U
the directory path to configuration file also needs to be provided. For AutoDock and AutoDockZn,
N
user needs to provide the path to reference .gpf and .dpf file containing details on customized grid
A
and docking parameters, respectively. Specifically, in case of AutoDockZn, the forcefield in .dat
M
needs to be kept in the directory where the .gpf and .dpf files are located. Further, in the interactive
ED
mode, the user needs to specify the number of CPU threads to be utilized for running single AutoDock
Vina job, whilst, in case of AutoDock and AutoDockZn, the number of jobs to be run in parallel needs
PT
to be specified. Further, the user needs to provide the number of top hits for which the protein-ligand
E
complexes needed to be generated as pdb files. On completion of all these steps, virtual screening
CC
will be executed in parallel. Finally, the results of Virtual screening will be tabulated in toplist.txt file
containing different energy scores, theoretical pI values etc. Moreover, the docked complexes of the
A
top hits will be parsed as pdb files, and can be retrieved from the Results folder in the working
directory.
13
2.7 Multi receptor Virtual screening
The Virtual screening module also provides a very useful option for performing multiple
ligands vs. multiple receptors in a parallelized manner. This option provides the user to choose a set
of proteins with different or similar active sites and perform virtual screening of chemical libraries at
a single stretch. The results of multiple receptor virtual screening will be provided as a tab delimited
file with docking scores of all ligands vs. all receptors. Moreover, these results are auto sorted in
T
accordance to binding score similarity across the receptors with respective standard deviation values.
IP
This feature enables the user to explore the ligands that target multiple receptors, and also aids in
R
short listing the ligands with off target effects.
SC
2.9 Validation of POAP for performance using sample datasets
different databases in SDF format were chosen: FDA approved drugs from DrugBank (1,288 ligands)
M
(Wishart et al., 2017), and Myriascreen-II (10,000 ligands), Natural Derivative Library (NDL) (3,040
ED
ligands), Extended Flavonoids Derivative (EFD) (4,053 ligands) from Timtec database
(http://www.timtec.net/). The dataset of FDA approved drugs comprised of 2,073 ligands, which
PT
includes inorganic ligands, metallic ligands and other ligands with atom types which are not supported
by AutoDock. Hence, by excluding all these ligands the resulting 1288 ligands datasets were
E
CC
processed by Ligand Preparation module of POAP. As these datasets were in 2D SDF format, 3D
conversion was performed with medium speed option. Different ligand conformers were generated
A
with weighted rotor conformational search, wherein, the number of conformers set to 50, among
which the lowest energy conformer were picked and proceeded further. Energy minimization of the
lowest energy conformer was deployed with 5000 steps of conjugate gradient minimization with
MMFF94 force field. The distance cut-offs for Vander waals, electrostatic and non-bonded
interactions were set to default. The pdbqt conversion of the energy minimized structure was
14
integrated by calling prepare_ligand4.py script available in MGLTOOLS-1.5.7. To compare the
efficiency of the Ligand preparation module, a single CPU committed script was run by retaining the
above discussed parameters. Finally, the ligand preparations were carried out by both parallelized
pipeline and in GNU Parallel disabled mode. The test run right from ligand preparation to virtual
screening were carried out in Ubuntu-12. 04 64-bit environment installed in T5510 DELL workstation
with Intel Xeon(R) CPU E5-2620V2, 2.10 GHz clock speed (12 Cores, 24 threads) with 62.9GB
T
RAM. The software integrated in the pipelines include Open Babel-2.4.1, MGLTOOLS-1.5.7,
IP
AutoDock-4.2.6, AutoDock Vina-1.1.2 and GNU Parallel-20170122. In case of parallel mode, 24
R
jobs were allocated to be executed in parallel at a given time based on the available CPU threads. The
SC
log files of parallelized and serial modes were analysed for assessing the performance in terms of
speedup ratio. The log details from parallelized preparation of FDA approved datasets from
U
DrugBank were used for measuring the parallelization efficiency with Amdahl’s Law, as serial
N
execution with other larger datasets (Myriascreen-II, NDL, and EFD) will lead to enormous
A
computational time with the limited hardware.
M
To demonstrate the speedup efficiency and drug repurposing applicability of POAP, crystal
structures of four popular and diverse drug targets (N=4) were chosen: Human ROCK I (Rho
PT
associated protein kinase I) (PDB ID: 2ETR, 2.6 Å),which is a potential target manifested in many of
E
the diseases like Cancer, atherosclerosis, Glaucoma etc. (Riento and Ridley, 2003); HTH-type
CC
transcriptional regulator (EthR) (PDB ID: 5J1U, 1.8Å) of M.tuberculosis which is responsible for
modulating the ethionamide (ETH) drug resistance (Willand et al., 2009); Pks13 (Polyketide
A
synthase) of M. Tuberculosis (Thioesterase domain: PDB ID: 5V3X, 1.94Å) involved in mycolic acid
synthesis (Aggarwal et al., 2017; Takayama et al., 2005); PqsA (Anthranilate-coenzyme A ligase)
(PDB ID:5OE3, 1.43Å) of Pseudomonas aeruginosa which is involved in Quorum sensing through
15
Further, to demonstrate the drug enrichment analysis, a completely different set of proteins
(N=3) from DUD-E database except for ROCK1 were chosen: FAK1 (Focal Adhesion Kinase) (PDB
ID: 3BZ3, 2.2Ǻ which is, a prominent target in many metastatic cancers; FABP4 (Fatty Acid Binding
protein) (PDB ID: 2NNQ, 1.8Ǻ), a well-documented target playing key role in modulation of diabetes,
insulin resistance and atherosclerosis. The choice of these proteins were based on the availability of
T
As all these structures were co-crystallized with ligands, the chains which were harbouring
IP
the ligands were chosen and optimized by addition of hydrogen atoms, fixing of missing side chains,
R
and overall geometry optimization using WHAT IF server. Later, the optimized structure was
SC
prepared for docking using default options in MGLTOOLS-1.5.7. Here, the non-polar hydrogens
were merged, gasteiger charges were added, AutoDock atom types were defined and saved in the
U
pdbqt format. The receptor grids for performing docking were also set around the active site regions.
N
A
2.9.3 Parallelized run of AutoDock Vina
M
The directory paths leading to working directory, AutoDock Vina configuration file, prepared
FDA ligand dataset from DrugBank in pdbqt format, the structural coordinates of the protein datasets
ED
(ROCK1, EthR, Pks13 and PqsA) in pdbqt format were provided as input during the evoke of virtual
PT
screening module in interactive mode. Moreover, the docking parameters with exhaustiveness of 8,
number of modes set to 20 and number of CPU’s set to 8 were assigned for running a single job.
E
Hence, these parameters will run 3 AutoDock Vina jobs in parallel with 24 CPU threads running as
CC
8 segmented threads per job. To cross validate the modified parallel job execution, AutoDock Vina
was run also with default parameters except retaining the exhaustiveness as similar to above.
A
The directory paths containing prepared FDA ligand datasets from DrugBank in pdbqt format,
structural coordinates of the protein datasets (ROCK1, EthR, Pks13 and PqsA) in pdbqt, working
directory, .gpf, .dpf files were provided as input in the interactive mode. Further, the parallelized
16
virtual screening using AutoDock was initiated, wherein, initial grid calculation for all the ligand
atom types were performed, followed by execution of ligand docking run as 24 parallel jobs. To cross
validate the parallel job execution, the docking runs were also run in GNU Parallel disabled mode.
The time taken for completion of the parallelized jobs of ligand preparation and virtual
screening were captured from the POAP log files. Further, the time values from the log files were
T
IP
converted to speed ratio implementing the equation (Karmani et al., 2011):
R
𝑠𝑝𝑒𝑒𝑑𝑢𝑝 = 𝑇𝑖𝑚𝑒 𝑜𝑛 1 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑜𝑟⁄ 𝑇𝑖𝑚𝑒 𝑜𝑛 𝑃 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑜𝑟𝑠 (𝐸𝑞𝑛 1)
SC
The converted speedup ratio for all the above mentioned parallelized tasks (N=4) per
parallelized module) were averaged and corresponding standard deviations were calculated.
U
N
2.9.6 Evaluation of Parallelization efficiency of POAP with Amdahl’s Law:
A
The efficiency of parallelization in POAP was evaluated by direct measurement of observed
M
speedup and comparison with ideal speedup based on Amdahl’s law. For these evaluations, the ligand
preparation and virtual screening runs were run in with differential CPU allocation. As POAP runs
ED
only in parallelized mode and does not contain any serial execution scripts, the theoretical ideal speed
up value for Amdahl’s law was derived by substituting serial fraction f with 0 in Amdahl’s equation,
PT
(Eqn-2) (Amdahl GM 1967; Karmani et al., 2011). To calculate the observed speedup ratio for ligand
E
preparation module, FDA approved ligand dataset (1,288) from DrugBank was used. Initially, the
CC
ligands were processed in serial mode with single CPU core. Following which, parallelized runs with
4, 8, 12, 16, 20 and 24 CPU’s were performed. In a similar fashion, the speed up ratio for parallelized
A
virtual screening with AutoDock was calculated with ROCK1 vs. FDA approved ligand datasets from
DrugBank.
17
Amdahl’s Law was applied using the Equation:
T
Amdahl’s calculation, a comparative speedup ratio analysis between default AutoDock Vina run and
IP
POAP mediated AutoDock Vina run was performed. Here, the time values from ROCK1 vs. FDA
R
approved ligands virtual screening were used for comparative speedup ratio analysis. For the default
SC
AutoDock Vina run, four sets of evaluations were performed wherein, the first set involves direct
serial run with no explicit mention of CPU allocation, other three involving CPU allocations of 8, 16
U
and 24 with --cpu flag of AutoDock Vina. In case of POAP mediated run, for every single AutoDock
N
Vina job, 8 CPUS were restricted, as the hardware used was only of 24 cores, hence, 8-core for single
A
job, 16-core for two jobs and 24 cores for three jobs were run in parallel. For all the AutoDock Vina
M
executions, Exhaustiveness value of 8 was applied globally. Finally, the speedup ratios of default
ED
AutoDock Vina and POAP mediated run were compared to evaluate the efficiency.
The crystal structures of ROCK1, EthR, PqsA and Pks13 in complex with respective ligands
E
were processed for redocking by MGLTOOL-1.5.7. The receptor grids for all the proteins were set
CC
around the actual binding site observed in the crystallized form. Redocking studies were performed
with AutoDock Vina retaining the default parameters. Finally, RMSD (Root Mean Square Deviation)
A
on structural superposition of actual co-crystallized forms with corresponding redocked forms were
calculated to infer the predictive accuracy. RMSD values were calculated using rmsd.py available in
Schrodinger Maestro script library. This redocking analysis was performed to utilize the binding
energy score as positive control for demonstrating the drug repurposing applicability of POAP.
18
2.9.9 Multi receptor virtual screening and drug repurposing:
To demonstrate the applicability of multi receptor virtual screening module of POAP in drug
repurposing, ROCK1, EthR, Pks13 and PqsA structures were prepared, and their respective docking
configuration file were kept in a common folder with corresponding protein names. The directory
paths containing the prepared FDA datasets from DrugBank and configuration files were also given
as input for multi receptor virtual screening. This module was demonstrated only with AutoDock
T
Vina with exhaustiveness value of 8 and number of CPUs per job set to 8. The top 10 hits for each
IP
protein were ranked based on the binding energy. Further, these hits were analysed for intermolecular
R
interactions using Protein-Ligand interaction profiler (Salentin et al., 2015). Finally, a comparative
SC
analysis of binding energy and intermolecular interactions of redocked complexes vs. top hits were
performed, so as to shortlist the FDA ligands from DrugBank that shall be repurposed for targeting
To demonstrate the Docking enrichment analysis of POAP based virtual screening, the crystal
structures of three prominent protein targets: FAK1 (PDB ID: 3BZ3 2.2Ǻ), which is targeted in many
ED
metastatic cancer types (Lin et al., 2017); ROCK1 (PDB ID: 2ETR 2.6Ǻ) , a potential target in many
PT
cancers, glaucoma, atherosclerosis and vascular diseases (Defert and Boland, 2017); FABP4 (PDB
ID: 2NNQ 1.8Ǻ), a well-documented target for diabetes and atherosclerosis (Floresta et al., 2017)
E
were chosen. The active ligands and decoys corresponding to these targets were retrieved from DUD-
CC
E dataset (http://dude.docking.org/) (Mysinger et al., 2012) for performing the enrichment analysis.
To proceed with POAP based virtual screening, the protein structures were prepared using the
A
MGLTOOLS-1.5.7. Since the ligand datasets from DUD-E were in mol2 format, these were
converted to pdbqt format using POAP_lig.bash script. Further, virtual screening of these prepared
ligands vs. the chosen protein targets were performed using POAP_vs.bash script. For the Virtual
screening runs, area under curve (AUC) and Enrichment Factor (EF) predictions were performed to
19
validate the discrimination efficiency of actives from the decoys. The AUC for the Receiver operating
characteristics (ROC) curve was calculated using the Rocker tool (http://users.jyu.fi/~satalatt/cgi-
bin/rocker.cgi), wherein, the ligand binding energy scores were given as input (Lätti et al., 2016). A
stringent EF cut of 1% was given to validate the percentage of actives picked among the ranked
datasets.
T
IP
3.1 Speedup performance of Ligand Preparation Module in POAP:
The efficiency of ligand preparation module was validated using four (N=4) different ligand
R
databases: FDA approved drugs from DrugBank, Myriascreen-II, NDL and EFD. The validation runs
SC
featured seamless operability, leading to generation of ligand datasets with optimal geometry
U
favourable for virtual screening. Moreover, among the ligands prepared, Open Babel showed error
N
messages for few ligands during 2D to 3D conversion, conformer generation, minimization and pdbqt
A
conversion. POAP efficiently quarantined these ligands from rest of the ligand datasets, leading to
M
seamless parallel operability, thereby eliminating the perturbations during the ligand conversion
process. The runs resulted in optimized ligand datasets in pdbqt file format archived in pdbqt folder
ED
of the working directory. The time values of each parallelized run were obtained from corresponding
log files. Further, these values were converted to seconds, from which speedup factor was calculated
PT
using Eqn 1. Subsequently, the speedup factors from these four datasets (N=4) were averaged and
E
standard deviation (SD) across these runs was calculated. It was observed that, the parallelized mode
CC
conferred an average speedup of 14.788 times faster than the serial mode, with a least SD of 0.148
The speedup performance of POAP based AutoDock and AutoDock Vina parallelization was
demonstrated by virtual screening FDA drugs from DrugBank vs. four proteins namely, ROCK1,
EthR, Pks13 and PqsA. During AutoDock based virtual screening, 24 jobs were executed in parallel
20
for each protein and corresponding time values were obtained and processed as similar to discussed
in section (3.1). The parallelized AutoDock run showed a speedup of 12.464 times faster execution
than the serial mode (N=4, SD=0.081). In case of AutoDock Vina, virtual screening was parallelized
to run three jobs in parallel, thereby, utilizing 8 CPU threads per job, which resulted in a speedup of
2.397 times faster execution than the default mode (N=4, SD=0.066) Fig.3.
T
R IP
SC
U
N
A
M
ED
PT
Fig. 3: Speedup ratio comparison for the POAP based runs of AutoDock, AutoDock Vina and
E
The speedup ratio for ligand preparation (FDA DrugBank database) and AutoDock (Virtual
screening of FDA drugs from DrugBank against ROCK1) are shown in Fig.4. This indicates
sequential increase in the speedup of the process along the increase in the number of processors. The
speedup ratio for this module tends to drop below the theoretical ideal value when it crosses 8
21
processors, but a sub linear increase in speedup ratio was observed with the increment in number of
processors. Hence, the performance is proportionate to the increase in number for CPUs.
T
R IP
SC
U
Fig.4: Parallel performance comparison for AutoDock and Open Babel speedup ratio.
N
A
3.4 Speedup ratio analysis of default vs. POAP mediated AutoDock Vina run
M
Speedup ratio analysis of AutoDock Vina default mode vs. POAP triggered mode, infers that
increment of processors (8, 16, 24) with AutoDock Vina inbuilt --cpu flag does not affect the speedup
ED
factor and tends to remain similar to that of default mode (Fig.5). Whereas, in case of POAP mediated
PT
CPU allocation, a significant increase in the speedup ratio was observed (Fig.5). Since the hardware
used was of 24 cores, 8-cores for single job, 16-cores for two jobs and 24 cores for three jobs, were
E
run in parallel through POAP. This clearly signifies POAP mediated AutoDock Vina run will lead to
CC
22
T
R IP
SC
Fig.5: Speedup ratio comparison between default and POAP mediated run of AutoDock Vina.
U
N
3.5 Enrichment analysis:
A
M
The ROC curve analysis helps to discriminate the success of identifying the active compounds
from the datasets containing inactive compounds (Fawcett, 2006). Area under curve (AUC) value
ED
analysis of ROC provides the information on fraction of enrichment of active compounds in a virtual
screening run. Generally, AUC values greater than 0.5, indicates the significant increase in enriched
PT
identification of actives among the larger decoy sets (Empereur-Mot et al., 2016). Both AutoDock
and AutoDock VINA were able to successfully the discriminate the active compounds over large
E
CC
dataset of inactives. On comparing the AUC values of AutoDock (AD) and AutoDock Vina (ADV)
from the virtual screening results of three proteins, FAK1 (AD=0.60, ADV= 0.78), ROCK1
A
(AD=0.78, ADV=0.73) and FABP4 (AD=0.70, ADV=0.81) (Fig.6) it could be inferred that
AutoDock Vina to be a slightly better discriminator of actives than AutoDock. The Enrichment Factor
provides us the information on early recognition of percentage of active compounds among the ranked
list. Except for ROCK1, AutoDock Vina was mostly able to identify higher number of actives among
the 1% of the dataset compared to AutoDock: FAK1 (AD=0.88%, ADV=17.59%), ROCK1 (7.86%,
23
4.91%) and FABP4 (AD=5.28%, ADV=35.23%) (Table.S1).
T
R IP
Fig.6: ROC curve for the Docking Enrichment analysis: (a) AutoDock (b) AutoDock Vina.
SC
Colour representation: FAK1- blue; ROCK1-red; FABP4-green.
U
In order to demonstrate the applicability of AutoDock Vina based multi receptor docking
N
module of POAP in drug repurposing, FDA drugs from DrugBank against the four targets namely,
A
ROCK1, EthR, Pks13 and PqsA were selected. Prior to virtual screening, these proteins were
M
redocked to the corresponding co-crystallized ligands (Y27 for ROCK1, P93 for EthR, I28 for Pks13
ED
and 3UK for PqsA) using AutoDock Vina and the respective binding energies on redocking were
noted. During the redocking process, the allowed torsional rotations for the co-crystallized Ligands
PT
were set to 0, so as to conserve the native conformation. Further, the redocked complexes were
compared to the respective co-crystallized forms by structural superposition and the corresponding
E
CC
ligand RMSD values were tabulated. This was performed to assess the predictive accuracy of the
docking method. The Ligand RMSD values were found to be in the range of 0.16Ǻ to 0.48Ǻ,
A
suggesting the reliability of the method adopted. The Protein-ligand interactions in the redocked
complexes were analysed using Protein Ligand Interaction Profiler (PLIP) (Salentin et al., 2015). The
binding energies and the protein ligand interactions details of the redocked complexes were used as
reference for identifying newer hits which shall be repurposed from FDA datasets. Further, the multi
receptor docking was performed using POAP, which resulted in a tab delimited file containing the
24
binding energies of all the FDA datasets vs. the four proteins studied. Here, the ligands were sorted
in accordance to the maximal binding energy exhibited across all the proteins. It also featured
standard deviation of binding energies for each ligand vs. all the proteins. Finally, the probable re-
purposable ligands from FDA datasets for each protein were concluded based on the top scoring
T
IP
Redocking of Y27 to ROCK1 inferred the binding energy to be -7.5 Kcal/mol and was also
found to maintain the closeness with the Co-crystallized form (RMSD 0.48 A) retaining the key
R
interactions: H-bonds with the M156 in the hinge region and D216 of the activation loop, hydrophobic
SC
interactions with the glycine-rich loop region residues of I82 and V90, thereby, inferring the correct
U
binding orientation target the ATP binding site (Jacobs et al., 2006). Based on the comparative
N
analysis of binding energy and residue interactions with Y27, two top ranking hits namely, DB06210
A
(Eltrombopag) and DB09280 (Lumacaftor) were found to have lower binding energies of -10.7
M
Kcal/mol and -10.5 Kcal/mol, respectively (Fig.S1). These two ligands were found to form H-bonds
with N203, D216, K105 of activation loop and also π-cation cum salt bridge interactions with K105.
ED
It should be noted that all these residues together play key role in coordination of Mg2+ and ATP
during catalytic activity (Jacobs et al., 2006). These two ligands also formed H-bonded and
PT
hydrophobic interactions with residues in the activation loop, glycine rich loop and also the hinge
E
region residues (Table. S3). Moreover, DB09280 (Lumacaftor) formed halogen bond with M156 of
CC
the hinge region which play an important role in the ATP binding in the pocket. Hence, based on
these inferences, DB06210 which used to stimulate platelet production and DB09280 used to treat
A
cystic fibrosis (CF) shall be validated for repurposing use as inhibitors in ROCK1 activity related
diseases.
25
EthR vs. FDA dataset
Redocking of P93 with EthR inferred the binding energy to be -8.9 Kcal/mol and was also
found to maintain the structural closeness with the co-crystallized form (RMSD 0.25Ǻ) holding the
key interactions: H-bonded interactions with N176, N179 residues, hydrophobic interactions to the
tunnel region spanning residues (F110, W138, W145, W207), and π-π stacking with F110. The
hydrophobic residues harbouring the tunnel region play a key role in orienting the HTH motif during
T
DNA binding of this transcription repressor (Nikiforov et al., 2017). Based on the binding energies
IP
and key residue interactions, top two hit compounds: DB0079 (Sulfasalazine, -11.4 Kcal/mol) and
R
DB00450 (Droperidol,-10.7 Kcal/mol) were found to be highly potential (Fig.S2). These ligands
SC
formed H-bonds with N176, N179 which are key residues for targeting EthR and also synonymous
with P93 interaction. Moreover, these ligands also found to form Hydrophobic interactions and π-π
U
stacking interactions with the residues in the subpocket I, II, II which are shown to interact with most
N
potential combinatorial drugs targeting EthR (Nikiforov et al., 2016) (Table.S4). Taken together,
A
these two hits were found to be highly potential, as these were found to interact with the druggable
M
hotspot regions of EthR. Sulfasalazine is used in treatment of Crohn's disease and rheumatoid arthritis,
ED
whilst, Droperidol, is used to treat nausea and vomiting. These drugs shall be further validated for
Redocking of I28 (TAM1) with Pks13 inferred the binding energy to be -13.3 Kcal/mol and
also showed structural nearness with the corresponding co-crystallized structure (RMSD of 0.16Å)
A
retaining the key intermolecular interactions: hydrophobic and π-π stacking interactions with F1670,
Salt bridge interaction with D1644, and hydrophobic interactions with residues Y1637, N1640,
Y1663, A1667, F1670, and T1674 which harbour the substrate binding groove (Fig.S3). Based on
binding energies and residue interactions, top three hits DB09280 (Lumacaftor (used in Cystic
26
Fibriosis treatment), -11.8 Kcal/mol), DB00972 (Azelastine (treatment of allergic & non-allergic
rhinitis), -11.7 Kcal/mol), and DB06210 (Eltrombopag (used to stimulate platelet count), -11.7
Kcal/mol) from FDA datasets were shortlisted as potential inhibitors of Pks13 (Fig.S3). All these
three ligands formed hydrophobic and π-π-stacking interactions with F1670 similar to TAM1.
Moreover, these three ligands showed hydrophobic interactions with residues in the substrate binding
groove similar to TAM1 (Table.S5). Based on these inferences, these three hits shall be considered
T
as potential repurpose inhibitors of pks13 targeting mycolic acid synthesis in Mycobacterium
IP
tuberculosis (Aggarwal et al., 2017).
R
PqsA vs. FDA dataset:s
SC
The redocking analysis of anthraniloyl-AMP (3UK) to PqsA revealed the binding energy to
U
be -14.6 Kcal/mol and also found to show structural closeness to the co-crystallized form (RMSD of
N
0.29 Å) retaining the key intermolecular interactions: H-bond with residues G279, G300, G302, T304
A
and hydrophobic interactions with other set of residues which play key role in substrate binding
M
(Fig.S4). From the virtual screening run, DB09074 (Olaparib, (used in treatment of Ovarian Cancer)
and DB06817 (Raltegravir, (used in treatment of HIV-1 infection) were found to be stable binders,
ED
as these showed binding energy of -10.4 and -10.2 Kcal/mol, respectively. These compounds were
found to fit well into the active site region (Anthraniloyl-AMP & substrate binding site) through H-
PT
bonded interactions with G279, G300, G302 and hydrophobic interactions with F209, Y211.
E
Moreover, these ligands also formed π-π-stacking interactions with H308, mimicking the interacting
CC
ligands to potential inhibitors of PqsA, and shall be validated for repurposing capability towards
A
targeting PQS mediated quorum signalling pathway in Pseudomonas aeruginosa infections (Witzgall
et al., 2017).
27
4. Conclusion:
POAP is distinct of its kind in utilizing the potential of GNU Parallel for parallelization of
ligand preparation by Open Babel and virtual screening using AutoDock suite. It features a unique
and important function of quarantining the erroneous ligands which is essential for unperturbed
parallelized run. The efficiency of POAP modules in handling different datasets has been well
demonstrated in this study. POAP is distributed freely under GNU GPL license with extensive manual
T
and supporting tutorials, thereby enabling ease of use. In future versions, it is intended to include
IP
advanced parallelization involving splitting of sub jobs, inbuilt modules for drug enrichment analysis
R
and parsing of intermolecular interactions. POAP will be of significant use for many of the aspirants
SC
who intend to use open source based high throughput drug discovery methods.
Conflict of interest
U
N
The authors declare that there are no conflicts of interest.
A
Acknowledgements
M
Technology, Government of India, for providing financial assistance through DBT-JRF Fellowship
[DBT/2015/VRF/363] to Samdani for carrying out this work. The authors also thank DBT Rapid
PT
Grant for Young Investigator (RGYI) scheme [BT/PR6476/ GBD/27/496/2013, 05/09/2013] for the
hardware support.
E
CC
References
Aggarwal, A., Parai, M.K., Shetty, N., Wallis, D., Woolhiser, L., Hastings, C., Dutta, N.K., Galaviz,
A
S., Dhakal, R.C., Shrestha, R., Wakabayashi, S., Walpole, C., Matthews, D., Floyd, D., Scullion,
P., Riley, J., Epemolu, O., Norval, S., Snavely, T., Robertson, G.T., Rubin, E.J., Ioerger, T.R.,
Sirgel, F.A., van der Merwe, R., van Helden, P.D., Keller, P., Böttger, E.C., Karakousis, P.C.,
Lenaerts, A.J., Sacchettini, J.C., 2017. Development of a Novel Lead that Targets M. tuberculosis
28
Amdahl, G.M. Validity of the single processor approach to achieving large scale computing
capabilities, in: the April 18-20, 1967, spring joint computer conference, Atlantic City, New
Jersey, p. 483.
Årdal, C., Røttingen, J.-A., 2012. Open source drug discovery in practice: a case study. PLoS
Chen, Y.-C., 2015. Beware of docking! Trends in pharmacological sciences 36 (2), 78–95.
T
10.1016/j.tips.2014.12.001.
IP
Defert, O., Boland, S., 2017. Rho kinase inhibitors: a patent review (2014 - 2016). Expert opinion on
R
therapeutic patents 27 (4), 507–515. 10.1080/13543776.2017.1272579.
SC
Empereur-Mot, C., Zagury, J.-F., Montes, M., 2016. Screening Explorer-An Interactive Tool for the
Analysis of Screening Results. Journal of chemical information and modeling 56 (12), 2281–2286.
10.1021/acs.jcim.6b00283.
U
N
Fawcett, T., 2006. An introduction to ROC analysis. Pattern Recognition Letters 27 (8), 861–874.
A
10.1016/j.patrec.2005.10.010.
M
Ferreira, L.G., Dos Santos, R.N., Oliva, G., Andricopulo, A.D., 2015. Molecular docking and
10.3390/molecules200713384.
PT
Floresta, G., Pistarà, V., Amata, E., Dichiara, M., Marrazzo, A., Prezzavento, O., Rescifina, A., 2017.
Adipocyte fatty acid binding protein 4 (FABP4) inhibitors. A comprehensive systematic review.
E
Jacobs, M., Hayakawa, K., Swenson, L., Bellon, S., Fleming, M., Taslimi, P., Doran, J., 2006. The
structure of dimeric ROCK I reveals the mechanism for ligand selectivity. The Journal of
A
Ji, C., Sharma, I., Pratihar, D., Hudson, L.L., Maura, D., Guney, T., Rahme, L.G., Pesci, E.C.,
Coleman, J.P., Tan, D.S., 2016. Designed Small-Molecule Inhibitors of the Anthranilyl-CoA
29
biology 11 (11), 3061–3067. 10.1021/acschembio.6b00575.
Kalyani G, 2013. A review on drug designing, methods, its applications and prospects. Int J Pharm
Karmani, R.K., Agha, G., Squillante, M.S., Seiferas, J., Brezina, M., Hu, J., Tuminaro, R., Sanders,
P., Träffe, J.L., Geijn, R.A., Träff, J.L., Sander, M.B., Gustafson, J.L., Dror, R.O., Young, C.,
Shaw, D.E., Lin, C., Lee, J.-K., Chang, R.-G., Kuan, C.-B., Kollias, G., Grama, A.Y., Li, Z.,
T
Whaley, R.C., Vuduc, R.W., 2011. Amdahl’s Law, in: Padua, D. (Ed.), Encyclopedia of Parallel
IP
Computing. Springer US, Boston, MA, pp. 53–60.
R
Kuhn, B., Guba, W., Hert, J., Banner, D., Bissantz, C., Ceccarelli, S., Haap, W., Körner, M.,
SC
Kuglstatter, A., Lerner, C., Mattei, P., Neidhart, W., Pinard, E., Rudolph, M.G., Schulz-Gasch, T.,
Woltering, T., Stahl, M., 2016. A Real-World Perspective on Molecular Design. Journal of
U
medicinal chemistry 59 (9), 4087–4102. 10.1021/acs.jmedchem.5b01875.
N
Lätti, S., Niinivehmas, S., Pentikäinen, O.T., 2016. Rocker: Open source, easy-to-use tool for AUC
A
and enrichment calculations and ROC visualization. Journal of cheminformatics 8 (1), 45.
M
10.1186/s13321-016-0158-y.
Lesic, B., Lépine, F., Déziel, E., Zhang, J., Zhang, Q., Padfield, K., Castonguay, M.-H., Milot, S.,
ED
Stachel, S., Tzika, A.A., Tompkins, R.G., Rahme, L.G., 2007. Inhibitors of pathogen intercellular
PT
10.1371/journal.ppat.0030126.
E
Li, H., Gao, Z., Kang, L., Zhang, H., Yang, K., Yu, K., Luo, X., Zhu, W., Chen, K., Shen, J., Wang,
CC
X., Jiang, H., 2006. TarFisDock: a web server for identifying drug targets with docking approach.
Lill, M.A., Danielson, M.L., 2011. Computer-aided drug design platform using PyMOL. Journal of
Lin, V.T.G., Pruitt, H.C., Samant, R.S., Shevde, L.A., 2017. Developing Cures: Targeting
30
Medina-Franco, J.L., Giulianotti, M.A., Welmaker, G.S., Houghten, R.A., 2013. Shifting from the
single to the multitarget paradigm in drug discovery. Drug discovery today 18 (9-10), 495–501.
10.1016/j.drudis.2013.01.008.
Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K., Goodsell, D.S., Olson, A.J., 2009.
AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal
T
Mysinger, M.M., Carchia, M., Irwin, J.J., Shoichet, B.K., 2012. Directory of useful decoys, enhanced
IP
(DUD-E): better ligands and decoys for better benchmarking. Journal of medicinal chemistry 55
R
(14), 6582–6594. 10.1021/jm300687e.
SC
Nikiforov, P.O., Blaszczyk, M., Surade, S., Boshoff, H.I., Sajid, A., Delorme, V., Deboosere, N.,
Brodin, P., Baulard, A.R., Barry, C.E., Blundell, T.L., Abell, C., 2017. Fragment-Sized EthR
U
Inhibitors Exhibit Exceptionally Strong Ethionamide Boosting Effect in Whole-Cell
N
Mycobacterium tuberculosis Assays. ACS chemical biology 12 (5), 1390–1396.
A
10.1021/acschembio.7b00091.
M
Nikiforov, P.O., Surade, S., Blaszczyk, M., Delorme, V., Brodin, P., Baulard, A.R., Blundell, T.L.,
Abell, C., 2016. A fragment merging approach towards the development of small molecule
ED
inhibitors of Mycobacterium tuberculosis EthR for use as ethionamide boosters. Organic &
PT
O. Tange, 2011. GNU Parallel - The Command-Line Power Tool. ;login: The USENIX Magazine,
E
42–47.
CC
O'Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., Hutchison, G.R., 2011a.
3-33.
O'Boyle, N.M., Vandermeersch, T., Flynn, C.J., Maguire, A.R., Hutchison, G.R., 2011b. Confab -
10.1186/1758-2946-3-8.
31
Prakhov, N.D., Chernorudskiy, A.L., Gainullin, M.R., 2010. VSDocker: a tool for parallel high-
Rahman, M.M., Karim, M.R., Ahsan, M.Q., Khalipha, A.B.R., Chowdhury, M.R., Saifuzzaman, M.,
2012. Use of computer in drug design and drug discovery: A review. Int. J. Pharma Life Sci. 1
(2). 10.3329/ijpls.v1i2.12955.
T
Riento, K., Ridley, A.J., 2003. Rocks: multifunctional kinases in cell behaviour. Nature reviews.
IP
Molecular cell biology 4 (6), 446–456. 10.1038/nrm1128.
R
Salentin, S., Schreiber, S., Haupt, V.J., Adasme, M.F., Schroeder, M., 2015. PLIP: fully automated
SC
protein-ligand interaction profiler. Nucleic acids research 43 (W1), W443-7. 10.1093/nar/gkv315.
Sandeep, G., Nagasree, K.P., Hanisha, M., Kumar, M.M.K., 2011. AUDocker LE: A GUI for virtual
U
screening with AUTODOCK Vina. BMC research notes 4, 445. 10.1186/1756-0500-4-445.
N
Santos-Martins, D., Forli, S., Ramos, M.J., Olson, A.J., 2014. AutoDock4(Zn): an improved
A
AutoDock force field for small-molecule docking to zinc metalloproteins. Journal of chemical
M
Śledź, P., Caflisch, A., 2017. Protein structure-based drug design: from docking to molecular
ED
Sliwoski, G., Kothiwale, S., Meiler, J., Lowe, E.W., 2014. Computational methods in drug discovery.
Surade, S., Ty, N., Hengrung, N., Lechartier, B., Cole, S.T., Abell, C., Blundell, T.L., 2014. A
CC
structure-guided fragment-based approach for the discovery of allosteric inhibitors targeting the
lipophilic binding site of transcription factor EthR. The Biochemical journal 458 (2), 387–394.
A
10.1042/BJ20131127.
Takayama, K., Wang, C., Besra, G.S., 2005. Pathway to synthesis and processing of mycolic acids in
10.1128/CMR.18.1.81-101.2005.
32
Trott, O., Olson, A.J., 2010. AutoDock Vina: improving the speed and accuracy of docking with a
Umashankar, V. G. S., & Gurunathan, S., 2015. Drug discovery: an appraisal. Int J Pharm
Wang, Z., Sun, H., Yao, X., Li, D., Xu, L., Li, Y., Tian, S., Hou, T., 2016. Comprehensive evaluation
T
of ten docking programs on a diverse set of protein-ligand complexes: the prediction accuracy of
IP
sampling power and scoring power. Physical chemistry chemical physics : PCCP 18 (18), 12964–
R
12975. 10.1039/c6cp01555g.
SC
Willand, N., Dirié, B., Carette, X., Bifani, P., Singhal, A., Desroses, M., Leroux, F., Willery, E.,
Mathys, V., Déprez-Poulain, R., Delcroix, G., Frénois, F., Aumercier, M., Locht, C., Villeret, V.,
U
Déprez, B., Baulard, A.R., 2009. Synthetic EthR inhibitors boost antituberculous activity of
N
ethionamide. Nature medicine 15 (5), 537–544. 10.1038/nm.1950.
A
Wishart, D.S., Feunang, Y.D., Guo, A.C., Lo, E.J., Marcu, A., Grant, J.R., Sajed, T., Johnson, D., Li,
M
C., Sayeeda, Z., Assempour, N., Iynkkaran, I., Liu, Y., Maciejewski, A., Gale, N., Wilson, A.,
Chin, L., Cummings, R., Le, D., Pon, A., Knox, C., Wilson, M., 2017. DrugBank 5.0: a major
ED
update to the DrugBank database for 2018. Nucleic acids research. 10.1093/nar/gkx1037.
PT
Witzgall, F., Ewert, W., Blankenfeldt, W., 2017. Structures of the N-Terminal Domain of PqsA in
Xia, X., 2017. Bioinformatics and Drug Discovery. Current topics in medicinal chemistry 17 (15),
A
1709–1726. 10.2174/1568026617666161116143440.
Zhang, S., Kumar, K., Jiang, X., Wallqvist, A., Reifman, J., 2008. DOVIS: an implementation for
2105-9-126.
33
Figure legends:
Fig.1: Detailed flowchart of Ligand preparation module. The dotted red line indicates the
Fig. 3: Speedup ratio comparison for the POAP based runs of AutoDock, AutoDock Vina and
T
IP
Open Babel vs. corresponding serial execution.
R
Fig.4: Parallel performance comparison for AutoDock and Open Babel speedup ratio.
SC
Fig.5: Speedup ratio comparison between default and POAP mediated run of AutoDock Vina.
U
Fig.6: ROC curve for the Docking Enrichment analysis: (a) AutoDock (b) AutoDock Vina.
N
Colour representation: FAK1- blue; ROCK1-red; FABP4-green.
A
M
ED
E PT
CC
A
34