Gkae 236

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Nucleic Acids Research, 2024, 1–10

https://doi.org/10.1093/nar/gkae236
Web Server Issue

ADMETlab 3.0: an updated comprehensive online ADMET


prediction platform enhanced with broader coverage,
improved performance, API functionality and decision
support

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


Li Fu 1 ,† , Shaohua Shi 2 ,† , Jiacai Yi 3 ,† , Ningning Wang4 , Yuanhang He 1 , Zhenxing Wu5 ,
Jinfu Peng1 , Youchao Deng1 , Wenxuan Wang1 , Chengkun Wu 3 , Aiping Lyu 2 ,
Xiangxiang Zeng 6 , Wentao Zhao3 , Tingjun Hou 5 ,* and Dongsheng Cao 1 ,*
1
Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P.R. China
2
School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR, 999077, P.R. China
3
School of Computer Science, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
4
Xiangya Hospital of Central South University, Changsha, Hunan 410008, P.R. China
5
College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, P.R. China
6
Department of Computer Science, Hunan University, Changsha, Hunan 410082, P.R. China
*
To whom correspondence should be addressed. Tel: +86 731 89824761; Email: oriental-cds@163.com
Correspondence may also be addressed to Tingjun Hou. Tel: +86 571 88208412; Email: tingjunhou@zju.edu.cn

The first three authors should be regarded as Joint First Authors.

Abstract
ADMETlab 3.0 is the second updated version of the web server that provides a comprehensive and efficient platform for evaluating ADMET-
related parameters as well as physicochemical properties and medicinal chemistry characteristics involved in the drug discovery process. This
new release addresses the limitations of the previous version and offers broader coverage, improved performance, API functionality, and decision
support. For supporting data and endpoints, this version includes 119 features, an increase of 31 compared to the previous version. The updated
number of entries is 1.5 times larger than the previous version with over 400 000 entries. ADMETlab 3.0 incorporates a multi-task DMPNN
architecture coupled with molecular descriptors, a method that not only guaranteed calculation speed for each endpoint simultaneously, but also
achieved a superior performance in terms of accuracy and robustness. In addition, an API has been introduced to meet the growing demand for
programmatic access to large amounts of data in ADMETlab 3.0. Moreover, this version includes uncertainty estimates in the prediction results,
aiding in the confident selection of candidate compounds for further studies and experiments. ADMETlab 3.0 is publicly for access without the
need for registration at: https://admetlab3.scbdd.com.

Graphical abstract

Received: January 29, 2024. Revised: March 10, 2024. Editorial Decision: March 18, 2024. Accepted: March 21, 2024
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the
original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
2 Nucleic Acids Research, 2024

Introduction works (DMPNN) framework combined with molecular de-


Drug development is a long journey full of high risk, high in- scriptors was applied to construct predictive models for vari-
vestments and extremely high attrition rates. The preclinical ous endpoints, which significantly improved the performance
stage lays the crucial groundwork and, as a demanding phase and robustness of these models (14,15). More than that, in re-
in drug development, confronts a considerable attrition rate sponse to the escalating demand for enhanced in-batch evalu-
of approximately 93% (1). Even if certain drug candidates ation and analysis experiences, we implemented API capabili-
progress to clinical studies, over 75% of them are likely to ties in ADMETlab 3.0, specifically designed for those dealing
fail in phases I, II and III clinical trials, and during the sub- with substantial volumes of data. Additionally, an uncertainty
sequent drug approval process (2,3). In this process, undesir- estimation module using evidential deep learning techniques is
able pharmacokinetic properties, including absorption, distri- deployed in ADMETlab 3.0, which can evaluate the reliability
bution, metabolism, and excretion (ADME) properties, play of prediction results by providing precise measurement of un-

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


a crucial role, leading to the failure of approximately 40% certainties in the outcomes of ADMET predictions, thereby fa-
of candidate molecules. Addition to them, toxicity, another cilitating informed decision-making processes regarding can-
important evaluation indicator in the drug development pro- didate prioritization during virtual screening. In summary, we
cess, also accounts for up to 30% of drug development failures believe that the updated ADMET 3.0 is expected to provide
(2,4). This underscores the substantial impact of ADME and drug developers and chemists with a more comprehensive, re-
toxicity (ADMET) properties on the overall success or fail- liable, and accurate service based on its broader coverage, im-
ure of drug development efforts, and the importance of early proved performance, API functionality, and decision support.
assessment of ADMET properties for drug development. ADMETlab 3.0 is freely available for all users without login
To evaluate the ADMET properties of new molecular enti- at: https://admetlab3.scbdd.com.
ties (NMEs) as early as possible, various in vitro and in vivo
methods including medium- and high-throughput screening
have been developed, which also facilitate the rapid accu- Methods and webserver description
mulation of experimental data. However, as the number of Data collection
NMEs continues to increase, these experimental approaches To provide a broader and deeper insight into the ADMET
have shown several inherent shortcomings: time-consuming, profiles, molecular physicochemical properties, and medicinal
costly, and animal welfare issues involved, which have greatly chemistry rules for query compounds, our team performed
limited their application and spurred the emergence of in an extensive re-collection and reorganization based on the
silico methods for predicting ADMET properties. In recent existing ADMET dataset. We incorporated a wide spectrum
decades, with the rapid development of computer science of open-access bioactivity databases including ChEMBL (16),
and the accumulation of ADMET experimental data, in sil- PubChem (17) and OCHEM (18), as well as rigorously peer-
ico predictive models and derived web tools aimed at facil- reviewed literature (19–24). To ensure the quality, consis-
itating the efficient evaluation of ADMET properties have tency and reliability of the data, and to build more accu-
been greatly developed. Representative tools include admet- rate and generalizable models based on it, a series of pre-
SAR (5), SwissADME (6) and ProTox-II (7), as well as sev- processing was performed: organometallic compounds, iso-
eral web-based platforms developed after the release of AD- meric mixtures and chemical mixtures were selectively re-
METlab 2.0 (8), such as ADMETboost (9) and Interpretable- moved, salts were neutralized, counterions were eliminated,
ADMET (10). Despite their evident utility, current platforms and canonical SMILES strings were adopted as model input
still have certain limitations including narrower coverage of format. After these pretreatments, we finally obtained over
endpoints and lower calculation efficiency, etc. Understand- 400 000 molecules covering 77 ADMET-related endpoints
ing and addressing these limitations is crucial for improv- for model building. The detailed information of 77 datasets
ing the effectiveness and reliability of computer tools in drug and their respective data splitting information can be seen in
development. Supplementary Table S1.
Since its first release in 2018 and the updated version in
2021, we have been dedicated to enhancing the performance
and user experience of the ADMETlab webserver. As one of Directed message passing neural network
the most popular ADMET prediction platforms, our platform (DMPNN)
has not only garnered numerous appreciative feedback from The ADMETlab 2.0 employed the multi-task graph atten-
ordinary users but has also earned recognition in the field of tion (MGA) model, a multi-task framework based on rela-
AI-driven drug discovery thanks to its superior performance tion graph convolutional networks (RGCN). Although ver-
and wider ADMET predictions coverage (11–13). To date, the sion 2.0 has shown significant improvement compared to
article publishing ADMETlab 2.0 has been cited 955 times ac- its predecessor, we are continually seeking further enhance-
cording to Google Scholar, and the web server has been visited ments to achieve superior performance without compromis-
more than 1.7 million times. Inspired by increasingly frequent ing computational speed. In ADMETlab 3.0, we leveraged
access and user expectations for enhancements, including the directed message passing neural network (DMPNN), a sub-
development of Application Programming Interface (API) ca- class of graph convolutional neural networks (GCNN), by
pabilities to facilitate batch evaluation, we have upgraded AD- utilizing Chemprop (25,26), a DMPNN-based package de-
METlab to version 3.0 to address existing issues and further veloped by a team from the Massachusetts Institute of Tech-
optimize its user experience. ADMET 3.0 currently covers nology. DMPNN considers directed edges and learns molecu-
a more comprehensive set of ADMET endpoints, including lar encodings through bond-centered convolutions instead of
400 000 high-quality entries and 119 endpoints, marking an atom-centered convolutions, thus avoiding unnecessary loops
enhancement of 31 additional endpoints in comparison to its during the message passing phase. ADMETlab 2.0 s superior
predecessor. The multi-task deep message passing neural net- prediction ability over its predecessor and other webservers
Nucleic Acids Research, 2024 3

primarily owed to its multi-task graph neural network struc- under the ROC curve (AUC), accuracy (ACC), and Matthews
ture. To inherit the multi-tasking benefits from the previous correlation coefficient (MCC) (39) were applied for classifica-
version—simultaneously improving model performance and tion tasks. To obtain robust and accurate prediction models,
computational speed—we implemented Chemprop’s multi- each training process was repeated five times with randomly
task training approach in ADMETlab 3.0. partitioned data, and the best-performing models were incor-
Recent studies have demonstrated that incorporating ex- porated into the online platform.
ternal features, rather than relying solely on SMILES, can
further enhance the performance of DMPNN (14,27). The
Webserver implementation
advantage of this combination likely stems from the com-
plementary relationship between the global information of Similar to its predecessor, the ADMETlab 3.0 application has
molecules from descriptors and the local information ex- been crafted using Django as its backbone. The front end, us-
ing frameworks such as Bootstrap and JQuery, is responsi-

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


tracted by DMPNN. RDKit 2D descriptors offer a com-
prehensive overview of molecular characteristics, including ble for obtaining user queries, communicating with the back
molecular size and shape, topological information, and func- end, and selecting visual components in a user-friendly inter-
tional groups. Our approach adopted the combination of face. Better than ADMET1.0 and 2.0, ADMETlab 3.0 intro-
DMPNN with RDKit 2D descriptors (hereinafter referred duces an API that can be easily integrated into various re-
to as DMPNN-Des), an optional combination provided in search pipelines. This API, implemented using Django Ninja,
Chemprop, to significantly enhance predictive performance. is responsible for crucial tasks such as storing computational
Presently, DMPNN methods have been successfully applied in data, overseeing the execution and management of predic-
various fields of drug discovery, including antibiotic discovery tive analyses for ADMET-related compounds, and enabling
(28–30), reaction property prediction (31), infrared spectra interoperability with external applications through a public
prediction (32), solute parameters prediction (14), and molec- API. Furthermore, a noteworthy addition to the latest ver-
ular optical peak prediction (15). sion is the incorporation of a caching mechanism. This mod-
The multi-task DMPNN-Des framework in this work con- ule temporarily stores user calculation results on the server
sists of four key modules. Firstly, there is a feature-extraction through database storage, which aims to improve the overall
function that transforms an input molecule into both a molec- speed and efficiency of the web service. Chemprop was uti-
ular graph and a RDKit 2D descriptor vector. Secondly, a lized for the implementation of prediction models. The web
DMPNN structure is employed to learn atomic and bond em- server has been successfully tested on the latest versions of
beddings from molecular graph features. Subsequently, an ag- popular browsers, including Mozilla Firefox, Google Chrome,
gregation function is utilized to concatenate the graph readout Microsoft Edge and Apple Safari.
feature and RDKit 2D descriptors, creating a more compre-
hensive molecular embedding. Lastly, a standard feed-forward
neural network is employed to transform these molecular em-
New features and updates
beddings into target property values. The overview of AD- Comprehensive coverage of ADMET endpoints
MET profiles and the framework of the DMPNN-Des model To obtain as much ADMET-related data as possible and fur-
is summarized in Figure 1. ther improve the practicability of the ADMET webserver, we
collected new data from two aspects: new data for new end-
points and new data for existing endpoints. Up to now, we
Model validation methods extended the number of predictable endpoints from 88 in AD-
In this latest update, a comprehensive set of 77 prediction METlab 2.0 to 119 in ADMET 3.0, that is, we have incorpo-
models including 59 classification models and 18 regression rated 31 new endpoints. In addition, we also collected new
models, has been implemented. Additionally, the platform in- molecules for four previously existing endpoints. As a result,
cludes 34 other directly computable endpoints and 8 drug- the number of updated entries is 1.5 times larger than the pre-
related rules, which cover the computation of physicochemi- vious version, which is over 400 000. Herein, we presented
cal properties utilizing the RDKit package, the generation of a comparison between the training datasets available in AD-
medicinal chemistry and toxicophore rules using Scopy (33), METlab versions 2.0 and 3.0 and the details can be shown in
and the incorporation of six frequent hitter detection meth- Figure 2.
ods developed by our team (34–37). For each ADMET end- The 119 ADMET endpoints in ADMET 3.0 are composed
point, the dataset was randomly split into training, validation, of 21 physicochemical, 20 medical chemistry, 9 absorption,
and test sets by a ratio of 8:1:1. The larger part was used for 9 distribution, 14 metabolism, 2 excretion, 36 toxicity prop-
training, and the validation and test sets were used to opti- erties and 8 toxicophore rules. The detailed information of
mize hyperparameters and test the predictive capacity of each the 119 endpoints including their data description, results in-
model, respectively. We built and compared models based on terpretation, and empirical decision can be seen in the ‘Help’
sole DMPNN and DMPNN-Des for all the 77 endpoints with section of the website. Compared with ADMET 2.0, the 31
a view to selecting the best prediction model to deploy on the newly added endpoints are as follows: 4 physicochemical, 7
website. The Adam optimizer method (38) was applied for medical chemistry, 2 absorption, 5 distribution, 4 metabolism
training the models, with Bayesian optimization was adopted properties and 9 toxicity endpoints. Among these, 7 medici-
for hyperparameter optimization. The detailed information of nal chemistry endpoints are directly computable, requiring no
the optimal hyperparameters of DMPNN and DMPNN-Des additional data support. The data for the remaining 24 end-
models is listed in Supplementary Tables S2 and S3, respec- points are utilized to expand the ADMET database, construct-
tively. To evaluate the performance of the models, R-square ing more comprehensive and precise models.
(R2 ), root mean squared error (RMSE), and mean absolute er- The update includes new physicochemical properties such
ror (MAE) were applied for regression tasks, while the area as pKa(acid), pKa(base), melting point and boiling point.
4 Nucleic Acids Research, 2024

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024

Figure 1. Overview scheme of ADMETlab 3.0 data and DMPNN-Des model framework.

For absorption, two endpoints—human oral bioavailability Increased diversity and quantity of data contribute to a
(50%) and Parallel Artificial Membrane Permeability Assay more general understanding of the relationships between
(PAMPA)—were integrated. Distribution data covers trans- molecular features and the corresponding properties to be
membrane transporter inhibitors. Metabolism updates fea- learned, leading to more robust and accurate predictions.
ture CYP inhibition and human liver microsomal stability Therefore, in addition to the above new endpoints and their
data. The toxicity section now includes five types of toxic- datasets, we also expanded the volume of data for four impor-
ity (nephrotoxicity, neurotoxicity, ototoxicity, hematotoxicity, tant datasets for drug absorption, distribution, and excretion,
genotoxicity) and in vitro toxicity assessments (hERG block- namely Caco-2, logD7.4, VDss and half-life, with increasing
ers, RPMI-8226 immunotoxicity, A549 and Hek293 cytotoxi- rates of 20%, 20%, 125% and 20%, respectively. We made a
city). Notably, hematotoxicity data stems from recent research thorough analysis on data distribution and diversity for each
on chemical structure and hematotoxicity relationships. endpoint. Users can access these information in the ‘Diversity
Nucleic Acids Research, 2024 5

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


Figure 2. (A) The dataset sizes for absorption, distribution, metabolism, and physicochemical properties. (B) The dataset sizes for toxicity properties.
The wathet blue (lighter) bars represent the volume of renewed data, while the Aegean blue (darker) bars indicate the existing data volume.

& Distribution’ item under ‘Help’ section of the website to for LC50FM still reaches a satisfactory 0.68. Regarding the
gain a deeper understanding of the data composition for the newly added five regression endpoints, the R2 values for four
model. of them fell between 0.8 and 0.9. The remaining one, T1/2 (a
Overall, the high-quality updates to the data have made classification endpoint in ADMET2.0), although hindered by
ADMETlab 3.0 the most comprehensive ADMET online pre- its complex pharmacological mechanism and a small dataset,
diction platform known to date. These processed datasets laid still achieved an R2 of near 0.7. For these classification mod-
a solid foundation for subsequent model construction, accu- els, the AUC values of existing 40 endpoints were in the range
rate prediction, and in vitro or in vivo in-depth study of AD- of 0.72 to 0.99, and the AUC values of the newly added 20
MET properties of new compound molecules. classification endpoints were between 0.73 and 0.96, which
have also been proven to be practical and reliable to a certain
extent. In conclusion, the overall performance of the ADMET
Robust and accurate multi-task DMPNN models predictive models based on DMPNN was excellent for both
There are a total of 77 predictive models based on DMPNN- regression and classification endpoints and we believe that the
Des or DMPNN, of which 18 are regression models and 59 online predictive webserver developed based on these models
are classification models. The performance scores for the clas- can provide extensive, accurate and detailed ADMET infor-
sification and regression models for both MPNN and MPNN- mation for drug developers and medicinal chemists.
Des were summarized in Supplementary Tables S4 and S5, re- To further assess the performance and efficacy of our
spectively. For these regression models, R2 values of the ex- models, we conducted a comprehensive comparison of the
isting 13 endpoints mainly ranged from 0.75 to 0.95. Even two types of models embedded in ADMETlab 3.0, namely
for the endpoint with the lowest performance, the R2 value DMPNN-Des and DMPNN, against MGA models. All these
6 Nucleic Acids Research, 2024

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


Figure 3. Comparison among DMPNN-Des, DMPNN and MGA in regression and classification tasks. (A) R2 of regression tasks. (B) AUC of
pharmacokinetics parameters (ADME) in classification tasks. (C) AUC of toxicity parameters in classification tasks.

models were trained and tested on the same dataset and split higher AUC scores when compared with MGA. Notably, these
method used in this study. The outcomes of this comparative methods outperformed MGA in 47 out of 59 tasks, underscor-
analysis were visually depicted in Figure 3. As illustrated in ing their superior performance in the classification domain. In
Figure 3A, for most tasks in regression tasks, both DMPNN- terms of the AUC of ADME (Figure 3B), MGA demonstrated
Des and DMPNN models exhibited significantly superior R2 superior AUC scores for only four of the 27 tasks, includ-
performance compared to MGA. ing two cytochrome enzyme endpoints and two absorption
However, it’s noteworthy that for certain specific tasks, such endpoints. For the performance of toxicity tasks (Figure 3C),
as melting point, LC50FM and half-life, MGA demonstrated MGA only achieved higher AUC scores for seven out of 32
slightly superior performance compared to the naïve MPNN endpoints, including AMES, respiratory toxicity, nephrotox-
in the past. Nevertheless, its performance remained lower icity, hERG 10 μm, and three others. The above results sug-
than that achieved by MPNN-Des. In classification tasks, both gested that DMPNN-Des or DMPNN generally exhibited a
DMPNN-Des and DMPNN consistently showed significantly more dominant and superior classification performance over
Nucleic Acids Research, 2024 7

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


Figure 4. A Python code example for making predictions by calling an API.

the MGA method. The overall performance of DMPNN-Des timates for each prediction result are exclusively available
was slightly better than DMPNN; however, the addition of through the API. To maintain server stability, we recom-
descriptors can slightly reduce processing speed (see in the mend no more than five requests per second. For optimal
following runtime analysis). To provide users with flexibil- user experience, web interface queries take precedence over
ity, both DMPNN-Des, and DMPNN are available as options API requests. Notably, users can choose between DMPNN
in the webserver API, allowing users to select their preferred or DMPNN-Des as the prediction model. The API’s flexi-
model for compound evaluation. bility also encourages developers to leverage its functional-
ity for broader applications or integrations, including the de-
velopment of repositories, graphic user interfaces, and web
API integration and architecture upgrades applications for ADMET evaluation. By making full use of
The Application Programming Interface (API) in ADMETlab API functionality, the efficiency of high throughout ADMET
3.0 provides the option for researchers to use the command evaluation by medicinal chemists is expected to be greatly
line for efficient access. This is particularly beneficial for tasks improved.
involving large volume of data. The API achieves accessibil-
ity by employing well-established protocols that are compat-
ible with widely used programming languages, thus simplify- Uncertainty evaluation for predicted results
ing interactions with the core functionalities of the web server. In predictive modelling, offering uncertainty estimates for pre-
Users can retrieve comprehensive calculation results conve- dictive results is a crucial metric to evaluate the accuracy and
niently via a simple script (Figure 4). A tutorial introducing reliability of predictions. This metric reflects the model’s confi-
the utility and providing detailed code examples can be found dence, with lower uncertainty indicating increased confidence
in the ‘API Tutorial’ section on the website. and higher uncertainty signalling an unreliable prediction for
ADMETlab 3.0 s API provides two key functions: (i) a given molecule.
Molecule Wash. This includes practical functions crucial For the regression model, an evidence-based deep learning
for generating a reliable outcome, including standardizing approach is utilized to predict target properties and deduce
molecules, addressing fragments, assigning charges, handling the parameters of the underlying evidential distribution (40).
tautomerism, isotope correction, and managing stereochem- The method integrates an evidential layer after feature extrac-
istry, which are crucial for generating a reliable outcome. tion in a strategic position to capture nuanced support for
(ii) Off-website Batch Prediction. the API returns all the 119 each prediction. Implemented in Chemprop, we adopted the
ADMET-related properties for a request molecule. Users can ‘evidential_total’ estimation method for the regression model,
follow the tutorial or use the example scripts to obtain the full which goes beyond conventional uncertainty assessments by
prediction results in CSV format. calculating both evidential epistemic and aleatoric uncer-
Additionally, the API can return the ‘structure’ as an SVG tainty. For the classification model, Monte Carlo dropout
string representing the molecular structure and a unique iden- is employed to assess uncertainty across different properties
tifier ‘taskid’ for result retrieval. Note that uncertainty es- (41). Monte Carlo dropout assesses uncertainty across differ-
8 Nucleic Acids Research, 2024

Table 1. Comparison of the main features of ADMETlab (2.0 and 3.0 version) with other web-based platforms
ADMETlab ADMETlab admetSAR Interpretable-
Features 3.0 2.0 SwissADME 2.0 FAF-Drugs4 pkCSM vNN-ADME ADMETboost ADMET

Physicochemical 21 17 12 5 20 6 0 7 10
property
Medicinal 20 13 10 0 16 0 0 0 0
chemistry
ADME 34 23 9 35 0 20 9 18 20
Toxicity 36 27 0 12 0 10 6 4 29
Toxicophoric rule 8 8 0 0 4 0 0 0 0
PAINS included Yes Yes Yes No Yes No No No Yes
Batch +++ ++ + + ++ ++ ++ + +
evaluation/API

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


support
Explanation +++ ++ ++ + ++ ++ + + +++
Uncertainty Yes No No No No No No No No
estimation
Availability Free Free Free Free Free Free Registration Free Restricted visit
required
Computation time 87 84 1560 267 967 1845 2400 5 (per Restricted
(1000 molecules) molecule) evaluation
*Medicinal chemistry contains drug-likeness rules, chemical friendly measures, and substructural rules of frequent hitters; ADME contains absorption, distribution,
metabolism, and excretion related endpoints; Toxicity contains human toxicity, animal toxicity, environmental toxicity, and toxic pathways. A higher number of ‘+’
symbols indicates better support in the respective item. Runtime assessment for each platform was conducted ten times, and the average runtime value in seconds was
demonstrated.
URL links:
ADMETlab 2.0: https://admetmesh.scbdd.com/
SwissADME: http://www.swissadme.ch/
admetSAR 2.0: http://lmmd.ecust.edu.cn/admetsar2/
FAF-Drugs4: https://fafdrugs4.rpbs.univ- paris- diderot.fr/
pkCSM: http://biosig.unimelb.edu.au/pkcsm/prediction
vNN-ADMET: https://vnnadmet.bhsai.org/vnnadmet/login.xhtml
FAF-Drugs4: https://fafdrugs4.rpbs.univ- paris- diderot.fr/
ADMETboost: https://ai-druglab.smu.edu/admet
Interpretable-ADMET: http://cadd.pharmacy.nankai.edu.cn/interpretableadmet/

ent properties by simulating an ensemble of sub-models within that has proven powerful on moderate or smaller training
a single neural network. It generates a distribution of out- datasets (42). Interpret-ADMET, on the other hand, can pre-
comes, offering a prediction mean as the predicted value and dict 59 ADMET properties and physiochemical properties.
a variance as an uncertainty score for each property. Specifically, Interpretable-ADMET not only made predictions
We presented the model’s RMSE range within distinct but also offered an interpretation module and an optimiza-
uncertainty intervals for regression tasks and provided un- tion module to identify key substructures for specific prop-
certainty thresholds with their corresponding maximum erties and optimize the structure afterwards. However, due
Youden’s index for classification tasks. In classification, the to its limitations on user access and the lack of an interface
uncertainty score exceeding maximum Youden’s index desig- to perform batch calculations, its practicality is largely lim-
nates the model’s prediction as ‘low confidence’, while pre- ited. ADMETlab 3.0, on the other hand, demonstrates in-
diction uncertainty below this threshold indicates ‘high confi- terpretability by providing uncertainty estimation scores for
dence’ in the model’s prediction. Users can retrieve prediction prediction results, utilizing coloured dots to represent empir-
results with these confidence labels through the API, and ad- ical decision state of each output, and highlighting alert sub-
ditional details can be found in Supplementary Tables S6 and structures. Additionally, we have significantly improved the
S7 for reference on confidence values. user experience by introducing elaborate documentation, bet-
ter guidance, and enhanced web design compared to 2.0 ver-
sion. The detailed information for endpoint explanation, data
Comparison with other web-based tools distribution and other aspects are summarized in Supplemen-
We compared endpoint information and processing efficiency tary Document and the ‘help’ section on the website. In run-
among ADMETlab 3.0, ADMETlab 2.0, and several popu- time analysis, ADMETlab 3.0 exhibited a slightly longer run-
lar ADMET prediction platforms, including SwissADME, ad- time compared to ADMETlab 2.0 in the web portal. This
metSAR2.0, FAF-Drugs4, pkCSM, vNN-ADMET, ADMET- is considered a commendable performance, given that AD-
boost and Interpret-ADMET. Details are summarized in Ta- METlab 3.0 has 31 more endpoints to calculate. Detailed run-
ble 1. As indicated by the results, ADMETlab 3.0 and 2.0 time assessment results of ADMETlab 3.0 with different mod-
version unquestionably exhibits superior data support and elling options can be found in Supplementary Table S8 and
evaluation performance on SwissADME, admetSAR2.0, FAF- Supplementary Figure S1. For users who prioritize uncertainty
Drugs4, pkCSM and vNN-ADMET. In comparison with the reference, the combination of DMPNN-Des and uncertainty
two latest platforms, ADMET-boost and Interpret-ADMET, demonstrated optimal performance on runtime. However, for
ADMETlab 3.0 also showed overall better coverage and those primarily concerned with runtime efficiency, DMPNN-
utility performance. ADMET-boost covered 22 ADMET re- Des can be the preferred choice, as it exhibited superior com-
lated parameters plus 7 physiochemical properties. Its optimal putational efficiency across varying dataset sizes while main-
model turned out to be extreme gradient boosting, a method tained a better performance over sole DMPNN.
Nucleic Acids Research, 2024 9

Conclusions and future plans References


ADMETlab 3.0 marks a significant advance in the field of 1. Pammolli,F., Magazzini,L. and Riccaboni,M. (2011) The
in silico tools for drug development. Building upon the suc- productivity crisis in pharmaceutical R&D. Nat. Rev. Drug
cesses of its predecessor, ADMETlab 3.0 not only rectifies Discov., 10, 428–438.
2. Dowden,H. and Munro,J. (2019) Trends in clinical success rates
limitations related to narrower coverage, uncertainty estima-
and therapeutic focus. Nat. Rev. Drug Discov., 18, 495–496.
tion deficiencies, and integration capabilities but also intro-
3. Takebe,T., Imai,R. and Ono,S. (2018) The current status of drug
duces innovative features to align with the dynamic require- discovery and development as originated in United States
ments of the field. By incorporating the multi-task DMPNN academia: the influence of industrial and academic collaboration
modeling method to aggregate local molecular information, on drug discovery and development. Clin. Transl. Sci., 11,
and further enhancing it by combining RDKit 2D descrip- 597–606.
tors with global molecular information, the updated model 4. Harrison,R.K. (2016) Phase II and phase III failures: 2013-2015.

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


structure ensures the precision, efficiency, and reliability of Nat. Rev. Drug Discov., 15, 817–818.
ADMET predictions. The augmentation of the dataset, en- 5. Yang,H., Lou,C., Sun,L., Li,J., Cai,Y., Wang,Z., Li,W., Liu,G. and
compassing over 400 000 entries and 31 additional endpoints Tang,Y. (2019) admetSAR 2.0: web-service for prediction and
optimization of chemical ADMET properties. Bioinformatics, 35,
firmly establish ADMETlab 3.0 as a comprehensive and po-
1067–1069.
tent platform. The integration of API capabilities streamlines 6. Daina,A., Michielin,O. and Zoete,V. (2017) SwissADME: a free
in-batch evaluation, especially beneficial for users handling web tool to evaluate pharmacokinetics, drug-likeness and
substantial data volumes. Furthermore, the implementation of medicinal chemistry friendliness of small molecules. Sci. Rep., 7,
an uncertainty estimation module addresses challenges associ- 42717.
ated with interpretation, out-of-domain regimes, and robust- 7. Banerjee,P., Eckert,A.O., Schrey,A.K. and Preissner,R. (2018)
ness guarantees. This enhancement provides a valuable tool ProTox-II: a webserver for the prediction of toxicity of chemicals.
for informed decision-making in candidate prioritization dur- Nucleic Acids Res., 46, W257–W263.
ing virtual screening. With these advancements, ADMETlab 8. Xiong,G., Wu,Z., Yi,J., Fu,L., Yang,Z., Hsieh,C., Yin,M., Zeng,X.,
3.0 is poised to play a vital role as an indispensable resource Wu,C., Lu,A., et al. (2021) ADMETlab 2.0: an integrated online
platform for accurate and comprehensive predictions of ADMET
for researchers and practitioners in accelerating drug research
properties. Nucleic Acids Res., 49, W5–W14.
and development. 9. Tian,H., Ketkar,R. and Tao,P. (2022) ADMETboost: a web server
for accurate ADMET prediction. J. Mol. Model., 28, 408.
10. Wei,Y., Li,S., Li,Z., Wan,Z. and Lin,J. (2022)
Data availability Interpretable-ADMET: a web service for ADMET prediction and
ADMETlab 3.0 is publicly accessible without registration at optimization based on deep neural representation. Bioinformatics,
38, 2863–2871.
https://admetlab3.scbdd.com or via API. Results are promptly
11. Dulsat,J., Lopez-Nieto,B., Estrada-Tejedor,R. and Borrell,J.I.
displayed on the website and available for download in op- (2023) Evaluation of free online ADMET tools for academic or
tional formats. small biotech environments. Molecules, 28, 776–792.
12. Hasselgren,C. and Oprea,T.I. (2023) Artificial intelligence for drug
discovery: are we there yet? Annu. Rev. Pharmacol. Toxicol., 64,
Supplementary data 527–550.
13. Tran,T.T.V., Surya Wibowo,A., Tayara,H. and Chong,K.T. (2023)
Supplementary Data are available at NAR Online.
Artificial intelligence in drug toxicity prediction: recent advances,
challenges, and future perspectives. J. Chem. Inf. Model., 63,
2628–2643.
Acknowledgements 14. Chung,Y., Vermeire,F.H., Wu,H., Walker,P.J., Abraham,M.H. and
We acknowledge Haikun Xu, and the High-Performance Green,W.H. (2022) Group contribution and machine learning
Computing Center of Central South University for support. approaches to predict Abraham Solute parameters, solvation free
energy, and Solvation enthalpy. J. Chem. Inf. Model., 62, 433–446.
The study was approved by the university’s review board.
15. Greenman,K.P., Green,W.H. and Gomez-Bombarelli,R. (2022)
Multi-fidelity prediction of molecular optical peaks with deep
learning. Chem. Sci., 13, 1152–1162.
Funding 16. Mendez,D., Gaulton,A., Bento,A.P., Chambers,J., De Veij,M.,
National Key Research and Development Program of China Felix,E., Magarinos,M.P., Mosquera,J.F., Mutowo,P., Nowotka,M.,
[2021YFF1201400]; National Natural Science Foundation of et al. (2019) ChEMBL: towards direct deposition of bioassay data.
Nucleic Acids Res., 47, D930–D940.
China [22173118, 22220102001]; Hunan Provincial Science
17. Kim,S., Chen,J., Cheng,T., Gindulyte,A., He,J., He,S., Li,Q.,
Fund for Distinguished Young Scholars [2021JJ10068]; Sci- Shoemaker,B.A., Thiessen,P.A., Yu,B., et al. (2023) PubChem 2023
ence and Technology Innovation Program of Hunan Province update. Nucleic Acids Res., 51, D1373–D1380.
[2021RC4011]; Natural Science Foundation of Hunan 18. Sushko,I., Novotarskyi,S., Korner,R., Pandey,A.K., Rupp,M.,
Province [2022JJ80104]; 2020 Guangdong Provincial Sci- Teetz,W., Brandmaier,S., Abdelaziz,A., Prokopenko,V.V.,
ence and Technology Innovation Strategy Special Fund Tanchuk,V.Y., et al. (2011) Online chemical modeling environment
[2020B1212030006, Guangdong-Hong Kong-Macau Joint (OCHEM): web platform for data storage, model development
Lab]. Funding for open access charge: HKBU Strategic De- and publishing of chemical information. J Comput. Aided Mol.
velopment Fund project [SDF19-0402-P02]. Des., 25, 533–554.
19. Arnot,J.A., Brown,T.N. and Wania,F. (2014) Estimating
screening-level organic chemical half-lives in humans. Environ. Sci.
Technol., 48, 723–730.
Conflict of interest statement 20. Wang,Y., Liu,H., Fan,Y., Chen,X., Yang,Y., Zhu,L., Zhao,J.,
None declared. Chen,Y. and Zhang,Y. (2019) In Silico prediction of Human
10 Nucleic Acids Research, 2024

intravenous pharmacokinetic parameters with improved accuracy. 31. Heid,E. and Green,W.H. (2022) Machine learning of reaction
J. Chem. Inf. Model., 59, 3968–3980. properties via learned representations of the condensed graph of
21. Duan,Y.J., Fu,L., Zhang,X.C., Long,T.Z., He,Y.H., Liu,Z.Q., reaction. J. Chem. Inf. Model., 62, 2101–2110.
Lu,A.P., Deng,Y.F., Hsieh,C.Y., Hou,T.J., et al. (2023) Improved 32. McGill,C., Forsuelo,M., Guan,Y. and Green,W.H. (2021)
GNNs for log D(7.4) prediction by transferring knowledge from Predicting infrared spectra with message passing neural networks.
low-fidelity data. J. Chem. Inf. Model., 63, 2345–2359. J. Chem. Inf. Model., 61, 2594–2609.
22. Dong,J., Wang,N.-N., Liu,K.-Y., Zhu,M.-F., Yun,Y.-H., Zeng,W.-B., 33. Yang,Z.Y., Yang,Z.J., Lu,A.P., Hou,T.J. and Cao,D.S. (2021) Scopy:
Chen,A.F. and Cao,D.-S. (2017) ChemBCPP: a freely available web an integrated negative design python library for desirable HTS/VS
server for calculating commonly used physicochemical properties. database design. Brief. Bioinform., 22, bbaa194.
Chemom. Intell. Lab. Syst., 171, 65–73. 34. Yang,Z.Y., Yang,Z.J., Dong,J., Wang,L.L., Zhang,L.X., Ding,J.J.,
23. Wu,J., Wan,Y., Wu,Z., Zhang,S., Cao,D., Hsieh,C.Y. and Hou,T. Ding,X.Q., Lu,A.P., Hou,T.J. and Cao,D.S. (2019) Structural
(2023) MF-SuP-pK(a): multi-fidelity modeling with subgraph analysis and identification of colloidal aggregators in drug
pooling mechanism for pK(a) prediction. Acta Pharm. Sin. B, 13, discovery. J. Chem. Inf. Model., 59, 3714–3726.

Downloaded from https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae236/7640525 by guest on 06 April 2024


2572–2584. 35. Yang,Z.Y., Dong,J., Yang,Z.J., Yin,M., Jiang,H.L., Lu,A.P.,
24. Yu,J., Wang,J., Zhao,H., Gao,J., Kang,Y., Cao,D., Wang,Z. and Chen,X., Hou,T.J. and Cao,D.S. (2021) ChemFLuo: a web-server
Hou,T. (2022) Organic compound synthetic accessibility for structure analysis and identification of fluorescent compounds.
prediction based on the graph attention mechanism. J. Chem. Inf. Brief. Bioinform., 22, bbaa282.
Model., 62, 2973–2986. 36. Yang,Z.Y., Dong,J., Yang,Z.J., Lu,A.P., Hou,T.J. and Cao,D.S.
25. Yang,K., Swanson,K., Jin,W., Coley,C., Eiden,P., Gao,H., (2020) Structural analysis and identification of false positive hits
Guzman-Perez,A., Hopper,T., Kelley,B., Mathea,M., et al. (2019) in luciferase-based assays. J. Chem. Inf. Model., 60, 2031–2043.
Analyzing learned molecular representations for property 37. Yang,Z.Y., He,J.H., Lu,A.P., Hou,T.J. and Cao,D.S. (2020)
prediction. J. Chem. Inf. Model., 59, 3370–3388. Frequent hitters: nuisance artifacts in high-throughput screening.
26. Heid,E., Greenman,K.P., Chung,Y., Li,S.C., Graff,D.E., Drug Discov. Today, 25, 657–667.
Vermeire,F.H., Wu,H., Green,W.H. and McGill,C.J. (2024) 38. Kingma,D.P. and Ba,J. (2014) Adam: a method for stochastic
Chemprop: a machine learning package for chemical property optimization. arXiv doi:
prediction. J. Chem. Inf. Model., 64, 9–17. https://doi.org/10.48550/arXiv.1412.6980, 22 December 2014,
27. Cai,H., Zhang,H., Zhao,D., Wu,J. and Wang,L. (2022) FP-GNN: a preprint: not peer reviewed.
versatile deep learning architecture for enhanced molecular 39. Chicco,D. and Jurman,G. (2020) The advantages of the Matthews
property prediction. Brief Bioinform, 23, bbac408. correlation coefficient (MCC) over F1 score and accuracy in
28. Stokes,J.M., Yang,K., Swanson,K., Jin,W., Cubillos-Ruiz,A., binary classification evaluation. Bmc Genomics [Electronic
Donghia,N.M., MacNair,C.R., French,S., Carfrae,L.A., Resource], 21, 6.
Bloom-Ackermann,Z., et al. (2020) A deep learning approach to 40. Soleimany,A.P., Amini,A., Goldman,S., Rus,D., Bhatia,S.N. and
antibiotic discovery. Cell, 180, 688–702. Coley,C.W. (2021) Evidential deep learning for guided molecular
29. Liu,G., Catacutan,D.B., Rathod,K., Swanson,K., Jin,W., property prediction and discovery. ACS Cent. Sci., 7, 1356–1367.
Mohammed,J.C., Chiappino-Pepe,A., Syed,S.A., Fragis,M., 41. Gal,Y. and Ghahramani,Z. (2016) In: Maria Florina,B. and
Rachwalski,K., et al. (2023) Deep learning-guided discovery of an Kilian,Q.W. (eds.) Proceedings of the 33rd International
antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol., Conference on Machine Learning. PMLR, Proceedings of Machine
19, 1342–1350. Learning Research, Vol. 48, pp. 1050–1059.
30. Wong,F., Zheng,E.J., Valeri,J.A., Donghia,N.M., Anahtar,M.N., 42. Chen,C., Zhang,Q., Yu,B., Yu,Z., Lawrence,P.J., Ma,Q. and
Omori,S., Li,A., Cubillos-Ruiz,A., Krishnan,A., Jin,W., et al. (2023) Zhang,Y. (2020) Improving protein-protein interactions prediction
Discovery of a structural class of antibiotics with explainable deep accuracy using XGBoost feature selection and stacked ensemble
learning. Nature, 626, 177–185. classifier. Comput. Biol. Med., 123, 103899.

Received: January 29, 2024. Revised: March 10, 2024. Editorial Decision: March 18, 2024. Accepted: March 21, 2024
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For
commercial re-use, please contact journals.permissions@oup.com

You might also like