Gkae 236
Gkae 236
Gkae 236
https://doi.org/10.1093/nar/gkae236
Web Server Issue
Abstract
ADMETlab 3.0 is the second updated version of the web server that provides a comprehensive and efficient platform for evaluating ADMET-
related parameters as well as physicochemical properties and medicinal chemistry characteristics involved in the drug discovery process. This
new release addresses the limitations of the previous version and offers broader coverage, improved performance, API functionality, and decision
support. For supporting data and endpoints, this version includes 119 features, an increase of 31 compared to the previous version. The updated
number of entries is 1.5 times larger than the previous version with over 400 000 entries. ADMETlab 3.0 incorporates a multi-task DMPNN
architecture coupled with molecular descriptors, a method that not only guaranteed calculation speed for each endpoint simultaneously, but also
achieved a superior performance in terms of accuracy and robustness. In addition, an API has been introduced to meet the growing demand for
programmatic access to large amounts of data in ADMETlab 3.0. Moreover, this version includes uncertainty estimates in the prediction results,
aiding in the confident selection of candidate compounds for further studies and experiments. ADMETlab 3.0 is publicly for access without the
need for registration at: https://admetlab3.scbdd.com.
Graphical abstract
Received: January 29, 2024. Revised: March 10, 2024. Editorial Decision: March 18, 2024. Accepted: March 21, 2024
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the
original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
2 Nucleic Acids Research, 2024
primarily owed to its multi-task graph neural network struc- under the ROC curve (AUC), accuracy (ACC), and Matthews
ture. To inherit the multi-tasking benefits from the previous correlation coefficient (MCC) (39) were applied for classifica-
version—simultaneously improving model performance and tion tasks. To obtain robust and accurate prediction models,
computational speed—we implemented Chemprop’s multi- each training process was repeated five times with randomly
task training approach in ADMETlab 3.0. partitioned data, and the best-performing models were incor-
Recent studies have demonstrated that incorporating ex- porated into the online platform.
ternal features, rather than relying solely on SMILES, can
further enhance the performance of DMPNN (14,27). The
Webserver implementation
advantage of this combination likely stems from the com-
plementary relationship between the global information of Similar to its predecessor, the ADMETlab 3.0 application has
molecules from descriptors and the local information ex- been crafted using Django as its backbone. The front end, us-
ing frameworks such as Bootstrap and JQuery, is responsi-
Figure 1. Overview scheme of ADMETlab 3.0 data and DMPNN-Des model framework.
For absorption, two endpoints—human oral bioavailability Increased diversity and quantity of data contribute to a
(50%) and Parallel Artificial Membrane Permeability Assay more general understanding of the relationships between
(PAMPA)—were integrated. Distribution data covers trans- molecular features and the corresponding properties to be
membrane transporter inhibitors. Metabolism updates fea- learned, leading to more robust and accurate predictions.
ture CYP inhibition and human liver microsomal stability Therefore, in addition to the above new endpoints and their
data. The toxicity section now includes five types of toxic- datasets, we also expanded the volume of data for four impor-
ity (nephrotoxicity, neurotoxicity, ototoxicity, hematotoxicity, tant datasets for drug absorption, distribution, and excretion,
genotoxicity) and in vitro toxicity assessments (hERG block- namely Caco-2, logD7.4, VDss and half-life, with increasing
ers, RPMI-8226 immunotoxicity, A549 and Hek293 cytotoxi- rates of 20%, 20%, 125% and 20%, respectively. We made a
city). Notably, hematotoxicity data stems from recent research thorough analysis on data distribution and diversity for each
on chemical structure and hematotoxicity relationships. endpoint. Users can access these information in the ‘Diversity
Nucleic Acids Research, 2024 5
& Distribution’ item under ‘Help’ section of the website to for LC50FM still reaches a satisfactory 0.68. Regarding the
gain a deeper understanding of the data composition for the newly added five regression endpoints, the R2 values for four
model. of them fell between 0.8 and 0.9. The remaining one, T1/2 (a
Overall, the high-quality updates to the data have made classification endpoint in ADMET2.0), although hindered by
ADMETlab 3.0 the most comprehensive ADMET online pre- its complex pharmacological mechanism and a small dataset,
diction platform known to date. These processed datasets laid still achieved an R2 of near 0.7. For these classification mod-
a solid foundation for subsequent model construction, accu- els, the AUC values of existing 40 endpoints were in the range
rate prediction, and in vitro or in vivo in-depth study of AD- of 0.72 to 0.99, and the AUC values of the newly added 20
MET properties of new compound molecules. classification endpoints were between 0.73 and 0.96, which
have also been proven to be practical and reliable to a certain
extent. In conclusion, the overall performance of the ADMET
Robust and accurate multi-task DMPNN models predictive models based on DMPNN was excellent for both
There are a total of 77 predictive models based on DMPNN- regression and classification endpoints and we believe that the
Des or DMPNN, of which 18 are regression models and 59 online predictive webserver developed based on these models
are classification models. The performance scores for the clas- can provide extensive, accurate and detailed ADMET infor-
sification and regression models for both MPNN and MPNN- mation for drug developers and medicinal chemists.
Des were summarized in Supplementary Tables S4 and S5, re- To further assess the performance and efficacy of our
spectively. For these regression models, R2 values of the ex- models, we conducted a comprehensive comparison of the
isting 13 endpoints mainly ranged from 0.75 to 0.95. Even two types of models embedded in ADMETlab 3.0, namely
for the endpoint with the lowest performance, the R2 value DMPNN-Des and DMPNN, against MGA models. All these
6 Nucleic Acids Research, 2024
models were trained and tested on the same dataset and split higher AUC scores when compared with MGA. Notably, these
method used in this study. The outcomes of this comparative methods outperformed MGA in 47 out of 59 tasks, underscor-
analysis were visually depicted in Figure 3. As illustrated in ing their superior performance in the classification domain. In
Figure 3A, for most tasks in regression tasks, both DMPNN- terms of the AUC of ADME (Figure 3B), MGA demonstrated
Des and DMPNN models exhibited significantly superior R2 superior AUC scores for only four of the 27 tasks, includ-
performance compared to MGA. ing two cytochrome enzyme endpoints and two absorption
However, it’s noteworthy that for certain specific tasks, such endpoints. For the performance of toxicity tasks (Figure 3C),
as melting point, LC50FM and half-life, MGA demonstrated MGA only achieved higher AUC scores for seven out of 32
slightly superior performance compared to the naïve MPNN endpoints, including AMES, respiratory toxicity, nephrotox-
in the past. Nevertheless, its performance remained lower icity, hERG 10 μm, and three others. The above results sug-
than that achieved by MPNN-Des. In classification tasks, both gested that DMPNN-Des or DMPNN generally exhibited a
DMPNN-Des and DMPNN consistently showed significantly more dominant and superior classification performance over
Nucleic Acids Research, 2024 7
the MGA method. The overall performance of DMPNN-Des timates for each prediction result are exclusively available
was slightly better than DMPNN; however, the addition of through the API. To maintain server stability, we recom-
descriptors can slightly reduce processing speed (see in the mend no more than five requests per second. For optimal
following runtime analysis). To provide users with flexibil- user experience, web interface queries take precedence over
ity, both DMPNN-Des, and DMPNN are available as options API requests. Notably, users can choose between DMPNN
in the webserver API, allowing users to select their preferred or DMPNN-Des as the prediction model. The API’s flexi-
model for compound evaluation. bility also encourages developers to leverage its functional-
ity for broader applications or integrations, including the de-
velopment of repositories, graphic user interfaces, and web
API integration and architecture upgrades applications for ADMET evaluation. By making full use of
The Application Programming Interface (API) in ADMETlab API functionality, the efficiency of high throughout ADMET
3.0 provides the option for researchers to use the command evaluation by medicinal chemists is expected to be greatly
line for efficient access. This is particularly beneficial for tasks improved.
involving large volume of data. The API achieves accessibil-
ity by employing well-established protocols that are compat-
ible with widely used programming languages, thus simplify- Uncertainty evaluation for predicted results
ing interactions with the core functionalities of the web server. In predictive modelling, offering uncertainty estimates for pre-
Users can retrieve comprehensive calculation results conve- dictive results is a crucial metric to evaluate the accuracy and
niently via a simple script (Figure 4). A tutorial introducing reliability of predictions. This metric reflects the model’s confi-
the utility and providing detailed code examples can be found dence, with lower uncertainty indicating increased confidence
in the ‘API Tutorial’ section on the website. and higher uncertainty signalling an unreliable prediction for
ADMETlab 3.0 s API provides two key functions: (i) a given molecule.
Molecule Wash. This includes practical functions crucial For the regression model, an evidence-based deep learning
for generating a reliable outcome, including standardizing approach is utilized to predict target properties and deduce
molecules, addressing fragments, assigning charges, handling the parameters of the underlying evidential distribution (40).
tautomerism, isotope correction, and managing stereochem- The method integrates an evidential layer after feature extrac-
istry, which are crucial for generating a reliable outcome. tion in a strategic position to capture nuanced support for
(ii) Off-website Batch Prediction. the API returns all the 119 each prediction. Implemented in Chemprop, we adopted the
ADMET-related properties for a request molecule. Users can ‘evidential_total’ estimation method for the regression model,
follow the tutorial or use the example scripts to obtain the full which goes beyond conventional uncertainty assessments by
prediction results in CSV format. calculating both evidential epistemic and aleatoric uncer-
Additionally, the API can return the ‘structure’ as an SVG tainty. For the classification model, Monte Carlo dropout
string representing the molecular structure and a unique iden- is employed to assess uncertainty across different properties
tifier ‘taskid’ for result retrieval. Note that uncertainty es- (41). Monte Carlo dropout assesses uncertainty across differ-
8 Nucleic Acids Research, 2024
Table 1. Comparison of the main features of ADMETlab (2.0 and 3.0 version) with other web-based platforms
ADMETlab ADMETlab admetSAR Interpretable-
Features 3.0 2.0 SwissADME 2.0 FAF-Drugs4 pkCSM vNN-ADME ADMETboost ADMET
Physicochemical 21 17 12 5 20 6 0 7 10
property
Medicinal 20 13 10 0 16 0 0 0 0
chemistry
ADME 34 23 9 35 0 20 9 18 20
Toxicity 36 27 0 12 0 10 6 4 29
Toxicophoric rule 8 8 0 0 4 0 0 0 0
PAINS included Yes Yes Yes No Yes No No No Yes
Batch +++ ++ + + ++ ++ ++ + +
evaluation/API
ent properties by simulating an ensemble of sub-models within that has proven powerful on moderate or smaller training
a single neural network. It generates a distribution of out- datasets (42). Interpret-ADMET, on the other hand, can pre-
comes, offering a prediction mean as the predicted value and dict 59 ADMET properties and physiochemical properties.
a variance as an uncertainty score for each property. Specifically, Interpretable-ADMET not only made predictions
We presented the model’s RMSE range within distinct but also offered an interpretation module and an optimiza-
uncertainty intervals for regression tasks and provided un- tion module to identify key substructures for specific prop-
certainty thresholds with their corresponding maximum erties and optimize the structure afterwards. However, due
Youden’s index for classification tasks. In classification, the to its limitations on user access and the lack of an interface
uncertainty score exceeding maximum Youden’s index desig- to perform batch calculations, its practicality is largely lim-
nates the model’s prediction as ‘low confidence’, while pre- ited. ADMETlab 3.0, on the other hand, demonstrates in-
diction uncertainty below this threshold indicates ‘high confi- terpretability by providing uncertainty estimation scores for
dence’ in the model’s prediction. Users can retrieve prediction prediction results, utilizing coloured dots to represent empir-
results with these confidence labels through the API, and ad- ical decision state of each output, and highlighting alert sub-
ditional details can be found in Supplementary Tables S6 and structures. Additionally, we have significantly improved the
S7 for reference on confidence values. user experience by introducing elaborate documentation, bet-
ter guidance, and enhanced web design compared to 2.0 ver-
sion. The detailed information for endpoint explanation, data
Comparison with other web-based tools distribution and other aspects are summarized in Supplemen-
We compared endpoint information and processing efficiency tary Document and the ‘help’ section on the website. In run-
among ADMETlab 3.0, ADMETlab 2.0, and several popu- time analysis, ADMETlab 3.0 exhibited a slightly longer run-
lar ADMET prediction platforms, including SwissADME, ad- time compared to ADMETlab 2.0 in the web portal. This
metSAR2.0, FAF-Drugs4, pkCSM, vNN-ADMET, ADMET- is considered a commendable performance, given that AD-
boost and Interpret-ADMET. Details are summarized in Ta- METlab 3.0 has 31 more endpoints to calculate. Detailed run-
ble 1. As indicated by the results, ADMETlab 3.0 and 2.0 time assessment results of ADMETlab 3.0 with different mod-
version unquestionably exhibits superior data support and elling options can be found in Supplementary Table S8 and
evaluation performance on SwissADME, admetSAR2.0, FAF- Supplementary Figure S1. For users who prioritize uncertainty
Drugs4, pkCSM and vNN-ADMET. In comparison with the reference, the combination of DMPNN-Des and uncertainty
two latest platforms, ADMET-boost and Interpret-ADMET, demonstrated optimal performance on runtime. However, for
ADMETlab 3.0 also showed overall better coverage and those primarily concerned with runtime efficiency, DMPNN-
utility performance. ADMET-boost covered 22 ADMET re- Des can be the preferred choice, as it exhibited superior com-
lated parameters plus 7 physiochemical properties. Its optimal putational efficiency across varying dataset sizes while main-
model turned out to be extreme gradient boosting, a method tained a better performance over sole DMPNN.
Nucleic Acids Research, 2024 9
intravenous pharmacokinetic parameters with improved accuracy. 31. Heid,E. and Green,W.H. (2022) Machine learning of reaction
J. Chem. Inf. Model., 59, 3968–3980. properties via learned representations of the condensed graph of
21. Duan,Y.J., Fu,L., Zhang,X.C., Long,T.Z., He,Y.H., Liu,Z.Q., reaction. J. Chem. Inf. Model., 62, 2101–2110.
Lu,A.P., Deng,Y.F., Hsieh,C.Y., Hou,T.J., et al. (2023) Improved 32. McGill,C., Forsuelo,M., Guan,Y. and Green,W.H. (2021)
GNNs for log D(7.4) prediction by transferring knowledge from Predicting infrared spectra with message passing neural networks.
low-fidelity data. J. Chem. Inf. Model., 63, 2345–2359. J. Chem. Inf. Model., 61, 2594–2609.
22. Dong,J., Wang,N.-N., Liu,K.-Y., Zhu,M.-F., Yun,Y.-H., Zeng,W.-B., 33. Yang,Z.Y., Yang,Z.J., Lu,A.P., Hou,T.J. and Cao,D.S. (2021) Scopy:
Chen,A.F. and Cao,D.-S. (2017) ChemBCPP: a freely available web an integrated negative design python library for desirable HTS/VS
server for calculating commonly used physicochemical properties. database design. Brief. Bioinform., 22, bbaa194.
Chemom. Intell. Lab. Syst., 171, 65–73. 34. Yang,Z.Y., Yang,Z.J., Dong,J., Wang,L.L., Zhang,L.X., Ding,J.J.,
23. Wu,J., Wan,Y., Wu,Z., Zhang,S., Cao,D., Hsieh,C.Y. and Hou,T. Ding,X.Q., Lu,A.P., Hou,T.J. and Cao,D.S. (2019) Structural
(2023) MF-SuP-pK(a): multi-fidelity modeling with subgraph analysis and identification of colloidal aggregators in drug
pooling mechanism for pK(a) prediction. Acta Pharm. Sin. B, 13, discovery. J. Chem. Inf. Model., 59, 3714–3726.
Received: January 29, 2024. Revised: March 10, 2024. Editorial Decision: March 18, 2024. Accepted: March 21, 2024
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For
commercial re-use, please contact journals.permissions@oup.com