0% found this document useful (0 votes)

13 views

A Self-Attention Based Message Passing Neural Netw

ghbfhh

Uploaded by

Duc Thuy

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

A Self-Attention Based Message Passing Neural Netw

ghbfhh

Uploaded by

Duc Thuy

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Tang et al.

J Cheminform (2020) 12:15

https://doi.org/10.1186/s13321-020-0414-z Journal of Cheminformatics

RESEARCH ARTICLE Open Access

A self‑attention based message passing

neural network for predicting molecular
lipophilicity and aqueous solubility
Bowen Tang1,2, Skyler T. Kramer2, Meijuan Fang1, Yingkun Qiu1, Zhen Wu1* and Dong Xu2*

Abstract
Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for
rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-
network framework called self-attention-based message-passing neural network (SAMPN) to study the relationship
between chemical properties and structures in an interpretable way. The main advantages of SAMPN are that it
directly uses chemical graphs and breaks the black-box mold of many machine/deep learning methods. Specifically,
its attention mechanism indicates the degree to which each atom of the molecule contributes to the property of
interest, and these results are easily visualized. Further, SAMPN outperforms random forests and the deep learning
framework MPN from Deepchem. In addition, another formulation of SAMPN (Multi-SAMPN) can simultaneously
predict multiple chemical properties with higher accuracy and efficiency than other models that predict one spe-
cific chemical property. Moreover, SAMPN can generate chemically visible and interpretable results, which can help
researchers discover new pharmaceuticals and materials. The source code of the SAMPN prediction pipeline is freely
available at Github (https://github.com/tbwxmu/SAMPN).
Keywords: Message passing network, Attention mechanism, Deep learning, Lipophilicity, Aqueous solubility

Introduction vector machines (SVM), have aided the discovery process

Accurate and reliable prediction of molecular properties of new chemical drugs and materials [2, 5, 6]. For exam-
is an important ingredient in drug discovery and chemi- ple, random forests models with atom pair descriptors
cal material projects [1–3]. Characterizing quantitative have been used by many pharmaceutical companies to
structure-bioactivity/structure–property relationships construct QSAR models [7], and Bayesian optimization
(QSAR/QSPR) of compounds has always been a hot topic models have been used to design nanostructures for pho-
in medicinal and material chemistry [2, 4], but such rela- non transport [8]. More recently, however, neural-net-
tionships are usually difficult to elucidate with heuris- work-based methods have greatly accelerated this field
tic rules or empirical measurements. Machine learning and will be briefly discussed below.
(ML) methods, such as random forests (RF) and support Many ML methods first convert chemical molecules
into a computer-interpretable representation, utilizing
physicochemical properties from experimental/compu-
*Correspondence: [email protected]; [email protected] tational measurements [9] or by using molecular finger-
1
Fujian Provincial Key Laboratory of Innovative Drug Target Research, prints. Physiochemical properties include mass, charge,
School of Pharmaceutical Sciences, Xiamen University, Xiamen 361000,
China refractivity, and many other physical features of the
2
Department of Electrical Engineering and Computer Science, molecules. The most widely used molecular conversion,
Informatics Institute, and Christopher S. Bond Life Sciences Center, however, is the molecular fingerprint, which encodes a
University of Missouri, Columbia, MO 65211, USA

© The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material
in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material
is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the
permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativeco
mmons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/
zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 2 of 9

molecular structure into a series of binary digits (a bit features, which may require advanced variable selec-
vector) [10] based on substructures that may or may not tion techniques or a high-level of empirical knowledge.
be pre-defined, depending on the class of fingerprints In contrast, some deep learning networks based on the
being used. For example, extended-connectivity finger- simplified molecular input line entry system (SMILES)
prints (ECFP) can split one molecule into many substruc- [12] codes can automatically learn the molecular features
tures (not pre-defined) and encode all of them into just [13, 14]. However, this may cause the model to focus on
one bit vector with different identifiers [11]. ENREF_10 the SMILES grammar and not the implicated molecu-
Alternatively, bit vectors may be extended into count vec- lar structure. This limitation of the SMILES-based deep
tors that indicate the number of each substructure found learning models is hard to avoid as the SMILES repre-
in the molecule, not just its presence/absence. sentation is not designed to capture molecular similar-
Compared to the previously-mentioned traditional ity. Generally, molecules with similar chemical structures
methods, artificial neural networks (ANNs) have become can be encoded into very different SMILES strings. Even
increasingly popular in predicting molecular properties. for the same molecular structure, there are often non-
For example, a three-layered ANN with E-state indi- unique SMILES strings as Fig. 1A displays. Though the
ces was used to predict aqueous solubility of organic process of generating canonical SMILES is well known,
molecules [15]. More recently, graph-based networks the process is inconsistent among chemical toolkits.
were applied to predict lipophilicity and solubility [16]. For example, the ‘canonical’ SMILES code for caffeine is
These network-based models have shown impressive CN1C=NC2=C1C(=O)N(C)C(=O)N2C according to
results and made good contributions for developing new RDKit, Cn1cnc2c1c(=O)n(C)c(=O)n2C according to
methods. Obabel, and CN1C=NC2=C1C(=O)N(C(=O)N2C)C
Fixed fingerprint feature extraction rules of molecules according to PubChem.
are useful to accurately reflect underlying chemical sub- Using the natural chemical graph instead of the
structures, though these may not be the best-suited SMILES representation may be more suitable for chemi-
representation for all tasks. Hence, researchers have cal property predictions. Briefly, a graph consists of
to spend much time and effort to carefully determine nodes and edges that connect two or more nodes to one
which features are most relevant to their models. This another. Analogously, a chemical graph considers atoms
is especially problematic with the utilization of physical as nodes and bonds as the edges connecting atoms to

Fig. 1 Conversion of a chemical structure into a mathematical graph. a A chemical structure usually has a unique graph but multiple SMILES
strings. b Relationship list between node indices and edge indices, which are converted from the chemical graph. c The lists of Node2Edge,
Edge2Node, Edge2Revedge and Node2NeiNode, derived from (b)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 3 of 9

one another. Our formulation considers these edges as in a logarithmic form as logP. The raw lipophilicity data
bidirectional, meaning that the bond connecting atom was downloaded from CHEMBL3301361 deposited by
A to atom B is the same as the bond connecting atom B AstraZeneca [24] and includes 4200 molecules. Aqueous
to atom A. An example chemical graph can be seen in solubility is the saturated concentration of the chemi-
Fig. 1a. cal in the aqueous phase, which is usually displayed with
Essential chemical properties such as molecular valid- unit log(mol/L) and is represented as logS. This dataset
ity are more easily represented in two-dimensional chem- was downloaded from the online chemical database and
ical graphs than linear SMILES. Unlike SMILES codes modeling environment (OCHEM) [25] and includes 1311
chemical graphs are invariant to molecule permutations, experimental records. The dataset distributions are plot-
i.e., one molecular structure has one graph but multiple ted in Additional file 1: Fig. S1.
SMILES representations. Recently, graph-based deep As both datasets are small relative to the typical size
learning models are reported in QSAR and QSPR studies requirements of deep learning models, we use tenfold
[7, 17–21]. However, according to these references, pre- stratified cross-validation [13, 23, 35], where each dataset
dictions are difficult to interpret, since most neural net- was randomly split into a training and validation set (80%
works act as black boxes [22]. and 10%, respectively) for parameter selection and a test
In this paper, we describe a self-attention-based mes- dataset (10%) for model comparisons. Then, we repeated
sage-passing neural network (SAMPN) model, which is all experiments three times with different random seeds.
a modification of Deepchem’s MPN [16] and is state-of- This process ensures that the model does not simply
the-art in deep learning. It directly learns the most rel- memorize the training and is capable of generalizing to
evant features of each QSAR/QSAPR task in the learning new molecules.
process and assigns the degree of importance for sub- For the initial data preprocessing, duplicate molecules
structures to improve the interpretability of prediction. were removed so that each chemical structure in the data
Our SAMPN graph network utilizes the chemical graph was unique, while the maximum one of the related prop-
structure described above, where each edge is derived erties was kept. Molecules unrecognized by RDkit (ver-
from the chemical bond and each atom is the node. Both sion 2019.3) [26], a cheminformatics toolkit implemented
our message passing neural network (MPN) and SAMPN in Python, were also deleted. Only two columns (‘smiles’
model can be used as multi-target models (Multi-MPN and ‘experimental value’) were kept as the input data to
or Multi-SAMPN), which can learn not only the relation- our models. Each downloaded SMILES representation
ship between chemical structures and properties, but was then converted into a directed graph before train-
also the relationship between intrinsic attributes of mol- ing the SAMPN model using the MPN encoder, which
ecules. To demonstrate our computational methods, we was adapted from Deepchem and Chemprop [27, 28].
chose lipophilicity and aqueous solubility as the target The directed graphs were mainly composed of index lists
properties as they were very important chemical descrip- of nodes and edges shown in Fig. 1c. Take the substruc-
tors that pervade every aspect of bioactivity, drug metab- ture of N–C as an example: a chemical bond between
olism and pharmacokinetic (DMPK) profiles [23]. the N and C atoms can derive two edges (C:0 → N:1 and
To our knowledge, this is the first time that a model N:0 → C:1). The number of nodes is equal to the num-
like SAMPN has been used to predict chemical proper- ber of atoms and the number of edges is always dou-
ties from experimental data for QSPR studies. The results ble the number of bonds, since we consider edges to be
from our experiments demonstrate that our SAMPN net- bidirectional.
work yields superior performance relative to traditional
ML-based models and previous deep-learning models Message passing network encoder
(i.e., Deepchem’s MPN [16]). Furthermore, the predic- Instead of manually selected features, using molecu-
tions of SAMPN are easily understood and visualized, lar graph structures directly was first reported in 1994
since the integrated attention mechanism can color the [29]. In recent years, graph-based methods have been
atoms of the molecule based on their contributions to the used to analyze various aspects of chemical systems [14,
property of interest. 30] and compare with fingerprints [31]. Graph-based
models provide a natural way to describe chemical mol-
Methods and materials ecules, where atoms in the molecule are equivalent to
Datasets and data process nodes and chemical bonds to the edges in a graph. The
Datasets of molecular lipophilicity and aqueous solu- message-passing network is a variant of the graph-theo-
bility were used for developing and testing our method. retical approaches, which gradually merges information
Lipophilicity is usually quantified by the n-octanol/ from distant atoms by extending radially through bonds
water partition coefficient P and preferentially displayed as displayed in Fig. 2. Those passing messages were used

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 4 of 9

Fig. 2 Representation of SAMPN architecture. The main part of the MPN encoder converts the neighbor features to a molecule matrix, then
followed by a self-attention layer and fully connected networks to make a final prediction

to encode all substructures of a molecule by an adaptive Table 1 Descriptions of node and edge features
learning approach, which extracts useful representations
Attribute Description Dimension
of molecules suited to the target predictions.
The message passing network encoder works as fol- Node
lows in Eqs. (1–3). The passing message M from atom x Atom type All currently known chemical elements 118
to atom y in the d-th iteration (message passing depth) is Degree Number of heavy atom neighbors 6
calculated as follows: Formal charge Charge assigned to an atom (− 2, − 1, 5
0, 1, 2)
d=1

Mxy = Re Winp · fx fy (1) Chirality label R, S, unspecified and unrecognized 4
type of chirality
  Hybridization sp, sp2, sp3, sp3d, or sp3d2 5
� Aromaticity Aromatic atom or not 1
d>1 d−1 
Mxy = ReWinp · fx fy + Wh Mzx (2) Edge
z∈N (x)\y Bond type Single, double, triple, or aromatic 4
Ring Whether the bond is in a ring 1
Here, Re is the activation function (Relu). Winp and
Bond stereo Nature of the bond’s stereochemistry 6
Wh are the learned weight matrices. As we use the edge- (none, any, Z, E, cis, or trans)
dependent neural network to pass a message, node fea-
ture fx is concatenated with edge feature fxy to form the
merged node-edge feature fxfxy. Node feature fx, is derived
by atom type, formal charge, valence, and aromaticity. Node x is allowed to send a message to a neighbor node
Similarly, edge feature fxy is derived from bond order, ring y only after node x has received messages from all neigh-
status and direction connection. The definitions of node bor nodes except y. We use the skip connection in the
fx features and edge fxy features are displayed in Table 1. message passing steps as in Fig. 2 (displayed in between
The initial message Mxy d=1, which x sends to y, is gener-
neighbor features and self-features). This skip connection
ated from the merged node-edge feature fxfxy by a neural allows the message to pass a very long distance without
network as described in Eq. (1). vanishing gradient problem when using backpropagation.
In a chemical graph, atoms denote the node set x∈V, The generated messages exchange and update based on
and bonds denote the edge set (x,y)∈E. Each edge has its the merged node-edge feature and the previous message
own direction in the SAMPN model. N(x) or N(y) stands passing step as Eq. (2) defined.
for the group of neighbor nodes of x or y, respectively. The latent vector hy of each node, take Node 2’s
z ∈ N (x)\y means the neighbors of x do not contain y. latent vector h2 as an example in Fig. 2, is obtained by

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 5 of 9

aggregating its neighbor messages in Eq. (3) after the average pooling is used on the sum of G and EG to get the
message-passing process: molecule latent vector as Fig. 2 shows in the purple rec-
   tangle. Finally, the latent vector is combined with several
�
d 
layers of fully connected networks for the target property
hy = ReWo Wah · fy + Mzy (3) prediction.
z∈N (y)

where, hy captures the local chemical structure fea- Model training and hyperparameter optimization
tures based on the passing depth, and Wo and Wah are The code for the MPN encoder was mainly adapted
the learned weight matrices. More detailed information from Deepchem and Chemprop [27, 28]. Both the
of SAMPN algorithm can be found in Additional file 1: MPN encoder and self-attention mechanism were
Table S1 in Supporting Materials. Applying the above implemented with Python and Pytorch version 1.0, an
Eqs. (1–3) on a chemical graph generates the final graph open-source framework for deep learning [33]. MPN,
representation G = {h1 … hi … hn}, which combines with Multi-MPN, SAMPN and Multi-SAMPN models were
the self-attention mechanism and fully-connected neural trained with the Adam optimizer using the same learning
networks to make the final prediction. rate schedule in [34].
Multiple metrics were used to evaluate the perfor-
mance of our models: mean absolute error (MAE), root
Self‑attention mechanism mean squared error (RMSE), mean squared error (MSE),
All hidden states of a node are directly combined into a coefficient of determination ( R2) and Pearson correlation
single vector, which may not make the difference among coefficient (PC). Lower values of MAE, MSE, and RMSE
the learned features explainable [32]. A better way is to indicate better predictive performance. Conversely,
apply the attention mechanism to obtain a context vec- higher values for PC and R2 indicate better models or
tor for the target node by focusing on its neighbors and better fits for the data. While some of these metrics tell
local environment. Take Node 2 as an example (the blue the same story, the inclusion of all of these values may
node in Fig. 2), after several message passing steps, Node provide a rich benchmark for future studies.
2 has hidden state h 2, which represents the substructure A grid search algorithm was used to adjust the hyper-
centered at Atom 2. Meanwhile, all the rest nodes have parameters with Hyperopt package version 0.1.2 [35].
the same process and hn represents the substructure cen- Table 2 shows the hyperparameters to be optimized and
tered at Atom n. Since different substructures have dif- the search space. We chose RMSE on the validation set as
ferent contribution to the molecular property, we can the metric to find the most suitable combination of the
use the attention mechanism to capture the different hyperparameters within the search space. In the lipophi-
influences of substructures in contributing to the target licity-QSPR task, one of the best combinations of hyper-
molecular property. parameters was {‘activation’: ‘ReLU’; ‘depth’: 4; ‘dropout’:
A self-attention layer is then added to identify the rela- 0.25; ‘layers of fully connected networks’: 2; ‘hidden size’:
tionship between the substructure contribution to the 384}. All the message passing neural network models
target property of a molecule. A dot-product attention (MPN, SAMPN, Multi-MPN and Multi-SAMPN) utilized
algorithm was implemented to take the whole molecular the above hyperparameters to test the final performance
graph representation G as the input. The self-attentive with using the tenfold stratified cross-validation on the
weighted molecule graph embedding can be formed as whole dataset.
follows:

Watt = softmax G · G T (4)
Table 2 Hyperparameters optimization for MPN
and SAMPN
EG = Watt · G (5) MPN Hyperparameters Range (interval)
and SAMPN
where Watt is the self-attention score that implicitly indi-
cates the contribution of local chemical graph to the tar- Activation function Tanh, ELU,
LeakyReLU ReLU,
get property. As G = {h1 … hi … hn}, each row of Watt is PReLU, SELU
the attention weight between the i-th atom and the rest Steps of message passing 2–6 (1)
atoms in the molecule. EG is the attentive embedding Graph embedding size 32–512 (32)
matrix, where each row corresponds to the attention Dropout rate 0.0–0.4 (0.05)
weighted hidden vector of the node. Then, the global Layers of fully connected network 1–3 (1)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 6 of 9

In addition to using the published results from Deep- Table 3 Models’ performance (root-mean-square error)
chem’s MPN, we also built a pure MPN model to estab- on lipophilicity database
lish a baseline without the self-attention and all the rest Dataset (size) Model RMSE
configurations were kept the same to SAMPN. To com-
pare the single-task and multi-target based deep learn- Lipophilicity (4200) RF 0.824 ± 0.041
ing network, we built the multi-MPN and multi-SAMPN. MPN (Deepchem)a 0.630 ± 0.059
The multi-target-based model used a merged molecule MPN (Deepchem)b 0.652 ± 0.061
dataset from ‘Lipophilicity’ and ‘Water Solubility’ as MPN 0.630 ± 0.059
described in Supporting Materials. All the used param- SAMPN 0.579 ± 0.036
eters were kept the same between MPN and SAMPN. Multi-MPN 0.594 ± 0.039
Multi-SAMPN 0.571 ± 0.032
Water solubility (1311) RF 1.096 ± 0.092
Random forest
MPN (Deepchem-1128)a 0.580 ± 0.030
To compare our SAMPN method with the traditional
MPN (Deepchem)b 0.676 ± 0.022
machine learning methods, we chose a random for-
MPN 0.694 ± 0.050
est model as the baseline. Random forest (RF) [36] is a
SAMPN 0.688 ± 0.057
supervised learning algorithm with an ensemble of deci-
Multi-MPN 0.674 ± 0.074
sion trees generated from a bootstrapped (bagged) sam-
Multi-SAMPN 0.661 ± 0.063
pling of compounds and features. It is widely used in the
traditional structure–property relation research [37], Italics represents the best performance in the results
a
Values were reported in [16]. In the lipophilicity prediction, we use the same
and was considered as a “gold standard” according to its dataset with Deepchem. In the water solubility prediction, our used dataset is
robustness, easy usage and high prediction accuracy in larger than Deepchem used (1128 molecules)
structure–property relationship research [38]. Here, the b
Values were calculated from the same data and the same stratified cross-
ECFP with a fixed length of 1024 [12] was used with the validation protocol in our work

RF model, which was implemented in Python 3.6.3 [39]

with the package Scikit-learn, version 0.21.2 [40]. For the
traditional model (RF) as displayed in Table 3 and Fig. 3b.
RF model, more trees generally increase performance
The MPN from Deepchem also displayed a good per-
and make predictions more stable, but it also slows down
formance (0.580 ± 0.030 RMSD) on the water solubility
the computation heavily. We set 500 trees for a good bal-
prediction [16]; however, they used a small water solubil-
ance point as suggested in [16] for most QSPR studies.
ity dataset (1128 molecules). For comparison purposes,
we used their default setting MPN (Deepchem) on our
Results and discussion water solubility data (1311 molecules) with our stratified
Lipophilicity and solubility prediction cross-validation protocol. The scripts of the detail calcu-
In each QSPR task, we built RF, MPN, SAMPN, multi- lation process were available in our Github repository.
MPN and multi-SAMPN models to explore the rela- After that, MPN (Deepchem) shows similar performance
tionship between the target property and the molecular (0.676 ± 0.022) with our MPN (0.694 ± 0.050). Based on
structure. For the lipophilicity prediction, both single- our model results (performance: SAMPN > MPN; Multi-
target based and multiple-target based model have good SAMPN > Multi-MPN), the self-attention mechanism
performance according to RMSE (SAMPN: 0.579 ± 0.036; can improve the performance of message passing neu-
Multi-SAMPN: 0.571 ± 0.032). Without the self-atten- ral networks in both lipophilicity and solubility predic-
tion mechanism, the performance of MPN decreased tion. And multi-target models have better performance
as Table 3 and Fig. 3a show. Nevertheless, the result of than single task-based model (performance: Multi-
our new formulation of the MPN or Multi-MPN was still SAMPN > SAMPN; Multi-MPN > MPN). In our study, no
much better than the one from the MPN version of Deep- matter choosing which metric as Fig. 3 displayed, we can
chem (0.719 ± 0.031) [16], and performance increased get the same conclusion as we motioned above. The bet-
even more with the inclusion of the attention mecha- ter predictive performance of the multi-target model is
nism. Our MPN model is different from Deepchem in probably from the benefit that Multi-SAMPN or Multi-
that we did not use any recurrent neural networks (RNN) MPN can use the learned feature from lipophilicity to
in our network architecture, which improved the speed help the solubility prediction, and vice versa. It is worth
of our MPN model in training. mentioning that Multi-SAMPN or Multi-MPN can pre-
For the solubility prediction, message passing based dict the lipophilicity and aqueous solubility simultane-
networks also greatly improved the performance over the ously rather than step by step prediction like SAMPN

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 7 of 9

Fig. 3 Models’ performance on lipophilicity (a, c) and aqueous solubility (b, d) with the same tenfold stratified cross-validation. Error bars represent
standard deviations

or MPN. Although our case indicates that a multi-target By using heat map coloring on each molecule (such
model performs better than the single-target model, it as in Fig. 4a–f ), it is easy to see which parts of molecule
requires more studies to show whether this is general, play a more important role in the lipophilicity or water
since our case only used only one lipophilicity and water solubility of molecule. The lipophilicity and solubility
solubility dataset. heat maps are helpful for chemists to optimize the lipo-
philicity and solubility of a particular molecule. Con-
Visualize the attention sider Fig. 4b, a depiction of 1H-indazole after using our
While higher prediction accuracy is always desirable, model. This molecule has a relatively high lipophilicity,
the ability to interpret a QSPR model is also important. as it has a large π-electron-conjugated system in its fused
Model comparison and interpretation can be facilitated aromatic ring. However, the nitrogen-containing section
by a visualization technique, making it possible to iden- of the molecule displays strong anti-lipophilic proper-
tify the learned features that drive compound property ties relative to the rest of the molecule. This may, in part,
predictions. In the SAMPN model, we can obtain the be due to nitrogen’s contribution (as ‘N’ or ‘NH’) to a
attention weight scores from the self-attention mecha- hydrogen bonding-network with its surroundings. Thus,
nism. For a specific molecule, we obtain the difference altering 1H-indazole to disrupt that potential network
between each atom’s weight score and the average atten- may increase the molecule’s lipophilicity. To test this
tion weight of the molecule. We define the above differ- hypothesis, we used SAMPN to predict the lipophilic-
ence as the attention coefficient of each atom and those ity of benzo[d]isothiazole (Additional file 1: Fig. S2), the
attention weight coefficients are very useful to gain molecule made by exchanging the ‘NH’ of 1H-indazole
insight into which parts of a molecule increase the target with ‘S’ (sulfur). As expected, this change did increase the
molecular property and which decrease it. molecule’s lipophilicity. Another example is the primary

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 8 of 9

Fig. 4 Heat map molecule coloring on lipophilicity (a–c) and solubility (d–f). a–c Red indicates a predicted anti-lipophilic feature and blue indicates
a predicted lipophilic feature. d–f Red indicates a predicted soluble feature and blue indicates a predicted anti-soluble feature

amine group in Fig. 4f, which can easily form hydrogen Supplementary information
bonds with water molecules. This is reflected in red for a Supplementary information accompanies this paper at https://doi.
predicted soluble feature. org/10.1186/s13321-020-0414-z.

Additional file 1. The Supporting Information can be found in the

Conclusions supporting documents. The source code and the prepared datasets are
In this work, we have proposed a self-attention-based available in the SAMPN Github repository (https://github.com/tbwxmu/
message passing neural network for identifying the rela- SAMPN).

tionship between molecular lipophilicity/solubility and

structure. Our SAMPN model outperforms the con- Acknowledgements
The authors acknowledge the group of the State Key Laboratory of Physical
ventional random forests and the previous graph neural Chemistry of Solid Surfaces at Xiamen University, for the use of their high-
network-based model. By applying the attention mecha- performance computing resources.
nism, SAMPN can provide some insights on the atomic
Authors’ contributions
sources of lipophilicity/solubility, which is different from BT and DX designed the study. BT implemented the models and wrote the
black box approaches that most machine learning and manuscript. All authors contributed to the interpretation of results. All authors
deep learning methods used. The results from SAMPN reviewed and edited the manuscript. All authors read and approved the final
manuscript.
are easy to understand by coloring the attention scores
directly on the molecular graph, which is useful as a Funding
guide to adjust the lipophilicity or solubility of one mol- This work was funded in part by the program of China Scholarships
Council No. 201806310017 and the US National Institutes of Health Grant
ecule. In addition, our message-passing neural networks R35-GM126985. SK’s effort was funded by US National Institutes of Health
can be easily trained as a multi-target model, which BD2K Training Grant T32LM012410.
makes it better in computational efficiency and predic-
Availability of data and materials
tive performance. With this approach and our case study, All data sets and code are available at GitHub: https://github.com/tbwxmu/
we believe our method can be applied to other quanti- SAMPN.
tative structure–property relationship studies and help
Competing interests
chemists optimize molecular properties directly from the The authors declare that they have no competing interests.
chemical structures.
Received: 4 October 2019 Accepted: 27 January 2020

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Tang et al. J Cheminform (2020) 12:15 Page 9 of 9

References 21. Liu K, Sun X, Jia L, Ma J, Xing H, Wu J, Gao H, Sun Y, Boulnois F, Fan J (2019)
1. Hansen K, Biegler F, Ramakrishnan R, Pronobis W, Von Lilienfeld OA, Müller Chemi-Net: a molecular graph convolutional network for accurate drug
K-R, Tkatchenko A (2015) Machine learning predictions of molecular property prediction. Int J Mol Sci 20:3389
properties: accurate many-body potentials and non-locality in chemical 22. Goulon A, Picot T, Duprat A, Dreyfus G (2007) Predicting activities without
space. J Phys Chem Lett 6:2326–2331 computing descriptors: graph machines for Qsar. SAR QSAR Environ Res
2. Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, 18:141–153
Dearden J, Gramatica P, Martin YC, Todeschini R (2014) Qsar mod- 23. Arnott JA, Planey SL (2012) The influence of lipophilicity in drug discovery
eling: where have you been? Where are you going to? J Med Chem and design. Expert Opin Drug Discov 7:863–875
57:4977–5010 24. AstraZeneca. Experimental in vitro Dmpk and physicochemical data on
3. Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of a set of publicly disclosed compounds (2016) https://doi.org/10.6019/
deep learning in drug discovery. Drug Discov Today 23:1241–1250 Chembl3301361
4. Le T, Epa VC, Burden FR, Winkler DA (2012) Quantitative structure- 25. Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Brand-
property relationship modeling of diverse materials properties. Chem Rev maier S, Abdelaziz A, Prokopenko VV, Tanchuk VY et al (2011) Online
112:2889–2919 chemical modeling environment (Ochem): web platform for data
5. Gómez-Bombarelli R, Aguilera-Iparraguirre J, Hirzel TD, Duvenaud D, storage, model development and publishing of chemical information. J
Maclaurin D, Blood-Forsythe MA, Chae HS, Einzinger M, Ha D-G, Wu T Comput Aided Mol Des 25:533–554
(2016) Design of efficient molecular organic light-emitting diodes by 26. Landrum G. Rdkit: open-source cheminformatics (2006)
a high-throughput virtual screening and experimental approach. Nat 27. Ramsundar B, Eastman P, Walters P, Pande V (2019) Deep Learning for
Mater 15:1120 the life sciences: applying deep learning to genomics, microscopy, drug
6. Mannodi-Kanakkithodi A, Pilania G, Huan TD, Lookman T, Ramprasad discovery, and more. O’Reilly Media, Inc., Newton
R (2016) Machine learning strategy for accelerated design of polymer 28. Yang K, Swanson K, Jin W, Coley CW, Eiden P, Gao H, Guzman-Perez A,
dielectrics. Sci Rep 6:20952 Hopper T, Kelley B, Mathea M (2019) Analyzing learned molecular repre-
7. Feinberg EN, Sheridan R, Joshi E, Pande VS, Cheng AC (2019) Step change sentations for property prediction. J Chem Inf Model. 59:3370–3388
improvement in Admet prediction with Potentialnet deep Featurization. 29. Kireev DB (1995) Chemnet: a novel neural network based method for
arXiv preprint arXiv:190311789 graph/property mapping. J Chem Inf Comput Sci 35:175–180
8. Ju S, Shiga T, Feng L, Hou Z, Tsuda K, Shiomi J (2017) Designing nano- 30. Coley CW, Jin W, Rogers L, Jamison TF, Jaakkola TS, Green WH, Barzilay R,
structures for phonon transport via bayesian optimization. Phys Rev X Jensen KF (2019) A graph-convolutional neural network model for the
7:021024 prediction of chemical reactivity. Chem Sci 10:370–377
9. Hansch C, Maloney PP, Fujita T, Muir RM (1962) Correlation of biological 31. Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph
activity of phenoxyacetic acids with hammett substituent constants and convolutions: moving beyond fingerprints. J Comput Aided Mol Des
partition coefficients. Nature 194:178 30:595–608
10. Riniker S, Landrum GA (2013) Open-source platform to benchmark 32. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-
fingerprints for ligand-based virtual screening. J Cheminform 5:26 Guzik A, Adams RP (2015) Convolutional networks on graphs for learning
11. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf molecular fingerprints. In Advances in neural information processing
Model 50:742–754 systems. pp 2224–2232.
12. Weininger D (1988) Smiles, a chemical language and information system. 33. Paszke A, Gross S, Chintala S, Chanan G (2017) Pytorch: tensors and
1. Introduction to methodology and encoding rules. J Chem Inf Comput dynamic neural networks in python with strong Gpu acceleration.
Sci. 28:31–36 PyTorch: tensors and dynamic neural networks in python with strong
13. Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo GPU acceleration. 6
design through deep reinforcement learning. Journal of cheminformatics 34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser
9:48 Ł, Polosukhin I (2017) Attention is all you need. In: advances in neural
14. Li X, Yan X, Gu Q, Zhou H, Wu D, Xu J (2019) Deepchemstable: chemical information processing systems. pp 5998–6008.
stability prediction with an attention-based graph convolution network. J 35. Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a
Chem Inf Model 14:1044–1049 python library for model selection and hyperparameter optimization.
15. Tetko IV, Tanchuk VY, Kasheva TN, Villa AE (2001) Estimation of aqueous Comput Sci Discov 8:014008
solubility of chemical compounds using E-state indices. J Chem Inf Com- 36. Breiman L (2001) Random forests. Mach Learn 45:5–32
put Sci 41:1488–1493 37. Polishchuk P (2017) Interpretation of quantitative structure-activity
16. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, relationship models: past, present, and future. J Chem Inf Model
Leswing K, Pande V (2018) Moleculenet: a benchmark for molecular 57:2618–2639
machine learning. Chem Sci 9:513–530 38. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as
17. Réti T, Sharafdini R, Dregelyi-Kiss A, Haghbin H (2018) Graph irregularity a method for quantitative structure–activity relationships. J Chem Inf
indices used as molecular descriptors in qspr studies. MATCH Commun Model 55:263–274
Math Comput Chem 79:509–524 39. Oliphant TE (2007) Python for Scientific Computing. Comput Sci Eng
18. Sarkar D, Sharma S, Mukhopadhyay S, Bothra AK (2016) Qsar Studies of 9:10–20
Fabh inhibitors using graph theoretical & quantum chemical descriptors. 40. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O,
Pharmacophore 7 Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine
19. Shao Z, Hirayama Y, Yamanishi Y, Saigo H (2015) Mining discriminative learning in Python. J Mach Learn Res 12:2825–2830
patterns from graph data with multiple labels and its application to
quantitative structure–activity relationship (Qsar) models. J Chem Inf
Model 55:2519–2527 Publisher’s Note
20. Wang X, Li Z, Jiang M, Wang S, Zhang S, Wei Z (2019) Molecule property Springer Nature remains neutral with regard to jurisdictional claims in pub-
prediction based on spatial graph embedding. J Chem Inf Model lished maps and institutional affiliations.
59:3817–3828

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
From Everand
Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
Dr. M. Kanagasabapathy
5/5 (1)
Physics Project: Magnetic Effect of Electric Current
20% (5)
Physics Project: Magnetic Effect of Electric Current
10 pages
Below Are The Definitions of The Six Capacity Pillars and Some Helpful Pointers For Assessment: Definitions of Capacity Pillar Structure
100% (2)
Below Are The Definitions of The Six Capacity Pillars and Some Helpful Pointers For Assessment: Definitions of Capacity Pillar Structure
2 pages
JournalofCheminformatics-2024-XiaofanZheng-1-A BERT-based pretraining model for extracting molecular structural information from a SMILES sequence
No ratings yet
JournalofCheminformatics-2024-XiaofanZheng-1-A BERT-based pretraining model for extracting molecular structural information from a SMILES sequence
9 pages
Molecular Modeling: A Powerful Tool For Drug Design and Molecular Docking
No ratings yet
Molecular Modeling: A Powerful Tool For Drug Design and Molecular Docking
10 pages
Mol-BERT An Effective Molecular Representation Wit
No ratings yet
Mol-BERT An Effective Molecular Representation Wit
7 pages
2022 - A review of molecular representation in the age of machine learning
No ratings yet
2022 - A review of molecular representation in the age of machine learning
19 pages
Molecular machine learning with conformer ensembles
No ratings yet
Molecular machine learning with conformer ensembles
26 pages
AIChE Journal - 2009 - Maginn - From Discovery To Data What Must Happen For Molecular Simulation To Become A Mainstream
No ratings yet
AIChE Journal - 2009 - Maginn - From Discovery To Data What Must Happen For Molecular Simulation To Become A Mainstream
7 pages
Structure To Property: Chemical Element Embeddings and A Deep Learning Approach For Accurate Prediction of Chemical Properties
No ratings yet
Structure To Property: Chemical Element Embeddings and A Deep Learning Approach For Accurate Prediction of Chemical Properties
11 pages
A Practical Guide To Molecular Docking and Homology Modelling For Medicinal Chemists
No ratings yet
A Practical Guide To Molecular Docking and Homology Modelling For Medicinal Chemists
20 pages
2307.03811v3
No ratings yet
2307.03811v3
37 pages
Bilodeau-Generative Models For Molecular Discovery-Recent Advances and challenges-article-2022-NA
No ratings yet
Bilodeau-Generative Models For Molecular Discovery-Recent Advances and challenges-article-2022-NA
17 pages
Bonvin_JMB2020_IntegrativeModelling
No ratings yet
Bonvin_JMB2020_IntegrativeModelling
21 pages
s00706-023-03076-1
No ratings yet
s00706-023-03076-1
25 pages
1 s2.0 S1570963922000048 Main
No ratings yet
1 s2.0 S1570963922000048 Main
13 pages
Efficient Machine Learning Force Field For Large-Scale Molecular Simulations of Organic Systems
No ratings yet
Efficient Machine Learning Force Field For Large-Scale Molecular Simulations of Organic Systems
23 pages
ML + Lammps
No ratings yet
ML + Lammps
8 pages
Graph-Based Molecular Representation Learning
No ratings yet
Graph-Based Molecular Representation Learning
9 pages
s13321-024-00899-w
No ratings yet
s13321-024-00899-w
11 pages
Molecular Representations in AI-driven Drug Discovery: A Review and Practical Guide
No ratings yet
Molecular Representations in AI-driven Drug Discovery: A Review and Practical Guide
22 pages
Advancing Material Property Prediction Using Physics-Informed Machine Learning Models for Viscosity
No ratings yet
Advancing Material Property Prediction Using Physics-Informed Machine Learning Models for Viscosity
14 pages
Molecular Simulations in Macromolecular Science
No ratings yet
Molecular Simulations in Macromolecular Science
10 pages
Applicationsofmolecularmodeling1 151106065509 Lva1 App6891
No ratings yet
Applicationsofmolecularmodeling1 151106065509 Lva1 App6891
35 pages
CS-E4860 Projects 2017
No ratings yet
CS-E4860 Projects 2017
8 pages
Two Personal Perspectives On A Key Issue Comtemporary 3D QSAR, Clark and Norinder, 20may12
No ratings yet
Two Personal Perspectives On A Key Issue Comtemporary 3D QSAR, Clark and Norinder, 20may12
6 pages
Chemistry Methods - 2020 - Cau T - Conceptual and Computational DFT Based in Silico Fragmentation Method For The
No ratings yet
Chemistry Methods - 2020 - Cau T - Conceptual and Computational DFT Based in Silico Fragmentation Method For The
15 pages
26 Augmented Hill-Climb Increases Reinforcement Learning Efficiency For Language-Based de Novo Molecule Generation.
No ratings yet
26 Augmented Hill-Climb Increases Reinforcement Learning Efficiency For Language-Based de Novo Molecule Generation.
22 pages
Three-Dimensional Classification Structure-Activity Relationship Analysis Using Convolutional Neural Network
No ratings yet
Three-Dimensional Classification Structure-Activity Relationship Analysis Using Convolutional Neural Network
8 pages
Carloni 2002
No ratings yet
Carloni 2002
10 pages
Artificiall Intelligence Paper 4
No ratings yet
Artificiall Intelligence Paper 4
15 pages
Russel PLoSBiol 2012
No ratings yet
Russel PLoSBiol 2012
5 pages
2103-Molecular Optimization by Capturing Chemist's Intuition Using Deep Neural Networks
No ratings yet
2103-Molecular Optimization by Capturing Chemist's Intuition Using Deep Neural Networks
17 pages
Fragment dynamics_peptides
No ratings yet
Fragment dynamics_peptides
15 pages
1 s2.0 S2095809923002813 Main
No ratings yet
1 s2.0 S2095809923002813 Main
14 pages
Large Scale Comparison of QSAR and Conformal Prediction Methods and Their Applications in Drug Discovery
No ratings yet
Large Scale Comparison of QSAR and Conformal Prediction Methods and Their Applications in Drug Discovery
16 pages
Molecular Quantum Chemical Data Sets and Databases
No ratings yet
Molecular Quantum Chemical Data Sets and Databases
51 pages
Online_Fault_Tolerant_RUL_Prediction_Strategy_for_Lithium-ion_Batteries_using_Machine_Learning
No ratings yet
Online_Fault_Tolerant_RUL_Prediction_Strategy_for_Lithium-ion_Batteries_using_Machine_Learning
13 pages
Xóa 3
No ratings yet
Xóa 3
23 pages
ms2mol-a-transformer-model-for-illuminating-dark-chemical-space-from-mass-spectra
No ratings yet
ms2mol-a-transformer-model-for-illuminating-dark-chemical-space-from-mass-spectra
35 pages
Machine-learning-assisted exploration of new non-fullerene acceptors for high-efficiency organic solar cells
No ratings yet
Machine-learning-assisted exploration of new non-fullerene acceptors for high-efficiency organic solar cells
14 pages
batteries-09-00112-v2 (3)
No ratings yet
batteries-09-00112-v2 (3)
11 pages
Applications of MM
No ratings yet
Applications of MM
10 pages
Review of Machine Learning Driven Design of Polyme
No ratings yet
Review of Machine Learning Driven Design of Polyme
15 pages
crystals-13-00602
No ratings yet
crystals-13-00602
10 pages
lamps - cpc
No ratings yet
lamps - cpc
34 pages
0002R
No ratings yet
0002R
11 pages
Introduction To Molecular Docking
100% (6)
Introduction To Molecular Docking
8 pages
Target-Oriented Generic Fingerprint-Based Molecular Representation
No ratings yet
Target-Oriented Generic Fingerprint-Based Molecular Representation
14 pages
IPIntJComprAdvPharmacol 7-1-12 16
No ratings yet
IPIntJComprAdvPharmacol 7-1-12 16
6 pages
Afzal Final Thesis With References
No ratings yet
Afzal Final Thesis With References
132 pages
Be16b033 Internship Report Molecular Modeling & Drug Design
No ratings yet
Be16b033 Internship Report Molecular Modeling & Drug Design
14 pages
Sali Structure 2005
No ratings yet
Sali Structure 2005
3 pages
Structure-Based, Deep-Learning Models For Protein-Ligand Binding Affinity Prediction
No ratings yet
Structure-Based, Deep-Learning Models For Protein-Ligand Binding Affinity Prediction
15 pages
Bbaa 266
No ratings yet
Bbaa 266
10 pages
International Journal of Research and Development in Pharmacy and Life Sciences
No ratings yet
International Journal of Research and Development in Pharmacy and Life Sciences
9 pages
Wu et al. - 2020 - iQSPR in XenonPy A Bayesian Molecular Design Algo
No ratings yet
Wu et al. - 2020 - iQSPR in XenonPy A Bayesian Molecular Design Algo
9 pages
2309.09355v3
No ratings yet
2309.09355v3
16 pages
32020-Article Text-36088-1-2-20250410
No ratings yet
32020-Article Text-36088-1-2-20250410
9 pages
Quantum Mechanical Methods in Computational Chemistry
No ratings yet
Quantum Mechanical Methods in Computational Chemistry
5 pages
F8_2021_fingerprint machine learning QSAR prediction of ionic liquid properties
No ratings yet
F8_2021_fingerprint machine learning QSAR prediction of ionic liquid properties
8 pages
'20241115021502_6736aea6f033e_ipintjcompradvpharmacol_7_1_12_16.pdf'
No ratings yet
'20241115021502_6736aea6f033e_ipintjcompradvpharmacol_7_1_12_16.pdf'
5 pages
Curriculum Vitae
No ratings yet
Curriculum Vitae
8 pages
Glob Bus Org Exc - 2022 - Chigeda - Continuance in Organizational Commitment The Role of Emotional Intelligence Work Life
No ratings yet
Glob Bus Org Exc - 2022 - Chigeda - Continuance in Organizational Commitment The Role of Emotional Intelligence Work Life
17 pages
17TH June Vocab of Perfection 4.0
No ratings yet
17TH June Vocab of Perfection 4.0
13 pages
Baker e Dransfield, 2016
No ratings yet
Baker e Dransfield, 2016
27 pages
Plant Genome Diversity Volume 2 Physical Structure Behaviour and Evolution of Plant Genomes 1st Edition Prof. Pamela S. Soltis
100% (3)
Plant Genome Diversity Volume 2 Physical Structure Behaviour and Evolution of Plant Genomes 1st Edition Prof. Pamela S. Soltis
52 pages
Fransua Dalachew
No ratings yet
Fransua Dalachew
205 pages
UBL Banking Service Questionnaire
No ratings yet
UBL Banking Service Questionnaire
2 pages
Sweet - Citrus - Fruit - Detection - in - Thermal - Images - Using
No ratings yet
Sweet - Citrus - Fruit - Detection - in - Thermal - Images - Using
6 pages
10th Economics
No ratings yet
10th Economics
25 pages
Tree and Forest Measurement - Ebook
No ratings yet
Tree and Forest Measurement - Ebook
191 pages
Davis & Pandey 2023 - Feeling Out The Rules A Psychological Process Theory of Red Tape
No ratings yet
Davis & Pandey 2023 - Feeling Out The Rules A Psychological Process Theory of Red Tape
14 pages
Science 10 BIODIVERSITY & EVOLUTION
No ratings yet
Science 10 BIODIVERSITY & EVOLUTION
3 pages
Cavitation - Wikipedia
No ratings yet
Cavitation - Wikipedia
22 pages
Curriculum Development: EDUC 505
No ratings yet
Curriculum Development: EDUC 505
12 pages
Thesis Acknowledgement Parents
100% (3)
Thesis Acknowledgement Parents
4 pages
Davison 2017
No ratings yet
Davison 2017
35 pages
Bronsted Lowry
No ratings yet
Bronsted Lowry
3 pages
AW-FD706EX Explosion-Proof Point-Type UV Flame Detector User Manual 220601
No ratings yet
AW-FD706EX Explosion-Proof Point-Type UV Flame Detector User Manual 220601
7 pages
Non-Stationary Signal Analysis Software - WT9362 & WT9364: Brüel &
No ratings yet
Non-Stationary Signal Analysis Software - WT9362 & WT9364: Brüel &
4 pages
GROUP DYNAMICS Reviewer
No ratings yet
GROUP DYNAMICS Reviewer
8 pages
Module 9 Internet of Things IOT
No ratings yet
Module 9 Internet of Things IOT
34 pages
NF15-Notice No. 53
No ratings yet
NF15-Notice No. 53
3 pages
Ancient Indian Astronomy
No ratings yet
Ancient Indian Astronomy
12 pages
6700download Multiplicity and Ontology in Deleuze and Badiou 1st Edition Becky Vartabedian Ebook All Chapters PDF
100% (1)
6700download Multiplicity and Ontology in Deleuze and Badiou 1st Edition Becky Vartabedian Ebook All Chapters PDF
57 pages
Reading Practice 19.5
No ratings yet
Reading Practice 19.5
6 pages
56 Assignments
No ratings yet
56 Assignments
12 pages
Gate Leaf - SI
No ratings yet
Gate Leaf - SI
100 pages
Hardness Test Procedure For Spherical Tank
No ratings yet
Hardness Test Procedure For Spherical Tank
7 pages

A Self-Attention Based Message Passing Neural Netw

Uploaded by

A Self-Attention Based Message Passing Neural Netw

Uploaded by

Tang et al.

J Cheminform (2020) 12:15

RESEARCH ARTICLE Open Access

A self‑attention based message passing

Introduction vector machines (SVM), have aided the discovery process

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

RF model, which was implemented in Python 3.6.3 [39]

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Additional file 1. The Supporting Information can be found in the

tionship between molecular lipophilicity/solubility and

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

You might also like