Schütt Et Al. - 2021 - Equivariant Message Passing For The Prediction of
Schütt Et Al. - 2021 - Equivariant Message Passing For The Prediction of
Schütt Et Al. - 2021 - Equivariant Message Passing For The Prediction of
molecular spectra
Kristof T. Schütt,1, 2 Oliver T. Unke,1, 2 and Michael Gastegger1, 3
1)
Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
2)
Berlin Institute for the Foundations of Learning and Data, 10587 Berlin,
Germanya)
3)
BASLEARN – TU Berlin/BASF Joint Lab for Machine Learning, 10587 Berlin,
Germany
(Dated: 8 June 2021)
Message passing neural networks have become a method of choice for learning on graphs, in particular the
prediction of chemical properties and the acceleration of molecular dynamics studies. While they readily scale
to large training data sets, previous approaches have proven to be less data efficient than kernel methods. We
identify limitations of invariant representations as a major reason and extend the message passing formulation
to rotationally equivariant representations. On this basis, we propose the polarizable atom interaction neural
arXiv:2102.03150v4 [cs.LG] 7 Jun 2021
network (PaiNN) and improve on common molecule benchmarks over previous networks, while reducing
model size and inference time. We leverage the equivariant atomwise representations obtained by PaiNN for
the prediction of tensorial properties. Finally, we apply this to the simulation of molecular spectra, achieving
speedups of 4-5 orders of magnitude compared to the electronic structure reference.
banks according to known symmetries of associated fea- conventional CNNs can be cast in this general framework.
ture types. While these approaches work on grids, archi- Rotational invariance of the representation can be ensured
tectures such as Tensor Field Networks19 , Cormorant20 by choosing rotationally invariant message and update
and NequIP30 use equivariant convolutions based on functions, which by definition need to fulfill
spherical harmonics (SH) and Clebsch-Gordon (CG) trans-
forms for point clouds. Kondor and Trivedi 31 have de- f (~x) = f (R ~x), (3)
scribed a general framework on equivariance and convolu-
tions in neural networks focusing on irreducible representa- for any rotation matrix R ∈ R3×3 .
tions. In contrast, PaiNN models equivariant interactions
in Cartesian space which is conceptually simpler and does
not require tensor contractions with CG coefficients. A B. Building equivariant MPNNs
similar approach was proposed by Jing et al. 32 with the
GVP-GNN. Both approaches are designed for distinct To obtain more expressive representations of local envi-
applications resulting in crucial differences in the design ronments, neurons do not have to be scalar, but can be
of the neural network architectures, most notably the geometric objects such as vectors and tensors16,19,20,33 .
message functions. The GVP-GNN has been designed for For the purpose of this work, we restrict ourselves to scalar
single-point protein sequence predictions, e.g., allowing and vectorial representations sti and ~vit , respectively, such
the use of non-smooth components. In contrast, PaiNN that a corresponding message pass can be written as
is designed for simulations with millions of inference steps, X
requiring a fast message function and a smoothly differ- ~ v,t+1
m i = M~ t (st , st , ~vt , ~vt , ~rij )
i j i j (4)
entiable model. j∈N (i)
Scaling with with the cutoff chosen such that only neighboring atoms
O(|N |) O(|N |2 ) O(|N |)
neighbors are within range. We observe that for the two arrange-
Resolve change yes no no
ments the angles are equal as well (Fig. 1, left). Therefore,
of k~
r1j k they are indistinguishable for invariant message passing
Resolve change no yes yes
with distances and angles. In contrast, the equivariant
of α213 representations differ in sign of their components (Fig. 1,
right). When contracting them to invariant representa-
tions, as in the previous example, this information is lost.
However, we may instead retain equivariance in the repre-
single pass is limited to interactions within a local en- sentation and design a message function M ~ t that does not
vironment, successive message passes are supposed to only propagate directions to neighboring atoms, but those
construct more complex representations and propagate lo- of equivariant representations as well. This enables the
cal information beyond the neighborhood. This raises the efficient propagation of directional information by scaling
question whether rotationally invariant representations of linearly with the number of neighbors and keeping the
atomic environments hi are sufficient here. required cutoff small.
Tab. I compares three simplified message functions
Note that neither many-body representations, using an-
for the example of a water molecule. Using a distance-
gles, dihedral angles etc., nor equivariant representations
based message function, the representation of the atom
corresponding to a multipole expansion are complete. It
1 (oxygen) is able to resolve changing bond lengths to
has been shown that even when including up to 4-body
atoms 2 and 3 (hydrogens), however it is not sensitive
invariants, there are structures with as few as eight atoms
to changes of the bond angle. On the other hand, using
which cannot be distinguished35 and that a many-body
the angle directly as part of the message function can not
expansion up to n-body contributions is necessary to guar-
resolve the distances. Therefore, a combination of dis-
antee convergence for a system consisting of n atoms36 .
tances and angles is required to obtain a more expressive
The same holds for multipole expansions, where scalars
message function15 . Unfortunately, including angles in
and vectors correspond to the 0th and 1st order. Thus,
the messages scales O(|N |2 ) with the number of neigh-
even spherical harmonics expansions with less than infi-
bors. Alternatively, directions to neighboring atoms may
~ t (~rij ) = ~rij /k~rij k. Employing the nite degree are unable to represent arbitrary equivariant
be used as messages M n-body functions. Instead, a practically sufficient and
update function Ut (m) = kmk2 , this is related to angles computationally efficient approach for the problem at
as follows: hand is desirable.
2 In the following, we propose a neural network archi-
N X ~rij X N XN
X ~rij ~rik tecture that takes advantage of these properties. It over-
= , = cos αjik .
k~rij k k~rij k k~rik k comes the limitations of invariant representations dis-
j=1 j,k j=1 k=1
cussed above, which we will demonstrate at the example
of an organometallic compound in Section V C 2.
Thus, using equivariant messages, the runtime complexity
remains O(|N |) while angular information can be resolved.
Note that this update function contracts the equivariant
messages to a rotationally invariant representation. IV. POLARIZABLE ATOM INTERACTION NEURAL
Beyond the computational benefits, equivariant rep- NETWORK (PAINN)
resentations allow to propagate directional information
beyond the neighborhood which is not possible in the in- The potential energy surface E(Z1 , . . . , ZN , ~r1 , . . . , ~rN ),
variant case. Fig. 1 illustrates this at a minimal example, with nuclear charges Zi ∈ N and atom positions ~ri ∈ R3
where four atoms are arranged in an equidistant chain exhibits certain symmetries. Generally, they include in-
4
FIG. 2: The architecture of PaiNN with the full architecture (a) as well as the message (b) and update blocks (c) of
the equivariant message passing. In all experiments, we use 128 features for si and ~vi throughout the architecture.
Other layer sizes are annotated in grey.
variance of the energy towards the permutation of atom Next, we define message and update functions as in-
indices i, as well as rotations and translations of the troduced in Sec. III. We use a residual structure of inter-
molecule. A neural network potential should encode these changing message and update blocks (Fig. 2a), resulting
constraints to ensure the symmetries of the predicted in coupled scalar and vectorial representations. For the
energy surface and increase data efficiency. A common residual of the scalar message function, we adopt the
inductive bias of the neural network potential is a de- feature-wise, continuous-filter convolutions introduced by
composition of the energy into atomwise contributions Schütt et al. 13
PN
E = i=1 (si ), where an output network predicts en- ∆sm
i = (φs (s) ∗ Ws )i (7)
ergy contributions from atoms embedded within their X
chemical environment, represented by si ∈ RF 21,37 . = φs (sj ) ◦ Ws (k~rij k),
While properties of chemical compounds may be such j
rotationally invariant scalars, they can also be equivariant where φs consists of atomwise layers as shown in Fig. 2b.
tensorial properties, e.g. the multipole expansion of the The roationally-invariant filters Ws are linear combi-
electron density nations of radial basis functions sin( rnπ k~rij k)/k~rij k as
cut
to the validation loss to reduce the impact of fluctuations the atomic forces could be predicted directly from vec-
which are particularly common when training with both torial features, we employ the gradients of the energy
energies and forces. Please refer to the supplement for model F~i = −∂E/∂~ri to ensure conservation of energy.
further details on training parameters. This property is crucial to run stable molecular dynam-
ics simulations. To demonstrate the data efficiency of
PaiNN, we use the more challenging setting with 1k
A. Chemical compound space known structures of which we use 950 for training and 50
for validation, where a separate model is trained for each
We use the QM9 dataset of ≈130k small organic trajectory. Tab. 5 shows the comparison with sGDML4
molecules45 with up to nine heavy atoms to evaluate and NequIP30 , which were trained on forces only, as
the performance of PaiNN for the prediction of scalar well as SchNet, PhysNet, DimeNet and FCHL1911 ,
properties across chemical compound space. We predict that were trained on a combined loss of energies and
the magnitude of the dipole moment using Eq. 13 and forces. Christensen and von Lilienfeld 47 have found that
the electronic spatial extent by the energies of MD17 are noisy. Thus, depending on the
molecule and chosen tradeoff, using energies for training
N
X is not always beneficial. For this reason, we train two
2
R = qatom (si )k~ri k2 , PaiNN models per trajectory: only on forces and on a
i=1 combined loss including energy with a force error weight
of ρ = 0.95. PaiNN achieves the lowest mean absolute
as implemented by SchNetPack. The remaining prop- errors for 12 out 14 targets on models trained only on
erties are predicted as sums over atomic contributions. forces and exhibits errors in a similar range as Gaussian
PaiNN is trained on 110k examples while 10k molecules regression with the FCHL19 kernel. Overall, PaiNN
are used as a validation set for decaying the learning rate performs best or equal to FCHL19 on 9 out of 14 targets.
and early stopping. The remaining data is used as test This demonstrates that equivariant neural networks ap-
set and results are averaged over three random splits. For proaches such as PaiNN are able to compete with kernel
the isotropic polarizability α, we first observed validation methods in the small data regime, while being able to
MAEs of 0.054 a0 . Upon closer inspection, we notice scale to large data sets at the same time.
that for this property both the squared loss as well as the
MAE can be reduced when minimizing the MAE directly
(as done by Klicpera, Groß, and Günnemann 34 ). This
yields both validation and test MAEs of 0.045a0 that are
comparable to those of DimeNet++.
Tab. II shows the mean absolute error (MAE) of PaiNN
for 12 target properties of QM9 in comparison with previ- C. Advantages of equivariant features
ous approaches. SchNet13 and PhysNet27 are MPNNs
with distance-based interactions, DimeNet++34 includes
additional angular information and is an improved vari- 1. Ablation studies
ant of DimeNet15 . L1Net14 and Cormorant20 are
equivariant neural networks based on spherical harmonics
and Clebsch-Gordon coefficients.
We evaluate the impact of equivariant vector features
PaiNN achieves state-of-the-art results in six target
at the example of the aspirin MD trajectory from the
properties and yields comparable results to DimeNet++
previous section. Compared to the full model, we remove
on another two targets. On the remaining proper-
the scalar product of vector features in Eq. 9 from the
ties, PaiNN achieves the second best results after
update block and the convolution over vector features
DimeNet++. Note that PaiNN using about 600k param-
in Eq. 8 (i.e., Wvv = 0). Table IV show the results
eters is significantly smaller than DimeNet++ with about
for the various ablations. The number of parameters
1.8M parameters. For random batches of 50 molecules
is kept approximately constant by raising the number
from QM9, the inference time is reduced from 45 ms to
of node features F accordingly. We observe that all
13 ms, i.e. an improvement of more than 70%, when
ablated components contribute to the final accuracy of the
comparing PaiNN to the reference implementation of
model, where the convolution over equivariant features in
DimeNet++46 using an NVIDIA V100.
the message function has a slightly larger impact. This
component also enables the propagation of directional
information, which will be examined in Sec. V C 2. Finally,
B. Molecular dynamics trajectories we remove all vector features from the model, resulting
in an invariant model. Despite keeping the number of
We evaluate the ability to predict combined energies parameters constant by increasing the number of atoms
and forces on the MD17 benchmark10 including molecular features to F = 174, the mean absolute error of the forces
dynamics trajectories of small organic molecules. While increases beyond 1 kcal/mol/Å.
7
TABLE II: Mean absolute errors on QM9 dataset for various chemical properties. Results for PaiNN are averaged
over three random splits. Best in bold.
TABLE III: Mean absolute errors on MD17 dataset for energy and force predictions in kcal/mol and kcal/mol/Å,
respectively. Batzner et al. 30 only reported force errors for NequIP. Results for PaiNN are averaged over three
random splits. Best in bold.
TABLE IV: Ablation study for the prediction of energies 2. Propagation of directional information in substituted
[kcal/mol] and forces [kcal/mol/Å] for aspirin trajectories ferrocene
from MD17.
Ablation # params F energy MAE force MAE To demonstrate the advantages of equivariant over in-
no ablation 588.3k 128 0.159 0.371
variant representations in practice, we consider a ferrocene
derivative where one hydrogen atom in each cyclopenta-
no scalar product of 589.1k 134 0.173 0.420
vector features in Eq. 8 dienyl ring is substituted by fluorine (see Fig. 4). This
no vector propagation 589.2k 135 0.183 0.441 molecule has been chosen as it features small energy fluc-
(Wvv = 0 in Eq. 7)
remove both 590.1k 142 0.200 0.507 tuations (<1 kcal/mol) when the rings rotate relative to
no vector features 590.3k 174 0.449 1.194
each other. Since the torsional energy profile depends
mainly on the orientation of the distant fluorine atoms
(measured by the rotation angle θ), it is a challenging
prediction target for models without equivariant repre-
sentations. Fig. 4 shows the predicted energy profiles
for a full rotation of the cyclopentadienyl ring using cut-
offs rcut ∈ {2.5, 3.0, 4.0} Å. All models are trained on
energies of 10k structures sampling thermal fluctuations
(300 K) and ring rotations of substituted ferrocene. An-
8
FIG. 5: IR (top) and Raman (bottom) spectra of ethanol and aspirin. Spectra calculated with the reference method
using the harmonic oscillator approximation are shown in black (QM harmonic). The inset table shows the mean
absolute errors on the respective test set.
years (3140 seconds / step) now takes one hour (15 ms / ACKNOWLEDGEMENTS
step).
KTS acknowledges support by the Federal Ministry of
Education and Research (BMBF) for the Berlin Center for
VI. CONCLUSIONS Machine Learning / BIFOLD (01IS18037A). MG works
at the BASLEARN – TU Berlin/BASF Joint Lab for
We have given general guidelines to design equivari- Machine Learning, co-financed by TU Berlin and BASF
ant MPNNs and discussed the advantages of equivariant SE. OTU acknowledges funding from the Swiss National
representations over angular features in terms of compu- Science Foundation (Grant No. P2BSP2 188147).
tational efficiency as well as their ability to propagate
1 J. Behler, “Perspective: Machine learning potentials for atomistic
directional information. On this basis, we have proposed
simulations,” J. Chem. Phys 145, 170901 (2016).
PaiNN that yields fast and accurate predictions of scalar 2 O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky,
and tensorial molecular properties. Thereby, equivariant K. T. Schütt, A. Tkatchenko, and K.-R. Müller, “Machine Learn-
message passing allows us to significantly reduce both ing Force Fields,” arXiv preprint arXiv:2010.07067 (2020).
3 O. A. von Lilienfeld, K.-R. Müller, and A. Tkatchenko, “Exploring
model size and inference time compared to directional
message-passing while retaining accuracy. Finally, we chemical compound space with quantum-based machine learning,”
Nat. Rev. Chem. 4, 347–358 (2020).
have demonstrated that PaiNN can be applied to the 4 S. Chmiela, H. E. Sauceda, K.-R. Müller, and A. Tkatchenko,
prediction of tensorial properties, which we leverage to ac- “Towards exact molecular dynamics simulations with machine-
celerate the simulation of molecular spectra by 4-5 orders learned force fields,” Nat. Commun. 9, 3887 (2018).
5 J. Westermayr, M. Gastegger, and P. Marquetand, “Combining
of magnitude – from years to hours.
SchNet and SHARC: The SchNarc machine learning approach
In future work, the equivariant representation of
for excited-state dynamics,” J. Phys. Chem. Lett. 11, 3828–3834
PaiNN as well as the ability to predict tensorial proper- (2020).
ties may be leveraged in generative models of 3d geome- 6 T. Morawietz, A. Singraber, C. Dellago, and J. Behler, “How van
tries52–54 or the prediction of wavefunctions55–57 . We see der Waals interactions determine the unique properties of water,”
further applications of equivariant message passing in 3d Proc. Natl. Acad. Sci. 113, 8368–8373 (2016).
7 A. P. Bartók, J. Kermode, N. Bernstein, and G. Csányi, “Machine
shape recognition and graph embedding58 . learning a general-purpose interatomic potential for silicon,” Phys.
Many challenges remain for the fast and accurate predic- Rev. X 8, 041048 (2018).
tion of molecular properties, e.g. modeling of enzymatic 8 D. Lu, H. Wang, M. Chen, J. Liu, L. Lin, R. Car, W. Jia,
active sites or surface reactions. To describe such phenom- L. Zhang, et al., “86 PFLOPS Deep Potential Molecular Dynamics
ena, highly accurate reference methods are required. Due simulation of 100 million atoms with ab initio accuracy,” arXiv
preprint arXiv:2004.11658 (2020).
to their computational cost, reference data generation can 9 J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl,
become a bottleneck, making data-efficient MPNNs such “Neural message passing for quantum chemistry,” Interational
as PaiNN invaluable for future chemistry research. Conference on Machine Learning (2017).
10
10 S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. 31 R. Kondor and S. Trivedi, “On the Generalization of Equivariance
Schütt, and K.-R. Müller, “Machine Learning of Accurate Energy- and Convolution in Neural Networks to the Action of Compact
Conserving Molecular Force Fields,” Sci. Adv. 3, e1603015 (2017). Groups,” International Conference on Machine Learning , 2747–
11 A. S. Christensen, L. A. Bratholm, F. A. Faber, and O. Ana- 2755 (2018).
tole von Lilienfeld, “FCHL revisited: Faster and more accurate 32 B. Jing, S. Eismann, P. Suriana, R. J. L. Townshend, and R. Dror,
quantum machine learning,” J. Chem. Phys 152, 044107 (2020). “Learning from protein structure with geometric vector percep-
12 A. P. Bartók, M. C. Payne, R. Kondor, and G. Csányi, “Gaussian trons,” in International Conference on Learning Representations
approximation potentials: The accuracy of quantum mechanics, (2021).
without the electrons,” Phys. Rev. Lett. 104, 136403 (2010). 33 G. E. Hinton, A. Krizhevsky, and S. D. Wang, “Transforming
13 K. Schütt, P.-J. Kindermans, H. E. Sauceda, S. Chmiela, auto-encoders,” International Conference on Artificial Neural
A. Tkatchenko, and K.-R. Müller, “SchNet: A continuous-filter Networks , 44–51 (2011).
convolutional neural network for modeling quantum interactions,” 34 J. Klicpera, J. Groß, and S. Günnemann, “Fast and Uncertainty-
Advances in Neural Information Processing Systems , 991–1001 Aware Directional Message Passing for Non-Equilibrium
(2017). Molecules,” Machine Learning for Molecules Workshop at NeurIPS
14 B. K. Miller, M. Geiger, T. E. Smidt, and F. Noé, “Relevance (2020).
of rotationally equivariant convolutions for predicting molecular 35 S. N. Pozdnyakov, M. J. Willatt, A. P. Bartók, C. Ortner,
properties,” arXiv preprint arXiv:2008.08461 (2020). G. Csányi, and M. Ceriotti, “Incompleteness of atomic structure
15 J. Klicpera, J. Groß, and S. Günnemann, “Directional message representations,” Physical Review Letters 125, 166001 (2020).
passing for molecular graphs,” International Conference on Learn- 36 A. Hermann, R. P. Krawczyk, M. Lein, P. Schwerdtfeger, I. P.
Filters for Rotation Equivariant CNNs,” Proceedings of the IEEE ical environments,” Phys. Rev. B 87, 184115 (2013).
Conference on Computer Vision and Pattern Recognition , 849– 38 M. Gastegger, K. T. Schütt, and K.-R. Müller, “Machine learning
on Computer Vision (ECCV) , 567–584 (2018). high-dimensional neural network potentials,” J. Chem. Phys 134,
19 N. Thomas, T. Smidt, S. Kearnes, L. Yang, L. Li, K. Kohlhoff, 074106 (2011).
and P. Riley, “Tensor field networks: Rotation-and translation- 40 M. Gastegger, J. Behler, and P. Marquetand, “Machine learning
equivariant neural networks for 3D point clouds,” arXiv preprint molecular dynamics for the simulation of infrared spectra,” Chem.
arXiv:1802.08219 (2018). Sci. 8, 6924–6935 (2017).
20 B. Anderson, T. S. Hy, and R. Kondor, “Cormorant: Covariant 41 M. Veit, D. M. Wilkins, Y. Yang, R. A. DiStasio Jr, and M. Ceri-
Molecular Neural Networks,” Advances in Neural Information otti, “Predicting molecular dipole moments by combining atomic
Processing Systems 32, 14537–14546 (2019). partial charges and atomic dipoles,” J. Chem. Phys. 153, 024113
21 J. Behler and M. Parrinello, “Generalized neural-network repre- (2020).
sentation of high-dimensional potential-energy surfaces,” Phys. 42 A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
Rev. Lett. 98, 146401 (2007). T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An
22 F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Mon- imperative style, high-performance deep learning library,” arXiv
fardini, “The graph neural network model,” IEEE Trans. Neural preprint arXiv:1912.01703 (2019).
Netw. 20, 61–80 (2008). 43 K. Schütt, P. Kessel, M. Gastegger, K. Nicoli, A. Tkatchenko, and
23 D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, K.-R. Müller, “SchNetPack: A deep learning toolbox for atomistic
T. Hirzel, A. Aspuru-Guzik, and R. P. Adams, “Convolutional systems,” J. Chem. Theory Comput. 15, 448–455 (2018).
networks on graphs for learning molecular fingerprints,” Advances 44 D. P. Kingma and J. Ba, “Adam: A method for Stochastic
in Neural Information Processing Systems 28, 2224–2232 (2015). Optimization,” arXiv preprint arXiv:1412.6980 (2014).
24 S. Kearnes, K. McCloskey, M. Berndl, V. Pande, and P. Riley, 45 R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. Von Lilien-
“Molecular graph convolutions: moving beyond fingerprints,” J. feld, “Quantum chemistry structures and properties of 134 kilo
Comput. Aided Mol. Des. 30, 595–608 (2016). molecules,” Sci. Data 1, 1–7 (2014).
25 K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R. Müller, and 46 https://github.com/klicperajo/dimenet.
A. Tkatchenko, “Quantum-chemical insights from deep tensor neu- 47 A. S. Christensen and O. A. von Lilienfeld, “On the role of
ral networks,” Nat. Commun. 8 (2017), 10.1038/ncomms13890. gradients for machine learning of molecular energies and forces,”
26 K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko, Mach. Learn.: Sci. Technol. 1, 045018 (2020).
and K.-R. Müller, “SchNet–A deep learning architecture for 48 M. Thomas, M. Brehm, R. Fligg, P. Vöhringer, and B. Kirchner,
molecules and materials,” J. Chem. Phys 148, 241722 (2018). “Computing vibrational spectra from ab initio molecular dynamics,”
27 O. T. Unke and M. Meuwly, “PhysNet: A Neural Network for Pre- Phys. Chem. Chem. Phys. 15, 6608–6622 (2013).
dicting Energies, Forces, Dipole Moments, and Partial Charges,” 49 H. E. Sauceda, V. Vassilev-Galindo, S. Chmiela, K.-R. Müller,
J. Chem. Theory Comput. 15, 3678–3693 (2019). and A. Tkatchenko, “Dynamical strengthening of covalent and
28 N. Lubbers, J. S. Smith, and K. Barros, “Hierarchical modeling of non-covalent molecular interactions by nuclear quantum effects
molecular energies using a deep neural network,” J. Chem. Phys at finite temperature,” Nat. Commun. 12, 1–10 (2021).
148, 241715 (2018). 50 P. Linstrom and W. G. Mallard, NIST Chemistry WebBook NIST
29 M. Weiler, M. Geiger, M. Welling, W. Boomsma, and T. S. Cohen, Standard Reference Database Number 69 (National Institute
“3D Steerable CNNs: Learning Rotationally Equivariant Features of Standards and Technology, doi:10.18434/T4D303, (retrieved
in Volumetric Data,” Advances in Neural Information Processing September 24, 2020), Gaithersburg MD, 20899, 2020).
Systems 31, 10381–10392 (2018). 51 J. Kiefer, “Simultaneous acquisition of the polarized and depo-
30 S. Batzner, T. E. Smidt, L. Sun, J. P. Mailoa, M. Kornbluth, larized Raman signal with a single detector,” Anal. Chem. 89,
N. Molinari, and B. Kozinsky, “SE (3)-Equivariant Graph Neural 5725–5728 (2017).
Networks for Data-Efficient and Accurate Interatomic Potentials,” 52 N. Gebauer, M. Gastegger, and K. Schütt, “Symmetry-adapted
arXiv preprint arXiv:2101.03164 (2021). generation of 3d point sets for the targeted discovery of molecules,”
Advances in Neural Information Processing Systems , 7564–7576
11
(2019).
53 J. Köhler, L. Klein, and F. Noé, “Equivariant Flows: sampling
configurations for multi-body systems with symmetric energies,”
Proceedings of the 37th International Conference on Machine
Learning (2019).
54 G. N. Simm, R. Pinsler, G. Csányi, and J. M. Hernández-Lobato,
triple zeta valence and quadruple zeta valence quality for H to Rn:
Design and assessment of accuracy,” Phys. Chem. Chem. Phys.
7, 3297–3305 (2005).
64 F. Neese, “The ORCA program system,” WIREs Comput. Mol.
117–258 (1930).
69 R. B. Blackman and J. W. Tukey, “The measurement of power
Data set batch size learning rate decay patience stopping patience rcut [Å]
QM9 100 5 · 10−4 5 30 5.0
MD17 10 1 · 10−3 50 150 5.0
Ferrocene 10 1 · 10−3 10 30 2.5-4.0
Spectra 10 5 · 10−4 15 50 2.7