0% found this document useful (0 votes)

10 views6 pages

Machine Learn

Uploaded by

soumava palit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

Machine Learn

Uploaded by

soumava palit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Available online at www.sciencedirect.

com

ScienceDirect

Simulations meet machine learning in structural biology

Adrià Pérez1, Gerard Martı́nez-Rosell1 and Gianni De Fabritiis1,2

Classical molecular dynamics (MD) simulations will be able to compute, but use approximations that forfeit accuracy.
reach sampling in the second timescale within five years, The extent to which these limitations may affect the
producing petabytes of simulation data at current force field validity of the results depends on the system and the
accuracy. Notwithstanding this, MD will still be in the regime of biological question at hand. Quantum mechanics (QM)
low-throughput, high-latency predictions with average calculations can be used to obtain an accurate description
accuracy. We envisage that machine learning (ML) will be able of a molecule, but are computationally demanding and
to solve both the accuracy and time-to-prediction problem by very limited in terms of sampling. Ideally, one would like
learning predictive models using expensive simulation data. to simulate at quantum level accuracy, which describes
The synergies between classical, quantum simulations and ML the physics and chemistry precisely, but at the sampling
methods, such as artificial neural networks, have the potential regime of current classical simulations.
to drastically reshape the way we make predictions in
computational structural biology and drug discovery. The first simulation of protein dynamics dates from
1977 and consisted of a 9.2 ps trajectory of the bovine
pancreatic trypsin inhibitor (BPTI) in vacuum [5]. In 2010,
Addresses [6] reported a 1 ms trajectory of the same protein in explicit
1
Computational Biophysiscs Laboratory (GRIB-IMIM), Universitat
Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Doctor
solvent, which constitutes a 100 million increase in trajec-
Aiguader 88, 08003 Barcelona, Spain tory length compared to the first simulation. In 30 years,
2
Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig MD simulations have increased sampling capabilities over
Lluis Companys 23, Barcelona 08010, Spain 8 orders of magnitude, with increasing accuracy in the force
fields [2–4]. In the last 10 years, MD has evolved from
Corresponding author: De Fabritiis, Gianni ([email protected])
single simulation [7–9] to high-throughput molecular
dynamics studies [10–15,16], where hundreds of micro-
Current Opinion in Structural Biology 2018, 49:139–144 seconds of simulations are computed in independent tra-
This review comes from a themed issue on Theory and simulation jectories to obtain converged statistics. Software and hard-
Edited by Robert Best and Kresten Lindorff-Larsen
ware innovations, such as the implementation of MD
codes for GPUs [17–20], distributed computing projects
For a complete overview see the Issue and the Editorial
like Folding@home [21], GPUGRID [22] and the devel-
Available online 21st February 2018 opment of special-purpose supercomputers like ANTON
https://doi.org/10.1016/j.sbi.2018.02.004 [23], are steadily decreasing the computational cost of
0959-440X/ã 2018 Elsevier Ltd. All rights reserved. molecular simulations. Additionally, the development of
adaptive sampling schemes has introduced more efficient
ways to sample conformational space, decreasing the
amount of simulations needed [24–26].

In a recent review we estimated that MD would reach

Introduction seconds of aggregated sampling using commodity hard-
Molecular dynamics (MD) simulations are one of the ware by 2022 [27] (Figure 1a), generating petabytes of
predominant techniques to study protein dynamics. simulation data. For instance, the file size of one second of
MD is often used to capture dynamical processes of simulation data of a 60 000-atom system (e.g. a GPCR
proteins across different timescales with atomistic details system) at 0.1 ns per frame is 7.2 Petabytes (reduced to a
in order to rationalize biological phenomena. Despite the third using compressed trajectory file formats). This
potential to become a surrogate model of real protein amount of data constitutes a valuable source of informa-
dynamics, some important issues still remain to be solved, tion, but the knowledge extracted from it is mainly used
mainly: high computational cost and sampling limitations to rationalize a particular protein system at hand, not to
[1] and force field accuracy [2–4]. generalize it to other systems. In this review, we envision
a paradigm change in the near future where expensive
Classical MD simulations constitute a balance between simulations (QM and MD) are not used to predict but to
accuracy and efficiency. For example, quantum-level learn models, so that further predictions can be drawn
phenomena such as enzymatic reactions, polarizability using ML approaches. By doing so, the large computa-
and proton transfers are neglected in exchange for compu- tional cost required by simulations becomes justifiable, in
tational speed. Commonly used force fields, based on a particular if the results are more accurate by the use of
parameterization of a closed form potential, are fast to more expensive simulation methods.

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

140 Theory and simulation

Figure 1

(a) MD data generation (b) QM/ML (c) MD as data augmentation tool

PDBBind
("general set" of 17,900
1,000.0 molecule protein-ligand structures
dihedral with annotated affinity)
Simulation length (ms)

100.0 BindingDB PDB

(1,419,347 (59,805 valid
protein-ligand protein-ligand
10.0 affinities) structures reported
in PDBBind)

1.0
neural
network
0.1 available
structure
MD
available
affinity

2010 2012 2014 2016 2018 2020 2022 QM vs ML 5

Publication year QM
predicted ML
energies 4
GPCR 1 atom 3D Total
1 second system coordinate file size 3
y
z 2

× × x =
1

1010 frames ~ 60,000 12 bytes 7.2 petabytes 0 Protein-ligand Protein-ligand

@ 0.1ns / frame atoms
–150 –100 –50 0 50 100 150 binding pose binding affinity
Phi

Current Opinion in Structural Biology

Overview of a combined simulation and machine learning approach. (a) MD data generation is expected to reach the second aggregated
timescale by 2022 and an output files size of several petabytes by 2022 based on a trend of maximum aggregated time per paper per year using
the ACEMD software. Chart adapted from [27]. Referenced publications correspond to [12,13,15,29,56–58]. (b). A first example of ML replacing
QM to predict dihedral energies given a neural network trained with QM simulations. (c). An example of data augmentation by MD: augment
protein–ligand binding poses for a set of protein–ligand pairs with unknown binding mode; augment binding affinity data for a set of resolved
protein–ligand complex structures of unknown affinities.

Machine learning applied to structural biology [40] is a deep learning-based model for toxicity prediction
ML approaches are not new in simulation analysis. For of compounds, winning the Tox21 toxicology prediction
instance, the common analysis pipeline for MD simula- challenge in 2014 by a large margin. Variational autoen-
tions involves dimensionality reduction [28–33]) and coders [41], a generative flavor of deep NNs, were
clustering algorithms. recently applied to convert discrete representations of
molecules to and from a multidimensional continuous
In the last few years, ML applications have grown expo- representation [42], allowing for efficient search and
nentially. One of the main factors driving this growth is optimization through open-ended spaces of chemical
the broad popularization of a particular type of ML called compounds. Additionally, autoencoders have also been
deep neural networks [34,35]. An artificial neural network used for dimensionality reduction in MD [43–45]. VAMP-
(NN) is a simple mathematical framework organized in nets [46] fit a Markov state model from the system specific
layers, each of them performing a matrix multiplication molecular simulation data. NNs have also been used to
and a non-linear function of the input variables x. The reproduce the free-energy surface of molecules [47].
output of a single neuron f in each layer is given by Deep convolutional neural networks (CNN) [48] have
f ¼ f ðwt x þ bÞ, where w are learnable weights, b is a bias become increasingly popular due to its performance in
and f is some nonlinear function. NNs can have several to machine vision, a property that has been exploited by us
hundred of nested layers and in such cases is called and others to apply it on structural biology by treating
“deep”. Given enough parameters, a NN is capable of proteins as 3D images. CNNs have been used for ligand
interpolating any continuous function [36,37]. binding site detection [49], ligand pose prediction [50],
ligand active/inactive classification [51], ligand binding
The application of NN models in computational biology affinity prediction [52] and protein design [53]. Also, the
is steadily increasing [38]. For instance, the Merck molec- DeepChem software [54] and the MoleculeNet chal-
ular activity challenge demonstrated the potential of deep lenge [55] provide multiple featurization algorithms and
neural network models in drug discovery [39]. DeepTox access to relevant QSAR prediction datasets.

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

Simulations meet machine learning in structural biology Pérez, Martı́nez-Rosell and De Fabritiis 141

Case 1: ML models to predict quantum forces The method shows an overall good mean correlation
using QM simulation data (0.82) when tested against the PDBBind’s core set. This
One important application case of ML is apparent in set contains several targets clustered by sequence simi-
quantum mechanics. QM simulations are notoriously larity, in order to define a representative, non-redundant
computationally expensive and, depending on the level subset of proteins. For few of these protein clusters,
of theory, scales poorly with the number of atoms of the however, the correlation disappears or even becomes
system [59]. It is therefore not surprising that there have negative. This fact might be explained by a lack of
been efforts to interpolate the QM many-body potential training data for specific protein pockets, which ulti-
with NNs to obtain a predictive model of QM forces. mately leads to a poor generalization in these cases.
KDEEP is structure-based, that is, it requires labelled data
Many studies on approximating QM with NNs were in the form of the structure of the protein–ligand complex
performed previously to the recent NN resurgence. In and their affinity. One way to address this issue would be
particular, Behler et al. [60,61,62] contributed signifi- to extend the available training datasets by obtaining new
cantly to the field for small molecules and on ways to fit affinity data or structures, either experimentally or com-
quantum observables, for example, infrared spectra [63]. putationally (Figure 1c). Experiments are of course a
The initial effort went to provide usable symmetry func- possibility, viable for pharmaceutical companies and
tions that could guarantee basic physical principles like some academic groups. Here we prefer to look at the
translational, rotational [64] and invariance on atom reor- computational options which can be more automated in
dering to the learned potentials. Transferability, however, active learning methods and are subjected to be expo-
was limited until recently [65]. nentially cheaper in the future.

In [65,66,67,68], NNs are trained from QM simulation A potential synergy between MD and ML would improve
data to generate the potential energy surface and forces the accuracy of predictive NN models, delivering pre-
for small molecules, generalizing to unseen molecules, dictions several orders of magnitude faster than simula-
including some preliminary tests on proteins. In the same tions. This level of performance is needed for large
way as MD force fields do, forces are true derivatives of prediction studies in drug design, where thousands of
the interpolated potential energy surface using the gra- molecules need to be evaluated, such as in virtual screen-
dients of the NN, and can be used to run dynamics. This ing. As for training KDEEP, the two most popular binding
guarantees that the forces produced by the NNs yield a affinity databases are PDBBind [70] and Binding MOAD
conservative field [69]. The QM energy potential is [71]. PDBind’s latest release (v2017) screened the
therefore learned with the accuracy of first-principle 124 962 structures in the PDB database [72] (as in Jan
based methods, using generated datasets for many mole- 1st, 2017) and identified 59 805 valid molecular complex
cules. The computational cost of generating the datasets structures into four main categories: protein–small ligand,
is of course very large, but once trained, the NN inference nucleic acid–small ligand, protein–nucleic acid and
cost is many orders of magnitude faster than the QM protein–protein complexes. From this set of structures,
computational model and comparable in costs to standard they defined the general set, providing binding affinity
classical MD (Figure 1b). data (KD/KI and IC50) for a total of 17,900 biomolecular
complexes in the PDB database, including protein–ligand
Case 2: ML models to predict binding affinities (14 761), nucleic acid–ligand (121), protein–nucleic acid
using MD simulation data (837), and protein–protein complexes (2181). The other
MD software for GPU has made simulations of full dataset, Binding MOAD, contains binding information
protein–ligand binding processes possible, allowing the for 9142 structures, being 6862 of them overlapped with
prediction of thermodynamic and kinetic properties PDBBind. This makes a total of 20 065 co-crystal struc-
[12,13]. At the moment, a trade-off between accuracy tures with binding data, out of the 59 805 complex struc-
and sampling restricts the applicability of MD compared tures detected in the PDB. A naive example of synergy
to other commonly used methods used in drug design, could be to increase the available affinity data for the
like docking, less accurate but significantly faster. Even if remaining 39 740 structures by simulations. This, how-
the sampling problem is solved via brute force, MD does ever would be very expensive and arguably impractical in
not currently guarantee that the results are correct a prediction study. Yet in the context of generating a
because of the approximations of the force fields. The database for training NN models, it only needs to be
last point can be mitigated by the use of QM/ML force performed once, and possibly at very high accuracy using
fields in the future. QM/ML-based force fields to obtain very accurate data.
The resulting NN will be used for predictions. Another
Recently, we explored the use of machine vision NN possible example comes from the BindingDB dataset [73]
models for binding affinity prediction. In KDEEP [52], a which contains about 1 419 347 binding data for 7000 pro-
ML model is used to predict binding affinities, which teins and over 635 301 drug-like molecules, but for most
consists of a CNN trained on the PDBBind database [70]. of the protein–ligand pairs there is no co-crystal structural

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

142 Theory and simulation

information. To fill up this gap, simulations could be used 8. Grossfield A, Pitman MC, Feller SE, Soubias O, Gawrisch K:
Internal hydration increases during activation of the
to predict ligand binding poses. As a rough estimation, if G-protein-coupled receptor rhodopsin. J Mol Biol 2008,
approximately 10 ms are needed to obtain the ligand 381:478-486.
binding pose for a pair of protein–ligand using adaptive 9. Dror RO, Arlow DH, Borhani DW, Jensen MO, Piana S, Shaw DE:
Identification of two distinct inactive conformations of the 2-
sampling methods [26], with 1 s of aggregate time one adrenergic receptor reconciles structural and biochemical
could generate 100 000 new predicted protein–ligand observations. Proc Natl Acad Sci U S A 2009, 106:4689-4694.
structures over the course of one year [27]. This aug- 10. Snow CD, Zagrovic B, Pande VS: The Trp cage: folding kinetics
mented dataset build at high computational and time cost and unfolded state topology via molecular dynamics
simulations. J Am Chem Soc 2002, 124:14548-14549.
is then used for learning fast predictive models, for
example, KDEEP. 11. Noe F, Schutte C, Vanden-Eijnden E, Reich L, Weikl TR:
Constructing the equilibrium ensemble of folding pathways
from short off-equilibrium simulations. Proc Natl Acad Sci U S A
Discussion 2009, 106:19011-19016.
In this article we illustrate how generated data produced 12. Buch I, Giorgino T, De Fabritiis G: Complete reconstruction of an
enzyme-inhibitor binding process by molecular dynamics
by simulations might be used to develop new and better simulations. Proc Natl Acad Sci U S A 2011, 108:10184-10189.
predictive ML models. Generation of datasets is not
13. Ferruz N, Harvey MJ, Mestres J, De Fabritiis G: Insights from
hampered by fast return times, which means that better fragment hit binding assays by molecular simulations. J Chem
simulation methods can be used, while ML is used to Inf Model 2015, 55:2200-2205.
obtain fast predictions. One existing example of this 14. Pan AC, Xu H, Palpant T, Shaw DE: Quantitative characterization
approach are QM simulations of biomolecules, used to of the binding and unbinding of millimolar drug fragments with
molecular dynamics simulations. J Chem Theory Comput 2017,
generate data for learning a NN QM potential, a paradigm 13:3372-3377.
that could improve on classical force fields in the near
15. Stanley N, Pardo L, Fabritiis GD: The pathway of ligand entry
future. A further possible example, build upon the expe- from the membrane bilayer to a lipid G protein-coupled
rience obtained in KDEEP, is where simulations are used receptor. Sci Rep 2016, 6:p22639.
as a data augmentation tool, delegating the binding 16. Plattner N, Doerr S, De Fabritiis G, Noé F: Complete protein–
affinity prediction to ML-based methods. protein association kinetics in atomic detail revealed by
molecular dynamics simulations and Markov modelling. Nat
Chem 2017.
Plattner et al. managed to simulate Barnse-Barstar protein–protein
Acknowledgements association.
The authors thank Acellera Ltd. for funding. G.D.F. acknowledges support
from MINECO (BIO2017-82628-P) and FEDER, as well as ‘Unidad de 17. Friedrichs MS, Eastman P, Vaidyanathan V, Houston M,
Excelencia Marı́a de Maeztu’, funded by MINECO (MDM-2014-0370). Legrand S, Beberg AL, Ensign DL, Bruns CM, Pande VS:
The authors thank the European Union’s Horizon 2020 research and Accelerating molecular dynamic simulation on graphics
innovation programme under grant agreement No 675451 (CompBioMed processing units. J Comput Chem 2009, 30:864-872.
project). 18. Harvey MJ, De Fabritiis G: An implementation of the smooth
particle mesh Ewald method on GPU hardware. J Chem Theory
Comput 2009, 5:2371-2377.
References and recommended reading
Papers of particular interest, published within the period of review, 19. Harvey MJ, Giupponi G, De Fabritiis G: ACEMD: accelerating
have been highlighted as: biomolecular dynamics in the microsecond time scale. J Chem
Theory Comput 2009, 5:1632-1639.
of special interest
of outstanding interest 20. Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y,
Beauchamp KA, Wang LP, Simmonett AC, Harrigan MP, Stern CD
et al.: OpenMM 7: rapid development of high performance
1. Freddolino PL, Harrison CB, Liu Y, Schulten K: Challenges in algorithms for molecular dynamics. PLoS Comput Biol 2017, 13.
protein-folding simulations. Nat Phys 2010, 6:751-758.
21. Shirts M, Pande VS: COMPUTING: screen savers of the world
2. Beauchamp KA, Lin YS, Das R, Pande VS: Are protein force unite! Science 2000, 290:1903-1904.
fields getting better? A systematic benchmark on
524 diverse NMR measurements. J Chem Theory Comput 2012, 22. Buch I, Harvey MJ, Giorgino T, Anderson DP, De Fabritiis G: High-
8:1409-1414. throughput all-atom molecular dynamics simulations using
distributed computing. J Chem Inf Model 2010, 50:397-403.
3. Lindorff-Larsen K, Maragakis P, Piana S, Eastwood MP, Dror RO,
Shaw DE: Systematic validation of protein force fields against 23. Shaw DE, Chao JC, Eastwood MP, Gagliardo J, Grossman JP,
experimental data. PLoS ONE 2012, 7. Ho CR, Lerardi DJ, Kolossváry I, Klepeis JL, Layman T et al.:
Anton, a special-purpose machine for molecular dynamics
4. Piana S, Klepeis JL, Shaw DE: Assessing the accuracy of simulation. Commun ACM 2008, 51:91.
physical models used in protein-folding simulations:
quantitative evidence from long molecular dynamics 24. Singhal N, Pande VS: Error analysis and efficient sampling in
simulations. Curr Opin Struct Biol 2014, 24:98-105. Markovian state models for molecular dynamics. J Chem Phys
2005, 123.
5. McCammon JA, Gelin BR, Karplus M: Dynamics of folded
proteins. Nature 1977, 267:585-590. 25. Hinrichs NS, Pande VS: Calculation of the distribution of
eigenvalues and eigenvectors in Markovian state models for
6. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, molecular dynamics. J Chem Phys 2007, 126.
Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y,
Wriggers W: Atomic-level characterization of the structural 26. Doerr S, De Fabritiis G: On-the-fly learning and sampling of
dynamics of proteins. Science 2010, 330:341-346. ligand binding by high-throughput molecular simulations. J
Chem Theory Comput 2014, 10:2064-2069.
7. Duan Y: Pathways to a protein folding intermediate observed in
a 1-microsecond simulation in aqueous solution. Science 27. Martı́nez-Rosell G, Giorgino T, Harvey MJ, de Fabritiis G: Drug
1998, 282:740-744. discovery and molecular dynamics: methods, applications

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

Simulations meet machine learning in structural biology Pérez, Martı́nez-Rosell and De Fabritiis 143

and perspective beyond the second timescale. Curr Top Med 50. Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR: Protein–
Chem 2017, 17:2617-2625. ligand scoring with convolutional neural networks. J Chem Inf
Model 2017, 57:942-957.
28. Noé F, Nüske F: A variational approach to modeling slow
processes in stochastic dynamical systems. Multiscale Model 51. Wallach I, Dzamba M, Heifets A: AtomNet: A Deep Convolutional
Simul 2013, 11:635-655. Neural Network for Bioactivity Prediction in Structure-based Drug
Discovery. arXiv; 2015:1-11.
29. Pérez-Hernández G, Paul F, Giorgino T, De Fabritiis G, Noé F:
Identification of slow molecular order parameters for Markov 52. Jiménez J, kali9c M, Martı́nez-Rosell G, De Fabritiis G: KDEEP:
model construction. J Chem Phys 2013, 139. protein–ligand absolute binding affinity prediction via 3D-
convolutional neural networks. J Chem Inf Model 2018:1-26
30. Schwantes CR, Pande VS: Improvements in Markov State http://dx.doi.org/10.1021/acs.jcim.7b00650. (in press).
Model construction reveal many non-native interactions in the KDEEP is a deep convolutional neural network trained over the PDBBind’s
folding of NTL9. J Chem Theory Comput 2013, 9:2000-2009. dataset, treating proteins as 3D images, to perform predictions on
protein–ligand binding affinity.
31. Amadei A, Linssen ABM, Berendsen HJC: Essential dynamics of
proteins. Proteins Struct Funct Bioinform 1993, 17:412-425. 53. Torng W, Altman RB: 3D deep convolutional neural networks for
amino acid environment similarity analysis. BMC Bioinform
32. Lange OF, Grubmüller H: Can principal components yield a 2017, 18.
dimension reduced description of protein dynamics on long
time scales? J Phys Chem B 2006, 110:22842-22852. 54. DeepChem, Deepchem, a python library democratizing deep
learning for science. http://www.deepchem.io (accessed
33. David CC, Jacobs DJ: Principal component analysis: a method 21.09.17).
for determining the essential dynamics of proteins. Methods DeepChem is a Python library that aims to provide a high-quality open
Mol Biol 2014, 1084:193-226. source tool for deep learning applied on to computational chemistry,
making more accessible the usage of deep neural networks for drug
34. LeCun Y, Bengio Y, Hinton G: Deep learning. Nature 2015, discovery, materials science, quantum chemistry and biology.
521:436-444.
55. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C,
35. Schmidhuber J: Deep Learning in neural networks: an Pappu AS, Leswing K, Pande V: MoleculeNet: A Benchmark for
overview. Neural Netw 2015, 61:85-117. Molecular Machine Learning. arXiv; 2017:1-39.
36. Hornik K: Approximation capabilities of multilayer feedforward 56. Sadiq SK, de Fabritiis G: Explicit solvent dynamics and
networks. Neural Netw 1991, 4:251-257. energetics of HIV-1 protease flap opening and closing.
Proteins Struct Funct Bioinform 2010, 78:2873-2885.
37. Andoni A, Panigrahy R, Valiant G, Zhang L: Learning polynomials
with neural networks. In Proceedings of the 31st International 57. Sadiq SK, Noe F, De Fabritiis G: Kinetic characterization of the
Conference on Machine Learning, vol 330. 2014:1-9. critical step in HIV-1 protease maturation. Proc Natl Acad Sci U
S A 2012, 109:20449-20454.
38. Angermueller C, Pärnamaa T, Parts L, Stegle O: Deep learning for
computational biology. Mol Syst Biol 2016, 12:878. 58. Stanley N, Esteban-Martı́n S, De Fabritiis G: Kinetic modulation
of a disordered protein domain by phosphorylation. Nat
39. Dahl GE, Jaitly N, Salakhutdinov R: Multi-task Neural Networks for Commun 2014, 5.
QSAR Predictions. arXiv; 2014:1-21.
59. Carleo G, Troyer M: Solving the quantum many-body problem
40. Mayr A, Klambauer G, Unterthiner T, Hochreiter S: with artificial neural networks. Science 2017, 355:602-606.
DeepTox: toxicity prediction using deep learning. Front Environ
Sci 2016, 3. 60. Behler J, Parrinello M: Generalized neural-network
representation of high-dimensional potential-energy
41. Kingma DP, Welling M: Auto-Encoding Variational Bayes. arXiv; surfaces. Phys Rev Lett 2007, 98.
2013:1-14. One of the first contributions on learning the potential energy surface of
molecules using neural networks.
42. Gómez-Bombarelli R, Duvenaud D, Hernández-Lobato JM,
Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A: 61. Behler J: Atom-centered symmetry functions for constructing
Automatic Chemical Design Using a Data-driven Continuous high-dimensional neural network potentials. J Chem Phys
Representation of Molecules. arXiv; 2016:1-28. 2011, 134:074106.
43. Wehmeyer C, Noe F: Time-lagged Autoencoders: Deep Learning 62. Behler J: Constructing high-dimensional neural network
of Slow Collective Variables for Molecular Kinetics. arXiv; 2017:1-8. potentials: a tutorial review. Int J Quantum Chem 2015,
115:1032-1050.
44. Doerr S, Ariz-Extreme I, Harvey MJ, De Fabritiis G: Dimensionality
Reduction Methods for Molecular Simulations. arXiv; 2017:1-11. 63. Gastegger M, Behler J, Marquetand P: Machine learning
molecular dynamics for the simulation of infrared spectra.
45. Hernández CX, Wayment-Steele HK, Sultan MM, Husic BE, Chem Sci 2017, 8:6924-6935.
Pande VS: Variational Encoding of Complex Dynamics. arXiv;
2017:1-12. 64. Boomsma W, Frellsen J: Spherical convolutions and their
application in molecular modelling. Neural Inf Process Syst
46. Mardt A, Pasquali L, Wu H, Noe F: VAMPnets: Deep Learning of (NIPS) 2017.
Molecular Kinetics. arXiv; 2017:1-14.
65. Smith JS, Isayev O, Roitberg AE: ANI-1: an extensible neural
47. Schneider E, Dai L, Topper RQ, Drechsel-Grau C, Tuckerman ME: network potential with DFT accuracy at force field
Stochastic neural network approach for learning high- computational cost. Chem Sci 2017, 8:3192-3203.
dimensional free energy surfaces. Phys Rev Lett 2017, 119. In this paper they present ANI-1, a neural network trained with QM
simulation data to generate the potential energy surface and forces for
48. Krizhevsky A, Sulskever I, Hinton GE: ImageNet classification small molecules.
with deep convolutional neural networks. Adv Neural Inf
Process Syst 2012, 60:84-90. 66. Yao K, Herr JE, Toth DW, Mcintyre R, Parkhill J: The TensorMol-0.1
Model Chemistry: A Neural Network Augmented with Long-Range
49. Jiménez J, Doerr S, Martı́nez-Rosell G, Rose AS, De Fabritiis G: Physics. arXiv; 2017:1-8.
DeepSite: protein-binding site predictor using 3D- TensorMol is a neural network potential trained over quantum mechanics
convolutional neural networks. Bioinformatics 2017, simulations that is able to generate the potential energy surface and
33:3036-3042. forces of small molecules.
DeepSite is a deep convolutional neural network trained with protein
structural data, treating proteins as 3D images. The network predicts the 67. Zhang L, Han J, Wang H, Car R, W.E.: Deep Potential Molecular
presence of druggable pockets, and demonstrates superior performance Dynamics: A Scalable Model with the Accuracy of Quantum
than the state-of-the-art. Mechanics. arXiv; 2017:1-22.

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

144 Theory and simulation

68. Faber FA, Hutchison L, Huang B, Gilmer J, Schoenholz SS, 71. Ahmed A, Smith RD, Clark JJ, J.B. Jr, Carlson HA: Recent
Dahl GE, Vinyals O, Kearnes S, Riley PF, Von Lilienfeld OA: improvements to Binding MOAD: a resource for protein–ligand
Prediction errors of molecular machine learning models Binding affinities and structures. Nucleic Acids Res 2015, 43:
lower than hybrid DFT error. J Chem Theory Comput 2017, D465-D469.
13:5255-5264.
72. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H,
69. Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Shindyalov IN, Bourne PE: The protein data bank. Nucleic Acids
Müller K-R: Machine learning of accurate energy-conserving Res 2000, 28:235-242.
molecular force fields. Sci Adv 2017, 3.
73. Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, Chong J:
70. Liu Z, Su M, Han L, Liu J, Yang Q, Li Y, Wang R: Forging the basis BindingDB in 2015: a public database for medicinal chemistry,
for developing protein–ligand interaction scoring functions. computational chemistry and systems pharmacology. Nucleic
Acc Chem Res 2017, 50:302-309. Acids Res 2016, 44:D1045-D1053.

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

Machine Learning in Molecular Sciences: Chen Qu Hanchao Liu
No ratings yet
Machine Learning in Molecular Sciences: Chen Qu Hanchao Liu
323 pages
TargetDiff 【ICLR 2023】
No ratings yet
TargetDiff 【ICLR 2023】
22 pages
Operation and Maintenance of Bact Alert 3d
No ratings yet
Operation and Maintenance of Bact Alert 3d
9 pages
Molecular Modeling of Proteins PDF
100% (1)
Molecular Modeling of Proteins PDF
474 pages
Bio 220 Main Body 2
No ratings yet
Bio 220 Main Body 2
243 pages
Dokumen - Pub Machine Learning in Bioinformatics of Protein Sequences Algorithms Databases and Resources For Modern Protein Bioinformatics 9811258570 9789811258572
No ratings yet
Dokumen - Pub Machine Learning in Bioinformatics of Protein Sequences Algorithms Databases and Resources For Modern Protein Bioinformatics 9811258570 9789811258572
378 pages
Ploy AAA
No ratings yet
Ploy AAA
50 pages
2-Ab Initio Characterization of Protein Molecular Dy
No ratings yet
2-Ab Initio Characterization of Protein Molecular Dy
28 pages
Machine Learning in The Analysis of Biomolecular Simulations
No ratings yet
Machine Learning in The Analysis of Biomolecular Simulations
32 pages
Fphar 13 844293
No ratings yet
Fphar 13 844293
16 pages
Molecular Dynamics 3l5sb5ap
No ratings yet
Molecular Dynamics 3l5sb5ap
40 pages
Graph Neural Networks GNNs For Predicting Protein Ligand Interaction
No ratings yet
Graph Neural Networks GNNs For Predicting Protein Ligand Interaction
21 pages
Crystal Singing Bowls
100% (1)
Crystal Singing Bowls
40 pages
Processes 09 00071 v4
No ratings yet
Processes 09 00071 v4
60 pages
Advanced Science - 2022 - Wu - Pre Training of Equivariant Graph Matching Networks With Conformation Flexibility For Drug
No ratings yet
Advanced Science - 2022 - Wu - Pre Training of Equivariant Graph Matching Networks With Conformation Flexibility For Drug
13 pages
1 s2.0 S1570963922000048 Main
No ratings yet
1 s2.0 S1570963922000048 Main
13 pages
Ijms 21 06339
No ratings yet
Ijms 21 06339
20 pages
DL Protein
No ratings yet
DL Protein
25 pages
Unisim: A Unified Simulator For Time-Coarsened Dynamics of Biomolecules
No ratings yet
Unisim: A Unified Simulator For Time-Coarsened Dynamics of Biomolecules
18 pages
ML Techniques in Molecular Modeling Seminar All Complete
No ratings yet
ML Techniques in Molecular Modeling Seminar All Complete
14 pages
A Survey of Deep Learning Methods in Protein Bioinformatics and Its Impact On Protein Design
No ratings yet
A Survey of Deep Learning Methods in Protein Bioinformatics and Its Impact On Protein Design
30 pages
Efficient Machine Learning Force Field For Large-Scale Molecular Simulations of Organic Systems
No ratings yet
Efficient Machine Learning Force Field For Large-Scale Molecular Simulations of Organic Systems
23 pages
Advancements and Future Directions in Molecular Dynamics (MD) Simulations
No ratings yet
Advancements and Future Directions in Molecular Dynamics (MD) Simulations
6 pages
Biomolecular Simulation
No ratings yet
Biomolecular Simulation
26 pages
Molecular Dynamics Simulation For All
No ratings yet
Molecular Dynamics Simulation For All
15 pages
Structure-Based, Deep-Learning Models For Protein-Ligand Binding Affinity Prediction
No ratings yet
Structure-Based, Deep-Learning Models For Protein-Ligand Binding Affinity Prediction
15 pages
Dinosaurs Riveting Reads For Curious Kids Microbites by DK Z-Lib Org
No ratings yet
Dinosaurs Riveting Reads For Curious Kids Microbites by DK Z-Lib Org
96 pages
HHS Public Access: Molecular Dynamics Simulation For All
No ratings yet
HHS Public Access: Molecular Dynamics Simulation For All
29 pages
Schnet: A Continuous-Filter Convolutional Neural Network For Modeling Quantum Interactions
No ratings yet
Schnet: A Continuous-Filter Convolutional Neural Network For Modeling Quantum Interactions
11 pages
Intro To Psych - Module
100% (1)
Intro To Psych - Module
80 pages
Structural Biology
No ratings yet
Structural Biology
5 pages
Deep Learning in Protein Structural
No ratings yet
Deep Learning in Protein Structural
23 pages
Noe 2020 Machine Learning For Molecular Simu
No ratings yet
Noe 2020 Machine Learning For Molecular Simu
32 pages
ML Techniques in Molecular Modeling Seminar Updated
No ratings yet
ML Techniques in Molecular Modeling Seminar Updated
17 pages
Biochimica Et Biophysica Acta: K. Vanommeslaeghe, A.D. Mackerell JR
No ratings yet
Biochimica Et Biophysica Acta: K. Vanommeslaeghe, A.D. Mackerell JR
11 pages
Toefl Exercise 6: (Structure & Written Expression)
No ratings yet
Toefl Exercise 6: (Structure & Written Expression)
8 pages
Molecular Modeling of Proteins PDF
100% (1)
Molecular Modeling of Proteins PDF
474 pages
Ijms 17 01313
No ratings yet
Ijms 17 01313
26 pages
Singraber Et Al 2019 Library Based Lammps Implementation of High Dimensional Neural Network Potentials
No ratings yet
Singraber Et Al 2019 Library Based Lammps Implementation of High Dimensional Neural Network Potentials
14 pages
XXX (W 2369) Jackson Et Al 2023 Introduction To Machine Learning For Molecular Simulation
No ratings yet
XXX (W 2369) Jackson Et Al 2023 Introduction To Machine Learning For Molecular Simulation
3 pages
Machine Learning For Protein Folding and Dynamics: Sciencedirect
No ratings yet
Machine Learning For Protein Folding and Dynamics: Sciencedirect
8 pages
Mol2net-03 5076 Slides
No ratings yet
Mol2net-03 5076 Slides
3 pages
MolSnapper Conditioning Diffusion For Structure Based Drug Design
No ratings yet
MolSnapper Conditioning Diffusion For Structure Based Drug Design
13 pages
Coupling Molecular Dynamics and Deep Learning To
No ratings yet
Coupling Molecular Dynamics and Deep Learning To
11 pages
Poll Z-11 Zoology
No ratings yet
Poll Z-11 Zoology
5 pages
Jcpsa6 000159 024118 - 1
No ratings yet
Jcpsa6 000159 024118 - 1
14 pages
J Bbagen 2020 129545
No ratings yet
J Bbagen 2020 129545
18 pages
Enhancing Geometric Representations For Molecules With Equivariant Vector-Scalar Interactive Message Passing
No ratings yet
Enhancing Geometric Representations For Molecules With Equivariant Vector-Scalar Interactive Message Passing
13 pages
Form 3 Biology Notes New
No ratings yet
Form 3 Biology Notes New
113 pages
Zhavoronkov 2018 Artificial Intelligence For Drug Discovery Biomarker Development and Generation of Novel Chemistry
No ratings yet
Zhavoronkov 2018 Artificial Intelligence For Drug Discovery Biomarker Development and Generation of Novel Chemistry
3 pages
Topper 110 1 1 Biology Solution Up202409301153 1727677387 5039
No ratings yet
Topper 110 1 1 Biology Solution Up202409301153 1727677387 5039
7 pages
Positive Psychology Wisdom ND Hope
100% (1)
Positive Psychology Wisdom ND Hope
22 pages
Fphar 11 00697
No ratings yet
Fphar 11 00697
23 pages
Benchmarking Protein Structure Predictors To Assist Machine 1h3bc063
No ratings yet
Benchmarking Protein Structure Predictors To Assist Machine 1h3bc063
13 pages
Foram Thesis Final Hightlighted
No ratings yet
Foram Thesis Final Hightlighted
194 pages
Ijms 25 08426
No ratings yet
Ijms 25 08426
21 pages
ML Techniques in Molecular Modeling Seminar All Exaggerated
No ratings yet
ML Techniques in Molecular Modeling Seminar All Exaggerated
7 pages
DNN PhysRevLett.120.143001
No ratings yet
DNN PhysRevLett.120.143001
6 pages
Torchmd: A Deep Learning Framework For Molecular Simulations
No ratings yet
Torchmd: A Deep Learning Framework For Molecular Simulations
10 pages
EasyAmber - A Comprehensive Toolbox To Automate The Molecular
No ratings yet
EasyAmber - A Comprehensive Toolbox To Automate The Molecular
18 pages
2016 - Vivo - Role of Molecular Dynamics and Related Methods in Drug Discovery
No ratings yet
2016 - Vivo - Role of Molecular Dynamics and Related Methods in Drug Discovery
27 pages
CG Prot 2
No ratings yet
CG Prot 2
7 pages
Application of Strcture Prediction of Peptides and Proteins Review CSBJ 2019
No ratings yet
Application of Strcture Prediction of Peptides and Proteins Review CSBJ 2019
9 pages
Protein Desin With Deep Learning
No ratings yet
Protein Desin With Deep Learning
9 pages
Source Code For Biology and Medicine: Faunus: An Object Oriented Framework For Molecular Simulation
No ratings yet
Source Code For Biology and Medicine: Faunus: An Object Oriented Framework For Molecular Simulation
8 pages
Allantoin Brochure 04-2005 Single Pages
100% (1)
Allantoin Brochure 04-2005 Single Pages
4 pages
Cryo em
No ratings yet
Cryo em
24 pages
Current Scenario On Application of Computational Tools in Biological Systems
No ratings yet
Current Scenario On Application of Computational Tools in Biological Systems
12 pages
Welcome To The 5 Edition of The Course!: Cancer-Related Biosensors
No ratings yet
Welcome To The 5 Edition of The Course!: Cancer-Related Biosensors
33 pages
A Practical Introduction To Molecular Dynamics Simulations Applications To Homology Modeling
No ratings yet
A Practical Introduction To Molecular Dynamics Simulations Applications To Homology Modeling
37 pages
Assignment of CMBS by Qamar Shehzad.
No ratings yet
Assignment of CMBS by Qamar Shehzad.
7 pages
Deep Learning For 3D Protein Structure Prediction in Drug Discovery: A Novel Approach To Revolutionizing Therapeutic Agent Development
No ratings yet
Deep Learning For 3D Protein Structure Prediction in Drug Discovery: A Novel Approach To Revolutionizing Therapeutic Agent Development
6 pages
CHEMISTRY Investigatory Project
No ratings yet
CHEMISTRY Investigatory Project
16 pages
Biomolecular Simulation: A Computational Microscope For Molecular Biology
No ratings yet
Biomolecular Simulation: A Computational Microscope For Molecular Biology
27 pages
Endometriosis & Adenomyosis
No ratings yet
Endometriosis & Adenomyosis
41 pages
Advances in Protein Structure Prediction and Design
No ratings yet
Advances in Protein Structure Prediction and Design
17 pages
Protein Naggre Sunil KumAR
No ratings yet
Protein Naggre Sunil KumAR
19 pages
Aggregation Path Forcefield MD
No ratings yet
Aggregation Path Forcefield MD
14 pages
57 2 2 Biology
No ratings yet
57 2 2 Biology
15 pages
CC11 12.16.20 Frosty The Snowman Genetics Activity Page FILLABLE
33% (3)
CC11 12.16.20 Frosty The Snowman Genetics Activity Page FILLABLE
3 pages
Module 2 Skeletal
No ratings yet
Module 2 Skeletal
26 pages
Aggregation of Amphipathic Peptides at An Aqueous Organic Interface Using Coarse Grained Simulations
No ratings yet
Aggregation of Amphipathic Peptides at An Aqueous Organic Interface Using Coarse Grained Simulations
12 pages
Contains Aggere Propensity
No ratings yet
Contains Aggere Propensity
12 pages
The Navigational Nose - A New Hypothesis For The Function of The
No ratings yet
The Navigational Nose - A New Hypothesis For The Function of The
12 pages
E.nervous System - 2404
No ratings yet
E.nervous System - 2404
19 pages
SOAL EDIT BEDAH SKL Kelompok 10
No ratings yet
SOAL EDIT BEDAH SKL Kelompok 10
10 pages
Pressure Ulcer Categorisation Guidance
No ratings yet
Pressure Ulcer Categorisation Guidance
2 pages
Origins - Quntum - Bio - Rspa.2018.0674
No ratings yet
Origins - Quntum - Bio - Rspa.2018.0674
13 pages
Aggre
No ratings yet
Aggre
23 pages
Folding Landscape Pnas.1524864113
No ratings yet
Folding Landscape Pnas.1524864113
5 pages
Carbohydrates Metabolism
No ratings yet
Carbohydrates Metabolism
33 pages
3 Potato-Trap
No ratings yet
3 Potato-Trap
1 page
3
No ratings yet
3
20 pages
Driving Forces Dill
No ratings yet
Driving Forces Dill
12 pages
Daylight For Energy Savings and Psycho-Physiological Well-Being in Sustainable Built Environments
No ratings yet
Daylight For Energy Savings and Psycho-Physiological Well-Being in Sustainable Built Environments
14 pages
Comput Struc Bio
No ratings yet
Comput Struc Bio
2 pages
Apbio
No ratings yet
Apbio
10 pages
Ecological Systems Theory The Person in The Center
No ratings yet
Ecological Systems Theory The Person in The Center
18 pages
Before Sunrise Before Sunset Before Midn
No ratings yet
Before Sunrise Before Sunset Before Midn
4 pages
Combining The MARTINI and Structure Base
No ratings yet
Combining The MARTINI and Structure Base
9 pages
All Atom Aggre Ijms-20-05450
No ratings yet
All Atom Aggre Ijms-20-05450
14 pages
Acs Jcim 8b00641
No ratings yet
Acs Jcim 8b00641
14 pages
Astm B456
No ratings yet
Astm B456
11 pages
Multiple Alleles: Genes Which Have More Than Two Alleles
No ratings yet
Multiple Alleles: Genes Which Have More Than Two Alleles
11 pages
The Body Reveals
100% (16)
The Body Reveals
111 pages
Campylobacter Jejuni, C. Coli, C. Lari, C. Upsaliensis
100% (1)
Campylobacter Jejuni, C. Coli, C. Lari, C. Upsaliensis
4 pages
Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
From Everand
Python for Chemistry: An introduction to Python algorithms, Simulations, and Programing for Chemistry (English Edition)
Dr. M. Kanagasabapathy
5/5 (1)
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
From Everand
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
Abdolhossein Fereidoon
No ratings yet

Machine Learn

Uploaded by

Machine Learn

Uploaded by

Available online at www.sciencedirect.

Simulations meet machine learning in structural biology

In a recent review we estimated that MD would reach

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

(a) MD data generation (b) QM/ML (c) MD as data augmentation tool

100.0 BindingDB PDB

2010 2012 2014 2016 2018 2020 2022 QM vs ML 5

1010 frames ~ 60,000 12 bytes 7.2 petabytes 0 Protein-ligand Protein-ligand

Current Opinion in Structural Biology

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Structural Biology 2018, 49:139–144

Current Opinion in Structural Biology 2018, 49:139–144 www.sciencedirect.com

You might also like