Advances of Machine Learning in Materials Science: Ideas and Techniques
Advances of Machine Learning in Materials Science: Ideas and Techniques
Advances of Machine Learning in Materials Science: Ideas and Techniques
1 Department of New Energy Science and Engineering, Xiamen University Malaysia, Sepang 43900, Malaysia
2 Engineering Research Center of Micro-nano Optoelectronic Materials and Devices, Ministry of Education;
Fujian Key Laboratory of Semiconductor Materials and Applications, CI Center for OSED,
and Department of Physics, Xiamen University, Xiamen 361005, China
*These two authors contributed equally.
ABSTRACT
In this big data era, the use of large dataset in conjunction with machine
learning (ML) has been increasingly popular in both industry and
academia. In recent times, the field of materials science is also undergoing
a big data revolution, with large database and repositories appearing
everywhere. Traditionally, materials science is a trial-and-error field, in
both the computational and experimental departments. With the advent
of machine learning-based techniques, there has been a paradigm shift:
materials can now be screened quickly using ML models and even gener-
ated based on materials with similar properties; ML has also quietly infil-
trated many sub-disciplinary under materials science. However, ML
remains relatively new to the field and is expanding its wing quickly.
There are a plethora of readily-available big data architectures and abun-
dance of ML models and software; The call to integrate all these elements
in a comprehensive research procedure is becoming an important direction
of material science research. In this review, we attempt to provide an intro-
duction and reference of ML to materials scientists, covering as much as
possible the commonly used methods and applications, and discussing the
future possibilities.
Keywords machine learning, materials science
arXiv: 2307.14032
https://doi.org/10.1007/s11467-023-1325-z
FRONTIERS OF PHYSICS REVIEW ARTICLE
13501-2 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 1 List of the conventional machine learning tasks and the problems tackled [68].
neural networks, materials science-based learning. theory, the overall implementation and architecture, and
In ML, the “standard” or conventional learning tasks the massive surge in data and data infrastructures. This
have been extensively studied, which include classifica- section covers the advancement of ideas from various
tion, regression, ranking, clustering, and dimensionality areas of applications of Artificial Intelligence – Natural
reduction or manifold learning [68]. The problems Language Processing (NLP), Computer Vision (CV),
related to the above tasks are listed in Fig. 1. The definitions Reinforcement Learning (RL), Explainability Artificial
and terminology commonly used in ML for different Intellignce (XAI), etc.
learning stages are listed in Fig. 2. The typical stages of
a learning process are also shown in Fig. 3, which can be 3.1 Classical machine learning application areas
briefly described as follows: with a given collection of
labeled examples, one can firstly divide the data/ In the field of natural language processing and under-
samples into three groups, namely, training samples, standing, ML models have made huge progress with
validation data and test samples, then the relevant Attention Transformer networks [69] and pre-training
features associated to the desired properties are chosen, techniques. SuperGLUE [70] is a natural language
which are next used to train the pre-determined learning understanding benchmark consisting of many tasks,
algorithm. This is done by adjusting the hyperparameters which requires in-depth understanding of short proses
Θ in order to ensure that the hypothesis Θ0 has the best and sentences. With superhuman performances at Super-
performance on the validation sample. Typical learning GLUE benchmarks, it has been demonstrated that ML
scenarios include supervised learning, unsupervised is able to model both understanding of natural language
learning, semi-supervised learning, transductive inference, and generation of relevant natural language in context.
on-line learning, reinforcement learning, active learning, The technique that has led to this leap in performance is
and more other complex learning scenarios. Different pre-training [71], which refers to “training a model with
from traditional data analysis, ML is fundamentally one task to help it form parameters that can be used in
about generalization [68]. Spectacularly, the neural other tasks”. Prompt learning is a form of ML that
network-based ML is able to approximate functions in a works with large models, to learn knowledge from a
very high dimension with unprecedented efficiency and language model simply by prompting the learnt model
accuracy [2], and therefore it can be used for complex with various types of prompts. BERT-like models have
tasks in a wide-range of applications. also been extended to process data from realms outside
natural language, like Programming Languages, e.g.,
CodeBERT [72], and Images [73], and had been very
3 Recent progress in machine learning successful in these realms too. Table 1 lists works relevant
to several main ideas in machine learning for Natural
Recently, the ML community has seen breakthroughs in Language Processing (NLP).
many traditional AI tasks and classical challenging Unsupervised learning has made strides in computer
scientific tasks. This leap of improvement is being vision tasks, with models being able to identify subject
powered by both the new grounds in the underlying in video, or identify poses of objects from point cloud in
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-3
FRONTIERS OF PHYSICS REVIEW ARTICLE
13501-4 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
learning models, which effectively decomposes the coop- more effectively simulated in the primitive sense using
eration task into role-learning by large neural networks quantum ML algorithm. The hybrid ML coupled with
[142] amongst many other techniques [143, 144]. This is quantum material science is potentially an important
a breakthrough for multi-agent reinforcement learning. stepping stone for material scientists and computer
scientists alike to innovate and research more efficiently.
3.2 On quantum machine learning
3.3 Theory, explainable AI and verification
Quantum ML is one of the big next steps of ML [145].
While error-correction still limits our ability to build a In classical computer science, the very hard case of
fully quantum computer, it is possible to innovate with Travelling Salesman Problem (TSP), a classical NP
hybrid algorithms that uses quantum sub-algorithms or problem, has been solved with very satisfactory result
components to speed-up, robustify ML or simply to based on neural networks, which either blend with pre-
expand the theoretical boundaries of ML with 2 norm training of a solver of a mini-TSP or a reinforcement
probabilities. In quantum computing, we can compute learning-based [149] strategy selector combined with
the similarity between feature vectors with state overlaps heuristic. Other prominent NP problems like Maximum
(denoted by bra and ket) instead of kernels via inner Independent Set (MIS) or Satisfiability Modulo Test
product. Consider a simple quantum ML scenario below: (SMT) have also been solved satisfactorily with ML-
guided heuristic search [150]. This demonstrates that
K (x, x′ ) = ⟨ϕ(x), ϕ (x′ )⟩ → QKE (x, x′ ) ML models have been able to push through boundaries
= ⟨ϕ(x) | ϕ (x′ )⟩ . (3.1) that have been set forth by traditional theoretical
computer science. This breakthrough has been made
The feature space in quantum ML can be obtained by
possible by effective latent representation learning of
state preparation. For instance,
essential features of the problem itself and the solver.
ϕ : [x1 ; x2 ] → (x1 |0⟩ + x2 |1⟩) ⊗ (x1 |0⟩ + x2 |1⟩) . (3.2) Explainability XAI techniques like Integrated Gradients
(IG) [151], Local Interpretable Model-agnostic Explana-
The corresponding circuit is denoted by tions (LIME)[152], Shapley Additive Explanations
SxAB (|0⟩A ⊗ |0⟩B ) = SxA |0⟩A ⊗ SxB |0⟩B . (3.3) (SHAP)[153], SimplEx [154] and various others have
gained much attention. LIME attempts to identify hot
We can have quantum kernel estimation [see Eq. (1)] areas in the image responsible for features that result in
[146], or quantum feature spaces [147] in hybrid ML the prediction. SimplEx [154] is an explainability technique
algorithms or intermediate scale hybrid machines [148]. that attempts to explain a prediction with linear combi-
This offers a new insight to the types of kernels and nations of samples drawn from the corpus of training
linear algebra that we can use to improve ML in the data; the technique returns a combination of training
classical sense. Quantum physics or chemistry can be samples that has contributed to the predictions. There
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-5
FRONTIERS OF PHYSICS REVIEW ARTICLE
are also efforts to incorporate explainability by adding a methods and has become deeply integrated with many
layer at the end of neural networks for capturing scientific and humanities [166] fields. Application-wise,
explainability information. Explainable Graph Neural ML models have been trusted to make more and more
Network NN) techniques that apply specifically to crucial decisions for the well-functioning of society. For
Graph Neural Networks are broadly classified into instance, in the criminal justice setting [167], ML models
several classes: Gradients/features based Guided Back- have been used to set bail for defendants; in the finance
propagation (BP) [155], Perturbation Based GNN sector, models can help make decisions [168]; in the
Explainer [156], SubgraphX [157], Decomposition Based, energy sector, they predict power generation efficiency
Surrogates GraphLIME [158] and Generation [159]. for wind power stations. While neural network might
These GNN XAI techniques are well-suited for explaining still be a black box and can be hard to verify at times,
feature importance for predictions at either the node its effectiveness as a predictor and sometimes generator
level, edge level or graph level. has already been relied upon by many societal sectors
Verification is important for protecting neural for greater efficiency and effectiveness.
network models against adversarial behaviours; adversary
behaviours can be characterized by ill-intent shifts of
planes of separation in the model so that it is more 4 Development trend of machine learning
likely to err on otherwise correctly classified samples or for materials science
corrupting input samples with noise or otherwise. Neural
network robustness verification techniques like Rectified ML has helped material scientist achieve their study
Linear Unit-Plex (ReLUPlex) [160] and alpha-beta aims in a wide variety of tasks, most prominently as a
CROWN [161] have also made huge progress. It is a screening tool in the design of a large variety of materi-
numerical bounds back-propagation technique where the als, which include energy materials, semiconductors,
score boundaries for each class are back-propagated polymer design, catalyst, high entropy alloy, etc. The
throughout the network to determine the overlap trend of going from processing a single dataset to
between class scores. Specifically, in the non-linear achieving a specific aim to learning a latent representation
portions of the neural network, the ReLU activation of the underlying structure, which can later be finetuned
functions were bounded with linear functions. Safety- to perform specific tasks, such as predicting the energeti-
critical applications have also been secured with neural cally stable structure across datasets, is rather promi-
network verification techniques, and the Airborne Collision nent.
Avoidance System for Unmanned Aircraft (ACAS Xu)
[162] is an ensemble of 45 neural networks whose 4.1 From numerical analysis to feature engineering
purpose is to give anti-collision advice to flying planes,
and utilize ReLUplex methods to make their advice Traditionally, ML has been used as an advanced numerical
robust. regression tool to analyse experimental data in material
science and many other fields [169, 170]. The remarkable
3.4 Stack optimizations for deep learning ability of ML to interpolate data has allowed scientists
to explain phenomena and verify hypotheses effectively.
Graphical Processing Units (GPU) are processors capable Traditional material science ML practitioners often
of parallel processing instructions. Standard GPU deep concern themselves with explicit feature engineering of
learning speedup techniques include convolutional layer specific materials [171]. Bhadeshia [171] has outlined
reuse, featuremap reuse and filter reuse, and memory four categories of models in material science; traditionally
access is a common bottleneck [163]. The basic idea is ML models are “models used to express data, reveal
that functions that are computed many times should be patterns, or for implementation in control algorithm”.
optimized on all levels, from high to low, including the The classical works that involve material property
instruction set level. The entire software stack, compiler prediction mostly fall into the fourth category. Figure 4
technologies, and code generation have been optimized illustrates the feature engineering process for material
for deep learning computations on GPU. Deep learning science, which encompasses four stages: feature extrac-
GPU is known for its high energy usage; reducing energy tion; feature analysis; correlation and importance analy-
usage is an essential objective for GPU optimization sis; and feature selection [172].
research [164]. The requirement for the scale of hardware In material space, there are many degrees of freedoms,
architecture for ML is also loosening up, as engineers are such as the atomic coordinates, coordination numbers,
packing engineering insights from large systems into interatomic distances, and the position of the various
smaller and energy-conserving systems, TensorFlow Lite species. Often, they are impractical to be used as the
Micro [165]. direct inputs to the algorithms, as they are not invariant
ML theory and practice have made massive progress under translation and rotation. In feature extraction [see
in recent years. It is now transforming the scientific Fig. 4(a)], we seek to convert them into descriptors,
13501-6 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 4 Feature engineering for ML applications. (a) Feature extraction process. Starting from material space, one can
extract information from material space into chemical structures then to descriptors space. (b) Typical ML feature analysis
methods. “FEWD” refers to Filter method, Embedded method, Wrapper method, and Deep learning. (c) Correlation and
importance analysis of selected features. The feature correlations is visualized in the diagram on the left. Diagram on the
right is normalized version of left diagram, where the colors indicate the relative correlation of every other feature for prediction
of the row/column feature. (d) Various feature subsets obtained from feature engineering analysis. One can construct
features with linearly independent combination of subsets, in other words, subsets of features are basis. Reproduced with
permission from Ref. [172].
which extract the underlying symmetry and distinguish those dimensions. Moreover, the task of sifting through
systems that are truly different and not just a product a vast dataset is laborious and hard to manage for indi-
of translations and/or rotations. viduals. Furthermore, with the ever-expanding computing
After the features are extracted, they undergo a series power, the dimensionality of the features that is compu-
of analysis to fine tune and reduce the dimensionality of tationally feasible also rapidly scales up, allowing the
the descriptors space. The four commonly used methods, consideration of more factors, which ultimately improves
shown in Fig. 4(b), are the filter method, embedded the accuracy of the prediction while also widening the
method, wrapper method, and deep learning method. coverage of material types screened. Thus, there is a
With the analysis process completed, a mapping, as push towards representation learning, an automation of
illustrated in Fig. 4(c), which relates the importance and feature engineering of a large material dataset [173],
correlations among the selected features, can be used to which better captures the internal latent features [174].
visualize their dependence. In turn, this aids the process This trend encouraged a deeper integration in both
of feature selection, in which many suitable subsets of development trends in ML and material science, coupled
features [see Fig. 4(d)] are chosen to proceed to the next with a concise selection of ML tools, which require an
stage – fed into the ML algorithm and compared to intuitive understanding of mathematical and theoretical
obtain the best performing minimal subset. computer science ideas behind these tools.
In representation learning, the features are automatically
4.2 From feature engineering to representation learning discovered and extracted from the raw data, and thus
complicated patterns that are hidden from the human
While explicit feature engineering is a practical and user but are highly relevant could boost the accuracy
valuable task, it often restricts the type of task that ML and effieciency of the ML model, which is highly dependent
can perform and does not fully use its ability to learn a on the quality of the selected features. Therefore, repre-
generalized representation or sound separation of sentation learning excels in applications where the data
features and ability to interpolate or extrapolate along dimensionality is high and features extraction is difficult,
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-7
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 5 Infographic of End-to-End Model. End-to-End models take multi-modal dataset as inputs, and encodes them into
vectors for the surrogate model. The surrogate model then learns the latent representation, which makes the internal
patterns of these datasets indexable. One is then able to decode the latent representation into an output form of our choice,
which includes property predictions, generated novel materials and co-pilot simulation engines.
such as speech recognition and signal processing, object managed to obtain latent features after performing
recognition, and natural language processing [175]. unsupervised ML methods. The raw experimental data
Neural networks can be packed into layers or attention are fed into an autoencoder that includes the encoder
blocks that can be integrated into a single neural and decoder, which uses the input data as the output
network. Effective embedding of information that is a data, while information is passed through a bottleneck
dimensional reduction tool reduces the complexity of the layer, as illustrated in Fig. 6.
model, when upended upon the training pipeline, brings
us to end-to-end learning. Figure 5 shows a simplified 4.3 From representation learning to inverse design
pipeline for material science end-to-end model, where
datasets are turned into vectors by the encoder to use as After learning the representations that are critical in
the input for the surrogate model, which attempts to influencing the functionality of the materials, we ought
identify the latent representation that can be decoded to to think: could we use them inversely, to generate novel
generate predictions. and maybe better materials? This question had been
Representation learning has been applied in materials sought in 1999 by Franceschetti and Zunger [177], where
science. By using the raw experimental X-ray absorption they successfully searched for the alloy of fixed elements
near edge structure (XANES) spectra, Routh et al. [176] with targeted electronic structure, using Monte Carlo
Fig. 6 Schematic of the representation learning methods used in the structural characterization of catalysts, where the
autoencoder, which includes the encoder and decoder, is used, with the input and output data being the same. Reproduced
with permission from Ref. [176].
13501-8 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 7 Depending on the degree of freedom (DOF) involved, the machine learning methodologies of the photonic design
vary. The analytical methods that are suitable for DOF of order unity are replaced by the discriminative model of ML. As
DOF increases, generative model is leveraged to bring down the dimensionality. Reproduced with permission from Ref. [178].
method only. This limited yet profound results show us different scale and granularity with ML model as an aide.
the vast usefulness of solving the inverse problem. Now, An example is shown in Fig. 7, where both discriminative
armed with the computation power and advancement in and generative models are used jointly to design photon-
ML, we are in a better position to answer this question. ics. When the dimensionality of the photonic structures
Generative models like Variational Autoencoders (VAE) involved is very low, at the order of 1, analytical methods
and Generative Adversarial Networks (GAN) have been are well-suited. However, as the dimensionality increases,
applied in the inverse design of molecules and solid-state the analytical methods are no longer feasible, and the
crystal. ML methods are required. On its own, discriminative
By combining the power of representation learning models are suitable at slightly larger parameters space,
and generative models into a single extensive model, but when the degree of freedom scales up considerably,
that is the joining of neural networks from several parts generative model can be employed to reduce the dimen-
of the workflow into a single network, many benefits can sionality.
be reaped. First of all, the ability of an extensive
network to counter noise levels in the training dataset,
resulting in better predictions or better-generated solu- 5 Databases in material science
tions. Secondly, the latent representation learnt from
each part of the pipeline is more consistent with the Data is prevalent in material science; data which originates
final goal of experimentation or design. Thirdly, the from every aspect and process of material science
absence of human error-prone non-ML intervention research endeavour have varying types, accuracy and
helps experimenters focus on the overall goal and archi- genre. Table 4 lists the typical data types and database
tecture. that are used in ML models. A material science task
By using discriminative models, generative models, often includes processing a combination of data types
and rapid simulation, whether standalone or in combina- listed.
tion, one can construct sophisticated models that tackle The broad spectrum of data types and multi-modules
problems ranging from predicting density functional of input data dictates that material science models need
theory (DFT) properties to inverse device design with to learn to integrate multi-modal data to produce mean-
confidence. One can also explore material design at ingful research results. This trend also means that the
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-9
FRONTIERS OF PHYSICS REVIEW ARTICLE
Table 5 Machine learning libraries. All descriptions were adapted from the references therein.
material science community needs to embrace the software calculation data with well-tested and scalable database
and statistical revolution that will propel the field norms (like schema) and eases or speedup data batch
forward. processing. The basis for quantum chemistry libraries is
In order to use computer systems to process material standardized; typical ones include Gaussian Orbital
information, material-related nomenclatures have to Basis (GTO), Plane Wave Basis (PW), and Numerically
adapt to computer processing norms, like string. Both Tabulated Atom-centered Orbitals (NAO). Table 5 lists
atomic and structure of molecules should be evident by softwares which might be useful. The first portion lists
parsing strings, e.g. Simplified molecular-input line-entry general deep learning libraries (APIs), second portion
(SMILES), BigSMILES [197], Self-referencing embedded lists useful libraries for machine learning tasks, third
strings (SELFIES) [198], Physical Information File (PIF) portion lists tools that might be useful to material
[199]. Material Science datasets are often implemented science.
in neural network data loaders like Deep Graph Library
[200]. The ML community’s Datasets have codebases
that organize information that eases software engineers 6 Machine learning descriptors for
to call and process with a library. Most quantum chemistry material science
software is softwareengineering based Application
Programming Interface (API) to share and process The material science datasets are often comprised of
Quantum Chemistrydata; it is written to store quantum atomistic information with the coordinates of atoms, the
13501-10 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 8 (a) The mathematical description of the Weyl and Coulomb matrices. (b) The construction of the PRDF sums,
where atoms covered by the yellow strip covering the radius (r, r + dr) are considered. (b) Reproduced with permission from
Ref. [221].
charges on the atoms, and their compositions. To approach for fast and accurate prediction of molecular
capture the spatially invariant information, the local atomization energy. The same eigenvalue-based method
environment on atomic scale is oftenly extracted, such has also been used in a number of recent studies [218,
as the list of neighbouring atoms and their relative 219]. The downsides of this method are the inabilility to
spatial positions. They are then compactified and propa- differentiate enantiometer [220] and the loss of informa-
gated as descriptors in the form of a vector, in a neural tion, as the dimensions are reduced from N2 to N, which
network which maps this information to their properties can sometimes be an advantage [219].
that are of interests: total energy, mass density, bulk As described, the Coulomb matrices methods are only
moduli, etc. In general, a good descriptor needs to have viable for finite system. To extend the pairwise descriptor
the following qualities: to infinite periodic system, Faber et al. [220] proposed
i) Invariant under spatial transformation (arbitrary three different methods: Ewald sum matrices, Sine
translations, rotations, and reflections) matrices, and Extended coulomb-like matrix, and their
ii) Invariant under permutation/exchange of atoms of results show that Sine matrix is the most efficient and
identical species, i.e., only a unique representation for outputs the smallest error.
each arrangement of atoms. Another alternative, the partial radial distribution
iii) Computationally cheap and easy to implement. function (PRDF) was proposed by Schütt et al. [221]
iv) Minor deviation under small perturbations in the and used in their work to perform fast prediction of
atomic structure. density of states at Fermi level for different types of
Clearly, the Cartesian coordinates of the atoms do not solids. The pairwise distances dαβ between two atoms
satisfy the points i) and ii), even though it is the easiest type are considered, in the following equation for PRDF:
imaginable method. There are many different descriptors
1 ∑
Nα ∑Nβ
that have been tried and tested in material science, ( ) ( )
gαβ (r) = θ dαi βj − r θ r + dr − dαi βj ,
which we will attempt to briefly summarize in this Nα Vr i=1 j=1
section, but it is by no means exhaustive. For further
(6.1)
information and use examples on descriptors, the reader
is recommended to the articles of Li et al. [215] and where θ (x) is the step function, Vr is the volume of the
Schmidt et al. [216]. primitive cell, while Nα and Nβ are the number of atoms
of types α and β . Only the atoms in the primitive cell
6.1 Pair-wise descriptor are considered as the shell center, i.e., the atoms αi, see
Fig. 8(b). This function is invariant under translation,
Pair-wise descriptor is a type of descriptor that considers rotation, and different choice of the unit cell.
each and every possible pair of atoms in the system.
Examples include Z-matrices, Weyl matrices, and more 6.2 Local descriptor
recently, the Coulomb matrices [216]. A figure briefly
describing the Weyl matrices and Coulomb matrices are The most intuitive methods to describe a system of
shown in Fig. 8(a). In the work of Rupp et al. [217], atoms that also take into the geometrical aspect into
Coulomb matrices were constructed for a set of organic account is the neighbour-based or local descriptor, as the
molecules that are numerically extracted and sorted electron density is only weakly affected by distant atoms.
descendingly, then the Euclidean difference between the By considering the neighbouring atoms of a selected
vectors of eigenvalues are computed and defined as the atom within a pre-determined cutoff radius, we can store
distance between two molecules (with different dimensions the information about their bonds, such as the bond
accounted for by adding trailing zeroes to vectors). distance and angle.
Using this as the sole descriptor, they developed a ML Behler and Parinello [222] proposed the use of two
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-11
FRONTIERS OF PHYSICS REVIEW ARTICLE
∑
all where the Cjjm 1 m1 j2 m2
’s are the ordinary Clebsch–Gordan
ζ
G2i =21−ζ (1 + λcosθijk ) coefficients of the SO(4) group.
j,k̸=i The Smooth Overlap of Atomic Positions (SOAP)
−η (Rij
2 2 2
) descriptor [226] uses the atomic density defined in Eq.
×e +Rik +Rjk
fc (Rij ) fc (Rik ) fc (Rjk ) ,
(5), but with the Dirac Delta function replaced by the
(6.3) Gaussians, expanded in terms of spherical harmonics:
where Rij is the distance between atom i and j, θijk is ( ) [ ( )]
2
the angle between the three atoms i, j, k . There are four exp −α|r − ri | = 4π exp −α r2 + ri2
free parameters in total, λ(= + 1, −1), η, ζ , and the ∑
∗
implicit Rc in fc , defined as · hl (2αrri ) Ylm (rb) Ylm (rbi ) ,
[ ( ) ] lm
πRij (6.8)
fc (Rij ) = 0.5 × cos +1 for Rij ≤ Rc ;
Rc
0 for Rij > Rc . where hl’s are the modified spherical Bessel functions of
the first kind and Ylm is the spherical harmonics. A simi-
(6.4) ∫ ∫ ( ) n
larity kernel k (ρ, ρ′ ) ≡ dR b ρ (r) ρ′ Rr b dr was intro-
The symmetry functions capture the local environment
duced to compare two different atomic environments,
of an atoms, are invariant to permutation, translation,
where n = 2 is used in their study. The normalized
rotation, and changes in coordination number. They
kernel or SOAP kernel
have been used in reproducing potential energy surface
(PES) at DFT accuracy. This formalism was extended ( )ξ
′
k (ρ, ρ )
and studied in extensive details in Behler [223], where K (ρ, ρ′ ) = √ , (6.9)
the set of symmetry functions are coined the “Atom- k (ρ, ρ) k (ρ′ , ρ′ )
centered Symmetry Functions (ACSFs)”. A further
generalization was done by Seko et al. [224], which where ξ is any positive integer, chosen to control the
included basis functions other than the Gaussian in sensitivity, goes into the PES of the form
Eq. (2), such as Neumann functions and Bessel functions. ∑
N ( )
They also introduced the use of the Least Absolute ε (q) = αk K q, q (k) , (6.10)
Shrinkage and Selection Operator (LASSO) technique to k=1
optimize the basis set and find the sparsest representation
to speed up computation. This was successfully used to where the q (k) is the training set configurations. The
reproduce almost DFT-accuracy phonon dispersion and SOAP descriptor is now widely adopted, especially in
specific heat for hcp Mg. A more recent work to reduce the machine-learning of potentials [227–230].
the undesirable scaling in ASCF has also been discussed Based on the SOAP approach, Artrith et al. [216]
[225]. introduced another descriptor for machine-learnt poten-
Another approach using bispectrum, a three-point tials, which does not scale with the number of chemical
correlation function, was introduced by Bartók et al. species, a feature that SOAP lacks. This is carried out
[226]. In this approach, they first construct local atomic by taking the union of a set of invariant coordinates
density function for each atom i, as which maps the atomic structure and another one that
∑ maps the chemical composition, which are both
ρi (r) = δ (r) + δ (r − rij ) fc (|rij |) , (6.5) described by the radial and angular distribution func-
j
tions:
where the δ (r)’s are the Dirac Delta function. This ∑
atomic density is then projected onto the surface of a 4D RDFi (r) = a ϕα (r) , 0 ≤ r ≤ Rc ,
c(2) (6.11)
α
sphere, by expanding the atomic density using 4D spherical
∑
j
harmonics, Um ′ m (index i omitted):
′ j
ADFi (r) = a ϕα (θ) , 0 ≤ θ ≤ π,
c(3) (6.12)
cj m′ m = ⟨Um ′ m |ρ⟩, (6.6) α
and the bispectrum is then built from these coefficients, where Rc is the cutoff radius and the ϕα is a complete
defined as basis set, which in their work is the Chebyshev polynomials
13501-12 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 9 (a) Structure graph for 2,3,4-trimethylhexane and (b) the related adjacency and distance matrix. Reproduced with
permission from Ref. [231]. (c) The Universal fragment descriptors. The crystal structure is analysed for atomic neighbours
via Voronoi tessellation with the infinite periodicity taken into account. Reproduced with permission from Ref. [232].
of the first kind. when multiplied together gives the Galvez matrix
M ≡ A · D. The information about the atomic/elemental
6.3 Graph-based descriptor reference property q (could be Mendeleev group and
period number, number of valence electron, electronic
By converting the atoms and bonds in a molecule into affinity, covalent radii, etc.) is then incorporated in the
vertices and edges, we can turn the molecule into a pair of descriptors for a particular property q ,
graph as depicted in Fig. 9(a). The information about
∑
n−1 ∑
n
the edges and the edge distance between vertices can TE = |qi − qj | Mij , (6.13)
then be encoded into the adjacency and distance matrices i=1 j=i+1
[231], shown in Fig. 9(b). This graph-theoretic approach
is known as structure graph, which has been devised ∑
E
Tbond = |qi − qj | Mij , (6.14)
long ago in 1863. Despite the simplicity and apparent
{i,j}∈bonds
loss of 3D information, structure graphs have seen
widespread uses in comparing the structure of molecules. where the former runs over all pairs of atoms while the
The generalization of structure graph to periodic latter only considers bonded pairs of atoms.
systems is the Universal Fragment Descriptor (UFD) Xie et al. [233] proposed a framework, the generalized
[232], which uses the Voronoi tessellation [see Fig. 9(c)] crystal graph convolutional neural networks (CGCNN)
to determine the connectivity of atoms, in the following which introduced another graph-based descriptor that is
two steps: inspired by the UFD. Their construction of the crystal
i) The crystal is partitioned into atom-centered graph is illustrated in Fig. 10, where the connectivity
Vornoi–Dirichlet polyhedral; determination is the same as in UFD, but they used the
ii) Atoms that share a perpendicular-bisecting Voronoi one-hot encoding to encode the atom and bond properties
face with interatomic distance smaller than the Cordero in two separate feature vectors: node feature vectors and
covalent radii (with 0.25 Å tolerance) is determined to edge feature vectors. They are the descriptors, which are
be connected. Periodic atoms are considered. then sent through convolutional layer, which further
which defines the graph. Subgraphs are also extracts critical features while reducing the dimensions.
constructed corresponding to the individual fragments, Convolutional neural network is discussed in Section 7.
which include linear paths connecting at most 4 atoms
and circular fragments, representing the coordination 6.4 Topological descriptor
polyhedral of an atom. Then, an adjacency matrix A is
constructed based on the determined connectivity, along Topology famously does not differentiate between a
with a reciprocal distance matrix (Dij = 1/rij 2 , which
) donut and a coffee mug, as they both have a hole. This
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-13
FRONTIERS OF PHYSICS REVIEW ARTICLE
13501-14 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 12 Persistence barcode plot for the selected Na atom inside a NaCl crystal, surrounded by only (a) Na atoms and (b)
Cl atoms. (c) Construction of crystal topological descriptor, taking into account different chemical environmen t. Reproduced
with permission from Ref. [236].
palette chosen for the patterns obtained from the rotation generate the XRD computationally. They successfully
of different axis [e.g., red for x-axis, green for y-axis, and distinguished solid-state lithium-ion conductors with this
blue for z-axis, see Fig. 13(b)]. The final obtained descriptor using unsupervised learning.
pattern is then used as the descriptor, fed into a convo-
lutional network, similar to image-based object recogni- 6.6 Reduction of descriptor dimension
tion. The benefits of the XRD descriptor are that the
dimension is independent of the system size and very In materials science, there are many possible combinations
robust to defects [compare Fig. 13(b) and Fig. 13(c)]. of various properties that can be used as descriptors. It
The more conventional XRD is the 1D XRD, shown is often difficult to select and fine-tune the descriptor
in Fig. 13(d) for different crystals, which is obtained space manually. This is a common problem in the field
based on Bragg’s law, mapping the 3D structures into of ML, and several methods have been developed to
1D fingerprints. 1D XRD based descriptor has been used tackle this issue: principal component analysis (PCA)
to classify crystal structure [240] and predict their prop- [242, 243], least absolute shrinkage and selection operator
erties [241]. In the latter, the group used modified XRD, (LASSO) [244], and sure independence screening and
where only the anions sublattice is considerd with the sparsifying operator (SISSO) [245]. However, they
cations removed, and the pymatgen package is used to mainly work for models that are linear, hence not
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-15
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 13 (a) Experimental XRD method, where X-ray plane wave incidents on a crystal, resulting in diffraction fingerprints.
(b, c) XRD-based image descriptor for a crystal where each RGB colour corresponds to rotation about the x, y, z axes. The
robustness of the descriptor against defects can be observed by comparing (b) to (c). (d) Examples of 1D XRD. (a–c) Reproduced
with permission from Ref. [239], (d) Reproduced with permission from Ref. [241].
Table 6 List of Machine Learning (ML) algorithms used by various tools or framework developed in materials science.
ML algorithms Tool
Support vector machine (SVM) Refs. [260, 261, 262, 246]
Kernel ridge regression (KRR) Refs. [237, 263, 247, 264]
Deep neural network VampNet [257], DTNN [265], ElemNet [266],
IrNet [267], PhysNet [268], DeepMolNet [269],
SIPFENN [270], SpookyNet [250]
Convolutional neural network (CNN) SchNet [271], Refs. [239, 240, 272, 273]
Graph neural network (GNN) CGCNN [274], MEGNet [275], GATGNN [276],
OrbNet [277], DimeNET [278], ALIGNN [279], MXMNet [280],
GraphVAMPNet [281], GdyNets [282], NequIP [283],
PaiNN [284], CCCGN [285, 286], FFiNet [287]
Generative adversarial networks (GAN) Ref. [288], CrystalGAN [246], MatGAN [289]
Variational auto encoder (VAE) FTCP [290], CDVAE [291], Refs. [292, 263]
Random forest/ decision tree Refs. [236, 293, 294, 251, 295, 296]
Unsupervised clustering Refs. [241, 282, 252, 297, 298]
Roost [299], AtomSets [288], XenonPy.MDL [289],
Transfer learning
TDL [290], Refs. [256, 291, 292, 300, 301]
directly applicable for neural network-based models them together by the ML algorithms used. We then
[239]. briefly describe the commonly used algorithms and also
introduces some of the emerging algorithms, which could
further unlock the potential of materials science ML
7 Machine learning algorithms for applications.
material science
7.1 Currently utilized algorithms
In this section, we collate the recently developed ML-
based tools and frameworks in materials science, grouping Table 6 enumerates the ML algorithms used in relatively
13501-16 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 14 (a) Neural network (NN) with 3 layers: input, hidden, and output. (b) Deep NN with 3 hidden layers. Reproduced
with permission from Ref. [257].
recent developed tools or framework in materials science. structure, and regression models to predict the band
It can be seen that the convolutional and graph neural gaps of binary compounds, using the polynomial kernel
networks are the popular algorithms, with transfer learning [Eq. (7.2)] with d = 2 and θ = 1.
also picking up pace. We will briefly introduce the algo- Wu et al. [247] used KRR to assist in non-adiabatic
rithms, drawing examples from the materials science molecular dynamics (NA-MD) simulations, particularly
implementation. in the prediction of excitation energy and interpolate
nonadiabatic coupling. KRR was chosen over neural
7.1.1 Kernel-based linear algorithms networks because of the fewer hyperparameters KRR
possessed, and KRR requires only the use of simple
Support vector machine (SVM) and kernel ridge regression matrix operation to find the global minimum. By only
(KRR) are kernel-based ML algorithms, which utilize providing a small fraction (4%) of sampled points, KKR
kernel functions K(x, x′ ) that allow high-dimensional gives a reliable estimate while saving MD computational
feature to be used implicitly, without actually computing effort of over an order of magnitude.
the feature coordinates explicitly, hence speeding up
computation. Furthermore, it allows non-linear problem 7.1.2 Neural network
to be solved using linear algorithms by mapping the
problem into a higher-dimensional space. Examples of Artificial Neural Networks (ANNs, shortened to NNs) is
commonly used kernel functions include a type of ML architecture that aims to mimic the neural
Linear kernel: structure of the brain. In NNs, there are 3 types of
layers consists of interconnected nodes: input layer,
K (x, x′ ) = (xi · xj + θ) ; (7.1) hidden layer(s), and output layer, as shown in Fig. 14(a).
The input layer receives the input raw data, which is
Polynomial kernel:
then propagated to the hidden layer(s), where the nodes
K (x, x′ ) = (xi · xj + θ) ; are functions of the backward-connected nodes and each
d
(7.2)
connection is weighted. The function of a hidden layer
Gaussian kernel/radial basis function (RBF): node m with x being the node values of previous layer,
( ) takes the form:
2
−||x − x || ( )
K (x, x′ ) = exp
i j
, (7.3) ∑n
σ2
hm = σ b + ωi · xi , (7.4)
i
where x is the input data, while θ and σ are adjustable
parameters. SVM is used for both classification and where σ (z) is known as the activation function, where a
regression problem, denoted as SVC and SVR, respec- common choice is the sigmoid function σ (z) = 1/(1 + e−z )
tively. On the other hand, KKR is used only for regression and b is bias term, and the ReLU (Rectified Linear Unit)
problems and it is very similar to SVR, except for the function, simply defined as σ (z) = max(0, z). After the
different loss functions. hidden layer(s), the information is then passed toward
Applications of both types of SVM are demonstrated to the output layer, which is another function of the
in the work of Lu et al. [246]. Using atomic parameters nodes of the final hidden layer. The outputs are
such as electronegativity, atomic radius, atomic mass, measured against true value using a pre-defined cost
valence, and functions of these parameters, they function, with the simplest example for regression problem
constructed classifier for the formability of perovskite being the sum of squared error
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-17
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 15 The CNN architecture used in the work of Ziletti et al. [239]. (a) A kernel or learnable filter is applied all over the
image, taking scalar product between the filter and the image data at every point, resulting in an activation map. This
process is repeated in (b), which is then coarse grained in (c), reducing the dimension. The map is then transferred to regular
NNs hidden layers (d) before it is used to classify the crystal structure (e). Reproduced with permission from Ref. [239].
1∑1
n
J (ω) =
2
(yi − ybi ) , Ziletti et al. [239] used CNN architecture, as depicted in
n i=1 2 Fig. 15. The convolution layers capture elements that
are discriminative and discard unimportant details.
where yi and ybi are the true and predicted values respec- Graph NNs is specifically designed for input data that
tively, and the sum is taken over the whole training set are structured as graph, which contains nodes and edges,
with n being the size of the training set. The weights are and can handle inputs of different sizes. There are
then optimized iteratively using the backpropagation several differrent types of graphs NNs, such as graph
method, which is a function of the gradient of the cost convolutional network (GCNNs), graph attention
function. For a detailed discussion, the book [248] is network, and Message Passing Neural Network.
recommended.
Deep NNs are NNs with more than one hidden layer 7.1.3 Decision tree and ensembles
[see Fig. 14(b)]. By having more hidden layers, the
model is better positioned to capture the nonlinearities Decision tree is a supervised method for solving both
in the data. However, having too many hidden layers classification and regression problems, which resembles a
can cause the convergence or learning to be slow and tree. A typical decision tree is shown in Fig. 16(a),
difficult, because the gradients used in backpropagation where each internal node represents a feature or
will tend to become vanishingly small. To overcome this attribute, each branch contains a decision rule, and each
issue, residual block has been devised [249], which intro- leaf node is a class label or a numerical value, depending
duces shortcut between layers. on the type of problem solved. The number of node
SpookyNet [250] is a DNN-based model built to layers a decision tree contains is known as depth, which
construct force fields that explicitly include nonlocal needs to be tuned. An important metric used in measuring
effects. In their DNN architecture, the generalized the performance of a decision tree in classification is the
sigmoid linear unit (SiLU) activation function is used, information gain, which is defined as the difference of
which is given by silu (x) = 1+αx e−βx
, where both α and β the information entropy between the parent and child
are learneable parameters. They noted that a smooth node; while for regression problem, the variance reduction
activation function is crucial for the prediction of potential is the performance evaluation metric for a decision tree.
energies, as discontinuities in the atomic forces would be Decision tree is advantageous when it comes to inter-
introduced otherwise. They introduced a loss function pretability, but it suffers from overfitting, especially
that has 3 components: energy, forces, and dipole when the tree is too deep and complex. It can also be
moments, which is minimized by optimizing the weights overly-sensitive to data changes.
using mini-batch gradient descent. They also incorporated Random forest is an algorithm that combines multiple
residual block which allowed them to use a large number decision tree, with each of them trained on randomized
of hidden layers. subsets of samples, where both training data and
Convolutional NNs (CNNs) is primarily used in image features are chosen random with replacement in a
pattern recognition, and is different from deep NNs by process known as bootstrapping. The final decision is
having a few extra layers, which are the convolutional then made by aggregating the results from each decision
and pooling layers. The extra layers filter and convolute tree and taking the majority vote for classification or the
the data to capture crucial features in the data and also average for regression. The steps taken above are collec-
reduce the input dimension, which scales quickly with tively known as bagging, which help ensure that the
resolution in image recognition problems. The work of random forest algorithm is less sensitive to changes in
13501-18 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 16 (a) An example of a decision tree, where each square represents internal node or feature, each arrow represents
branch or decision rule, and the green circles are leafs representing class labels or numerical values. (b) Dendogram obtained
via agglomerative hierarchical clustering (AHC) where the dashed line indicates the optimal clustering. (a) Reproduced with
permission from Ref. [258], (b) Reproduced with permission from Ref. [241].
the dataset and more robust to overfitting. randomly, and each data point is assigned to a cluster
Gradient Boosting Decision Tree (GDBT) is another centroid that is closest in Euclidean distance to the data
method that uses ensembles of decision trees but in point. The centroids are then moved to a new location
sequence rather than in parallel. GBDT works by adding that is the arithmetic mean of the assigned data points.
decision trees iteratively, with each one attempts to This repeats until convergence, i.e., there is no more
improve upon the errors of the previous tree. The final movement among the centroids. The number K determines
output from the trees ensembles is then taken by using the number of classes in the data, which can be known
weighted average of the decision trees outputs. before hand if the dataset has clear distinction, e.g.,
Random forest models are used in the work of Zheng metal vs. non-metal, or can be optimized using the
et al. [251], which predicts the atomic environment elbow method, which has an associated cost function J
labels from the X-ray absorption near-edge structure that is optimized by the best choice of K. Despite its
(XANES). Using the random forest classifier of scikit- popularity, K-Means clustering has some limitations,
learn package, they found that 50 trees ensemble gave such as sensitivity to outliers, dependence on the
the best performance, even better than other classifiers, centroids position initialization, ineffective for dataset
such as CNN and SVC. On the other hand, GBDT has with uneven distribution, and predetermined number of
been used for regression in the topology-based formation clusters.
energy predictor [236]. Also using the scikit-learn pack- Several alternatives have been: proposed which
age, they added a tree to their model one at a time and improves upon the limitations of K-Means clustering.
used bootstrapping to reduce overfitting. This topology- Agglomerative hierarchical clustering (AHC), used in
based model is able to achieve a high accuracy in cross- the work of Zhang et al. [241], is initialized by using
validation, with mean absolute error of only 61 meV/ each data point as a single cluster, then iteratively
atom, outperforming previous works that uses Voronoi merged the clusters of the closest points until one big
tessellations and Coulomb matrix method. cluster is left. Then, a dendogram or a bottom-up hier-
archical tree diagram, as show in Fig. 16(b), which can
7.1.4 Unsupervised clustering be cut at a desired precision, as indicated in the figure
via a dashed line, where 7 groups are obtained. To verify
K-Means clustering is a popular unsupervised classification the results, they performed spectral clustering, which
algorithm which aims to group similar data points splits the samples into chosen K groups, based on the
together in K different clusters. K numbers of points eigenvalues of the similarity matrix constructed from the
that are known as cluster centroids are initialized data. This process is recursively applied bisectionally,
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-19
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 17 The architectures of the two generative models, Generative adversarial networks (GAN) and Variational auto
encoders (VAE). Reproduced with permission from Ref. [259].
and they obtained similar clusters as the AHC. There is proposed a two-step VAE-based generator is shown in
also the mean-shift algorithm, utilized in Ref. [252]. Fig. 18(b). In the first step, the materials data is passed
to a convolutional autoencoder, which contains 4 convo-
7.1.5 Generative models (GAN and VAE) lutional layers, outputting a compressed intermediate
vector, which is then fed to a decoder that aims to maps
Generative models attempt to learn the underlying the vector back to the input. The intermediate vector is
distribution of a training dataset, and use that to generate fed into the VAE in 2nd step to learn about the latent
new samples that resemble the original data. Two popular materials space. To generate completely novel poly-
types of generative models are Generative Adversarial morphs, the materials space around known stable structure
Networks (GAN) and Variational Autoencoders (VAE). is sampled using random Gaussian distributed vectors
As can be seen from Fig. 17, there are two different and the resulting latent vectors are decoded in a series
neural networks in both of the models: GAN contains a of steps to output new stable structures. The model is
discriminator and generator network, while VAE has a able to recover 25 out of 31 known structures that are
decoder and encoder network. In GAN, random noise is not included in the training, and 47 new valid compositions
injected into the generator network and subsequently are discovered that have eluded genetic algorithms.
outputs a sample that is then fed to the discriminator
network, which is then classified as real or generated 7.1.6 Transfer learning
sample. The networks are trained together until the
generated samples are able to convince the discriminator In materials science, high quality data for a specific type
that the samples are real and not generated. On the of materials is usually scarce, which severely impedes the
other hand, VAE tries to learn the latent representations applications of ML in generating high quality predictions
from the training data and generate new samples based [255]. Transfer learning (TL) is a method that can be
on them using probabilistic approach. applied to overcome this data scarcity issue. In transfer
A variant of GAN, Wasserstein GAN, has been learning, the parameters of a model that has been pre-
applied in the work of Kim et al. [253], which generate trained on a large data set but with different task/
Mg–Mn–O ternary materials which can potentially be purpose, are used to initialize training on another data-
used as potential photoanode materials. The overview of scarce task, such as the parameters of the models used
their GAN architecture is shown in Fig. 18(a), which for predicting formation energy is later used to train
after training, takes in random Gaussian noise vector Z another task of predicting band gaps.
and encoded composition vector, and spits out new Chang et al. [256] combined pairwise transfer learning
unseen crystals. The new crystals are then passed to a and mixture of experts (MoEs) framework in their model.
critic and a classifier, where the former computes the In pairwise transfer learning, a model is pre-trained on a
Wasserstein distance that measures the dissimilarity source task (task designed for the large dataset) and a
between the generated and true data distributions, subset of the pre-trained model parameters is used to
which are used to improve the realism of the generated produce generalizable features of an atomic structure,
materials, while the latter ensure that the generated defined as a feature extractor. This extractor extracts a
materials meet the composition condition. Using this feature vector from an atomic structure, which can be
model, they found 23 previously unknown new crystals used to predict a scalar property after passing thorugh a
with suitable stability and band gap. neural network. On the other hand, MoE contains multiple
An example of inverse design using VAE was demon- neural network models that specialize in different
strated in the work of Noh et al. [254], where their regions of the input space, known as “experts”, and each
13501-20 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Fig. 18 (a) Composition-conditioned crystal GAN, designed to generate crystals that can be applied in photodiode.
(b) Simplified VAE architecture used in the inverse design of VxOy materials. (a) Reproduced with permission from Ref.
[253], (b) Reproduced with permission from Ref. [254].
of them are activated and controlled by a gating func- or more of these features [304]: sparsity, simulatability,
tion. The outputs of the “experts” are then aggregated and modularity. A model that has limited number of
through an aggregation function. Using this architecture, nonzero parameters is known as sparse, and this can be
the authors have performed many downstream, data- obtained by the LASSO method, whereas if a model can
scarce tasks, such as predicting band gap, poisson ratio, be easily comprehend and mentally simulate by the
2D materials exfoliation energy, and experimental human user is simulatable, such as decision trees-based
formation energies. model. A modular model is a model that combines
several modules which can be interpreted independently.
7.2 Emerging ML methods In the field of materials science, the understanding of
the physical and chemical intuition is paramount as it
7.2.1 Explainable AI (XAI) methods opens the door to understanding hidden connection and
physics, and improve the efficiency of future studies by
The DNNs-based approaches discussed have proved to providing insights from previous work. The importance
be of great help in assisting and speeding up materials and implementation of XAI in materials ML tools (refer
research, but their black-box nature has made under- to Table 6) have been discussed in the review of Oviedo
standing and explaining the results difficult, which has et al. [58] and Zhong et al. [302]. Zhong et al. [302]
also plagued the general ML community [302]. In presented an overview of DNNs-based XAI as shown in
systems that trust, fairness, and moral are highly critical, Fig. 19(a), which highlight two fundamental motivations
such as in healthcare, finance, and autonomous driving, for XAI, which is the need for explaining how the results
the decisions made by AI cannot be blindly trusted are obtained from the input (model processing), and
without understanding the motivation and reasoning what information is contained in the network (model
behind the choice. Furthermore, when the black box representations). The design of an intrinsically explainable
returns results that are erroneous and puzzling, it can be DNNs will prove important in answering the questions
difficult to diagnose and correct without knowing what posed, but is itself a highly difficult task. In the follow-
exactly went wrong. To overcome these issues, the XAI ing, we will illustrate some of the materials science XAI
techniques were introduced, which try to explain the implementation, which is still in its infancy.
reasonings and connections behind a prediction or classi- Kondo et al. [305] used heat maps to highlight the
fication. feature importance, particularly in identifying the positive
There are many post-hoc (i.e., applied after model and negative features that affect ionic conductivity in
fitting) XAI methods proposed for the general ML ceramics, using scanning electron microscope (SEM)
community [303], including gradient-based attribution images. Their CNN-based model used feature visualization
(Gradients, Integrated gradients, and DeepLIFT), method that is very similar to the deconvolution method
deconvolution-based methods (Guided backpropagation, used in CAM and Grad-CAM. By defining mask map,
Deconvolution, Class activation maps (CAM), Grad- they obtained masked SEM images [see Fig. 19(b)] that
CAM), model-agnostic techniques (Shapley additive show features that determine low and high ionic conduc-
explanations (SHAP), local interpretable model-agnostic tivities.
explanations (LIME), Ancors). A recent implementation of XAI for crystals is the
Another type of XAI is the use of models that are CrysXPP [306] which is built upon an auto-encoder-
inherently interpretable or explainable, which have one based architure, CrysAE, that containing deep encoding
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-21
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 19 (a) Overview of explainable DNNs approaches. (b) Feature visualization in the form of heat map used in determining
the ionic conductivity from SEM images. (a) Reproduced with permission from Ref. [302], (b) Reproduced with permission
from Ref. [305].
module which is capable of capturing the important LASSO to improve the sparsity of the features. An
structural and composition information in crystal graph. example of the explainable results obtained is shown in
The information learnt is then transferred to the GCNN Fig. 20(b), where features that affect the band gap of
contained within CrysXPP [shown in Fig. 20(a)], which GaP crystal are weighted and compared.
takes in feature selected from crystal graph. The feature Compositionally restricted attention-based network
selector contains trainable weights that selects weighted (CrabNet) [307, 308] is an example of explainable DNN
subset of important features, which is fine-tuned with in materials science that is based on the Transformer-
Fig. 20 (a) The architecture of CrysXPP, which is capable of producing explainable results, as seen in (b) the bar chart
of features affecting the band gap of GaP crystal. Reproduced with permission from Ref. [306].
13501-22 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
As mentioned above, high-quality data in materials Fig. 21 High-throughput screening with learnt interatomic
science which are complete with proper labels are scarce. potential embedding from Ref. [330]. With the integration of
This issue is exarcebated when we look at experimental active learning and DFT in the screening pipeline, the
data, which unlike computationally-produced data, is throughput efficiency or the quality of the output obtained
plagued with issues stemming from the different experi- from calculation can be improved. Reproduced with permission
from Ref. [330].
mental equipments and variable environments. There-
fore, the few-shot learning (FSL), which specifically
targets situation where data is limited, has enticed currently popular GGA on the Jacob’s ladder of Perdew
materials scientists, especially the experimentalists. [315] is desirable. The techniques of ML have been
There are several approaches to FSL, as discussed in started to be utilized in the generation of new XC func-
Refs. [309, 310], such as metric-based, optimization- tionals [316–320], with the aim of improving the calculated
based, and model-based approach. FSL is still a relatively accuracy while maintaining the efficiency. Transferability
young and unrefined method, but it has already remains a huge challenge, which will need a huge and
attracted a lot of attentions. FSL has been implemented diverse dataset to achieve.
in the prediction of molecular properties [311, 312], clas- The potentials and force fields used in molecular
sification of space group from electron backscatter dynamics (MD) are critical in determining the reliability
diffraction (EBSD) data [313], and segmenting electron and accuracy of the output [321]. MD that does not
microscopy data [314]. involve first-principles approach but rather fixed potentials
are in general less accurate the ab initio MD (AIMD),
but they can be applied on a large system and long time
8 Machine learning tasks for material scale, where AIMD is too costly. As such, one would
science hope that the standard MD can bring about results similar
to AIMD. Developed in 2017, DeePMD [322] accurately
This section will discuss the coverage of materials reproduced the water model obtained from DFT. The
science tasks that ML tools have been utilized to assists same team developed an open-source tool for the on-the-
in tackling. The common ML tasks in material science fly generation of MD potentials, known as DP-GEN
often coincide with the traditional ML tasks, which have [323], available on available on URL: github.com/deep-
been extensively studied and optimized. The tasks of modeling/dpgen. In 2020, the team won the ACM
inference of material property given structural and Gordon Bell Prize for the DeePMD work, as it can be
compositional data, generative modelling from a latent scaled efficiently on the best HPC. Similar works have
representation of desired properties, and the generation also been carried out by other teams [324, 325].
of DFT functionals, are analogous to the tasks that ML Another material modeling technique is the Density
has traditionally performed well, including object classi- Functional Tight Binding (DFTB) method, which is less
fication, image and text generation using text cues, and computational expensive than DFT-based first principles
natural language processing (NLP). calculations. Efforts have been carried out on applying
ML to obtain the TB parameters [252, 326].
8.1 Potentials, functionals, and parameters generation
8.2 Screening of materials
Traditionally, the XC functionals used in DFT are
generated through mathematical approaches, guided by There are many methods to compress design space. To
empirical data, such as the Perdew–Zunger exchange, name few, one could train a model that predicts material
with the exact XC functional remains elusive. The property given material information or performs ML
search for an improved XC functional above the guided simulation of new materials to predict material
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-23
FRONTIERS OF PHYSICS REVIEW ARTICLE
Fig. 22 Schematics of generative adversarial network. Reproduced with permission from Ref. [332].
13501-24 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
ment, with the recent GPT-3 [340] and now GPT-4 [341] (crystal structures). These discoveries can hugely impact
models making strides not just academically, but used in physics and chemistry theory [346], experiments and
just about everywhere. The trained AI models are able research methods. Physics-informed neural networks
to hold realistic conversation with humans, take stan- [347] can both improve neural network performance and
dardized exams [342], write codes in various programming physics research efficiency.
language, and so on. Increasingly, it could be more and more about the
Most of the published results on materials science are mapping of descriptors. We can imagine that with the
not stored in a centralized database, which hinders the emergence of more sophisticated models, it is possible to
overall effort of ML applications. The NLP techniques advance a particular segment of study in material
can help this by scraping information from published science, such as polymer design, by completing a well-
literatures, such as the materials structure and proper- defined sophisticated task with a model/model of
ties. Tshitoyan et al. [343] demonstrated that an unsu- computation, where the lack of relevant databases will
pervised learning modethat can capture complicated limit its advancement. Tracing the development of
underlying knowledge from the literature and recommend computer science, sophisticated models which perform
materials for functional applications before their actual generic tasks in material science well will again be inte-
discovery. NLP also can help hypothesis formation and grated into a giant multi-purpose model much like a
provide knowledge on the current trends in the field generic processing chip, to which we can prompt for
[344]. NLP methods can also serve as an efficient knowledge insights which was previously only gained by human
extractor from vast amount of material science literature, experimentation at a much slower rate. ML models will
making the literature review process more efficient and bring material scientists closer to the many possibilities
thorough for researchers [345]. already inherent in big data itself, allowing us to explore
and exploit the possibilities with greater efficiency. The
task material scientist will be able to complete with the
9 Perspectives on the integration of help of machine learning will become more integrated
machine learning in materials science and sophisticated, from the screening of material to the
design of material as a complete task. Then with the
In the following we will list perspectives on the integration design of material as an atomic/primitivetransaction, we
of machine learning in materials science with materal will be able to come out with new science on top of the
science point of view and with machine learning point of material design as a whole.
view, respectively.
9.1.2 Systematic generalization
9.1 Perspectives from machine learning viewpoint
In our stride towards autonomous general intelligence
As ML techniques and ideals become ever more preva- (AGI), researchers have drawn many parallels and inspi-
lent, we believe algorithmic templates and ML ideas will rations from neuro-sciences [348] and how humans learn
eventually become either the target modes of computation and teach each other to develop models which better
or the mode of guidelines which decides the permutation generalizes to novel situations and objects well. We
to which areas of material science garner attention and expect a body of material science knowledge and ideas
gain resources. Machine translation has evolved from a to become generalized and accessible to other fields,
rule-based coupled with statistical model to a very data- conveyed by advanced models in the future, where we
driven approach, and researchers are discussing the can generalize or verbalize properties of imagined materials
translation task with less and less reference to a specific or predict performance of material in novel situations
source and target languages, pivoting towards advancing with high accuracy with its formal deduction process
mode of computation for the task as a whole. generated by models. We can also observe the interac-
tion, cooperation or contradiction between bodies of
9.1.1 More deep integrations materials science knowledge for novel materials and
circumstances, and perform research on the intersection
We might also observe the trend of attempting to learn of bodies of knowledge with more depth and rigor. ML
descriptors for parts of complex systems with ML models can also learn to identify potential directions for
models to be either more computationally efficient or exploration, come up with a comprehensive experimenta-
more human interpretable or editable. Instead of scientists tion plan and collaborate with human researchers as a
attempting to describe a system with equations from navigation co-pilot. The novel direction identified will be
first principles, ML models can help scientists discover a novel and comprehensive because models can learn from
better set of descriptors for systems across all datasets. passive observations of a large material science literature,
For example, descriptors could be descriptors for input its publication trend [349] and insight analysis of
data (atomistic information/reaction space) or labels researchers.
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-25
FRONTIERS OF PHYSICS REVIEW ARTICLE
9.1.3 Huge computational models 9.2 Perspectives from material science viewpoint
With the development of reaction environment models, Currently, one of the biggest challenges is the availability
one might also reasonably expect reinforcement learning of high-quality data. The increasing number of research
game-play learning to learn an agent policy for a mate- groups adapting the open data approach and the growing
rial, i.e., to first learn a material behaviour that is desirable availability of internet of things (IoT) devices will solve
for a particular purpose, followed by an automatic this problem, albeit gradually. We have also discussed
search/generation of material which suits the specifica- several possible methods to overcome the issues, which is
tion. In general, the compounding of learning methods the advancement of small training sample ML algo-
to get a solution for an even more vaguely defined objective rithms, such as transfer learning and few-shot learning
but more analyzable process for that solution results in a algorithms will also be one of the possible solutions to
human-verifiable solution for large or vague problems. this issue.
Moreover, the increasing synthesizability or explainability
of the solution to vague problems will help material 9.2.1 Theoretical and computational materials science
scientists navigate methods for solving huge overarch-
The various computational techniques in materials
ing/generic problems with more finesse, evolution of
science, such as DFT, molecular dynamics in its various
subject through large models [350].
forms of molecular dynamics (MD), monte-carlo meth-
Model of computation might become the common ods, and density functional tight binding method, has
language of material scientists and researchers from started to benefit from the application of ML and will
other fields. Task definition might become the lingua continue to do so in a dramatic manner.
franca or the leading cause of concern for ML practitioners As of now, the Kohn–Sham DFT remains a reliable
in material science. This broader definition of material and popular method for determining various material
science might then, in turn, propel the advancement of properties. However, the accuracy of DFT calculations
machine learning. In general, the barrier to entry to heavily relies on the quality of the approximations
both advanced material science and advanced ML will employed, such as the exchange-correlation (XC) func-
be lowered, allowing more experts from other fields or tional. The search for improved approximations, including
individuals to contribute their efforts and ideas to the exact functionals, using ML has only recently
development of both fields. commenced. Another area for improvement in DFT is
Mechanisms in quantum ML will become readily- reducing computational costs. Recently, ML-refined
available to be integrated with quantum physics, chemistry numerical techniques have emerged that offer faster
and subsequently material science. As classical-quantum speeds compared to their traditional counterparts
hybrid infrastructure and architectures [351] become [353–355]. It is hoped that these advancements can even-
more available, quantum learning for material science tually be applied to accelerate DFT computations.
might incorporate mechanisms of both quantum The integration of ML into MD, exemplified by methods
computing and quantum analysis of materials as primi- like DeePMD, has demonstrated the potential to achieve
tives. This trend is expected to speed up the inter-disci- DFT-level accuracy while maintaining the computational
plinary mixing of these fields from both engineering and efficiency of classical MD. This breakthrough opens up
theoretical grounds. new possibilities for conducting accurate calculations in
ab initio molecular dynamics (AIMD) on extremely large
The resulting phenomenon is the emergence of an ever
systems (with over 100 million atoms) or over very long
more integrated huge ML model, a Super Deep-Learning
timescales (beyond 1 nanosecond) [333, 355]. By
Model, which will tackle most if not all of the most
enabling adequate sampling of phase space, these
fundamental underlying problems in material science; it
advancements enable more comprehensive investigations
will integrate fundamental engineering ideas from across various applications, including (electro-)catalysis,
computer science with domain-invariants of material sensors, fabrication, drug interactions, and more.
science, which is designed to perform well for various
tasks on both super-computing facilities, quantum or
9.2.2 Experimental materials science
otherwise, and on limited resources devices, [352] scalable
yet robust. Moreover, by integrating the best training The availability of a vast number of predicted materials
and privacy practices from ML software and hardware with desired properties is highly advantageous for exper-
development experience, future material scientists can imentalists. With a large number of possible candidates,
quickly expect robust material science downstream the experimentalist can focus on the materials that can
models running smoothly and reliably as an application be synthesized and tested on available facilities and
on widely available and portable devices like a cell- equipments. Additionally, the automated learning of the
phone. fabrication parameters and conditions are on the rise
13501-26 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
recently [356–359]. The advancement of MD will also this review also covered the tasks or issues in materials
enable comprehensive simulations of fabrication process scicence that has been tackled with the use of machine
and finds out the best experimental conditions for learning. We also discussed our vision for the future of
successful synthesizes of new materials [360, 361]. materials science as the field matures with the integration
Furthermore, the analysis of the data, a highly time- of machine learning, which will be drastically different
consuming and laborious task is being increasingly from what we know today.
supported by ML algorithms. The implementation of on-
the-fly accurate inference mechanism of experimental Declarations The authors declare that they have no competing
data will increase the producitivity and efficiency of interests and there are no conflicts.
fabrication process, enabling experimentalists to determine
if the samples have been fabricated successfully and Acknowledgements This research was supported by the Ministry
move on to the next attempt quickly. of Higher Education Malaysia through the Fundamental Research
Grant Scheme (No. FRGS/1/2021/STG05/XMU/01/1).
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-27
FRONTIERS OF PHYSICS REVIEW ARTICLE
An integrated study of electron, neutron, and X-ray CaCu3Ti4O12: A new route to the enhanced dielectric
diffraction, X-ray absorption fine structure, and first- response, Phys. Rev. Lett. 99(3), 037602 (2007)
principles calculations, Phys. Rev. B 81(14), 144203 29. J. C. Zheng, H. Q. Wang, A. T. S. Wee, and C. H. A.
(2010) Huan, Structural and electronic properties of Al
15. N. Sa, S. S. Chong, H. Q. Wang, and J. C. Zheng, nanowires: An ab initio pseudopotential study, Int. J.
Anisotropy engineering of ZnO nanoporous frameworks: Nanosci. 01(02), 159 (2002)
A lattice dynamics simulation, Nanomaterials (Basel) 30. J. C. Zheng, H. Q. Wang, A. T. S. Wee, and C. H. A.
12(18), 3239 (2022) Huan, Possible complete miscibility of (BN)x(C2)1–x
16. H. Cheng and J. C. Zheng, Ab initio study of alloys, Phys. Rev. B 66(9), 092104 (2002)
anisotropic mechanical and electronic properties of 31. J. C. Zheng, H. Q. Wang, C. H. A. Huan, and A. T. S.
strained carbon-nitride nanosheet with interlayer bond- Wee, The structural and electronic properties of
ing, Front. Phys. 16(4), 43505 (2021) (AlN)x(C2)1–x and (AlN)x(BN)1–x alloys, J. Phys.:
17. Y. Huang, C. Y. Haw, Z. Zheng, J. Kang, J. C. Zheng, Condens. Matter 13(22), 5295 (2001)
and H. Q. Wang, Biosynthesis of zinc oxide nanomaterials 32. H. Q. Wang, J. C. Zheng, R. Z. Wang, Y. M. Zheng,
from plant extracts and future green prospects: A topical and S. H. Cai, Valence-band offsets of III–V alloy
review, Adv. Sustain. Syst. 5(6), 2000266 (2021) heterojunctions, Surf. Interface Anal. 28(1), 177 (1999)
18. Z. Q. Wang, H. Cheng, T. Y. Lü, H. Q. Wang, Y. P. 33. J. C. Zheng, R. Z. Wang, Y. M. Zheng, and S. H. Cai,
Feng, and J. C. Zheng, A super-stretchable boron Valence offsets of three series of alloy heterojunctions,
nanoribbon network, Phys. Chem. Chem. Phys. 20(24), Chin. Phys. Lett. 14(10), 775 (1997)
16510 (2018) 34. J. C. Zheng, Y. Zheng, and R. Wang, Valence offsets
19. Y. Li, H. Q. Wang, T. J. Chu, Y. C. Li, X. Li, X. Liao, of ternary alloy heterojunctions InxGa1-xAs/InxAl1-xAs,
X. Wang, H. Zhou, J. Kang, K. C. Chang, T. C. Chin. Sci. Bull. 41(24), 2050 (1996)
Chang, T. M. Tsai, and J. C. Zheng, Tuning the 35. L. Liu, T. Wang, L. Sun, T. Song, H. Yan, C. Li, D.
nanostructures and optical properties of undoped and Mu, J. Zheng, and Y. Dai, Stable cycling of all‐solid‐
N-doped ZnO by supercritical fluid treatment, AIP state lithium metal batteries enabled by salt engineering
Adv. 8(5), 055310 (2018) of PEO‐based polymer electrolytes, Energy Environ.
20. Y. L. Li, Z. Fan, and J. C. Zheng, Enhanced thermoelectric Mater. (Feb.), e12580 (2023)
performance in graphitic ZnO (0001) nanofilms, J. 36. W. Zhang, F. Y. Du, Y. Dai, and J. C. Zheng, Strain
Appl. Phys. 113(8), 083705 (2013) engineering of Li+ ion migration in olivine phosphate
21. J. He, I. D. Blum, H. Q. Wang, S. N. Girard, J. Doak, cathode materials LiMPO4 (M = Mn, Fe, Co) and
L. D. Zhao, J. C. Zheng, G. Casillas, C. Wolverton, M. (LiFePO4)n(LiMnPO4)m superlattices, Phys. Chem.
Jose-Yacaman, D. N. Seidman, M. G. Kanatzidis, and Chem. Phys. 25(8), 6142 (2023)
V. P. Dravid, Morphology control of nanostructures: 37. B. Zhang, L. Wu, J. Zheng, P. Yang, X. Yu, J. Ding,
Na-doped PbTe–PbS system, Nano Lett. 12(11), 5979 S. M. Heald, R. A. Rosenberg, T. V. Venkatesan, J.
(2012) Chen, C. J. Sun, Y. Zhu, and G. M. Chow, Control of
22. Z. Fan, J. Zheng, H. Q. Wang, and J. C. Zheng, magnetic anisotropy by orbital hybridization with
Enhanced thermoelectric performance in three-dimen- charge transfer in (La0.67Sr0.33MnO3)n/(SrTiO3)n super-
sional superlattice of topological insulator thin films, lattice, NPG Asia Mater. 10(9), 931 (2018)
Nanoscale Res. Lett. 7(1), 570 (2012) 38. L. Zhang, T. Y. Lü, H. Q. Wang, W. X. Zhang, S. W.
23. N. Wei, H. Q. Wang, and J. C. Zheng, Nanoparticle Yang, and J. C. Zheng, First principles studies on the
manipulation by thermal gradient, Nanoscale Res. Lett. thermoelectric properties of (SrO)m(SrTiO3)n superlat-
7(1), 154 (2012) tice, RSC Adv. 6(104), 102172 (2016)
24. N. Wei, Z. Fan, L. Q. Xu, Y. P. Zheng, H. Q. Wang, 39. J. C. Zheng, C. H. A. Huan, A. T. S. Wee, M. A. V.
and J. C. Zheng, Knitted graphene-nanoribbon sheet: Hove, C. S. Fadley, F. J. Shi, E. Rotenberg, S. R.
A mechanically robust structure, Nanoscale 4(3), 785 Barman, J. J. Paggel, K. Horn, P. Ebert, and K.
(2012) Urban, Atomic scale structure of the 5-fold surface of a
25. J. Q. He, J. R. Sootsman, L. Q. Xu, S. N. Girard, J. C. AlPdMn quasicrystal: A quantitative X-ray photoelec-
Zheng, M. G. Kanatzidis, and V. P. Dravid, Anomalous tron diffraction analysis, Phys. Rev. B 69(13), 134107
electronic transport in dual-nanostructured lead (2004)
telluride, J. Am. Chem. Soc. 133(23), 8786 (2011) 40. H. Q. Wang, J. Xu, X. Lin, Y. Li, J. Kang, and J. C.
26. N. Wei, L. Xu, H. Q. Wang, and J. C. Zheng, Strain Zheng, Determination of the embedded electronic
engineering of thermal conductivity in graphene sheets states at nanoscale interface via surface-sensitive
and nanoribbons: A demonstration of magic flexibility, photoemission spectroscopy, Light Sci. Appl. 10(1), 153
Nanotechnology 22(10), 105705 (2011) (2021)
27. J. He, J. R. Sootsman, S. N. Girard, J. C. Zheng, J. 41. M. A. Van Hove, K. Hermann, and P. R. Watson, The
Wen, Y. Zhu, M. G. Kanatzidis, and V. P. Dravid, On NIST surface structure database – SSD version 4, Acta
the origin of increased phonon scattering in nanostruc- Crystallogr. B 58(3), 338 (2002)
tured PbTe-based thermoelectric materials, J. Am. 42. H. Q. Wang, E. Altman, C. Broadbridge, Y. Zhu, and
Chem. Soc. 132(25), 8669 (2010) V. Henrich, Determination of electronic structure of
28. Y. Zhu, J. C. Zheng, L. Wu, A. I. Frenkel, J. Hanson, oxide-oxide interfaces by photoemission spectroscopy,
P. Northrup, and W. Ku, Nanoscale disorder in Adv. Mater. 22, 2950 (2010)
13501-28 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
43. H. Zhou, L. Wu, H. Q. Wang, J. C. Zheng, L. Zhang, and A. Walsh, Machine learning for molecular and
K. Kisslinger, Y. Li, Z. Wang, H. Cheng, S. Ke, Y. Li, materials science, Nature 559(7715), 547 (2018)
J. Kang, and Y. Zhu, Interfaces between hexagonal 58. F. Oviedo, J. L. Ferres, T. Buonassisi, and K. T.
and cubic oxides and their structure alternatives, Nat. Butler, Interpretable and explainable machine learning
Commun. 8(1), 1474 (2017) for materials science and chemistry, Acc. Mater. Res.
44. J. D. Steiner, H. Cheng, J. Walsh, Y. Zhang, B. 3(6), 597 (2022)
Zydlewski, L. Mu, Z. Xu, M. M. Rahman, H. Sun, F. 59. J. F. Rodrigues Jr, M. C. F. Florea, D. de Oliveira, D.
M. Michel, C. J. Sun, D. Nordlund, W. Luo, J. C. Diamond, and O. N. Oliveira Jr, Big data and machine
Zheng, H. L. Xin, and F. Lin, Targeted surface doping learning for materials science, Discover Materials 1(1),
with reversible local environment improves oxygen 12 (2021)
stability at the electrochemical interfaces of nickel-rich 60. K. Choudhary, B. DeCost, C. Chen, A. Jain, F.
cathode materials, ACS Appl. Mater. Interfaces 11(41), Tavazza, R. Cohn, C. W. Park, A. Choudhary, A.
37885 (2019) Agrawal, S. J. L. Billinge, E. Holm, S. P. Ong, and C.
45. J. C. Zheng, H. Q. Wang, A. T. S. Wee, and C. H. A. Wolverton, Recent advances and applications of deep
Huan, Trends in bonding configuration at SiC/III–V learning methods in materials science, npj Comput.
semiconductor interfaces, Appl. Phys. Lett. 79(11), Mater. 8, 59 (2022)
1643 (2001) 61. L. Samuel, Some studies in machine learning using the
46. H. Q. Wang, J. C. Zheng, A. T. S. Wee, and C. H. A. game of checkers, IBM J. Res. Develop. 3(3), 210
Huan, Study of electronic properties and bonding (1959)
configuration at the BN/SiC interface, J. Electron 62. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J.
Spectrosc. Relat. Phenom. 114–116, 483 (2001) Stone, Classification and Regression Trees, 1983
47. S. Lin, B. Zhang, T. Y. Lü, J. C. Zheng, H. Pan, H. 63. L. G. Valiant, A theory of the learnable, in: STOC ’84
Chen, C. Lin, X. Li, and J. Zhou, Inorganic lead-free B- Proceedings of the Sixteenth Annual ACM Symposium
γ-CsSnI 3 perovskite solar cells using diverse electron- on Theory of Computing, pp 436–445, 1984
transporting materials: A simulation study, ACS 64. T. Mitchell, Machine Learning, New York, USA:
Omega 6(40), 26689 (2021) McGrawHill, 1997
48. F. Y. Du, W. Zhang, H. Q. Wang, and J. C. Zheng, 65. S. Roweis and Z. Ghahramani, A unifying review of
Enhancement of thermal rectification by asymmetry linear gaussian models, Neural Comput. 11(2), 305
engineering of thermal conductivity and geometric (1999)
structure for the multi-segment thermal rectifier, Chin. 66. J. C. Zheng, J. Y. Chen, J. W. Shuai, S. H. Cai, and R.
Phys. B 32(6), 064402 (2023) Z. Wang, Storage capacity of the Hopfield neural
49. M. Kulichenko, J. S. Smith, B. Nebgen, Y. W. Li, N. network, Physica A 246(3), 313 (1997)
Fedik, A. I. Boldyrev, N. Lubbers, K. Barros, and S. 67. J. W. Shuai, J. C. Zheng, Z. X. Chen, R. T. Liu, and
Tretiak, The rise of neural networks for materials and B. X. Wu, The three-dimensional rotation neural
chemical dynamics, J. Phys. Chem. Lett. 12(26), 6227 network, Physica A 238), 23 (1997)
(2021) 68. M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foun-
50. W. Sha, Y. Guo, Q. Yuan, S. Tang, X. Zhang, S. Lu, dations of Machine Learning, 2nd Ed. , Adaptive
X. Guo, Y. C. Cao, and S. Cheng, Artificial intelligence Computation and Machine Learning. Cambridge, MA:
to power the future of materials science and engineer- MIT Press, 2018
ing, Adv. Intell. Syst. 2(4), 1900143 (2020) 69. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L.
51. S. Leonelli, Scientific research and big data, in: The Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin,
Stanford Encyclopedia of Philosophy, Summer 2020 Attention is all you need, arXiv: 1706.03762 (2017)
Ed., edited by E. N. Zalta, Metaphysics Research Lab, 70. A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J.
Stanford University, 2020 Michael, F. Hill, O. Levy, and S. R. Bowman, Super-
52. J. Westermayr, M. Gastegger, K. T. Schütt, and R. J. GLUE: A stickier benchmark for general-purpose
Maurer, Perspective on integrating machine learning language understanding systems, arXiv: 1905.00537
into computational chemistry and materials science, J. (2019)
Chem. Phys. 154(23), 230903 (2021) 71. D. Erhan, Y. Bengio, A. Courville, P. A. Manzagol, P.
53. D. Morgan and R. Jacobs, Opportunities and challenges Vincent, and S. Bengio, Why does unsupervised pre-
for machine learning in materials science, Annu. Rev. training help deep learning, J. Mach. Learn. Res. 11,
Mater. Res. 50(1), 71 (2020) 625 (2010)
54. C. Chen, Y. Zuo, W. Ye, X. Li, Z. Deng, and S. P. 72. Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M.
Ong, A critical review of machine learning of energy Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou,
materials, Adv. Energy Mater. 10(8), 1903242 (2020) CodeBERT: A pre-trained model for programming and
55. J. Wei, X. Chu, X. Y. Sun, K. Xu, H. X. Deng, J. natural languages, arXiv: 2002.08155 (2020)
Chen, Z. Wei, and M. Lei, Machine learning in materials 73. H. Bao, L. Dong, and F. Wei, BEIT: BERT pre-training
science, InfoMat 1(3), 338 (2019) of image transformers, arXiv: 2106.08254 (2021)
56. G. Pilania, Machine learning in materials science: 74. K. Hakhamaneshi, M. Nassar, M. Phielipp, P. Abbeel,
From explainable predictions to autonomous design, and V. Stojanović, Pretraining graph neural networks
Comput. Mater. Sci. 193, 110360 (2021) for few-shot analog circuit modeling and design, arXiv:
57. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, 2203.15913 (2022)
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-29
FRONTIERS OF PHYSICS REVIEW ARTICLE
75. J. Li, D. Li, C. Xiong, and S. Hoi, BLIP: Bootstrapping arXiv: 2205.01068 (2022)
language-image pre-training for unified vision-language 90. O. Lieber, O. Sharir, B. Lenz, and Y. Shoham, Jurassic-
understanding and generation, arXiv: 2201.12086 1: Technical Details and Evaluation, AI21 Labs, Tech.
(2022) Rep., 2021
76. K. Lu, A. Grover, P. Abbeel, and I. Mordatch, 91. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D.
Pretrained transformers as universal computation Kaplan, et al., Language models are few-shot learners,
engines, arXiv: 2103.05247 (2021) in: Advances in Neural Information Processing Systems,
77. M. Reid, Y. Yamada, and S. S. Gu, Can Wikipedia edited by H. Larochelle, M. Ranzato, R. Hadsell, M.
help offline reinforcement learning? arXiv: 2201.12122 Balcan, and H. Lin, 33 Curran Associates, Inc., 2020,
(2022) pp 1877–1901, arXiv: 2005.14165
78. C. Sun, X. Qiu, Y. Xu, and X. Huang, How to fine- 92. A. Bapna, I. Caswell, J. Kreutzer, O. Firat, D. van
tune BERT for text classification? arXiv: 1905.05583 Esch, et al., Building machine translation systems for
(2019) the next thousand languages, arXiv: 2205.03983 (2022)
79. H. Liu, D. Tam, M. Muqeeth, J. Mohta, T. Huang, M. 93. T. Mikolov, K. Chen, G. Corrado, and J. Dean,
Bansal, and C. Raffel, Few-shot parameter-efficient Efficient estimation of word representations in vector
fine-tuning is better and cheaper than in-context learn- space, arXiv: 1301.3781 (2013)
ing, Advances in Neural Information Processing 94. J. Pennington, R. Socher, and C. Manning, GloVe:
Systems 35, 1950 (2022) Global vectors for word representation, in: Proceedings
80. J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, of the 2014 Conference on Empirical Methods in Natural
BERT: Pre-training of deep bidirectional transformers Language Processing (EMNLP). Doha, Qatar: Associa-
for language understanding, arXiv: 1810.04805 (2018) tion for Computational Linguistics, Oct. 2014, pp
81. Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, 1532–1543
and R. Soricut, ALBERT: A lite BERT for self-supervised 95. O. Melamud, J. Goldberger, and I. Dagan,
learning of language representations, arXiv: 1909.11942 Context2vec: Learning generic context embedding with
(2019) bidirectional LSTM, in: Proceedings of the 20th
82. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. SIGNLL Conference on Computational Natural
Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, Language Learning. Berlin, Germany: Association for
ROBERTA: A robustly optimized BERT pretraining Computational Linguistics, Aug. 2016, pp 51–61
approach, arXiv: 1907.11692 (2019) 96. H. Dai, B. Dai, and L. Song, Discriminative embeddings
83. J. Vig and Y. Belinkov, Analyzing the structure of of latent variable models for structured data, arXiv:
attention in a transformer language model, arXiv: 1603.05629 (2016)
1906.04284 (2019) 97. J. Yang, R. Zhao, M. Zhu, D. Hallac, J. Sodnik, and J.
84. S. Zhang and L. Xie, Improving attention mechanism Leskovec, Driver2vec: Driver identification from auto-
in graph neural networks via cardinality preservation, motive data, arXiv: 2102.05234 (2021)
in: Proceedings of the Twenty-Ninth International 98. S. Schneider, A. Baevski, R. Collobert, and M. Auli,
Joint Conference on Artificial Intelligence, International Wav2vec: Unsupervised pre-training for speech recog-
Joint Conferences on Artificial Intelligence Organiza- nition, arXiv: 1904.05862 (2019)
tion, 2020, page 1395 99. H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong,
85. Y. Tay, V. Q. Tran, M. Dehghani, J. Ni, D. Bahri, H. and W. Zhang, Informer: Beyond efficient transformer
Mehta, Z. Qin, K. Hui, Z. Zhao, J. Gupta, T. Schuster, for long sequence time-series forecasting, in: Proceedings
W. W. Cohen, and D. Metzler, Transformer memory of the AAAI Conference on Artificial Intelligence
as a differentiable search index, Advances in Neural 35(12), 11106 (2021), arXiv: 2012.07436
Information Processing Systems 35, 21831 (2022) 100. I. Beltagy, M. E. Peters, and A. Cohan, Longformer:
86. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, The long-document transformer, arXiv: 2004.05150
M. Matena, Y. Zhou, W. Li, and P. J. Liu, Exploring (2020)
the limits of transfer learning with a unified text-to- 101. K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu,
text transformer, J. Mach. Learn. Res. 21(1), 5485 Y. Tang, A. Xiao, C. Xu, Y. Xu, Z. Yang, Y. Zhang,
(2020) and D. Tao, A survey on vision transformer, IEEE
87. T. Shin, and Y. Razeghi, R. L. L. IV, E. Wallace, and Trans. Pattern Anal. Mach. Intell. 45(1), 87 (2023)
S. Singh, AutoPrompt: Eliciting knowledge from 102. J. B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr,
language models with automatically generated prompts, et al., Flamingo: A visual language model for few-shot
arXiv: 2010.15980 (2020) learning, Advances in Neural Information Processing
88. N. Ding, S. Hu, W. Zhao, Y. Chen, Z. Liu, H. -T. Systems 35, 23716 (2022), arXiv: 2204.14198
Zheng, and M. Sun, Openprompt: An open-source 103. J. Yu, Z. Wang, V. Vasudevan, L. Yeung, M. Seyed-
framework for prompt-learning, arXiv: 2111.01998 hosseini, and Y. Wu, COCA: Contrastive captioners
(2021) are image-text foundation models, arXiv: 2205.01917
89. S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. (2022)
Chen, C. Dewan, M. Diab, X. Li, X. V. Lin, T. 104. X. Liu, C. Gong, L. Wu, S. Zhang, H. Su, and Q. Liu,
Mihaylov, M. Ott, S. Shleifer, K. Shuster, D. Simig, P. Fusedream: Training-free text-to-image generation
S. Koura, A. Sridhar, T. Wang, and L. Zettlemoyer, with improved CLIP+GAN space optimization, arXiv:
OPT: Open pre-trained transformer language models, 2112.01573 (2021)
13501-30 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
105. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. SARSA with linear function approximation, arXiv:
Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. 1902.02234 (2019)
Clark, G. Krueger, and I. Sutskever, Learning transferable 119. C. J. C. H. Watkins and P. Dayan, Q-learning,
visual models from natural language supervision, arXiv: Machine Learning 8(3), 279 (1992)
2103.00020 (2021) 120. P. Abbeel and A. Y. Ng, Apprenticeship learning via
106. L. He, Q. Zhou, X. Li, L. Niu, G. Cheng, X. Li, W. Liu, inverse reinforcement learning, in Proceedings of the
Y. Tong, L. Ma, and L. Zhang, End-to-end video Twenty-First International Conference on Machine
object detection with spatial-temporal transformers, in: Learning, Ser. ICML ’04. New York, NY, USA: Associ-
Proceedings of the 29th ACM International Conference ation for Computing Machinery, 2004
on Multimedia, 2021, pp 1507–1516, arXiv: 2105.10920 121. C. Finn, P. Abbeel, and S. Levine, Model-agnostic
107. X. Zhai, X. Wang, B. Mustafa, A. Steiner, D. Keysers, meta-learning for fast adaptation of deep networks, In
A. Kolesnikov, and L. Beyer, LIT: Zero-shot transfer International conference on machine learning, 2017, pp
with locked-image text tuning, in: Proceedings of the 1126–1135, arXiv: 1703.03400
IEEE/CVF Conference on Computer Vision and 122. C. Fifty, E. Amid, Z. Zhao, T. Yu, R. Anil, and C.
Pattern Recognition, 2022, pp 18123–18133, arXiv: Finn, Efficiently identifying task groupings for multi-
2111.07991 task learning, Advances in Neural Information
108. A. Trockman and J. Z. Kolter, Patches are all you Processing Systems 34, 27503 (2021), arXiv:
need? arXiv: 2201.09792 (2022) 2109.04617
109. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. 123. N. Anand and D. Precup, Preferential temporal difference
Radford, M. Chen, and I. Sutskever, Zeroshot text-to- learning, arXiv: 2106.06508 (2021)
image generation, in: International Conference on 124. K. Chen, R. Cao, S. James, Y. Li, Y. H. Liu, P.
Machine Learning, 2021, pp 8821–8831, arXiv: Abbeel, and Q. Dou, Sim-to-real 6d object pose estimation
2102.12092 via iterative self-training for robotic bin-picking, in:
110. A. Tewari, J. Thies, B. Mildenhall, P. Srinivasan, E. Computer Vision–ECCV 2022: 17th European Confer-
Tretschk, Y. Wang, C. Lassner, V. Sitzmann, R. ence, Tel Aviv, Israel, October 23–27, 2022, Proceed-
Martin-Brualla, S. Lombardi, T. Simon, C. Theobalt, ings, Part XXXIX (pp 533–550). Cham: Springer
M. Niessner, J. T. Barron, G. Wetzstein, M. Zollhoefer, Nature Switzerland, arXiv: 2204.07049
and V. Golyanik, Advances in neural rendering, 125. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I.
Computer Graphics Forum 41(2), 703 (2022), arXiv: Antonoglou, D. Wierstra, and M. Riedmiller, Playing
2111.05849 atari with deep reinforcement learning, arXiv:
111. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. 1312.5602 (2013)
Barron, R. Ramamoorthi, and R. Ng, NERF: Repre- 126. T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T.
senting scenes as neural radiance fields for view Erez, Y. Tassa, D. Silver, and D. Wierstra, Continuous
synthesis, Communications of the ACM 65(1), 99 control with deep reinforcement learning, arXiv:
(2021), arXiv: 2003.08934 1509.02971 (2015)
112. S. Zheng, J. Pan, C. Lu, and G. Gupta, Pointnorm: 127. D. Yarats, D. Brandfonbrener, H. Liu, M. Laskin, P.
Normalization is all you need for point cloud analysis, Abbeel, A. Lazaric, and L. Pinto, Don’t change the
arXiv: 2207.06324 (2022) algorithm, change the data: Exploratory data for
113. H. Ran, J. Liu, and C. Wang, Surface representation offline reinforcement learning, arXiv: 2201.13425
for point clouds, in: Proceedings of the IEEE/CVF (2022)
Conference on Computer Vision and Pattern Recogni- 128. M. Ahn, A. Brohan, N. Brown, Y. Chebotar, O. Cortes,
tion, 2022, pp 18942–18952, arXiv: 2205.05740 et al., Do as I can, not as I say: Grounding language in
114. X. Ma, C. Qin, H. You, H. Ran, and Y. Fu, Rethinking robotic affordances, in: Conference on Robot Learning,
network design and local geometry in point cloud: A 2023, pp 287–318, arXiv: 2204.01691
simple residual MLP framework, arXiv: 2202.07123 129. S. James and P. Abbeel, Coarse-to-fine Q-attention
(2022) with learned path ranking, arXiv: 2204.01571 (2022)
115. Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bron- 130. C. Qi, P. Abbeel, and A. Grover, Imitating, fast and
stein, and J. M. Solomon, Dynamic graph CNN for slow: Robust learning from demonstrations via decision-
learning on point clouds, arXiv: 1801.07829 (2018) time planning, arXiv: 2204.03597 (2022)
116. D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, 131. L. Wang, X. Zhang, K. Yang, L. Yu, C. Li, L. Hong, S.
A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Zhang, Z. Li, Y. Zhong, and J. Zhu, Memory replay
Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. van with data compression for continual learning, arXiv:
den Driessche, T. Graepel, and D. Hassabis, Mastering 2202.06592 (2022)
the game of Go without human knowledge, Nature 132. L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M.
550(7676), 354 (2017) Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, Decision
117. E. Zhao, R. Yan, J. Li, K. Li, and J. Xing, Alphahol- transformer: Reinforcement learning via sequence
dem: High-performance artificial intelligence for heads- modeling, Advances in Neural Information Processing
up no-limit poker via end-to-end reinforcement learn- Systems 34, 15084 (2021), arXiv: 2106.01345
ing, in: Proceedings of the AAAI Conference on Artificial 133. J. Parker-Holder, M. Jiang, M. Dennis, M. Samvelyan,
Intelligence 36(4), 4689 (2022) J. Foerster, E. Grefenstette, and T. Rocktäschel,
118. S. Zou, T. Xu, and Y. Liang, Finite-sample analysis for Evolving curricula with regret-based environment
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-31
FRONTIERS OF PHYSICS REVIEW ARTICLE
design, in: International Conference on Machine Learn- Nature 567(7747), 209 (2019)
ing, 2022, pp 17473–17498, arXiv: 2203.01302 148. S. Moradi, C. Brandner, C. Spielvogel, D. Krajnc, S.
134. R. Wang, J. Lehman, J. Clune, and K. O. Stanley, Hillmich, R. Wille, W. Drexler, and L. Papp, Clinical
Paired open-ended trailblazer (POET): Endlessly data classification with noisy intermediate scale quantum
generating increasingly complex and diverse learning computers, Sci. Rep. 12(1), 1851 (2022)
environments and their solutions, arXiv: 1901.01753 149. J. Zheng, K. He, J. Zhou, Y. Jin, and C. M. Li,
(2019) Combining reinforcement learning with lin-kernighan-
135. Z. Li, L. Li, Z. Ma, P. Zhang, J. Chen, and J. Zhu, helsgaun algorithm for the traveling salesman problem,
Read: Large-scale neural scene rendering for in: Proceedings of the AAAI Conference on Artificial
autonomous driving, arXiv: 2205.05509 (2022) Intelligence 35(14), 12445 (2021), arXiv: 2012.04461
136. W. Tang, C. J. Ho, and Y. Liu, Bandit learning with 150. Z. Li, Q. Chen, and V. Koltun, Combinatorial opti-
delayed impact of actions, in: Advances in Neural mization with graph convolutional networks and
Information Processing Systems, edited by A. Beygelz- guided tree search, Advances in Neural Information
imer, Y. Dauphin, P. Liang, and J. W. Vaughan, 2021, Processing Systems 31, 2018, arXiv: 1810.10659
arXiv: 1904.01763 151. M. Sundararajan, A. Taly, and Q. Yan, Axiomatic
137. Z. Gao, Y. Han, Z. Ren, and Z. Zhou, Batched multi- attribution for deep networks, in: International Confer-
armed bandits problem, in: Advances in Neural Infor- ence on Machine Learning, 2017, pp 3319–3328, arXiv:
mation Processing Systems, edited by H. Wallach, H. 1703.01365
Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, 152. M. T. Ribeiro, S. Singh, and C. Guestrin, Why Should
and R. Garnett, Curran Associates, Inc., 2019, arXiv: I Trust You? Explaining the predictions of any classi-
1904.01763 fier, in: Proceedings of the 22nd ACM SIGKDD Inter-
138. Y. Yue, J. Broder, R. Kleinberg, and T. Joachims, The national Conference on Knowledge Discovery and Data
k-armed dueling bandits problem, J. Comput. Syst. Sci. Mining, 2016, pp 1135–1144, arXiv: 1602.04938
78(5), 1538 (2012) 153. S. Lundberg and S. I. Lee, A unified approach to inter-
139. A. Carpentier, A. Lazaric, M. Ghavamzadeh, R. preting model predictions, arXiv: 1705.07874 (2017)
Munos, P. Auer, and A. Antos, Upperconfidence- 154. J. Crabbe, Z. Qian, F. Imrie, and M. van der Schaar,
bound algorithms for active learning in multi-armed Explaining latent representations with a corpus of
bandits, in: Algorithmic Learning Theory: 22nd Inter- examples, in: Advances in Neural Information Processing
national Conference, ALT 2011, Espoo, Finland, October Systems, edited by M. Ranzato, A. Beygelzimer, Y.
5–7, 2011. Proceedings 22 (pp 189–203), Springer Dauphin, P. Liang, and J. W. Vaughan, Curran Asso-
Berlin Heidelberg, arXiv: 1507.04523 ciates, Inc., 2021, pp 12154–12166, arXiv: 2110.15355
140. W. Ye, S. Liu, T. Kurutach, P. Abbeel, and Y. Gao, 155. J. T. Springenberg, A. Dosovitskiy, T. Brox, and M.
Mastering Atari games with limited data, Advances in Riedmiller, Striving for simplicity: The all convolutional
Neural Information Processing Systems 34, 25476 net, arXiv: 1412.6806 (2014)
(2021), arXiv: 2111.00210 156. R. Ying, D. Bourgeois, J. You, M. Zitnik, and J.
141. M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Leskovec, Gnnexplainer: Generating explanations for
Farquhar, N. Nardelli, T. G. J. Rudner, C. M. Hung, graph neural networks, arXiv: 1903.03894 (2019)
P. H. S. Torr, J. Foerster, and S. Whiteson, The StarCraft 157. H. Yuan, H. Yu, J. Wang, K. Li, and S. Ji, On
multi-agent challenge, in: Proceedings of the 18th explainability of graph neural networks via subgraph
International Conference on Autonomous Agents and explorations, in: International Conference on Machine
MultiAgent Systems, 2019, arXiv: 1902.04043 Learning, 2021, pp 12241–12252, arXiv: 2102.05152
142. T. Wang, T. Gupta, A. Mahajan, B. Peng, S. White- 158. Q. Huang, M. Yamada, Y. Tian, D. Singh, D. Yin, and
son, and C. Zhang, Rode: Learning roles to decompose Y. Chang, GraphLIME: Local interpretable model
multi-agent tasks, arXiv: 2010.01523 (2020) explanations for graph neural networks, IEEE Trans-
143. O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Math- actions on Knowledge and Data Engineering, 35(7),
ieu, A. Dudzik, et al., Grandmaster level in StarCraft 6968 (2023), arXiv: 2001.06216
II using multi-agent reinforcement learning, Nature 159. H. Yuan, H. Yu, S. Gui, and S. Ji, Explainability in
575(7782), 350 (2019) graph neural networks: A taxonomic survey, IEEE
144. W. Du and S. Ding, A survey on multi-agent deep Transactions on Pattern Analysis and Machine Intelli-
reinforcement learning: From the perspective of chal- gence 45(5), 5782 (2023), arXiv: 2012.15445
lenges and applications, Artif. Intell. Rev. 54(5), 3215 160. G. Katz, C. Barrett, D. Dill, K. Julian, and M.
(2021) Kochenderfer, ReLUPlex: An efficient smt solver for
145. J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. verifying deep neural networks, in: Computer Aided
Wiebe, and S. Lloyd, Quantum machine learning, Verification: 29th International Conference, CAV 2017,
Nature 549, 195 (2017) Heidelberg, Germany, July 24–28, 2017, Proceedings,
146. Y. Liu, S. Arunachalam, and K. Temme, A rigorous Part I 30, pp 97–117. Springer International Publishing,
and robust quantum speed-up in supervised machine arXiv: 1702.01135
learning, Nat. Phys. 17(9), 1013 (2021) 161. S. Wang, H. Zhang, K. Xu, X. Lin, S. Jana, C. J.
147. V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, Hsieh, and J. Z. Kolter, Beta-CROWN: Efficient
A. Kandala, J. M. Chow, and J. M. Gambetta, Supervised bound propagation with per-neuron split constraints
learning with quantum-enhanced feature spaces, for complete and incomplete neural network verifica-
13501-32 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-33
FRONTIERS OF PHYSICS REVIEW ARTICLE
deep-learning models for high-precision atom segmen- Canova, Y. S. Ranawat, D. Z. Gao, P. Rinke, and A. S.
tation, localization, denoising, and deblurring of atomic- Foster, Dscribe: Library of descriptors for machine
resolution images, Sci. Rep. 11(1), 5386 (2021) learning in materials science, Comput. Phys. Commun.
195. L. Han, H. Cheng, W. Liu, H. Li, P. Ou, R. Lin, H.-T. 247, 106949 (2020)
Wang, C.-W. Pao, A. R. Head, C.-H. Wang, X. Tong, 212. W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu,
C.-J. Sun, W.-F. Pong, J. Luo, J.-C. Zheng, and H. L. M. Catasta, and J. Leskovec, Open graph benchmark:
Xin, A single-atom library for guided monometallic Datasets for machine learning on graphs, Advances in
and concentration-complex multimetallic designs, Nat. neural information processing systems 33, 22118 (2020),
Mater. 21, 681 (2022) arXiv: 2005.00687
196. D. Mrdjenovich, M. K. Horton, J. H. Montoya, C. M. 213. O. Source, Rdkit: Open-source cheminformatics soft-
Legaspi, S. Dwaraknath, V. Tshitoyan, A. Jain, and K. ware, URL: www.rdkit.org, 2022
A. Persson, Propnet: A knowledge graph for materials 214. D. Grattarola, Spektral, URL: graphneural.network,
science, Matter 2(2), 464 (2020) 2022
197. T. S. Lin, C. W. Coley, H. Mochigase, H. K. Beech, W. 215. S. Li, Y. Liu, D. Chen, Y. Jiang, Z. Nie, and F. Pan,
Wang, Z. Wang, E. Woods, S. L. Craig, J. A. Johnson, Encoding the atomic structure for machine learning in
J. A. Kalow, K. F. Jensen, and B. D. Olsen, Bigsmiles: materials science, Wiley Interdiscip. Rev. Comput. Mol.
A structurally-based line notation for describing Sci. 12(1) (2022)
macromolecules, ACS Cent. Sci. 5(9), 1523 (2019) 216. J. Schmidt, M. R. G. Marques, S. Botti, and M. A. L.
198. M. Krenn, Q. Ai, S. Barthel, N. Carson, A. Frei, et al., Marques, Recent advances and applications of machine
Selfies and the future of molecular string representa- learning in solid-state materials science, npj Comput.
tions, Patterns 3(10), 100588 (2022) Mater. 5(1), 83 (2019)
199. K. Michel and B. Meredig, Beyond bulk single crystals: 217. M. Rupp, A. Tkatchenko, K. R. Müller, and O. A. von
A data format for all materials Lilienfeld, Fast and accurate modeling of molecular
structure–property–processing relationships, MRS Bull. atomization energies with machine learning, Phys. Rev.
41(8), 617 (2016) Lett. 108(5), 058301 (2012)
200. M. Wang, D. Zheng, Z. Ye, Q. Gan, M. Li, X. Song, J. 218. J. Schrier, Can one hear the shape of a molecule (from
Zhou, C. Ma, L. Yu, Y. Gai, T. Xiao, T. He, G. its Coulomb matrix eigenvalues), J. Chem. Inf. Model.
Karypis, J. Li, and Z. Zhang, Deep graph library: A 60(8), 3804 (2020)
graph-centric, highly-performant package for graph 219. M. McCarthy and K. L. K. Lee, Molecule identification
neural networks, arXiv: 1909.01315 (2019) with rotational spectroscopy and probabilistic deep
201. I. Babuschkin, K. Baumli, A. Bell, S. Bhupatiraju, J. learning, J. Phys. Chem. A 124(15), 3002 (2020)
Bruce, et al., The DeepMind JAX Ecosystem, 2020 220. F. Faber, A. Lindmaa, O. A. von Lilienfeld, and R.
202. F. Chollet, et al., Keras, URL: github.com/fchollet/ Armiento, Crystal structure representations for
keras (2015) machine learning models of formation energies, Int. J.
203. A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Quantum Chem. 115(16), 1094 (2015)
Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. 221. K. T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K.
Lerer, Automatic differentiation in PYTORCH, 31st R. Müller, and E. K. U. Gross, How to represent crystal
Conference on Neural Information Processing Systems structures for machine learning: Towards fast prediction
(NIPS 2017), Long Beach, CA, USA, 2017 of electronic properties, Phys. Rev. B 89(20), 205118
204. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, (2014)
et al., TensorFlow: Large-scale machine learning on 222. J. Behler and M. Parrinello, Generalized neural-
heterogeneous systems, 2015, URL: www.tensorflow. network representation of high-dimensional potential-
org energy surfaces, Phys. Rev. Lett. 98(14), 146401 (2007)
205. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. 223. J. Behler, Atom-centered symmetry functions for
Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. constructing high-dimensional neural network poten-
Funtowicz, J. Davison, S. Shleifer), von Platen, C. Ma, tials, J. Chem. Phys. 134(7), 074106 (2011)
Y. Jernite, J. Plu, C. Xu, T. L. Scao, S. Gugger, M. 224. A. Seko, A. Takahashi, and I. Tanaka, Sparse repre-
Drame, Q. Lhoest, and A. M. Rush, Huggingface’s sentation for a potential energy surface, Phys. Rev. B
transformers: State-of-the-art natural language process- 90(2), 024101 (2014)
ing, arXiv: 1910.03771 (2019) 225. M. Gastegger, L. Schwiedrzik, M. Bittermann, F.
206. Openrefine: A free, open source, powerful tool for Berzsenyi, and P. Marquetand, WACSF — Weighted
working with messy data, URL: openrefine.org, 2022 atom-centered symmetry functions as descriptors in
207. PyG-Team, PYG documentation, URL: pytorch- machine learning potentials, J. Chem. Phys. 148(24),
geometric.readthedocs.io/en/latest/, 2022 241709 (2018)
208. PytorchLightning, URL: www.pytorchlightning.ai, 226. A. P. Bartók, R. Kondor, and G. Csányi, On representing
2022 chemical environments, Phys. Rev. B 87(18), 184115
209. GitHub - Netflix/vectorflow, URL: github.com/Netflix (2013)
/vectorflow, 2022 227. C. W. Rosenbrock, E. R. Homer, G. Csányi, and G. L.
210. L. Biewald, Experiment tracking with weights and W. Hart, Discovering the building blocks of atomic
biases, URL: www.wandb.com, 2020 systems using machine learning: Application to grain
211. L. Himanen, M. O. Jäger, E. V. Morooka, F. F. boundaries, npj Comput. Mater. 3, 29 (2017)
13501-34 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
228. F. M. Paruzzo, A. Hofstetter, F. Musil, S. De, M. Ceri- component analysis, Comput. Mater. Sci. 202, 110938
otti, and L. Emsley, Chemical shifts in molecular solids (2022)
by machine learning, Nat. Commun. 9(1), 4501 (2018) 244. L. M. Ghiringhelli, J. Vybiral, E. Ahmetcik, R.
229. A. S. Rosen, S. M. Iyer, D. Ray, Z. Yao, A. Aspuru- Ouyang, S. V. Levchenko, C. Draxl, and M. Scheffler,
Guzik, L. Gagliardi, J. M. Notestein, and R. Q. Snurr, Learning physical descriptors for materials science by
Machine learning the quantum-chemical properties of compressed sensing, New J. Phys. 19(2), 023017 (2017)
metal–organic frameworks for accelerated materials 245. R. Ouyang, S. Curtarolo, E. Ahmetcik, M. Scheffler,
discovery, Matter 4(5), 1578 (2021) and L. M. Ghiringhelli, SISSO: A compressed-sensing
230. Z. Fan, Z. Zeng, C. Zhang, Y. Wang, K. Song, H. method for identifying the best low-dimensional
Dong, Y. Chen, and T. Ala-Nissila, Neuroevolution descriptor in an immensity of offered candidates, Phys.
machine learning potentials: Combining high accuracy Rev. Mater. 2, 083802 (2018)
and low cost in atomistic simulations and application 246. W. C. Lu, X. B. Ji, M. J. Li, L. Liu, B. H. Yue, and L.
to heat transport, Phys. Rev. B 104(10), 104309 (2021) M. Zhang, Using support vector machine for materials
231. Z. Mihalić and N. Trinajstić, A graph-theoretical design, Adv. Manuf. 1(2), 151 (2013)
approach to structure-property relationships, J. Chem. 247. Y. Wu, N. Prezhdo, and W. Chu, Increasing efficiency
Educ. 69(9), 701 (1992) of nonadiabatic molecular dynamics by Hamiltonian
232. O. Isayev, C. Oses, C. Toher, E. Gossett, S. Curtarolo, interpolation with kernel ridge regression, J. Phys.
and A. Tropsha, Universal fragment descriptors for Chem. A 125(41), 9191 (2021)
predicting properties of inorganic crystals, Nat. 248. T. Hastie, R. Tibshirani, and J. H. Friedman, The
Commun. 8(1), 15679 (2017) elements of statistical learning: Data mining, inference,
233. T. Xie and J. C. Grossman, Crystal graph convolutional and prediction, 2nd Ed., in: Springer series in statistics,
neural networks for an accurate and interpretable NY: Springer, 2009
prediction of material properties, Phys. Rev. Lett. 249. K. He, X. Zhang, S. Ren, and J. Sun, Deep residual
120(14), 145301 (2018) learning for image recognition, in: 2016 IEEE Conference
234. K. Xia and G. W. Wei, Persistent homology analysis on Computer Vision and Pattern Recognition (CVPR),
of protein structure, flexibility and folding, Int. J. 2016, pp 770–778
Numer. Methods Biomed. Eng. 30(8), 814 (2014) 250. O. T. Unke, S. Chmiela, M. Gastegger, K. T. Schütt,
235. Z. Cang, L. Mu, K. Wu, K. Opron, K. Xia, and G. W. H. E. Sauceda, and K.-R. Müller, Spookynet: Learning
Wei, A topological approach for protein classification, force fields with electronic degrees of freedom and
Comput. Math. Biophys. 3(1) (2015) nonlocal effects, Nat. Commun. 12, 7273 (2021)
236. Y. Jiang, D. Chen, X. Chen, T. Li, G.-W. Wei, and F. 251. C. Zheng, C. Chen, Y. Chen, and S. P. Ong, Random
Pan, Topological representations of crystalline forest models for accurate identification of coordination
compounds for the machine-learning prediction of environments from X-ray absorption near-edge struc-
materials properties, npj Comput. Mater. 7, 28 (2021) ture, Patterns 1(2), 100013 (2020)
237. E. Minamitani, T. Shiga, M. Kashiwagi, and I. 252. J. J. Kranz, M. Kubillus, R. Ramakrishnan, O. A. von
Obayashi, Topological descriptor of thermal conductivity Lilienfeld, and M. Elstner, Generalized density-functional
in amorphous Si, J. Chem. Phys. 156(24), 244502 tight-binding repulsive potentials from unsupervised
(2022) machine learning, J. Chem. Theory Comput. 14(5),
238. M. E. Aktas, E. Akbas, and A. E. Fatmaoui, Persistence 2341 (2018)
homology of networks: Methods and applications, Appl. 253. S. Kim, J. Noh, G. H. Gu, A. Aspuru-Guzik, and Y.
Netw. Sci. 4(1), 1 (2019) Jung, Generative adversarial networks for crystal
239. A. Ziletti, D. Kumar, M. Scheffler, and L. M. Ghir- structure prediction, ACS Cent. Sci. 6(8), 1412 (2020)
inghelli, Insightful classification of crystal structures 254. J. Noh, J. Kim, H. S. Stein, B. Sanchez-Lengeling, J.
using deep learning, Nat. Commun. 9(1), 2775 (2018) M. Gregoire, A. Aspuru-Guzik, and Y. Jung, Inverse
240. W. B. Park, J. Chung, J. Jung, K. Sohn, S. P. Singh, design of solid-state materials via a continuous repre-
M. Pyo, N. Shin, and K. S. Sohn, Classification of sentation, Matter 1(5), 1370 (2019)
crystal structure using a convolutional neural network, 255. M. L. Hutchinson, E. Antono, B. M. Gibbons, S.
IUCrJ 4(4), 486 (2017) Paradiso, J. Ling, and B. Meredig, Overcoming data
241. Y. Zhang, X. He, Z. Chen, Q. Bai, A. M. Nolan, C. A. scarcity with transfer learning, arXiv: 1711.05099
Roberts, D. Banerjee, T. Matsunaga, Y. Mo, and C. (2017)
Ling, Unsupervised discovery of solid-state lithium ion 256. R. Chang, Y.-X. Wang, and E. Ertekin, Towards over-
conductors, Nat. Commun. 10(1), 5260 (2019) coming data scarcity in materials science: Unifying
242. S. C. Sieg, C. Suh, T. Schmidt, M. Stukowski, K. models and datasets with a mixture of experts frame-
Rajan, and W. F. Maier, Principal component analysis work, npj Comput. Mater. 8, 242 (2022)
of catalytic functions in the composition space of 257. M. A. Nielsen, Neural Networks and Deep Learning,
heterogeneous catalysts, QSAR Comb. Sci. 26(4), 528 Determination Press, 2015
(2007) 258. A. Akbari, L. Ng, and B. Solnik, Drivers of economic
243. R. Tranås, O. M. Løvvik, O. Tomic, and K. Berland, and financial integration: A machine learning approach,
Lattice thermal conductivity of half-Heuslers with J. Empir. Finance 61, 82 (2021)
density functional theory and machine learning: 259. L. Weng, Flow-based deep generative models, URL:
Enhancing predictivity by active sampling with principal lilianweng.github.io, 2018
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-35
FRONTIERS OF PHYSICS REVIEW ARTICLE
13501-36 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
288. C. Chen and S. P. Ong, AtomSets as a hierarchical ing, Nat. Commun. 10(1), 5316 (2019)
transfer learning framework for small and large materials 302. X. Zhong, B. Gallagher, S. Liu, B. Kailkhura, A. Hisz-
datasets, npj Comput. Mater. 7, 173 (2021) panski, and T. Y.-J. Han, Explainable machine learning
289. H. Yamada, C. Liu, S. Wu, Y. Koyama, S. Ju, J. in materials science, npj Comput. Mater. 8, 204 (2022)
Shiomi, J. Morikawa, and R. Yoshida, Predicting 303. P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis,
materials properties with little data using shotgun Explainable AI: A review of machine learning inter-
transfer learning, ACS Cent. Sci. 5(10), 1717 (2019) pretability methods, Entropy (Basel) 23(1), 18 (2020)
290. S. Feng, H. Fu, H. Zhou, Y. Wu, Z. Lu, and H. Dong, 304. W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl,
A general and transferable deep learning framework for and B. Yu, Definitions, methods, and applications in
predicting phase formation in materials, npj Comput. interpretable machine learning, Proc. Natl. Acad. Sci.
Mater. 7(1), 10 (2021) USA 116(44), 22071 (2019)
291. V. Gupta, K. Choudhary, F. Tavazza, C. Campbell, W. 305. R. Kondo, S. Yamakawa, Y. Masuoka, S. Tajima, and
K. Liao, A. Choudhary, and A. Agrawal, Cross-property R. Asahi, Microstructure recognition using convolutional
deep transfer learning framework for enhanced predictive neural networks for prediction of ionic conductivity in
analytics on small materials data, Nat. Commun. 12, ceramics, Acta Mater. 141, 29 (2017)
6595 (2021) 306. K. Das, B. Samanta, P. Goyal, S.-C. Lee, S. Bhat-
292. V. Stanev, C. Oses, A. G. Kusne, E. Rodriguez, J. tacharjee, and N. Ganguly, CrysXPP: An explainable
Paglione, S. Curtarolo, and I. Takeuchi, Machine property predictor for crystalline materials, npj
learning modeling of superconducting critical tempera- Comput. Mater. 8, 43 (2022)
ture, npj Comput. Mater. 4(1), 29 (2018) 307. A. Y. T. Wang, S. K. Kauwe, R. J. Murdock, and T.
293. D. S. Palmer, N. M. O’Boyle, R. C. Glen, and J. B. O. D. Sparks, Compositionally restricted attention-based
Mitchell, Random forest models to predict aqueous network for materials property predictions, npj
solubility, J. Chem. Inform. Model. 47(1), 150 (2007) Comput. Mater. 7(1), 77 (2021)
294. P. Banerjee and R. Preissner, Bittersweetforest: A 308. A. Y. T. Wang, M. S. Mahmoud, M. Czasny, and A.
random forest based binary classifier to predict bitterness Gurlo, CrabNet for explainable deep learning in materials
and sweetness of chemical compounds, Front. Chem. 6, science: bridging the gap between academia and indus-
93 (2018) try, Integr. Mater. Manuf. Innov. 11(1), 41 (2022)
295. P. Raccuglia, K. C. Elbert, P. D. F. Adler, C. Falk, M. 309. A. Parnami and M. Lee, Learning from few examples:
B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. A summary of approaches to few-shot learning, arXiv:
Schrier, and A. J. Norquist, Machine-learning-assisted 2203.04291 (2023)
materials discovery using failed experiments, Nature 310. Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, Generalizing
533(7601), 7601 (2016) from a few examples: A survey on few-shot learning,
296. L. Chen, B. Xu, J. Chen, K. Bi, C. Li, S. Lu, G. Hu, ACM Comput. Surv. 53(3), 63 (2020)
and Y. Lin, Ensemble-machine-learning-based correla- 311. Y. Wang, A. Abuduweili, Q. Yao, and D. Dou, Prop-
tion analysis of internal and band characteristics of erty-aware relation networks for few-shot molecular
thermoelectric materials, J. Mater. Chem. C 8(37), property prediction, arXiv: 2107.07994 (2021)
13079 (2020) 312. Z. Guo, et al., Few-shot graph learning for molecular
297. J. Venderley, K. Mallayya, M. Matty, M. Krogstad, J. property prediction, in: Proceedings of the Web
Ruff, G. Pleiss, V. Kishore, D. Mandrus, D. Phelan, L. Conference 2021, in: WWW ’21. New York, USA:
Poudel, A. G. Wilson, K. Weinberger, P. Upreti, M. Association for Computing Machinery, June 2021, pp
Norman, S. Rosenkranz, R. Osborn, and E. A. Kim, 2559–2567
Harnessing interpretable and unsupervised machine 313. K. Kaufmann, H. Lane, X. Liu, and K. S. Vecchio,
learning to address big data from modern X-ray Efficient few-shot machine learning for classification of
diffraction, Proc. Natl. Acad. Sci. USA 119(24), EBSD patterns, Sci. Rep. 11(1), 8172 (2021)
e2109665119 (2022) 314. S. Akers, et al., Rapid and flexible segmentation of
298. R. Cohn and E. Holm, Unsupervised machine learning electron microscopy data using few-shot machine learn-
via transfer learning and k-means clustering to classify ing, npj Comput. Mater. 7, 187 (2021)
materials image data, Integr. Mater. Manuf. Innov. 315. J. P. Perdew and K. Schmidt, Jacob’s ladder of
10(2), 231 (2021) density functional approximations for the exchange-
299. R. E. A. Goodall and A. A. Lee, Predicting materials correlation energy, AIP Conf. Proc. 577, 1 (2001)
properties without crystal structure: Deep representation 316. S. Dick and M. Fernandez-Serra, Machine learning
learning from stoichiometry, Nat. Commun. 11(1), accurate exchange and correlation functionals of the
6280 (2020) electronic density, Nat. Commun. 11(1), 3509 (2020)
300. K. Muraoka, Y. Sada, D. Miyazaki, W. Chaikittisilp, 317. R. Nagai, R. Akashi, and O. Sugino, Completing
and T. Okubo, Linking synthesis and structure density functional theory by machine learning hidden
descriptors from a large collection of synthetic records messages from molecules, npj Comput. Mater. 6(1), 43
of zeolite materials, Nat. Commun. 10(1), 4459 (2019) (2020)
301. D. Jha, K. Choudhary, F. Tavazza, W. Liao, A. 318. J. Kirkpatrick, B. McMorrow, D. H. P. Turban, A. L.
Choudhary, C. Campbell, and A. Agrawal, Enhancing Gaunt, J. S. Spencer, A. G. D. G. Matthews, A. Obika,
materials property prediction by leveraging computa- L. Thiry, M. Fortunato, D. Pfau, L. R. Castellanos, S.
tional and experimental data using deep transfer learn- Petersen, A. W. R. Nelson, P. Kohli, P. Mori-Sánchez,
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-37
FRONTIERS OF PHYSICS REVIEW ARTICLE
D. Hassabis, and A. J. Cohen, Pushing the frontiers of Jaakkola, Crystal diffusion variational autoencoder for
density functionals by solving the fractional electron periodic material generation, arXiv: 2110.06197 (2021)
problem, Science 374(6573), 1385 (2021) 332. Y. Dong, D. Li, C. Zhang, C. Wu, H. Wang, M. Xin, J.
319. J. C. Snyder, M. Rupp, K. Hansen, K. R. Müller, and Cheng, and J. Lin, Inverse design of two-dimensional
K. Burke, Finding density functionals with machine graphene/h-BN hybrids by a regressional and conditional
learning, Phys. Rev. Lett. 108(25), 253002 (2012) GAN, Carbon 169, 9 (2020)
320. X. Lei and A. J. Medford, Design and analysis of 333. Y. Pathak, K. S. Juneja, G. Varma, M. Ehara, and U.
machine learning exchange-correlation functionals via D. Priyakumar, Deep learning enabled inorganic material
rotationally invariant convolutional descriptors, Phys. generator, Phys. Chem. Chem. Phys. 22(46), 26935
Rev. Mater. 3(6), 063801 (2019) (2020)
321. Z. Fan, Y. Wang, P. Ying, K. Song, J. Wang, Y. 334. Y. Suzuki, H. Hino, T. Hawai, K. Saito, M. Kotsugi,
Wang, Z. Zeng, K. Xu, E. Lindgren, J. M. Rahm, A. J. and K. Ono, Symmetry prediction and knowledge
Gabourie, J. Liu, H. Dong, J. Wu, Y. Chen, Z. Zhong, discovery from X-ray diffraction patterns using an
J. Sun, P. Erhart, Y. Su, and T. Ala-Nissila, GPUMD: interpretable machine learning approach, Sci. Rep.
A package for constructing accurate machine-learned 10(1), 21790 (2020)
potentials and performing highly efficient atomistic 335. A. A. Enders, N. M. North, C. M. Fensore, J. Velez-
simulations, J. Chem. Phys. 157(11), 114801 (2022) Alvarez, and H. C. Allen, Functional group identification
322. H. Wang, L. Zhang, J. Han, and W. E, DeePMD-kit: for FTIR spectra using image-based machine learning
A deep learning package for many-body potential models, Anal. Chem. 93(28), 9711 (2021)
energy representation and molecular dynamics, 336. B. Huang, Z. Li, and J. Li, An artificial intelligence
Comput. Phys. Commun. 228, 178 (2018) atomic force microscope enabled by machine learning,
323. Y. Zhang, H. Wang, W. Chen, J. Zeng, L. Zhang, H. Nanoscale 10(45), 21320 (2018)
Wang, and W. E, DP-GEN: A concurrent learning 337. A. Chandrashekar, P. Belardinelli, M. A. Bessa, U.
platform for the generation of reliable deep learning Staufer, and F. Alijani, Quantifying nanoscale forces
based potential energy models, Comput. Phys. using machine learning in dynamic atomic force
Commun. 253, 107206 (2020) microscopy, Nanoscale Adv. 4(9), 2134 (2022)
324. P. Pattnaik, S. Raghunathan, T. Kalluri, P. Bhimala- 338. S. V. Kalinin, C. Ophus, P. M. Voyles, R. Erni, D.
puram, C. V. Jawahar, and U. D. Priyakumar, Kepaptsoglou, V. Grillo, A. R. Lupini, M. P. Oxley, E.
Machine learning for accurate force calculations in Schwenker, M. K. Y. Chan, J. Etheridge, X. Li, G. G.
molecular dynamics simulations, J. Phys. Chem. A D. Han, M. Ziatdinov, N. Shibata, and S. J. Penny-
124(34), 6954 (2020) cook, Machine learning in scanning transmission electron
325. J. Westermayr and P. Marquetand, Machine learning microscopy, Nat. Rev. Methods Primers 2(1), 11
and excited-state molecular dynamics, Mach. Learn.: (2022)
Sci. Technol. 1(4), 043001 (2020) 339. J. Jung, et al., Super-resolving material microstructure
326. G. Fan, A. McSloy, B. Aradi, C. Y. Yam, and T. image via deep learning for microstructure characteri-
Frauenheim, Obtaining electronic properties of zation and mechanical behavior analysis, npj Comput.
molecules through combining density functional tight Mater. 7, 96 (2021)
binding with machine learning, J. Phys. Chem. Lett. 340. L. Floridi and M. Chiriatti, GPT-3: Its nature, scope,
13(43), 10132 (2022) limits, and consequences, Minds Mach. 30(4), 681
327. Z. Ahmad, T. Xie, C. Maheshwari, J. C. Grossman, (2020)
and V. Viswanathan, Machine learning enabled 341. OpenAI, GPT-4 Technical Report, arXiv: 2303.08774
computational screening of inorganic solid electrolytes (2023)
for suppression of dendrite formation in lithium metal 342. D. M. Katz, M. J. Bommarito, S. Gao, and P.
anodes, ACS Cent. Sci. 4(8), 996 (2018) Arredondo, GPT-4 passes the bar exam, Rochester,
328. S. Gong, S. Wang, T. Zhu, X. Chen, Z. Yang, M. J. NY, Mar. 15, 2023
Buehler, Y. Shao-Horn, and J. C. Grossman, Screening 343. V. Tshitoyan, J. Dagdelen, L. Weston, A. Dunn, Z.
and understanding Li adsorption on two-dimensional Rong, O. Kononova, K. A. Persson, G. Ceder, and A.
metallic materials by learning physics and physics- Jain, Unsupervised word embeddings capture latent
simplified learning, JACS Au 1(11), 1904 (2021) knowledge from materials science literature, Nature
329. T. Xie, A. France-Lanord, Y. Wang, J. Lopez, M. A. 571(7763), 95 (2019)
Stolberg, M. Hill, G. M. Leverick, R. Gomez- 344. E. A. Olivetti, J. M. Cole, E. Kim, O. Kononova, G.
Bombarelli, J. A. Johnson, Y. Shao-Horn, and J. C. Ceder, T. Y.-J. Han, and A. M. Hiszpanski, Data-
Grossman, Accelerating amorphous polymer electrolyte driven materials research enabled by natural language
screening by learning to reduce errors in molecular processing and information extraction, Appl. Phys. Rev.
dynamics simulated properties, Nat. Commun. 13(1), 7(4), 041317 (2020)
3415 (2022) 345. P. Shetty and R. Ramprasad, Automated knowledge
330. K. Gubaev, E. V. Podryabinkin, G. L. Hart, and A. V. extraction from polymer literature using natural
Shapeev, Accelerating high-throughput searches for language processing, iScience 24(1), 101922 (2021)
new alloys with active learning of interatomic poten- 346. A. Davies, P. Veličković, L. Buesing, S. Blackwell, D.
tials, Comput. Mater. Sci. 156, 148 (2019) Zheng, N. Tomašev, R. Tanburn, P. Battaglia, C.
331. T. Xie, X. Fu, O. E. Ganea, R. Barzilay, and T. Blundell, A. Juhász, M. Lackenby, G. Williamson, D.
13501-38 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)
REVIEW ARTICLE FRONTIERS OF PHYSICS
Hassabis, and P. Kohli, Advancing mathematics by structural, electronic, and magnetic properties of stron-
guiding human intuition with AI, Nature 600(7887), 70 tium titanate through atomic design: A comparison
(2021) between oxygen vacancies and nitrogen doping, J. Am.
347. G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Ceram. Soc. 96(2), 538 (2013)
Perdikaris, S. Wang, and L. Yang, Physics informed 363. H. Xing, H. Q. Wang, T. Song, C. Li, Y. Dai, G. Fu, J.
machine learning, Nat. Rev. Phys. 3(6), 422 (2021) Kang, and J. C. Zheng, Electronic and thermal properties
348. A. Goyal and Y. Bengio, Inductive biases for deep of Ag-doped single crystal zinc oxide via laser-induced
learning of higher-level cognition, Proc. R. Soc. A technique, Chin. Phys. B 32(6), 066107 (2023)
478(2266), 20210068 (2022) 364. L. Wu, J. C. Zheng, J. Zhou, Q. Li, J. Yang, and Y.
349. B. Baker, I. Akkaya, P. Zhokhov, J. Huizinga, J. Tang, Zhu, Nanostructures and defects in thermoelectric
A. Ecoffet, B. Houghton, R. Sampedro, and J. Clune, AgPb18SbTe20 single crystal, J. Appl. Phys. 105(9),
Video pretraining (VPT): Learning to act by watching 094317 (2009)
unlabeled online videos, Advances in Neural Information 365. H. Zeng, M. Wu, H. Q. Wang, J. C. Zheng, and J.
Processing Systems 35, 24639 (2022) Kang, Tuning the magnetic and electronic properties
350. J. Lehman, J. Gordon, S. Jain, K. Ndousse, C. Yeh, of strontium titanate by carbon doping, Front. Phys.
and K. O. Stanley, Evolution through large models, 16(4), 43501 (2021)
arXiv: 2206.08896 (2022) 366. D. Li, H. Q. Wang, H. Zhou, Y. P. Li, Z. Huang, J. C.
351. M. S. Anis, et al., Qiskit: An open-source framework Zheng, J. O. Wang, H. Qian, K. Ibrahim, X. Chen, H.
for quantum computing, 2021 Zhan, Y. Zhou, and J. Kang, Influence of nitrogen and
352. C. Wu, F. Wu, L. Lyu, Y. Huang, and X. Xie, magnesium doping on the properties of ZnO films,
Communication-efficient federated learning via knowl- Chin. Phys. B 25(7), 076105 (2016)
edge distillation, Nat. Commun. 13, 2032 (2022) 367. R. Wang and J. C. Zheng, Promising transition metal
353. H. G. Yu, Neural network iterative diagonalization decorated borophene catalyst for water splitting, RSC
method to solve eigenvalue problems in quantum Advances 13(14), 9678 (2023)
mechanics, Phys. Chem. Chem. Phys. 17(21), 14071 368. J. He, L. D. Zhao, J. C. Zheng, J. W. Doak, H. Wu, H.
(2015) Q. Wang, Y. Lee, C. Wolverton, M. G. Kanatzidis,
354. S. K. Ghosh and D. Ghosh, Machine learning matrix and V. P. Dravid, Role of sodium doping in lead
product state Ansatz for strongly correlated systems, J. chalcogenide thermoelectrics, J. Am. Chem. Soc.
Chem. Phys. 158(6), 064108 (2023) 135(12), 4624 (2013)
355. P. C. H. Nguyen, J. B. Choi, H. S. Udaykumar, and S. 369. L. D. Cooley, A. J. Zambano, A. R. Moodenbaugh, R.
Baek, Challenges and opportunities for machine learning F. Klie, J. C. Zheng, and Y. Zhu, Inversion of two-
in multiscale computational modeling, J. Comput. Inf. band superconductivity at the critical electron doping
Sci. Eng. 23(6), 060808 (2023) of (Mg, Al)B2, Phys. Rev. Lett. 95(26), 267002 (2005)
356. H. Wahab, V. Jain, A. S. Tyrrell, M. A. Seas, L. 370. H. Yan, T. Wang, L. Liu, T. Song, C. Li, L. Sun, L.
Kotthoff, and P. A. Johnson, Machine-learning-assisted Wu, J. C. Zheng, and Y. Dai, High voltage stable
fabrication: Bayesian optimization of laser-induced cycling of all-solid-state lithium metal batteries
graphene patterning using in-situ Raman analysis, enabled by top-down direct fluorinated poly (ethylene
Carbon 167, 609 (2020) oxide)-based electrolytes, J. Power Sources 557,
357. A. Tayyebi, A. S. Alshami, X. Yu, and E. Kolodka, 232559 (2023)
Can machine learning methods guide gas separation 371. J. C. Zheng, C. H. A. Huan, A. T. S. Wee, R. Z. Wang,
membranes fabrication, J. Membrane Sci. Lett. 2(2), and Y. M. Zheng, Ground-state properties of cubic C-
100033 (2022) BN solid solutions, J. Phys.: Condens. Matter 11(3),
358. Y. T. Chen, M. Duquesnoy, D. H. S. Tan, J. M. Doux, 927 (1999)
H. Yang, G. Deysher, P. Ridley, A. A. Franco, Y. S. 372. Z. Huang, T. Y. Lü, H. Q. Wang, S. W. Yang, and J.
Meng, and Z. Chen, Fabrication of high-quality thin C. Zheng, Electronic and thermoelectric properties of
solid-state electrolyte films assisted by machine learn- the group-III nitrides (BN, AlN and GaN) atomic
ing, ACS Energy Lett. 6(4), 1639 (2021) sheets under biaxial strains, Comput. Mater. Sci. 130,
359. W. Li, L. Liang, S. Zhao, S. Zhang, and J. Xue, Fabrication 232 (2017)
of nanopores in a graphene sheet with heavy ions: A 373. T. Y. Lü, X. X. Liao, H. Q. Wang, and J. C. Zheng,
molecular dynamics study, J. Appl. Phys. 114(23), Tuning the indirect–direct band gap transition of SiC,
234304 (2013) GeC and SnC monolayer in a graphene-like honeycomb
360. L. L. Safina and J. A. Baimova, Molecular dynamics structure by strain engineering: A quasiparticle GW
simulation of fabrication of Ni-graphene composite: study, J. Mater. Chem. 22(19), 10062 (2012)
Temperature effect, Micro & Nano Lett. 15(3), 176 374. J. C. Zheng and J. W. Davenport, Ferromagnetism
(2020) and stability of half-metallic MnSb and MnBi in the
361. B. Zhao, C. Shen, H. Yan, J. Xie, X. Liu, Y. Dai, J. strained zinc-blende structure: Predictions from full
Zhang, J. Zheng, L. Wu, Y. Zhu, and Y. Jiang, potential and pseudopotential calculations, Phys. Rev.
Constructing uniform oxygen defect engineering on B 69(14), 144415 (2004)
primary particle level for high-stability lithium-rich 375. L. Xu, H. Q. Wang, and J. C. Zheng, Thermoelectric
cathode materials, Chem. Eng. J. 465, 142928 (2023) properties of PbTe, SnTe, and GeTe at high pressure:
362. X. X. Liao, H. Q. Wang, and J. C. Zheng, Tuning the An ab initio study, J. Electron. Mater. 40(5), 641
Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024) 13501-39
FRONTIERS OF PHYSICS REVIEW ARTICLE
13501-40 Sue Sin Chong, et al., Front. Phys. 19(1), 13501 (2024)