0% found this document useful (0 votes)
78 views12 pages

Machine Learning in Chemical Engineering A Perspec

Uploaded by

uyslhly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views12 pages

Machine Learning in Chemical Engineering A Perspec

Uploaded by

uyslhly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chemie

These are not the final page numbers! ((


Ingenieur Review Article 1
Technik

Machine Learning in Chemical Engineering:


A Perspective
Artur M. Schweidtmann1,2,*, Erik Esche3, Asja Fischer4, Marius Kloft5, Jens-Uwe Repke3,
Sebastian Sager6, and Alexander Mitsos2,7,8
DOI: 10.1002/cite.202100083
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any
medium, provided the original work is properly cited.

The transformation of the chemical industry to renewable energy and feedstock supply requires new paradigms for the
design of flexible plants, (bio-)catalysts, and functional materials. Recent breakthroughs in machine learning (ML) provide
unique opportunities, but only joint interdisciplinary research between the ML and chemical engineering (CE) commun-
ities will unfold the full potential. We identify six challenges that will open new methods for CE and formulate new types
of problems for ML: (1) optimal decision making, (2) introducing and enforcing physics in ML, (3) information and
knowledge representation, (4) heterogeneity of data, (5) safety and trust in ML applications, and (6) creativity. Under the
umbrella of these challenges, we discuss perspectives for future interdisciplinary research that will enable the transfor-
mation of CE.
Keywords: Deep learning, Hybrid modeling, Machine learning, Optimization, Reinforcement learning
Received: May 28, 2021; revised: August 25, 2021; accepted: October 12, 2021

1 Introduction particular, mechanistic modeling, optimization, and model


predictive control.
The chemical industry must convert to using renewable
energy and feedstock supply, otherwise chemical production –
might become the largest driver of global oil consumption 1
Prof. Artur M. Schweidtmann
by 2030 [1–4]. However, renewable resources fluctuate over [email protected]
time and space, requiring dynamic operation and a new Delft University of Technology, Department of Chemical Engineer-
paradigm for identifying new process routes and the design ing, Van der Maasweg 9, 2629 HZ Delft, The Netherlands.
2
of flexible plants [3]. At the same time, the chemical compa- Prof. Artur M. Schweidtmann, Prof. Alexander Mitsos, Ph.D.
nies are facing increased competition and must ensure RWTH Aachen University, Aachener Verfahrenstechnik, Forcken-
optimal operation and short development cycles for new beckstr. 51, 52074 Aachen, Germany.
3
processes. Facilitating this radical change poses difficulties Dr. Erik Esche, Prof. Dr. Jens-Uwe Repke
Technische Universität Berlin, Fachgebiet Dynamik und Betrieb
as conventional methods for process synthesis and opera-
technischer Anlagen, Straße des 17. Juni 135, 10623 Berlin, Ger-
tion may not be sufficient. To make optimal decisions in many.
complex environments, models are conventionally devel- 4
Prof. Dr. Asja Fischer
oped based on mechanistic understanding and optimized. Ruhr-Universität Bochum, Department of Mathematics, Univer-
However, the development of physicochemical models is sitätsstraße 150, 44801 Bochum, Germany.
expensive, and many phenomena cannot be fully described 5
Prof. Dr. Marius Kloft
by computationally tractable models. Technische Universität Kaiserslautern, Department of Computer
Machine learning (ML) has the potential to overcome the Science, Erwin-Schrödinger-Straße 52, 67663 Kaiserslautern, Ger-
limitations of mechanistic modeling as ML methods can many.
6
learn complex behaviors, the model development is cheaper, Prof. Dr. Sebastian Sager
and it can be advantageous for optimization [5, 6]. Chemi- Otto-von-Guericke-Universität Magdeburg, Department of Mathe-
matics, Universitätsplatz 2, 39106 Magdeburg, Germany.
cal engineering (CE) already experienced two big waves of 7
ML applications between the 1980s and 2008, i.e., expert Prof. Alexander Mitsos, Ph.D.
JARA Center for Simulation and Data Science (CSD), Aachen,
systems and (shallow) artificial neural networks (ANNs) Germany.
(c.f. [7, 8]). These waves had limited impact due to several 8
Prof. Alexander Mitsos, Ph.D.
reasons [7]: i) lack of data, of data accessibility, of computa- Forschungszentrum Jülich, Institute for Energy and Climate
tion power, and of programming environments/paradigms, Research IEK-10 Energy Systems Engineering, Wilhelm-Johnen-
ii) competing successful emerging technologies for CE, in Straße, 52428 Jülich, Germany.

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
2 Review Article Ingenieur
Technik

Today, we have cheap and powerful computing, easy-to- In CE, process monitoring and fault detection have seen
use programming environments (e.g., Python & Tensor- many applications over the past decades leading to com-
Flow), and a large open-source community in ML. At the mercial tools and industrial applications. Process monitor-
same time, ML has seen a surge in automatic feature learn- ing is mainly based on classical principal component analy-
ing by deep ANNs [9, 10]. This development together with sis (PCA) [21], while some researchers investigated
advances in hardware – most importantly GPU computing independent component analysis for non-Gaussian process-
– led to breakthrough results in image recognition [11], es [22] and kernel density estimation for applications with
especially when based on convolutional neural networks unknown distributions, e.g., for data smoothing [23].
(CNNs) [7], and in game playing [12–14] and generally to a Further advances are monitoring platforms, e.g., using
technology push in ML [15]. self-organizing maps for a wastewater treatment plant [24]
CE currently undergoes a transformation towards digiti- and Gaussian mixture models for the Tennessee Eastman
zation and full automation of industry and research. This process [25]. Fault detection is another common application
leads to an ever-increasing availability of data and the need of unsupervised learning to process data. In the previous
for automated optimal decision-making based on data, literature, variations of PCA have been used frequently for
allowing for more sustainable process operations [16]. We fault detection [26–28]. Furthermore, other advanced
thus have a technology push and industry pull situation, methods have been applied to distinguish between normal
where ML opens up new possibilities to overcome pressing and faulty batches (e.g., support vector data description
challenges in CE [7]. In this perspective, we first review ML [29] and k-means clustering [30]). Today, first fault detec-
methods already established in CE (Sect. 2). Then, we iden- tion tools are commercially available for the process indus-
tify six emerging ML challenges with great potentials in CE try.
(Sect. 3).

2.2 Supervised Learning


2 Established Machine Learning Methods
in Chemical Engineering Supervised learning methods train a model on labeled data
with an explicit input-output structure and learn functions
ML is a subclass of artificial intelligence (AI). ML has roots mapping an input to an output. Regression is a supervised
in computer science and mathematics and gives computers ML tool that is part of the standard repertoire in process
the ability to learn from data without being explicitly pro- systems engineering (PSE) and has long been used for mod-
grammed. ML is broadly classified into supervised learning eling and subsequent optimal design of processes. Regard-
and unsupervised learning [17]. Other types of ML are rein- ing soft sensor applications, i.e., online prediction of process
forcement learning (RL) as well as hybrids such as semi- qualities, a large variety of different methods has been used,
supervised learning. including partial least squares [31–33], principal compo-
First applications of classical AI in CE were proposed in nent regression [34, 35], support vector machines (SVM)
the 1980s with the advent of expert systems, e.g., for thermo- [36], ANNs [37], and Gaussian process (GP) regression
physical properties [18] and catalyst design [19]. They did [38–40]. Applications of these include complex large-scale
not achieve breakthroughs mainly because implementation, processes such as air separation units [32], injection-mold-
training, and maintenance were costly and time-consuming ing [35], reverse osmosis of seawater [33], and further
[7] (c.f. Sect. 1). As ML theory, computer hardware, and pro- chemical production processes [37, 38, 41].
gramming languages advanced, ML was applied to experi- Supervised learning has long been used for dynamic sys-
mental and simulated data to extract information, recognize tems in operations and control. A wide range of models is
patterns, and make predictions [20]. Overall, ML methods applied to describe dynamic processes based on data in dis-
for process monitoring, fault detection, and soft sensing are crete-time and continuous-time approaches [42]. There are
mostly mature and commercially available in CE. state-space models, Hammerstein-Wiener models, scale-
bridging surrogate models, linear autoregressive integrated
moving average (ARIMAX), and nonlinear ARIMAX
2.1 Unsupervised Learning (NARMAX) [43]. Identification of these models is well
established. However, they are limited to Markovian sys-
Unsupervised learning describes the collection of tech- tems, where the current state completely describes the
niques that investigate ‘‘unlabeled data’’, i.e., data with no system, i.e., effects of hysteresis cannot be described. In ML,
explicit input-output connection. The main purpose is to recurrent neural networks (RNN) have been introduced to
find hidden structure in data, e.g., for clustering, feature include non-Markovian effects. For example, [44] use an
extraction, compression, or anomaly detection. Unsuper- RNN to learn the policy for operating a batch bioprocess. In
vised learning is popular and attractive from a practical cases where large data sets are present and long-term
point of view as input-output connections are oftentimes dependencies are relevant, training of standard RNNs suf-
unavailable in applications. fers from vanishing gradients and gated recurrent neural

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12
Chemie
These are not the final page numbers! ((
Ingenieur Review Article 3
Technik

networks like long-short-term memory


(LSTM) architectures are suitable [45].

3 Emerging Machine Learning


Challenges in Chemical
Engineering

Beyond the previously summarized topics


deeply rooted in data mining and ana-
lytics, we identify six emerging challenges Figure 2. Overview of data-driven models embedded in optimization problems in CE:
of ML with a large potential for CE [46]. linear [47], convex region linear surrogate model [57], nonlinear basis functions, e.g.,
ALAMO [56], piecewise polynomial function [59], spline function [60], Gaussian process
[54, 55], support vector machine, ensemble tree model (e.g., random forest, gradient
boosted trees [58, 64]), ANNs with ReLU activation [61–63], and other ANNs with more
3.1 Optimal Decision Making complex activation functions [6, 54]. Note that the models are ordered by their esti-
mated ability to learn complex dependencies.
Optimal decision making is a prominent
topic in CE, for process synthesis, control, as well as solvent, becomes available, e.g., through smart manufacturing, high
catalyst, or adsorbent selection. All these decisions need to throughput experiments, and simulation studies [16]. How-
be made based on existing information which can be in the ever, applications of deep ANNs are still limited in process
form of data and mechanistic knowledge, e.g., models. As design in PSE [8].
shown in Fig. 1, optimal decision making based on data can While optimization of problems with linear models can
be done by training of data-driven or hybrid models and be solved globally on a large scale, e.g., for structural opti-
subsequent optimization with embedded data-driven mod- mization [47], linear models cannot learn high-dimensional
els [5, 6]. nonlinear problems accurately. On the other hand, the con-
Within PSE, various data-driven models have been used sideration of more complex data-driven models like ANNs
for regression and subsequent process optimization. As and GPs, which could reflect high dimensional nonlinear
illustrated in Fig. 2, the complexity of applied data-driven problems better, has long been limited to local or stochastic
models ranges from linear approximations to deep ANNs. solution approaches [51–55]. Although some tailor-made
For a long time, the literature focused on linear models to data-driven models can be solved using state-of-the-art
approximate simulation and experimental data [47]. Since global solvers, these are also limited to low-dimensional
the 1990s, shallow ANNs have been used extensively. problems [56, 57]. A few researchers in CE and ML have
Shallow ANNs can theoretically approximate any nonlinear, developed tailored optimization approaches for problems
smooth function given a sufficient number of neurons to with ML models embedded. Mistry et al. [58] proposed a
any given positive accuracy on a training data set [48]. In tailored algorithm for problems with gradient boosted trees
many PSE applications, ANNs are fitted to complete pro- embedded. Grimstad and coworkers proposed an algorithm
cesses (black-box approach, e.g., [49]), or ANNs are com- for the optimization of piecewise polynomial functions [59]
bined with mechanistic model equations (hybrid modeling and spline functions [60]. Some previous works also used
approach, e.g., [50–52]). Subsequently, the obtained process general-purpose global solvers to solve optimization prob-
models can be optimized, e.g., to identify process design lems with complex surrogate models embedded [54, 55] but
[53]. Today, the re-emergence of ML is mostly driven by observed computational burdens. Recently, Schweidtmann
deep ANNs and big data [15]. The deep ANNs are believed and Mitsos [6] proposed an efficient reduced-space opti-
to become more important in PSE because abundant data mization formulation for global optimization of deep
ANNs. Notably, ANNs with rectified lin-
ear unit (ReLU) activations have recently
been reformulated as mixed-integer line-
ar programs (MILPs) [61–63]. In the
MILP formulations, binary variables are
introduced to divide the domain of the
piecewise linear ReLU activation func-
tions into two linear sub-domains. Simi-
larly, tree models can be reformulated as
MILPs [58, 64, 65]. However, the number
of integer variables and constraints
grows linearly with the model complexity
(e.g., number of nodes in the ANN).
Figure 1. Illustration of the data-driven modeling and optimization approach [5, 6].

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
4 Review Article Ingenieur
Technik

Overall, optimal decision-making based on data has tic model structure g(f1(x1)f2(x2)) is known a priori where f1
already seen some work on optimal process synthesis and and f2 are unknown functions and g is a known mechanistic
optimal process operation. However, there are still severe model. This allows to build a hybrid model that can be eval-
limitations in integrating learning and optimization frame- uated outside the initial training data manifold because each
works that exhibit complex ML models. black-box model has only one single input and is thus eval-
Promising future research in optimal decision making uated within its training data range. Consequently, the
includes ML-assisted/embedded optimization and ML- hybrid model structure avoids extrapolation of the black-
assisted control. In both areas, the complexity of real-life box model parts. Hybrid modeling has numerous applica-
processes and the inclusion of non-Markovian effects prom- tions in CE and biotechnology since the early 1990s, e.g., in
ises to discover insufficiencies of ML methods, to nurture process [69, 70] and reactor modeling [71], polymerization
new developments, and to open promising new avenues of [72], crystallization, distillation, drying processes [73], and
research in both fields. This area has excellent synergy process control [74, 75]. Also, many empirical constitutive
potential with the other five. equations in CE can be interpreted as simple data-driven
parts in a hybrid model. Hybrid modeling has strong
theoretical foundations within the CE community and is
3.2 Introducing and Enforcing Physics in Machine believed to gain importance within and beyond CE: Fiedler
Learning and Schuppert [76] and Kahrs and Marquardt [50] pro-
vided fundamental insight into the identification of hybrid
The physicality of ML models is a frequently called-for models and extrapolation properties. Furthermore, Kahrs
development in CE. Across many disciplines, supervised and Marquardt [77] developed methods for determining a
learning is directly applied in a black-box approach. How- valid input domain for hybrid models.
ever, black-box approaches have severe drawbacks in inter- The ML community has identified the need for the incor-
pretability, extrapolation, data demand, and reliability. poration of a priori knowledge for many applications. For
These drawbacks can limit applications of ML and lead to instance, researchers in the ML community incorporate
fatal errors when being applied in industry without neces- prior knowledge as a penalty term in the training and thus
sary checks. On the other hand, mechanistic, physicochemi- enforce physics-informed ANN [78]. A few works also aim
cal models can provide structural knowledge that can be to extract physical knowledge from data or data-driven
combined with data-driven models. models. Symbolic regression was used to identify physical
The combination of mechanistic and data-driven models laws from kinematic data [79]. Interestingly, advanced opti-
is called hybrid (semi-parametric) modeling. It promises mization formulations for symbolic regression have been
advantages such as better interpretability, enhanced extrap- developed [80] and applied in CE [81].
olation properties, and higher prediction accuracy [66–68]. Overall, CE has a tremendous record in physicochemical
Fig. 3 illustrates the enhanced extrapolation properties of modeling and formulating predictive models. Exploiting
hybrid models. In the illustrative example, some training these capabilities for hybrid modeling is promising to en-
data points are distributed on low-dimensional manifolds sure interpretability, extrapolation, reliability, and trust of
that are illustrated by dashed lines. In this case, a standard ML models. At the same time, CE has a strong foothold in
black-box approach (e.g., an ANN with two inputs and one (global) optimization of constrained mixed-integer non-
output) should not be evaluated outside the manifolds of linear problems. Bringing these concepts to the training of
the training data points. In the illustrative case, a mechanis- hybrid ML methods should reap further profit.

Figure 3. Comparison of hybrid model structure and black-box modeling approach assuming data on a low-dimensional mani-
fold. The dashed lines represent the manifold of the training data points. The mechanistic model g is known a priori. The figure
is adapted from a lecture of Andreas Schuppert on hybrid modeling.

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12
Chemie
These are not the final page numbers! ((
Ingenieur Review Article 5
Technik

Hybrid models that combine data-driven and mechanis- Semantic web technologies connect knowledge and data
tic models can avoid extrapolation (c.f. earlier discussion on by using graphs as a unified data model [87]. In particular,
Fig. 3) and are essential for many CE applications. Further- knowledge graphs combine data with ontologies, i.e.,
more, they improve the explainability of ML models, which semantic data models [88]. Currently, there exist only a few
matches a current trend in the ML community. In other chemical engineering ontologies (e.g., ONTOCAPE [89])
words, hybrid models are often more explainable compared and knowledge graphs (e.g., the J-Park Simulator [90]).
to a black-box model. For example, intermediate variables In the future, finding new representations for information
in hybrid models commonly have a physical meaning that and knowledge of CE will allow for further analysis, new
can facilitate explainability of the predictions. In addition, information and knowledge, and subsequent use, e.g., for
there is a smooth transition between hybrid modeling and optimal decision making. This crosscutting field is hence of
physically motivated ML model architectures. Including great importance to the overall success of ML in CE. In the
physical knowledge into the ML architectures has the future, CE data will be extracted from scientific literature
potential to enhance generalization and explainability of and other CE data sources. Moreover, we believe that it will
ML models. For this, CE can build on results from the early be structured through ontologies and will be saved in
1990s that are today mostly not recognized by the ML com- knowledge graphs. Using knowledge graph embeddings or
munity. Most of the previous hybrid modeling efforts can other representations allows for automated learning of
be understood as a top-down approach where a hybrid information [91]. At the same time, the ML methods that
modeling structure is dictated by the physical and chemical work on these specialized knowledge representations need
understanding of the system. However, we know from the to be tailored to the applications requiring research in both
analogy to expert systems that this top-down approach ML and CE. Handling and representing this highly hetero-
requires system expertise and leads to high maintenance. geneous, noisy, and sometimes scarce data is challenging
Deep learning is successful because it allows for bottom-up and a key issue that should be addressed in the future; it
(or end-to-end) data utilization. Using the structural infor- opens huge potentials in CE but requires both ML know-
mation contained in process flowsheets will generate new how and domain-specific insight from CE.
training schemes for hybrid models and at the same time
ensure increased physicality.
3.4 Heterogeneity of Data

3.3 Information and Knowledge Representation Heterogeneity of data in CE has many sources (e.g., lab
books, measurements, property data, molecular simulations,
The significant surge of ML applications in social media publications, simulation files), and processing heteroge-
platforms, online shopping, and video-on-demand services neous data is a major hurdle in CE. Heterogeneity in CE
heavily relies on vast amounts of structured data. In CE, stems, e.g., from (1) multiple scales in time and space (e.g.,
however, only a tiny fraction of knowledge and information ms to months in control and scheduling, or nm to m in
is accessible for ML methods while the majority is only pore diffusion and pressure swing adsorption), (2) a variety
available in analog or non-standardized digital form. of data sources, which need to be combined to understand
Currently, ML techniques commonly process data from chemical processes (e.g., process data, alarms, property data,
computations, sensors, and measurements. However, equipment specifications), (3) highly different data frequen-
molecular data, process flow charts, P&IDs, publications, cies (e.g., continuous measurement data appears once per
lab books, etc. are not often accessible to standard ML tech- ms, while quality data is gathered every other hour or day).
niques. This is a major hurdle for finding and exploiting All of this is exacerbated by the frequent high dimensional-
more complex relationships by ML techniques. ity of data sets [20].
Information extraction is the process of (semi-)auto- Unsupervised machine learning has emerged in CE for
mated retrieval of structured information from unstruc- treating high-dimensional problems and perform dimen-
tured data sources [82]. For example, natural language sionality reduction in CE. Recently, an outlier detection
processing (NLP) algorithms can recognize entities in algorithm identified strategic molecules for circular supply
unstructured text and extract their relations [83]. Although chains within the ‘‘network of organic chemistry’’ [92, 93]
transformer-based language models have recently demon- with roughly one million reactions [94]. Breaking down the
strated great advances in NLP [84], automated named dimensionality makes huge networks accessible to reaction
entity recognition and relation extraction are still challeng- pathway optimization methods [95–97]. Furthermore, PCA
ing tasks require future research. In addition, figures and is for example applied [98] to design features from a set of
tables provide valuable information. Extracting information molecular descriptors for solvent selection.
from tables is domain-independent and there exist multi- Considering increasingly high-dimensional data sets,
ples tools for this task [85]. However, the extraction of manifold learning has become ever more important.
information from figures is often domain-specific and Among others, Aimin et al. [99, 100] use manifold learning
requires joint research efforts [86]. as the basis for soft-sensor developments for a fermentation

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
6 Review Article Ingenieur
Technik

process and a debutanizer column. To tackle highly differ- graphs [113]. There have also been some first efforts to rep-
ent data frequencies or data, which is erroneous or incom- resent reaction networks [94, 114] and flowsheets [115, 116]
plete, [20] suggest using semi-supervised learning. It can be as graphs and apply ML to this data. Venkatasubramanian
employed in case of a mismatch between input and output [7] identified information representation as promising
data, e.g., when a data basis consists of labeled and unla- building blocks to further advance the field of CE.
beled data. A small amount of labeled data can be aug- In the future, data of different length and time scales will
mented by larger unlabeled data sets [101]. First prelimi- be combined through ML. For example, simulation and
nary applications in CE use semi-supervised learning to experimental data will be integrated. Beyond this heteroge-
predict missing data for a soft sensor in penicillin produc- neity in continuous data, completely different types of data
tion [102]. sources will also be integrated. We also expect advanced
To describe heterogeneous data, specialized representa- novel training procedures to construct ML models based on
tions of CE information and knowledge are required. For a heterogeneous data. For example, recent work predicts
long time, researchers have designed manual features to experimental procedures from chemical reactions using a
describe data. For example, molecules can be described transformer language model [117]. Regarding the introduc-
through molecular counts or group contribution methods tion of data from various scales in space, we expect new
[103, 104]. However, this manual feature design requires modeling paradigms, which automatically rank and subse-
expert knowledge and can lead to a model bias. A promis- quently filter the influence of phenomena at multiple scales.
ing solution is end-to-end learning, where gradient-based The targeted analysis of scientific publications regarding,
learning is applied on a complete system from the informa- e.g., thermodynamic information, will allow for the use of
tion representation to the output [15]. This has led to historic data in new applications. We envision an automatic
breakthrough results in many complex applications includ- identification of similar data sets among a database and
ing self-driving cars [105] and speech recognition [106]. subsequent domain adaptation. This necessitates advances
Recently, molecules and crystals have been represented as in transfer learning and domain adaption. Vice versa, there
graphs and processed by specialized ML algorithms for is a lot of heuristic knowledge in CE on small-scale phe-
end-to-end learning [107, 108]. Graph neural networks nomena, which have tremendous effects at larger scales,
(GNNs) directly operate on graph structure and have e.g., capillary forces and their ramifications for membranes
shown promising results for predicting structure-property and filters. Formalizing this knowledge and turning it into
relationships [109, 110]. Through graph convolutions, modeling paradigms for multiscale problems should also be
GNNs can learn optimal molecular representations and beneficial.
map these representations to the physicochemical proper-
ties. As illustrated in Fig. 4, the end-to-end learning
approach eliminates the need for manual feature selection. 3.5 Safety and Trust in Machine Learning
Recent work applied GNNs to predict of quantitative struc- Applications
ture-activity and property relationships, e.g., octanol solu-
bility, aqueous solubility, melting point, and toxicity In CE, a failure could amount to a runaway reaction causing
[110, 111]. Further, Xie and Grossman [112] represent crys- damage to equipment. Safety and trust in ML applications
tal structures by a crystal graph that encodes atomic infor- are related to the call for the introduction of physical laws
mation and bonding interactions for the prediction of target into ML techniques but goes well beyond. It is well-known
properties. In addition to these applications, GNNs have that the extrapolation capacity of data-driven models over
also been extended to recognize higher-order features from their initial training domain is limited [77, 99, 118]. Thus,
the development of models describing the valid-
ity domain of data-driven models are desired
[77, 99, 118]. However, when training data-
driven models on industry data, defining and
modeling the validity domain is a major issue
[99]. Similar issues can arise when applying
GNNs to molecular property prediction
[119, 120] or when applying RL to control pro-
cesses [44, 121]. Overcoming this hurdle is a
relevant issue where ML and chemical engineer-
ing together can generate considerable added
value in terms of research and which would pave
the way for new applications.
RL is well-known for its application in game
Figure 4. Illustration of the concept of end-to-end learning in comparison to playing, where an agent automatically deter-
manual feature extraction. mines actions that maximize an expected reward

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12
Chemie
These are not the final page numbers! ((
Ingenieur Review Article 7
Technik

[122]. RL has seen some first and promising applications in In the future, we expect ML methods to solve creative
chemical engineering for control [44, 121] and scheduling tasks from the thermodynamic phenomena scale up to
[8]. However, all these initial attempts were purely simula- whole flowsheets and enterprises. Regarding the former,
tion-based because RL is trial-and-error-based, meaning thermodynamic properties for never-measured systems will
that experiments can fail. Finally, the field of ML is getting be inferred by matrix completion techniques.
more interested in ‘‘explainability’’ and ‘‘interpretability’’. So far, ML’s recent advances on creativity have mainly
The usage of black-box models has proven to be a trust been appropriately applied on images, texts, and sounds.
issue and more transparent architectures are coming into Transferring these to molecules, control structures, and
focus. flowsheets is highly promising as a lot of manual work can
In the future, research needs to focus on the safety and be automated and potentially a huge new set of candidate
trustworthiness of ML methods going beyond just ensuring solutions will be found by these techniques. These candidate
physicality of ML models. To ensure safety in CE applica- solutions will also be of great help regarding optimal deci-
tions, the interpretability of decisions, the provable robust- sion-making. Given the novelty of the applications, further
ness of models, and the quantification of uncertainty are advances in ML methods can be expected regarding train-
crucial avenues for future research. In process automation ing techniques, structuring of data, and metrics for the
and control, causality- and control-based approaches ensure sensibility of outputs.
safety. Integrating these with the promising advances in RL
could generate novel methods of use for multiple fields.
Similarly, there has been a lot of work on uncertainty quan- 4 Conclusion and Outlook
tification both for physicochemical models in CE and for
data-driven ML models. These should be brought together We identified six challenges for interdisciplinary research
for proper descriptions of uncertainty in hybrid models that will open up new methods for CE and formulate new
and, of course, for sound design and operation of CE appli- types of problems for ML: optimal decision making, intro-
cations under uncertainty. ducing and enforcing physics in ML, information and
knowledge representation, heterogeneity of data, safety and
trust in ML applications, and creativity.
3.6 Creativity The German Research Foundation recently established
the Priority Program ‘‘Machine Learning in Chemical Engi-
Creativity is a feature that ML has become quite famous for. neering: Knowledge meets data.’’ (SPP 2331). The first batch
Some examples are generating new texts, new sounds, and of projects is expected to start in the fall of 2021. With this
new images [123]. In CE, a lot of effort using non-ML tech- Priority Program, researchers from ML and CE will jointly
niques is currently going into inverse problems, e.g., finding work to tackle these emerging challenges. In the meantime,
a new catalyst or a novel solvent for a given application other initiatives have begun researching at the interface of
[124]. Also, deriving new process structures or new control CE and ML. These initiatives promise exciting new research
structures is a desired goal. In ML, matrix completion is a directions in the next few years and will undoubtedly aid in
semi-supervised technique that generated a lot of attention educating a new generation of engineers fluent in methods
by its large-scale application for the ‘‘Netflix problem’’ from both worlds.
[125]. Here, predictions are made for non-rated movies
based on a large (and sparse) matrix of viewers and ratings.
Recently, this technique has been applied to the prediction The authors gratefully acknowledge the DFG for estab-
of activity coefficients for component mixtures, which were lishing the Priority Programme SPP 2331 ‘‘Machine
never experimentally investigated [126]. learning in chemical engineering. Knowledge meets data:
Further progress came with generative adversarial net- Interpretability, Extrapolation, Reliability, Trust’’. AMS is
works (GANs), which are deep neural network architectures supported by the TU Delft AI Labs Programme. MK
consisting of two nets competing against one another (‘‘ad- acknowledges support by the Carl-Zeiss Foundation,
versarial’’) [127]. GANs quickly proved to be highly capable by the German Research Foundation (DFG) award
at creative tasks such as creating new works of art. Another KL 2698/2-1, and by the Federal Ministry of Science and
type of promising generative models are variational auto- Education (BMBF) awards 01IS18051A and 031B0770E.
encoders (VAE) [128], which are also used for data genera-
tion (image, sound, text) and missing data imputation.
Overall, CE is a field where novel process designs, prod-
ucts, and materials can currently only be found or discov-
ered by experimental trial and error or by human design.
Supporting this with creative techniques from ML might
allow for discoveries as of yet unimaginable and research in
this direction is hence highly desirable.

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
8 Review Article Ingenieur
Technik

Artur M. Schweidtmann is Marius Kloft is full pro-


an assistant professor for fessor (W3) for computer
chemical engineering at science at Technische Uni-
Delft Technical University versität Kaiserslautern.
and co-director of the Previously, he was an assis-
KDAI Lab, which is part of tant professor at HU Berlin
the TU Delft AI Labs Pro- (2014–2017) and a post-
gramme. He received his doctoral fellow at Courant
Master of Science from Institute of Mathematical
RWTH University in 2017 Sciences, New York. He
and defended his Ph.D. earned his PhD at UC Ber-
from RWTH in 2021, both keley and TU Berlin (2011).
in chemical Engineering. He is interested in theory
During his studies, he spent the academic year 2013/ and algorithms of statistical machine learning, espe-
2014 at Carnegie Mellon University as a visiting cially unsupervised deep learning, and its applications
student via DAAD ISAP program. He performed his in chemical process engineering. In 2014, he was
Master thesis at the University of Cambridge. His awarded the Google Most Influential Papers award.
research focuses on the combination of artificial intelli-
gence and chemical engineering.
Jens-Uwe Repke is full
professor (W3) for Process
Erik Esche is a postdoctoral Dynamics and Operations
researcher with Prof. Jens- at Technische Universität
Uwe Repke at TU Berlin. Berlin. He received his
He leads the group’s work Dipl.-Ing. in 1996 and his
on development and appli- Dr.-Ing. in 2002, both from
cation of methods for the Technische Universität
mathematical optimization Berlin. From 2010 to 2016
and machine learning for his has been a full professor
operation and design of for Thermal Separation
chemical processes. His Technologies at TU Berg-
research focuses on un- akademie. His research
certainty in models and focuses on optimal process design and operation.
measurements and their
consequence for the reliable operation of chemical
processes. Sebastian Sager is full
professor (W3) for algo-
rithmic optimization at
Asja Fischer is full profes- Otto-von-Guericke Uni-
sor (W3) for mathematics versität Magdeburg. He
at Ruhr-Universität received his Diploma
Bochum, Bochum. She (2001), PhD (2006), and
received her Master of habilitation (2012) from
Science in Cognitive Universität Heidelberg, all
Science from the University in mathematics. His re-
of Osnabrück in 2009 and search focuses on mixed-
her Ph.D. in Computers integer nonlinear optimiza-
Science in 2014 from Uni- tion of complex processes
versity of Copenhagen. Her and applications in renewable energy, mobility, and
research focusses on the clinical decision support.
theory and application of
machine learning with a focus on deep learning and
probabilistic models.

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12
Chemie
These are not the final page numbers! ((
Ingenieur Review Article 9
Technik

[10] G. E. Hinton, S. Osindero, Y.-W. Teh, Neural Comput. 2006, 18


Alexander Mitsos is a full (7), 1527–1554.
professor (W3) for chemi- [11] G. B. Huang, H. Lee, E. Learned-Miller, in 2012 IEEE Conference
cal engineering at RWTH on Computer Vision and Pattern Recognition, IEEE, Piscataway,
Aachen University and the NJ 2012.
[12] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou,
director of IEK-10 Energy D. Wierstra, M. Riedmiller, Playing Atari with Deep Reinforce-
Systems Engineering at ment Learning, arXiv 2013. https://arxiv.org/abs/1312.5602
Forschungszentrum Jülich. [13] V. Mnih et al., Nature 2015, 518 (7540), 529.
He received his Dipl.-Ing. [14] D. Silver et al., Nature 2016, 529 (7587), 484.
from University of Karls- [15] Y. LeCun, Y. Bengio, G. Hinton, Nature 2015, 521 (7553),
ruhe in 1999 and his Ph.D. 436–444.
[16] S. J. Qin, AIChE J. 2014, 60 (9), 3092–3100. DOI: https://doi.org/
from MIT in 2006, both in
10.1002/aic.14523
chemical engineering. His [17] C. M. Bishop, Pattern recognition and machine learning, 8th ed.,
research focuses on opti- Information science and statistics, Springer, New York 2009.
mization of energy and chemical systems and develop- [18] R. Banares-Alcantara, A. Westerberg, M. Rychener, Comput.
ment of enabling numerical algorithms. He is the Chem. Eng. 1985, 9 (2), 127–142.
coordinator of SPP 2331. [19] R. Banares-Alcantara, E. I. Ko, A. W. Westerberg, M. D. Ryche-
ner, Comput. Chem. Eng. 1988, 12 (9), 923–938. DOI: https://
doi.org/10.1016/0098-1354(88)87018-2
[20] Z. Ge, Z. Song, S. X. Ding, B. Huang, IEEE Access 2017, 5,
Abbreviations 20590–20616.
[21] B. De Ketelaere, M. Hubert, E. Schmitt, J. Qual. Technol 2015, 47
(4), 318–335. DOI: https://doi.org/10.1080/
ANN Artificial neural network
00224065.2015.11918137
ARIMAX Autoregressive integrated moving average [22] C.-C. Hsu, M.-C. Chen, L.-S. Chen, Control Eng. Pract. 2010, 18
CE Chemical engineering (3), 242–253. DOI: https://doi.org/10.1016/j.conengprac.
CNN Convolutional neural network 2009.11.002
GNN Graph neural network [23] M. He, C. Yang, X. Wang, W. Gui, L. Wei, Miner. Eng. 2013, 53,
GP Gaussian process 203–212. DOI: https://doi.org/10.1016/j.mineng.2013.08.011
[24] M. Liukkonen, I. Laakso, Y. Hiltunen, Environ. Model. Software
LSTM Long-short-term memory
2013, 48, 193–201. DOI: https://doi.org/10.1016/j.envsoft.
MILP Mixed-integer linear program 2013.07.005
ML Machine learning [25] J. Yu, J. Process Control 2012, 22 (4), 778–788. DOI: https://
PCA Principal component analysis doi.org/10.1016/j.jprocont.2012.02.012
PSE Process systems engineering [26] A. Prieto-Moreno, O. Llanes-Santiago, E. GarcÃ-a-Moreno,
RL Reinforcement learning J. Process Control 2015, 33, 14–24. DOI: https://doi.org/10.1016/
RNN Recurrent neural network j.jprocont.2015.06.003
[27] M. af Pimentel, D. A. Clifton, L. Clifton, L. Tarassenko, Signal
SVM Support vector machine
Process. 2014, 99, 215–249.
[28] L. H. Chiang, R. J. Pell, M. B. Seasholtz, J. Process Control 2003,
13 (5), 437–449. DOI: https://doi.org/10.1016/S0959-
References 1524(02)00068-9
[29] M. Yao, H. Wang, W. Xu, J. Process Control 2014, 24 (7),
[1] A. Kätelhön, R. Meys, S. Deutz, S. Suh, A. Bardow, Proc. Natl. 1085–1097. DOI: https://doi.org/10.1016/j.jprocont.2014.05.015
Acad. Sci. 2019, 116 (23), 11187–11194. [30] Z. Lv, X. Yan, Q. Jiang, Chemom. Intell. Lab. Syst. 2014, 137,
[2] A. A. Lapkin, in Handbook of Green Chemistry, Vol. 12, 128–139. DOI: https://doi.org/10.1016/j.chemolab.2014.06.010
Wiley-VCH, Weinheim 2018, 1–16. [31] J. V. Kresta, T. E. Marlin, J. F. MacGregor, Comput. Chem. Eng.
[3] A. Mitsos, N. Asprion, C. A. Floudas, M. Bortz, M. Baldea, 1994, 18 (7), 597–611. DOI: https://doi.org/10.1016/0098-
D. Bonvin, A. Caspari, P. Schäfer, Comput. Chem. Eng. 2018, 113, 1354(93)E0006-U
209–221. [32] J. Liu, J. Process Control 2014, 24 (7), 1046–1056. DOI: https://
[4] J. Artz, T. E. Müller, K. Thenert, J. Kleinekorte, R. Meys, A. Stern- doi.org/10.1016/j.jprocont.2014.05.014
berg, A. Bardow, W. Leitner, Chem. Rev. (Washington, DC, U. S.) [33] S. S. Kolluri, I. J. Esfahani, P. S. N. Garikiparthy, C. Yoo, Korean
2017, 118 (2), 434–504. J. Chem. Eng. 2015, 32 (8), 1486–1497. DOI: https://doi.org/
[5] K. McBride, K. Sundmacher, Chem. Ing. Tech. 2019, 91 (3), 10.1007/s11814-014-0356-0
228–239. DOI: https://doi.org/10.1002/cite.201800091 [34] Z. Ge, F. Gao, Z. Song, Chemom. Intell. Lab. Syst. 2011, 105 (1),
[6] A. M. Schweidtmann, A. Mitsos, J. Optim. Theory Appl. 2019, 91–105. DOI: https://doi.org/10.1016/j.chemolab.2010.11.004
180 (3), 925–948. DOI: https://doi.org/10.1007/s10957-018-1396-0 [35] Y. Yang, F. Gao, Polym. Eng. Sci. 2006, 46 (4), 540–548.
[7] V. Venkatasubramanian, AIChE J. 2019, 65 (2), 466–478. [36] P. Jain, I. Rahman, B. D. Kulkarni, Chem. Eng. Res. Des. 2007, 85
[8] J. H. Lee, J. Shin, M. J. Realff, Comput. Chem. Eng. 2018, 114, (2), 283–287. DOI: https://doi.org/10.1205/cherd05026
111–121. [37] J. C. B. Gonzaga, L. A. C. Meleiro, C. Kiang, R. Maciel-Filho,
[9] Y. Bengio, A. C. Courville, P. Vincent, Representation Learning: Comput. Chem. Eng. 2009, 33 (1), 43–49. DOI: https://doi.org/
A Review and New Perspectives, arXiv 2012. https://arxiv.org/abs/ 10.1016/j.compchemeng.2008.05.019
1206.5538

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
10 Review Article Ingenieur
Technik

[38] Z. Ge, T. Chen, Z. Song, Control Eng. Pract. 2011, 19 (5), over Ensemble Tree Models, arXiv 2020. https://arxiv.org/abs/
423–432. DOI: https://doi.org/10.1016/j.conengprac.2011.01.002 2003.04774
[39] C. E. Rasmussen, in Advanced Lectures on Machine Learning [66] D. C. Psichogios, L. H. Ungar, AIChE J. 1992, 38 (10), 1499–
(Eds: O. Bousquet, U. von Luxburg, G. Rätsch) Lecture Notes in 1511. DOI: https://doi.org/10.1002/aic.690381003
Computer Science, vol 3176, Springer, Berlin 2004. [67] A. A. Schuppert, in Equadiff 99: (In 2 Volumes), World Scientific,
[40] C. K. I. Williams, C. E. Rasmussen, in Advances in neural infor- Singapore 2000.
mation processing systems MIT Press, Cambridge, MA 1996. [68] M. von Stosch, R. Oliveira, J. Peres, S. F. de Azevedo, Comput.
[41] H. Kaneko, M. Arakawa, K. Funatsu, Comput. Chem. Eng. 2011, Chem. Eng. 2014, 60, 86–101. DOI: https://doi.org/10.1016/
35 (6), 1135–1142. DOI: https://doi.org/10.1016/j.compchemeng. j.compchemeng.2013.08.008
2010.09.003 [69] H. A. Te Braake, H. J. van Can, H. B. Verbruggen, Eng. Appl.
[42] R. Rico-Martınez, R. A. Adomaitis, I. G. Kevrekidis, Comput. Artif. Intell. 1998, 11 (4), 507–515.
Chem. Eng. 2000, 24 (11), 2417–2433. [70] D. Rall, A. M. Schweidtmann, M. Kruse, E. Evdochenko, A. Mit-
[43] K. J. Keesman, System Identification, Advanced Textbooks in sos, M. Wessling, J. Membr. Sci. 2020, 118208.
Control and Signal Processing, Springer-Verlag, London 2011. [71] M. Dors, R. Simutis, A. Lübbert, in Biosensor and Chemical
[44] P. Petsagkourakis, I. O. Sandoval, E. Bradford, D. Zhang, E. A. Sensor Technology, ACS Publications, Washington, DC 1995.
Del Rio-Chanona, Comput. Chem. Eng. 2020, 133, 106649. [72] G. Mogk, T. Mrziglod, A. Schuppert, Comput.-Aided Chem. Eng.
[45] S. Hochreiter, J. Schmidhuber, Neural Comput. 1997, 9 (8), 2002, 10, 931–936.
1735–1780. [73] M. von Stosch et al., Biotechnol. J. 2014, 9 (6), 719–726.
[46] A. Mitsos, A. Fischer, M. Kloft, J.-U. Repke, S. Sager, Priority [74] P. Schäfer, A. Caspari, K. Kleinhans, A. Mhamdi, A. Mitsos,
Programme Machine Learning in Chemical Engineering. Knowl- AIChE J. 2019, 65 (5), e16568. DOI: https://doi.org/10.1002/
edge Meets Data: Interpretability, Extrapolation, Reliability, Trust aic.16568
(SPP 2331) 2020. [75] P. Schäfer, A. Caspari, A. M. Schweidtmann, Y. Vaupel, A. Mham-
[47] S. A. Papoulias, I. E. Grossmann, Comput. Chem. Eng. 1983, 7 di, A. Mitsos, Chem. Ing. Tech. 2020, 92 (12), 1910–1920.
(6), 695–706. DOI: https://doi.org/10.1002/cite.202000048
[48] K. Hornik, M. Stinchcombe, H. White, Neural Network 1989, 2 [76] B. Fiedler, A. Schuppert, J. Inst. Math. Its Appl. 2008, 73 (3),
(5), 359–366. DOI: https://doi.org/10.1016/0893-6080(89)90020-8 449–476.
[49] J. D. Smith, A. A. Neto, S. Cremaschi, D. W. Crunkleton, Ind. [77] O. Kahrs, W. Marquardt, Chem. Eng. Process 2007, 46 (11),
Eng. Chem. Res. 2013, 52 (22), 7181–7188. DOI: https://doi.org/ 1054–1066. DOI: https://doi.org/10.1016/j.cep.2007.02.031
10.1021/ie302478d [78] M. Raissi, P. Perdikaris, G. E. Karniadakis, J. Comput. Phys. 2019,
[50] O. Kahrs, W. Marquardt, Comput. Chem. Eng. 2008, 32 (4–5), 378, 686–707.
694–705. [79] M. Schmidt, H. Lipson, Science 2009, 324 (5923), 81–85.
[51] C. A. Henao, C. T. Maravelias, AIChE J. 2011, 57 (5), 1216–1232. [80] A. Cozad, N. V. Sahinidis, Math. Program. 2018, 170 (1), 97–119.
DOI: https://doi.org/10.1002/aic.12341 [81] P. Neumann, L. Cao, D. Russo, V. S. Vassiliadis, A. A. Lapkin,
[52] I. Fahmi, S. Cremaschi, Comput. Chem. Eng. 2012, 46, 105–123. Chem. Eng. Trans. 2020, 387, 123412.
DOI: https://doi.org/10.1016/j.compchemeng.2012.06.006 [82] S. Sarawagi, FNT in Databases 2007, 1 (3), 261–377. DOI: https://
[53] C. Nentwich, S. Engell, in 2016 International Joint Conference on doi.org/10.1561/1900000003
Neural Networks (IJCNN), IEEE, Piscataway, NJ 2016. [83] C. Giuliano, A. Lavelli, L. Romano, ACM Trans. Speech Lang.
[54] F. Boukouvala, M. F. Hasan, C. A. Floudas, J. Global Optim. 2017, Process. 2007, 5 (1), 1–26. DOI: https://doi.org/10.1145/
67 (1–2), 3–42. 1322391.1322393
[55] T. Keßler, C. Kunde, K. McBride, N. Mertens, D. Michaels, [84] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
K. Sundmacher, A. Kienle, Chem. Eng. Sci. 2019, 197, 235–245. Gomez, L. Kaiser, I. Polosukhin, Attention is All you Need, in
/

DOI: https://doi.org/10.1016/j.ces.2018.12.002 Advances in Neural Information Processing Systems 30 (NIPS


[56] Z. T. Wilson, N. V. Sahinidis, Comput. Chem. Eng. 2017, 106, 2017) (Eds: I. Guyon et al.), 2017.
785–795. DOI: https://doi.org/10.1016/j.compchemeng. [85] D. Pinto, A. McCallum, X. Wei, W. B. Croft, in Proc. of the 26th
2017.02.010 annual international ACM SIGIR conference on Research and
[57] Q. Zhang, I. E. Grossmann, A. Sundaramoorthy, J. M. Pinto, development in informaion retrieval – SIGIR ’03 (Eds: C. Clarke
Optim. Eng. 2016, 17 (2), 289–332. DOI: https://doi.org/10.1007/ et al.), ACM Press, New York 2003.
s11081-015-9288-8 [86] N. Siegel, N. Lourie, R. Power, W. Ammar, in Proc. of the 18th
[58] M. Mistry, D. Letsios, G. Krennrich, R. M. Lee, R. Misener, ACM/IEEE on Joint Conference on Digital Libraries, Association
Mixed-Integer Convex Nonlinear Optimization with Gradient- for Computing Machinery, New York 2018, 223–232.
Boosted Trees Embedded, arXiv 2019. https://arxiv.org/abs/ DOI: https://doi.org/10.1145/3197026.3197040
1803.00952 [87] P. Hitzler, M. Krotzsch, S. Rudolph, Foundations of Semantic Web
[59] B. Grimstad, B. R. Knudsen, J. Global Optim. 2020, 1–32. Technologies, Chapman and Hall/CRC, London/Boca Raton, FL
[60] B. Grimstad, A. Sandnes, J. Global Optim. 2016, 65 (3), 401–439. 2009.
[61] B. Grimstad, H. Andersson, Comput. Chem. Eng. 2019, 131, [88] A. Hogan, E. Blomqvist, M. Cochez, C. D’amato, G. de Melo,
106580. C. Gutierrez, S. Kirrane, J. E. L. Gayo, R. Navigli, S. Neumaier,
[62] R. Anderson, J. Huchette, W. Ma, C. Tjandraatmadja, J. P. Viel- A.-C. N. Ngomo, A. Polleres, S. M. Rashid, A. Rula, L. Schmelz-
ma, Math. Program. 2020, 1–37. eisen, J. Sequeda, S. Staab, A. Zimmermann, ACM Comput. Surv.
[63] J. Katz, I. Pappas, S. Avraamidou, E. N. Pistikopoulos, Comput. 2021, 54 (4), 1–37. DOI: https://doi.org/10.1145/3447772
Chem. Eng. 2020, 106801. [89] J. Morbach, A. Yang, W. Marquardt, Eng. Appl. Artif. Intell. 2007,
[64] A. Thebelt, J. Kronqvist, R. M. Lee, N. Sudermann-Merx, 20 (2), 147–161. DOI: https://doi.org/10.1016/j.engappai.
R. Misener, Comput.-Aided Chem. Eng. 2020, 48, 1981–1986. 2006.06.010
[65] A. Thebelt, J. Kronqvist, M. Mistry, R. M. Lee, N. Sudermann-
Merx, R. Misener, ENTMOOT: A Framework for Optimization

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12
Chemie
These are not the final page numbers! ((
Ingenieur Review Article 11
Technik

[90] A. Eibeck, M. Q. Lim, M. Kraft, Comput. Chem. Eng. 2019, 131, [109] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell,
106586. DOI: https://doi.org/10.1016/j.compchemeng. T. Hirzel, A. Aspuru-Guzik, R. P. Adams, in Advances in Neural
2019.106586 Information Processing Systems 28 (NIPS 2015) (Eds: C. Cortes
[91] Y. Lin et al., Learning entity and relation embeddings for knowl- et al.), MIT Press, Cambridge, MA 2015.
edge graph completion, in AAAI’15: Proceedings of the Twenty- [110] A. M. Schweidtmann, J. G. Rittig, A. König, M. Grohe, A. Mitsos,
Ninth AAAI Conference on Artificial Intelligence, Association for M. Dahmen, Energy Fuels 2020, 34 (9), 11395–11407.
Computing Machinery, New York 2015, 2181–2187. DOI: https://doi.org/10.1021/acs.energyfuels.0c01533
[92] B. A. Grzybowski, K. J. M. Bishop, B. Kowalczyk, C. E. Wilmer, [111] C. W. Coley, R. Barzilay, W. H. Green, T. S. Jaakkola, K. F. Jensen,
Nat. Chem. 2009, 1 (1), 31–36. DOI: https://doi.org/10.1038/ J. Chem. Inf. Model. 2017, 57 (8), 1757–1772. DOI: https://
nchem.136 doi.org/10.1021/acs.jcim.6b00601
[93] M. Fialkowski, K. J. M. Bishop, V. A. Chubukov, C. J. Campbell, [112] T. Xie, J. C. Grossman, Phys. Rev. Lett. 2018, 120 (14), 145301.
B. A. Grzybowski, Angew. Chem. 2005, 117 (44), 7429–7435. [113] C. Morris, M. Ritzert, M. Fey, W. Hamilton, J. E. Lenssen, G. Rat-
DOI: https://doi.org/10.1002/ange.200502272 tan, M. Grohe, in Proc. of the 33rd AAAI Conference on Artificial
[94] J. M. Weber, P. Lió, A. A. Lapkin, React. Chem. Eng. 2019, 4 (11), Intelligence, AAAI Press, Palo Alto, CA 2019, 4602–4609.
1969–1981. [114] P.-M. Jacob, A. Lapkin, React. Chem. Eng. 2018, 3 (1), 102–118.
[95] K. Ulonska, M. Skiborowski, A. Mitsos, J. Viell, AIChE J. 2016, 62 [115] T. Zhang, N. V. Sahinidis, J. J. Siirola, AIChE J. 2019, 65 (2),
(9), 3096–3108. 592–603.
[96] K. Ulonska, A. König, M. Klatt, A. Mitsos, J. Viell, Ind. Eng. [116] L. d’Anterroches, R. Gani, Fluid Phase Equilib. 2005, 228–229,
Chem. Res. 2018, 57 (20), 6980–6991. 141–146. DOI: https://doi.org/10.1016/j.fluid.2004.08.018
[97] J. M. Weber, A. M. Schweidtmann, E. Nolasco, A. A. Lapkin, [117] A. C. Vaucher, P. Schwaller, J. Geluykens, V. H. Nair, A. Iuliano,
Comput.-Aided Chem. Eng. 2020, 48, 1843–1848. T. Laino, Nat. Commun. 2021, 12 (1), 2573. DOI: https://doi.org/
[98] Y. Amar, A. M. Schweidtmann, P. Deutsch, L. Cao, A. Lapkin, 10.1038/s41467-021-22951-1
Chem. Sci. 2019, 10 (27), 6697–6706. DOI: https://doi.org/ [118] P. Courrieu, Neural Networks 1994, 7 (1), 169–174.
10.1039/C9SC01844A [119] M. Grohe, in Proc. of the 39th ACM SIGMOD-SIGACT-SIGAI
[99] A. M. Schweidtmann, J. M. Weber, C. Wende, L. Netze, A. Mitsos, Symposium on Principles of Database Systems (Eds: D. Suciu,
Optim. Eng., in press. DOI: https://doi.org/10.1007/s11081-021- Y. Tao, Z. Wei), Association for Computing Machinery, New York
09608-0 2020, 06142020.
[100] M. Aimin, L. Peng, Y. Lingjian, Chemom. Intell. Lab. Syst. 2015, [120] L. Hirschfeld, K. Swanson, K. Yang, R. Barzilay, C. W. Coley,
147, 86–94. DOI: https://doi.org/10.1016/j.chemolab.2015.07.012 J. Chem. Inf. Model. 2020, 60 (8), 3770–3780. DOI: https://
[101] X. Zhu, A. B. Goldberg, Introduction to semi-supervised learning, doi.org/10.1021/acs.jcim.0c00502
Morgan & Claypool, San Rafael, CA 2009. [121] J. H. Lee, W. Wong, J. Process Control 2010, 20 (9), 1038–1048.
[102] J. Ji, H. Wang, K. Chen, Y. Liu, N. Zhang, J. Yan, J. Taiwan Inst. [122] Richard S. Sutton, Andrew G. Barto, Reinforcement Learning:
Chem. Eng. 2012, 43 (1), 67–76. DOI: https://doi.org/10.1016/ An Introduction, MIT Press, Cambridge, MA 2018.
j.jtice.2011.06.002 [123] L. A. Gatys, A. S. Ecker, M. Bethge, A Neural Algorithm of Artistic
[103] L. Constantinou, R. Gani, AIChE J. 1994, 40 (10), 1697–1710. Style, arXiv 2015. https://arxiv.org/abs/1508.06576
[104] A. Fredenslund, Vapor-liquid equilibria using UNIFAC: a group- [124] A. Bardow, K. Steur, J. Gross, Ind. Eng. Chem. Res. 2010, 49 (6),
contribution method, Elsevier, Amsterdam 2012. 2834–2840.
[105] M. Bojarski et al., End to End Learning for Self-Driving Cars, [125] E. J. Candès, B. Recht, Exact Matrix Completion via Convex
arXiv 2016. https://arxiv.org/abs/1604.07316 Optimization, arXiv 2008. https://arxiv.org/abs/0805.4471
[106] D. Amodei et al., International conference on machine learning, [126] F. Jirasek, R. A. S. Alves, J. Damay, R. A. Vandermeulen, R. Bam-
New York, June 2016. ler, M. Bortz, S. Mandt, M. Kloft, H. Hasse, J. Phys. Chem. Lett
[107] T. Gärtner, P. Flach, S. Wrobel, in Learning Theory and Kernel 2020, 11 (3), 981–985.
Machines (Eds: B. Schölkopf, M. K. Warmuth), Springer, Heidel- [127] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Far-
berg 2003. ley, S. Ozair, A. Courville, Y. Bengio, in Advances in neural infor-
[108] D. Oglic, S. A. Oatley, S. J. F. Macdonald, T. Mcinally, R. Garnett, mation processing systems (Z. Ghahramani, M. Welling, C. Cortes,
J. D. Hirst, T. Gärtner, Mol. Inform. 2018, 37 (1–2), 1700130. N. Lawrence, K. Q. Weinberger) 2014.
DOI: https://doi.org/10.1002/minf.201700130 [128] D. P. Kingma, M. Welling, An Introduction to Variational Auto-
encoders, arXiv 2019. https://arxiv.org/abs/1906.02691

Chem. Ing. Tech. 2021, 93, No. 12, 1–12 ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH www.cit-journal.com
’’ These are not the final page numbers! Chemie
12 Review Article Ingenieur
Technik

DOI: 10.1002/cite.202100083

Machine Learning in Chemical Engineering: A Perspective


Artur M. Schweidtmann*, Erik Esche, Asja Fischer, Marius Kloft, Jens-Uwe Repke, Sebastian Sager,
Alexander Mitsos

Review Article: Recent breakthroughs in machine learning provide unique opportunities


for chemical engineering, but only joint interdisciplinary research will unfold the full poten-
tial of machine learning in chemical engineering. We identify six challenges of interdiscipli-
nary research that will open up new methods for chemical engineering and formulate new
types of problems for ML. .................................................................................. ¢

www.cit-journal.com ª 2021 The Authors. Chemie Ingenieur Technik published by Wiley-VCH GmbH Chem. Ing. Tech. 2021, 93, No. 12, 1–12

You might also like