0% found this document useful (0 votes)
28 views17 pages

Physics Based Synthetic Data Model For Automated Segmentation in Catalysis Micros

Uploaded by

jaione.choi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views17 pages

Physics Based Synthetic Data Model For Automated Segmentation in Catalysis Micros

Uploaded by

jaione.choi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Physics-Based Synthetic Data Model for Automated Segmentation

in Catalysis Microscopy
Maurits Vuijk1 , Gianmarco Ducci1 , Luis Sandoval1 , Markus Pietsch2 , Karsten Reuter1 ,
Thomas Lunkenbein1 , and Christoph Scheurer1, 3
1 Fritz-Haber-Institut
der Max-Planck-Gesellschaft, Berlin, 14195, Germany
2 Technical University of Munich; TUM School of Natural Sciences, Department of
Chemistry and Catalysis Research Center, 85748 Garching, Germany
3 Institut für Energie und Klimaforschung (IEK-9), Forschungszentrum Jülich GmbH,

Jülich, Germany

Abstract
In catalysis research, the amount of microscopy data acquired when imaging dynamic processes is
often too much for non-automated quantitative analysis. Developing machine learned segmentation
models is challenged by the requirement of high-quality annotated training data. We thus substitute
expert-annotated data with a physics-based sequential synthetic data model. We study environmental
SEM (ESEM) data collected from isopropanol oxidation to acetone over cobalt oxide. Upon applying
a temperature program during the reaction a phase transition occurs, reducing the catalyst selectivity
towards acetone. This is accompanied on the µm ESEM by the formation of cracks between the pores
of the catalyst surface. We aim to generate synthetic data to train a neural network capable of semantic
segmentation (pixel-wise labelling) of this ESEM data. This analysis will lead to insights into this phase
transition. To generate synthetic data that approximates this transition, our algorithm composes the
ESEM images of the room-temperature catalyst with dynamically evolving synthetic cracks satisfying
physical construction principles, gathered from qualitative knowledge accessible in the ESEM data. We
mimic the surface crack growth propagation along surface paths, avoiding close vicinity to nearby pores.
This physics-based approach results in a lowered rate of false positives compared to a random approach.

Keywords: machine learning, computer vision, synthetic data, ESEM, LSTM, U-NET.

1 Introduction
Catalyst research frequently involves the use of sequential operando electron microscopy imaging techniques
that capture time-series data. These imaging techniques are powerful tools to capture dynamic processes
and phase changes at different scales of the catalyst. ((Uwins et al. [1997])) However, handling and analysis
of the large amount of data acquired at high frame rates (often over days) during such imaging processes
presents a significant challenge for researchers. Thus, there is a growing need for automated segmentation
processes that can efficiently analyze these large sets of electron microscopy data. For this purpose, many

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

different computer vision algorithms and neural networks have been developed which enable automatic
analyses focusing on different aspects of the collected image data, such as interface boundaries (Schneider
et al. [2016]) and carbon nanostructures (Toth et al. [2013]).
For scanning electron microscopy (SEM) data, reliable semantic segmentation (pixel-wise labelling) is re-
quired to extract meaningful statistical data from the image sequences. Manually segmenting such datasets
is tedious and less time efficient. (Madsen et al. [2018]) As a solution, Machine Learning models have been
successfully employed for this purpose (Groschner et al. [2021]). Besides saving time, using an automated
solution also lowers bias and provides reproducibility and self-consistency. However, to train a network capa-
ble of reliable semantic segmentation, a large amount of annotated training data is required. Alternatively,
unsupervised learning methods can be used, but these methods tend to lead to less reliable results. (Ede
[2021])
Automated segmentation processes using Machine Learning models have shown great utility for image
segmentation and classification in electron microscopy. A large amount of research has been done for different
model architectures and applications within this field. Supervised semantic segmentation models utilizing
Convolutional Neural Network (CNN) architectures have since become widespread, with many different
architectures and implementations being developed, such as the PixelNet, which segments microstructures
in SEM images (DeCost et al. [2019]). By convolution of trained filters with input images at different
resolutions, CNNs are able to extract feature information between details and domains of images. The
power of the CNN has also been combined with recurrent neural network elements (Liu et al. [2021]), such
as the Long Short Term Memory (LSTM) element (Hochreiter and Schmidhuber [1997]), to allow networks
to form connections between images in image stacks. The LSTM cell is well suited for the identification of
evolving features in an image sequence, due to its ability to remember relevant information (newly appearing
and established features) and forget obsoleted information (fading features).
One widely used model is U-NET, which was originally developed for biomedical applications to segment
electron microscopy images of cells. The core idea of the U-NET architecture is an encoder-decoder-like
structure with skip connections. Each of the contracting units in the encoder is mirrored with an inverse
expanding unit in the decoder, linked from the corresponding contracting unit by a skip connection. This
leads to the namesake U shape of the network. (Ronneberger et al. [2015])
The U-NET architecture has been shown to be versatile and applicable beyond the original application of
biomedical imaging. It is well suited for segmenting transmission electron microscopy (TEM) images, where
it can exploit the regular structures in the images (Bermudez-Chacon et al. [2018]), as well as SEM images
of materials and surfaces. (Shah et al. [2023])
As an alternative to assembling expert-annotated datasets, synthetic data methods have been developed.
Synthetic data methods involve generating data along with ground truth using a model, and then using the
generated data to train a neural network for segmentation of the target (real) data. With atomic-resolution
microscopy techniques such as TEM, it is possible to use established theoretical models to generate accurate
synthetic data. (Ziatdinov et al. [2017]) However, when looking at lower-magnification SEM images of
catalyst surfaces, the often chaotic nature of the macroscopic surface structure demands a more ad-hoc
visual approach to synthetic data generation. By examining the important features of the data empirically,
it is possible to create a synthetic data model that generates visually similar images that respect the physical
behaviour of the features. The work of (Trampert et al. [2021]) shows such results that directly inspired this
work. In this work, they use exemplar-based inpainting as a synthetic data generator to train a classification
network. To do so, a texture dictionary is built up from real samples to generate new backgrounds. Features
are then generated according to predefined parameters and superimposed on these backgrounds to generate
the synthetic data.

Page 2

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

In this work, we build upon this previous work in two ways: We use a physics-based model to generate
sequential synthetic data with inpainted evolving features rather than a purely parametric model. The
physics-based model naturally generates a distribution of cracks with similar qualities (length, tortuosity) to
those found in the real data, and confers an important improvement in accuracy, especially when it comes to
false positives, over the parametric model. As our data consists of a time-series in which slow evolution of the
features is clearly visible, we introduce a temporal element to the neural network architecture by appending
a convolutional LSTM block (Chang and Liao [2019]) to the usual U-NET architecture. The convolutional
LSTM block allows the neural network to take advantage of relations across the time dimension of the
time-series in addition to the spatial dimensions of the images. To train a neural network containing an
LSTM block, sequential annotated training data is required. For this reason, the synthetic data generator
was modified to generate sequences. In addition, an important advantage of using the U-NET in the context
of synthetic data is that the U-NET architecture is very good at learning transformational invariance from
augmented datasets. (Wilms et al. [2017]) As a synthetic data approach could be regarded as an extreme
form of data augmentation, as well as being combined with data augmentation of the synthetic dataset, this
property of the network architecture is advantageous. (Fawaz et al. [2018])
We thus propose an approach to overcome the annotation challenge in training semantic segmentation
neural networks. Instead of relying on expert-annotated data, we substitute it with a physics-based sequential
synthetic data model. This model mimics the dynamic processes involving the relevant features observed in
our example system, the formation of crack features during the oxidation of isopropanol to acetone over a
cobalt oxide catalyst surface. By training a U-NET with LSTM block using synthetic data we aim to train
a model that can reliably segment this crack feature.

2 Methodology
2.1 ESEM Experiment and Sample
In this section, we introduce the sample and the experiment that was performed to obtain the dataset,
and we describe the goals and challenges with regards to the analysis of the dataset. The system we are
investigating is a CoO foil catalyst used in the oxidation of propanol to acetone. A constant gas flow of
acetone and oxygen was present during the experiment and the catalyst was heated with staggered heating
to 500 °C using a laser heating setup. A time-series, consisting of 1600 frames with a scan time of 35.40s, of
the evolution of the catalyst surface was gathered during the experiment in an operando ESEM experiment,
as depicted in Figure 1. For further details about the experimental setup, see Supporting Information
S1.
The initial surface of the catalyst displays a complex morphology with prominent (bright) ridges and
(dark) holes (see Figure 3 below). After being heated to 400 °C, a change in the surface texture occurs that
coincides with a large drop in the selectivity towards acetone of the catalyst, visible in Figure 2. On the
ridges of the catalyst surface, crack structures start to appear, i.e. the cracks remain spatially separated from
the original holes of the catalyst morphology. Following the formation of the crack structures, the surface
structure of the catalyst now has two key (dark) features: ”holes” and ”cracks”. During the formation of
the cracks in the ESEM experiment, it can also be seen in Figure 3 that the cracks mostly form in the
middle of the ridges of the catalyst structure, i.e. the parts of the surface in between the holes. This is an
irreversible phase change, where the cracks persist throughout the cooling and the second heating cycle of
the experiment. It can also be seen from the quadrupole mass spectrometer (QMS) data in Figure 2 that

Page 3

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

a second heating cycle of the experiment with the same catalyst and the already present cracks does not
exhibit the same selectivity towards acetone as the first cycle.
The goal is to semantically segment this crack feature (assigning a binary classification per pixel). The
formation of the crack feature in the image is the manifestation of a change in the catalyst surface, and it
appears following a significant change in selectivity towards acetone. With the semantic segmentation data,
statistical analysis can be performed, providing a deeper understanding of this process. (Treder et al. [2023])
For this dataset, trivial image processing methods are not sufficient to isolate the cracks: The holes and
the cracks occupy the same color space and have a similar color difference from the background, making
techniques like thresholding or edge detection unsuitable.

2.2 Synthetic Dataset for Model Training


A set of backgrounds was manually taken from catalyst surface images before the surface transformation.
The set of background images was augmented through random rotation, translation and scaling. To generate
physics-based crack structures, the holes in the surface structure of the background image are first identified
by using a threshold function, using an adjusted form of Otsu’s method (Otsu [1979]). The threshold
value resulting from Otsu’s method is multiplied by 0.9 to improve the definition of the holes. A distance
transform of the resulting binary hole map is taken to calculate the distance of each surface pixel to the
nearest hole, resulting in the distance transform map J. The synthetic cracks are then generated according
to the algorithm illustrated in Figure 4 and superimposed on the corresponding background image. The
algorithm takes the distance transform map J as an input and outputs a trajectory for a synthetic crack
feature that respects the physical characteristics of the real crack features.
The algorithm utilizes an iterative process to generate these trajectories: The starting point of the
trajectory is given a random x,y coordinate L0x , L0y and angle value L0θ (rad). The position and angle of each
following point is then given by the following, where s determines the step size and r is randomly drawn
from its associated interval each step:
Lnx = Ln−1
x + s cos(Ln−1
θ ) (1)
Lny = Ln−1
y + s sin(Ln−1
θ ) (2)
π π
Lnθ = Ln−1
θ + t, t ∈ R, t ∈ [− , ] (3)
6 6
These equations describe the movement of the point in the trajectory according to its assigned angle in
Eqn. 1, 2, followed by a limited randomization of its direction in Eqn. 3. After each iteration, the value
of the distance transform map J at the current position (Lnx , Lny ) is checked to be above the threshold of
5 pixels. If this condition is not met, the trajectory ends. As such, the trajectory of the synthetic cracks
cannot come within a radius of 5 pixels of a hole in the background image, and thus stays on the ridges of
the catalyst surface. In Figure 4, this is represented by the blue circle. Multiple trajectories are generated
per background image through repetition of this algorithm. Values such as the threshold and angle change
intervals were chosen based on empirical knowledge from the real dataset.
Due to the presence of the LSTM element in the neural network architecture, the neural network must be
trained with sequential time-series data. The trajectories generated by the previously described algorithm
are therefore used to generate synthetic data sequences with evolving crack features. This is achieved by
adding the values of a parametric kernel around each point of the sequence. The kernel parameters control
the intensity and width of the kernel, and thereby the resulting crack feature. The kernel parameters are
systematically varied within the sequence to mimic the evolution of the cracks on the surface over time.

Page 4

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

A temporal parameter τ is varied over the length of the sequence, which influences the kernel parameters,
generating thicker and more intense lines for higher values of τ . To generate a diverse dataset, four different
types of sequences are generated as follows, where N is the length of the sequence:

Full sequence:τ = {0, 1, 2, . . . , N } (4)


1j 2j
Slow sequence: τ = {n, n + , n + , . . . , n + j}
N N
With: {n, j} ∈ Z, n ∈ [0, N ], j ∈ [1, 3] (5)
Constant sequence: τ = n, n ∈ Z, n ∈ [1, N ] (6)
Blank sequence: τ =0 (7)

These ranges describe the different possibilities for the temporal parameter τ . In the full sequence Eqn. 4,
the range of τ comprises the full possible range, resulting in a sequence that shows synthetic cracks growing
from nothing to full intensity. The slow sequence Eqn. 5 comprises a subset of the full sequence with a
higher time resolution. The same sequence Eqn. 6 keeps τ fixed at a random point. The empty sequence
Eqn. 7 keeps τ fixed at 0, adding the backgrounds with no cracks into the dataset. This algorithm was
implemented using Python, with the OpenCV library. (Bradski [2000])
Finally, a noise layer was added to the image, to approximate the noise in the real dataset. This is
represented as a random value from a white noise distribution taken for each pixel and added to its value.
Representative results of this algorithm can be seen in Figure 5. By construction, the generated cracks avoid
the holes in the background, following the distance map. Using this algorithm, a physics-based synthetic
dataset was generated and used to train a neural network. To provide a point of comparison, a random
synthetic dataset was also generated and used to train a different neural network as described below. The
random synthetic dataset was generated using the same algorithm as the physics-based synthetic dataset,
with the end condition based on the distance transform of the background omitted. This results in a dataset
where the synthetic crack trajectories are taken from unguided random walkers. The main difference between
the two synthetic datasets will thereby be the presence of overlapping holes and cracks within the random
synthetic dataset. It should be noted that the generation of the trajectories of the random synthetic dataset
are still constructed with the restricted angle movement based on the real cracks, and therefore are still
partially based on the physics of the real system.

2.3 Neural Network Architecture for Segmentation Model


The network architecture that was used is based on the U-NET architecture as described in the Introduction,
with ResizeConv (Odena et al. [2016]) layers replacing ConvTranspose layers. The ConvTranspose layer used
in the original U-NET introduces checkerboard artifacts into the segmentation mask with each upsampling.
ResizeConv layers provide an alternative upsampling method: the image is resized, then convolution is
performed with a 3x3 trainable kernel. Additionally, a ConvLSTM (Shi et al. [2015]) layer was placed after
the final layer of the U-NET. The ConvLSTM layer introduces a recurrent element to the network, allowing
it to take advantage of the sequential nature of the input dataset.
The model was implemented and trained using the PyTorch library (Paszke et al. [2019]), using cross
entropy loss and a stochastic gradient descent optimizer with a learning rate of 0.8. The final generated
dataset consisted of 768 synthetic training sequences consisting of 16 512x512 images per sequence, and 256
synthetic validation sequences, with a minibatch size of 8 sequences. These parameters were chosen through
hyperparameter optimization. The model was trained on an NVIDIA A100 80GB GPU for 70 epochs.

Page 5

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

The optimal epoch was then chosen by minimizing the validation and training error, and by evaluating the
performance on the real dataset. The implementation of the neural network and synthetic data generator
can be found in Supporting Information S2.

3 Results
3.1 Model Performance Benchmark
To show the impact of the physics-based distance map component of the synthetic dataset described in the
Dataset section, a benchmark was performed between two networks with an identical architecture (described
in the Synthetic Dataset for Model Training section). The physics-based network was trained with the
physics-based synthetic dataset. The random network was trained with the random synthetic dataset.
The benchmark consists of a challenging synthetic test set. To generate the test set, the same algorithm
was used as for the physics-based training set, with fewer lines generated per image. Furthermore, temporal
information was not supplied in this case: each frame in the sequence was generated with the same iteration
value τ , leading to sequences with differences only in noise. In this process, an individual mask was stored
for each synthetic crack. The binarized output of the models was then evaluated as follows: First, to check
for true positives (T P ), a threshold of pixels marked as cracks by the model within each individual mask
must be reached. Any sufficiently large cluster of marked pixels that is not within any of the masks is then
marked as a false positive (F P ), and any unmarked pixels within the masks are marked as false negative
(F N ). Any remaining marked pixels that were not assigned to the other categories are counted as noise
pixels. Lastly, a pixel-based Dice score (Dice [1945]) is calculated as
2T P
(8)
2T P + F P + F N
where the F P value includes both clustered false positive pixels and noise pixels. (Carass et al. [2020])
The results of this benchmark can be seen Figure 6. It is clear that the physics-based model, in
comparison to the random walk model, exhibits an overall decrease in the amount of false positives at the
cost of a minor penalty to the amount of true positives when evaluating a synthetic test dataset. The
Gaussian shapes of the distributions of False Positives in Figure 6 are expected (following the central
limit theorem), and show that the synthetic dataset generated for the benchmark contains a reliable and
diverse distribution of challenging segmentation tasks for the model. The effective result of the physics-
based component included in the dataset is that the model becomes much more specific towards the crack
feature, rather than the holes. This is in accordance with the expectation, as the random synthetic dataset
frequently includes the situation where a synthetic crack feature overlaps with a hole, confusing the model’s
trained filters. Despite coming at a small cost to the rate of recall of the feature, this change still infers an
improvement of the Dice score from the reduction in false positives.

3.2 ESEM Data Timeseries Segmentation


In this section we report the results obtained by applying the neural network trained with the physics-based
synthetic dataset (as described in the Methods section) to the ESEM experiment dataset.
The model was applied to the whole dataset of the ESEM time series. The result of this can be found as
a video in Supporting Information S3. It can be seen that the model provides a reliable segmentation of
the dataset. The model is able to segment the crack features as soon as they begin forming on the surface,

Page 6

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

without any false positives prior to the formation of the first crack feature. The segmentation remains
stable through the sudden scene changes and drifting within the microscopy data resulting from changes in
temperature in the experiment. For the intended application of the detection of the formation of the crack
feature on the surface, the reduction of false positives is of great importance. The observed reduction in the
area under the curve of the physics-based segmentation output, when compared to the random segmentation
output (as illustrated in Figure 7), can be attributed to such a decrease in false positives. This suggests that
the physics-based model provides segmentation results that are more aligned with the ground truth. Notably,
within the low-temperature regime (prior to crack formation, occurring before 300 minutes), the physics-
based component significantly reduces false positives, highlighting the role of the physics-based component
in the ability of the model to accurately identify the initial nucleation point of the crack feature.

4 Discussion
Comparing the segmented output of the model (crack amount) in Figure 7 to the selectivity of the catalyst
over the experiment, we can get a full picture of the relation between these factors. From this, it can be seen
that the crack formation occured on the surface approximately 100 minutes after the acetone selectivity peak
ceased, and that a continuously increasing temperature is required to facilitate further crack growth. As the
relatively low resolution of the ESEM does not show us the full nanoscale image of the surface, it appears to
us that the cracks are a manifestation of an overall surface phase change that is initiated around 350 °C and
continues to affect the surface until the cracks are fully formed. We intend to investigate the nucleation of
the cracks further, by letting the trained neural network control the microscope. Upon detection of the onset
of crack formation, the neural network would send a command to the microscope to immediately quench the
reaction, stopping the heating and gas flow. Then, the catalyst can be placed in a higher-resolution electron
microscopy setup such as immersion-SEM or TEM to gain insights into the nature of the phase change.
(Kaegi and Holzer [2003])
A restriction of the physics-based synthetic data model approach is the fact that it must be possible to
approximately model the physical properties of the system. Some systems may be too complex or otherwise
infeasible to apply this approach to. Furthermore, a diverse enough set of background (areas with no features)
images is a prerequisite to apply this method. We plan to investigate the integration of theoretical simulation
techniques into the construction of synthetic data models. Data obtained from such simulations could be
used directly in the construction of synthetic data models (through rendering of structures into images) or
supplement known experimental results.
It is likely that the segmentation performance of the physics-based synthetic data model could be fur-
ther improved through experimentation with hyperparameters of the U-NET network architecture, entirely
different network architectures and other machine learning techniques such as utilizing pre-trained networks
(Hinterstoisser et al. [2017]). However, as the focus of this work was on the development of the physics-based
synthetic data generation method, this was considered out of scope for this work.

5 Summary
In this paper, we have shown that the segmentation of a feature on a complex catalyst surface can be
accurately performed by a neural network trained from random initial parameters only with synthetic data.
By integrating physical characteristics of the features on the catalyst surface into the synthetic training data
generation algorithm, we achieved superior accuracy and greatly decreased false positives.

Page 7

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

6 Data Availability
Source code and data used in this article can be found at Scheurer et al. [2024].

7 Acknowledgements
We acknowledge funding from the German Federal Ministry of Education and Research in the framework
of the project CatLab (03EW0015A). This project was in part funded by the Deutsche Forschungsgemein-
schaft (DFG, German Research Foundation) - 388390466-TRR 247, subproject B6. This work was further
supported by the DFG under Germany’s Excellence Strategy – EXC 2089/1 – 390776260.

8 Competing Interests
The authors declare no competing interests.

References
R. Bermudez-Chacon, P. Marquez-Neila, M. Salzmann, and P. Fua. A domain-adaptive two-stream U-Net
for electron microscopy image segmentation. In 2018 IEEE 15th International Symposium on Biomedical
Imaging (ISBI 2018). IEEE, 2018. doi: 10.1109/isbi.2018.8363602.
G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
A. Carass, S. Roy, A. Gherman, J. C. Reinhold, A. Jesson, T. Arbel, O. Maier, H. Handels, M. Ghafoorian,
B. Platel, A. Birenbaum, H. Greenspan, D. L. Pham, C. M. Crainiceanu, P. A. Calabresi, J. L. Prince,
W. R. G. Roncal, R. T. Shinohara, and I. Oguz. Evaluating white matter lesion segmentations with refined
Sørensen-Dice analysis. Sci. Rep., 10(1), 2020. ISSN 2045-2322. doi: 10.1038/s41598-020-64803-w.
S. W. Chang and S. W. Liao. KUnet: Microscopy image segmentation with deep Unet based convolutional
networks. In 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). IEEE, 2019.
doi: 10.1109/smc.2019.8914048.
G. D. Danilatos. Introduction to the ESEM instrument. Microsc. Res. Tech., 25(5–6):354–361, 1993. ISSN
1097-0029. doi: 10.1002/jemt.1070250503.
B. L. DeCost, B. Lei, T. Francis, and E. A. Holm. High throughput quantitative metallography for complex
microstructures using deep learning: A case study in ultrahigh carbon steel. Microsc Microanal, 25(1):
21–29, 2019. ISSN 1435-8115. doi: 10.1017/s1431927618015635.
L. R. Dice. Measures of the amount of ecologic association between species. Ecology, 26(3):297–302, 1945.
ISSN 1939-9170. doi: 10.2307/1932409.
J. M. Ede. Deep learning in electron microscopy. Mach. Learn.: Sci. Technol., 2(1):011004, 2021. doi:
10.1088/2632-2153/abd614.

Page 8

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P. Muller. Data augmentation using synthetic data
for time series classification with deep residual networks. CoRR, abs/1808.02455, 2018. doi: 10.48550/
arXiv.1808.02455.
C. K. Groschner, C. Choi, and M. C. Scott. Machine learning pipeline for segmentation and defect iden-
tification from high-resolution transmission electron microscopy data. Microsc Microanal, 27(3):549–556,
2021. doi: 10.1017/s1431927621000386.
S. Hinterstoisser, V. Lepetit, P. Wohlhart, and K. Konolige. On pre-trained image features and synthetic
images for deep learning. CoRR, abs/1710.10710, 2017. doi: 10.48550/arXiv.1710.10710.
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, 1997. ISSN
1530-888X. doi: 10.1162/neco.1997.9.8.1735.
R. Kaegi and L. Holzer. Transfer of a single particle for combined ESEM and TEM analyses. Atmos.
Environ., 37(31):4353–4359, 2003. ISSN 1352-2310. doi: 10.1016/s1352-2310(03)00574-0.
J. Liu, B. Hong, X. Chen, Q. Xie, Y. Tang, and H. Han. An effective AI integrated system for neuron
tracing on anisotropic electron microscopy volume. Biomed. Signal Process. Control, 69:102829, 2021.
ISSN 1746-8094. doi: 10.1016/j.bspc.2021.102829.
J. Madsen, P. Liu, J. Kling, J. B. Wagner, T. W. Hansen, O. Winther, and J. Schiøtz. A deep learning ap-
proach to identify local structures in atomic-resolution transmission electron microscopy images. Advanced
Theory and Simulations, 1(8), 2018. ISSN 2513-0390. doi: 10.1002/adts.201800037.
A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 1(10), 2016. doi:
10.23915/distill.00003.
N. Otsu. A threshold selection method from gray-level histograms. IEEE Trans. Syst., Man, Cybern., 9(1):
62–66, 1979. ISSN 2168-2909. doi: 10.1109/tsmc.1979.4310076.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,
A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang,
J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In Advances
in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation.
CoRR, abs/1505.04597, 2015. doi: 10.48550/arXiv.1505.04597.
C. Scheurer, M. Vuijk, G. Ducci, L. Sandoval, M. Pietsch, K. Reuter, and T. Lunkenbein. Physics-Based
Synthetic Data Model for Automated Segmentation in Catalysis Microscopy; Data and Source Code. 2024.
doi: 10.17617/3.NWOKER.
N. M. Schneider, J. H. Park, M. M. Norton, F. M. Ross, and H. H. Bau. Automated analysis of evolving
interfaces during in situ electron microscopy. Advanced Structural and Chemical Imaging, 2(1), 2016. doi:
10.1186/s40679-016-0016-z.
A. Shah, J. A. Schiller, I. Ramos, J. Serrano, D. K. Adams, S. Tawfick, and E. Ertekin. Automated image
segmentation of scanning electron microscopy images of graphene using U-Net neural network. Mater.
Today Commun., 35:106127, 2023. ISSN 2352-4928. doi: 10.1016/j.mtcomm.2023.106127.

Page 9

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

X. Shi, Z. Chen, H. Wang, D. Yeung, W. Wong, and W. Woo. Convolutional LSTM network: A machine
learning approach for precipitation nowcasting. CoRR, abs/1506.04214, 2015. doi: 10.48550/arXiv.1506.
04214.
P. Toth, J. Farrer, A. Palotas, J. Lighty, and E. Eddings. Automated analysis of heterogeneous carbon
nanostructures by high-resolution electron microscopy and on-line image processing. Ultramicroscopy,
129:53–62, 2013. doi: 10.1016/j.ultramic.2013.02.017.
P. Trampert, D. Rubinstein, F. Boughorbel, C. Schlinkmann, M. Luschkova, P. Slusallek, T. Dahmen, and
S. Sandfeld. Deep neural networks for analysis of microscopy images—synthetic data generation and
adaptive sampling. Crystals, 11(3):258, 2021. doi: 10.3390/cryst11030258.

K. P. Treder, C. Huang, C. G. Bell, T. J. A. Slater, M. E. Schuster, D. Özkaya, J. S. Kim, and A. I. Kirkland.


nNPipe: a neural network pipeline for automated analysis of morphologically diverse catalyst systems.
npj Comput. Mater., 9(1), 2023. doi: 10.1038/s41524-022-00949-7.
P. J. Uwins, G. J. Millar, and M. L. Nelson. Dynamic imaging of structural changes in silver catalysts by
environmental scanning electron microscopy. Microsc. Res. Tech., 36(5):382–389, 1997. doi: doi.org/10.
1002/(SICI)1097-0029(19970301)36:5⟨382::AID-JEMT8⟩3.0.CO;2-N.
M. Wilms, H. Handels, and J. Ehrhardt. Multi-resolution multi-object statistical shape models based on the
locality assumption. Med. Image Anal., 38:17–29, 2017. ISSN 1361-8415. doi: 10.1016/j.media.2017.02.003.
M. Ziatdinov, O. Dyck, A. Maksov, X. Li, X. Sang, K. Xiao, R. R. Unocic, R. Vasudevan, S. Jesse, and
S. V. Kalinin. Deep learning of atomically resolved scanning transmission electron microscopy images:
Chemical identification and tracking local transformations. ACS Nano, 11(12):12742–12752, 2017. ISSN
1936-086X. doi: 10.1021/acsnano.7b07504.

Page 10

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 1: Experimental ESEM setup. ESEM is a technique that allows for SEM images to be taken under
atmospheric conditions. (Danilatos [1993]) By using pumps to create a pressure gradient, the requirements
of high vacuum for the operation of the electron gun, and low vacuum for the gas flow required for catalytic
experiments within the sample chamber can be satisfied. QMS is a quadrupole mass spectrometer, allowing
data collection of the output of the experiment.

Page 11

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 2: Selected QMS output of the ESEM experiement. The graph shows the ion current intensities of the
desired partial oxidation of propanol to acetone (blue curve), and the undesired full oxidation to carbon
dioxide (red curve). It can be seen that there is a peak in the acetone production (outlined) in the first
heating cycle that is not present in the second cycle.

Page 12

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 3: Crack formation in the reaction time-series. Frames shown are 50 frames apart, with a scan time
per frame of 35.40s. Contrast has been increased to make features more easily discernible. The important
features are the ”holes”, dark areas visible in a,b,c and the ”cracks”, the thin dark lines additionally visible
in b, c.

Page 13

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 4: Synthetic crack generation algorithm. 1) A point is placed with random Lx , Ly coordinates and a
random angle Lθ . 2) According to the angle and position of the first point, the location of the next point is
determined. 3) The distance transform map value at the new coordinates Lx , Ly is checked to be above the
threshold, and the angle Lθ is given a new random value. 4) Steps 2 and 3 are repeated until the distance
transform map value is below the threshold of 5 pixels distance from a hole. In this case, the algorithm is
ended and all points are saved as a sequence. If an initial point fails the threshold check, the trajectory is
discarded, leading to a varying number of trajectories in each sequence.

Page 14

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 5: a) Background taken from an initial room-temperature catalyst image before the heating cycle and
thus without cracks present; b, c) Examples of synthetic crack evolution superimposed on the background
image: Temporal parameter at τ = 7, τ = 15; d) Distance transform: Brighter pixels have greater distance
from voids in a; e, f ) Synthetic generated masks corresponding to b, c.

Page 15

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 6: Accuracy distribution of synthetic data benchmark. Data from 1024 synthetic sequences. Input data
consisted of sequences with five generated crack features persistently present across frames in each sequence.
As such, an ideal segmentation would be 5 true positives and no false positives. Segmented and classified as
described by two neural networks, one trained with a physics-based synthetic data model, the other trained
with a random synthetic data model. The resulting Dice scores are 0.76 (physics-based), 0.64 (random).

Page 16

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0
Physics-Based Synthetic Data Model Vuijk et al.

Fig. 7: Segmentation output of the physics-based and random models over the first 800 minutes of the
ESEM experiment dataset. Area is taken as a sum of activated pixels of the model output. Spiked dips in
the output are the result of image shifting due to temperature changes.

Page 17

https://doi.org/10.26434/chemrxiv-2024-6b4h0 ORCID: https://orcid.org/0000-0003-0681-0074 Content not peer-reviewed by ChemRxiv. License: CC BY-NC-ND 4.0

You might also like