0% found this document useful (0 votes)
30 views6 pages

Micron: Pawe Kozikowski

Uploaded by

N T
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views6 pages

Micron: Pawe Kozikowski

Uploaded by

N T
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Micron 171 (2023) 103473

Contents lists available at ScienceDirect

Micron
journal homepage: www.elsevier.com/locate/micron

Machine Learning for grouping nano-objects based on their morphological


parameters obtained from SEM analysis
Paweł Kozikowski
Central Institute for Labour Protection – National Research Institute, ul. Czerniakowska 16, 00-701 Warsaw, Poland

A R T I C L E I N F O A B S T R A C T

Keywords: Nanoparticles have unique properties that make them useful in a variety of applications, but their potential
Nanoparticles toxicity raises concerns about their safety. Accurate characterization of nanoparticles is essential for under­
Electron microscopy standing their behavior and potential risks. In this study, we employed machine learning algorithms to auto­
Size distribution analysis
matically identify nanoparticles based on their morphological parameters, achieving high classification accuracy.
Machine learning
Our results demonstrate the effectiveness of machine learning for nanoparticle identification and highlight the
need for more precise characterization methods to ensure their safe use in various applications.

1. Introduction necessary to define the health risks (Kelly and Fussell, 2020; Sharma,
2009; Donaldson and Poland, 2013). Non-spherical particles can be
Aerosols in work environment are often classified based on their described using multiple length and width measures as well parameters
aerodynamic or electric mobility equivalent diameter (Kuhlbusch et al., describing their shape (Buckland et al., 2021; Arenas-Guerrero et al.,
2011; Glytsos et al., 2010). The widely used method is electrical 2018; Frank et al., 2022; Vippola et al., 2016). Such descriptions provide
mobility sizing, coupled with a condensation nucleus counting tech­ greater accuracy but also greater complexity. Microscopy is the only
nique, like a scanning mobility particle spectrometer (SMPS), which has technique that can describe and identify morphologically nanoparticles
been proposed as one of the standard instruments for measuring nano­ using various parameters (Pellegrino et al., 2022). SEM analysis not only
particles. The particle number size distribution of the atmospheric allows statistical analysis of particle size distribution (albeit not as
aerosol is used to calculate of the effects of aerosols on climate, human precise as SMPS analysis), but also allows the identification of
health, and eco-systems (Wiedensohler et al., 2012). However, these nano-objects in the working environment, classification of particles in
instruments have a limited capability of providing information on terms of morphology and determination of parameters (Brostrøm et al.,
nanoparticle shape as the analysis process results in the loss of infor­ 2020; Rühle et al., 2021).
mation on particle morphology (Chen et al., 2016; Brostrøm et al., 2019; The aim of this study is to determine the usability of microscopic
Braakhuis et al., 2014). Such classification has its limitations since methods coupled with machine learning to automatically identify
workers are usually exposed to various processes simultaneously, nanoparticles. Graphite nanoparticles were generated with varying pa­
resulting in different concentrations, dimensions, aggregations, and rameters, specifically spark frequency, which changes the particle’s size
agglomerations that can alter the toxicity of the nanoparticles (Brouwer and morphology. A test aerosol consisting of these particles was
et al., 2013; Modena et al., 2019). collected on membrane filters and observed using SEM. Automatic
This shows some limitation of real time instruments such as SMPS, segmentation and in-depth particle analysis based on SEM images were
ELPI,CPC, as these method describes particles only in terms of concen­ conducted, revealing differences in morphology. Each particle was
tration and particle mobility diameter. In part, this parameter is similar described in detail, and the data were used in several machine learning
to the equivalent diameter, since it allows to describe any irregularly algorithms to automatically group them based on their descriptive
shaped particle as spherical particles of a given diameter (DeCarlo et al., parameters.
2004). However, this is an oversimplification and broad approximation
as all indirect methods assume the particle to be spherical (Schmid et al., 2. Materials and methods
2007). Since both particle size and its shape influence the toxicity, a
better understanding of morphology of produced nanoparticles is Graphite nanoparticles were generated using a PALAS GFG 1000

E-mail address: [email protected].

https://doi.org/10.1016/j.micron.2023.103473
Received 1 March 2023; Received in revised form 25 April 2023; Accepted 25 April 2023
Available online 29 April 2023
0968-4328/© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
P. Kozikowski Micron 171 (2023) 103473

nano-aerosol generator (Palas, Germany) (Kozikowski and Sobiech, rectified linear unit (ReLU) activation function was applied to all layers
2022). This device generates fine carbon particles by creating a spark except for the output layer, which utilized the sigmoid activation
between two graphite electrodes in an inner chamber of the generator function to output a probability value. The model was trained using the
with a protective atmosphere of argon (Evans et al., 2003). The carbon Adam optimizer with binary cross-entropy loss function.
escaping during the sparking is transferred, with the help of argon, forms To prevent overfitting, an early stopping strategy was implemented
aggregates and agglomerates of carbon elementary particles. For the with a patience value of 25 epochs, which monitored the validation loss
purpose of this study, nanoparticles were produced at 2 spark fre­ during training and stopped the training process if the validation loss did
quencies: 15 and 45 Hz to show the difference in generated particles’ not improve after 25 consecutive epochs. The training dataset was split
morphology with increasing spark frequency. This constant energy in into a training set and a validation set, with the latter used for early
each individual spark guarantees stable particle size distribution in a stopping. The model was trained for a maximum of 250 epochs.
broad size range (geometric standard deviation σg>1.6) (Roth et al.,
2004). 3. Results and discussion
Nanoparticles were collected using Cascade Impactor Mini Moudi
135 (TSI) together with personal pump SG 10–2. Flow rate was set to 2 l/ Fig. 1 displays SEM images of nanoparticles generated at two spark
min. A cascade impactor is a device that separates and collects airborne frequencies (15, 45 Hz) after a 30-minute collection time. For the 15 Hz
particles based on their size by utilizing a series of plates with decreasing spark frequency, spherical particles and their aggregates can be
hole sizes, where particles of different sizes are impacted and collected observed (Fig. 1a). The particles are regular and circular, and no ag­
due to differences in their inertial properties. In this study Cascade glomerates of particles are present. The particles are uniformly depos­
Impactor was used only to separate the “bigger” particles (cascades 1–6) ited on the substrate without any localized deposition around the
for nanoparticles. The particles of interest for microscopic analysis were membrane openings or distinct clusters of overlapping particles.
collected only from the nucleopore membranes (1 µm) located behind Fig. 1b shows SEM images for a spark frequency of 45 Hz. Under this
the impactor. frequency, the particles are no longer spherical but have a diametrically
The collected particles were observed using Hitachi SU8010 model distinct morphology. These particles are characterized by a smaller
cold field emission scanning electron microscope (FE-SEM) (Hitachi, primary particle size and resemble agglomerates with a fractal-like
Japan). Observations were carried out at an accelerating voltage of 10 structure. Their contrast with the substrate is greater than that of the
kV and at 20 000x magnification (pixel size equals 2.48 nm). Since the spherical particles due to the greater surface area of the agglomerates.
membranes are polycarbonate, the substrates used were sputtered with a The particles are uniformly deposited on the substrate without any
conductive layer. The sputtering process was performed on clean filters, localized deposition around the membrane holes or distinct clusters of
before the experiment, since sputtering loaded substrates after aerosol superimposed particles. However, due to the presence of two types of
exposure would affect the particle size. Conductive layer deposition was particles in the aerosol, all particles were labelled either as "spherical
performed using a Quorum Technologies Q150ES sputtering machine. particles" or "agglomerates".
This process was carried out in a protective atmosphere of Argon sup­ 11 images were taken for spark frequency 15 Hz and 7 images for
plied at a pressure of 0.3 bar. Additionally, nitrogen was supplied to vent spark frequency 45 Hz (Fig. 2). Image analysis retrieved 5925 labelled
the sputter chamber. “spherical particles” and 5673 “agglomerates”. Number of analysed
Images were binarized using in-house written script in python for images have been set to ensure a balanced dataset (equal number of
image segmentation. Semi-automated method first normalizes image labels), since using unbalanced dataset can cause issues with classifi­
based on the intensity of the substrate and then thresholds particles cation predictions (Viloria et al., 2020; Lemaître et al., 2017; Prati et al.,
based on their intensity. Binary images are process using scikit-image 2009).
libraries to label each particle (Van Der Walt et al., 2014). Obtained Each particle was describes using parameters shown in Table 1. In
data is analysed using machine learning techniques from scikit-learn, an total 16 parameters including area, equivalent diameter, circularity,
open-source machine learning library for Python (Pedregosa et al., intensity, are calculated for each particle with addition of one target
2012) and Keras/TensorFlow (Abadi et al., 2016) for deep Neural label.
Network. In this paper supervised machine learning algorithms have Most of the parameters exhibit high positive skewness, indicating
utilised namely: k-Nearest Neighbors, Logistic Regression, Support that a larger number of data points have lower values. This can lead to
Vector Machine (SVM), Gaussian Naive Bayes (GNB), Decision Tree, machine learning models performing better at predicting particles with
Random Forest and Deep Neural Networks. All of the methods used lower values, and less effectively predicting particles with higher values
default hyperparameters to check the baseline accuracy of each model. (Hammouri et al., 2020; Feng et al., 2014, 2013). Since skewed data can
The NN architecture was designed using the Keras API with the affect machine learning models, parameters with a skewness greater
TensorFlow backend. The model consists of four densely connected than 0.5 have been log-transformed to approximate a normal
layers, with 16, 4, 4, and 1 nodes, respectively (number of nodes in the distribution.
input layer was equal to the number of features in the dataset). The Fig. 3 shows an example of two particles, characterized as

Fig. 1. SEM images of graphite particles deposited on the filter membrane after 30 min of collection; sparking frequency a) 15, b) 45.

2
P. Kozikowski Micron 171 (2023) 103473

Fig. 2. SEM images graphite particles deposited on the filter membrane after 30 min of collection; sparking frequency a) 15 Hz, b) 45 Hz.

characteristic parameters listed in Table 2. Automated segmentation


Table 1
performed reasonably well in distinguishing the particles from the
Summary of characteristic parameters and its skewness.
background, enabling in-depth characterization of each particle.
Feature Definition skewness Fig. 4 shows Pearson Linear correlation coefficient (Mu et al., 2018)
name
between characteristic parameters and label. The calculated coefficient
AC Area of convex hull 6.500214 revealed a relatively high linear correlation between the mean intensity
A Area of the particle 4.863661
of particles and label, confirming that aggregates tend to have higher
AF Fraction of particle’s pixels in the image 4.814997
P Perimeter of the particle 4.262473 contrast with the background compared to spherical particles. Negative
a Length of the major axis of the enclosed ellipse 3.269451 linear correlation was found between most characteristic parameters
FMAX Maximum Feret’s diameter computed as the longest 3.089374 describing shape and size. The lowest correlation was found with the
distance between points length of the major axis of the enclosed ellipse, indicating that this
PC Perimeter of the convex area 2.800444
parameter is not significant in distinguishing between the two labels.
b Length of the minor axis of the enclosed ellipse 2.792951
dE Diameter of a circle with the same area as the region. 2.023977 Surprisingly, the commonly used parameter Feret diameter did not show
CX Ratio of convex perimeter and perimeter -1.16496 high linear correlation with the label. To improve the learning capa­
EC Eccentricity of the ellipse that has the same second- -1.11203 bilities of distance-based models and neural networks (Wang et al.,
moments as the particle
2022), all features were standardized using standard scaler.
S Ratio of area and convex area -0.54236
I Value with the mean particle’s intensity 0.539526 The data was analyzed using several classification algorithms to
C
√̅̅̅̅̅̅̅̅̅
4πA -0.35652 automatically determine their labels as 0: "spherical particles" and 1:
Circularity defined as C =
P2 "agglomerates". The classification performance of six algorithms was
label 0 for “spherical particles” and 1 for “agglomerates” -0.10976 assessed with the test set using accuracy, and the related uncertainties
AR Ratio of minor and major axis lengths 0.106103
were estimated. To minimize the effect of overfitting the model, 10-fold
EX Ratio of pixels in the region to pixels in the total -0.09734
bounding box cross-validation was used. The accuracy of the algorithms was estimated
by splitting the data into a training set and a test set, fitting the model,
and computing it ten consecutive times with different splits each time.
agglomerates and spherical particles, respectively, with their Fig. 6 shows the classification accuracy and standard deviation for each

Fig. 3. SEM image and binary image of agglomerate (a) and spherical nanoparticle (b).

3
P. Kozikowski Micron 171 (2023) 103473

Table 2
Summary of characteristic parameters for agglomerate and spherical nanoparticle show in Fig. 3.
AC A [nm] AF P a FMAX PC b dE CX EC S I C AR EX label

40681 15012 5E-4 1837 349 315 828 156 138 0.45 0.89 0.36 39 0.23 0.45 0.30 1
33354 14871 6E-4 1449 299 275 778 168 137 0.53 0.83 0.44 34 0.30 0.56 0.29 0

columns represent the predicted label. A confusion matrix shows the


binary classifier’s predictions and classifies them into one of four cells
(TP true positive, FP False Positive, FN False Negative and TN True
Negative). A primary diagonal matrix means that the predictions are
perfect. If some digits are present in the top right or the top left, it means
that the model mislabelled some particles. Looking at the confusion
matrices, it can be observed that for the majority of models, spherical
particles were predicted with higher accuracy than aggregates. The
highest accuracy was achieved by the neural network model.
The overall accuracy of a machine learning model is highly depen­
dent on the quality of the input data provided. Poor quality data can
significantly reduce the model’s predictive capabilities. In the case of
nanoparticle analysis, our input data comes from 2D images that are
collected using a microscope. Table 4 shows the number of particles that
were correctly and incorrectly labelled for each 2D image using a
random forest classifier. The accuracy of the classifier ranges from 0.85
to 0.96 depending on the image, which can be attributed to artefacts
present in the images due to the semi-automatic segmentation process.
Manual cleaning can improve the quality of the input data, but it is a
time-consuming process that undermines the use of machine learning.
Fig. 4. Pearson correlation coefficient between characteristic parameters To improve the accuracy of the model, we should focus on obtaining
and label. higher-quality images at higher resolutions, improving automatic seg­
mentation, and optimizing hyperparameters.
tested model. The accuracy strongly depends on the model, with deci­
sion tree having the lowest accuracy (0.85) and neural network having 4. Conclusions
the highest accuracy (0.95). Logistic Regression achieved the second-
best score but also had the highest standard deviation, which suggests The study utilized image analysis augmented by machine learning
a high degree of overfitting, depending on the used datasets. This is in algorithms to automatically identify particles based on their morpho­
agreement with the literature, as this algorithm tends to overfit high- logical parameters. Machine learning proved effective in identifying
dimensional data (Bartlett et al., 2020). The neural network had both nanoparticles, with the highest performing model being Neural
higher accuracy and the lowest standard deviation, which was achieved Network, achieving an accuracy of 0.95. Logistic regression achieved the
after more than 100 epochs. Fig. 5 shows the decrease of training and second-best score but had the highest standard deviation, which can be
testing losses with increasing epoch number. The decrease was rapid up explained by overfitting on the training data. Simpler algorithms, such
to 25 epochs and slowed down after that. The loss of the testing dataset as random forest and decision trees, achieved lower scores. Even with
was close to that of the training dataset throughout the training process, the use of the best model, improvements can still be made, especially in
implying that no overfitting occurred. The fitting process was stopped at terms of better image segmentation, hyperparameter optimization,
the point where the two losses started to diverge, which signalled the larger datasets, and cleaning the images of artefacts that are unavoid­
onset of overfitting. Fig. 6. able when doing automatic image processing. Unfortunately, the latter
Table 3 shows the confusion matrix for each model using the entire improvement is time-consuming, and moving forward, manual handling
dataset. The rows represent the true label of the particles while the of large-scale data should be avoided.
Presented work shows that more precise description of produced
nanoparticle can be done with microscopic and machine leering
methods compared to real time instruments. Such detailed character­
ization is increasingly important to ensure the understanding the

Fig. 5. Binary cross entropy loss function of training and test datasets. Fig. 6. Mean classification accuracy of machine learning algorithms.

4
P. Kozikowski Micron 171 (2023) 103473

Table 3 I would like to thank Piotr Sobiech for the support in setting-up of the
Model’s confusion matrix of classification of spherical particles (0) and ag­ experimental setup.
glomerates (1) on the test dataset.
Neural Network Logistic Regression Random Forest Classifier References
Accuracy 0.940 Accuracy 0.847 Accuracy 0.903
Abadi, M., et al., 2016. Tensor flow: large-scale machine learning on heterogeneous
0 1 0 1 0 1 distributed systems. arXiv. https://doi.org/10.48550/ARXIV.1603.04467.
0 1638 121 0 1627 187 0 1697 117 Arenas-Guerrero, P., et al., 2018. Determination of the size distribution of non-spherical
1 86 1635 1 298 1368 1 196 1470 nanoparticles by electric birefringence-based methods. Sci. Rep. 8 (1), 1–10. https://
doi.org/10.1038/s41598-018-27840-0.
k-Neighbours Classifier SVM Decision Tree Bartlett, P.L., Long, P.M., Lugosi, G., Tsigler, A., 2020. Benign overfitting in linear
regression. Proc. Natl. Acad. Sci. U. S. A. 117 (48), 30063–30070. https://doi.org/
Accuracy 0.873 Accuracy 0.895 Accuracy 0.856
10.1073/pnas.1907378117.
0 1 0 1 0 1 Braakhuis, H.M., Park, M.V.D.Z., Gosens, I., De Jong, W.H., Cassee, F.R., 2014.
0 1650 164 0 1709 105 0 1581 233 Physicochemical characteristics of nanomaterials that affect pulmonary
1 257 1409 1 240 1426 1 229 1437 inflammation. Part. Fibre Toxicol. 11 (1) https://doi.org/10.1186/1743-8977-11-
18.
Brostrøm, A., Kling, K.I., Koponen, I.K., Hougaard, K.S., Kandler, K., Mølhave, K., 2019.
Improving the foundation for particulate matter risk assessment by individual
nanoparticle statistics from electron microscopy analysis. Sci. Rep. 9 (1), 1–13.
Table 4 https://doi.org/10.1038/s41598-019-44495-7.
Percentage of falsely labelled particles for each 2D image. Brostrøm, A., Kling, K.I., Hougaard, K.S., Mølhave, K., 2020. Complex aerosol
Image Total number of Properly Falsely Accuracy characterization by scanning electron microscopy coupled with energy dispersive X-
ray spectroscopy. Sci. Rep. 10 (1), 1–15. https://doi.org/10.1038/s41598-020-
File particles labelled labelled [%]
65383-5.
15 Hz_1 629 580 49 0.92 Brouwer, D.H., et al., 2013. Workplace air measurements and likelihood of exposure to
15 Hz_2 485 441 44 0.90 manufactured nano-objects, agglomerates, and aggregates. J. Nanopart. Res. 15 (11)
15 Hz_3 597 552 45 0.92 https://doi.org/10.1007/s11051-013-2090-7.
15 Hz_4 573 531 42 0.92 Buckland, H.M., et al., 2021. Measuring the size of non-spherical particles and the
implications for grain size analysis in volcanology. J. Volcanol. Geotherm. Res. 415,
15 Hz_5 668 629 39 0.94
107257 https://doi.org/10.1016/j.jvolgeores.2021.107257.
15 Hz_6 563 532 31 0.94
Chen, B.T., et al., 2016. Performance of a scanning mobility particle sizer in measuring
15 Hz_7 590 555 35 0.94
diverse types of airborne nanoparticles: Multi-walled carbon nanotubes, welding
15 Hz_8 533 505 28 0.94 fumes, and titanium dioxide spray. J. Occup. Environ. Hyg. 13 (7), 501–518. https://
15 Hz_9 415 388 27 0.93 doi.org/10.1080/15459624.2016.1148267.
15 Hz_10 357 336 21 0.94 DeCarlo, P.F., Slowik, J.G., Worsnop, D.R., Davidovits, P., Jimenez, J.L., 2004. Particle
15 Hz_11 515 497 18 0.96 morphology and density characterization by combined mobility and aerodynamic
45 Hz_1 582 510 72 0.86 diameter measurements. Part 1: theory. Aerosol Sci. Technol. 38 (12), 1185–1205.
45 Hz_2 629 547 82 0.85 https://doi.org/10.1080/027868290903907.
45 Hz_3 989 875 114 0.87 Donaldson, K., Poland, C.A., 2013. Nanotoxicity: challenging the myth of nano-specific
45 Hz_4 760 689 71 0.90 toxicity. Curr. Opin. Biotechnol. 24 (4), 724–734. https://doi.org/10.1016/j.
45 Hz_5 839 769 70 0.91 copbio.2013.05.003.
45 Hz_6 977 846 131 0.85 Evans, D.E., Harrison, R.M., Ayres, J.G., 2003. The generation and characterisation of
45 Hz_7 897 763 134 0.82 elemental carbon aerosols for human challenge studies. J. Aerosol Sci. 34 (8),
1023–1041. https://doi.org/10.1016/S0021-8502(03)00069-7.
Total 11598 10545 1053
Feng, C., et al., 2014. Log-transformation and its implications for data analysis. Shanghai
Arch. Psychiatry 26 (2), 105–109. https://doi.org/10.3969/j.issn.1002-
0829.2014.02.
behaviour and potential risks of nanoparticles. Feng, C., Wang, H., Lu, N., Tu, X.M., 2013. Log transformation: application and
interpretation in biomedical research. Stat. Med. 32 (2), 230–239. https://doi.org/
10.1002/sim.5486.
Declaration of Competing Interest Frank, U., Uttinger, M.J., Wawra, S.E., Lübbert, C., Peukert, W., 2022. Progress in
multidimensional particle characterization. KONA Powder Part. J. 39 (39), 3–28.
https://doi.org/10.14356/kona.2022005.
All authors have participated in (a) conception and design, or anal­ Glytsos, T., Ondráček, J., Džumbová, L., Kopanakis, I., Lazaridis, M., 2010.
ysis and interpretation of the data; (b) drafting the article or revising it Characterization of particulate matter concentrations during controlled indoor
activities. Atmos. Environ. 44 (12), 1539–1549. https://doi.org/10.1016/j.
critically for important intellectual content; and (c) approval of the final
atmosenv.2010.01.009.
version. This manuscript has not been submitted to, nor is under review Hammouri, H.M., Sabo, R.T., Alsaadawi, R., Kheirallah, K.A., 2020. Handling skewed
at, another journal or other publishing venue. The authors have no data: a comparison of two popular methods. Appl. Sci. 10 (18) https://doi.org/
affiliation with any organization with a direct or indirect financial in­ 10.3390/APP10186247.
Kelly, F.J., Fussell, J.C., 2020. Toxicity of airborne particles - established evidence,
terest in the subject matter discussed in the manuscript. The following knowledge gaps and emerging areas of importance: topical aspects of particle
authors have affiliations with organizations with direct or indirect toxicity. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 378 (2183) https://doi.org/
financial interest in the subject matter discussed in the manuscript. 10.1098/rsta.2019.0322.
Kozikowski, P., Sobiech, P., 2022. Comparison of nanoparticles’ characteristic
parameters derived from SEM and SMPS analyses. J. Nanopart. Res. 24 (6) https://
Data Availability doi.org/10.1007/s11051-022-05480-w.
Kuhlbusch, T.A.J., Asbach, C., Fissan, H., Göhler, D., Stintz, M., 2011. Nanoparticle
exposure at nanotechnology workplaces: a review. Part. Fibre Toxicol. 8 (1), 22.
Data will be made available on request. https://doi.org/10.1186/1743-8977-8-22.
Lemaître, G., Nogueira, F., Aridas, C.K., 2017. Imbalanced-learn: a python toolbox to
Acknowledgement tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18,
1–5.
Modena, M.M., Rühle, B., Burg, T.P., Wuttke, S., 2019. Nanoparticle characterization:
This paper is published and based on the results of a research task what to measure? Adv. Mater. 31 (32) https://doi.org/10.1002/adma.201901556.
carried out within the scope of the fifth stage of the National Programme Mu, Y., Liu, X., Wang, L., 2018. A Pearson’s correlation coefficient based decision tree
and its parallel implementation. Inf. Sci. (N. Y.) 435, 40–58. https://doi.org/
“Improvement of safety and working conditions” supported within the 10.1016/j.ins.2017.12.059.
scope of state services by the Ministry of Family and Social Policy task Pedregosa, F., et al., 2012. Scikit-learn: Machine Learning in Python. https://doi.org/
no. 2.SP.13, UM-2/DPR/PD/2020/02, entitled “Development of nano- 10.48550/ARXIV.1201.0490.
Pellegrino, F., Ortel, E., Mielke, J., Schmidt, R., Maurino, V., Hodoroaba, V.D., 2022.
objects sampling method and their analysis using advanced imaging Customizing new titanium dioxide nanoparticles with controlled particle size and
techniques”. The Central Institute for Labour Protection—National shape distribution: a feasibility study toward reference materials for quality
Research Institute is the Programme’s main co-ordinator.

5
P. Kozikowski Micron 171 (2023) 103473

assurance of nonspherical nanoparticle characterization. Adv. Eng. Mater. 24 (6), Van Der Walt, S., et al., 2014. Scikit-image: Image processing in python. PeerJ vol. 2014
1–10. https://doi.org/10.1002/adem.202101347. (1), 1–18. https://doi.org/10.7717/peerj.453.
Prati, R.C., Batista, G.E.A.P.A., Monard, M.C., 2009. Data mining with unbalanced class Viloria, A., Lezama, O.B.P., Mercado-Caruzo, N., 2020. Unbalanced data processing using
distributions: concepts and methods. Proc. 4th Indian Int. Conf. Artif. Intell. IICAI oversampling: machine learning. Procedia Comput. Sci. 175, 108–113. https://doi.
359–376. org/10.1016/j.procs.2020.07.018.
Roth, C., et al., 2004. Generation of ultrafine particles by spark discharging. Aerosol Sci. Vippola, M., Valkonen, M., Sarlin, E., Honkanen, M., Huttunen, H., 2016. Insight to
Technol. 38 (3), 228–235. https://doi.org/10.1080/02786820490247632. nanoparticle size analysis—novel and convenient image analysis method versus
Rühle, B., Krumrey, J.F., Hodoroaba, V.D., 2021. Workflow towards automated conventional techniques. Nanoscale Res. Lett. 11 (1), 6–11. https://doi.org/
segmentation of agglomerated, non-spherical particles from electron microscopy 10.1186/s11671-016-1391-z.
images using artificial neural networks. Sci. Rep. vol. 11 (1), 1–10. https://doi.org/ Wang, S., Lu, H., Khan, A., Hajati, F., Khushi, M., Uddin, S., 2022. A machine learning
10.1038/s41598-021-84287-6. software tool for multiclass classification. Softw. Impacts 13, 100383. https://doi.
Schmid, O., Karg, E., Hagen, D.E., Whitefield, P.D., Ferron, G.A., 2007. On the effective org/10.1016/j.simpa.2022.100383.
density of non-spherical particles as derived from combined measurements of Wiedensohler, A., et al., 2012. Mobility particle size spectrometers: Harmonization of
aerodynamic and mobility equivalent size. J. Aerosol Sci. 38 (4), 431–443. https:// technical standards and data structure to facilitate high quality long-term
doi.org/10.1016/j.jaerosci.2007.01.002. observations of atmospheric particle number size distributions. Atmos. Meas. Tech. 5
Sharma, V.K., 2009. Aggregation and toxicity of titanium dioxide nanoparticles in (3), 657–685. https://doi.org/10.5194/amt-5-657-2012.
aquatic environment—a review. J. Environ. Sci. Heal. Part A 44 (14), 1485–1495.
https://doi.org/10.1080/10934520903263231.

You might also like