Ctabdomin Ifj
Ctabdomin Ifj
https://doi.org/10.1007/s11042-024-18578-1
Abstract
Abdominal systems such the liver, pancreas, spleen, and kidneys must be carefully dis-
sected in order to properly diagnose and treat abdominal illnesses. Even while deep learn-
ing segmentation methods are excellent, there are still problems with many of them,
including partial volume effects, image noise, and data asymmetry. The purpose of this
research is to colourize CT scans in order to improve these segmentation techniques. For
the purpose of segmenting various organs in thoracic CT images, we suggest an adversarial
training technique for deep neural networks. U-Net-generative adversarial networks are the
suggested adversarial network architecture. High quantities of GPU RAM are needed for
this procedure (perhaps exceeding hardware restrictions) and training takes a long time.
With the findings of this publication, By highlighting how great outcomes are still possible
with a reduced-resource architecture, Our goal is to get more scientists interested in and
involved with deep neural networks. We use cutting-edge pre-processing methods, multi-
organ segmentation requires both a well-designed model with model fusion amongst mod-
els that have been trained on the same datasets. Using state-of-the-art approaches from a
public competition as a benchmark, we show that our design is much better.
1 Introduction
The realm of medical imaging presents a disparity between real-world clinical scenarios
and the training data used for organ or pathology segmentation in deep learning models.
While publicly available annotated datasets are limited, contemporary medical facilities
possess vast amounts of unlabeled patient imaging data [1]. Annotated data scarcity, driven
by privacy concerns and the laborious nature of expert annotation, necessitates the explora-
tion of methods for model training without direct access to previous datasets. Fine-tuning
(FT) [2] stands out as a common sequential training method, but it encounters challenges
in accurately segmenting abdominal organs crucial for diagnosing and treating abdominal
13
Vol.:(0123456789)
Multimedia Tools and Applications
diseases, such as the pancreas, liver, spleen, and kidneys. Despite various effective deep
learning segmentation approaches, inherent issues like partial volume effects, image noise,
and data distribution disparities persist [3].
Clinical practitioners increasingly rely on automated organ delineation, especially
from multi-sequence MR images, crucial for radiotherapy planning. However, segmenting
organs from MR volumes poses challenges due to variations across sequences. Advance-
ments in convolutional neural networks (CNNs) offer promise for automating organ deline-
ation, yet there’s scant research comparing CNN performance to emerging techniques like
adversarial learning (AL) in segmentation [4]. In radiotherapy preparations for patients
with cancer of the head and neck, precise delineation of multiple organs from scheduling
CT images is vital to minimize toxicity and preserve normal tissue. However, the diverse
size and class variation among head and neck structures pose challenges in training a sin-
gle deep neural network to encompass them all [5]. Adaptive radiotherapy, utilizing Cone-
Beam CT images before radiation administration, holds potential to enhance radiation
delivery accuracy and treatment outcomes by modifying contouring and treatment plans
[6].While organ delineation is crucial for diagnosis and treatment, its subjectivity and time-
consuming nature make automation challenging. Dual energy CT, with its higher picture
contrast compared to traditional single energy CT, shows promise in aiding autonomous
organ segmentation [7]. Abdominal organ segmentation, pivotal in computer-aided diag-
nostics and laparoscopic surgery, faces limitations in addressing the diversity in organ form
and location using current approaches [8], often focusing on specific organs.
Deep learning heavily relies on well-labeled histology data for successful nuclei seg-
mentation, impacting its quality and efficacy [9]. While deep learning models have shown
significant success in multi-organ segmentation, their dependence on extensive labeled
datasets for each organ poses a challenge, especially with small or partially labeled medi-
cal image datasets [10]. The laborious manual effort involved in delineating organs-at-risk
for radio therapeutic treatment planning remains a bottleneck [11]. Managing mobility
during radiation treatment for the pancreas is challenging. CBCT-based adaptive radio-
therapy offers innovative offline or real-time plan adjustments, although manual delinea-
tion remains a critical step in adaptive re-planning, particularly for rapid re-planning needs
[12]. Organ segmentation in medical images, crucial for disease detection, faces difficul-
ties due to fluid boundaries, varying sizes, and conventional segmentation methods that
might overlook vital geographical information, especially for organs like the pancreas with
diverse geometries and small volumes [13]. Developing deep-learning-based techniques
for accurate and robust auto-segmentation of organs-at-risk (OARs), such as those in pan-
creatic cancer cases using contrast-enhanced CT images, offers a solution to the challeng-
ing manual outlining of OARs [14]. Non-contrast CT scans are invaluable for diagnosing
various conditions, but the inability to detect intra-organ borders due to the lack of contrast
presents a significant limitation [15].
1.1 Research aim
Our goal was to demonstrate that architectures that operate in low-resource settings may
nonetheless provide segmentation outcomes that place at the top of a well-known multi-
organ task or contest. In this approach, researchers of various hardware capacities may
employ deep neural networks in order to make strides in the area.
To that end, we have implemented several regulations. The first option was to use a
deep learning techniques framework which could be taught on a GPU with up to 8 GB of
13
Multimedia Tools and Applications
memory. A video card with a moderate price tag may do this. To demonstrate that a reli-
able DL structure may provide satisfactory results even with a smaller memory than is used
by current modern studies (which utilise up to 24 GB GPUs), we used this technique.
The second constraint we set for ourselves was to choose a well-known DL network that
was yet straightforward. We settled on the U-Net 3D framework since it has been exten-
sively studied and has repeatedly been shown to be effective in medical segmentation.
Taking into account the present computer abilities and technological improvements in
DL these are strict requirements; yet, a well-designed architectural remains to be able to
get significant and predictable results.
2 Related work
Using only a small labelled cohort of single-phase images, the authors of [16] estab-
lished a novel segmentation method called co-heterogeneous as well as adaptive segments
(CHASe) that could adapt to all unlabeled cohorts of different multi-phase data, potentially
introducing innovative medical situations and diseases. They offered a flexible system that
combined semi-supervision based on appearance, adversarial domain adaptation based on
masks, and pseudo-labeling to achieve this goal. Additionally, they provided co-heteroge-
neous training, a unique approach that combined co-training with hetero modality learning.
CHASe was evaluated on a clinically broad and challenging dataset consisting of 1147 peo-
ple and 4577 3D images from multi-phase CT imaging scans. When used on non-contrast
CT scans, for instance, CHASe had the potential to increase the pathologic unit mask Dice-
Sorensen ratios by anywhere from $4.2% to 9.4%$ to as high as $84.6%$ and $94.0%$.
In the [17] scenario, a model improved its ability to learn new tasks at the expense of its
ability to perform previously acquired ones. To solve this problem, the Learning without
Forgetting (LwF) method replayed its predictions for previous tasks during the modeling
phase. In that study, authors compared FT and LwF on the publicly accessible AAPM data-
set for class incrementally learning in multi-organ segmentation. They demonstrated that
LwF was capable of remembering information from previously completed segmentations;
nevertheless, its capacity for learning a new class declined as more classes were added.
Researchers proposed a continuous adversarial learning segmentation method (ACLSeg) to
deconstruct the feature set into task-specific with task-invariant features as a solution to this
issue. This facilitated efficient learning and maintenance of prior skills.
The study [18] used deep learning to separate numerous organ areas from a single non-
contrast CT volume. They also demonstrated the efficacy of fine-tuning using a small
sample of training data for segmenting several organs. Recognizing patient-specific ana-
tomic elements in medical pictures was crucial for any imaging analysis system; hence,
they explored the application of 3D U-Net to segment several organ regions from contrast-
enhanced abdomen CT volumes. Since non-contrast CT images were so often used in med-
icine, it was crucial that a medical image analysis system could distinguish multi-organ
regions from non-contrast CT sizes. In this research, they employed 3D U-Net trained on
sparse data to extract low-contrast organ areas from a CT image. Using a model that had
already been trained using data from the literature, they then performed fine-tuning. The
3D U-Net model had already been trained using data from several contrast-enhanced CT
volumes. When that was done, a modest number of non-contrast CT data were used for
fine-tuning. The experiments proved that the optimized 3D U-Net system could separate
several organs from a CT scan with no contrast. Several organ regions could be effectively
13
Multimedia Tools and Applications
segmented using the proposed fine-tuning-based training approach even when only a small
quantity of training information was provided.
Researchers in [19] hoped that coloring CT images could aid in segmentation
approaches. They provided a new design for coloring CT images based on residual
U-blocks and spectral-normalized layered generating stochastic networks. Laboratory-
added color was introduced. DenseV-Net was utilized to train and test V-Net on abdo-
men CT scans, aiming to segment multiple organs simultaneously. Clinical investigations
showed that coloring CT scans improved dice similarity coefficients and decreased Haus-
dorff radius from (0.32, 302.7) to (0.67, 78.2). Researchers in [20] recognized the need to
address class imbalance issues during training for each resultant class (one subcategory per
component) by developing a single-class segment algorithm for each of the 12 OAR. They
detailed a straightforward weight averaging methodology to start a model as the mean
of many models training on distinct organs, along with an approach to transfer learning
for utilizing characteristics learned on one set of OAR in another. They utilized the same
U-net foundation. Experiments conducted on 200 H & N cancer recipients who had exter-
nal beam radiation therapy showed that the proposed model significantly outperformed the
starting multi-organ division strategy, which aimed to simultaneously train several OARs.
The suggested approach used transfer learning across OAR and a weight averaging strat-
egy to achieve an overall Dice rating of $0.75 \pm 0.12$, demonstrating that acceptable
classification efficiency could be achieved by incorporating data from nearby structures
to reduce the uncertainty of ground-truth annotations. [21] aimed to convince more aca-
demics to utilize and study deep neural networks by demonstrating that remarkable results
could be attained even when choosing a low-resource design. Experts combined cutting-
edge pre-processing methods, an efficient model architecture, and model fusion among
models training on the same dataset to accomplish multi-organ segmentation. The success
of their strategy in competing alongside state-of-the-art alternatives in an openly published
competition demonstrated its effectiveness (Table 1).
As recommended by the authors in [22], commonly obtained thorax CT scans could
be automatically segmented into several organs at risk using a GAN. To facilitate com-
prehensive segmentation, the generator now had a multi-label U-Net. The GAN was
trained on ROIs corresponding to the esophagus and the spinal cord. The final contour
was reconstructed by fusing the well-trained network’s probability maps of the fresh
CT thorax multi-organ scan. Twenty patients’ records were analyzed, all of whom had
chest CT scans with manual contouring. The average Dice similarity coefficient was
0.73 ± 0.04 for the esophagus, 0.85 ± 0.04 for the heart, 0.96 ± 0.01 for the left lungs,
0.97 ± 0.02 for the right lungs, and 0.88 ± 0.03 for the spinal cord. This advanced deep
learning-based solution utilized the GAN approach, allowing the automatic and precise
segmentation of a large number of OARs in lung CT scans, suggesting it might be use-
ful for improving the effectiveness of lung radiation therapy scheduling. The authors of
[23] proposed a synthetic multi-organ segment guided by MRI and cone-beam CT for
use in adaptive radiation of the prostate. To create sMRI from a CBCT picture, they
first trained cycle-consistent generative adversarial networks using pairs of pre-aligned
Table 1 In terms of segmentation Learning Model Dice Value Number of Layers
performance, VGG16
outperforms 2DCNN and
U-Net-GAN 0.741 16
3DCNN
U-Net 0.726 50
3DCNN 0.690 121
13
Multimedia Tools and Applications
CBCT and MRI images. Then, two distinct U-Nets were used to derive attribute maps
from the CBCT and the sMRI. Attention gates were used to merge the attribute maps
before feeding them as input to a convolutional neural network to predict the final seg-
mentation of these crucial components. Using data from 100 patients, the segmenta-
tion findings were analyzed: bladder, rectum, prostate, right femoral head, and left
femoral head dice similarity coefficients were 0.96 ± 0.03, 0.91 ± 0.08, 0.93 ± 0.04, and
0.95 ± 0.05, respectively.
In [24], researchers proposed a framework for the unpaired translation of medical
images from the portal-venous phase to non-contrast CT volumes. The potential use of
image-to-image translation in medical image analysis applications, such as segmentation,
was enormous. Several deep learning-based segmentation algorithms for contrast-enhanced
CT volumes had been suggested recently. However, patients sensitive to contrast media
might only have non-contrast CT scans, emphasizing the significance of segmentation
with non-contrast CT volumes. To address this issue, image conversion from regular CT
to contrast-enhanced CT was explored. The study employed the unpaired image-to-image
network (UNIT) and the cycle-consistent (CycleGAN) to translate images. They trained a
segmentation algorithm using U-Net on contrast-enhanced CT scans to evaluate the trans-
lation efficiency for multi-organ segmentation. The experiments showed that image trans-
lation significantly improved multi-organ segmentation, notably enhancing segmentation
accuracy.
In order to tackle difficulties, researchers demonstrated a method for nuclei segmenta-
tion using conditional generative adversarial networks trained on both synthetic and real
data [25]. Employing an unpaired GAN architecture, they produced a sizable library of
H&E training pictures with accurate labels for nuclei segmentation. To enhance nuclei
segmentation, a conditioned GAN was developed using both synthetic and real histolog-
ical data from six distinct organs. Compared to standard CNN models, their adversarial
regression methodology demonstrated more reliable enforcement of high-order spatial
consistency. The method outperformed the status quo in nuclei segmentation, especially
in identifying individual and overlapping nuclei, proving applicable across various organs,
locations, patients, and disease states.
The authors of research [26] presented a completely automated delineation approach to
accelerate adaptive radiation rearranging as well as evaluating and tracking dose-volume-
based plans. For example, they used a cycle-consistent adversarial architecture to create
synthetic CT from CBCT in order to decrease scatter artefact and enhance image quality.
They used mask scored regional neural networks (RCNN) trained on synthetic CT scans to
extract the characteristics required for final segmentation. Their technique was evaluated
using multiple metrics, including Dice similarity parameters, Hausdorff separation at 95%
certainty, mean square distance, and residual mean square distance. They obtained DSC
values between 0.82 and 0.94 across eight different organs. The methodology proposed by
researchers in [27] involved three steps to gather specific information for precise segmenta-
tion: collecting global spatial features from multiple fields of reception using multi-scale
field choosing, combining multi-level features located in different nodes of the network
using the multi-channel fusion element (MCFM) to increase consistency among the seg-
mentor’s output likelihood maps and the original samples. Their proposed MSC-DUnet
showed a 5.1% higher index dice resemblance coefficient compared to the baseline network
in their analyses of the NIH Pancreas-CT data collection, demonstrating the enormous
potential of MSC-DUnet for pancreatic segmentation.
In [28], the authors studied confocal immunofluorescent (IF) images of lung cells
from the LungMAP dataset to understand more about lung development. They aimed for
13
Multimedia Tools and Applications
accurate multiple-class segmentation of lung laser IF images and explored the utilization
of a state-of-the-art deep learning-based system. However, the widespread deployment
of deep neural network models was impeded by a lack of high-quality training data and
ground-truth segmentation labels. To address this, they developed multi-class segmenta-
tion and emphasized using synthetic photographs in the classification of IF images, thereby
expanding the training dataset and enhancing overall segmentation precision using GAN
models. Experimental results showcased a 15.1% improvement in six-class segmentation
accuracy by using Mask R-CNN. Notably, their suggested strategy significantly enhanced
precision despite limited data availability. They highlighted the potential of synthetic
datasets to rectify data imbalances and augment the overall dataset size. At [29], a team
of researchers proposed a distinctive approach to automated abdominal multi-organ seg-
mentation by incorporating spatial information into the supervoxel classification proce-
dure. They segmented images using Simple Linear Iterative Clusters (SLIC) to eliminate
supervoxels located near anatomical edges. A random forest classifier predicted supervoxel
labels, considering both spatial and strength information, to separate the liver, left kidneys,
and spleen from other abdominal contents using 30 CT scans. The experimental outcomes
indicated that their suggested approach outperformed the previous model-based segmen-
tation method used. In [30], the authors introduced an AutoML method to expedite the
calculation time required for effective hyperparameter optimization in predictive learning.
Their method utilized proxy data instead of the entire dataset. Qualitative and quantitative
results illustrated that their system could construct more effective segmentation systems
compared to AutoML, which employed poorly calibrated hyperparameters and randomly
selected training groups, as well as manually generated deep learning-based methods. The
median Dice score for segmenting abdominal organs using 10 classes was reported as 85.7.
The reviewed studies encompass various novel approaches in different domains of
machine learning and cybersecurity. A Light Deep Learning (LightDL) recommender
system [31] was proposed, utilizing Twitter-based reviews to enhance sentiment anal-
ysis in recommender systems. By incorporating semantic, syntactic, symbolic, and
tweet-based features, this model accurately categorized sentiments and achieved a com-
mendable 95% accuracy for Twitter data. To address the scarcity of annotated data in
Chinese Named Entity Recognition (NER) tasks, a method combining local adversarial
training, attention mechanisms, and transfer learning using ALBERT was introduced
[32]. This approach substantially improved dataset size, bolstered accuracy, and stabi-
lized the NER model, validated on People’s Daily 2004 and Tsinghua University data-
sets. In study [33] introduced FDGUA, a forward-derivative-based method for crafting
graph universal adversarial attacks targeting Graph Neural Networks (GNNs). This
technique demonstrated high attack success rates of over 80% on the Cora dataset, uti-
lizing merely three anchor nodes. Additionally, the authors proposed Graph Universal
Adversarial Training (GUAT) as a defense mechanism against such attacks, effectively
enhancing GNN robustness without compromising accuracy. Addressing the limita-
tions of Intrusion Detection Systems (IDS), a novel intrusion detection model named
NMAIFS MOP-AQAI was introduced [34]. It leveraged NMAIFS for efficient feature
reduction and MOP-AQAI for classification, resulting in improved intrusion detection
performance. The exploration of Deep Learning architectures for image generation from
text descriptions was detailed in another study [35]. This review encompassed various
GAN variations such as DCGAN, StackGAN, StackGAN + + , and AttnGAN, assess-
ing their capabilities in converting textual descriptions into semantically similar images.
These models focused on iterative refinement, tree-like structures, and attentional
mechanisms to enhance the accuracy and quality of generated images. Lastly, a study
13
Multimedia Tools and Applications
proposed a white-hat worm launcher employing machine learning for Botnet Defense
Systems (BDS) in large-scale IoT networks [36]. This system aimed to strategically
deploy white-hat worms across extensive IoT networks, effectively combating malicious
botnets and reducing infected devices by approximately 30–40%.
3 Proposed work
3.1 Research gaps
13
Multimedia Tools and Applications
3.2 Preprocessing
This process is the first stage in the pipeline. CT scans have variable voxel sizes due to
the CT scanners’ diverse configurations and the patients’ unique anatomical features.
The many image artifacts caused by these variables make segmentation more diffi-
cult. In order to standardize the thickness of the slice and decrease the picture sizes
(by an amount of 2), the CT scans must be resampled as the first stage of preliminary
processing.
As a second stage of preparation, we clipped voxel frequencies using the Hounsfield
scaling to represent the relevant organs. As a last step, we applied some basic Z-score
normalization to the photos. Integration of colorization into CT scans significantly aug-
ments segmentation techniques, notably within thoracic CT imaging contexts. This
enhancement introduces crucial advantages by enhancing visual contrast, aiding in bet-
ter differentiation among various anatomical structures. By assigning distinct colors
to different organs, colorization renders their boundaries clearer, assisting segmenta-
tion algorithms in more precise identification and segmentation. Moreover, it serves
as a complementary feature, providing additional information in cases where grayscale
imaging lacks sufficient contrast. Through improved visual cues, colorization aids in
organ boundary delineation, particularly in complex thoracic CT scans, ultimately con-
tributing to refined segmentation accuracy and reduced errors in organ segmentation.
In most cases, a set theoretical framework adequately characterizes the process of medi-
cal picture segmentation: for a certain medical picture I along with some requirements
for resemblance Ci (i = 1, 2, … ), I is segmented so that it may be divided into smaller
pieces, or segments:
N
⋃
Rx = I, Rx ∩ Ry = ∅, ∀x ≠ y, x, y ∈ [1, N] (1)
x=1
where Rx matches the requirements for similarity on both ends of the communication spec-
trum Ci (i = 1, 2, … ), i.e., the image areas.
For R_y, the coordinates (x,y) are used in the same way to denote distinct areas.
After division, there are at most N regions, where N is an optimistic integer greater than
2. Classification of medical pictures may be broken down into the following steps:
1. Acquire a data set of medical images, which should consist of a training set, a vali-
dation set, as well as an evaluation set. The information set is often split into three
sections when applied machine learning to process images. It is common practice to
employ three distinct data sets while developing a neural network framework: a set
for training, a validation set, as well as a test set.
2. To enhance the dimension of the information set, we must first conduct some pre-
processing on the picture, which often entails standardizing the original image and
then randomly rotating and scaling it.
3. Segment the medical picture using a reliable approach, and then display the results.
13
Multimedia Tools and Applications
3.4 3D UNet
There is just one encode along with a single decode route, much like U-Net. There are four
degrees of resolution for every route. Every encoding layer consists of a ReLU layer that
follows two 33 convolutions. Maximum pooling layer is used to accomplish reduction in
dimensionality. Levels in the decoding route begin with a stride-of-2 2 × 2x2 deconvolu-
tion techniques layer, followed by a stride-of-3 3 × 3x3 combination layer. A ReLU layer is
added after each convolution. Layers from the encoding route with the same resolution as
those in the decoding route are transmitted directly to the latter, supplying it with its origi-
nal, superior resolution characteristics. The neural network is able to divide up 3D pictures
because it receives as input an ongoing sequence of 2D slices from a 3D image. In addition
to being able to train on a poorly labelled data set as well as estimate additional unlabeled
locations within that data set, this neural network can also train on several sparsely labelled
data sets and make predictions about completely new data. The input being processed is
a stereoscopic picture with three channels (132 by 132 by 116), which is larger than the
input used by U-Net. A 444428 pixel picture is the final product. Three-dimensional U-Net
(UCN) keeps the best of both FCN as well as U-Net. The development of this technology
greatly aids volumetric imaging.
3.5 Proposed methodology
A U-shaped CNN has been taught to map the input abdomen scans to multi-organ seg-
mentation maps, as shown in Fig. 1. An encoding device as well as a decoder are the typi-
cal components of a network. To extract the encoded semantic data from the input data
scans, a contracting route with several down sampling blocks is used in the encoder. The
final segmented mapping for every organ is obtained after passing the input via a decoder,
which is an expanding route for reducing the characteristics back to their previous size. The
network is improved using the following objective function given the target segmentation
map (g), network (f), and input abdominal scan (x).
argminLseg (x, f , g) = argmin[𝛾Dice(f (x), g) + (1 − 𝛾)CE(f (x), g)]
f f (2)
where Dice is the dice loss, CE is the cross-entropy loss, and γ is a weight that was empiri-
cally determined to be 0.5.
In order to purposefully impair network performance, we compute the statistics of the
adversarial noises that are generated and insert them into clean features. The noise injec-
tion of the adversarial feature is explained as follows:
ficlean − 𝜇 clean
noisy ,Sr
fi (x) = 𝜎 adv + 𝜇 adv (3)
𝜎 clean
)
where 𝜇 is the first-order
( moment, ) 𝜎 is the second-order moment, ( 𝜇clean , 𝜎 clean is com-
puted from ficlean , and 𝜇adv , 𝜎 adv is computed from fi r . The noisy adversarial feature
adv,S
13
Multimedia Tools and Applications
is afterwards entered the network to take the place of the clean function ficlean
noisy, S
r
fi
(Fig. 2).
The proposed adversarial training technique outlined in the research employs a two-part
framework integrating generative models with adversarial networks, specifically a Genera-
tive Adversarial Network (GAN) as indicated in Fig. 1. This approach involves two net-
works, the G and D, engaged in a competitive learning process. The first component, the
creation of a system, takes in an erratic signal z (a random integer) and uses it to cre-
ate a picture. The subsequent step is to challenge the network’s authority in determining
whether or not a picture is fake. It takes an image (represented by the variable x) as input
and returns a probability, denoted by D (x), of whether or not x is in fact an image. Two
networks may be made to contend with one another via training. The adversarial networks
evaluate the validity of the neural network’s output with the help of a discriminator. It is
hoped, ultimately, that the output of the generator will be feasibly manufactured.
To assess the likelihood that a data sample was drawn from the samples used for
training instead of G, a model with discrimination is used. The discriminative models D
is optimised during training such that it has a high chance of accurately labelling both
its initial samples as well as the samples acquired from G. Sample x = G(z) is produced
by feeding a sample of noise z form the latent space of a sample into the algorithm that
generates G. The produced samples’ statistical distribution P_G (x) may have a more
intricate shape for neural networks. The distributions of probability P_G (x) as well as
P_"data" (x) are trained to be as similar as feasible while the model that produces G
13
Multimedia Tools and Applications
is being developed. In order to trick the model that discriminates (D), the generating
model (G) attempts to create fictitious data, while the discriminator (D) attempts to tell
the difference between actual and fictitious examples. Adversarial learning ensures that
G’s produced dispersion becomes closer to the true statistical distribution over time.
To further understand how adversarial learning works, consider the following optimi-
zation issue:
( )
G = arg minDiv PG (x), PData (x) ,
G
(4)
where Div represents how far apart P_"Data " (x) as well as P_G (x) are from one another.
Mathematically, the discriminator D’s function is
D = arg maxV(G, D), (5)
D
where the goal function is a two-player the minima competition game between models D
and G (G, D):
V(G, D) = Ex∼Pdax [log D(x)] + Ex∼P(G) [log(1 − D(G(z)))] (6)
This allows us to phrase the challenge of optimising the GAN’s parameters as
G = arg min maxV(G, D). (7)
G D
Through iterative optimization, both the G and D models are kept current. Model G’s
generated information will become more like the actual data as the antagonistic process
of learning progresses (Figs. 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15).
To get accurate segmentation maps for the intended organs, the network is trained
with both clean and adversarial features together after producing the adversarial
13
Multimedia Tools and Applications
features. This is done instead of training with only clean features. To further improve
the generalization and resilience of the model, we train with multi-scale feature attacks.
The suggested GAN-based augmentation training scheme’s ultimate goal function is:
∑ ( )
noisy ,Sk
k=0.1,0.05,0.025,0.0125
arg minLseg x, fi i , f1−i , g ∗ (8)
f
13
Multimedia Tools and Applications
13
Multimedia Tools and Applications
13
Multimedia Tools and Applications
13
Multimedia Tools and Applications
the segmentation network’s training, the model learns to handle more complex features and
better adapts to diverse imaging variations. This augmentation training not only allows the
network to learn from clean features but also from adversarial perturbed features, improv-
ing its robustness and generalization capabilities. Consequently, this approach aids in cre-
ating more accurate segmentation maps for organs in thoracic CT scans by enhancing the
network’s ability to handle variations, nuances, and different anatomical structures within
the images.
All experimental procedures are discussed in great detail here. The experimental setting as
well as the design specifics of the comparison experiments is provided once the databases
and information preparation have been introduced. The analysis of the experimental find-
ings follows.
13
Multimedia Tools and Applications
The five network components which make up the technique were each trained inde-
pendently on identical hardware for up to 500 epochs. The training rate was decreased by
an integer multiple of 10 (with 0.00001 being the minimum allowed) if the loss did not
expand after 20 epochs. After a period of practise at the minimum learning rates, learn-
ing is complete if the quantity lost does not decrease. Using this strategy, we were able to
train every model in under 350 iterations. From a computational standpoint, the total time
required to train a single DL network was close to 24 h.
The suggested method’s performance was measured using a variety of metrics, such as
the dice resemblance quotient, sensitivity, specificity, and The residue mean square devia-
tion, the mean area distance, and the 95% Hausdorff distance. The degree to which the gen-
erated contours match the outlines of the ground truth is measured by DSC.
2 × |X ∩ Y|
DSC = (9)
|X| + |Y|
where X represents the actual contours and Y represents the contours that were derived
using the suggested approach. The amount of similarity between an item and the ground
truth volume may be determined using sensitivity and specificity metrics,
|X∩Y|
Sensitivity = |X|
|X∩Y| (10)
Specificity =
|X|
where X and Y are the areas beyond the autosegmented as well as real-world contours.
MSD is the sum of each of the directional mean contact distances,
13
Multimedia Tools and Applications
DSC
U-Net 0.71 ± 0.08 0.85 ± 0.05 0.97 ± 0.01 0.96 ± 0.01 0.83 ± 0.05
3DCNN 0.69 ±0.05 0.83 ±0.01 0.87 ±0.04 0.94 ±0.03 0.80 ±0.02
conditional 0.67 ±0.06 0.80 ±0.01 0.84 ±0.04 0.92 ±0.03 0.78 ±0.02
GAN
SLIC 0.65 ±0.06 0.78 ±0.01 0.80 ±0.03 0.88 ±0.03 0.74 ±0.02
U-Net-GAN 0.75 ± 0.08 0.87 ± 0.05 0.97 ± 0.01 0.97 ± 0.01 0.90 ± 0.04
Sensitivity
U-Net 0.71 ± 0.09 0.94 ± 0.05 0.97 ± 0.02 0.96 ± 0.02 0.97 ± 0.01
3DCNN 0.68 ±0.03 0.89 ±0.07 0.95 ±0.02 0.93 ±0.08 0.94 ±0.10
conditional 0.67 ±0.02 0.86 ±0.03 0.93 ±0.10 0.91 ±0.03 0.91 ±0.05
GAN
SLIC 0.64 ±0.07 0.83 ±0.10 0.89 ±0.08 0.87 ±0.07 0.87 ±0.02
U-Net-GAN 0.73 ± 0.10 0.89 ± 0.07 0.97 ± 0.02 0.96 ± 0.02 0.93 ± 0.03
Specificity
U-Net[9] 0.9996 ± 0.0002 0.9958 ± 0.0029 0.9989 ± 0.0010 0.9992 ± 0.0007 0.9995 ± 0.0001
3DCNN 0.9994 0.9953 0.9984 0.9990 0.9991
±0.0020 ±0.0007 ±0.0003 ±0.0003 ±0.0032
conditional 0.9995 0.9950 0.9982 0.9988 0.9989
GAN [25] ±0.0007 ±0.0013 ±0.0020 ±0.0017 ±0.0020
SLIC [29] 0.9990 0.9948 0.9979 0.9986 0.9987
±0.0005 ±0.0023 ±0.0023 ±0.0020 ±0.0007
U-Net-GAN 0.9997 ± 0.0001 0.9977 ± 0.0020 0.9989 ± 0.0010 0.9992 ± 0.0007 0.9998 ± 0.0000
HD95 (mm)
U-Net 4.91 ± 4.13 6.45 ± 4.03 2.07 ± 1.92 2.50 ± 3.33 1.98 ± 1.52
3DCNN 5.02 ± 4.18 6.50 ±4.15 2.13 ±1.96 2.54 ±3.38 2.04 ±1.63
conditional 5.09 ±4.23 6.53 ±4.19 2.19 ±2.04 2.59 ±3.40 2.08 ±1.72
GAN
SLIC 5.14 ±4.28 6.60 ±4.27 2.23 ±2.12 2.62 ±3.42 2.14 ±1.76
U-Net-GAN 4.52 ± 3.81 4.58 ± 3.67 2.07 ± 1.93 2.50 ± 3.34 1.19 ± 0.46
MSD (mm)
U-Net 1.09 ± 0.67 1.91 ± 0.95 0.61 ± 0.73 0.65 ± 0.53 0.54 ± 0.29
3DCNN 1.14 ±0.69 1.96 ±0.89 0.68 ±0.75 0.63 ±0.50 0.57 ±0.32
conditional 1.17 ±0.67 2.06 ±0.92 0.73 ±0.70 0.69 ±0.54 0.59 ±0.32
GAN
SLIC 1.19 ± 0.66 2.12 ±0.78 0.76 ±0.77 0.72 ± 0.55 0.62 ±0.29
U-Net-GAN 1.05 ± 0.66 1.49 ± 0.85 0.61 ± 0.73 0.65 ± 0.53 0.38 ± 0.27
RMSD (mm)
U-Net 2.37 ± 1.40 3.68 ± 2.24 2.12 ± 2.32 2.66 ± 2.45 1.08 ± 1.32
3DCNN 2.40 ±1.34 3.70 ±2.34 2.16 ±2.62 2.74 ±2.51 1.13 ±1.42
conditional 2.42 ±1.43 3.72 ±2.16 2.20 ±2.14 2.81 ±2.57 1.24 ±1.50
GAN
SLIC 2.43 ±1.37 3.76 ±2.64 2.23 ±2.43 2.86 ±2.64 1.27 ±1.56
U-Net-GAN 2.24 ± 1.36 3.14 ± 2.19 2.12 ± 2.32 2.66 ± 2.46 0.82 ± 0.85
13
Multimedia Tools and Applications
DSC 100 0.72 ± 0.10 0.97 ± 0.02 0.97 ± 0.02 0.93 ± 0.02 0.88 ± 0.037
200 0.64 ± 0.20 0.98 ± 0.01 0.97 ± 0.02 0.92 ± 0.02 0.89 ± 0.042
300 0.71 ± 0.12 0.98 ± 0.02 0.97 ± 0.02 0.91 ± 0.02 0.87 ± 0.110
400 0.64 ± 0.11 0.97 ± 0.01 0.97 ± 0.02 0.90 ± 0.03 0.88 ± 0.045
500 0.61 ± 0.11 0.96 ± 0.03 0.95 ± 0.05 0.92 ± 0.02 0.85 ± 0.035
600 0.58 ± 0.11 0.96 ± 0.01 0.96 ± 0.02 0.90 ± 0.02 0.87 ± 0.022
700 0.55 ± 0.20 0.95 ± 0.03 0.96 ± 0.02 0.85 ± 0.04 0.83 ± 0.080
Proposed 0.75 ± 0.08 0.97 ± 0.01 0.97 ± 0.01 0.87 ± 0.05 0.90 ± 0.04
MSD (mm) 100 2.23 ± 2.82 0.74 ± 0.31 1.08 ± 0.54 2.05 ± 0.62 0.73 ± 0.21
200 6.30 ± 9.08 0.61 ± 0.26 0.93 ± 0.53 2.42 ± 0.82 0.69 ± 0.25
300 2.08 ± 1.94 0.62 ± 0.35 0.91 ± 0.52 2.98 ± 0.93 0.76 ± 0.60
400 2.03 ± 1.94 0.79 ± 0.27 1.06 ± 0.63 3.00 ± 0.96 0.71 ± 0.25
500 2.48 ± 1.15 2.90 ± 6.94 2.70 ± 4.84 2.61 ± 0.69 1.03 ± 0.84
600 2.63 ± 1.03 1.16 ± 0.43 1.39 ± 0.61 3.15 ± 0.85 0.78 ± 0.14
700 13.10 ± 10.39 1.22 ± 0.61 1.13 ± 0.49 4.55 ± 1.59 2.10 ± 2.49
Proposed 1.05 ± 0.66 0.61 ± 0.73 0.65 ± 0.53 1.49 ± 0.85 0.38 ± 0.27
HD95 (mm) 100 7.3 ± 10.31 2.9 ± 1.32 4.7 ± 2.50 5.8 ± 1.98 2.0 ± 0.37
200 19.7 ± 25.90 2.2 ± 10.79 3.6 ± 2.30 7.1 ± 3.73 1.9 ± 0.49
300 7.8 ± 8.17 2.3 ± 1.30 3.7 ± 2.08 9.0 ± 4.29 2.0 ± 1.15
400 6.8 ± 3.93 3.0 ± 1.08 4.6 ± 3.45 9.9 ± 4.16 2.0 ± 0.62
500 8.0 ± 3.80 7.8 ± 19.13 14.5 ± 34.4 8.8 ± 5.31 2.3 ± 0.50
600 8.6 ± 3.82 4.5 ± 1.62 5.6 ± 3.16 9.2 ± 3.10 2.1 ± 0.35
700 37.0 ± 26.88 4.4 ± 3.41 4.1 ± 2.11 13.8 ± 5.49 8.1 ± 10.72
Proposed 4.52 ± 3.81 2.07 ± 1.93 2.50 ± 3.34 4.58 ± 3.67 1.19 ± 0.46
scores for Esophagus (0.75), Heart (0.87), Left Lung (0.97), Right Lung (0.97), and Spinal
Cord (0.90) compared to the other methods, showcasing its effectiveness in segmentation
accuracy. In terms of Sensitivity, U-Net-GAN also outperforms other methods for most
structures, demonstrating higher sensitivity values for Esophagus (0.73), Heart (0.89),
Left Lung (0.97), Right Lung (0.96), and Spinal Cord (0.93), indicating its ability to accu-
rately capture true positives. Specificity values are consistently high across all methods,
with U-Net-GAN showing slightly improved or comparable specificity scores for the struc-
tures compared to other techniques. Regarding geometric evaluation metrics, U-Net-GAN
demonstrates promising results with lower Hausdorff Distance at 95th percentile (HD95),
MSD, and RMSD values for various structures like Esophagus, Heart, Left Lung, Right
Lung, and Spinal Cord compared to the other methods. Notably, U-Net-GAN achieves
competitive values such as HD95 (ranging from 1.19 mm to 4.52 mm), MSD (ranging
from 0.38 mm to 1.49 mm), and RMSD (ranging from 0.82 mm to 3.14 mm), indicating
its efficiency in accurately delineating anatomical structures with minimal deviation from
ground truth data. Overall, the U-Net-GAN method demonstrates superior performance in
terms of segmentation accuracy, sensitivity, specificity, and geometric evaluation metrics,
making it a promising approach for accurate anatomical structure segmentation in medical
imaging applications.
13
Multimedia Tools and Applications
Table 3 shows boundary-regularized models are effective at making up for the rarity
of some organs in the sample by improving their segmentation performance. Figures 13
as well as 14 provide qualitative findings that confirm the beneficial effect of teaching
the model to recognise organ-boundaries. Figures 13 and 14 provide qualitative instances
of how the boundary-constrain has decreased the incidence of over- or under-segmented
organs, demonstrating our model’s capacity to learn the better representations of numerous
organs concurrently. The design proves that state-of-the-art performance can be obtained
with less RAM on the GPU. The design also contributes significantly via the use of a new
patching technique that replicates the structure of an organ, and by the implementation of
smart fusion of findings amongst many deep neural networks.
The system’s performance varied across anatomical structures, with the esophagus
displaying the best results and the trachea exhibiting the poorest performance due to low
Hounsfield values attributed to its unique shape. Our models struggled with trachea seg-
mentation, affecting outcomes in multi-organ scenarios where trachea and esophagus prox-
imity led to unintended impacts on each other’s segmentation quality. Although attempts
were made to enhance trachea segmentation by merging it with other organs, the underly-
ing issue persisted, stemming from poor trachea segmentation accuracy in the multi-organ
network.
Testing the system on GPUs with larger memory allowed experimentation with bigger
patches and batches, reproducing results but without significant enhancements. Despite
limitations, the patching technique appeared effective in addressing training issues with
incomplete CT data. Notably, even with moderate GPU resources, the system achieved
state-of-the-art outcomes through thorough pre-processing, strategic patching, multiple
neural networks, and consistent result fusion. However, there’s room for improvement by
augmenting the dataset size, utilizing GANs to tackle data scarcity, or integrating RRNs for
enhanced segmentation. Despite its strengths in superior segmentation metrics and robust
delineation of anatomical structures, the system faces challenges in organ-specific segmen-
tation and requires further improvements in data augmentation and network designs for
wider applicability.
5 Conclusion
In this study, the necessity of precise organ segmentation for gastrointestinal diagnostics
is acknowledged; yet existing deep learning segmentation approaches grapple with chal-
lenges like partial volume effects, image noise, and data imbalance. Our research focuses
on leveraging CT scan colorization to address these limitations and enhance segmentation
methods. Introducing the U-Net-GAN adversarial network architecture aimed at multi-
organ segmentation from thoracic CT images, we acknowledge the computational demands
inherent in our approach, which surpass current hardware capacities. Comparatively, this
study sheds light on the potential of deep neural networks despite resource constraints.
Although our technique demands extensive training time and significant GPU memory,
our findings emphasize that commendable results can still be achieved with more mod-
est hardware configurations. Notably, our innovative preprocessing methodologies, meticu-
lous model architecture, and ensemble training of multiple models on identical datasets
enabled efficient multi-organ separation. Yet, this research has limitations; it confronts the
challenge of prolonged training periods and high GPU memory requirements, posing con-
straints on practical implementation. While our approach shows promise in competing with
13
Multimedia Tools and Applications
recent techniques in publicly accessible competitions, the study’s scope primarily focuses
on thoracic CT images and needs expansion to encompass a broader spectrum of medical
imaging. Future efforts will concentrate on enriching datasets, employing GANs to miti-
gate data scarcity, and enhancing segmentation precision using RRNs. This emphasizes the
urgency for broader acceptance and thorough investigation of deep neural networks for the
interpretation of medical images. Additionally, forthcoming work aims to augment trans-
parency in experiment execution by presenting segmented results and integrating training
along with validation curves. Furthermore, the study underscores the necessity for wider
adoption and deeper investigation of deep neural networks for analyzing medical images.
Finally, the manuscript should be thoroughly proofread for comprehensive coverage and
accuracy.
Acknowledgements There is no acknowledgement involved in this work.
Data availability Data sharing not applicable to this article as no datasets were generated or analysed during
the current study.
Declarations
Ethics approval and consent to participate No participation of humans takes place in this implementation
process.
Human and animal rights No violation of Human and Animal Rights is involved.
References
1. Murugesan GK, McCrumb D, Brunner E, Kumar J, Soni R, Grigorash V, Chang A, Peck A, VanOss J,
Moore S (2023) Automatic abdominal multi organ segmentation using residual UNet. bioRxiv
2. Lei Y, Dong X, Tian S, Wang T, Patel PR, Curran WJ, Jani AB, Liu T, Yang X (2020) Multi-organ
segmentation in pelvic CT images with CT-based synthetic MRI. Medical Imaging: Biomedical Appli-
cations in Molecular, Structural, and Functional Imaging
3. Kim H, Jung J, Kim J, Cho B, Kwak J, Jang JY, Lee S, Lee J, Yoon SM (2019) Abdominal multi-organ
auto-segmentation using 3D-patch-based deep convolutional neural network. Scientific Reports, 10
4. Lewis S, Inglis SD, Doyle S (2023) The role of anatomical context in soft-tissue multi-organ segmen-
tation of cadaveric non-contrast enhanced whole body CT. Medical physics
5. Segre L, Hirschorn O, Ginzburg D, Raviv D (2022) Shape-consistent generative adversarial networks
for multi-modal medical segmentation maps. 2022 IEEE 19th International Symposium on Biomedical
Imaging (ISBI), 1–5
6. Kuang H, Menon BK, Qiu W (2020) Automated stroke lesion segmentation in non-contrast CT scans
using dense multi-path contextual generative adversarial network. Physics in Medicine & Biology, 65
7. Peng Z, Fang X, Yan P, Shan H, Liu T, Pei X, Wang G, Liu B, Kalra MK, Xu XG (2019) A method of
rapid quantification of patient-specific organ dose for CT using coupled deep multi-organ segmentation
algorithms and GPU-accelerated Monte Carlo Dose computing code
8. Kan CN, Gilat-Schmidt T, Ye D (2021) Enhancing reproductive organ segmentation in pediatric CT
via adversarial learning. Medical Imaging
9. Huang J, Li X, Wang J, Yu X, Zhu L, Zhan Y, Gao Y, Huang C (2021) Cross-Dataset Multiple
Organ Segmentation From CT Imagery Using FBP-Derived Domain Adaptation. IEEE Access
9:25025–25035
13
Multimedia Tools and Applications
10. Wang S, Zhang X, Hui H, Li F, Wu Z (2022) Multimodal CT image synthesis using unsupervised deep
generative adversarial networks for stroke lesion segmentation. Electronics
11. Yao H, Wan W, Li X (2022) A deep adversarial model for segmentation-assisted COVID-19 diagnosis
using CT images. EURASIP Journal on Advances in Signal Processing, 2022
12. Liu Y, Fu W, Selvakumaran V, Phelan M, Segars WP, Samei E, Mazurowski MA, Lo JY, Rubin GD,
Henao R (2019) Deep learning of 3D computed tomography (CT) images for organ segmentation
using 2D multi-channel SegNet model. Medical Imaging
13. Chen S, Zhong X, Hu S, Dorn S, Kachelriess M, Lell MM, Maier AK (2018) Automatic multi-organ
segmentation in dual energy CT using 3D fully convolutional network
14. Sherwani MK, Marzullo A, De Momi E, Calimeri F (2022) Lesion segmentation in lung CT scans
using unsupervised adversarial learning. Med Biol Eng Compu 60:3203–3215
15. Dinh TL, Seong, Kwon, G., Suk, Lee, H., Ki, & Kwon R (2021) Breast Tumor Cell Nuclei Segmenta-
tion in Histopathology Images using EfficientUnet++ and Multi-organ Transfer Learning
16. Raju A, Cheng C, Huo Y, Cai J, Huang J, Xiao J, Lu L, Liao C, Harrison AP (2020) Co-Heterogeneous
and Adaptive Segmentation from Multi-Source and Multi-Phase CT Imaging Data: A Study on Patho-
logical Liver and Lesion Segmentation. ArXiv, abs/2005.13201
17. Elskhawy A, Lisowska A, Keicher M, Henry J, Thomson P, Navab N (2020) Continual Class Incre-
mental Learning for CT Thoracic Segmentation. DART/DCL@MICCAI
18. Hayashi Y, Shen C, Roth HR, Oda M, Misawa K, Jinzaki M, Hashimoto M, Kumamaru KK, Aoki S,
Mori K (2020) Usefulness of fine-tuning for deep learning based multi-organ regions segmentation
method from non-contrast CT volumes using small training dataset. Medical Imaging
19. Chandra V, Fan W, Chen Y, Luó X (2022) Residual u-structure nested conditional adversarial nets
colorized CT Improves deep learning based abdominal multi-organ segmentation. 2022 IEEE Interna-
tional Conference on Image Processing (ICIP), 2061–2065
20. Cros S, Vorontsov E, Kadoury S (2021) Managing class imbalance in multi-organ CT segmentation
in head and neck cancer patients. 2021 IEEE 18th International Symposium on Biomedical Imaging
(ISBI), 1360–1364
21. Ogrean V, Brad R (2022) Multi-organ segmentation using a low-resource architecture. Inf 13:472
22. Lei Y, Liu Y, Dong X, Tian S, Wang T, Jiang X, Higgins K, Beitler JJ, Yu DS, Liu T, Curran WJ, Fang
Y, Yang X (2019) Automatic multi-organ segmentation in thorax CT images using U-Net-GAN. Medi-
cal Imaging
23. Fu Y, Lei Y, Wang T, Tian S, Patel PR, Jani AB, Curran WJ, Liu T, Yang X (2021) Daily cone-beam
CT multi-organ segmentation for prostate adaptive radiotherapy. Medical Imaging
24. Shen C, Hayashi Y, Oda M, Misawa K, Mori K (2021) Unpaired medical image translation between
portal-venous phase and non-contrast CT volumes for multi-organ segmentation. Other Conferences
25. Mahmood F, Borders D, Chen RJ, McKay GN, Salimian KJ, Baras AS, Durr N (2018) Deep Adversar-
ial Training for Multi-Organ Nuclei Segmentation in Histopathology Images. IEEE Trans Med Imag-
ing 39:3257–3267
26. Dai X, Lei Y, Janopaul-Naylor JR, Wang T, Roper JR, Zhou J, Curran WJ, Liu T, Patel PR, Yang X
(2021) Synthetic CT-based multi-organ segmentation in cone beam CT for adaptive pancreatic radio-
therapy. Medical Imaging
27. Li M, Lian F, Guo S (2021) Multi-scale selection and multi-channel fusion model for pancreas seg-
mentation using adversarial deep convolutional nets. J Digit Imaging 35:47–55
28. Katsuma D, Kawanaka H, Prasath V, Aronow BJ (2022) Data augmentation using generative adver-
sarial networks for multi-class segmentation of lung confocal IF images. J Adv Comput Intell Intell
Inform 26:138–146
29. Wu J, Li G, Lu H, Kamiy T (2021) A supervoxel classification based method for multi-organ segmen-
tation from abdominal CT images. J Image Graph 9:9–14
30. Shen C, Roth HR, Nath V, Hayashi Y, Oda M, Misawa K, Mori K (2022) Effective hyperparameter
optimization with proxy data for multi-organ segmentation. Medical Imaging
31. Chiranjeevi P, Rajaram A (2023) A lightweight deep learning model based recommender system by
sentiment analysis. J Intell Fuzzy Syst, (Preprint), 1–14
32. Zhang Runmei, Li Lulu, Yin Lei, Liu Jingjing, Xu Weiyi, Cao Weiwei, Chen Zhong (2022). Chinese
named entity recognition method combining ALBERT and a local adversarial training and adding
attention mechanism. Intl J Semantic Web Inform Syst (IJSWIS)18(1), 20
33. Wang T, Pan Z, Guyu Hu, Duan Y, Pan Yu (2022) (2022) Understanding universal adversarial attack
and defense on graph. Intl J Semantic Web Inform Syst 18(1):1–21
34. Ling Z, Hao ZJ (2022) An intrusion detection system based on normalized mutual information anti-
bodies feature selection and adaptive quantum artificial immune system. Intl J Semantic Web Inform
Syst, 18(1), 1–25
13
Multimedia Tools and Applications
35. Chopra M, Singh SK, Sharma A, Gill SS (2022) A comparative study of generative adversarial net-
works for text-to-image synthesis.International J Softw Sci Comput Intell, 14(1), 1–12
36. Pan X, Yamaguchi S, Kageyama T, Kamilin MH (2022) Machine-learning-based white-hat worm
launcher in botnet defense system. Intl J Softw Sci Comput Intell (IJSSCI) 14(1):1–14
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and applicable
law.
13