0% found this document useful (0 votes)
23 views12 pages

2022 - The Effect of Loss Function On Conditional Generative Adversarial Networks

This paper investigates the impact of loss functions on Conditional Generative Adversarial Networks (cGAN) to enhance image-to-image translation quality. By combining various adversarial loss functions, including Wasserstein GAN and least squared GAN, with non-adversarial loss functions, the study identifies an optimal combination that improves image realism and detail retention. Experimental results demonstrate the effectiveness of the proposed approach using the Facades dataset, achieving superior performance in image generation tasks such as in-painting and segmentation.

Uploaded by

15835431945
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views12 pages

2022 - The Effect of Loss Function On Conditional Generative Adversarial Networks

This paper investigates the impact of loss functions on Conditional Generative Adversarial Networks (cGAN) to enhance image-to-image translation quality. By combining various adversarial loss functions, including Wasserstein GAN and least squared GAN, with non-adversarial loss functions, the study identifies an optimal combination that improves image realism and detail retention. Experimental results demonstrate the effectiveness of the proposed approach using the Facades dataset, achieving superior performance in image generation tasks such as in-painting and segmentation.

Uploaded by

15835431945
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Contents lists available at ScienceDirect

Journal of King Saud University –


Computer and Information Sciences
journal homepage: www.sciencedirect.com

The effect of loss function on conditional generative adversarial


networks
Alaa Abu-Srhan a, Mohammad A.M. Abushariah b, Omar S. Al-Kadi b,⇑
a
Department of Basic Science, Faculty of Science, The Hashemite University, Zarqa 13133, Jordan
b
King Abdullah II School of Information Technology, The University of Jordan, Amman 11942, Jordan

a r t i c l e i n f o a b s t r a c t

Article history: Conditional Generative Adversarial Network (cGAN) is a general purpose approach for many image-to-
Received 4 August 2021 image translation tasks, which aims to translate images from one form to another resulting in high-
Revised 2 February 2022 quality translated images. In this paper, the loss function of the cGAN model is modified by combining
Accepted 16 February 2022
the adversarial loss of state-of-the-art Generative Adversarial Network (GAN) models with a new combi-
Available online 4 March 2022
nation of non-adversarial loss functions to enhance model performance and generate more realistic
images. Specifically, the effect of the Wasserstein GAN (WGAN), the WGAN with Gradient Penalty
Keywords:
(WGAN-GP), and least Squared GAN (lsGAN) adversarial loss functions are explored. Several comparisons
Generative adversarial network
Conditional generative adversarial network
are performed to select an optimized combination of L1 with structure, gradient, content-based, Kullback-
Pixel2Pixel Leibler divergence, and softmax non-adversarial loss functions. For experimentation purposes, the
Loss functions Facades dataset is used in case of image-to-image translation task. Peak-signal-to-noise-ratio (PSNR),
Structural Similarity Index (SSIM), Universal Quality Index (UQI), and Visual Information Fidelity (VIF)
are used to quantitatively evaluate the translated images. Based on our experimental results, the best
combination of the loss functions for image-to-image translation on facade dataset is (WGAN) adversarial
loss with (L1 and content) non-adversarial loss functions. The model generates fine structure images, and
captures both high and low frequency details of translated images. Image in-painting and lesion segmen-
tation is investigated to demonstrate practicality of proposed work.
Ó 2022 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access
article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction classification and recognition (Liu et al., 2019), image segmenta-


tion (Andreini et al., 2020), image in-painting (Liu et al., 2021),
Image-to-image translation is a class of computer graphics, and many other GAN applications are available. GAN can be used
computer vision, and image processing problems that aims to in medical fields, where it produces impressive results in tumor
map an image from the source domain to the target domain detection (Han et al., 2021), drug discovery (G.R. Padalkar et al.,
(Tang et al., 2021). In fact, many image-to-image translation prob- 2021), and other areas.
lems are difficult to handle, since a single input image could be Image-to-image translation tasks have been extensively investi-
mapped to many possible outputs, thus causing inconsistencies gated with the GAN model as a promising deep learning method
in image feature distributions or misrepresentation in the latent for this kind of problem. The GAN model consists of two deep neu-
space. ral networks, namely; generator and discriminator. It was origi-
GAN is a powerful model with a wide range of interesting appli- nally intended to be used for image generation, which was later
cations that are prevalent in many domains. Many essential appli- modified to be used for image-to-image translation. For image-
cations have been published in the literature. The essential to-image translation, GAN model does not generate images from
applications include image generation, in which the model can a random noise vector. The input is an image that needs to be
generate training data for machine learning models if the data is mapped to another image domain (Goel et al., 2021). Many GAN
small or expensive to collect (Waheed et al., 2020). GAN can be flavors have been developed to enhance the performance of
used to generate 2D and 3D objects, faces, anime characters, and image-to-image translation results, including the conditional gen-
music. Image translation (Emami et al., 2020), super-resolution, erative adversarial network (cGAN) model, which is considered to
be a general solution for image-to-image translation tasks, since it
⇑ Corresponding author. can be used in a wide range of image domains and is considered to
E-mail address: [email protected] (O.S. Al-Kadi).

https://doi.org/10.1016/j.jksuci.2022.02.018
1319-1578/Ó 2022 The Authors. Published by Elsevier B.V. on behalf of King Saud University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

be the beginning of using GAN for the image-to-image translation Table 1


(Tang et al., 2020). List of abbreviations.

The definition of the loss function is a very critical aspect in the Abbreviation Meaning
design of GAN models. The loss function used by GAN is called an cGAN Conditional Generative Adversarial Network
adversarial loss function that calculates the distance between the GAN Generative Adversarial Network
GAN distribution of the generated data and the distribution of WGAN Wasserstein Generative Adversarial Network
the actual data. Any GAN model has two loss functions, one to train WGAN-GP Wasserstein GAN with Gradient Penalty
lsGAN least Squared GAN
the generator network and the other to train the discriminator net- PSNR Peak-signal-to-noise-ratio
work. These two loss functions work together to form an adversar- SSIM Structural Similarity Index
ial loss function (Tzeng et al., 2017). It is beneficial to combine the UQI Universal Quality Index
GAN adversarial loss function of the generator network with tradi- VIF Visual Information Fidelity
Pix2Pix pixel-to-pixel
tional loss functions such as the pixel-to-pixel (Pix2Pix) model
PAN Perceptual Adversarial Network
(Isola et al., 2017), which is an interesting extension of the cGAN ID-CGAN Image De-raining cGAN
architecture in which the loss function has been modified by add- ReLU Rectified Linear Unit
ing L1 loss functions to produce more powerful results. It should be GDL Gradient Difference Loss Function
noted that both the cGAN and the Pix2Pix models need to be KLD Kullback–Leibler Divergence
FR-IQA Full-Reference Image Quality Assessment
trained using paired images, where a mapping between the input
MSE Mean Square Error
and the target images exists. NSS Natural Scene Statistics
The research done in the field of GAN can be divided into two HVS Human Visual System
categories: Architecture improvement and loss function improve- Std standard deviation
MR Magnetic Resonance
ments. Many architecture variants and loss variants of GAN have
been introduced as a result of this research. The architecture vari-
ants aim to improve performance by focusing on vanishing gradi-
ent, image quality, and mode diversity, while the loss variants
aim to improve performance based on mode collapse, vanishing
Training set
gradient, and image quality.
One of the most common failure modes in GAN is mode col-
lapse. Pix2pix and cGAN models, as loss variants of GAN, are less Fake
Discriminator
vulnerable to the mode collapse problem, which can be minimized
by combining multiple loss functions. Furthermore, using condi- Real
tional generative models that explicitly maximize likelihood, such Random
Generator
Noise
as cGAN, could avoid the mode collapse problem.
In this paper, we focus on the adversarial loss functions used to Fake Image
train the cGAN to improve its performance in terms of the quality
Fig. 1. A typical GAN architecture.
of the generated images. The adversarial loss function of cGAN
model is replaced based on a comparison of a set of state-of-
the-art adversarial loss functions. In addition, we implemented
different effective loss functions that are used in literature and
generator and discriminator. The generator is used to generate
performed some combination between them with cGAN
images, whereas the function of the discriminator is to distinguish
adversarial loss function.
between real images and the generated images. Fig. 1 shows the
The main contributions of this research are as follows:
architecture of GAN. The original GAN starts with generating new
images from noise data. However, the idea then expanded to solve
1. The effect of loss function on cGAN model has been
the problem of image to image translation with more promising
investigated.
results. Many works were done to enhance the performance of
2. Both high and low frequency components of the output images
GAN either by enhancing the GAN architecture or modifying the
are captured by enhancing the adversarial loss function with an
loss function used to train the model, which led to enhance the
optimized combination of non-adversarial loss functions.
training process.
3. Performance is quantitatively evaluated by four different well-
Mirza et al. proposed a novel model named cGAN, which is
known image quality assessment metrics. considered as an extension to the original GAN, but with modifi-
4. The proposed approach has been experimented with image-to-
cations done in the input of discriminator; the input image of the
image translation of facade dataset, and applied for image seg- generator is feed also to the discriminator. They applied the
mentation and image in-painting tasks.
model on MNIST digits, and obtained results that outperform

Table 1 describes the abbreviations used throughout the paper.


The rest of the paper is organized as follows: Section 2 presents
a review of the literature and the contributions to the cGAN and
Pix2Pix models. Section 3 provides an in-depth description of the Generated ٌReal
Image Image
methods and the loss functions used in our work. Section 4 con-
tains the experimental results and discussion. We finally summa- Input
rize main findings and conclusions in Section 5. Image

Generator Discriminator Fake Input Discriminator Real


Image
2. Related works

Goodfellow proposed GAN framework in 2014 (Goodfellow


et al., 2014), which consists of two deep neural networks, namely; Fig. 2. cGAN architecture.

6978
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

other existing models, which is considered a novel training


method for the GAN. Fig. 2 shows the cGAN architecture (Tang Algorithm 1: Loss function selection methodology
et al., 2020). Later on, Isola et al. (Isola et al., 2017) extended
the work of cGAN by modifying the loss function. This extension
was able to improve the results of the image to image transla-
tion, where the loss function of the generator network was mod-
ified by adding L1 to its adversarial loss. The proposed model was
very effective in image synthesis, achieved reasonable results,
considered widely applicable in many applications, and was easy
for adoption.
Several works have been done to modify the GAN architecture
for improving the quality of the generated images. For instance,
Wang et al. (2018) proposed Perceptual Adversarial Network
(PAN), where the pixel loss of Pix2Pix model was replaced with a
feature matching loss. In addition, Johnson et al. (2016) replaced
the pixel loss with a perceptual loss. Consequently, the experi-
ments were carried out with a single-image super-resolution task
and obtained visually pleasing results. Zhang et al. (2017)
improved the cGAN framework by reducing artifacts introduced
by GAN and ensuring better visual quality through the use of a
newly refined loss function. Based on the cGAN modification, they
introduced the Image De-raining cGAN (ID-CGAN) method. The
proposed model outperforms many image synthesis models in
terms of performance.
Chrysos et al. (2018) focused on making cGAN more robust to
noise by producing the RoCGAN model. They augmented the gen-
erator with an unsupervised pathway, which leads to force the
generator outputs to span the target manifold even when the noise
exists. RoCGAN introduced perfect results in a variety of domains.
Liu et al. (2021) proposed a SCCGAN model based on cGAN for style
and characters inpainting task.
Most previous work lack proper synthesis of details inside
the image. This is especially evident with conditional image
synthesis which still remains a challenge, and is usually hin-
dered by the problem of capturing many of the important dis-
tinctive characteristics between input and output images,
where using L1 or L2 will capture low frequency details but
not high frequency details. Previous works have primarily con-
centrated on capturing either low frequency or high frequency
details.
Although conditional distributions are multi-modal in practice, 3.1. Network architecture
most cGAN approaches learn an overly simplified distribution in
which an input is always mapped to a single output. More investi- We adopted the architecture of the cGAN model from Isola et al.
gation is required to implement this rich model to perform as (2017), where the generator and the discriminator networks used
expected on multi-model datasets. Another limitation of previous the convolution-BatchNorm-ReLu (UNet with batchnorm) mod-
models is model collapse, which must be addressed in order to ules. The Rectified Linear Unit (ReLU) activation function is used
improve output accuracy. in all generator layers except the output layer that uses the Tanh
function. Batchnorm is used to help the generative model generate
3. Methods a better result from the underlying data distribution (Yu et al.,
2017). For the discriminator, the PatchGAN architecture is used,
To explore the generality of the cGAN model and expand the where the discriminator works on each patch of an image classify-
work conducted by Isola et al. (2017), where the L1 loss func- ing it as a real or fake image. This is done convolutionally for all the
tion was added to the cGAN adversarial loss for the generator patches, and the final decision is made by averaging all responses
network. We replaced the cGAN adversarial loss with the adver- to decide the final output of the discriminator.
sarial loss function used by one of the state-of-the-art GAN
models including; WGAN, WGAN-GP, and lsGAN. We then used 3.2. Loss functions
various traditional loss functions and add them to the adversar-
ial loss for the generator network such as L1 , structure, gradient, One of the most important parts of GAN models is the choice of
KLD, content, and softmax loss functions. These models are an appropriate loss function. This section presents the description
evaluated on the ‘‘facades” dataset trained for 200 epochs. of the loss functions used during the experiments, where we
Peak-signal-to-noise-ratio (PSNR), Structural Similarity Index employed the state-of-the-art adversarial loss function with cGAN
(SSIM), Universal Quality Index (UQI), and Visual Information model to improve its performance.
Fidelity (VIF) are used to evaluate the results to decide the best
combination of adversarial loss and traditional loss functions 1. L1 Loss Function
that is suited to the cGAN architecture. Algorithm 1 presents L1 loss function is the absolute distance between two images;
the loss function selection methodology. the generated image and the ground truth image. The idea
6979
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

behind L1 loss function is to minimize the absolute difference 5. The Content Loss
between the generated and ground truth images (Ma et al., The content Loss (reconstruction loss) has been proposed by
2020). LL1 is described in Eq. 1. Gatys et al. (2016), which can be used with adversarial loss to
form a perceptual loss function. It is a feature domain
LL1 ðGÞ ¼ Ex;y;z ½jjy  Gðx; zÞjj1  ð1Þ
element-wise loss that is computed from a pre-trained network
Where x is the input image, y is the ground truth image, z is the such as VGG, where the VGG is a network that has been pre-
random noise, and Gðx; zÞ is the generated image. trained on the ImageNet dataset. For image generation, content
The main goal of using L1 is to capture the low frequency details. loss works with the content representation of the source and
Thus, using the L1 loss function will enforce low frequencies cor- the generated images in order to minimize the difference
rectness. It is effective to combine this type of loss functions between them. If we have the source image p and the generated
with another loss function to improve the quality of the gener- image x for the given layer l, then the content loss is defined as
ated images (Isola et al., 2017). in Eq. 6.
2. The Structural Similarity Index (SSIM) 1 X l 2
SSIM is a perception-based model that has been widely used to Lcontent ðp; x; lÞ ¼ F i;j  Pli;j ð6Þ
2 i;j
evaluate image processing algorithms and as a loss function for
many image processing applications (Abobakr et al., 2019;
where P li;j and F li;j are the content representations of p and x
Setiadi, 2021). The SSIM loss function (Lstructure ðP Þ) is defined
images in layer l, respectively.
in Eq. 2.
The content loss is also known as the reconstruction loss, which
1X offers the training stability required for convergence. It therefore
Lstructure ðp1 ; p2 Þ ¼ p1 ; p2 ½1  SSIMðp1 ; p2 Þ: ð2Þ
N leads to powerful results.
6. Softmax Cross-entropy Loss Function
Where SSIM(p1 ; p2 ) is the SSIM for pixel p1 and p2 and can be Softmax cross-entropy loss function is a soft version of max
defined as in Eq. 3. function that takes an N-dimensional real number vector and
   converts it to a vector in range (0,1). The output of this function
2lp1 lp2 þ c1 2rp1 p2 þ c2 is probability distribution, which makes it suitable for use in
SSIMðp1 ; p2 Þ ¼    ð3Þ
l2p1 þ l2p2 þ c1 r2p1 þ r2p2 þ c2 many classification tasks and deep learning applications. Eq. 7
shows the log-softmax loss function for the generated image x
Where lp1 is the average of p1 ; lp2 is the average of p2 ; rp21 is the and the real image y (Lin et al., 2017).
X
variance of p1 ; rp2 is the variance of p2 , and rp1 p2 is the covari- Lsoftmax ðx; yÞ ¼ y  log exi ð7Þ
2

ance of p1 and p2 . i

3. Gradient Difference Loss Function (GDL) As shown in Eq. 7, it can be concluded that the log-softmax loss
The GDL penalizes the differences in the image gradient predic- function is considered an exponential loss function.
tion in order to sharpen the images (Hognon et al., 2020). In 7. Adversarial Loss Function
addition, it can be used for texture matching and robust feature. Below is the description of the adversarial loss functions used in
This type of loss functions is used to overcome the blurry output cGAN, lsGAN, WGAN, and WGAN-GP models:
image problem (Bhattacharjee and Das, 2018).
The GDL loss function between the generated image Gð X Þ and
the ground truth image Y is defined in Eq. 4 (Hognon et al.,
2020). (a) Adversarial and Conditional Loss
X The formulation of the conditional loss for the generative adver-
Lgradient ðGð X Þ; Y Þ ¼ jjY i;j  Y i1;j j  jGð X Þi;j 
sarial network is defined in Eq. 8 (Isola et al., 2017):
i;j ð4Þ
Gð X Þi1;j j þ jY i;j  Y i;j1 j  jGð X Þi;j  Gð X Þi;j1 jj LcGAN ðG; DÞ ¼ Ex;y ½logDðx; yÞþ
ð8Þ
Ex;z ½log ð1  Dðx; Gðx; zÞÞ
Where the GDL loss function computes the average gradient dif-
ference loss between the generated and ground truth images. where G is the generator, D is the discriminator, x is the input
jY i;j  Y i1;j j and jY i;j  Y i;j1 j are the component of temporal dif- image, y is the ground truth image, and z is the random noise vector.
ference loss. The cGAN loss function differs from the original GAN adversarial
4. Kullback-Leibler Divergence (KLD) loss, where the discriminator observes the input image in the cGAN,
The KLD (relative entropy) is a difference between two proba- while not in the original GAN. The formulation of the original GAN
bility distributions (a distribution-wise measure). It is therefore adversarial loss function is shown in Eq. 9:
important to transform the image into a probability distribution
LGAN ðG; DÞ ¼ Ey ½logDð yÞþ
to apply this type of loss function for images generation task. In ð9Þ
the simple term, if the result of the KLD between two distribu- Ex;z ½log ð1  DðGðx; zÞÞ
tions is 0, it indicates that these two distributions are identical. where log ðDðxÞÞ denotes the likelihood that the generator correctly
Otherwise, there are some differences between the two distri- classifying the real image. Maximizing log(1-D(G(z))) would help it
butions. It is related to maximum likelihood estimation, which to correctly label the fake image that comes from the generator.
is easy to optimize and becoming popular to use in many appli- (b) The Wasserstein GAN (WGAN)
cations such as applied statistics, fluid mechanics, and machine The WGAN model uses the Wasserstein distance that calculates
learning (Bellemare et al., 2017). The definition of KLD as a loss the difference between the generated and the target distribu-
function is shown in Eq. 5. tions instead of using the loss function that is used by the orig-
  inal GAN model. The WGAN model is easy to train compared to
LKLD ¼ Y true  log Y true =Y pred ð5Þ
the original GAN model and has achieved impressive results
Where Y true is the ground truth image, and Y pred is the generated (Alotaibi, 2020). Eq. 10 and Eq. 11 show the adversarial loss
image.
6980
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

used by WGAN that includes the generator and the discrimina- 1. PSNR
tor loss functions, respectively. The most effective measurement method to evaluate synthe-
sised images is PSNR, which is widely used by many researchers
1X    
LWasserstein G ¼ i ¼ 1m f G zðiÞ : ð10Þ for image comparison and image synthesis, because it is simple
m and easy to implement (Setiadi, 2021; Sara et al., 2019). PSNR is
a pixel loss-based evaluation metric that measures how far the
1X       generated image pixels are from the ground truth. The testing
LWasserstein D ¼ i ¼ 1m f xðiÞ  f G zðiÞ : ð11Þ dataset consists of paired images that need pixel loss-based
m
metrics, such as PSNR and SSIM. The higher the PSNR, the better
where m is the number of pixel in the image, x is the input image, the quality of the generated image. Eq. 13 shows the formula of
and z is the random noise. PSNR:
(c) The WGAN-Gradient Penalty (WGAN-GP) !
R2
The WGAN-gp model (Gulrajani et al., 2017) is an extension to PSNR ¼ 20  log 10 ð13Þ
the WGAN model that overcomes the drawbacks of the WGAN MSE
model, where the WGAN sometimes fails to converge and could
Based on Eq. 13, R is the maximum fluctuation in the input
generate low-quality images using the gradient penalty instead
image data type. If the input image has a double-precision
of the weight clipping, leading to performance improvements
floating-point data type, then R is 1. If it has an 8-bit unsigned
over the WGAN model and high-quality image generation.
integer data type, then R is 255.
(d) least Squared GAN (lsGAN)
In addition, Mean Square Error (MSE) is formulated as the cumu-
The lsGAN model (Mao et al., 2017) uses the least squared dis-
lative squared error between generated image X(m,n) and
tance (ls) as an adversarial loss function for both the generator
ground truth image Y(m,n). The lower the value of MSE, the
and the discriminator networks, where ls or L2 is the average
lower the error. Eq. 14 shows the MSE formula:
squared differences between the predicted and the ground
truth images. The results of this loss function are always posi- 1 X X
MSE ¼ m ¼ 1M n
tive so that it is used for minimization optimization process MN
and is also a stable alternative to the original GAN loss function.
¼ 1N ðX ðm; nÞ  Y ðm; nÞÞ2 ð14Þ
Eq. 12 shows the L2 for the generated GðzÞ and real images y
(Anas et al., 2020). Where N and M denote the number of rows and columns in the
image, respectively.
1
L2 ¼ Ex;y;z ½DðGðzÞÞ  1Þ2 ð12Þ 2. UQI
2 UQI is mathematically determined without the use of any
where x is the input image, y is the output image, z is the random human visual system model, and it is designed to provide a
noise, and DðGðzÞÞ is the output of the discriminator in which its comparison of the distortion information between the original
input is the generated image from the generator G. image and the distorted image. UQI is a combination of three
factors, namely; loss of correlation, distortion of luminance,
A short description of the loss functions used in our experi- and distortion of contrast. This metric is easy to calculate and
ments is summarized in Table 2. can be used in various image processing applications (Fadl
et al., 2018). UQI is defined in Eq. 15:
3.3. Evaluation metrics rxy 2bxb
y 2r x r y
UQI ¼   ð15Þ
rx ry bx 2 þ by 2 r2x þ r2y
Automatic perceptual quality evaluation of a distorted image in
comparison with a reference image is called Full-Reference Image The three components of the equation represent loss of correla-
Quality Assessment (FR-IQA). PSNR and SSIM are FR-IQA’s state- tion, distortion of luminance, and distortion of contrast factors.
of-the-art evaluation metrics to evaluate image performance over Where x is the original image and y is the generated image. x
test sets (Saha and Wu, 2016). In the case of image synthesis, these and y are defined in Eq. 16 and Eq. 17, respectively.
metrics calculate the amount of distortion in the generated images. 1X
The simplest way to assess image quality is to calculate PSNR. b
x¼ i ¼ 1N ðxi Þ ð16Þ
N
However, PSNR does not always correlate with human visual per-
ception and image quality. Additional parameters were recom-
1X
mended to resolve the constraint of PSNR metrics, that is, SSIM. b
y¼ i ¼ 1N ðyi Þ ð17Þ
We also used UQI and VIF to evaluate the results. N

Table 2 To perform multiplication and addition operations, x and y


Loss function definition. images must be square images with N  N.
Loss Function Objective
3. VIF
VIF is an information quality measure based on Natural Scene
cGAN Conditional Adversarial Loss, where the
Statistics (NSS) and the notion of image information extracted
Discriminator Observes the Input Image
LL1 The absolute distance between two images
Lstructure Using SSIM index to compute the similarity
between two images
Lgradient Penalize the image gradient predictions HVS E
differences in order to sharpen the image.
LKLD The difference between two probability distribution.
Lcontent Minimize the difference between the content Natural Image Channel
HVS F
representation of source and generated images. Source C (Distortion) D
Lsoftmax Takes a vector of K real numbers as input, and
normalizes it into a probability distribution.
Fig. 3. VIF components.

6981
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

by the human visual system (Saha and Wu, 2016). VIF has three variety of architectural styles. The facade dataset is made up
components, namely; source, distortion, and Human Visual Sys- of 12 basic classes and sub-classes: facade, molding, cornice,
tem (HVS) as shown in Fig. 3 where C is the source image, D is pillar, window, door, sill, blind, balcony, shop, deco, and back-
the distorted image, E is the output of HVS for the source image, ground. This dataset has been manually annotated (Tylecek,
and F is the output of HVS for the distorted image. VIF is 2012). We divide the dataset into training and testing datasets,
expressed in Eq. 18, where E is the Reference Image Information where the training dataset contains 506 aligned pairs of images,
and F is the Distorted Image Information. and the testing dataset contains 100 aligned image pairs.
2. Image In-painting Dataset
Distorted Image Information
VIF ¼ ð18Þ We modified the facade dataset to be used for image in-
Reference Image Information
painting. The images in this dataset are paired, with each paired
image consisting of a modified facade image and its correspond-
4. Experimental results and discussion ing original facade image. The modified facade image has been
prepared with a white rectangle indicating the lost area. A sam-
4.1. Dataset ple of the used image in-painting dataset is shown in Fig. 4.
3. Lesion Segmentation Dataset
1. Image-to-Image Translation Dataset For lesion segmentation dataset, we used our prepared Mag-
Our model is applied to a facade dataset that contains 606 netic Resonance (MR) dataset. The images are paired for MR
images of facades collected from different sources. This data- dataset. Each paired image contains of the MR image and its
set’s images are from various cities around the world, with a corresponding manual segmentation mask. Our dataset consists
of 179 paired images (MR image and its corresponding mask).
The manual segmentation mask is prepared where the mask
is a black-and-white image with the white area indicating the
lesion. Fig. 5 shows an example of the MR lesion segmentation
dataset.

4.2. Experiments

1. Image-to-Image Translation Results


We started our experiment with replacing the cGAN and Pix2-
Pix (cGAN+ LL1 ) adversarial loss with the adversarial loss of
the state-of-the-art GAN models, namely; WGAN, WGAN-GP,
and lsGAN. Our goal is to show the adversarial loss that is most
suitable for cGAN and Pix2Pix architecture. Table 3 shows the
Fig. 4. In-painting dataset example showing (a) input image, (b) ground truth evaluation metrics results of a different cGAN model evaluation
image. on facades dataset. Fig. 6 and Fig. 7 show the results of four
images using different GAN models on cGAN and Pix2Pix archi-
tecture, respectively. We found that WGAN þ LL1 and
lsGAN þ LL1 are among the best adversarial loss functions, since
they obtained the highest mean value and lowest standard
deviation (std) value for all evaluation metrics. Furthermore,
the results show that adding L1 loss function to the cGAN model
produces better performance regardless of the used adversarial
loss, with the exception of the UQI and VIF metrics for WGAN-
GP. Therefore, it is beneficial to use a suitable adversarial loss
function when combined with L1 loss. Based on the findings,
we observe that the WGAN-GP adversarial loss is incapable of
producing improved results as for this type of translation.
We are going further in our experiments by adding non-
Fig. 5. A segmentation dataset example showing (a) input image, (b) manual
segmentation mask (Lesion is the white area).
adversarial loss functions to the cGAN and the Pix2Pix models.

Table 3
Image quality evaluation metrics for cGAN and Pix2Pix models with an adversarial loss replacement.

GAN model PSNR SSIM UQI VIF


Mean (Std) Mean (Std) Mean (Std) Mean (Std)
cGAN 23.772 2.319 0.238 0.036 0.685 0.074 0.011 0.003
WGAN 23.268 2.357 0.176 0.028 0.713 0.075 0.009 0.002
WGAN-GP 24.534 3.080 0.197 0.073 0.751 0.082 0.172 0.096
lsGAN 23.661 3.313 0.185 0.067 0.742 0.078 0.155 0.097
adding LL1 (Pix2Pix)

cGAN þ LL1 (Pix2Pix) 24.966 2.015 0.269 0.025 0.718 0.064 0.012 0.003
WGAN þ LL1 27.991 2.623 0.332 0.047 0.754 0.068 0.269 0.179
WGAN-GP +LL1 25.165 2.594 0.205 0.065 0.738 0.074 0.149 0.107
lsGAN þ LL1 28.089 2.561 0.352 0.090 0.900 0.074 0.283 0.091

Bold indicates the highest two metric scores.

6982
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Fig. 6. Results of the cGAN model with an adversarial loss replacement (a) input, (b) ground truth, (c) cGAN, (d) lsGAN, (e) WGAN, (f) WGAN-gp, respectively.

Fig. 7. Results of the Pix2Pix model with an adversarial loss replacement (a) input, (b) ground truth, (c) cGAN, (d) lsGAN, (e) WGAN, (f) WGAN-gp, respectively.

Table 4 shows the image evaluation metrics (PSNR, SSIM, UQI, results than using L1 alone. However, adding a non-
and VIF) results after adding non-adversarial loss functions to adversarial loss function to the cGAN model will not always
the cGAN and Pix2Pix adversarial loss. Fig. 8 and Fig. 9 show help because it does not use the L1 loss function and thus will
the results of four images from the cGAN and Pix2Pix models not capture the low-frequency components of the image cor-
after modifying their loss function by adding the non- rectly. For example, the SSIM values of cGAN þ Lgradient ;
adversarial loss functions mentioned earlier, respectively. cGAN þ Lcontent , and cGAN þ Lstructural are less than the cGAN
The results show that adding L1 to any of the applied non- value. This highlights the significance of combining non-
adversarial loss functions gives better results. For instance, adversarial loss with the L1 loss function. Furthermore, we
SSIM loss preserves contrast in high-frequency regions. On the notice that the pix2pix model outperforms the cGAN model.
other hand, L1 maintains low-frequency. This indicates that We proceed by adding the non-adversarial loss function to
the combination of SSIM and L1 loss functions produces better WGAN and lsGAN adversarial loss functions in addition to L1
results than using SSIM alone. In addition, adding a non- loss function. One loss function is added at a time. We com-
adversarial loss function to the Pix2Pix model produces better pared the results to determine the best combination of loss

6983
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Table 4
Image quality evaluation metrics for cGAN and Pix2Pix after adding non-adversarial loss function to the generator network.

GAN model PSNR SSIM UQI VIF


Mean (Std) Mean (Std) Mean (Std) Mean (Std)
cGAN 23.772 2.319 0.238 0.036 0.685 0.074 0.011 0.003
cGAN þ Lgradient 23.565 2.129 0.135 0.027 0.690 0.066 0.051 0.026
cGAN þ LKLD 23.649 2.640 0.244 0.036 0.739 0.075 0.073 0.037
cGAN þ Lsoftmax 24.335 2.241 0.208 0.031 0.713 0.069 0.063 0.030
cGAN þ Lcontent 24.713 2.368 0.173 0.038 0.709 0.074 0.079 0.051
cGAN þ Lstructural 22.805 2.862 0.179 0.033 0.688 0.083 0.093 0.058
cGAN+ LL1 (Pix2Pix)

Pix2Pix 24.966 2.015 0.269 0.025 0.718 0.064 0.012 0.003


Pix2Pix+Lgradient 26.722 2.445 0.294 0.031 0.724 0.066 0.086 0.061
Pix2Pix+LKLD 26.357 2.178 0.305 0.034 0.725 0.066 0.075 0.040
Pix2Pix+ Lsoftmax 26.625 2.464 0.285 0.033 0.732 0.069 0.081 0.051
Pix2Pix+Lcontent 27.043 2.451 0.291 0.037 0.731 0.068 0.088 0.059
Pix2Pix+Lstructural 26.578 2.272 0.291 0.037 0.715 0.068 0.073 0.049

Bold indicates the highest metric.

Fig. 8. Results of cGAN model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient , (d) LKLD , (e) Lstructural , (f) Lsoftmax , (g)
Lcontent , respectively.

Fig. 9. Results of Pix2Pix model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient , (d) LKLD , (e) Lstructural , (f) Lsoftmax , (g)
Lcontent , respectively.

6984
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Table 5
Image quality evaluation metrics for lsGAN and WGAN after adding non-adversarial loss function to the generator network.

lsGAN
GAN model PSNR SSIM UQI VIF
Mean (Std) Mean (Std) Mean (Std) Mean (Std)

lsGAN þ L1 28.089 2.56 0.352 0.090 0.805 0.074 0.283 0.091


lsGAN þ Lgradient þ L1 27.970 2.578 0.331 0.047 0.744 0.067 0.267 0.156
lsGAN þ LKLD þ L1 27.952 2.649 0.331 0.047 0.745 0.071 0.283 0.180
lsGAN þ Lsoftmax þ L1 27.953 2.468 0.334 0.045 0.741 0.067 0.255 0.172
lsGAN þ Lcontent þ L1 28.189 2.686 0.336 0.047 0.848 0.067 0.293 0.126
lsGAN þ Lstructural þ L1 27.906 2.630 0.338 0.047 0.745 0.070 0.239 0.133
WGAN
WGAN þ L1 27.991 2.623 0.332 0.047 0.754 0.068 0.269 0.179
WGAN þ Lgradient þ L1 28.305 2.452 0.330 0.049 0.742 0.069 0.271 0.165
WGAN þ LKLD þ L1 27.965 2.559 0.328 0.046 0.747 0.069 0.252 0.167
WGAN þ Lsoftmax þ L1 27.799 2.658 0.328 0.046 0.745 0.066 0.262 0.160
WGAN þ Lcontent þ L1 28.011 2.692 0.329 0.044 0.859 0.069 0.303 0.142
WGAN þ Lstructural þ L1 27.923 2.723 0.336 0.047 0.736 0.068 0.256 0.152

Bold indicates the best metric scores.

Fig. 10. Results of lsGAN model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient þ L1 , (d)LKLD þ L1 , (e) Lstructural þ L1 ,
(f) Lsoftmax þ L1 , (g) Lcontent þ L1 , respectively.

Fig. 11. Results of WGAN model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient þ L1 , (d) LKLD þ L1 , (e)
Lstructural þ L1 , (f) Lsoftmax þ L1 , (g) Lcontent þ L1 , respectively.
6985
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Table 6
Image in-painting quality evaluation metrics for lsGAN and WGAN after adding non-adversarial loss function to the generator network.

lsGAN
GAN model PSNR SSIM UQI VIF
Mean (Std) Mean (Std) Mean (Std) Mean (Std)

lsGAN þ L1 56.014 3.141 0.931 0.021 0.901 0.017 0.812 0.042


lsGAN þ Lgradient þ L1 56.209 2.951 0.938 0.058 0.934 0.022 0.848 0.051
lsGAN þ LKLD þ L1 56.733 3.241 0.938 0.061 0.927 0.031 0.873 0.024
lsGAN þ Lsoftmax þ L1 57.737 3.021 0.941 0.023 0.943 0.018 0.882 0.043
lsGAN þ Lcontent þ L1 58.133 3.024 0.943 0.027 0.950 0.029 0.880 0.037
lsGAN þ Lstructural þ L1 57.714 3.145 0.935 0.027 0.941 0.032 0.876 0.029
WGAN
WGAN þ L1 55.914 3.044 0.929 0.028 0.910 0.012 0.809 0.027
WGAN þ Lgradient þ L1 57.547 3.124 0.946 0.021 0.949 0.032 0.824 0.017
WGAN þ LKLD þ L1 56.850 3.102 0.938 0.027 0.938 0.027 0.824 0.021
WGAN þ Lsoftmax þ L1 57.279 2.991 0.943 0.014 0.942 0.019 0.883 0.024
WGAN þ Lcontent þ L1 58.455 2.874 0.947 0.034 0.949 0.023 0.902 0.012
WGAN þ Lstructural þ L1 56.995 3.001 0.942 0.029 0.923 0.020 0.831 0.030

Bold indicates the best metric scores.

Fig. 12. Image in-painting results of lsGAN model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient þ L1 ,
(d)LKLD þ L1 , (e) Lstructural þ L1 , (f) Lsoftmax þ L1 , (g) Lcontent þ L1 , respectively.

Fig. 13. Image in-painting results of WGAN model after adding non-adversarial loss function to the generator network (a) input, (b) ground truth, (c) Lgradient þ L1 , (d)
LKLD þ L1 , (e) Lstructural þ L1 , (f) Lsoftmax þ L1 , (g) Lcontent þ L1 , respectively.

functions as shown in the Table 5. Fig. 10 and Fig. 11 show the Utilizing WGAN adversarial loss results in the highest possible
results of four images using the combination of lsGAN and performance. Furthermore, the use of content or gradient loss
WGAN with the non-adversarial loss functions, respectively. with L1 loss function results in an overall improved perfor-
The results show that the best loss function is mance. Output samples are blurred and lack a high-frequency
WGAN þ Lcontent þ L1 or WGAN þ Lgradient þ L1 . structure that uses L1 loss function on its own, while content

6986
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

Fig. 14. Pix2pix Segmentation Results with its Loss Function Replacement on MR Images.

loss offers the training stability required for convergence. The L1 Dice, Hausdorff Distance (HD), and cross-entropy loss functions.
loss function handles the low-frequency components of the The segmentation results of the pix2pix model after its loss
image, and the content loss function deals with the high- fre- function replacement are shown in Figs. 14. The Dice, Jaccard
quency image components. Thus, the combination of these results of pix2pix after loss function replacement are shown
two functions can handle low and high frequency components. in Tables 7. The results show that using the Dice loss function
Content loss is used to detect the features in images, which with the pix2pix model outperforms the use of other loss
allows the loss function to know what features are in the target functions.
ground truth image rather than merely comparing pixel differ-
ences. This process allows the model being trained with this
5. Conclusions
loss function to produce a much finer detail of the generated
features and outputs.
The results of cGAN as a general-purpose image-to-image trans-
2. Image In-painting ResultscGAN and Pix2pix models can be
lation model are very impressive. They allowed greater control
used for image in-painting. We also evaluate the proposed
over the final output from the generator. We showed that the qual-
method on the in-painting problem of facade dataset. We per-
ity of cGAN results improved significantly with the use of effective
form the combination of L1 and non-adversarial loss functions
adversarial loss function, even when the network architecture
when WGAN and lsGAN adversarial loss functions are used.
remains unchanged. In addition, it is beneficial to combine the
Table 6 shows the comparison that we perform to determine
cGAN adversarial loss with non-adversarial loss functions as it
the best combination of loss functions. Fig. 12 and Fig. 13 show
enhances the generation power. Content and gradient loss func-
the results of three images using the combination of lsGAN+L1
tions are two of the most powerful functions that gain better
and WGAN+L2 with the non-adversarial loss functions, respec-
results than other loss functions, especially when combined with
tively. The results show that the best loss function is
the L1 loss function. In this paper, we concentrate on modifying
WGAN þ Lcontent þ L1 . Furthermore, the results show that incor-
the loss function in order to obtain more accurate results. There-
porating non-adversarial loss functions into the L1 loss function
fore, additional research to improve the accuracy of the proposed
improves the results.
model’s architecture could be conducted in the future. Further-
3. Lesion Segmentation Results
more, the proposed model could be combined with another unsu-
Pix2pix can be used for image segmentation. We evaluate the
pervised GAN model to construct a model that can take advantage
proposed method on the segmentation problem of brain MR
of the combined models. This paper demonstrates the effect of loss
images. We modify the pix2pix model’s loss function to
functions on cGAN with a single output. This work can be extended
improve its segmentation performance and to show the effect
to multi-modeling, in which the model produces multiple outputs.
of loss function replacement on this task. We replace L1 with

Declaration of Competing Interest


Table 7
Performance of Pix2pix with Loss Function Replacement on the MR Dataset in Terms
The authors declare that they have no known competing finan-
of Dice Index, Jaccard.
cial interests or personal relationships that could have appeared
GAN model Dice Jaccard to influence the work reported in this paper.
L1 0.871 0.794
Dice Loss 0.894 0.802
Cross-entropy Loss 0.782 0.768 References
HD Loss 0.773 0.734
Abobakr, A., Hossny, M., Nahavandi, S., 2019. Ssimlayer: towards robust deep
Bold indicates the highest metric. representation learning via nonlinear structural similarity. In: 2019 IEEE

6987
A. Abu-Srhan, Mohammad A.M. Abushariah and O.S. Al-Kadi Journal of King Saud University – Computer and Information Sciences 34 (2022) 6977–6988

International Conference on Systems, Man and Cybernetics (SMC). IEEE. pp. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A., 2017. Image-to-image translation with
1234–1238. conditional adversarial networks. In: Proceedings of the IEEE conference on
Alotaibi, A., 2020. Deep generative adversarial networks for image-to-image computer vision and pattern recognition, pp. 1125–1134.
translation: A review. Symmetry 12 (10), 1705. Johnson, J., Alahi, A., Fei-Fei, L., 2016. Perceptual losses for real-time style transfer
Anas, E.R., Onsy, A., Matuszewski, B.J., 2020. Ct scan registration with 3d dense and super-resolution. In: European conference on computer vision. Springer.
motion field estimation using lsgan. In: Annual Conference on Medical Image pp. 694–711.
Understanding and Analysis. Springer. pp. 195–207. Lin, M., 2017. Softmax gan, arXiv preprint arXiv:1704.06191.
Andreini, P., Bonechi, S., Bianchini, M., Mecocci, A., Scarselli, F., 2020. Image Liu, F., Jiao, L., Tang, X., 2019. Task-oriented gan for polsar image classification and
generation by gan and style transfer for agar plate image segmentation. clustering. IEEE Trans. Neural Networks Learn. Syst. 30 (9), 2707–2719.
Comput. Methods Programs Biomed. 184, 105268. Liu, R., Wang, X., Lu, H., Wu, Z., Fan, Q., Li, S., Jin, X., 2021. Sccgan: Style and
Bellemare, M.G., Danihelka, I., Dabney, W., Mohamed, S., Lakshminarayanan, B., characters inpainting based on cgan. Mobile Networks Appl. pp. 1–10.
Hoyer, S., Munos, R., 2017. The cramer distance as a solution to biased Liu, H., Wan, Z., Huang, W., Song, Y., Han, X., Liao, J., 2021. Pd-gan: Probabilistic
wasserstein gradients. arXiv preprint arXiv:1705.10743. diverse gan for image inpainting. In: Proceedings of the IEEE/CVF Conference on
Bhattacharjee, P., Das, S., 2018. Context graph based video frame prediction using Computer Vision and Pattern Recognition, pp. 9371–9381.
locally guided objective. In: Proceedings of the European Conference on Ma, Y., Wei, B., Feng, P., He, P., Guo, X., Wang, G., 2020. Low-dose ct image denoising
Computer Vision (ECCV). using a generative adversarial network with a hybrid loss function for noise
Chrysos, G.G., Kossaifi, J., Zafeiriou, S., 2018. Robust conditional generative learning. IEEE Access 8, 67 519–67 529.
adversarial networks. arXiv preprint arXiv:1805.08657. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S., 2017. Least squares
Emami, H., Aliabadi, M.M., Dong, M., Chinnam, R.B., 2020. Spa-gan: Spatial attention generative adversarial networks. In: Proceedings of the IEEE International
gan for image-to-image translation. IEEE Trans. Multimedia 23, 391–401. Conference on Computer Vision, pp. 2794–2802.
Fadl, S., Han, Q., Li, Q., 2018. Surveillance video authentication using universal Saha, A., Wu, Q.J., 2016. Full-reference image quality assessment by combining
image quality index of temporal average. In: International Workshop on Digital global and local distortion measures. Signal Process. 128, 186–197.
Watermarking. Springer. pp. 337–350. Sara, U., Akter, M., Uddin, M.S., 2019. Image quality assessment through fsim, ssim,
Gatys, L.A., Ecker, A.S., Bethge, M., 2016. Image style transfer using convolutional mse and psnra comparative study. J. Comput. Commun. 7 (3), 8–18.
neural networks. In: Proceedings of the IEEE conference on computer vision and Setiadi, D.R.I.M., 2021. Psnr vs ssim: imperceptibility quality assessment for image
pattern recognition, pp. 2414–2423. steganography. Multimedia Tools Appl. 80, 8423–8444.
Goel, T., Murugan, R., Mirjalili, S., Chakrabartty, D.K., 2021. Automatic screening of Tang, Y., Yang, X., Wang, N., Song, B., Gao, X., 2020. Cgan-tm: A novel domain-to-
covid-19 using an optimized generative adversarial network. Cogn. Comput., 1– domain transferring method for person re-identification. IEEE Trans. Image
16 Process. 29, 5641–5651.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Tang, H., Liu, H., Xu, D., Torr, P.H., Sebe, N., 2021. Attentiongan: Unpaired image-to-
Courville, A., Bengio, Y., 2014. Generative adversarial nets. In: Advances in image translation using attention-guided generative adversarial networks. IEEE
neural information processing systems, pp. 2672–2680. Trans. Neural Networks Learn. Syst., 1–16 early access.
Padalkar, G.R., Patil, S.D., Hegadi, M.M., Jaybhaye, N.K., 2021. Drug discovery using Tylecek, R., 2012. The cmp facade database, Research Report CTU–CMP–2012–24
generative adversarial network with reinforcement learning. In: 2021 (Tech. Rep.). Czech Technical University.
International Conference on Computer Communication and Informatics Tzeng, E., Hoffman, J., Saenko, K., Darrell, T., 2017. Adversarial discriminative
(ICCCI). IEEE. pp. 1–3. domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C., 2017. Improved and Pattern Recognition, pp. 7167–7176.
training of wasserstein gans. In: Advances in neural information processing Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F., Pinheiro, P.R., 2020.
systems, pp. 5767–5777. Covidgan: data augmentation using auxiliary classifier gan for improved covid-
Han, C., Rundo, L., Murao, K., Noguchi, T., Shimahara, Y., Milacski, Z.Á., Koshino, S., 19 detection. IEEE Access 8, 91 916–91 923.
Sala, E., Nakayama, H., Satoh, S., 2021. Madgan: unsupervised medical anomaly Wang, C., Xu, C., Wang, C., Tao, D., 2018. Perceptual adversarial networks for image-
detection gan using multiple adjacent brain mri slice reconstruction. BMC to-image transformation. IEEE Trans. Image Process. 27 (8), 4066–4079.
Bioinf. 22 (2), 1–20. Yu, Y., Gong, Z., Zhong, P., Shan, J., 2017. Unsupervised representation learning with
Hognon, C., Tixier, F., Colin, T., Gallinato, O., Visvikis, D., Jaouen, V., 2020. Influence deep convolutional neural network for remote sensing images. In: International
of gradient difference loss on mr to pet brain image synthesis using gans. Conference on Image and Graphics. Springer. pp. 97–108.
Zhang, H., Sindagi, V., Patel, V.M., 2017. Image de-raining using a conditional
generative adversarial network,” arXiv preprint arXiv:1701.05957.

6988

You might also like