Efficient Image Colorization
Efficient Image Colorization
Abstract—This paper proposes a deep learning-based approach particular ensure controlled generation of images conditioned
to the colorization of images from grayscale to colored ones on input data, such as a low resolution grey-scale images.
on images of size 32 × 32 in low resolution without transfer In this work of CGAN-based colourisation, the generator is
learning using Conditional Generative Adversarial Networks. For
smaller sizes of images, new architectures have been proposed to provide the image to be colored from the given grayscale
for the generator and discriminator motivated by the Pix2Pix image, and the discriminator learns to distinguish real and fake
framework so that the small image sizes are handled efficiently. colored images in the latent space, pushing the generator to
The paper experimentally studies such models’ performance produce more realistic output images [2], [3].
on 1,000–10,000 subsets of CIFAR-10 in terms of dataset size This paper proposes a lightweight CGAN-based approach
affecting a model’s performance. It can be noticed here that
the trade-off does exist between the quality of the dataset and for the colourization of low-resolution grayscale images. Ex-
image since the model performs approximately in the smaller periments conducted as part of this work focus on 32 × 32
dataset sizes under SSIM and PSNR. Such findings are relevant pixel images from the CIFAR-10 dataset, exploring whether
particularly for application areas like medical imaging, where CGANs are suitable for training in the presence of sparse data
access to clear, high-resolution images or large datasets is often without pre-trained models. This is particularly important in
limited, hence the demonstration that CGANs can deliver high-
quality results with limited data and without any form of pre- medical imaging domains in which high resolution images
trained models. or large datasets are not usually available. The proposed
Index Terms—Conditional GANs, image-to-image translation, method indeed shows that CGANs is capable of generating
small datasets, SSIM, PSNR, grayscale-to-color translation, colorization equivalent in quality while utilizing far fewer data
Pix2Pix. samples and lower-resolution images, which is very practical
for resource-limited scenarios [4], [5].
I. I NTRODUCTION
Colorization is the prediction of what colors should be A. Contributions
applied to a grayscale image and converting them into three- Some significant contributions in the paper include • Em-
dimensional RGB images. Traditionally, colorization involved ploying CGANs to devise colorizations of lowresolution
the long and arduous process of doing it manually. But with grayscale images with a dimension of (3232) as their results
the advancements in machine learning and computer vision, can be transferred for small scale datasets which differ be-
techniques for automatic colorization have developed as an tween 1, 000 and 10, 000 samples • Developed architecture
efficient and accurate alternative. Applications range from solely for generator as well as discriminator constructed with
restoring images, entertainment, to even medical imaging. a greater general adaptation from Pix2Pix framework enabling
There are basically two categories based on which col- much lesser sizes in contrast with standard small sized images.
orization approaches are made: user-guided and automatic. • The paper makes an elaborate comparison with state-of-the-
Basically, the user-guided method uses manual input when art colorization techniques and benchmarks the results using
the users provide color hints or reference images to make PSNR and SSIM scores across different methods in Table IV.
the algorithm propagate color across that grayscale image. • The model, when trained with a limited amount of data, still
Although resource-intensive and requiring much labor for an achieves competitive results in terms of Structural Similarity
image with intensive details, these methods become effective. Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR). It
The automatic method never requires any manual input. The established how far model performance depends on the size
dominance in approach using large datasets of colored images of the dataset in the paper. The model performed comparable
to learn the mapping from grayscale to color is dominated by up to just 5, 000 images and improves with dataset size. Thus,
Convolutional Neural Networks as deep learning rose. the methods based on CGANs can indeed be applied to low-
Generative Adversarial Networks are the invention of Good- resolution images, making them applicable to domains such
fellow et al. that have changed the face of image generation as medical images where resolution and possibly also dataset
tasks such as colorizing images [1]. Conditional GANs in size are sometimes limited.
Remainder of the paper: Section II gives an overview of ing, and it is thus applicable in scenarios with limited both
previous research in image colorization using GAN-based data and computational resources.
methods and other related works. It elucidates the experimental More recent work has attempted to push training efficiency
methodology to be used with the architectures of CGANs used with the help of self-supervised learning. Suarez et al. [4] and
along with the methodology for analyzing the results. Results Vitoria et al. [10] discussed the utilization of self-supervised
analysis and discussion is presented on the findings obtained for colorization in which GANs are effective even when data
in this paper in Section V while potential improvements and availability is small. They degrade a bit in the colorization
future research directions are proposed in Section VI and a quality if training is done on very minimalistic data since these
conclusion drawn with that of the entire paper and its findings still need some level of supervision.
in Section VII. Other studies, for instance, Kumar et al. [11], used CGANs
for specialized tasks such as medical image colorization;
II. R ELATED W ORK hence its capability across different domains. For the portrait
colorization, Serrano et al. [12] demonstrated the applicability
Image colorization is one area of research that has made of CGANs in coloring a low-resolution grayscale portrait.
remarkable progress with advancement through deep learning Although these works affirm the suitability of CGANs in
methods, particularly generative models. Zhang et al. [19] a variety of applications, they are typically more interested
first presented a CNN-based architecture that was utilized in high-resolution images and large datasets that cannot be
to automatically colorize the grayscale images. It forecasted feasible in resource-limited settings.
color distributions for each pixel within a grayscale image and Apart from GAN-based methods, the performance assess-
gained good success on large-scale datasets. However, this ment of colorized images has usually been very dependent
method, although effective, was computationally expensive on SSIM [13] and PSNR [14] as well. To be more specific,
and had to depend on large training data, making it less prac- these two metrics have served as default benchmark metrics
tical for a resource-scarce scenario or where the availability for determining the quality of colorized images in a perceptual
of the data is a concern. sense. Therefore, this paper leverages the two as an established
Generative Adversarial Networks (GANs) [1] offered much ground for comparison of model performance.
stronger framework than the former one for image coloriza-
tion, allowing for much more realistic synthesis of images by A. Limitations of Previous Approaches
exploiting the adversarial process between the generator and
discriminator. Mirza and Osindero have introduced conditional While GANs and CGANs have revolutionized the process
GANs, an extension to these ideas: instead of using any of image colorization significantly, a lot of the weaknesses
additional information for both generators and discriminators remain. In the first place, most of the approaches work based
to get conditioned. Isola, et al. have made significant contri- on a large set of datasets that are sufficient for adequate
butions on this and it is extended in Pix2Pix as widely used performance. Most models, including Pix2Pix in general, fail
architecture by the model for performing all image-to-image to make strong generalizations when few train data are avail-
translation tasks which include colorization. Although these able. This makes them more non-applicable in domains where
methods based on the CGAN-approach like Pix2Pix are quite these high-resolution images may not be available, especially
successful, they are often based on a large amount of training when dealing with medical image scanning. Besides, transfer
data to ensure generalization, and thus are not possible in learning methods, as proposed in [9], introduce additional
environments with data constraint. overhead and reliance on external pre-trained models, which
Cao et al have further enhanced this work by using GANs as is not always possible. Lastly, self-supervised approaches, as
a technique for automatic image colorization, focusing more described in [4], typically sacrifice quality when the data is
on the bright colors and realistic colors. Nazeri et al improved highly sparse.
this by proposing perceptual losses that improve the perceptual
quality of the colored image. These methods are producing B. Paper’s Approach
really high-quality results, but they depend on really large- To overcome such limitations, we propose in this work a
scale datasets for best performance. CGAN-based model specific to colorization of low-resolution
Transfer learning has also been explored to be applied for grayscale images at a dimensionality of (32 × 32) size in
improvement of performance. Iizuka et al. Introduced a model smaller datasets. Transfer learning can now be eliminated
using both local and global context features to colorize but due to a limitation of constrained image resolutions and
with the advantage of using pretrained networks to bring out constrained size of datasets. Experimental evaluations with
high-quality results. Such transfer learning-based approaches this model demonstrate its competitive ability in obtaining
do bring overhead in terms of computation and rely on pre- performance metrics on PSNR and SSIM even within very
trained models that may not always be available or appropriate, resource-constrained environments. This would allow the pro-
especially in fields such as medical imaging. This work is quite posed approach to apply well in specialized fields such as
different from the above approaches because it emphasizes a medical imaging, where high resolution is extremely difficult
lightweight CGAN architecture independent of transfer learn- to obtain and data is often scarce.
III. M ETHODOLOGY • Conv2DTranspose (64 filters, kernel size = 4, strides =
A. Dataset and Preprocessing 2) + Batch Normalization + ReLU
• Concatenate with input image
In this work, the authors used CIFAR-10 composed of • Conv2D (3 filters, kernel size = 4, strides = 1, activation
32 × 32 RGB images to test how such a model performs = tanh)
at data sizes. For the analysis, the paper experimented on
six subsets of CIFAR-10: 1000, 2000, 3000, 4000, 5000, 10000
samples, respectively.
The dataset is preprocessed such that the pixel values of
the image are normalized between [−1, 1] range and RGB
images are converted to grayscale input to the generator. The
normalizing function used for preprocessing normalize the
images in the following ways:
x
x′ = −1
127.5
where x represents the original pixel values of the image.
Each grayscale image is paired with its corresponding color
image from the dataset. This grayscale image is passed to the
generator, and the discriminator receives both grayscale (input)
and color (target) images.
B. Generator Architecture
The generator in this model is designed to convert 32 × 32
grayscale images into 32 × 32 RGB images. It consists of
an encoder-decoder structure with skip connections similar to
the Pix2Pix framework. The generator network begins with
an input grayscale image and progressively downsamples the
image to a low-dimensional latent space and then upsamples
it back to the original resolution while adding the necessary
Fig. 2: Transformation from Original to Grayscale to Gener-
color information.
ated Color Image
The architecture of the generator is as follows:
Encoder:
Given a grayscale input image, the generator will output a
• Conv2D (64 filters, kernel size = 4, strides = 2) +
color 32 × 32 × 3 image. During training, the loss function
LeakyReLU consists of adversarial loss combined with an L1 loss in order
• Conv2D (128 filters, kernel size = 4, strides = 2) + Batch
to force the generator to produce images close to the target
Normalization + LeakyReLU color images:
Bridge: The generator will output a color image 32 × 32 × 3,
• Conv2D (256 filters, kernel size = 4, strides = 1) + Batch generated from a given grayscale input image. The used
Normalization + LeakyReLU loss function for training the generator is an adversarial loss
Decoder: combined with an L1 loss in order to drive the generator to
output images close to target color images:
• Conv2DTranspose (128 filters, kernel size = 4, strides =
2) + Batch Normalization + ReLU
• Concatenate with encoder output LG = Ex,y [log(D(x, G(x)))] + λ · Ex,y [∥y − G(x)∥1 ]
where D(x, G(x)) is the discriminator’s output for the where D(x, y) is the discriminator’s output for a real image,
generated image, and λ = 100 controls the weight of the L1 and D(x, G(x)) is its output for a generated image.
loss.
Table III gives detailed SSIM and PSNR results for each
sample size. SSIM and PSNR values are reported for epochs
from 1 to 100 for each sample size. SSIM ranges between
0.36 and 0.89, and PSNR ranges between 14.43 and 24.35. It
is also found that with an increase in the number of epochs,
the values of both metrics improve and larger sample sizes
provide better overall performance.
The SSIM and PSNR values for the different sample sizes
are visualized in Figures 5 and 6, respectively. These plots il- Fig. 6: PSNR across sample sizes and epochs.
lustrate the trends in model performance as training progresses.
V. F INDINGS AND D ISCUSSION
• Figure 5 demonstrates how the SSIM values increase
significantly after the first epoch and then stabilize across Analysis of Conditional Generative Adversarial Networks
different sample sizes. For all sample sizes, SSIM con- applied to the CIFAR-10 dataset as a test set for colorization
verges to values above 0.8 after approximately 10 epochs, gives several important insights into the relationship between
indicating that the generated images become increasingly dataset size and number of epochs and model performance.
similar to the ground truth images over time. The key takeaways extracted from the results of SSIM and
• Figure 6 shows a similar trend for the PSNR values, PSNR across sample sizes and epochs are:.
where performance improves rapidly in the early stages of
training and stabilizes as the number of epochs increases. A. Impact of Dataset Size on Model Performance
Larger sample sizes achieve slightly higher PSNR values, There’s a positive trend in SSIM as well as PSNR results
with 5000 and 10000 samples reaching PSNR values by increasing the sample size from 1000 up to 10,000 and
above 23 by the 100th epoch. This suggests that the especially at larger epochs. Results are also proved to be robust
reconstruction quality of the generated images improves at smaller-size datasets, but larger sizes of data have constantly
with more training data. given high performance:
Both figures emphasize the effectiveness of increasing the • For 1000 samples, the SSIM improves from 0.3618 at
dataset size and training duration. As the figures show, larger epoch 1 to 0.8773 at epoch 100, a relative improvement
datasets and more epochs result in higher similarity (SSIM) of 142.4%, while the PSNR improves from 14.43 dB to
and better reconstruction quality (PSNR). 22.66 dB, a 57.1% increase.
TABLE III: SSIM and PSNR results for different sample sizes across epochs.
Epoch 1000 2000 3000 4000 5000 10000
SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR
1 0.3618 14.43 0.4690 15.29 0.5413 16.05 0.6631 17.70 0.7380 19.27 0.7447 19.47
10 0.7915 20.33 0.8365 21.63 0.8128 20.51 0.8554 21.91 0.8226 20.70 0.8701 22.17
20 0.8135 21.24 0.8395 21.19 0.7937 19.22 0.8616 22.33 0.8625 22.01 0.8564 21.70
30 0.8182 21.14 0.8595 22.17 0.7976 20.17 0.8151 21.17 0.8693 22.31 0.8326 21.75
40 0.8539 22.24 0.8326 21.62 0.8408 21.19 0.8631 22.26 0.8669 22.27 0.8747 22.28
50 0.8253 20.92 0.8502 22.09 0.8629 22.21 0.8235 21.09 0.8683 22.50 0.8756 22.64
60 0.8334 21.45 0.8743 22.88 0.8691 22.49 0.8912 23.17 0.8768 22.64 0.8635 22.09
70 0.8639 22.40 0.8679 22.39 0.8675 22.29 0.8888 23.02 0.8889 23.12 0.8892 23.08
80 0.8535 22.35 0.8595 22.14 0.8832 22.95 0.8864 23.20 0.8899 23.52 0.8540 23.17
90 0.8766 22.61 0.8855 22.90 0.8862 23.14 0.8926 23.33 0.8912 23.88 0.8219 23.98
100 0.8773 22.66 0.8885 22.98 0.8977 23.63 0.8931 23.50 0.8975 24.31 0.7878 24.35
TABLE IV: PSNR and SSIM results for different image colorization methods.
Citation / Source Dataset Size (Images) Image Resolution Total Pixels (approx.) Epochs PSNR (dB) SSIM
Zhang et al., 2016 [19] 1,000 256x256 65,536,000 100 24.8 0.86
Iizuka et al., 2016 [9] 3,000 224x224 150,528,000 50 24.5 0.85
Isola et al., 2017 [3] 400 256x256 26,214,400 200 22.9 0.81
Nazeri et al., 2018 [8] 2,000 128x128 32,768,000 50 23.1 0.79
Vitoria et al., 2020 [15] 1,500 256x256 98,304,000 100 24.3 0.83
Su et al., 2019 [16] 600 256x256 39,321,600 150 23.8 0.82
Sartaj et al., 2021 [17] 1,200 128x128 19,660,800 80 23.5 0.84
Bhattacharjee et al., 2022 [5] 800 128x128 13,107,200 120 24.1 0.81
Proposed Methodology 5,000 32x32 5,120,000 100 24.31 0.8975
• With 2000 samples, the SSIM increases from 0.4690 to C. Comparison Between Small and Large Datasets
0.8885 (89.4%) over 100 epochs, and the PSNR from The final SSIM and PSNR value varies consistently with
15.29 dB to 22.98 dB, a 50.3% increase. larger datasets at the same number of epochs. For instance,
• For 3000 samples, the SSIM rises from 0.5413 to 0.8977 at epoch 100, the SSIM for 10,000 samples is 0.7878 versus
(65.8%), and the PSNR improves from 16.05 dB to 23.63 0.8773 for 1000 samples. Similarly, PSNR for 10,000 samples
dB, a 47.2% gain. is 24.31 dB versus 22.66 dB for 1000 samples. However, it is
• When the dataset size is 4000, the SSIM improves from interesting that, even when the dataset is small, the model has
0.6631 to 0.8931 (34.7%), while the PSNR increases reasonably competitive performance, which shows that, even
from 17.70 dB to 23.50 dB, a 32.8% increase. on a small dataset, CGANs could perform reasonably well.
• At 5000 samples, the SSIM starts at 0.7380 and reaches
0.8923 (20.9%), while the PSNR increases from 19.27 D. Overall Findings
dB to 24.26 dB, a 25.9% rise.
• For the largest dataset size (10,000 samples), SSIM The conducted experiments demonstrate that in general,
improves from 0.7447 to 0.7878 (5.8%), while the PSNR a boost in size for a given dataset improves performance,
rises from 19.47 dB to 24.31 dB, a 24.9% increase. yet important gains can be also achieved by longer training
on smaller datasets. As the example shows the comparison
between the number of samples 5000 and 10,000 at epoch
B. Impact of Epochs on Model Performance 100 is barely differentiable with SSIM enhanced by only 0.6
percent, and PSNR increased by 0.2 dB. These results show
Across all dataset sizes, increasing the number of training that even with relatively small datasets, the colorization based
epochs results in a steady improvement in both SSIM and on CGAN can be quite efficient, and such approach appears
PSNR. The model shows rapid improvement within the first viable when data are scarce.
10 epochs, with diminishing returns at later epochs:
• For 1000 samples, SSIM increases by 118.8% from epoch E. Comparison with Related Work
1 to 10, while PSNR improves by 40.9%. However, the When compared with the results of previous research stud-
change from epoch 10 to epoch 100 is more modest, with ies, some major differences can be noticed. Zhang et al. uses
SSIM improving by 10.8% and PSNR by 11.5%. a dataset size of 1,000 images at 256 x 256 resolution and has
• Similar trends are observed for other dataset sizes, with achieved 24.8 dB PSNR and 0.86 SSIM. Whereas, the PSNR
rapid initial improvements followed by slower gains. For value becomes slightly low as 24.31 dB, but with a higher
instance, for 5000 samples, SSIM improves by 11.8% SSIM of 0.8975, which is higher compared with Zhang et
between epochs 10 and 100, while PSNR improves by al.’s technique, although implemented on much smaller 32x32
17.1%. images and sample size is considered to be 5,000. It depicts
how the approach can produce the perceptually better images, skip connections, such as in U-Net architecture, might im-
even at the lower resolutions. prove image restoration as they preserve high-resolution
Along the same lines, Iizuka et al. also achieve PSNR of details from earlier layers.
24.5 dB and SSIM of 0.85 from a dataset size of 3,000 images • Deeper Generator Network: Increasing the depth of the
at a higher resolution of 224x224. Because this approach uses generator by adding more convolutional and transpose
fewer pixels but a comparable number of epochs, it is excellent convolutional layers might help capture more complex
beyond this concerning SSIM. This implies that the CGAN- features from the input images.
based approach can attain very competitive results relatively • Residual Blocks: Introducing residual blocks into the
easily even from lower-resolution images. generator could help improve image reconstruction.
Apart from that, in comparison with Vitoria Residual connections allow better gradient flow and may
accelerate training convergence.
textitet al.
C. Loss Functions and Optimization
citevitoria2020chromagan, which trained the model on • Perceptual Loss: Instead of relying solely on L1 loss,
1,500 images at 256x256 resolution for 100 epochs, the consider adding perceptual loss (VGG-based loss), which
proposed method has PSNR and SSIM higher respectively compares high-level features of generated and real im-
(24.31 dB vs. 24.3 dB) and 0.8975 vs. 0.83 using a much ages. This could improve the visual quality of generated
larger dataset but significantly fewer pixels. That further images.
proves the scalability of the proposed method when training • Gradient Penalty: In the discriminator, you could intro-
on lower-resolution data. duce a gradient penalty, especially if the GAN suffers
Moreover, Nazeri et al. Nazeri2018Image and Isola et al. from instability during training. This could be imple-
Isola2017Image, using 128 × 128 and 256 × 256 images, mented using Wasserstein GAN with Gradient Penalty
achieved comparable results at a comparable amount of epoch (WGAN-GP) for more stable training.
numbers. However, the efficiency of that model with 32 × 32
D. Training Process
images demonstrates that CGANs are also capable of creating
high-quality outputs even if the image resolution is lower, so • Dynamic Learning Rate: The learning rate could be
it gives an opportunity to further develop that work in more adjusted dynamically using learning rate schedulers. A
extended ranges of the sizes of images. As an example, Sartaj decreasing learning rate over time could help the models
converge more efficiently.
textitet al. • Discriminator Training Frequency: Try updating the
discriminator more or less frequently relative to the
citesartaj2021cgan uses images of size 128x128 and 80 generator, which might improve training stability.
epochs but it is still superior to their model’s performance at
E. Metrics and Evaluation
epoch 100 as in TableIV.
• FID Score: In addition to SSIM and PSNR, consider
VI. P OTENTIAL I MPROVEMENTS using the Fréchet Inception Distance (FID) score, which
compares distributions of real and generated images using
A. Data Preprocessing
a pre-trained Inception network, and provides a more
• Normalization Range: Currently, the preprocessing step comprehensive measure of image quality.
normalizes images to the range of [−1, 1]. A potential im- • Visual Quality of Results: Visual inspection of the
provement could be trying different normalization ranges, generated results at intermediate epochs can provide more
such as [0, 1], depending on the activation functions used intuition about the quality of generated images over time.
in the generator and discriminator models. Include more frequent visual checkpoints.
• Data Augmentation: Introduce data augmentation tech-
niques such as random flips, rotations, and crops. This VII. C ONCLUSION
would increase the diversity of the training data and might This paper introduced the lightweight Conditional GAN
improve the generalization of the model. model based on CIFAR-10 for the task of colorizing grayscale
• Larger Dataset: The number of samples for both training images. Based on the compact architecture with the capacity
and testing is restricted (3,000 for training and 100 for to work regardless of whether large-scale or high-resolution
testing). A larger dataset could improve performance, inputs are presented, this model is extremely useful for fields
given that models like GANs typically benefit from more such as medical imaging, where one usually deals with scarce
data. amounts of data and high-quality labeled datasets are expen-
sive.
B. Model Architecture From start to end during the experimentation process,
• Skip Connections in Generator: The skip connections performance was promising for the developed model at such
currently only link certain layers. Using more detailed low PSNR and SSIM scores although working at low image
resolution with a tiny dataset as compared to almost all
prior work. Here, considering this research focuses on images
using a less extensive size dataset and resolutions, where data
availability indeed is critical or can be the constraint during
their realistic application, with the resultant superiority in
images’ quality as compared with the colourized images - the
PSNR and SSIM being remarkably impressive.
Future improvements may include data augmentation, per-
ceptual loss, and deeper network architectures, which could
make the model even more photorealistic and high-quality.
R EFERENCES
[1] I. Goodfellow et al., ”Generative Adversarial Nets,” NIPS, 2014.
[2] M. Mirza and S. Osindero, ”Conditional Generative Adversarial Nets,”
arXiv preprint arXiv:1411.1784, 2014.
[3] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, ”Image-to-image translation
with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit. (CVPR), 2017, pp. 1125–1134.
[4] J. Suarez et al., ”Self-Supervised Learning for Image Colorization,”
IEEE Access, 2022.
[5] S. Bhattacharjee, V. Singh, and D. S. Kushwaha, ”Efficient image
colorization using conditional GANs and perceptual loss,” in Proc. IEEE
Conf. Image Process. (ICIP), 2022.
[6] R. Zhang, P. Isola, and A. A. Efros, ”Colorful image colorization,” in
Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 649–666.
[7] Y. Cao, Z. Zhu, and Z. Zhang, ”Image Colorization Using Generative
Adversarial Networks,” International Journal of Advanced Computer
Science and Applications, 2020.
[8] K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, ”Image
colorization using generative adversarial networks,” in Proc. Int. Conf.
Artif. Intell. Appl. (AAIA), 2018.
[9] S. Iizuka, E. Simo-Serra, and H. Ishikawa, ”Let there be color!: Joint
end-to-end learning of global and local image priors for automatic image
colorization with simultaneous classification,” ACM Trans. Graph., vol.
35, no. 4, pp. 110:1–110:11, 2016.
[10] P. Vitoria, L. Zhang, ”ChromaGAN: Colorization with a GAN using
chrominance and luminance color spaces,” Pattern Recognition, 2022.
[11] S. Kumar, R. Srivastava, and S. Yadav, ”Image Colorization Using
Generative Adversarial Networks and Transfer Learning,” International
Journal of Innovative Technology and Exploring Engineering (IJITEE),
2021.
[12] P. Serrano, S. Bharadwaj, and R. Diaz, ”Portrait Image Colorization
Using Conditional GANs,” WACV, 2017.
[13] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, ”Image Quality
Assessment: From Error Visibility to Structural Similarity,” IEEE Trans-
actions on Image Processing, 2004.
[14] D. Huynh-Thu and M. Ghanbari, ”Scope of Validity of PSNR in
Image/Video Quality Assessment,” Electronics Letters, 2008.
[15] P. Vitoria, L. Sousa, and P. Quelhas, ”ChromaGAN: Adversarial picture
colorization with semantic class distribution,” in Proc. IEEE/CVF Conf.
Comput. Vis. Pattern Recognit. Workshops (CVPRW), 2020, pp. 0–0.
[16] Z. Su, J. Wang, and C. Hu, ”Lightweight image colorization with gen-
erative adversarial networks,” IEEE Access, vol. 7, pp. 170804–170816,
2019.
[17] S. N. Ali, P. Kumar, and S. Jain, ”cGAN-based image colorization using
semantic segmentation,” in Proc. Int. Conf. Pattern Recognit. Mach.
Intell. (PRMI), 2021, pp. 334–342.
[18] A. Radford, L. Metz, and S. Chintala, ”Unsupervised Representation
Learning with Deep Convolutional Generative Adversarial Networks,”
ICLR, 2016.
[19] R. Zhang, P. Isola, and A. Efros, ”Colorful Image Colorization,” ECCV,
2016.