Efficient Image Colorization

Uploaded by

Bhavya Chopra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views8 pages

Efficient Image Colorization

Uploaded by

Bhavya Chopra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Efficient Image Colorization with Conditional

GANs: A Lightweight Approach for High-Quality

Results Using Limited Data
Bhavya Chopra, Abhiram Chodapaneedi, Ishi Prashar, Dr. Sakshi Indolia
STME, SVKM’s Narsee Monjee Institute of Management Studies (NMIMS) Deemed-to-be-University,
Navi Mumbai, Maharashtra, India

Abstract—This paper proposes a deep learning-based approach particular ensure controlled generation of images conditioned
to the colorization of images from grayscale to colored ones on input data, such as a low resolution grey-scale images.
on images of size 32 × 32 in low resolution without transfer In this work of CGAN-based colourisation, the generator is
learning using Conditional Generative Adversarial Networks. For
smaller sizes of images, new architectures have been proposed to provide the image to be colored from the given grayscale
for the generator and discriminator motivated by the Pix2Pix image, and the discriminator learns to distinguish real and fake
framework so that the small image sizes are handled efficiently. colored images in the latent space, pushing the generator to
The paper experimentally studies such models’ performance produce more realistic output images [2], [3].
on 1,000–10,000 subsets of CIFAR-10 in terms of dataset size This paper proposes a lightweight CGAN-based approach
affecting a model’s performance. It can be noticed here that
the trade-off does exist between the quality of the dataset and for the colourization of low-resolution grayscale images. Ex-
image since the model performs approximately in the smaller periments conducted as part of this work focus on 32 × 32
dataset sizes under SSIM and PSNR. Such findings are relevant pixel images from the CIFAR-10 dataset, exploring whether
particularly for application areas like medical imaging, where CGANs are suitable for training in the presence of sparse data
access to clear, high-resolution images or large datasets is often without pre-trained models. This is particularly important in
limited, hence the demonstration that CGANs can deliver high-
quality results with limited data and without any form of pre- medical imaging domains in which high resolution images
trained models. or large datasets are not usually available. The proposed
Index Terms—Conditional GANs, image-to-image translation, method indeed shows that CGANs is capable of generating
small datasets, SSIM, PSNR, grayscale-to-color translation, colorization equivalent in quality while utilizing far fewer data
Pix2Pix. samples and lower-resolution images, which is very practical
for resource-limited scenarios [4], [5].
I. I NTRODUCTION
Colorization is the prediction of what colors should be A. Contributions
applied to a grayscale image and converting them into three- Some significant contributions in the paper include • Em-
dimensional RGB images. Traditionally, colorization involved ploying CGANs to devise colorizations of lowresolution
the long and arduous process of doing it manually. But with grayscale images with a dimension of (3232) as their results
the advancements in machine learning and computer vision, can be transferred for small scale datasets which differ be-
techniques for automatic colorization have developed as an tween 1, 000 and 10, 000 samples • Developed architecture
efficient and accurate alternative. Applications range from solely for generator as well as discriminator constructed with
restoring images, entertainment, to even medical imaging. a greater general adaptation from Pix2Pix framework enabling
There are basically two categories based on which col- much lesser sizes in contrast with standard small sized images.
orization approaches are made: user-guided and automatic. • The paper makes an elaborate comparison with state-of-the-
Basically, the user-guided method uses manual input when art colorization techniques and benchmarks the results using
the users provide color hints or reference images to make PSNR and SSIM scores across different methods in Table IV.
the algorithm propagate color across that grayscale image. • The model, when trained with a limited amount of data, still
Although resource-intensive and requiring much labor for an achieves competitive results in terms of Structural Similarity
image with intensive details, these methods become effective. Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR). It
The automatic method never requires any manual input. The established how far model performance depends on the size
dominance in approach using large datasets of colored images of the dataset in the paper. The model performed comparable
to learn the mapping from grayscale to color is dominated by up to just 5, 000 images and improves with dataset size. Thus,
Convolutional Neural Networks as deep learning rose. the methods based on CGANs can indeed be applied to low-
Generative Adversarial Networks are the invention of Good- resolution images, making them applicable to domains such
fellow et al. that have changed the face of image generation as medical images where resolution and possibly also dataset
tasks such as colorizing images [1]. Conditional GANs in size are sometimes limited.
Remainder of the paper: Section II gives an overview of ing, and it is thus applicable in scenarios with limited both
previous research in image colorization using GAN-based data and computational resources.
methods and other related works. It elucidates the experimental More recent work has attempted to push training efficiency
methodology to be used with the architectures of CGANs used with the help of self-supervised learning. Suarez et al. [4] and
along with the methodology for analyzing the results. Results Vitoria et al. [10] discussed the utilization of self-supervised
analysis and discussion is presented on the findings obtained for colorization in which GANs are effective even when data
in this paper in Section V while potential improvements and availability is small. They degrade a bit in the colorization
future research directions are proposed in Section VI and a quality if training is done on very minimalistic data since these
conclusion drawn with that of the entire paper and its findings still need some level of supervision.
in Section VII. Other studies, for instance, Kumar et al. [11], used CGANs
for specialized tasks such as medical image colorization;
II. R ELATED W ORK hence its capability across different domains. For the portrait
colorization, Serrano et al. [12] demonstrated the applicability
Image colorization is one area of research that has made of CGANs in coloring a low-resolution grayscale portrait.
remarkable progress with advancement through deep learning Although these works affirm the suitability of CGANs in
methods, particularly generative models. Zhang et al. [19] a variety of applications, they are typically more interested
first presented a CNN-based architecture that was utilized in high-resolution images and large datasets that cannot be
to automatically colorize the grayscale images. It forecasted feasible in resource-limited settings.
color distributions for each pixel within a grayscale image and Apart from GAN-based methods, the performance assess-
gained good success on large-scale datasets. However, this ment of colorized images has usually been very dependent
method, although effective, was computationally expensive on SSIM [13] and PSNR [14] as well. To be more specific,
and had to depend on large training data, making it less prac- these two metrics have served as default benchmark metrics
tical for a resource-scarce scenario or where the availability for determining the quality of colorized images in a perceptual
of the data is a concern. sense. Therefore, this paper leverages the two as an established
Generative Adversarial Networks (GANs) [1] offered much ground for comparison of model performance.
stronger framework than the former one for image coloriza-
tion, allowing for much more realistic synthesis of images by A. Limitations of Previous Approaches
exploiting the adversarial process between the generator and
discriminator. Mirza and Osindero have introduced conditional While GANs and CGANs have revolutionized the process
GANs, an extension to these ideas: instead of using any of image colorization significantly, a lot of the weaknesses
additional information for both generators and discriminators remain. In the first place, most of the approaches work based
to get conditioned. Isola, et al. have made significant contri- on a large set of datasets that are sufficient for adequate
butions on this and it is extended in Pix2Pix as widely used performance. Most models, including Pix2Pix in general, fail
architecture by the model for performing all image-to-image to make strong generalizations when few train data are avail-
translation tasks which include colorization. Although these able. This makes them more non-applicable in domains where
methods based on the CGAN-approach like Pix2Pix are quite these high-resolution images may not be available, especially
successful, they are often based on a large amount of training when dealing with medical image scanning. Besides, transfer
data to ensure generalization, and thus are not possible in learning methods, as proposed in [9], introduce additional
environments with data constraint. overhead and reliance on external pre-trained models, which
Cao et al have further enhanced this work by using GANs as is not always possible. Lastly, self-supervised approaches, as
a technique for automatic image colorization, focusing more described in [4], typically sacrifice quality when the data is
on the bright colors and realistic colors. Nazeri et al improved highly sparse.
this by proposing perceptual losses that improve the perceptual
quality of the colored image. These methods are producing B. Paper’s Approach
really high-quality results, but they depend on really large- To overcome such limitations, we propose in this work a
scale datasets for best performance. CGAN-based model specific to colorization of low-resolution
Transfer learning has also been explored to be applied for grayscale images at a dimensionality of (32 × 32) size in
improvement of performance. Iizuka et al. Introduced a model smaller datasets. Transfer learning can now be eliminated
using both local and global context features to colorize but due to a limitation of constrained image resolutions and
with the advantage of using pretrained networks to bring out constrained size of datasets. Experimental evaluations with
high-quality results. Such transfer learning-based approaches this model demonstrate its competitive ability in obtaining
do bring overhead in terms of computation and rely on pre- performance metrics on PSNR and SSIM even within very
trained models that may not always be available or appropriate, resource-constrained environments. This would allow the pro-
especially in fields such as medical imaging. This work is quite posed approach to apply well in specialized fields such as
different from the above approaches because it emphasizes a medical imaging, where high resolution is extremely difficult
lightweight CGAN architecture independent of transfer learn- to obtain and data is often scarce.
III. M ETHODOLOGY • Conv2DTranspose (64 filters, kernel size = 4, strides =
A. Dataset and Preprocessing 2) + Batch Normalization + ReLU
• Concatenate with input image
In this work, the authors used CIFAR-10 composed of • Conv2D (3 filters, kernel size = 4, strides = 1, activation
32 × 32 RGB images to test how such a model performs = tanh)
at data sizes. For the analysis, the paper experimented on
six subsets of CIFAR-10: 1000, 2000, 3000, 4000, 5000, 10000
samples, respectively.
The dataset is preprocessed such that the pixel values of
the image are normalized between [−1, 1] range and RGB
images are converted to grayscale input to the generator. The
normalizing function used for preprocessing normalize the
images in the following ways:
x
x′ = −1
127.5
where x represents the original pixel values of the image.
Each grayscale image is paired with its corresponding color
image from the dataset. This grayscale image is passed to the
generator, and the discriminator receives both grayscale (input)
and color (target) images.

Fig. 1: Overall GAN Pipeline

B. Generator Architecture
The generator in this model is designed to convert 32 × 32
grayscale images into 32 × 32 RGB images. It consists of
an encoder-decoder structure with skip connections similar to
the Pix2Pix framework. The generator network begins with
an input grayscale image and progressively downsamples the
image to a low-dimensional latent space and then upsamples
it back to the original resolution while adding the necessary
Fig. 2: Transformation from Original to Grayscale to Gener-
color information.
ated Color Image
The architecture of the generator is as follows:
Encoder:
Given a grayscale input image, the generator will output a
• Conv2D (64 filters, kernel size = 4, strides = 2) +
color 32 × 32 × 3 image. During training, the loss function
LeakyReLU consists of adversarial loss combined with an L1 loss in order
• Conv2D (128 filters, kernel size = 4, strides = 2) + Batch
to force the generator to produce images close to the target
Normalization + LeakyReLU color images:
Bridge: The generator will output a color image 32 × 32 × 3,
• Conv2D (256 filters, kernel size = 4, strides = 1) + Batch generated from a given grayscale input image. The used
Normalization + LeakyReLU loss function for training the generator is an adversarial loss
Decoder: combined with an L1 loss in order to drive the generator to
output images close to target color images:
• Conv2DTranspose (128 filters, kernel size = 4, strides =
2) + Batch Normalization + ReLU
• Concatenate with encoder output LG = Ex,y [log(D(x, G(x)))] + λ · Ex,y [∥y − G(x)∥1 ]
where D(x, G(x)) is the discriminator’s output for the where D(x, y) is the discriminator’s output for a real image,
generated image, and λ = 100 controls the weight of the L1 and D(x, G(x)) is its output for a generated image.
loss.

Fig. 3: Generator Architecture

TABLE I: Generator Architecture Fig. 4: Discriminator Architecture

Layer Output Size Kernel Activation
Size/Stride
Input 32x32x1 - - TABLE II: Discriminator Architecture
Conv2D 16x16x64 4x4 / 2 LeakyReLU(0.2)
Conv2D + Batch- 8x8x128 4x4 / 2 LeakyReLU(0.2) Layer Output Size Kernel Activation
Norm Size/Stride
Conv2D + Batch- 8x8x256 4x4 / 1 LeakyReLU(0.2) Input (Image + 32x32x4 - -
Norm (Bridge) Target)
Conv2DTranspose 16x16x128 4x4 / 2 ReLU Conv2D 16x16x64 4x4 / 2 LeakyReLU(0.2)
+ BatchNorm Conv2D + Batch- 8x8x128 4x4 / 2 LeakyReLU(0.2)
Concatenate 16x16x192 - - Norm
Conv2D + Batch- 16x16x128 3x3 / 1 ReLU Conv2D + Batch- 4x4x256 4x4 / 2 LeakyReLU(0.2)
Norm Norm
Conv2DTranspose 32x32x64 4x4 / 2 ReLU Conv2D (Output 4x4x1 4x4 / 1 -
+ BatchNorm Layer)
Concatenate 32x32x65 - -
Conv2D + Batch- 32x32x64 3x3 / 1 ReLU
Norm D. Training Procedure
Conv2D (Output 32x32x3 4x4 / 1 Tanh
Layer) The model is trained using Adam optimizers with a learning
rate of 2 × 10−4 and β1 = 0.5. The batch size is set to
32, and the models are trained for 100 epochs for each of
C. Discriminator Architecture
the following dataset sizes: 1000, 2000, 3000, 4000, 5000, and
The discriminator is designed to distinguish between real 10000 samples.
color images and those generated by the generator. It takes Reason for Choosing 100 Epochs: Based on preliminary
both the grayscale input image and the corresponding color experiments and prior studies in the field, it was observed
image (real or generated) as input and predicts whether the that training the model for 100 epochs provided an optimal
color image is real or fake. The architecture consists of several balance between training time and performance. This allowed
convolutional layers, which reduce the resolution of the input the model to converge effectively while minimizing the risk of
progressively while increasing the number of feature maps. overfitting. Similar findings have been observed in the works
The architecture of the discriminator is as follows: of Goodfellow et al. [1], Radford et al. [18], and Zhang et al.
• Input: 32x32 grayscale image concatenated with a 32x32 [19], where the authors used a comparable number of epochs
RGB image (either real or generated) in GAN-based architectures to achieve stable results without
• Conv2D (64 filters, kernel size = 4, strides = 2) + excessive training.
LeakyReLU
E. Dataset and Preprocessing
• Conv2D (128 filters, kernel size = 4, strides = 2) + Batch
Normalization + LeakyReLU This study uses the CIFAR-10 dataset, which consists of
• Conv2D (256 filters, kernel size = 4, strides = 2) + Batch 32 × 32 RGB images. To evaluate the performance of the
Normalization + LeakyReLU model under different data sizes, the paper experiments with
• Conv2D (1 filter, kernel size = 4, strides = 1, no activa- six subsets of CIFAR-10: 1000, 2000, 3000, 4000, 5000, and
tion) 10000samples.
The discriminator’s output is a patch-based prediction, The dataset is preprocessed by normalizing pixel values
which evaluates the realism of different patches of the image. to the range of [−1, 1] and converting the RGB images to
The discriminator loss is defined as: grayscale as input to the generator. The preprocessing function
normalizes the images as follows:
x
LD = −Ex,y [log D(x, y)] − Ex,z [log(1 − D(x, G(x)))] x′ = −1
127.5
where x represents the original pixel values of the image.
Each grayscale image is paired with its corresponding color
image from the dataset. This grayscale image is passed to the
generator, and the discriminator receives both grayscale (input)
and color (target) images.

IV. R ESULTS AND A NALYSIS

Within this section, the model performed results on the dif-

ferent parts of the CIFAR-10 dataset based on two measures;
these are SSIM as well as PSNR. Those are the reported
measures using different sample sizes such that include 1000,
2000, 3000, 4000, 5000, as well as 10000 sample sizes, with
related epochs from 1 up to 100. The plot includes graphical
visualizations for both averaged SSIM as well as PSNR for Fig. 5: SSIM across sample sizes and epochs.
better visualization.

A. Performance across Different Sample Sizes and Epochs

Table III gives detailed SSIM and PSNR results for each
sample size. SSIM and PSNR values are reported for epochs
from 1 to 100 for each sample size. SSIM ranges between
0.36 and 0.89, and PSNR ranges between 14.43 and 24.35. It
is also found that with an increase in the number of epochs,
the values of both metrics improve and larger sample sizes
provide better overall performance.

B. Visual Representation of Results

The SSIM and PSNR values for the different sample sizes
are visualized in Figures 5 and 6, respectively. These plots il- Fig. 6: PSNR across sample sizes and epochs.
lustrate the trends in model performance as training progresses.
V. F INDINGS AND D ISCUSSION
• Figure 5 demonstrates how the SSIM values increase
significantly after the first epoch and then stabilize across Analysis of Conditional Generative Adversarial Networks
different sample sizes. For all sample sizes, SSIM con- applied to the CIFAR-10 dataset as a test set for colorization
verges to values above 0.8 after approximately 10 epochs, gives several important insights into the relationship between
indicating that the generated images become increasingly dataset size and number of epochs and model performance.
similar to the ground truth images over time. The key takeaways extracted from the results of SSIM and
• Figure 6 shows a similar trend for the PSNR values, PSNR across sample sizes and epochs are:.
where performance improves rapidly in the early stages of
training and stabilizes as the number of epochs increases. A. Impact of Dataset Size on Model Performance
Larger sample sizes achieve slightly higher PSNR values, There’s a positive trend in SSIM as well as PSNR results
with 5000 and 10000 samples reaching PSNR values by increasing the sample size from 1000 up to 10,000 and
above 23 by the 100th epoch. This suggests that the especially at larger epochs. Results are also proved to be robust
reconstruction quality of the generated images improves at smaller-size datasets, but larger sizes of data have constantly
with more training data. given high performance:
Both figures emphasize the effectiveness of increasing the • For 1000 samples, the SSIM improves from 0.3618 at
dataset size and training duration. As the figures show, larger epoch 1 to 0.8773 at epoch 100, a relative improvement
datasets and more epochs result in higher similarity (SSIM) of 142.4%, while the PSNR improves from 14.43 dB to
and better reconstruction quality (PSNR). 22.66 dB, a 57.1% increase.
TABLE III: SSIM and PSNR results for different sample sizes across epochs.
Epoch 1000 2000 3000 4000 5000 10000
SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR
1 0.3618 14.43 0.4690 15.29 0.5413 16.05 0.6631 17.70 0.7380 19.27 0.7447 19.47
10 0.7915 20.33 0.8365 21.63 0.8128 20.51 0.8554 21.91 0.8226 20.70 0.8701 22.17
20 0.8135 21.24 0.8395 21.19 0.7937 19.22 0.8616 22.33 0.8625 22.01 0.8564 21.70
30 0.8182 21.14 0.8595 22.17 0.7976 20.17 0.8151 21.17 0.8693 22.31 0.8326 21.75
40 0.8539 22.24 0.8326 21.62 0.8408 21.19 0.8631 22.26 0.8669 22.27 0.8747 22.28
50 0.8253 20.92 0.8502 22.09 0.8629 22.21 0.8235 21.09 0.8683 22.50 0.8756 22.64
60 0.8334 21.45 0.8743 22.88 0.8691 22.49 0.8912 23.17 0.8768 22.64 0.8635 22.09
70 0.8639 22.40 0.8679 22.39 0.8675 22.29 0.8888 23.02 0.8889 23.12 0.8892 23.08
80 0.8535 22.35 0.8595 22.14 0.8832 22.95 0.8864 23.20 0.8899 23.52 0.8540 23.17
90 0.8766 22.61 0.8855 22.90 0.8862 23.14 0.8926 23.33 0.8912 23.88 0.8219 23.98
100 0.8773 22.66 0.8885 22.98 0.8977 23.63 0.8931 23.50 0.8975 24.31 0.7878 24.35

TABLE IV: PSNR and SSIM results for different image colorization methods.
Citation / Source Dataset Size (Images) Image Resolution Total Pixels (approx.) Epochs PSNR (dB) SSIM
Zhang et al., 2016 [19] 1,000 256x256 65,536,000 100 24.8 0.86
Iizuka et al., 2016 [9] 3,000 224x224 150,528,000 50 24.5 0.85
Isola et al., 2017 [3] 400 256x256 26,214,400 200 22.9 0.81
Nazeri et al., 2018 [8] 2,000 128x128 32,768,000 50 23.1 0.79
Vitoria et al., 2020 [15] 1,500 256x256 98,304,000 100 24.3 0.83
Su et al., 2019 [16] 600 256x256 39,321,600 150 23.8 0.82
Sartaj et al., 2021 [17] 1,200 128x128 19,660,800 80 23.5 0.84
Bhattacharjee et al., 2022 [5] 800 128x128 13,107,200 120 24.1 0.81
Proposed Methodology 5,000 32x32 5,120,000 100 24.31 0.8975

• With 2000 samples, the SSIM increases from 0.4690 to C. Comparison Between Small and Large Datasets
0.8885 (89.4%) over 100 epochs, and the PSNR from The final SSIM and PSNR value varies consistently with
15.29 dB to 22.98 dB, a 50.3% increase. larger datasets at the same number of epochs. For instance,
• For 3000 samples, the SSIM rises from 0.5413 to 0.8977 at epoch 100, the SSIM for 10,000 samples is 0.7878 versus
(65.8%), and the PSNR improves from 16.05 dB to 23.63 0.8773 for 1000 samples. Similarly, PSNR for 10,000 samples
dB, a 47.2% gain. is 24.31 dB versus 22.66 dB for 1000 samples. However, it is
• When the dataset size is 4000, the SSIM improves from interesting that, even when the dataset is small, the model has
0.6631 to 0.8931 (34.7%), while the PSNR increases reasonably competitive performance, which shows that, even
from 17.70 dB to 23.50 dB, a 32.8% increase. on a small dataset, CGANs could perform reasonably well.
• At 5000 samples, the SSIM starts at 0.7380 and reaches
0.8923 (20.9%), while the PSNR increases from 19.27 D. Overall Findings
dB to 24.26 dB, a 25.9% rise.
• For the largest dataset size (10,000 samples), SSIM The conducted experiments demonstrate that in general,
improves from 0.7447 to 0.7878 (5.8%), while the PSNR a boost in size for a given dataset improves performance,
rises from 19.47 dB to 24.31 dB, a 24.9% increase. yet important gains can be also achieved by longer training
on smaller datasets. As the example shows the comparison
between the number of samples 5000 and 10,000 at epoch
B. Impact of Epochs on Model Performance 100 is barely differentiable with SSIM enhanced by only 0.6
percent, and PSNR increased by 0.2 dB. These results show
Across all dataset sizes, increasing the number of training that even with relatively small datasets, the colorization based
epochs results in a steady improvement in both SSIM and on CGAN can be quite efficient, and such approach appears
PSNR. The model shows rapid improvement within the first viable when data are scarce.
10 epochs, with diminishing returns at later epochs:
• For 1000 samples, SSIM increases by 118.8% from epoch E. Comparison with Related Work
1 to 10, while PSNR improves by 40.9%. However, the When compared with the results of previous research stud-
change from epoch 10 to epoch 100 is more modest, with ies, some major differences can be noticed. Zhang et al. uses
SSIM improving by 10.8% and PSNR by 11.5%. a dataset size of 1,000 images at 256 x 256 resolution and has
• Similar trends are observed for other dataset sizes, with achieved 24.8 dB PSNR and 0.86 SSIM. Whereas, the PSNR
rapid initial improvements followed by slower gains. For value becomes slightly low as 24.31 dB, but with a higher
instance, for 5000 samples, SSIM improves by 11.8% SSIM of 0.8975, which is higher compared with Zhang et
between epochs 10 and 100, while PSNR improves by al.’s technique, although implemented on much smaller 32x32
17.1%. images and sample size is considered to be 5,000. It depicts
how the approach can produce the perceptually better images, skip connections, such as in U-Net architecture, might im-
even at the lower resolutions. prove image restoration as they preserve high-resolution
Along the same lines, Iizuka et al. also achieve PSNR of details from earlier layers.
24.5 dB and SSIM of 0.85 from a dataset size of 3,000 images • Deeper Generator Network: Increasing the depth of the
at a higher resolution of 224x224. Because this approach uses generator by adding more convolutional and transpose
fewer pixels but a comparable number of epochs, it is excellent convolutional layers might help capture more complex
beyond this concerning SSIM. This implies that the CGAN- features from the input images.
based approach can attain very competitive results relatively • Residual Blocks: Introducing residual blocks into the
easily even from lower-resolution images. generator could help improve image reconstruction.
Apart from that, in comparison with Vitoria Residual connections allow better gradient flow and may
accelerate training convergence.
textitet al.
C. Loss Functions and Optimization
citevitoria2020chromagan, which trained the model on • Perceptual Loss: Instead of relying solely on L1 loss,
1,500 images at 256x256 resolution for 100 epochs, the consider adding perceptual loss (VGG-based loss), which
proposed method has PSNR and SSIM higher respectively compares high-level features of generated and real im-
(24.31 dB vs. 24.3 dB) and 0.8975 vs. 0.83 using a much ages. This could improve the visual quality of generated
larger dataset but significantly fewer pixels. That further images.
proves the scalability of the proposed method when training • Gradient Penalty: In the discriminator, you could intro-
on lower-resolution data. duce a gradient penalty, especially if the GAN suffers
Moreover, Nazeri et al. Nazeri2018Image and Isola et al. from instability during training. This could be imple-
Isola2017Image, using 128 × 128 and 256 × 256 images, mented using Wasserstein GAN with Gradient Penalty
achieved comparable results at a comparable amount of epoch (WGAN-GP) for more stable training.
numbers. However, the efficiency of that model with 32 × 32
D. Training Process
images demonstrates that CGANs are also capable of creating
high-quality outputs even if the image resolution is lower, so • Dynamic Learning Rate: The learning rate could be
it gives an opportunity to further develop that work in more adjusted dynamically using learning rate schedulers. A
extended ranges of the sizes of images. As an example, Sartaj decreasing learning rate over time could help the models
converge more efficiently.
textitet al. • Discriminator Training Frequency: Try updating the
discriminator more or less frequently relative to the
citesartaj2021cgan uses images of size 128x128 and 80 generator, which might improve training stability.
epochs but it is still superior to their model’s performance at
E. Metrics and Evaluation
epoch 100 as in TableIV.
• FID Score: In addition to SSIM and PSNR, consider
VI. P OTENTIAL I MPROVEMENTS using the Fréchet Inception Distance (FID) score, which
compares distributions of real and generated images using
A. Data Preprocessing
a pre-trained Inception network, and provides a more
• Normalization Range: Currently, the preprocessing step comprehensive measure of image quality.
normalizes images to the range of [−1, 1]. A potential im- • Visual Quality of Results: Visual inspection of the
provement could be trying different normalization ranges, generated results at intermediate epochs can provide more
such as [0, 1], depending on the activation functions used intuition about the quality of generated images over time.
in the generator and discriminator models. Include more frequent visual checkpoints.
• Data Augmentation: Introduce data augmentation tech-
niques such as random flips, rotations, and crops. This VII. C ONCLUSION
would increase the diversity of the training data and might This paper introduced the lightweight Conditional GAN
improve the generalization of the model. model based on CIFAR-10 for the task of colorizing grayscale
• Larger Dataset: The number of samples for both training images. Based on the compact architecture with the capacity
and testing is restricted (3,000 for training and 100 for to work regardless of whether large-scale or high-resolution
testing). A larger dataset could improve performance, inputs are presented, this model is extremely useful for fields
given that models like GANs typically benefit from more such as medical imaging, where one usually deals with scarce
data. amounts of data and high-quality labeled datasets are expen-
sive.
B. Model Architecture From start to end during the experimentation process,
• Skip Connections in Generator: The skip connections performance was promising for the developed model at such
currently only link certain layers. Using more detailed low PSNR and SSIM scores although working at low image
resolution with a tiny dataset as compared to almost all
prior work. Here, considering this research focuses on images
using a less extensive size dataset and resolutions, where data
availability indeed is critical or can be the constraint during
their realistic application, with the resultant superiority in
images’ quality as compared with the colourized images - the
PSNR and SSIM being remarkably impressive.
Future improvements may include data augmentation, per-
ceptual loss, and deeper network architectures, which could
make the model even more photorealistic and high-quality.
R EFERENCES
[1] I. Goodfellow et al., ”Generative Adversarial Nets,” NIPS, 2014.
[2] M. Mirza and S. Osindero, ”Conditional Generative Adversarial Nets,”
arXiv preprint arXiv:1411.1784, 2014.
[3] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, ”Image-to-image translation
with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit. (CVPR), 2017, pp. 1125–1134.
[4] J. Suarez et al., ”Self-Supervised Learning for Image Colorization,”
IEEE Access, 2022.
[5] S. Bhattacharjee, V. Singh, and D. S. Kushwaha, ”Efficient image
colorization using conditional GANs and perceptual loss,” in Proc. IEEE
Conf. Image Process. (ICIP), 2022.
[6] R. Zhang, P. Isola, and A. A. Efros, ”Colorful image colorization,” in
Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 649–666.
[7] Y. Cao, Z. Zhu, and Z. Zhang, ”Image Colorization Using Generative
Adversarial Networks,” International Journal of Advanced Computer
Science and Applications, 2020.
[8] K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, ”Image
colorization using generative adversarial networks,” in Proc. Int. Conf.
Artif. Intell. Appl. (AAIA), 2018.
[9] S. Iizuka, E. Simo-Serra, and H. Ishikawa, ”Let there be color!: Joint
end-to-end learning of global and local image priors for automatic image
colorization with simultaneous classification,” ACM Trans. Graph., vol.
35, no. 4, pp. 110:1–110:11, 2016.
[10] P. Vitoria, L. Zhang, ”ChromaGAN: Colorization with a GAN using
chrominance and luminance color spaces,” Pattern Recognition, 2022.
[11] S. Kumar, R. Srivastava, and S. Yadav, ”Image Colorization Using
Generative Adversarial Networks and Transfer Learning,” International
Journal of Innovative Technology and Exploring Engineering (IJITEE),
2021.
[12] P. Serrano, S. Bharadwaj, and R. Diaz, ”Portrait Image Colorization
Using Conditional GANs,” WACV, 2017.
[13] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, ”Image Quality
Assessment: From Error Visibility to Structural Similarity,” IEEE Trans-
actions on Image Processing, 2004.
[14] D. Huynh-Thu and M. Ghanbari, ”Scope of Validity of PSNR in
Image/Video Quality Assessment,” Electronics Letters, 2008.
[15] P. Vitoria, L. Sousa, and P. Quelhas, ”ChromaGAN: Adversarial picture
colorization with semantic class distribution,” in Proc. IEEE/CVF Conf.
Comput. Vis. Pattern Recognit. Workshops (CVPRW), 2020, pp. 0–0.
[16] Z. Su, J. Wang, and C. Hu, ”Lightweight image colorization with gen-
erative adversarial networks,” IEEE Access, vol. 7, pp. 170804–170816,
2019.
[17] S. N. Ali, P. Kumar, and S. Jain, ”cGAN-based image colorization using
semantic segmentation,” in Proc. Int. Conf. Pattern Recognit. Mach.
Intell. (PRMI), 2021, pp. 334–342.
[18] A. Radford, L. Metz, and S. Chintala, ”Unsupervised Representation
Learning with Deep Convolutional Generative Adversarial Networks,”
ICLR, 2016.
[19] R. Zhang, P. Isola, and A. Efros, ”Colorful Image Colorization,” ECCV,
2016.

Efficient Image Colorization
No ratings yet
Efficient Image Colorization
8 pages
AI Resubmtion
No ratings yet
AI Resubmtion
18 pages
Automatic Colorization With Deep Convolutional Generative Adversarial Networks
No ratings yet
Automatic Colorization With Deep Convolutional Generative Adversarial Networks
8 pages
Amit Avijeet 4 Yr Paper
No ratings yet
Amit Avijeet 4 Yr Paper
7 pages
Chromagan: Adversarial Picture Colorization With Semantic Class Distribution
No ratings yet
Chromagan: Adversarial Picture Colorization With Semantic Class Distribution
10 pages
Image Colorization With Deep Convolutional Neural Networks
No ratings yet
Image Colorization With Deep Convolutional Neural Networks
12 pages
Intelligent Systems and Applications in Engineering
No ratings yet
Intelligent Systems and Applications in Engineering
4 pages
Batch3 - Conversion of Grayscale Images To Colored Images Using CNN
No ratings yet
Batch3 - Conversion of Grayscale Images To Colored Images Using CNN
1 page
Image Colorization Final Report
No ratings yet
Image Colorization Final Report
52 pages
Colorizing Images Using CNN in Machine Learning
No ratings yet
Colorizing Images Using CNN in Machine Learning
6 pages
Image Processing Project Revview 3
No ratings yet
Image Processing Project Revview 3
22 pages
Colorization of Black and White Images Using Deep Learning
No ratings yet
Colorization of Black and White Images Using Deep Learning
34 pages
Sharma Robust Image Colorization Using Self Attention Based Progressive Generative Adversarial CVPRW 2019 Paper
No ratings yet
Sharma Robust Image Colorization Using Self Attention Based Progressive Generative Adversarial CVPRW 2019 Paper
9 pages
Image Colorization Using Generative Adversarial Networks
No ratings yet
Image Colorization Using Generative Adversarial Networks
11 pages
Grayscale Image Colorization
No ratings yet
Grayscale Image Colorization
6 pages
IJCRTAF02025
No ratings yet
IJCRTAF02025
4 pages
Image Colour Prediction Using Deep Learning
No ratings yet
Image Colour Prediction Using Deep Learning
4 pages
Image Colorization Using Color-Features and Adversarial Learning
No ratings yet
Image Colorization Using Color-Features and Adversarial Learning
11 pages
Image-to-Image Difussion Models
No ratings yet
Image-to-Image Difussion Models
29 pages
Learning Representations For Automatic Colorization
No ratings yet
Learning Representations For Automatic Colorization
29 pages
Colorization Using Convnet and Gan
No ratings yet
Colorization Using Convnet and Gan
8 pages
Project Work: Final-ISA (Review 4)
No ratings yet
Project Work: Final-ISA (Review 4)
29 pages
Real-Time User-Guided Image Colorization With Learned Deep Priors
No ratings yet
Real-Time User-Guided Image Colorization With Learned Deep Priors
11 pages
A Seminar Report: Computer Engineering Department Academic Year: 2018-19
No ratings yet
A Seminar Report: Computer Engineering Department Academic Year: 2018-19
16 pages
It2201 PPT
No ratings yet
It2201 PPT
22 pages
Final Paper - Image Colorization Using Deep Learning - Paper Publication
No ratings yet
Final Paper - Image Colorization Using Deep Learning - Paper Publication
4 pages
Colorization Report
No ratings yet
Colorization Report
5 pages
Final First Review
No ratings yet
Final First Review
35 pages
Abiyesar PDF - PDF - 20250104 - 233256 - 0000 PDF
No ratings yet
Abiyesar PDF - PDF - 20250104 - 233256 - 0000 PDF
15 pages
Colorization Through Image Patterns Using Deep Learning
No ratings yet
Colorization Through Image Patterns Using Deep Learning
16 pages
DL Mini - Project (Devesh)
No ratings yet
DL Mini - Project (Devesh)
16 pages
Abi 1
No ratings yet
Abi 1
15 pages
Progress 1b grp5
No ratings yet
Progress 1b grp5
25 pages
Colorization of Images On Web: An Innovative Model
No ratings yet
Colorization of Images On Web: An Innovative Model
3 pages
DDColor Towards Photo-Realistic Image Colorization Via Dual Decoders
No ratings yet
DDColor Towards Photo-Realistic Image Colorization Via Dual Decoders
17 pages
Automatic Image Colorization Using Deep Learning
No ratings yet
Automatic Image Colorization Using Deep Learning
6 pages
Image Restoration and Colorization
No ratings yet
Image Restoration and Colorization
26 pages
2022 - The Effect of Loss Function On Conditional Generative Adversarial Networks
No ratings yet
2022 - The Effect of Loss Function On Conditional Generative Adversarial Networks
12 pages
Miniproject
No ratings yet
Miniproject
15 pages
Generative - Adversarial - Networks - For - Extreme - Learned - Image - Compression
No ratings yet
Generative - Adversarial - Networks - For - Extreme - Learned - Image - Compression
11 pages
BTP Report
No ratings yet
BTP Report
63 pages
Project Paper
No ratings yet
Project Paper
10 pages
Classifier-Guided-Diffusion-Diffusion Models Beat GANs On Image Synthesis
No ratings yet
Classifier-Guided-Diffusion-Diffusion Models Beat GANs On Image Synthesis
44 pages
Image Colorization Progress: A Review of Deep Learning Techniques For Automation of Colorization
No ratings yet
Image Colorization Progress: A Review of Deep Learning Techniques For Automation of Colorization
8 pages
Abi 2
No ratings yet
Abi 2
14 pages
Ai Project 1
No ratings yet
Ai Project 1
10 pages
Robust Manga Page Colorization Via Coloring Latent Space
No ratings yet
Robust Manga Page Colorization Via Coloring Latent Space
17 pages
Semantically-Guided Image Compression For Enhanced Perceptual Quality at Extremely Low Bitrates
No ratings yet
Semantically-Guided Image Compression For Enhanced Perceptual Quality at Extremely Low Bitrates
16 pages
Image Colorization Using GANs
No ratings yet
Image Colorization Using GANs
18 pages
Instance-Aware Image Colorization
No ratings yet
Instance-Aware Image Colorization
13 pages
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
No ratings yet
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
12 pages
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
No ratings yet
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
12 pages
Image Colorization Using AI
No ratings yet
Image Colorization Using AI
19 pages
A Tiered GAN Approach For Monet-Style Image Generation
No ratings yet
A Tiered GAN Approach For Monet-Style Image Generation
6 pages
Deep Generative Adversarial Networks For Image-To
No ratings yet
Deep Generative Adversarial Networks For Image-To
26 pages
MiniProj-3-Colorizing Old B&W Images
No ratings yet
MiniProj-3-Colorizing Old B&W Images
4 pages
Palette Diffusion
No ratings yet
Palette Diffusion
26 pages
A Novel Approach For Colorization of A Grayscale Image Using Soft Computing Techniques
No ratings yet
A Novel Approach For Colorization of A Grayscale Image Using Soft Computing Techniques
25 pages
Finding Orders of Reaction Experimentally: Chemguide - Answers
No ratings yet
Finding Orders of Reaction Experimentally: Chemguide - Answers
2 pages
Ranking Complete Old+New
No ratings yet
Ranking Complete Old+New
99 pages
DPP-4 Solution
No ratings yet
DPP-4 Solution
5 pages
MELC 5 7 Intervention
No ratings yet
MELC 5 7 Intervention
5 pages
A New Mathematical Dynamic Model For HVAC System Components Based On Matlab Simulink
100% (1)
A New Mathematical Dynamic Model For HVAC System Components Based On Matlab Simulink
6 pages
Adaptive Time Stepping
No ratings yet
Adaptive Time Stepping
3 pages
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
No ratings yet
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
11 pages
Intelligent Fire Sprinkler System
No ratings yet
Intelligent Fire Sprinkler System
1 page
PDF Solution Manual For Gas Turbine Theory 6th Edition Saravanamuttoo Rogers Compress
No ratings yet
PDF Solution Manual For Gas Turbine Theory 6th Edition Saravanamuttoo Rogers Compress
7 pages
International Trade Part II
No ratings yet
International Trade Part II
7 pages
Lab 07
No ratings yet
Lab 07
3 pages
Analog Compendium: Circuit Design Considerations
No ratings yet
Analog Compendium: Circuit Design Considerations
57 pages
F-QAQC-17, Rev.B - FAT LV Panel PDF
50% (2)
F-QAQC-17, Rev.B - FAT LV Panel PDF
3 pages
Presentation On Liquefaction of Natural Gas (CH16184)
No ratings yet
Presentation On Liquefaction of Natural Gas (CH16184)
11 pages
Quicksort: Pseudo Code For Recursive Quicksort Function
No ratings yet
Quicksort: Pseudo Code For Recursive Quicksort Function
11 pages
4 Bit BCD Adder Abstract
100% (1)
4 Bit BCD Adder Abstract
4 pages
Etri Brochure 2012 WEB
No ratings yet
Etri Brochure 2012 WEB
20 pages
20mm Thick External Plaster in CM
No ratings yet
20mm Thick External Plaster in CM
5 pages
SIR CLEO - Statistics-Exam
No ratings yet
SIR CLEO - Statistics-Exam
11 pages
Workshop - Analog Computing: Bernd Ulmann 19-APR-2006
No ratings yet
Workshop - Analog Computing: Bernd Ulmann 19-APR-2006
40 pages
Form and Location Tolerances According To DIN ISO 1101
No ratings yet
Form and Location Tolerances According To DIN ISO 1101
1 page
Unit 4 INTEGRATED CIRCUIT TIMER 29-11-2023
No ratings yet
Unit 4 INTEGRATED CIRCUIT TIMER 29-11-2023
38 pages
CASE IH Quadtrac Eng
No ratings yet
CASE IH Quadtrac Eng
64 pages
Biomechanics and Sports (Lecture Notes)
50% (2)
Biomechanics and Sports (Lecture Notes)
6 pages
Bridgeway Pharmaceuticals Manufactures and Sells Generic Over The Counter Medications in Plants
No ratings yet
Bridgeway Pharmaceuticals Manufactures and Sells Generic Over The Counter Medications in Plants
1 page
Use of Almond Our and Stevia in Rice-Based Gluten-Free Cookie Production
No ratings yet
Use of Almond Our and Stevia in Rice-Based Gluten-Free Cookie Production
13 pages
57 300 AntennaLab IC v1p2
No ratings yet
57 300 AntennaLab IC v1p2
23 pages
Znotes Practicle
No ratings yet
Znotes Practicle
5 pages
11 - WELDING POSITIONS (1) .PPT (Compatibility Mode)
75% (4)
11 - WELDING POSITIONS (1) .PPT (Compatibility Mode)
20 pages