Image Processing Project Revview 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

CSE4019: IMAGE PROCESSING

J - COMPONENT
REVIEW 1

TITLE
Colourization of Grayscale Images Using Deep Learning

SUBMITTED TO
Dr. Akila Victor
SCOPE

TEAM MEMBERS
Aashish Sharma - 19BCE0971
Gokul S Nambiar - 19BCI0017
Shivang Kohli - 19BCI0066
TABLE OF CONTENTS

S.no. Topic Name Page No.


1. Abstract 2
2. Introduction 2
3. Aim 2
4. Objective 3
5. Literature Survey 3
6. Proposed Methodology 4
7. Results and Discussion 4
8. Conclusion 4
9. Future Enhancements 5
10. References 5

1
Colourization of Grayscale Images
Using Deep Learning
Aashish Sharma#1, Gokul S Nambiar*2, Shivang Kohli#3

Abstract— Colourization is a procedure of adding shades of color again range from 0 to 255, a single pixel in a
to a monochrome picture or film. The procedure includes
typically fragmenting pictures into areas and following these coloured image stores 3 bytes (24 bits) of
districts crosswise over picture successions. It requires extensive information. Converting a grayscale image into a
client mediation and is a monotonous, tedious and costly coloured image involves generating these 3 pieces
assignment. For a considerable length of time, numerous
filmmakers restricted colorizing their high contrast motion of information (depending on the colour space
pictures and thought of it as vandalism of their craft. The chosen), from the given single byte of information
innovation itself has moved from meticulous hand colourization in one pixel of a grayscale image. Generating these
to the present to a great extent, robotized strategy. We aim to use
deep learning technology and convoluted neural networks to values accurately has been a difficult problem over
develop a code that can predict the colours of a monochrome the years as different conditions need to be taken
image and colour it for us. This can be used to not only colour into account and many times, some details are
old school films and pictures but also improve the quality of face
detection in biometric machines. always left behind. The actual ab color space is not
discrete, it is continuous, and for a CNN model to
train in such data would have taken a very long
time. Nevertheless, as we have made it to be
Keywords— Include at least 5 keywords or phrases divided into 313 spaces, the training can be done in
I. INTRODUCTION a standard laptop itself. There are two popular
methods for colorization : one in which a colour is
For colouring a gray-scale image, we first need to
assigned to an individual pixel based on its intensity
understand what the different colour spaces are and
as learned from a colour image with similar content,
the difference between a coloured image and a
and another in which the image is segmented into
grey-scale image. Colour images are just a
regions, each of which are then assigned a single
collection of information about the intensity at a
hue. Assigning colour to a grayscale image is a
particular point. So, for a greyscale image, the only
difficult problem. Given a certain image, there is
information required to make the whole image is
often no “correct” attainable colour. The
the intensity of the pixel that can vary from 0 to
applications of such a method allow for a new
255. As each pixel needs to store the information
appreciation of old, black and white photographs
about one parameter, the pixel size is normally 8
and cinema, along with allowing better
bits, as only one value is stored. For a coloured
interpretation of modern grayscale images such as
image, more information needs to be handled. Now
those from CCTV cameras, astronomical
along with the intensity, the colour information for
photography, or electron microscopy.
each pixel also needs to be stored. For an RGB
colour space, this involves storing the information II. AIM
about the Red, Green and Blue colours of the pixel. To develop a deep learning model that can be
These are the three main primary colours that can used to colourise grayscale images using
be mixed in different proportions to obtain all the Convoluted Neural Networks.
visible colours. Now as three pieces of information
are needed to be stored, and a single information
takes 1 byte of space (8 bits), where the value can

2
III. OBJECTIVE similarity between reference and target using
To convert a grayscale image to a coloured VGG-19 network.
image, you must first generate these three bits of Pucci et al. [3] (2019) They proposed a single
information (RGB) from a single byte of network called UCapsNet that considers the image
information in each grayscale pixel. level features obtained through convolutions and
The main objective is to create a convoluted neural entity level features captured by capsules, then they
network to predict the colors of a monochrome enforced collaboration between such convolutional
image given to us as the input and to produce a and entity factors to produce high quality colored
colored image. Different layers of the CNN model images. Their approach consists of 2 phases: 1.
are used in order for this colorization. The input for Downsample phase is for learning the image level
the model will be grayscale images while the output and entity-level features 2. Upsample phase
will be the colorized images. The true ab colour leverages these features to generate image
space is continuous, not discrete, and training a colorization.
CNN model with such data would have taken a Abhishek Pandey [4] (2020) Their proposed
lengthy time. We have separated it into 313 areas, methodology included building a deep
which can be completed on a regular laptop. convolutional neural network which includes 4
Convolutional neural networks are multilayer parts- the encoder component to produce mid-level
perceptrons that are biologically inspired and meant features, the feature extraction component to
to replicate visual brain function. By using the high produce high-level features, these two are then
spatially local correlation observed in natural merged into the fusion layer and the output is
pictures, these models circumvent the restrictions of generated using the decoder component.
the MLP architecture. Joshi Madhab et al. [5] (2020) The model starts
After the creation and definition of the with dataset collection and preprocessing the
Convolutional Neural Network and its multiple images to remove images having unusual aspect
layers, we test the model using a sample image and ratios and resizing images to 256 x 256. Then the
check for errors. Then we train the model using a images are converted into CIE L*a*b* where L is
number of epochs and predict the colorized output one layer for luminance and has packed three RGB
of the input grayscale test images. layers into two chroma layers (a* and b*). Then the
CNN model is implemented as a learning pipeline.
It takes greyscale images as input along with the
IV. LITERATURE SURVEY
luminance. The a* and b* channels are extracted as
Mingming He et al.[1] (2018) They proposed a the target values. L*a*b* to RGB conversion is
mapping convolutional model trained using applied as the final output.
adversarial methodology with conditional GANs Eman Saleem [6] (2020)A reference image has to
using pix2pix framework and used a novel be selected which is similar to the greyscale image
generator-discriminator setting that adapts the IBN in content and structure. Then both the images are
paradigm to encoder-decoder architecture. After converted into YCbCr color space. For each pixel
this they used Spectral Normalization for improving present in the greyscale image, it is compared with
the generalization of adversarial colorization and the pixels of the reference image using Euclidean
used multi scale discriminators for getting distance and the pixel with the minimum distance in
improved color generation in small areas and the reference image, their color information is
boosted details. transferred to the greyscale image. Just like this the
Blanch. M. G. et al. [2] (2019) They chose the first whole image is converted into a colorized image..
CNN to directly select, propagate and predict colors
from an aligned reference for a greyscale image.
First the Similarity sub-net is a preprocessing step
that provides the input by measuring the semantic

3
V. PROPOSED METHODOLOGY VI. RESULTS AND DISCUSSION
A. Algorithm in Methodology
Original Colourised PSNR Value
The project uses the following algorithm to Image Image
colorize a given grayscale image.
PSNR = 34.36
The models used the following Keras APIs to
build the hidden layers of the model.

● Conv2D layer with a relu activation function.


● Conc2D layer with a relu activation function.
● Conv2D transpose layer with a relu activation
function. PSNR = 37.26
● Conv2D transpose layer with a relu activation
function.

These layers make up the whole Deep Learning


Model. The purpose of using the conv2D layers is
to this layer creates a convolution kernel that is
convolved with the layer input to produce a tensor PSNR = 42.21
of outputs. This process downscales the images
therefore to converse the image’s quality we
upscale it by using a conv2DTranspose layer.

A Conv2DTranspose layer works in the same


way as the conv2D layer but the input vector used
for it are transposed and then used for inputs, The Hence for this model, we have analysed the PSNR
need for transposed convolutions generally arises values of 10 sample images which have fallen in
from the desire to use a transformation going in the the range of 30 to 50 dB, 3 of which have been
opposite direction of a normal convolution, i.e., shown above. Images with higher PSNR values
from something that has the shape of the output of indicated high quality.
some convolution to something that has the shape VII. CONCLUSION
of its input while maintaining a connectivity pattern The purpose of this study is to design a new, fully
that is compatible with said convolution. autonomous colorization method that employs
convolutional neural networks to reduce human
The 4 layers of convolution make up our CNN effort and reliance on example colour photos. As
model that detects areas and colorizes them and informative yet discriminative features, a patch
outputs a colored image. feature and a new semantic feature are extracted
and fed into the neural network. An adaptive image
clustering algorithm is adopted to incorporate
global image information. The output chrominance
values are further adjusted using combined bilateral
filtering to achieve colorization quality. Because the
proposed colorization is completely automated, it is
stronger and more stable than older methods. It
does, however, uses machine learning techniques
and has its own set of limitations. It's expected to be

4
trained on a huge reference photo library that of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878,
Volume-8 Issue-6
includes all possible objects. [5] Joshi, Madhab R., Lewis Nkenyereye, Gyanendra P. Joshi, S. M.R.
Islam, Mohammad Abdullah-AlWadud, and Surendra Shrestha. 2020.
VIII. FUTURE ENHANCEMENT "Auto-Colorization of Historical Images Using Deep Convolutional
Neural Networks" Mathematics 8, no. 12: 2258.
One extended application can be that we can
implement the same model as reinforcement [6] Eman Saleem, Nidhal K. El Abbadi. Auto Colorization of Gray-Scale
Image Using YCbCr Color Space. 2020. Iraqi Journal of Science,
learning.The model currently utilized shared 2020, Vol. 61, No. 12, pp: 3379-3386.
weights, to improve accuracy of the model we can [7] Charpiat G., Hofmann M., Schölkopf B. (2008) Automatic Image
Colorization Via Multimodal Predictions. In: Forsyth D., Torr P.,
implement more niche implementations such as Zisserman A. (eds) Computer Vision – ECCV 2008. ECCV 2008.
more layers or for more optimized performance less Lecture Notes in Computer Science, vol 5304. Springer, Berlin,
Heidelberg. https://doi.org/10.1007/978- 3-540-88690-7_10
attributes and more layers. The current edge
detection model does not perfectly perform [8] Royer, A., Kolesnikov, A., & Lampert, C. H. (2017). Probabilistic
image colorization. arXiv preprint arXiv:1705.04258.
segmentation boundaries as seen in previous and [9] T. -T. Nguyen-Quynh, S. -H. Kim and N. -T. Do, "Image Colorization
thus a better model can be used to fix such issues. Using the Global Scene-Context Style and Pixel-Wise Semantic
Segmentation," in IEEE Access, vol. 8, pp. 214098-214114, 2020, doi:
Though the library used to train the model is of 3 10.1109/ACCESS.2020.3040737
GB, in the domain of Machine Learning and AI it is [10] Wang XH,Jia J,Liao HY et a1.Affective image colorization.
still a small amount of data to be trained on, this JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(6):
1119—1128 Nov.2012.DOI 10.1007/s11390—012—1290—4
can be fixed by either providing a larger dataset or
by using a previous method suggested. [11] Reinhard E., Adhikhmin M., Gooch B., & Shirley P. Color transfer
between images. IEEE Computer graphics and applications, 21(5) :
34-41.
REFERENCES
[12] Halder S.S., De K., Roy P.P. (2019) Perceptual Conditional Generative
[1] He, M., Chen, D., Liao, J., Sander, P. V., & Yuan, L. (2018). Deep Adversarial Networks for End-to-End Image Colourization. In:
exemplar-based colorization. ACM Transactions on Graphics (TOG), Jawahar C., Li H., Mori G., Schindler K. (eds) Computer Vision –
37(4), 1-16. ACCV 2018. ACCV 2018. Lecture Notes in Computer Science, vol
11362. Springer, Cham.
[2] Blanch, M. G., Mrak, M., Smeaton, A. F., & O'Connor, N. E. (2019,
September). End-to-end conditional gan-based architectures for image [13] Subjective evaluation of colorized images with different colorization
colourisation. In 2019 IEEE 21st International Workshop on model [2020 Wiley Periodicals LLC] Authors : Xiao Teng,Zhijiang
Multimedia Signal Processing (MMSP) (pp. 1-6). IEEE. Li,Qiang Liu,Michael R. Pointer,Zheng Huang,Hongguang Sun
[3] Pucci, R., Micheloni, C., & Martinel, N. (2021). Collaborative Image [14] Ahmad S. Alhadidi. An Approach for Automatic Colorization of
and Object Level Features for Image Colourisation. In Proceedings of Grayscale Images. International Journal of Science and Research
the IEEE/CVF Conference on Computer Vision and Pattern (IJSR) ISSN (Online): 2319-7064
Recognition (pp. 2160-2169). [15] Saeed Anwar , Muhammad Tahir, Chongyi Li, Ajmal Mian, Fahad
[4] Abhishek Pandey, Rohit Sahay, C. Jayavarthini. Automatic Image Shahbaz Khan, Abdul Wahab Muzaffar. Image Colorization: A Survey
Colorization using Deep Learning. March 2020. International Journal and Dataset

5
CSE4019: IMAGE PROCESSING
J - COMPONENT
REVIEW 1

TITLE
Colourization of Grayscale Images Using Deep Learning

SUBMITTED TO
Dr. Akila Victor
SCOPE

TEAM MEMBERS
Aashish Sharma - 19BCE0971
Gokul S Nambiar - 19BCI0017
Shivang Kohli - 19BCI0066
ABSTRACT

Colourization is a procedure of adding shades of color to a monochrome picture or film. The


procedure includes typically fragmenting pictures into areas and following these districts
crosswise over picture successions. It requires extensive client mediation and is a monotonous,
tedious and costly assignment. For a considerable length of time, numerous filmmakers restricted
colorizing their high contrast motion pictures and thought of it as vandalism of their craft. The
innovation itself has moved from meticulous hand colourization to the present to a great extent,
robotized strategy. We aim to use deep learning technology and convoluted neural network to
develop a code that can predict the colours of a monochrome image and colour it for us. This can
be used to not only colour old school films and pictures but also improve the quality of face
detection in biometric machines.

INTRODUCTION

For colouring a gray-scale image, we first need to understand what the different colour spaces
are and the difference between a coloured image and a grey-scale image. Colour images are just
a collection of information about the intensity at a particular point. So, for a greyscale image, the
only information required to make the whole image is the intensity of the pixel that can vary
from 0 to 255. As each pixel needs to store the information about one parameter, the pixel size is
normally 8 bits, as only one value is stored. For a coloured image, more information needs to be
handled. Now along with the intensity, the colour information for each pixel also needs to be
stored. For an RGB colour space, this involves storing the information about the Red, Green and
Blue colours of the pixel. These are the three main primary colours that can be mixed in different
proportions to obtain all the visible colours. Now as three pieces of information are needed to be
stored, and a single information takes 1 byte of space (8 bits), where the value can again range
from 0 to 255, a single pixel in a coloured image stores 3 bytes (24 bits) of information.
Converting a grayscale image into a coloured image involves generating these 3 pieces of
information (depending on the colour space chosen), from the given single byte of information in
one pixel of a grayscale image. Generating these values accurately has been a difficult problem
over the years as different conditions need to be taken into account and many a times, some
details are always left behind. The actual ab color space is not discrete, it is continuous, and for a
CNN model to train in such data would have taken a very long time. Nevertheless, as we have
made it to be divided into 313 spaces, the training can be done in a standard laptop itself. There
are two popular methods for colorization : one in which a colour is assigned to an individual
pixel based on its intensity as learned from a colour image with similar content, and another in
which the image is segmented into regions, each of which are then assigned a single hue.
Assigning colour to a grayscale image is a difficult problem. Given a certain image, there is often
no “correct” attainable colour. The applications of such a method allow for a new appreciation of
old, black and white photographs and cinema, along with allowing better interpretation of
modern grayscale images such as those from CCTV cameras, astronomical photography, or
electron microscopy.

LITERATURE REVIEWS

S.N Name and Input Output Methodology Advantages Disadvantages


o Author
Name

1. End-to-End The user The model They proposed a mapping It takes into It is a very
Conditional gives a produced convolutional model account the complex
GAN-based grayscale colorized trained using adversarial adversarial loss method that
Architectures image as images as methodology with during training requires
for Image input to the output with conditional GANs using the GAN frequent
Colorization model. For accuracy of pix2pix framework and network. rebalancing of
training the 89% and used a novel weights, and
Mingming model, Peak generator-discriminator It achieves good the high
He, training Signal-toNoi setting that adapts the training instability
Dongdong examples se IBN paradigm to stability. during training
Chen, Jing generated Ratio(PSNR) encoder-decoder when a GAN
Liao, Pedro from value of architecture. After this This paper deals with high
V. Sander, Lu ImageNet 26.77 dB. they used Spectral shows that by resolution
Yuan dataset Normalization for boosting the images can lead
containing improving the performance of the pix2pix
50,000 RGB generalization of adversarial framework to
images were adversarial colorization framework, collapse. This
used. and used multi scale reduction of reduces the
discriminators for getting desaturation contribution of
improved color effect can be adversarial
generation in small areas achieved. loss.
and boosted details.

2 Deep The user They They chose the first CNN 1. A significant The quality of
Exemplar provides a generated to directly select, advantage of the final results
based grayscale colorized propagate and predict their network is depends on the
Colorization image to be images from colors from an aligned the robustness to choice of the
colorized. grayscale reference for a greyscale reference reference
Marc Górriz, Users can images with image. First the Similarity selection when samples. Also
Marta Mrak, also give a top 5 class sub-net is a preprocessing compared with the perceptual
Alan F. some accuracy of step that provides the traditional loss based on
Smeaton, reference 85.94%.and input by measuring the exemplar-based the
Noel E. images to achieved semantic similarity Colorization classification
O'Connor enhance PSNR of between reference and network (VGG)
colorization. 25.50 dB target using VGG-19 2. Their method cannot penalize
network. Then the benefits from incorrect colors
They used a Colorization sub-net getting in regions with
training provides a more general references and less semantic
dataset based colorization solution for hence is able to importance.
on ImageNet either similar or work on unseen
dataset by dissimilar pixel pairs. images just as
sampling This employs multi-task effectively and
from 7 learning which share the doesn’t fail like
categories: same network but 2 previous
animals different loss functions: learning based
(15%), plants Chrominance Loss and methods trained
(15%), Perceptual Loss. This on natural
people ensures proper images.
(20%), colorization from
scenery large-scale data. 3. They
(25%), food achieved a
(5%), Fooling Rate of
transportatio 38.08 % .
n (15%) and
artifacts
(5%).

3. Collaborative The user They They proposed a single 1. Their method Occasionally
Image and provides a successfully network called UCapsNet provides a their method
Object Level greyscale colorized the that considers the image consistent fails to predict
Features for image to be greyscale level features obtained object/backgrou colors for some
Image colorized. images with through convolutions and nd separation local regions.
Colorization For training an increase in entity level features also reducing Their network
their model, PSNR of captures by capsules, then the color cannot colorize
Rita Pucci, they used 3 more than they enforced blurring on objects with
Christian datasets: 10% as collaboration between contours thus unusual or
Micheloni, ImageNet compared to such convolutional and generating more artistic colors.
Niki Martinel containing the existing entity factors to produce detailed outputs.
10k images, approaches. high quality colored
COCOStuff image. Their approach 2. Their method
containing consists of 2 phases: 1. produced
5k images, Downsample phase is for images with no
and learning the image level splotches.
Places205 and entity-level features
containing 2. Upsample phase 3. They
20500 leverages these features to conducted user
validation generate image study with 200
images. colorization. Model learns random images,
a color distribution over on collecting
pixels used to predict the votes, their
color channels. images were
preferred over
other author’s
models (53% vs
47% )

4. Automatic A single The model Their proposed It is an efficient Most of the


Image black and produces methodology included way of coloring colors were low
Colorization white image images with building a deep images using saturated
using Deep in 256 x 256 realistic convolutional neural deep CNN. because of less
Learning pixels was colors. network which includes 4 Nature elements diverse data set.
given as parts- the encoder involving rivers, Some of the
Abhishek input. 18 component to produce trees, etc. have objects are not
Pandey, Rohit gigabytes of mid-level features, the above 80% colorized well
Sahay, C. images from feature extraction accuracy. and for that
Jayavarthini ImageNet component to produce model has
database are high-level features, these produced next
being used two are then merged into probable
for training. the fusion layer and the colors.
All the output is generated using
images are the decoder component.
being
rescaled to
224 x 224
and 299 x
299 for
encoding and
inception.

5. Auto-Coloriz The dataset Colorized The model starts with The framework Poor
ation of created to images were dataset collection and offers image performance
Historical train and test generated preprocessing the images colorization for some
Images Using the model to remove images having with improved images due to
Deep contains unusual aspect ratios and or comparable small data size
Convolutional historical, resizing images to 256 x performances as and variability
Neural heritage and 256. Then the images are compared to of images in
Networks cultural converted into CIE various existing training set.
image L*a*b* where L is one approaches from Poor coloring
Joshi MR, repositories layer for luminance and enhanced signal results were
Nkenyereye of Nepal. has packed three RGB energy obtained
L, Joshi GP, The images layers into two chroma perspectives.
Islam SMR, were layers (a* and b*). Then The Mean
Abdullah-Al- collected the CNN model is Squared Error
Wadud M, from the implemented as a learning (MSE) was
Shrestha S. ImageNet pipeline. 6.08%, Peak
database and It takes greyscale images Signal-to-Noise
the internet. as input along with the Ratio (PSNR)
1200 images luminance. The a* and b* 34.65 dB, and
of 256 x 256 channels are extracted as 75.23% model
were the target values. L*a*b* accuracy.
collected. to RGB conversion is
applied as the final
output.

6 Auto A greyscale Colorized A reference image has to This model Color


Colorization image to be images were be selected which is provides a fully deformation
of Gray-Scale colorized is generated. similar to the greyscale automated takes place
Image Using provided as image in content and method to when there is a
YCbCr Color the input. An structure. Then both the colorize images difference in
Space RGB image images are converted into based on YCbCr the color
for reference YCbCr color space. For color space and histogram
Nidhal K. El is also each pixel present in the a reference between black
Abbadi, provided greyscale image, it is image which and white
Eman Saleem compared with the pixels provided good image and the
of the reference image results. RMSE reference
using Euclidean distance (Root Mean image.
and the pixel with the Square Error)
minimum distance in the equal to 10,
reference image, their average PSNR
color information is (Peak
transferred to the Signal-toNoise
greyscale image. Just like Ratio) value
this the whole image is equal to 30 and
converted into colorized average MD
image. (Maximum
Difference)
value as 100.

7 Automatic The input for The model Using a multimodal The model is With an image
Image the image reproduces model which helps in also based on with many
Colorization described is colors for distinguishing similar the user input different
Via a grey scaled greyscale objects of different colors and user can objects and
Multimodal image taken images. The to help achieve a greater interact, add textures, such
Predictions from a multimodalit accuracy in terms of color more color as a brick wall,
colored y framework correction and image points if needed, a door, a dog, a
Guillaume image to test proves visibility. until a satisfying head, hands, a
Charpiat,Matt its accuracy. extremely result is reached, loose suit.
hias The paper useful in or even place Because of the
Hofmann,Ber uses the grey areas such as color points number of
nhard scaled image Mona Lisa’s strategically in objects, and
Schölkopf of Monalisa forehead or order to give because of their
by Da Vinci neck where indirect particular
to test its the texture of information on arrangement, it
accuracy skin can be the location of is unlikely to
easily color find a single
mistaken boundaries. color image
with the Accuracy of the with a similar
texture of sky model is base on scene that we
at the local the time and would use as a
level. dataset provide learning image.
for the current
dataset in the
paper the
accuracy for
images such as a
zebra is 95%
and above but
for images with
a lot of objects
the accuracy
ranges from
70-85%.

8 Probabilistic grayscale A colorized At training time, all The main Though the
Image image to an image from variables in the factors are innovation is model is able to
Colorization embedding, an input observed, so a model can that they treat capture the
which can be greyscale be efficiently trained by colorization as a vibrance of the
Amelie encoded with image learning all factors in classification image it is
Royer, color parallel. The paper used rather than unable to
Alexander information gated residual blocks as regression predict the
Kolesnikov, the main building task,combined correct color
Christoph H. component for the both with for a specific
Lampert networks. class-rebalancin image as it is
g in the training trained in a
loss to favor rare different way
colors and more than most
vibrant samples. neural
The model is networks.
highly
competitive with
other
approaches and
tends to produce
more saturated
colors on
average.

9 Image Greyscale A colorized Using an auto encoder It can be paired This approach
Colorization image image based architecture along with with other can introduce
Using the on the applying semantic machine some red noise
Global Scene training set segmentation based on learning and in the image
Context Style provided. pixel level to get a greater deep learning leading to a
and accuracy to detect areas models to better badly colorized
Pixel-Wise of an object. improve the image with
Semantic accuracy and undefined
Segmentation color correction. edges
The accuracy of
Tram-tran the approach is:
nguyen-quyn 0.823
h , Soo-hyung
kim and
Nhu-tai do

10 Affective Greyscale Colorized We firstly jointly use text Different image Selection of
Image image and image labels to semantically colorizations are color themes
Colorization reference filter internet images with generated from leads to build
images correct references Then the varied up of bias.
Xiao-Hui secondly , we select a set reference
Wang, Jia Jia, of color themes in pictures, and a
Han-Yu Liao accordance to the graphical
& Lian-Hong affective word based on computer
Cai art theories. program is
provided to
simply choose
the required
result.

11 Color Synthetic Transferred We first describe a viable Because the Because of its
Transfer image image approach of translating synthetic image simplicity, we
between RGB signals to Ruderman and the may use it as a
Images et al perception-based .'s photograph have plug-in for a
color space l. The goal is similar variety of
Erik to modify RGB images, compositions, commercial
Reinhard, which are often of we may use the graphics
Michael unknown phosphor synthetic image programmes.
Ashikhmin, chromaticity. We convert to mimic the Finally, we
Bruce Gooch, the image to LMS space appearance of believe that
and Peter in two steps since l is a the photograph. academics will
Shirley transform of LMS cone be able to use
space. The initial step is the l color
to convert RGB values to space
XYZ tristimulus values. successfully for
additional tasks
such as color
quantization.
12 Perceptual Grayscale Colorized This paper proposes a Including the When the
Conditional images, as image Conditional GANs variant perceptual loss classification
Generative well as those that attempts to learn a and the loss of the
Adversarial with minor functional mapping from classification generator
Networks for color an input grayscale image loss in the objective is
End-to-End variations to an output colorized objective calculated
Image image while minimizing function, in using the
Colourization the per pixel loss, addition to the attributes'
classification loss, and adversarial loss cross-entropy
Halder S.S., adversarial loss, as well and the loss, it is
Kanjar De , as the high-level per-pixel greater in a
and Partha perceptual loss. In constraint, different
Pratim Roy addition, an in-depth improves the dataset. It does
qualitative and coloriathio and not produce
quantitative comparison yields promising satisfactory
with existing methods is results for results when
performed. CuPGAN, the real-world the images used
proposed model, is images. have irregular
trained from start to borders and
finish. significant
artefacts,
making colour
estimation a
difficult task.

13 Subjective Colorized Evaluated Elaborate evaluations of It is found that, The


evaluation of image statistics generated color images in future work, performance of
colorized were performed using two due various models
images with visual responses: consideration varied
different preference and perceived should be given depending on
colorization similarity. The to human visual the image
models experiments were carried perception when content.
out using the traditional evaluating and Furthermore,
Xiao Teng,| method of converting the quantifying the none of the
Zhijiang Li, original color image to performance of three metrics
Qiang Liu1, grayscale and then colorization performed well
Michael R. recoloring it. Portraits, models. v in predicting
Pointer, natural scenes, objects, human visual
Zheng Huang pets, and vehicles were perceptions,
,Hongguang chosen as representative emphasizing
Sun images with varying the need for
content. improved
measures.

14 An Approach Grayscale Colorized The user provided Micro scribbles To better


for Automatic images and image reference image needs to at high describe texture
Colorization reference be divided into some confidence areas information of
of Grayscale images segments differentiating are generated a pixel more
Images among classes of objects. and the advanced
For this method we use complete techniques like
Ahmad S. mean shift based colorization is Maximum
Alhadidi segmentation of images. performed using Response filter
In mean shift Levin’s bank can be
segmentation a window Colorization used.
around the data point is using
defined and the mean of optimization
the data in the current algorithm. The
window is calculated. combination of
Then the center of the these two
window is shifted to the methods gives
mean point. This process highly
is repeated until optimized
convergence colorization of
images.

15 Image Image Survey of This article describes the Deep State-of-the-art


Colorization: Colorization techniques essential block structures, convolutional methods
A Survey and techniques and inputs, optimizers, loss algorithms for application to
Dataset evaluation functions, training picture critical
procedures, and training colorization real-world
Saeed Anwar data of contemporary have seen scenarios is
, Muhammad state-of-the-art deep tremendous restricted due
Tahir, learning-based picture expansion as a to inadequate
Chongyi Li, colorization approaches. result of deep metrics,
Ajmal Mian, It divides existing learning network
Fahad colorization approaches approaches' complexity, and
Shahbaz into seven categories and extraordinary failure to
Khan, Abdul addresses key issues. performance. handle real-life
Wahab benchmark datasets and Various degradations.
Muzaffar assessment metrics, for strategies based
example, are elements on exciting
that influence their advances, such
performance. We point as network
out the flaws in the architectures,
current system. training
colorization datasets, as methodologies,
well as a new dataset and learning
dedicated to colorization. paradigms, are
We undertake a suggested.
comprehensive analysis This article
using the current datasets provides a
as well as our new one. comprehensive
Existing picture overview of
colorization technologies picture
are being tested in an colorization
experimental setting. approaches
Finally, we evaluate the based on deep
shortcomings of current learning.
techniques and provide
recommendations. For
this fast expanding field
of deep picture
colorization, potential
solutions as well as future
research possibilities are
discussed.

PROPOSED METHODOLOGY

Colorization of grayscale images provides insightful innovation on the domain of corrective


computer vision. Enabling RGB images extracted from grayscale images is an innovative idea as
this innovation can be used and applied in several problems of real life. Few innovative solutions
that can be extended through application of this technologies are listed below:

1. Conversion of recorded videography from older times:

The development of basic videography began in the late 19th century which was limited
to fast screen fluxing and black and white compositions. Through the process of
colorization, we can construct RGB channels to the image and make them colored.

2. Highly Detailed Image Reconstruction:

Grayscale to RGB image reconstruction leads to images with high details in higher visual
aesthetics. This allows greater illustration of image quality post reconstruction.

Convolutional neural networks are biologically inspired forms of multilayer perceptrons that are
designed to mimic visual brain activity. These models overcome the limitations of the MLP
design by utilizing the significant spatially local correlation found in natural images. CNNs, as
opposed to MLPs, have the following characteristics:
1. 3D volumes of neurons: Neural networks are practically constructed by filling and
connecting neurons in 3 dimensions: height, width and depth. Each neuron has a limited
understanding of the field and it is connected with the layer before it. To build a CNN
architecture, these neurons are stacked and connected.
2. Local connectivity: CNNs exploit spatial locality by establishing a local connection
pattern between neurons in adjacent layers, much like receptive fields. The learned
"filters" produce the strongest response to a spatially confined input pattern as a result of
the architecture. Non-linear filters become progressively global by stacking many of
these layers, allowing the network to first build representations of small parts of the input,
then assemble representations of larger areas from them
.
3. Shared weights: In CNN, every layer isc copied using a field of view. These units form
maps using the same weights and bias.This means that in the similar region, they respond
in a similar way when the convolutional response is fired. By copying units in this way,
the feature map built at the end has equal variance under changes in the locations of input
features in the visual field,

4. Pooling:Feature maps are partitioned into rectangular subregions in a CNN's pooling


layers, and the features in each rectangle are down-sampled to a single value, usually by
taking their average or maximum value. In addition to lowering the size of feature maps,
the pooling method gives the features contained therein a degree of translational
invariance, allowing the CNN to be more resilient to changes in their positions.
Convolutional neural networks (CNNs) have achieved remarkable results in a variety of
fields, including medical studies, and there is growing interest in radiology. Although
deep learning has been the preferred method for a range of challenging tasks including
picture classification and object recognition, it is not a panacea. Knowing the core ideas
and benefits of CNN, as well as the limitations of deep learning, is critical for using it in
radiology research with the goal of enhancing radiologist performance.
INPUT, STEPS, EXPECTED OUTPUT

Input: A grayscale Image in LAB Format.


Steps:
1. Import the grayscale images dataset as Numpy array.
2. Convert the grayscale images in LAB (L - Lightness, A - Green and Magenta , B - Blue
and Yellow) format to RGB format.
3. Divide the dataset into a Training set and Test Set
4. Create a Convoluted Neural Network using multiple layers in Keras.
5. Use the Adam Optimizer and loss function to see the loss while training the model.
6. Check for errors using a sample image.
7. Train the model against a number of epochs
8. Predict the colorized output of input test images.
Expected Output: Colorized Image

DATASET INFORMATION

Dataset- Kaggle dataset -


https://www.kaggle.com/shravankumar9892/image-colorization
Tools:
1. Tensorflow.keras - it is a powerful and easy to use open source library for developing and
evaluating deep learning models. It allows us to define and train neural networks in just a few
lines of code.
2. Numpy- for mathematical operations on arrays and matrices, it’s an extension of numeric
python and NumArray.
3. Google Colab - google colab is a cloud based jupyter notebook environment.

ALGORITHM

The project uses the following algorithm to colorize a given grayscale image.

The models used the following Keras APIs to build the hidden layers of the model.
- Conv2D layer with a relu activation function.
- Conc2D layer with a relu activation function.
- Conv2D transpose layer with a relu activation function.
- Conv2D transpose layer with a relu activation function.

These layers make up the whole Deep Learning Model. The purpose of using the conv2D layers
is to this layer creates a convolution kernel that is convolved with the layer input to produce a
tensor of outputs. This process downscales the images therefore to converse the image’s quality
we upscale it by using a conv2DTranspose layer.

A Conv2DTranspose layer works in the same way as the conv2D layer but the input vector used
for it are transposed and then used for inputs, The need for transposed convolutions generally
arises from the desire to use a transformation going in the opposite direction of a normal
convolution, i.e., from something that has the shape of the output of some convolution to
something that has the shape of its input while maintaining a connectivity pattern that is
compatible with said convolution.

The 4 layers of convolution make up our CNN model that detects areas and colorizes them and
outputs a colored image.
BLOCK DIAGRAM

REFERENCES

1. He, M., Chen, D., Liao, J., Sander, P. V., & Yuan, L. (2018). Deep exemplar-based
colorization. ACM Transactions on Graphics (TOG), 37(4), 1-16.
2. Blanch, M. G., Mrak, M., Smeaton, A. F., & O'Connor, N. E. (2019, September). End-to-end
conditional gan-based architectures for image colourisation. In 2019 IEEE 21st International
Workshop on Multimedia Signal Processing (MMSP) (pp. 1-6). IEEE.
3. Pucci, R., Micheloni, C., & Martinel, N. (2021). Collaborative Image and Object Level
Features for Image Colourisation. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (pp. 2160-2169).
4. Abhishek Pandey, Rohit Sahay, C. Jayavarthini. Automatic Image Colorization using Deep
Learning. March 2020. International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-8 Issue-6
5. Joshi, Madhab R., Lewis Nkenyereye, Gyanendra P. Joshi, S. M.R. Islam, Mohammad
Abdullah-AlWadud, and Surendra Shrestha. 2020. "Auto-Colorization of Historical Images
Using Deep Convolutional Neural Networks" Mathematics 8, no. 12: 2258.
6. Eman Saleem, Nidhal K. El Abbadi. Auto Colorization of Gray-Scale Image Using YCbCr
Color Space. 2020. Iraqi Journal of Science, 2020, Vol. 61, No. 12, pp: 3379-3386.
7. Charpiat G., Hofmann M., Schölkopf B. (2008) Automatic Image Colorization Via
Multimodal Predictions. In: Forsyth D., Torr P., Zisserman A. (eds) Computer Vision – ECCV
2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978- 3-540-88690-7_10
8. Royer, A., Kolesnikov, A., & Lampert, C. H. (2017). Probabilistic image colorization. arXiv
preprint arXiv:1705.04258.
9. T. -T. Nguyen-Quynh, S. -H. Kim and N. -T. Do, "Image Colorization Using the Global
Scene-Context Style and Pixel-Wise Semantic Segmentation," in IEEE Access, vol. 8, pp.
214098-214114, 2020, doi: 10.1109/ACCESS.2020.3040737
10. Wang XH,Jia J,Liao HY et a1.Affective image colorization.JOURNAL OF COMPUTER
SCIENCE AND TECHNOLOGY 27(6):1119—1128 Nov.2012.DOI 10.1007/
s11390—012—1290—4
11. Reinhard E., Adhikhmin M., Gooch B., & Shirley P. Color transfer between images. IEEE
Computer graphics and applications, 21(5) : 34-41.
12. Halder S.S., De K., Roy P.P. (2019) Perceptual Conditional Generative Adversarial Networks
for End-to-End Image Colourization. In: Jawahar C., Li H., Mori G., Schindler K. (eds)
Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science, vol 11362.
Springer, Cham.
13. Subjective evaluation of colorized images with different colorization model [2020 Wiley
Periodicals LLC] Authors : Xiao Teng,Zhijiang Li,Qiang Liu,Michael R. Pointer,Zheng
Huang,Hongguang Sun
14. Ahmad S. Alhadidi. An Approach for Automatic Colorization of Grayscale Images.
International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064
15. Saeed Anwar , Muhammad Tahir, Chongyi Li, Ajmal Mian, Fahad Shahbaz Khan, Abdul
Wahab Muzaffar. Image Colorization: A Survey and Dataset

You might also like