0% found this document useful (0 votes)
8 views

Automatic Image Colorization Using Deep Learning

Uploaded by

Ambar Shukla
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Automatic Image Colorization Using Deep Learning

Uploaded by

Ambar Shukla
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Automatic Image Colorization using Deep Learning

Automatic Image Colorization using Deep


Learning
Digambar Shukla

Abstract: Image generation is a good topic and have become a


Process in thas approached, we builded a deeping
area of research in those days. In the project, we are going to convolution neural network this take an graycale photos as a
change black and white images help of Deep Learning. Some in put and produc an coloriz photos. First, we would convert
past methods required human help and result in design of my black and white photo into 256 x 256 pixel. We would
desaturated image We are making a Deep Convolutional Neural gave thas lihe in in put too my neural network. out model
Network (CNN) that would be train in over a lakhs of images. are train too produce
Output generat the model are full depend in image this had seen
train by and require not manual helped.Image are take in others Images with the real color to train in colors photos Images
source ResNet, Reddit, and many more Model would be added produc can easy fool an observer.
lots of hidden parts make result much correct That would a ful
RGB colo/r space is a 3-channel color space. CIE Lab
automatic model and it would be generate image and accurate
many parts. At the end, result of that project are to make real color space is similar to RGB color space but the only
colorperfectaly image this could be easy making fool tester difference is that the color information is encoded only in
viewer can not be able make difference in photo in that model the "a" and "b" channels. The L (lightness) channel only
produce real images in our project have widely practically encodes the intensity, so we can use it as our grayscale input
application history video restoration Image enhancement by to the neural network. The trained network will predict ab
more good interpretability famous things colorization by black channels. Now we will combine the produced ab channels
and white documents, and many more. with L channel. Finally, we will convert the "Lab" image
back to the RGB color space.
Keyword: CNN, CSV, DL, ML, GAN
II. LITERATURE REVIEW
I. INTRODUCTION
In this paper gray-scale images have been colored using
T he colorization of black and white images can impact a various deep learning approaches. A model is proposed
huge variety of domains. Some of the applications of black which is based on a neural network. This model starts from
and white image colorization are remastering of historical scratch and various high-quality features are extracted.
images and surveillance feed improvement. The content of There is a pre-trained model by the name of "Inception-
black and white images is very limited. So, if we add color ResNet-v2”. In this paper, a particular encoder-decoder
components, we can improve the semantics of the image. model is present which can work on images that can be of
Prebuilt models like Inception and ResNet are trained using any size and ratio. After calculating the results "public
datasets of colored images. When we apply these neural acceptance" of the model is carried out. A separate user
networks on black and white images, we can improve the study is carried out for this. Then a separate menu of
results if we colorize them beforehand. How-ever it is very applications that has different types of images is presented.
challenging nowadays to design and implement a system So, the steps which are carried out in this model can be
that is both active and reliable to automate the whole summarized as follows:
colorization  High-level features are extracted using the pre-trained
model specified above.
 Analysis of the architecture is carried out using CNN.
 A separate user study is carried out to check whether
that model is publicly accepted or not.
 A separate set of old pictures are presented and the
model is tested on that.
So, the main components are: An encoder for encoding and
a feature extraction engine, and then there is a function for
activation, a function that will be hyperbolic tangentially.
This project is suitable for carrying out certain tasks for
colorizing the images. It is able to color elements such as
oceans, forest, and sky [1].
In this paper the styles and formats of colored images are
mixed with contents of grayscale-images and then the result
is obtained as colorized-grayscale pictures. CNN is used
which will extract some colored information from a
particular picture and then it is transferred to another picture.
=
Automatic Image Colorization using Deep Learning

The approach used is that both the content-image and the density and diversity atasets have been evaluated. The
style-image are passed into a CNN network that will be difference between object-centric and scene-centric
pre-trained and then the formats of content representation networks has been shown. Linear SVM and pre-trained
and styled representation will be extracted. Then the same ImageNet database has been used. Visualization of CNN
will be done to a noisy picture. For optimization L-BFGS layers have been used to show object-centric and scene-
was used and then the parameters were properly tuned and centric networks. The Places database is very large it
images produced were much better than the method of using achieves the best performance when the whole of the set is
stochastic gradient descent [2]. used as a training set. The workers were presented with
In this model architecture is proposed which is based on different sets of images and had to choose the set which is
neural networks. It will color black and white pictures so similar. The deep features obtained from ImageNet were
without the use of any human-interference. Many network not enough competitive to perform the tasks. The Places
models, problem objectives are focused on. The final dataset was 60 times larger than the SUN database [6].
architecture will produce colored pictures that will be more In this paper, a generic framework has been introduced
useful and pleasing than the previously made base-line without any supervision. An online algorithm has been
regression models. This system uses various datasets. There proposed for big image databases. This approach is less
are several 1000s of pictures divided into 8 categories in the efficient for training rather any other supervised method.
MIT CVCL Urban and natural scene categories dataset. End to end training occurs and this project aims at learning
About 411 pictures were experimented to check the discriminative features and very few assumptive features are
reliability of the system. A pipeline is built by making the there and so this is very easy to train. A separate mean
program read pictures of certain constrained dimensions and square function has been used and so the model is used to
in red, green, blue color spaces. This pipeline consists of a train millions of images. A SoftMax function as a loss
neural network. This model also solved issues of image- function has been used. After performing several
inconsistency. The system can be made to learn to produce experiments, the quality of features has been evaluated.
pictures that could be compared with real images [3]. ImageNet database has been used and object classification
Traditionally picture-colorization is done using scribbling and detection have been done [7].
methods that work manually. In this paper, an automated
method is proposed. Two distinct convolutional III. PROPOSED WORK
architectures of the neural network are compared and trained
In this approach, we build a deep convolutional neural
using various lossy functions. Each variant result is obtained
network that takes a grayscale image as an input and
in the form of pictures, videos and then compared. The main
produces a colorized image. Firstly, we convert our black
goal of this paper is to determine whether any possibility is
and white image in 256 x 256 pixels. We give this as an
there to use neural networks for the colorization of grayscale
input to our neural network. Our model is trained to produce
images in an automated manner or not. The images would be
photos with realistic colors by training on colorful images.
different from natural images by containing less amount of
The images produced would easily fool a viewer.
textural material so making the process of obtaining
The RGB color space is a 3-channel color space. CIE
information harder. So, for obtaining these several variants
Lab color space is similar to RGB color space but the only
of neural networks and then performances compared. Two
difference is that the color information is encoded only in
architectures are considered-one will be a traditional and
the "a" and "b" channels. The L (lightness) channel only
plain network and another one will be inspired by a residual
encodes the intensity, so we can use it as our grayscale input
network that has not been put to use previously [4].
to the neural network. The trained network will predict ab
The aim of this paper is to make an output image a realistic
channels. Now we will combine the produced ab channels
picture like the input but not necessarily the same as the
with L channel. Finally, we will convert the "Lab" image
original. A neural network is explored first. Then the model
back to the RGB color space.
is combined with a classifier named Inception ResNet V2
which has been trained using 1.2 million pictures in order to
IV. ARCHITECTURE
obtain a more realistic output. CNN has been used to color
images. This model had advantages compared to earlier On giving our model the l component of an image as an
models which used mean squared errors. And that led to input, it calculates the ab components. It then combines it
photos of which was desaturated. New models such as with the input to form the colored image. The architecture is
colorful image colorization help to encourage more bold proposed in the Fig. 1.
pixel choices as compared to those which were more The CNN is divided into 4 parts. The encoder component
conservative. The dataset used is that of Unsplash which produces mid-level features and the feature extraction
consists of 10,000 pictures, 95% of which is used in the component produces high level features. These are then
training set and 2.5% in the development set and 2.5% in the merged into the fusion layer. Finally, the output is generated
test set. Various transformations such as image zooming and with the help of decoder component.
flipping were also performed to avoid overfitting. A very
simple survey was done at the end to determine the
frequency of colorings which were accurate but this
approach proved to be too slow and blunt [5].
In this paper, the ImageNet database has been used which

contains 7 million pictures. Using the concept of CNN


=

A. Preprocessing Adam optimizer is used while training so that the loss is back
The pixel size of the images is scaled between (-1,1) for gets backpropagated and the model parameters gets updated.
correct learning.

VI. EXPERIMENTS
ImageNet database is used for the most part of the
training process. The database consists of images which are
in millions and they come in different sets. We have trained
our model on 18 gigabytes of the images. The shape of the
pictures of the ImageNet database are heterogeneous. So, all
the images are rescaled to (224 x 224) and (299 x 299) for
encoding and inception respectively. The training time was
around 8hrs. Nvidia GeForce 1050Ti GPU was used for
speeding up the process.
Fig. 1. Architecture
VII. RESULT
B. Encoder After training our model, we tried colorizing some black and
The input given is (H x W) black and white image and its white images. The nature elements like rivers, trees, grass,
processed into (H/8 x W/8 x 512). In this process, 8 etc. are colorized well but some of the objects are not
convolutional layers are used with (3 x 3) kernels. To always. For those objects, our model has produced next
preserve the input size of the layers, padding is used. The probable colors.
1st, 3rd and 5th layers have stride equal to 2. This causes the
output dimensions to be halved and therefore reduces the
required computations.

C. Feature Extractor
For this we are using a pre trained inception model. Firstly,
we scale the image to (299 x 299) and then we stack it with
itself to produce a 2-channel image. Then we feed this into
the network and just before the SoftMax function, we
extract the output. The resultant is the (1001 x 1 x 1)
embedding.
D. Fusion
The feature vector is replicated (HW/82) times and is
attached along the depth axis to the feature volume which is
the output of encoder. So, a single volume of shape (H/8 x
H/8 x 1257) is obtained. Finally, an image of (H/8 x W/8 x
256) dimensions is obtained after applying 256
convolutional layers of (1 x 1) size.
E. Decoder
The input to the decoder is (H/8 x W/8 x 256) image. It
is passed through a series of up-sampling and
convolutional layers. It outputs a layer of size (H x W x
2). We have compared our results with Zhang’s who has used
the same training set of images. We both have used different
V. OBJECTIVE FUNCTION loss functions. We observed that although the results were
good most of the time but some of the results were low
Optimal values for the model are calculated by saturated because of less diverse data set.
minimizing the objective function which is defined over the
target and estimated output. For this, we calculate the mean VIII. CONCLUSION AND FUTURE WORK
square error between the real value of the pixel colors of the
ab component and its estimated value. It is given by: In this project, we have presented an efficient way of
coloring images using Deep CNN unlike the older manual
procedure. The aim of this paper is to make an output image
a realistic picture like the input but not necessarily the same
as the original. Various transformations such as image
zooming and flipping were also performed to avoid
overfitting. High-level features are extracted using the
Θ: Model Parameters. model
Xk(i,j) : ij:th pixel value of the k:th component of the target.
X˜(ki,j): ij:th pixel value of the k:th component of the
reconstructed image.
Automatic Image Colorization using Deep Learning
=

Our future work will include the colorization of historical


videos. This technique will cause the old documentaries
look visually appealing.
Altogether, some human intervention is required in image
colorization technique but still it has a great future potential.

REFERENCES
1. Federico Baldassare, Diego Gonzalez Morn and Lucas Rodes-Guirao.
“Deep Koalarization: Image Colorization using CNNs and
Inception-Resnet-v2”. arXiv:1712.03400, 2017.
2. Tung Nguyen, Kazuki Mori and Ruck Thawonmas, “Image
Colorization Using a Deep Convolutional Neural Network”.
arXiv:1604.07904, April 2016.
3. Jeff Hwang and You Zhou, “Image Colorization with Deep
Convolutional Neural Networks”, Stanford University.
4. David Futschik, “Colorization of black-and-white images using deep
neural networks”, January 2018.
5. Alex Avery and Dhruv Amin, “Image Colorization”.CS230 – Winter
2018.
6. Zhou, B. Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep
features for scene recognition using places database.
7. Kreahenbuhl, P, Doersch, C, Donahue, J, Darrell, Data-dependent
initializations of convolutional neural networks. ICOLR (2016).

You might also like