Colorization Report
Colorization Report
Colorization Report
Introduction
Image colorization is the process of assigning colors to a grayscale image to make it more
aesthetically appealing and perceptually meaningful. These are recognized as sophisticated
tasks than often require prior knowledge of image content and manual adjustments to achieve
artifact-free quality. Also, since objects can have different colors, there are many possible
ways to assign colors to pixels in an image, which means there is no unique solution to this
problem.
Nowadays, image colorization is usually done by hand in Photoshop. Many institutions use
image colorization services for assigning colors to grayscale historic images. There is also for
colorization purposes in the documentation image. However, using Photoshop for this
purpose requires more energy and time. One solution to this problem is to use machine
learning / deep learning techniques.
Recently, deep learning has gained increasing attention among researchers in the field of
computer vision and image processing. As a typical technique, convolutional neural
network (CNNs) have been well-studied and successfully applied to several tasks such as
image recognition, image reconstruction, image generation, etc. (Nguyen et al., 2016)
A CNN consists of multiple layers of small computational units that only process portions of
the input image in a feed-forward fashion. Each layer is the result of applying various image
filters, each of which extracts a certain feature of the input image, to the previous layer. Thus,
each layer may contain useful information about the input image at different levels of
abstraction.
Color Representation
So, how do we render an image, the basics of digital colors, and the main logic for our neural
network. We can say that grayscale images can be represented in grids of pixels.
Each pixel has a value that corresponds to its brightness. The values span from 0–255, from
black to white. While, a color image consist of three layers: Red, Green, Blue (RGB) layer.
Let’s imagine splitting a green leaf on a white background into three channels. As we know
that the color of the leaf is only consist of the green layer. But, the leaf actually present in all
three layers. The layes not only determine color, but also brightness.
Just like grayscale images, each layer in a color image has value from 0-255. The value 0 means that it
has no color in that layer. If the value is 0 for all color channels, then the image pixel is black. A neural
network creates a relationship between an input value and output value. In this project the network
needs to find the traits that link grayscale images with colored ones. So, we should search for the
features that link a grid od grayscale values to the three color grids.
Our final output is a colored image. We have a grayscale image for the input and we want to
predict two color layers, the ab in Lab. To create the final color image we’ll include the
L/grayscale image we used for the input. The result will be creating a Lab image.
How we turn one layer into two layer? We use a convolutional filters. Let’s say them as the
red and blue filter in 3D glasses. They can highlight or remove something to extract
information out of the picture. The network can either create a new image from a filter or
combine several filters into one image.
This part is important because we working on color images anyway or we working on RGB
image, meaning every image is very important and every channel is very important and we
need to predict the value in every channel. So instead of doing that, for this project the easy
way is by converting the RGB to Lab. Before we jump into the code, we should know about
the CIELAB color space into this diagram.
From the left side we have the grayscale input, our filters, and the prediction from our neural
network. We map the predicted values and the real values within the same interval. This way,
we can compare the values. The interval ranges from -1 to 1. To map the predicted values, we
use a tanh activation function. For any value you give the tanh function, it will return -1 to 1.
The true color values range between -128 and 128. This is the default interval in the Lab
color space. By dividing them by 128, they too fall within the -1 to 1 interval. This
“normalization” enables us to compare the error from our prediction. After calculating the
final error, the network updates the filters to reduce the total error. The network continues in
this loop until the error is as low as possible.