Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
I. INTRODUCTION
Convolution neural networks (CNN) are type of Deep Learning neural network which are mainly used for the image
classification or resolution. This network looks like a visual cortex of animal brain. As result of this it has some
interesting features to process the data like audio, video and images. CNN is a combination of convolution layers
which are responsible for image processing. Convolution is a technique which extracts the visual feature of an
image in the form of small chunks. It contains filters/kernel which determines the cluster of neurons. They can
produce the unmodified image, they can also blur the original image, sharp the edges etc. This can be done by
multiplying the original image values with the convolution matrix.
Image resolution is always an challenging problem because Low resolution input corresponds to a crop of possible
High resolution images and here we will try to map the High resolution space with Low resolution input which is
not traceable and there some drawbacks like unclear definition of mapping and inefficiency in establishing high
dimensional mapping for given raw data. So SRCNN is introduced to overcome these drawbacks and produce High
resolute image were breakage of pixel will be very less when it is zoomed.
SRCNN (Super Resolution Convolution Neural Network) is the deep learning method for super resolution which
makes a direct end-to-end mapping between the LR and HR images. This SRCNN consists of three layers and each
layer has convolution layer along with activation function. Bicubic interpolation image with Low resolution is an
input for this network and produce same size image as output with High resolution.
II. METHODOLOGY
Images resampled with the bi-cubic interpolation will be having very smoother surface and have very few
interpolation artifacts. So, we need to choose a Bi-cubic Interpolated image as an input and is sent to three
convolution layers for further processing.
1. Convolution layer 1: In this layer patch extraction will be performed. Patch extraction is a process of selecting
the patch i.e. set of pixels in the image. SRCNN technique will perform patch extraction than to select an entire
image to make the process much easier.
2. Convolution layer 2: In this layer non-linear mapping is performed. Rectified linear unit (RELU) is used. This
RELU is a form of activation function which returns 0 if it receives a negative input. The function is as:
f(x)=max(0,x).
Padding is the process of adding layers of zeros to the input images. While the image is getting processed then
middle part of the image gets extracted. The information at the borders will not get preserved. To avoid this
problem padding is done at second convolution layer.
Padding is of two types: Valid padding &Same padding.
1. Valid Padding: It simply indicates no padding at all i.e. the image is in its unaltered position. So,
[(n*n)image] * [(f*f)filter][(n-f+1) * (n-f+1) image]
Here * represents a convolution operation.
2. Same Padding: To allocate the same dimensions as the input image this Same Padding will add „P‟ padding
layers. So,
[(n+2p) * (n+2p) image] * [(f*f) filter][(n*n) image]
Pooling layer will summarize the features present in the feature maps. So that these summarized features are further
performed in the operations of the convolution layer. This increase the robustness of the model in variations in the
position of features in the input image. We have two types of pooling. Max pooling & Average pooling.
1. Max pooling: In this layer the maximum values of the image pixel gets extracted and these values are further
processed by multiplying them with the kernel filter.
Original Image
Degraded Image
SRCNN Image
Table 1. PSNR and MSE (Mean Squared Error) Values of images
In this process PSNR values of the SRCNN image is increased as the increase in PSNR values will result into better
quality of the image. In the above figure first convolution layer will perform 64 9*9 filters, second layer will
perform max pooling and average pooling using kernel operations and finally output will be the same as original
image which is more better in high resolution with good quality of texture.
IV. CONCLUSION
Large scale super resolution and SISR with corruption are the two major challenges in super resolution community.
Therefore Deep Learning algorithms are skilled to overcome these drawbacks. Combination of loss functions for
image super resolution will give the better quality of the image perceptually. Over many algorithms SRCNN is
proved to give the best resolution to the image and no pixel breakage is seen when it is pinched or zoomed.
Therefore benchmark has been reached in state of art. This application is more useful in MRI scans in medical
imaging, satellite images.
ACKNOWLEDGEMENTS
We thank our Guide Dr. Rajashree V. Biradar, Ph.D., for guiding every one of us and infusing the enthusiasm to
work over successfully. We express our sincere thanks to respected Head of the department Dr. R.N. Kulkarni,
Ph.D., whose moral support encouraged us throughout the project successfully. We offer our sincere gratitude to the
project coordinators and non-teaching staff or supporting us in difficult times.
V. REFERENCES
[1] Y.LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol.521, no.7553, p.436, 2015.
[2] Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural
networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
[3] D. C. Ciresan, U. Meier, J. Masci, L. Maria Gambardella, and J. Schmidhuber, “Flexible, high
performance convolutional neural networks for image classification,” in IJCAI Proceedings-International
Joint Conference on Artificial Intelligence, vol. 22, no. 1. Barcelona, Spain, 2011, p. 1237.
[4] R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks
with multitask learning,” in Proceedings of the 25th international conference on Machine learning. ACM,
2008, pp. 160–167.