0% found this document useful (0 votes)
18 views65 pages

BJ - 18-19 Batch-A

Uploaded by

Pavithra S.G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views65 pages

BJ - 18-19 Batch-A

Uploaded by

Pavithra S.G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

IMAGE SUPER RESOLUTION USING DEEP LEARNING

A Project report submitted in partial fulfillment of the requirements for

the award of the degree of

BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING

Submitted by
CH.ADITYA MANOHAR (315126512030)
D SOWMYA KRISHNA (315126512037)
C KIRAN KUMAR (315126512028)
C MAHESH (315126512029)

Under the guidance of


BIBEKANANDA JENA ,B.tech,M.tech
Assistant Professor

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

ANIL NEERUKONDA INSTITUTE OF TECHNOLOGY AND SCIENCES

(UGC AUTONOMOUS)

(Permanently Affiliated to AU, Approved by AICTE and Accredited by NBA & NAAC with ‘A’ Grade)

SANGIVALASA, BHEEMILI MANDAL, VISAKHAPATNAM DIST.(A.P)

2018-2019
ACKNOWLEDGEMENT
I would like to express deep gratitude to my project guide BIBEKANANDA JENA , Assistant
Professor, Department of Electronics and Communication Engineering, ANITS, for his guidance
with unsurpassed knowledge and immense encouragement. I am grateful to Dr. V.
Rajyalakshmi, Head of the Department, Electronics and Communication Engineering, for
providing me with the required facilities for the completion of the project work.

I am very much thankful to the Principal and Management, ANITS, Sangivalasa, for their
encouragement and cooperation to carry out this work.

I express thanks to all teaching faculty of Department of ECE, whose suggestions during
reviews helped me in accomplishment of my project. I would like to thank all non-teaching
staff of the Department of ECE, ANITS for providing great assistance in accomplishment of my
project.

I would like to thank our parents, friends, and classmates for their encouragement throughout my
project period. At last but not the least, I thank everyone for supporting me directly or indirectly
in completing this project successfully.

CH.ADITYA MANOHAR (315126512030)

D SOWMYA KRISHNA (315126512037)

C KIRAN KUMAR (315126512028)

C MAHESH (315126512029)
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
ANIL NEERUKONDA INSTITUTE OF TECHNOLOGY AND SCIENCES
(UGC AUTONOMOUS)
Permanently Affiliated to AU, Approved by AICTE and Accredited by NBA & NAAC with
(Permanently
‘A’ Grade)
SANGIVALASA, BHEEMILI MANDAL, VISAKHAPATNAM DIST.(A.P)

CERTIFICATE
IMAGE SUPER RESOLUTION USING
This is to certify that the project report entitled “IMAGE
DEEP LEARNING” submitted by CH.ADITYA MANOHAR (315126512030),
(315126512030
DODDAPANENI SOWMYA KRISHNA (315126512037) , CHINTAPALLI KIRAN
KUMAR (315126512028) , CHITTULURI MAHESH (315126512029) in partial fulfillment
of the requirements for the award of the degree of Bachelor of Technology in Electronics &
Communication Engineering(2015-2019)
Engineering of Andhra University, Visakhapatnam.
Visakhapatnam This is a
record of bonafide work carried out under my guidance and supervision.

Project Guide Head of the Department

B.Jena, M.tech Dr. V.Rajyalakshmi


M.E.,Ph.D.,MHRM,MIEEE,MIE,MISTE
Asst.Prof Professor
Department of E.C.E Department of ECE
ANITS ANITS
CONTENTS

LIST OF SYMBOLS iii

LIST OF FIGURES iv

LIST OF ABBREVATIONS vi

ABSTRACT vii

CHAPTER 1-INTRODUCTION
1.1 introduction to Super Resolution 2
1.2 Introduction to digital image processing 4
1.3 Gray Scale Image 8
1.4 Color Image 9
CHAPTER 2- BASIC INTERPOLATION METHODS
2.1 Nearest Neighbour Interpolation 11
2.2 Bilinear Interpolation 14
2.3 Bicubic Interpolation 18

CHAPTER 3-CONVOLUTIONAL NEURAL NETWORK


3.1 Neural Network 23
3.2 Convolutional Neural Network 24
3.3 Convolutional Layer 26
3.4 Advantages 29
CHAPTER 4- SUPER RESOLUTION IMAGING USING DEEP CNN

4.1 The SRCNN properties 31

4.2 Patch extraction and representation 34

4.3 Non-linear Mapping 35

4.4 Reconstruction 36
4.5 Training 37
i
CHAPTER 5- BACK PROPAGATION

5.1 Model initialization 40

5.2 Forward propagate 40

5.3 Loss function 40

5.4 Differentiation 41

5.5 Back-propagation 42

5.6 Weight update 43

5.7 Iterate until convergence 44

RESULTS 47

CONCLUSION 53

REFERENCES 54

ii
List of symbols:
1. ∑= This symbol indicates summation

2. i= This symbol indicates the value of weight in a filter for a particular value of i

3. ᶯ = This represents the learning rate of the network.


4. * = Represents the convolutional operation
iii

List of figures:

Figure.No Description Page no


Figure-1.1: The concept of multi-frame super-resolution 4
Figure-1.2: Digital image 6
Figure-1.3: Types of image processings 7
Figure-2.1: Hierarchy of Interpolation 11
Figure-2.2: 1-D &2-D nearest neighbor interpolation 12
Figure-2.3: Example for nearest neighbor interpolation 12
Figure-2.4: Input image to nearest neighbor interpolation 13
Figure-2.5: Output image to nearest neighbor interpolation 13
Figure-2.6: Linear interpolation 14
Figure-2.7: Operation of bilinear interpolation 15
Figure-2.8: Bilinear interpolation 1 15
Figure-2.9: Example for Bilinear interpolation 16
Figure-2,10: Bilinear interpolation input 17
Figure-2.11: Bilinear interpolation output 17
Figure-2.12: Bicubic interpolation 18
Figure-2.13: Bicubic interpolation example 19
Figure-2.14: Bicubic interpolation input 20
Figure-2.15: Bicubic interpolation output 20
Figure-3.1: Simple neural network 23
Figure-3.2: Basic structure of CNN 24
Figure-3.3: Example for a CNN network 25
Figure-3.4: Complete flow of CNN to process an input image 26
Figure-3.5: convolution with a filter example 27
Figure-3.6: Output of Convolution layer 27
Figure-3.7: Max pooling 28
Figure-3.8: fully connected layer 28
Figure-3.9: Overall view of CNN structure 29
Figure-4.1: Overall view of CNN process 33
Figure-4.2: patch extraction 34
Figure-4.3: CNN as a black box 38
Figure-5.1: Weight- Error function plot 42
Figure-5.2: Block diagram of back propagation 45
iv

Figure-A: input to CNN 47


Figure-B: output of CNN 48
Figure-C: input to CNN 49
Figure-D: output of CNN 50
Figure-E: input to CNN 51
Figure-F: output of CNN 52
v

List of abbreviations:

1.PSNR: Peak Signal to Noise Ratio

2.MSE: Mean Square Error

3.CNN : Convolutional Neural Network

4.LR: Low Resolution

5.HR: High Resolution

6.SR: Super Resolution

7. SRCNN: Super Resolution Convolutional Neural Network

8. ReLU: Rectified Linear Unit

9. ANN: Artificial Neural Network

10. RGB: Red Green Blue

11. YCbCr: Yellow Chromium Blue Chromium Red

12.EM : Electro Magnetic

13.DIP: Digital Image Processing

vi
Abstract:
Image super resolution (SR) is a technique to estimate or synthesize a high
resolution(HR) image from one or several low resolution (LR) images . Super-resolution (SR)
technique reconstructs a higher-resolution image or sequence from the observed LR images. In
this project we are going to present about the methods in super resolution and the advancements
that are taking place, since it has lot many applications in various fields. As SR has been
developed for more than three decades, both multi-frame and single-frame SR have significant
applications in our daily life. Most super-resolution techniques are based on the same idea, using
information from several different images to create one upsized image. Algorithms try to extract
details from every image in a sequence to reconstruct other frames. This multi frame approach
differs significantly from sophisticated image (single frame) upsizing methods which try to
synthesize artificial details. SR methods are usually based on two important algorithms: high
quality spatial (in-frame) up-scaling, and motion compensation for finding corresponding areas
in neighbor frames. Finally some future challenges and how to overcome the drawbacks are
discussed.
Vii
CHAPTER-1
INTRODUCTION TO SUPER RESOLUTION

1
Introduction
1.1 INTRODUCTION To Super Resolution
In the last decade the world has seen an immense global advancement in technology, both
in hardware and software. The industries took advantage of the advanced technology to produce
electronic gadgets such as computers, mobile phones, PDAs and many more at affordable prices.
The camera sensor manufacturing units also advanced in their manufacturing techniques to
produce good quality high-resolution (HR) digital cameras. Although, HR digital cameras are
available, many computer vision applications such as satellite imaging, target detection, medical
imaging and many more still had a strong requisition for higher resolution imagery which very
often exceeded the capabilities of these HR digital cameras. To cope up with the strong demand
for higher-resolution imagery, these applications approached image-processing techniques for a
solution to generate good quality HR imagery.

Super − Resolution image reconstruction is a promising technique of digital imaging


which attempts to reconstruct HR imagery by fusing the partial information contained within a
number of under-sampled low-resolution (LR) images of that scene during the image
reconstruction process. Super-resolution image reconstruction involves up-sampling of under-
sampled images thereby filtering out distortions such as noise and blur. In comparison to various
image enhancement techniques, super-resolution image reconstruction technique not only
improves the quality of under-sampled, low-resolution images by increasing their spatial
resolution but also attempts to filter out distortions.

The central aim of Super-Resolution (SR) is to generate a higher resolution image from
lower resolution images. High resolution image offers a high pixel density and thereby more
details about the original scene. The need for high resolution is common in computer vision
applications for better performance in pattern recognition and analysis of images. High
resolution is of importance in medical imaging for diagnosis. Many applications require zooming
of a specific area of interest in the image wherein high resolution becomes essential, e.g.
surveillance, forensic and satellite imaging applications.
However, high resolution images are not always available. This is since the setup for high
resolution imaging proves expensive and also it may not always be feasible due to the inherent
limitations of the sensor, optics manufacturing technology. These problems can be overcome
through the use of image processing algorithms, which are relatively inexpensive, giving rise to
concept of super-resolution. It provides an advantage as it may cost less and the existing low
resolution imaging systems can still be utilized.

Super-resolution is based on the idea that a combination of low resolution (noisy) sequence
of images of a scene can be used to generate a high resolution image or image sequence. Thus it
attempts to reconstruct the original scene image with high resolution given a set of observed
images at lower resolution. The general approach considers the low resolution images as

2
resulting from resampling of a high resolution image. The goal is then to recover the high
resolution image which when resampled based on the input images and the imaging model, will
produce the low resolution observed images. Thus the accuracy of imaging model is vital for
super-resolution and an incorrect modeling, say of motion, can actually degrade the image
further.

Image spatial resolution refers to the capability of the sensor to observe or measure the
smallest object, which depends upon the pixel size. As two-dimensional signal records, digital
images with a higher resolution are always desirable in most applications. Imaging techniques
have been rapidly developed in the last decades, and the resolution has reached a new level. The
question is therefore: are image resolution enhancement techniques still required?

The fact is, although the high-definition displays in recent years have reached a new level
(e.g., 1920*1080 for HDTV, 3840*2160 for some ultra HDTV, and 2048*1536 for some mobile
devices), the need for resolution enhancement cannot be ignored in many applications .For
instance, to guarantee the long-term stable operation of the recording devices, as well as the
appropriate frame rate for dynamic scenes, digital surveillance products tend to sacrifice
resolution to some degree. A similar situation exists in the remote sensing field: there is always a
tradeoff between the spatial, spectral, and temporal resolutions. As for medical imaging, within
each imaging modality, specific physical laws are in control, defining the meaning of noise and
the sensitivity of the imaging process. How to extract 3D models of the human structure with
high-resolution images while reducing the level of radiation still remains a challenge .

Based on these facts, the current techniques cannot yet satisfy the demands. Resolution
enhancement is therefore still necessary, especially in fields such as video surveillance, medical
diagnosis, and remote sensing applications. Considering the high cost and the limitations of
resolution enhancement through “hardware” techniques, especially for large-scale imaging
devices, signal processing methods, which are known as super-resolution (SR), have become a
potential way to obtain high-resolution (HR) images. With SR methods, we can go beyond the
limit of the low-resolution (LR) observations, rather than improving the hardware devices.

SR is a technique which reconstructs a higher-resolution image or sequence from the


observed LR images. Technically, SR can be categorized as multi-frame or single-frame based
on the input LR information If multiple images of the same scene with sub-pixel misalignment
can be acquired, the complementary information between them can be utilized to reconstruct a
higher resolution image or image sequence, as Fig. 1 shows. However, multiple LR images may
sometimes not be available for the reconstruction, and thus we need to recover the HR image
using the limited LR information, which is defined as single-frame SR .

Although SR techniques have been comprehensively summarized in several studies ,this


paper aims to provide a review from the perspective of techniques and applications, and
especially the main contributions in recent decades. This paper provides a more detailed
description of the most commonly employed regularized SR methods, including fidelity models,
regularization models, parameter estimation methods, optimization algorithms, acceleration
strategies, etc. Moreover, we present an exhaustive summary of the current applications using SR

3
techniques, such as the recent Google Skybox satellite application and unmanned aerial vehicle
(UAV) surveillance sequences .The current obstacles for the future research are also discussed.

Fig:1.1 The concept of multi-frame super-resolution. The grids on the left side represent the LR
images of the same scene with sub-pixel alignment, thus the HR image (the grid on the right
side) can be acquired by fusing the complementary information with SR methods.

1.2 DIGITAL IMAGE PROCESSING

Computerized picture preparing is a range portrayed by the requirement for broad test work to
build up the practicality of proposed answers for a given issue. A critical trademark hidden the
plan of picture preparing frameworks is the huge level of testing and experimentation that

Typically is required before touching base at a satisfactory arrangement. This trademark infers
that the capacity to plan approaches and rapidly model hopeful arrangements by and large
assumes a noteworthy part in diminishing the cost and time required to land at a suitable
framework execution.

4
WHAT IS DIP
A picture might be characterized as a two-dimensional capacity f(x, y), where x, y are
spatial directions, and the adequacy of f at any combine of directions (x, y) is known as the
power or dark level of the picture by then. Whenever x, y and the abundance estimations of are
all limited discrete amounts, we call the picture a computerized picture. The field of DIP alludes
to preparing advanced picture by methods for computerized PC. Advanced picture is made out of
a limited number of components, each of which has a specific area and esteem. The components
are called pixels.

Vision is the most progressive of our sensor, so it is not amazing that picture play the
absolute most imperative part in human observation. Be that as it may, dissimilar to people, who
are constrained to the visual band of the EM range imaging machines cover practically the whole
EM range, going from gamma to radio waves. They can work likewise on pictures produced by
sources that people are not acclimated to partner with picture.

There is no broad understanding among creators in regards to where picture handling


stops and other related territories, for example, picture examination and PC vision begin. Now
and then a qualification is made by characterizing picture handling as a teach in which both the
info and yield at a procedure are pictures. This is constraining and to some degree manufactured
limit. The range of picture investigation is in the middle of picture preparing and PC vision.

There are no obvious limits in the continuum from picture handling toward one side to
finish vision at the other. In any case, one helpful worldview is to consider three sorts of
mechanized procedures in this continuum: low, mid and abnormal state forms. Low-level process
includes primitive operations, for example, picture preparing to decrease commotion,
differentiate upgrade and picture honing. A low-level process is described by the way that both
its sources of info and yields are pictures.

Mid-level process on pictures includes assignments, for example, division, depiction of that

Question diminish them to a frame reasonable for PC handling and characterization of individual

articles.

A mid-level process is portrayed by the way that its sources of info by and large are pictures
however its yields are properties removed from those pictures. At long last more elevated
amount handling includes "Understanding" an outfit of perceived items, as in picture
examination and at the furthest end of the continuum playing out the intellectual capacities
typically connected with human vision. Advanced picture handling, as effectively characterized
is utilized effectively in a wide scope of regions of outstanding social and monetary esteem.

5
WHAT IS AN IMAGE

A picture is spoken to as a two dimensional capacity f(x, y) where x and y are spatial co-
ordinates and the adequacy of "f" at any match of directions (x, y) is known as the power of the
picture by then.

Fig 1.2: DIGITAL IMAGE

Processing on image :

Processing on image can be of three types .They are low-level , mid-level , high level

Low-level Processing :

 Preprocessing to remove noise.

 Contrast enhancement.

 Image sharpening.

Medium Level Processing :

 Segmentation.

 Edge detection.

 Object extraction.

6
High Level Processing :

 Image analysis.

 Scene interpretation.

Why Image Processing ?

Since the digital image is invisible , it must be prepared for viewing on one or more
output device(laser printer, monitor etc).The digital image can be optimized for the
application by enhancing the appearance of the structures within it .

There are three of image processings used.They are

 Image to Image transformation

 Image to Information transformations

 Information to Image transformations

fig.1.3 types of image processings

7
Pixel :

Pixel is the smallest element of an image. Each pixel correspond to any one value. In an
8-bit gray scale image, the value of the pixel between 0 and 255.Each pixel store a value
proportional to the light intensity at that particular location. It is indicated in either Pixels
per inch or Dots per inch .

Resolution :

The resolution can be defined in many ways. Such as pixel resolution, spatial resolution,
temporal resolution, spectral resolution .In pixel resolution, the term resolution refers to
the total number of count of pixels in an digital image. For example , If an image has M
rows and N columns, then its resolution can be defined as M X N. Higher is the pixel
resolution, the higher is the quality of the image.

Resolution of an image is of generally two types.

 Low Resolution image

 High Resolution

Fig.1.3:Types of image processings

Since high resolution is not a cost effective process . It is not always possible to achieve
high resolution images with low cost . Hence it is desirable to go for Super Resolution
imaging.In Super Resolution imaging , with the help of certain methods and algorithms
we can be able to produce high resolution images from the low resolution images .

1.3 GRAY SCALE IMAGE

A gray scale picture is a capacity I (xylem) of the two spatial directions of the picture plane. I(x,
y) is the force of the picture at the point (x, y) on the picture plane. I (xylem) take non-negative
esteems expect the picture is limited by a rectangle [0, a] '[0, b] I: [0, a] " [0, b] ® [0, data] .

8
1.4 COLOR IMAGE

It can be spoken to by three capacities, R (xylem) for red, G (xylem) for green and B
(xylem) for blue. A picture might be persistent as for the x and y facilitates and furthermore in
adequacy. Changing over such a picture to advanced shape requires that the directions and the
adequacy to be digitized. Digitizing the facilitate's esteems is called inspecting. Digitizing the
adequacy esteems is called quantization.

9
CHAPTER-2
BASIC INTERPOLATION METHODS

10
2.Basic
Basic Interpolation Methods
Interpolation is a method of constructing new data points within the range of a discrete
set of known data points.

Fig2.1:Hierarchy
Hierarchy of Interpolation

The three basic methods of interpolation here we used are:

1) Nearest Neighbor Interpolation


2) Bilinear Interpolation
3) Bicubic Interpolation

2. 1.Nearest Neighbor Interpolation


neighbor interpolation (also known as proximal interpolation or, in some contexts, point
Nearest-neighbor
sampling)) is a simple method of multivariate interpolation in one or more dimensions.
In numerical analysis, multivariate interpolation or spatial interpolation is interpolation on
functions of more than one variable.

Interpolation is the problem of approximating the value of a function for a non-given


non
point in some space when given the value of that function in points around (neighboring) that

11
point. The nearest neighbor algorithm selects the value of the nearest point and does not consider
the values of neighboring points at all, yielding a piecewise-constant interpolant.

fig.2.2:nearest neighbour

Nearest neighbour interpolation is simple approach to interpolation. This method


determines the “NEAREST NEIGHBOURING PIXEL” and assume the intensity value of it.It
is the simplest interpolation requires least computation and less processing time .This is the
fastest interpolation method .

Let us consider an example shown b

low:

Fig2.3:Example for nearest neighbor interpolation

12
The pictorial representation depicts that a 3x3 matrix is interpolated to 6x6 matrix. The values in
the interpolated matrix are taken from the input matrix no new value is added.

Fig.2.4: Original Image

Fig.2.5: Nearest Neighbor Interpolated Image

13
Drawbacks:
 The nearest neighbour algorithm simply selects the pixel value of the nearest pixel and
does not consider the values of other neighbouring pixels at all .

 The value of the missing pixel in the new image is the value of the nearest pixel in the
original image. It simply makes each pixel bigger.

 This type of interpolation can only be used for closer examination of digital images
because it does not change the pixel information of the image and does not introduce any
anti-aliasing..

Linear Interpolation:
Linear interpolation is a method of curve fitting using linear polynomials to
construct new data points within the range of a discrete set of known data points.Linear
method is effective method of image processing when compared to nearest neighbour
method because of the resolution of image.

Fig2.6:Linear interpolation

2.2.Bilinear Interpolation:
Bilinear interpolation is an extension of linear interpolation for interpolating functions of
two variables (e.g., x and y) on a rectilinear 2D grid.The key idea is to perform linear
interpolation first in one direction, and then again in the other direction.

14
The four red dots show the data points and the green dot is the point at which we
Fig2.7:The
want to interpolate.

Fig2.8:Bilinear
Bilinear Interpolation

Procedure for bilinear interpolation:


 Suppose that we want to find the value of the unknown function f at the point (x,
( y). It is
assumed that we know the value of f at the four points Q11 = (x1, y1), Q12 = (x1, y2), Q21 =
(x2, y1), and Q22 = (x2, y2).

 We first do linear interpolation in the x-direction. This yields

 f(x,y1)=(x2-x/x2-x1)*f(Q11)+(X
x1)*f(Q11)+(X-X1/X2-X1)*f(Q21)

15
 f(x,y2)=(x2-x/x2-x1)*f(Q12) )+(X-X1/X2-X1)*f(Q22)
Now interpolation is completed in x direction.

Now interpolation is done along y direction as follows:

 f(x,y)=((y2-y/y2-y1)*f(x,y1))+ )=((y-y1/y2-y1)*f(x,y2))

 Now interpolation is done in both the directions

fig.2.9: Example for bilinear interpolation is as follows:

 Interpolating ‘i’ value,

 (i-A) / w = (B-A) / W

 i = A + w * (B-A) / W

 Interpolating ‘j’ value,

 (j-C) / w = (D-C) / W

 j = C + w * (D-C) / W

 (Y-i) / h = (j-i) / H

 Y = i + h * (j-i) / H

 By substituting for i and j,

 Y = A + w * (B-A) / W + h * (D-C) / H + w * h * (D-C-B+A) / (W * H)

16
Fig.2.10:Original Image

Fig.2.11:Bilinear Interpolated Image

17
Advantages:
 It provides more resolution than nearest neighbour method

 The speed at which it process the image is good because of its simple algorithm

Disadvantages :
The resolution is less when compared to many other interpolation techniques like bicubic
interpolation and some deep learning techniques

2.3.Bicubic Interpolation:
Bicubic interpolation is an extension of cubic interpolation for interpolating data
points on a two-dimensional regular grid .The interpolated surface is smoother than
corresponding surfaces obtained by bilinear interpolation or nearest neighbor
interpolation.Bicubic interpolation can be accomplished using either cubic splines or cubic
convolution algorithm.

In contrast to bilinear interpolation, which only takes 4 pixels (2×2) into account,
bicubic interpolation considers 16 pixels (4×4).

Fig.2.12:Bicubic Interpolation

18
fig.2.13:Example for bicubic interpolation:

19
Fig.2.14:Original Image

Fig2.15: Bicubic Interpolated Image

20
Advantages :
 Bicubic interpolation makes use of more data, hence its results are generally smoother.

 Bicubic interpolation creates smoother curves than bilinear interpolation, and introduces
fewer "artifacts," or pixels that stand out as conspicuously deteriorating the apparent
quality of the image.

Drawbacks :
 The increased smoothness of bicubic interpolation comes at a substantial cost in terms of
processing time; the algorithms and formulas used for the bicubic method are much more
complex.

 Accordingly, while bilinear interpolation is fairly quick and may not be that much slower
than nearest-neighbor calculations, bicubic interpolation is slower, at times by an order of
magnitude.

21
CHAPTER-3
INTRODUCTION TO CONVOLUTIONAL NEURAL
NETWORKS

22
Introduction to Convolutional Neural Network

3.1 Neural Network:


A neural network is a network or circuit of neurons, or in a modern sense, an artificial
neural network, composed of artificial neurons or nodes.[1] Thus a neural network is either
a biological neural network, made up of real biological neurons, or an artificial neural network,
for solving artificial intelligence (AI) problems. The connections of the biological neuron are
modeled as weights. A positive weight reflects an excitatory connection, while negative values
mean inhibitory connections. All inputs are modified by a weight and summed. This activity is
referred as a linear combination. Finally, an activation function controls the amplitude of the
ceptable range of output is usually between 0 and 1, or it could be −1
output. For example, an acceptable
and 1.

These artificial networks may be used for predictive modeling, adaptive control and
applications where they can be trained via a dataset. Self
Self-learning
learning resulting from experience can
occur within networks, which can derive conclusions from a complex and seemingly unrelated
set of information

Fig.3.1:A
A simple Neural Network

23
A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers
between the input and output layers. The DNN finds the correct mathematical manipulation to
linear relationship .
turn the input into the output, whether it be a linear relationship or a non-linear

3.2Convolutional
Convolutional Neural Network:
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of
forward artificial neural networks, most commonly applied to analyzing visual
deep, feed-forward
imagery. Convolutional networks were inspired by biological processes in that the connectivity
pattern between neurons resembles the organization of the visual cortex. CNNs use relatively
little pre-processing
processing compared to other image classification algorithms. CNN is a special kind of
multi-layer NNs applied to 2-d d arrays (usually images), based on spatially localized neural input.
input
CNN Generate ‘patterns of patterns’ for pattern recognition. Each layer combines patches from
previous layers.

Basic structure of CNN,where C1,C3 are convolution layers and S2,S4 are
Fig3.2:Basic
pooled/sampled layers.

Convolutional Networks are trainable multistage architectures composed of multiple


stages Input and output of each stage are sets of arrays called feature mapsmaps. At output, each
feature map represents a particular feature extracted at all locations on input
input. Each stage is
non linearity layer, and a feature pooling layer
composed of: a filter bank layer, a non-linearity layer. A ConvNet is
composed of 1, 2 or 3 such 3-layer followed by a classification module.
layer stages, followed module

Filter: A trainable filter (kernel) in filter bank connects input feature map to output feature map

Convolutional layers apply a convolution operation to the input, passing the result to the
next layer. The convolution emulates the response of an individual neuron to visual stimuli
stimuli.

24
Pooling: Convolutional networks may include local or global pooling layers, which combine
the outputs of neuron clusters at one layer into a single neuron in the next layer.
layer

Max pooling uses the maximum value from each of a cluster of neurons at the prior layer.
Average pooling, which uses the average value from each of a cluster of neurons at the prior
layer.

Fig3.3: Example for a CNN network

Deep learning CNN models to train and test, each input image will pass it through a
series of convolution layers with filters (Kernals), Pooling, fully connected layers (FC) and apply
functions to classify an object with probabilistic values between 0 and 1.

25
Fig3.4: complete flow of CNN to process an input image and classifies the objects based on
values

3.3Convolution Layer :
Convolution is the first layer to extract features from an input image. Convolution
data. It is a mathematical operation that takes two inputs
features using small squares of input data.
such as image matrix and a filter or kernal .

 An image matrix of dimension (h x w x d)

 A filter (fh x fw x d)

 Outputs a volume dimension (h-fh+1)


(h x (w-fw+1) x 1.

Consider a 5 x 5 whose image pixel values are 0, 1 and filter matrix 3 x 3 as shown in below
values

26
Fig.3.5:: convolution with a filter example

Then the convolution of 5 x 5 image matrix multiplies with 3 x 3 filter matrix which is
called “Feature Map” as output shown in below
below.

Fig.3.6: Output of Convolution layer

Pooling Layer:
Pooling layers section would reduce the number of parameters when the images are too
large. Spatial pooling also called subsampling or downsampling which reduces the
dimensionality of each map but retains the important
im information.

Max pooling take the largest element from the rectified feature map. Taking the largest
element could also take the average pooling. Sum of all elements in the feature map call as sum
pooling.

27
fig.3.7: Max pooling

Fully Connected Layer :


The layer we call as FC layer, we flattened our matrix into vector and feed it into a fully
connected layer like neural network.

Figure-3.8: fully connected layer

In the above diagram, feature map matrix will be converted as vector (x1, x2, x3, …). With the
fully connected layers, we combined these features together to create a model. Finally, we have
an activation function to classify the outputs

28
Fig3.9: Overalll view of CNN structure

3.4Advantages:
In terms of performance, CNNs outperform NNs on conventional image recognition tasks
and many other tasks.For For a completely new task / problem CNNs are very good feature
useful attributes from an already trained CNN with its
extractors. This means that you can extract useful
trained weights by feeding your data on each level and tune the CNN a bit for the specific

task. The usage of CNNs are motivated by the fact that they can capture / are able to learn
relevant features from an image /video at different levels similar to a human brain.

29
CHAPTER-4
SUPER RESOLUTION IMAGING USING DEEP CNN

30
4. Super Resolution Imaging using deep CNN:
Single image super-resolution, which aims at recovering a high-resolution image from a
single low resolution image. We consider a convolutional neural network that directly learns an
end-to-end mapping between low- and high-resolution images. Our method differs
fundamentally from existing external example-based approaches, in that ours does not explicitly
learn the dictionaries or manifolds for modeling the patch space.These are implicitly achieved
via hidden layers. Furthermore, the patch extraction and aggregation are also formulated as
convolutional layers, so are involved in the optimization. In our method, the entire SR pipeline is
fully obtained through learning, with little pre/post processing.[1]

4.1The SRCNN has several appealing properties:


1. First, its structure is intentionally designed with simplicity in mind, and yet provides
superior accuracy compared with state-of-the-art example-based methods.

2. With moderate numbers of filters and layers, our method achieves fast speed.Our method is
faster than a number of example-based methods, because it is fully feed-forward and does not
need to solve any optimization problem on usage.

Overall, the contributions of this study are mainly in two aspects:

1) We present a fully convolutional neural network for image super-resolution. The


network directly learns an end-to-end mapping between low and high-resolution images,
with little pre/post processing beyond the optimization.

2) We demonstrate that deep learning is useful in the classical computer vision problem
of super resolution, and can achieve good quality and speed.

Firstly, we improve the SRCNN by introducing larger filter size in the non-linear mapping layer,
and explore deeper structures by adding nonlinear mapping layers. Secondly, we extend the
SRCNN to process three color channels (either in YCbCr or RGB color space) simultaneously

According to the image priors, single-image super resolution algorithms can be categorized into
four types – prediction models, edge based methods, image statistical methods and patch based
(or example-based) methods.

The internal example-based methods exploit the selfsimilarity property and generate exemplar
patches from the input image. It is first proposed in Glasner’s work ,and several improved
variants are proposed to accelerate the implementation. The external example-based methods
learn a mapping between low/highresolution patches from external datasets. These studies vary
on how to learn a compact dictionary or manifold space to relate low/high-resolution patches,
and on how representation schemes can be conducted in such spaces. In the pioneer work of
Freeman et al. ,the dictionaries are directly presented as low/high-resolution patch pairs, and the

31
nearest neighbour (NN) of the input patch is found in the low-resolution space, with its
corresponding high-resolution patch used for reconstruction.

The majority of SR algorithms focus on gray-scale or single-channel image super-resolution. For


color images, the aforementioned methods first transform the problem to a different color space
(YCbCr or YUV), and SR is applied only on the luminance channel. There are also works
attempting to super-resolve all channels simultaneously.

Convolutional neural networks (CNN) date back decades and deep CNNs have recently shown
an explosive popularity partially due to its success in image classification. They have also been
successfully applied to other computer vision fields, such as object detection, face recognition,
and pedestrian detection.

Several factors are of central importance in this progress:

(i) the efficient training implementation on modern powerful GPUs

(ii) the proposal of the Rectified Linear Unit (ReLU) which makes convergence much
faster while still presents good quality and

(iii) the easy access to an abundance of data for training larger models. Our method also
benefits from these progresses.

The convolutional neural network is applied for natural image de-noising and removing
noisy patterns (dirt/rain). These restoration problems are more or less de-noising-driven.

Consider a single low-resolution image, we first upscale it to the desired size using
bicubic interpolation, which is the only pre-processing we perform3 . Let us denote the
interpolated image as Y. Our goal is to recover from Y an image F(Y) that is as similar as
possible to the ground truth high-resolution image X. For the ease of presentation, we still
call Y a “low-resolution” image, although it has the same size as X. We wish to learn a
mapping F, which conceptually consists of three operations:

1) Patch extraction and representation : This operation extracts (overlapping)


patches from the low resolution image Y and represents each patch as a high-dimensional
vector. These vectors comprise a set of feature maps, of which the number equals to the
dimensionality of the vectors.[1]

2) Non-linear mapping: This operation nonlinearly maps each high-dimensional


vector onto another high-dimensional vector. Each mapped vector is conceptually the
representation of a high-resolution patch. These vectors comprise another set of feature
maps.[1]

32
3) Reconstruction: This operation aggregates the above high-resolution
high patch-wise
representations to generate the final high-resolution
high tion image. This image is expected to be
similar to the ground truth X.

We will show that all these operations form a convolutional neural network. An overview
of the network is depicted in Figure below.. Next we detail our definition of each
operation.

fig.4.1: Overall view of CNN process

4.2Patch
Patch extraction and representation:
A popular strategy in image restoration is to densely extract patches and then represent
trained bases. This is equivalent to convolving the image by a set oof
them by a set of pre-trained
filters, each of which is a basis.

In our formulation, we involve the optimization of these bases into the optimization of the
network. Formally, our first layer is expressed as an operation F1:

33
where W1 and B1 represent the filters and biases respectively, and ’∗’’ denotes the
convolution operation.

Here, W1 corresponds to n1 filters of support c×f1 ×f1, where c is the number of channels in
the input image, f1 is the spatial size of a filter.

fig.4.2: patch extraction

4.2 Non-linear
linear mapping :
n1 dimensional feature for each patch. In the second operation,
The first layer extracts an n1-dimensional
we map each of these n1-dimensional
n1 dimensional one. This is
vectors into an n2-dimensional
equivalent to applying n2 filters which have a trivial spatial support 1 × 1.

Thiss interpretation is only valid for 1×1 filters. But it is easy to generalize to larger filters
non linear mapping is not on a patch of the input
like 3 × 3 or 5 × 5. In that case, the non-linear
image; instead, it is on a 3 × 3 or 5 × 5 patch of the feature map.

The operation of the second layer is:

Here W2 contains n2 filters of size n1 ×f2 ×f2, and B2 is n2 dimensional.


n2-dimensional.

dimensional vectors is conceptually a representation of a high-


Each of the output n2-dimensional high
resolution patch that will be used for reconstruction.

34
patch that will be used for reconstruction. It is possible to add more convolutional layers to
increase the non-linearity. But this can increase the complexity of the model (n2 × f2 × f2 ×
n2 parameters for one layer), and thus demands more training time.

4.3Reconstruction:
In the traditional methods, the predicted overlapping high-resolution patches are often
averaged to produce the final full image.

The averaging can be considered as a pre-defined filter on a set of feature maps (where
each position is the flattened vector form of a high resolution patch).

Here,we define a convolutional layer to produce the final high-resolution image:

Here W3 corresponds to c filters of a size n2 × f3 × f3, and B3 is a c-dimensional vector.

If the representations of the high-resolution patches are in the image domain (we can
simply reshape each representation to form the patch), we expect that the filters act like
an averaging filter.

If the representations of the high-resolution patches are in some other domains (like
coefficients in terms of some bases), we expect that W3 behaves like first projecting the
coefficients onto the image domain and then averaging. In either way, W3 is a set of
linear filters.

Interestingly, although the above three operations are motivated by different intuitions,
they all lead to the same form as a convolutional layer. We put all three operations
together and form a convolutional neural network (Figure). In this model, all the filtering
weights and biases are to be optimized. Despite the succinctness of the overall structure,
our SRCNN model is carefully developed by drawing extensive experience resulted from
significant progresses in super-resolution

35
4.4Training:
Learning the end-to-end mapping function F requires the estimation of network parameters Θ
= {W1, W2, W3, B1, B2, B3}.

This is achieved through minimizing the loss between the reconstructed images F(Y; Θ) and
the corresponding ground truth high resolution images X.

Given a set of high-resolution images {Xi} and their corresponding low-resolution images
{Yi}, we use Mean Squared Error (MSE) as the loss function:

where n is the number of training samples. Using MSE as the loss function favors a high
PSNR. The PSNR is a widely-used metric for quantitatively evaluating image restoration
quality, and is at least partially related to the perceptual quality. It is worth noticing that the
convolutional neural networks do not preclude the usage of other kinds of loss functions, if
only the loss functions are derivable. If a better perceptually motivated metric is given during
training, it is flexible for the network to adapt to that metric. On the contrary, such a
flexibility is in general difficult to achieve for traditional “handcrafted” methods. Despite that
the proposed model is trained favoring a high PSNR, we still observe satisfactory
performance.

The loss is minimized using stochastic gradient descent with the standard backpropagation.
In particular, the weight matrices are updated as

where L ∈ {1, 2, 3} and i are the indices of layers and iterations, η is the learning rate.

The filter weights of each layer are initialized by drawing randomly from a Gaussian
distribution with zero mean and standard deviation 0.001 (and 0 for biases). The learning rate
is 10−4 for the first two layers, and 10−5 for the last layer. We empirically find that a smaller
learning rate in the last layer is important for the network to converge (similar to the
denoising case [22]). In the training phase, the ground truth images {Xi} are prepared as
fsub×fsub×c-pixel sub-images randomly cropped from the training images. By “sub-images”
we mean these samples are treated as small “images” rather than “patches”, in the sense that
“patches” are overlapping and require some averaging as post-processing but “sub-images”
need not. To synthesize the low-resolution samples {Yi}, we blur a sub-image by a Gaussian
kernel, sub-sample it by the upscaling factor, and upscale it by the same factor via bicubic

36
interpolation. To avoid border effects during training, all the convolutional layers have no
padding, and the network produces a smaller output ((fsub − f1 − f2 − f3 + 3)^2 × c). The
MSE loss function is evaluated only by the difference between the central pixels of Xi and
the network output. Although we use a fixed image size in training, the convolutional neural
network can be applied on images of arbitrary sizes during testing.

Different channels:

• Y only: this is our baseline method, which is a single-channel (c = 1) network trained only
on the luminance channel. The Cb, Cr channels are upscaled using bicubic interpolation.

• YCbCr: training is performed on the three channels of the YCbCr space.

• Y pre-train: first, to guarantee the performance on the Y channel, we only use the MSE of
the Y channel as the loss to pre-train the network. Then we employ the MSE of all channels
to fine-tune the parameters.

• CbCr pre-train: we use the MSE of the Cb, Cr channels as the loss to pre-train the network,
then fine-tune the parameters on all channels.

• RGB: training is performed on the three channels of the RGB space.

4.5:Training a Network:
A supervised neural network, at the highest and simplest abstract representation, can be
presented as a black box with 2 methods learn and predict as following:

37
fig-4.3: CNN as a black box

The learning process takes the inputs and the desired outputs and updates its internal state
accordingly, so the calculated output get as close as possible from the desired output. The
predict process takes input and generate, using the internal state

Take for instance the following dataset shown above, For this example, it might seems very
obvious that the output = 2 x input, however it is not the case for most of the real datasets
(where the relationship between the input and output is highly non-linear).

38
CHAPTER-5

BACK PROPAGATION

39
5.BACK PROPAGTION

5.1: Model initialization:


Let us assume that the desired relation between output Y and input X is Y=2.X

Now, we need to optimize this function using a neural network.

We are exploring which model of the generic form Y=W.X can fit the best the current
dataset. Where W is called the weights of the network and can be initialized randomly.

5.2: Forward propagate:


The natural step to do after initializing the model at random, is to check its performance.
We start from the input we have, we pass them through the network layer and calculate
the actual output of the model straight forwardly.

This step is called forward-propagation, because the calculation flow is going in the
natural forward direction from the input -> through the neural network -> to the output.

5.3: Loss function:


At this stage, in one hand, we have the actual output of the randomly initialized neural
network.

On the other hand, we have the desired output we would like the network to learn.

In order to be able to generalise to any problem, we define what we call: loss function.
Basically it is a performance metric on how well the NN manages to reach its goal of
generating outputs as close as possible to the desired values.

The most intuitive loss function is simply loss = (Desired output — actual output).
However this loss function returns positive values when the network undershoot
(prediction < desired output), and negative values when the network overshoot (prediction
> desired output). If we want the loss function to reflect an absolute error on the
performance regardless if it’s overshooting or undershooting we can define it as:
loss = Absolute value of (desired — actual ).

The most intuitive loss function is simply

loss = (Desired output — actual output).

40
If we want the loss function to reflect an absolute error on the performance regardless if
it’s overshooting or undershooting we can define it as:
The error function: E=∑ (desired — actual )²

However, several situations can lead to the same total sum of errors: for instance, lot of
small errors or few big errors can sum up exactly to the same total amount of error. Since
we would like the prediction to work under any situation, it is more preferable to have a
distribution of lot of small errors, rather than a few big ones.
In order to encourage the NN to converge to such situation, we can define the loss
function to be the sum of squares of the absolute errors (which is the most famous loss
function in NN). This way, small errors are counted much less than large errors.

As a summary, the loss function is an error metric, that gives an indicator on how much
precision we lose, if we replace the real desired output by the actual output generated by
our trained neural network model. That’s why it’s called loss.

5.4: Differentiation:
Obviously we can use any optimisation technique that modifies the internal weights of
neural networks in order to minimise the total loss function that we previously defined.These
techniques can include genetic algorithms or greedy search or even a simple brute-force search.

There is a powerful concept in mathematics that can guide us how to optimise the weights
called differentiation. Basically it deals with the derivative of the loss function. In mathematics,
the derivative of a function at a certain point, gives the rate or the speed of which this function is
changing its values at this point.

In order to see the effect of the derivative, how much the total error will change if we
change the internal weight of the neural network with a certain small value δW.

But what we really care about is the rate of which the error changes relatively to the
changes on the weight.

We could have guessed this rate by calculating directly the derivative of the loss function.
The advantage of using the mathematical derivative is that it is much faster and more precise to
calculate (less floating point precision problems).

Derivatives represent a slope on a curve. Also, the derivative measures the steepness of
the graph of a function at some particular point on the graph. In computational networks, the
activation function of a node defines the output of that node given an input or set of inputs.

When constructing Artificial Neural Network (ANN) models, one of the key
considerations is selecting an activation functions for hidden and output layers that are
differentiable. This is because calculating the back propagation error is used to determine

41
ANN parameter updates that require the gradient of the activation function for up
updating the
layer.

Here is what our loss function looks like:

 If w=2, we have a loss of 0, since the neural network actual output will fit perfectly the training
set.
 If w<2, we have a positive loss function, but the derivative is negative, meaning that an
increase of weight will decrease the loss function.
 At w=2, the loss is 0 and the derivative is 0, we reached a perfect model, nothing is needed.
 If w>2, the loss becomes positive again, but the derivative is as well positive, meaning that any
more increasee in the weight, will increase the losses even more.

Let’s check the derivative.


- If it is positive, meaning the error increases if we increase the weights, then we should
decrease the weight.
- If it’s negative, meaning the error decreases if we increase the weights, then we should
increase the weight.
- If it’s 0, we do nothing, we reach our stable point.

In the figure, y-error


error function; w-weight
w

fig.5.1- Weight- Error function plot

5.5: Back-propagation
propagation:
In this example, we used only one layer inside the neural network between the inputs and
the outputs. In many cases, more layers are needed, in order to reach more variations in the
functionality of the neural network.
complicated function that represent the composition
For sure, we can always create one complicated

42
over the whole layers of the network. For instance, if layer 1 is doing: 3.x to generate a hidden
output z, and layer 2 is doing: z² to generate the final output, the composed network will be
doing (3.x)² = 9.x². However in most cases composing the functions is very hard. Plus for
every composition one has to calculate the dedicated derivative of the composition (which is
not at all scalable and very error prone).

In order to solve the problem, luckily for us, derivative is decomposable, thus can be back-
back
propagated.
We have the starting point of errors, which is the loss function, and we know how to
derivate it, and if we know how to derivate each function from the composition, we can
propagate back the error
ror from the end to the start.

Back propagation is shorthand for "the backward propagation of errors," since an error is
computed at the output and distributed backwards throughout the network's layers. It is
commonly used to train deep neural networks, a term referring to neural networks with more
than one hidden layer.

> backpropagation of errors.


Input -> Forward calls -> Loss function -> derivative ->

5.6:Weight update:
life problems we shouldn’t update the weights with such big steps. Since there are
For real-life
linearities, any big change in weights will lead to a chaotic behaviour. We
lot of non-linearities,

43
should not forget that the derivative is only local at the point where we are calculating the
derivative.

New weight = old weight — Derivative Rate * learning rate

The learning rate is introduced as a constant (usually very small), in order to force the
weight to. get updated very smoothly and slowly.

In order to validate this equation:

If the derivative rate is positive, it means that an increase in weight will increase the
error, thus the new weight should be smaller.

If the derivative rate is negative, it means that an increase in weight will decrease the
error, thus we need to increase the weights.

If the derivative is 0, it means that we are in a stable minimum. Thus, no update on the
weights is needed -> we reached a stable state.

5.7: Iterate until convergence:


Since we update the weights with a small delta step at a time, it will take several
iterations in order to learn.
This is very similar to genetic algorithms where after each generation we apply a
small mutation rate and the fittest survives.
In neural network, after each iteration, the gradient descent force updates the
weights towards less and less global loss function.
The similarity is that the delta rule acts as a mutation operator, and the loss
function acts a fitness function to minimise.
The difference is that in genetic algorithms, the mutation is blind. Some mutations
are bad, some are good, but the good ones have higher chance to survive. The weight
update in NN are however smarter since they are guided by the decreasing gradient force
over the error.

44
Figure-5.2: Block diagram of back propag
propagation

The Proposed Cascaded CNN Framework:


We aim to learn a cascaded CNN model that discerns the statistical relations between the
hazy image and the corresponding medium transmission and global atmospheric light. The
CNN is presented in Figure 2 for a clear explanation.
specific design of our cascaded CNN
In Figure 2, the cascaded CNN includes three parts thatone is the shared hidden layers part, which
sub networks; one is the global atmospheric light
extracts common features for subsequent sub-networks;
network, which takes the outputs of the shared hidden layers part as the inputs to
estimation sub-network,
map the global atmospheric light; one is medium transmission estimation sub-network,
sub which
takes the outputs of the shared hidden layers part as the inputs to map the me
medium transmission.
By such network architecture, our cascaded CNN can predict the global atmospheric light and
medium transmission simultaneously.
The shared hidden layers part includes 4 convolutional layers with filter size of fi×fi×ni =
3×3×16 followed by ReLU nonlinearity function .Here, fi is the spatial support of a filter and niis

45
the number of filters. Since we found that the task of the global atmospheric light estimation is
easy for CNN, we employ a light-weight CNN architecture for the global atmospheric light
estimation sub-network. Specifically, the global atmospheric light estimation sub-network
includes 4 convolutional layers with filter size of 3 × 3 × 8 followed by ReLU nonlinearity
function ,except for the last one. The medium transmission estimation sub-network architecture is
inspired by the densely connected network which stacks early layers at the end of each block,
which strengthens feature propagation and alleviates the vanishing-gradient problem. Specifically,
the medium transmission estimation sub-network includes 7 convolutional layers with filter size
of 3 × 3 × 16 followed by ReLU ,nonlinearity function except for the last one. The network
parameter settings will be discussed in Section IV. Next, we describe loss functions used in the
cascaded CNN optimization.

46
RESULTS:
Fig-A:input image

47
PSNR for SRCNN Reconstruction: 51.033937 dB
Fig-B CNN output

48
Fig-C:input image

49
PSNR for SRCNN Reconstruction: 59.099285 dB

Fig-D:CNN output

50
Fig-E:input image

51
PSNR for SRCNN Reconstruction: 55.7862

Fig-F:CNN output

52
CONCLUSION:
In most of the applications higher resolution is desired to get detailed information of the
captured image. This can be achieved either by better image sensors or advanced optics. But it
has cost implications and also hardware limitations. We can opt signal processing, specially
image processing to overcome these limitations of the sensors and optics manufacturing
technology. An economical and effective solution to obtain a HR image from low resolution
images is the use of image super resolution algorithms. . The super resolution methods
combine the details of low quality multiple images to extract information on unknown pixels of
high quality image. Thus, the fusion of information from multiple LR images enables the
reconstruction of HR image easier as compared to single LR image. This is because multiple LR
images contain different information and this additional information can be exploited to obtain
a HR image. Thus it remains a challenging task to reconstruct a HR image from single LR image which
has less amount of information at hand. The proposed work is primarily concentrated on single image
super resolution.

53
REFERENCES:
[1.]A New Single Image Interpolation Technique For Super Resolution ,Hassan Aftab
Departmentof Avionics Engineering, National University of Sciences and Technology, Pakistan .
[2.] T. Blu, P. Thevenaz and M. Unser, "Linear interpolation revitalized," in IEEE Transactions
on Image Processing, vol. 13, no. 5, pp. 710-719, May 2004.

[3.] E. Maeland, "On the comparison of interpolation methods," in IEEE Transactions on


Medical Imaging, vol. 7, no. 3, pp. 213-217, Sept. 1988.

[4.]Bicubic Interpolation Algorithm Implementation for Image Appearance Enhancement


1Prachi R Rajarapollu, Vijay R Mankar ,Amaravati, Maharashtra, India
[5.] R. Matsuoka, M. Sone, N. Sudo and H. Yokotsuka, "Comparison of Image Interpolation
Methods Applied to Least Squares Matching," 2008 International Conference on Computational
Intelligence for Modelling Control & Automation, Vienna, 2008

[6.] An Introduction to Super-Resolution Imaging Jonathan Simpkins and Robert L. Stevenson


University of Notre Dame, 275 Fitzpatrick Hall, Notre Dame, IN 46556, USA

[7.] An adaptive single image method for super resolution Azade Mokari Electrical and Robotic
Engineering Department Shahrood University Shahrood

[8.] Image Super-Resolution Using Deep Convolutional Networks Chao Dong, Chen Change
Loy, Member, IEEE, Kaiming He, Member, IEEE, and Xiaoou Tang, Fellow, IEEE

54

You might also like