100% found this document useful (1 vote)
356 views31 pages

Image Processing Updated Report PDF

The document discusses image processing techniques for medical image correction and enhancement. It covers several key areas: 1) Image processing aims to enhance digital images for extracting useful information or correcting degradation. It is widely used in medical imaging for tasks like tumor detection from X-rays. 2) Common image processing tasks include representation, enhancement, restoration, analysis, and compression. Enhancement improves features for analysis/display. Restoration removes known degradation. 3) Literature on the topic proposes methods like nonlocal adaptive filtering using 3D grouping and collaborative filtering of image blocks to better separate signal from noise. Other work introduces a color attenuation prior based on brightness and saturation differences to remove haze from single images.

Uploaded by

Rashi Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
356 views31 pages

Image Processing Updated Report PDF

The document discusses image processing techniques for medical image correction and enhancement. It covers several key areas: 1) Image processing aims to enhance digital images for extracting useful information or correcting degradation. It is widely used in medical imaging for tasks like tumor detection from X-rays. 2) Common image processing tasks include representation, enhancement, restoration, analysis, and compression. Enhancement improves features for analysis/display. Restoration removes known degradation. 3) Literature on the topic proposes methods like nonlocal adaptive filtering using 3D grouping and collaborative filtering of image blocks to better separate signal from noise. Other work introduces a color attenuation prior based on brightness and saturation differences to remove haze from single images.

Uploaded by

Rashi Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Image Processing in

medical field with


correction in image
blurring

A project report submitted in fulfilment of


the requirements for the degree of Master of
Technology

by

VIPUL VIJAY

DEPARTMENT OF ELECTRONICS AND COMMUNICATION


MALVIYA NATIONAL INSTITUTE OF TECHNOLOGY JAIPUR
Chapter 1
Introduction

Image Processing means processing digital image, in order to


get enhanced image either to extract some useful information.
The term digital image processing means to processing of two
dimensional picture/image by a digital computer. It implies the
processing of two dimensional data. The data is nothing but an
array of real complex number represented by a finite number of
bits.an image is first digitized and stored as a matrix of binary
values and then processed or displayed on the monitor
This field has a vast range of application such as medical
processing, radars and sonars, satellite imaging, robotics,
automated inspection.
In medical fields the requirement is concerned with processing
of x-rays ultrasonic scanning these images sometime used for
detecting of tumours or others disease in human being, radars
and sonars images are used to detect the targets or in
guidance and manoeuvring of aircrafts or missile systems. In
satellite imaging acquired images are used to track earth
resources geographical mapping, prediction of agricultural
crops natural calamities like floods and forest fires, weather
forecast. In space gathered images processed and then
recognised and analysed to find out any asteroid coming or
nearby galaxies. There are so many applications in robotics
field also and in industries also like controlling of heavy
machinery parts
There are several tasks that are associated with image
processing like
• Image representation and modelling
• Image enhancement
• Image restoration
• Image analysis
• Image data compression

Image representation
In this type one is concerned with characteristics of the that
each pixel represents for example an image could represent
luminance of object in scene, the absorption characteristics of
body tissue that we get from x ray imaging, the temperature
profile of a region. An important consideration in image
representation is the fidelity criteria for measuring the quality of
an image or the performance of a processing technique.
Specification of such measure requires models of perception of
contrast, spatial frequencies, colour, and so on.
Image enhancement
In image enhancement, the goal is to accentuate certain image
feature for subsequent analysis or for image display. Examples
include contrast and edge enhancement, pseudo colouring,
noise filtering, sharpening, and magnifying. Image
enhancement is useful in feature extraction, image analysis,
and visual information display image enhancement technique,
such as contrast stretching, map each grey level into another
grey level by a predetermined transformation, an example is
the histogram equalization method, where the input grey levels
are mapped so that the output grey level distribution is uniform.
Image restoration
Image restoration refers to removal or minimization of known
degradation is an image. This includes deblurring of image
degraded by limitation of sensor or its environment, noise
filtering, and correction of geometric distortion or non-linearity’s
due to sensors.

Image analysis
Image analysis is concerned with making quantitative measure
from an image to produce a description of it. For example, the
task can be reading a label on the grocery item, sorting
different parts on an assembly line. Image analysis technique
require extraction of certain feature that aid in the identification
of an object. segmentation techniques are used to isolate the
desired object from the scene so that measurement can be
made on it subsequently.
Image data compression
The amount of data associated with visual information is so
large that it’s storage would require enormous storage
capacity.so image data compression techniques are concerned
with reduction of number of bits required to store or transmit
image without any appreciable loss of information. Its
applications are in field of broadcast television, remote sensing
via satellite, aircraft, radar sonar, etc.
Chapter 2
Literature survey

[1]
In this paper the image modelling and estimation algorithm developed
and can be interpreted as an approach to nonlocal adaptive
nonparametric filtering. The proposed approach can be adapted to
various noise models such as additive coloured noise, non-Gaussian
noise, etc., by modifying the calculation of coefficients’ variances in the
basic and Wiener parts of the algorithm. In addition, the developed
method can be modified for demonising 1-D-signals and video, for image
restoration, as well as for other problems that can benefit from highly
sparse signal representations.
The enhancement of the sparsity is achieved by grouping similar 2-D
image fragments (e.g., blocks) into 3-D data arrays which is called as
“groups.” Collaborative filtering is a special procedure developed to deal
with these 3-D groups.

In this paper there are three successive steps:


• 3-D transformation of a group,
• shrinkage of the transform spectrum,
• inverse 3-D transformation.

The result is a 3-D estimate that consists of the jointly filtered grouped
image blocks. By attenuating the noise, the collaborative filtering reveals
even the finest details shared by grouped blocks and, at the same time,
it preserves the essential unique features of each individual block. The
filtered blocks are then returned to their original positions. Because
these blocks are overlapping, for each pixel, and obtain many different
estimates which need to be combined. Aggregation is a particular
averaging procedure which is exploited to take advantage of this
redundancy.
plenty of denoising methods exist, originating from various disciplines
such as probability theory, statistics, partial differential equations, linear
and nonlinear filtering, and spectral and multiresolution analysis. All
these methods rely on some explicit or implicit assumptions about the
true (noise-free) signal in order to separate it properly from the random
noise.

The multiresolution transforms can achieve good sparsity for spatially


localized details, such as edges and singularities. Because such details
are typically abundant in natural images and convey a significant portion
of the information embedded therein, these transforms have found a
significant application for image denoising.

a group is a 3-D array formed by stacking together similar image


neighbourhoods. If the neighbourhoods have the same shape and size,
the formed 3-D array is a generalized cylinder. The importance of
grouping is to enable the use of a higher dimensional filtering of each
group, which exploits the potential similarity between grouped fragments
in order to estimate the true signal in each of them. This approach he
denominates as collaborative filtering.

So this collaborative filtering is essentially different because the model


induced by hard- thresholding has low-complexity only in relation to the
group as a whole. For the block-wise estimates and for the image
overall, the model can instead be highly complex and redundant as each
block can enter in many groups and, thus, can participate in many
collaborative estimates. This redundancy gives a very good noise
attenuation and allows to avoid artefacts typical for the standard
thresholding schemes.
[2]
In this paper, it is proposed that a novel linear colour attenuation prior,
based on the difference between the brightness and the saturation of the
pixels within the hazy image.

Single image haze removal has been a challenging problem due to its ill-
posed nature. This algorithm is a simple but powerful colour attenuation
prior for haze removal from a single input hazy image.

Outdoor images taken in bad usually lose contrast and fidelity, resulting
from the fact that light is absorbed and scattered by the turbid medium
such as particles and water droplets in the atmosphere during the
process of propagation. Moreover, most automatic systems, which
strongly depend on the definition of the input images, fail to work
normally caused by the degraded images. Therefore, improving the
technique of image haze removal will benefit many image understanding
and computer vision applications such as aerial imagery, image
classification image/video retrieval remote sensing and video analysis
and recognition

The dehazing effect is limited, because a single hazy image can hardly
provide much information. Later, researchers try to improve the
dehazing performance with multiple images. polarization based methods
are used for dehazing with multiple images which are taken with different
degrees of polarization.

A novel haze removal method by maximizing the local contrast of the


image based on Markov Random Field other approach is able to achieve
impressive results, it tends to produce over-saturated images. Fattal
proposes to remove the haze from colour images based on Independent
Component Analysis but the approach is time-consuming and cannot be
used for grayscale image dehazing. Furthermore, it has some difficulties
to deal with dense-haze images. Inspired by the widely used dark-object
subtraction technique and based on a large number of experiments on
haze-free images

In this algorithm by learning the parameters of the linear model with a


supervised learning method, the bridge between the hazy image and its
corresponding depth map is built effectively. With the recovered depth
information, we can easily remove the haze from a single hazy image.
There are several factors involves in dehazing
• ATMOSPHERIC SCATTERING
• COLOR ATTENUATION
• SCENE DEPTH
• TRANING DATA AND LEARNING STRATEGY
• SCENE RADIANCE
[3]
In this paper author introduce an effective technique to enhance the
images captured underwater and degraded due to the medium
scattering and absorption. underwater imaging has also been an
important source of interest in different branches of technology and
scientific research the images from the underwater are different from the
common images is underwater images suffer from poor visibility that
result in attenuation of the propagated light. mainly due to absorption
and scattering effects.
The absorption substantially reduces the light energy, while the
scattering causes changes in the light propagation direction due to this,
the image appear foggy and the distant objects become misty There
have been several attempts to restore and enhance the visibility of such
degraded images. Since the deterioration of underwater scenes results
from the combination of multiplicative and additive processes traditional
enhancing techniques
such as gamma correction, histogram equalization appears to be
strongly limited for such a task. this paper introduces a novel approach
to remove the haze in underwater images based on a single image
captured with a conventional camera. approach builds on the fusion of
multiple inputs, but derives the two inputs to combine by correcting the
contrast and by sharpening a white-balanced version of a single native
input image. The white balancing stage aims at removing the colour cast
induced by underwater light scattering, so as to produce a natural
appearance of the sub-sea images. The multi-scale implementation of
the fusion process results in an artefact-free blending. the basic
principles underlying light propagation in water, and reviews the main
approaches that have been considered to restore or enhance the
images captured under water
related work in this area
• A divergent-beam underwater Lidar imaging(UWLI) system uses
an optical/laser-sensing technique to capture turbid underwater
images. Unfortunately, these complex acquisition systems are very
expensive, and power consuming.
• A second class consists in polarization-based methods. These
approaches use several images of the same scene captured with
different degrees of polarization, as obtained by rotating a
polarizing filter fixed to the camera.
• Deep Photo system is able to restore images by employing the
existing georeferenced digital terrain and urban 3D models. Since
this additional information (images and depth approximation) is
generally not available, these methods are impractical for common
users.
• A fourth class of methods exploits the similarities between light
propagation in fog and under water. Recently, several single image
dehazing techniques have been introduced to restore images of
outdoor foggy scenes
• Recently, several algorithms that specifically restore underwater
images based on Dark Channel Prior (DCP) have been introduced.
The DCP has initially been proposed for outdoor scenes dehazing.
It assumes that the radiance of an object in a natural scene is
small in at least one of the colour component, and consequently
defines regions of small transmission as the ones with large
minimal value of colours

In this, image enhancement approach adopts a two-step strategy,


combining white balancing and image fusion, to improve underwater
images without resorting to the explicit inversion of the optical model.
White-balancing aims at improving the image aspect, primarily by
removing the undesired colour castings due to various illumination or
medium attenuation properties

The fusion of multiple inputs, but derives the two inputs to combine by
correcting the contrast and by sharpening a white-balanced version of a
single native input image. The multi-scale implementation of the fusion
process results in an artifact-free blending.

According to me this technique to be suitable for computer vision


applications, by the utility and relevance of the proposed image
enhancement technique for several challenging underwater computer
vision applications.
[4]
In this paper presented, analysed and compared ten state-of the- art
algorithms in 3D action recognition on six benchmark datasets. These
algorithms cover the use of handcrafted and deep learning features
computed from depth and skeleton video sequences.

The Kinect camera was an attempt to broaden the 3D gaming


experience of the Xbox 360’s audience. Since the Kinect camera can
capture real-time RGB and depth videos

Human action recognition methods using the Kinect data can be


classified into two categories, based on how the feature descriptors are
extracted to represent the human actions.

The first category is handcrafted features. Action recognition methods


using handcrafted features require two complex hand-design stages,
namely feature extraction and feature representation,

The feature extraction stage may involve computing the depth (and/or
colour) gradients, histogram, and other more complex transformations of
the video data.

The feature representation stage may involve simple concatenation of


the feature components extracted from the previous stage, a more
complex fusion step of these feature components, or even using a
machine learning technique, to get the final feature descriptor.
Also deep neural networks have been used to extract high-level features
from video sequences for many different applications, including 3D
human action analysis.

Existing skeleton-based action recognition methods can be grouped into


two categories: joint-based methods and body part based methods.
Joint-based methods model the positions and motion of the joints (either
individual or a combination) using the coordinates of the joints extracted
by the OpenNI tracking framework. For the body part based methods,
the human body parts are used to model the human’s articulated
system. These body parts are usually modelled as rigid cylinders
connected by joints.
In addition, the variations of each joint movement volume were
incorporated into the global feature vector to form spatial-temporal joint
features.
It is found that skeleton-based features are more robust than depth-
based features for both cross-subject action recognition and cross-view
action recognition. Handcrafted features performed better than deep
learning features given smaller datasets. However, deep learning
methods achieved very good results if trained on large datasets. While
accuracy as high as 90% has been achieved by some algorithms on
cross-subject action recognition, the average accuracy on cross-view
action recognition is much lower.
[5]
In this paper, author have summarized the traditional approach to image
quality assessment based on error-sensitivity he proposed the use of
structural similarity as an alternative motivating principle for the design of
image quality measures.
The Objective here is assessing perceptual image quality traditionally
attempted to quantify the visibility of errors between a distorted image
and a reference image using a variety of known properties of the human
visual system. Here in this paper author develop a Structural Similarity
Index and demonstrate its promise through a set of intuitive examples,
as well as comparison to both subjective ratings and state-of-the-art
objective methods on a database of images compressed with JPEG and
JPEG2000

DIGITAL images are subject to a wide variety of distortions during


acquisition, processing, compression, storage, transmission and
reproduction, any of which may result in a degradation of visual quality
however, subjective evaluation is usually too inconvenient, time-
consuming and expensive. The goal of research in objective image
quality assessment is to develop quantitative measures. An objective
image quality metric can play a variety of roles in image processing
applications. First, it can be used to dynamically monitor and adjust
image quality. Second, it can be used to optimize algorithms and
parameter settings of image processing systems.
In this method, the reference image is only partially available, in the form
of a set of extracted features made available as side information to help
evaluate the quality of the distorted image. This is referred to as
reduced-reference quality assessment. This paper focuses on full-
reference image quality assessment.

The simplest and most widely used full-reference quality metric is the
mean squared error (MSE), computed by averaging the squared
intensity differences of distorted and reference image pixels, along with
the related quantity of peak signal-to-noise ratio (PSNR). These are
appealing because they are simple to calculate, have clear physical
meanings, and are mathematically convenient in the context of
optimization. But they are not very well matched to perceived visual
quality
Most perceptual quality assessment models can be described with a
similar diagram, although they differ in detail. The stages of the diagram
are as follows
• Pre-processing:
• CSF Filtering
• Channel Decomposition
• Error Normalization
• Error Pooling
A new framework for the design of image quality measures was
proposed, based on the assumption that the human visual system is
highly adapted to extract structural information from the viewing field. It
follows that a measure of structural information change can provide a
good approximation to perceived image distortion.

Diagram of SSIM
[6]
In this paper, author proposed a high-resolution multi-scale encoder-
decoder network (HMEDN) to segment medical images, especially for
the challenging cases with blurry and vanishing boundaries caused by
low tissue contrast. In this network, three kinds of pathways were
integrated to extract meaningful features that capture accurate location
and semantic information

MEDICAL image analysis develops methods for solving


problems pertaining to medical images and their use
for clinical care. Among these methods and applications,
automatic image segmentation plays an important role in
therapy planning

The primary challenges for medical image segmentation mainly lie in


three aspects.

(1) Complex boundary interactions: The main target organs of pelvic


CT image segmentation are the three adjacent soft tissues, i.e.,
prostate, bladder, and rectum. Since these organs are adjacent to each
other and their shapes and scales can be changed easily and
significantly by different amounts of urine or bowel gas inside the
organs, the boundary interaction of these organs can be complicated.

(2) Large appearance variation: The appearance of main pelvic organs


may change dramatically for the cases with or without bowel gas,
contrast agents, fiducial markers, and metal implants.

(3) Low tissue contrast: CT images, especially those from the pelvic
area, have blurry and vanishing boundaries This last challenge poses
the most severe problem for image segmentation algorithms, as
compared with the natural or MR images, CT images visibly lack rich
and stable texture information

The weak or even vanishing edges caused by low and noisy-contrast


acquisition of the image makes the actual boundaries of organs easily
contaminated or even partially concealed by a large number of artefacts.
As a consequence, a holistic organ can be accidentally split into isolated
parts with various sizes and while the independent organs can be
visually merged as a whole The remaining clues for the correct location
of boundaries can be trivial and vulnerable
The main contributions of the paper are three parts:

1) the popular encoder decoder neural networks on low-contrast image


segmentation that they lack a mechanism to locate the touching blurry or
vanishing boundaries accurately.

2) To solve the problem, a novel high-resolution multi-scale encoder-


decoder network (HMEDN) with three different kinds of pathways and a
difficulty-aware loss function is introduced. Specifically, in the designed
network,

3) Extensive experiments on CT, MR and, microscopic image datasets,


on both semantic and instance segmentation tasks with 2D and 3D
models verify the effectiveness of our proposed network

High-Resolution Multi-Scale Encoder-Decoder Network (HMEDN) for


segmentation of low-contrast medical images.
Its working is as follows the distilling network, in which semantic
information is carefully distilled and preserved. Then, elaborate the high-
resolution pathway, which is constructed by densely connected dilated
convolution operations for high-resolution semantic information
exploitation. Next, integrate the task of contour regression with the task
of organ segmentation for accurate boundary localization. Finally, force
the network to concentrate more on the ambiguous boundary area by
designing a difficulty-guided cross-entropy loss function.

Through the experiments, several observations are made:

(1) Skip connections, which are usually


adopted in the encoder-decoder networks, are not enough
for detecting the blurry and vanishing boundaries in medical
images.

(2) Finding a good balance between semantic feature


resolution and the network complexity is an important factor
for the segmentation performance
[7]
In this paper, author address the low-light image enhancement problem
via a hybrid deep network. proposed deep model, the content stream is
used to enhance the visibility of the low-light input and learn a holistic
estimation of the scene content, while the edge stream network is
devoted to refining the edge information using both input and its
gradients based on an improved spatially variant RNN.

recurrent neural network (RNN) is an edge stream to model edge


details, with the guidance of another auto-encoder. The experimental
results show that the proposed network favourably performs against the
state-of-the-art low-light image enhancement algorithms.

IMAGES captured in the poorly light environment are often of low


visibility and affect many high-level computer vision tasks such as
detection and recognition.

Previously, deep convolutional neural networks (CNNs) have been


applied to image enhancement learn a mapping between photos from
mobile devices and a DSLR camera based on an end-to-end residual
network. This model uses a perceptual error function that combines
content, colour, and adversarial losses. The edge details are critical in
image enhancement. However, this method does not particularly
consider edge information when enhancing degraded inputs. In addition,
bilateral grid processing is embedded in a neural network for real-time
image enhancement. However, the method requires producing affine
coefficients before obtaining outputs, which lacks direct supervision from
the targets. For computer vision tasks, the number of affine coefficients
is usually very large, which becomes the performance and speed
bottlenecks

To preserve the image naturalness and generate more accurate


enhancement results, here author present an automatic low-light image
enhancement method based on a hybrid neural network.

There are four key strategies:


1) brighten the input images by a content stream

2) propose an edge stream network, by combining a spatially variant


RNN, to incorporate edge aware feature maps and predict accurate
image structures. The spatially RNNs

3) adding a small level of Gaussian noise in our training data. Therefore,


this proposed model can suppress noise to some extent.

4) further incorporate perceptual and adversarial losses to improve the


visual quality of the enhanced results.

Taking these four strategies together allows us to perform effective low-


light image enhancement.
[8]
In this paper author proposed an effective method to enhance low-light
images. The main factor in image enhancement is how well the
illumination map is estimated. The author has developed a structure-
aware smoothing model to enhance the illumination consistency. In this
paper author have created two algorithms: one can obtain the exact
optimal solution to the target problem, while the other alternatively
solves the approximate problem with significant saving of time.

First method is based on the Retinex-based category, which intends to


enhance a low-light image by estimating its illumination map The
illumination map is first constructed by finding the maximum intensity of
each pixel in R, G and B channels. Then, exploit the structure of the
illumination to refine the illumination map.

the image decomposition which attempts to decompose the input into


two components. the goal of the intrinsic image decomposition is to
recover the reflectance component and the shading

lowlight image enhancement (LIME) method. More concretely, the


illumination of each pixel is first estimated individually by finding the
maximum value in R, G, and B channels.

The relative order of lightness represents the light source directions and
the lightness variation, the naturalness of an enhanced image is related
to the relative order of lightness in different local areas.

This low-light image enhancement(LIME) technique can feed many


vision-based applications, such as edge detection, feature matching,
object recognition and tracking, with high visibility inputs, and thus
improve their performance.
[9]
In this paper author present CNN with attention mechanism (ACNN) for
facial expression recognition in the presence of occlusions
Facial expression recognition in the wild is challenging due to various
unconstrained conditions. Although existing
facial expression classifiers have been almost perfect on analysing
constrained frontal faces, they fail to perform well on partially occluded
faces that are common in the wild.

In this paper author proposed a convolution neutral network (CNN) with


attention mechanism (ACNN) that can perceive the occlusion regions of
the face and focus on the most discriminative un-occluded regions.

ACNN is an end-to-end learning framework. It combines the multiple


representations from facial regions of interest (ROIs). Each
representation is weighed via a proposed gate unit that computes an
adaptive weight from the region itself according to the unobstructedness
and importance. Considering different RoIs, here author introduce two
versions of ACNN: patch-based ACNN (pACNN) and global–local-based
ACNN (gACNN). pACNN only pays attention to local facial patches.
gACNN integrates local representations at patch-level with global
representation at image level.

FACIAL expression recognition (FER) has received significant interest


from computer scientists and psychologists over recent decades, as it
holds promise to an abundance of applications, such as human-
computer interaction, affect analysis, and mental health assessment.
Although many facial expression recognition systems have been
proposed and implemented, majority of them are built on images
captured in controlled environment, such as CK+, MMI, Oulu- CASIA,
and other lab-collected datasets.
The controlled faces are frontal and without any occlusion. The FER
systems that perform perfectly on the lab-collected datasets, are
probable to perform poorly when recognizing human expressions under
natural and un-controlled conditions. To fill the gap between the FER
accuracy on the controlled faces and uncontrolled
faces, researchers make efforts on collecting largescale facial
expression datasets in the wild. Despite
the usage of data from the wild, facial expression recognition is still
challenging due to the existence of partially occluded faces
Here a convolutional neural network with attention mechanism (ACNN)
for facial expression recognition with partial occlusions. To address the
occlusion issue, ACNN endeavours to focus on different regions of the
facial image and weighs each region according to its obstructed-ness (to
what extent the patch is occluded) as well as its contribution to FER for
facial analysis tasks, occlusion is one of the biggest challenges in the
real world facial expression recognition Previous approaches that
address facial occlusions can be classified into two categories:
holistic-based and part-based methods.

Holistic-based approaches treat the face as a whole and do not


explicitly divide the face into sub-regions. holistic way is to learn a
generative model that can reconstruct a complete face from the
occluded one. The generative methods rely on the training data with
varied occlusion conditions.

Part-based methods explicitly divide the face into several overlapped or


non-overlapped segmentations. To determine the patches on the face,
existing works either divide the facial image into several uniform parts or
get the patches around the facial landmarks or get the patches by a
sampling strategy, or explicitly detect the occludes Then, the part-based
methods detect and compensate the missing part, or re-weight the
occluded and non-occluded patches differently or ignore the occluded
parts

ACNNs differ from previous part or holistic based methods in two ways.
One, ACNNs need not explicitly handle occlusions which avoid
propagating detecting/ in painting error afterwards. Two, ACNNs unify
representation learning and occlusion patterns encoding in an end to
end CNN.

The Gate Unit in ACNN enables the model to shift attention from the
occluded patches to other unobstructed as well as distinctive facial
regions. Considering that facial expression is distinguished in specific
facial regions, author designed a patch based pACNN that incorporates
region decomposition to find typical facial parts that are related to
expression.
[10]
In this paper, author present a single-image SR algorithm based on the
rational fractal interpolation model, which is more suitable for describing
the structures of an image.

Interpolation-based methods estimate the unknown pixels in the HR grid


by employing their known neighbours. Traditional interpolation
algorithms involving bi-linear and bi-cubic interpolation are the most
widely used methods in practice. However, the kernel functions used in
the above methods are isotropic and cannot fully reflect the intrinsic
structures of images. Thus, these interpolation approaches are prone to
producing zigzagging artefacts along edges and blurring details in
textures. In order to compensate for the deficiencies of traditional
methods, edge-directed interpolation methods have recently been
proposed.

The steps involving in this algorithm include First, for each LR image
patch, the isocline method is employed to detect texture, such that more
detailed textures can be obtained, and the LR image is divided into
texture regions and contexture regions. Second, in the interpolation
model, the scaling factors play an important role, whereas the influence
of the shape parameters is minor. Based on the relationship between
scaling factors and the fractal dimension, the scaling factors are
accurately calculated by using the image local structure feature.

Then a suitable range of shape parameters is obtained using a number


of training images. Then, rational fractal interpolation and rational
interpolation are used in the texture region and the non-texture region,
respectively. Specifically, each LR image patch is first interpolated, and
the interpolation is extended to the entire image by traversing each
patch. Finally, an HR image is obtained by pixel mapping. Because the
proposed rational fractal interpolation function is an IFS, the image can
be amplified at any integral multiple by selecting a suitable mapping. The
experimental results demonstrate that the proposed algorithm achieves
competitive performance and generates high-quality SR images with
sharp edges and rich texture.
[11]

This paper is based on image deblurring which is widely used


in correcting the medical image, remote sensing, and computer
vision. Blurring is mostly happen when there is a relative shift
between the camera and the scene where the photo is taken
This motion blurr can be modeled as
I = (L convolution k) + N;
Where I =blurred image
L= sharp latent image
K =blur kernel
And N =unknown sensor noise,
So here the task is to find L without knowing the K

Due to camera motion in the image capture process, the high-


intensity pointolite in low-light conditions, or reflected lights in
both low-light and normal illumination conditions often produce
several light streaks. When the dynamic range of the scene is
higher than that of the camera, the captured intensities are
clipped into the dynamic range of a camera, i.e., the maximum
or minimum intensity of the dynamic range. Thus, the high-
intensity light streaks are prone to be limited by saturation.

This is the flowchart of this technique works

This paper present a simple yet effective method to select the


optimal light streak patch according to the
Properties of light streaks and the blur kernel. And then the light
streak is used as a reference to estimate the blur kernel so that
the blur kernel and the light streak have a similar shape.

But this is limited because Saturated light streaks and pointolite


in the low-light images cause a serious problem for blind
deconvolution because they violate the assumption of linear
convolution in the blurring process. So it can be consider as a
future scope of this paper.
[12]
This paper is based on blind image deblurring as there are
various pairs of latent image and blur kernels this paper is
based on the surface-aware strategy from the intrinsic
geometrical consideration. This approach facilitates the blur
kernel estimation due to the preserved sharp edges in the
intermediate latent images. Extensive experiments demonstrate
that our method outperforms .State-of-the-art methods on
deblurring the text and natural image also this method can be
used in achieving better results in low illuminating images with
large saturated regions and impulse noise
The image deblurring task is to seek the underlying true image
from the blurred measurement. It can be roughly classified into
two categories: non-blind deblurring and blind deblurring. If the
blur kernel is known, the problem is the classic non-blind image
deblurring problem, i.e., the task is to estimate the
Latent image also in this paper it is states that the blind image
deblurring can be further classified into two categories. One is
to estimate the latent image and kernel simultaneously. The
other is
To estimate the blur kernel first and then use it to find the latent
sharp image by a non-blind deblurring algorithm
In this work, the main focus is on the second type since the
estimated kernel plays a central role in the whole system.
The blind image deblurring problem is typically ill posed since
there would be infinitely many pairs explaining a blurred image,
e.g., one undesirable solution that perfectly explains this
situation is the no-blur explanation: the blurred image itself and
the delta blur kernel. To get a stable and reasonable result,
additional proper prior knowledge on both the latent image and
the blur kernel should be introduced

previously many works have illustrated the importance of edge


information in the blur kernel estimation. Some assumed that
edges are detectable even if the feature strength is weak and
used a sub-pixel difference of Gaussian edge detector to find
the location and the orientation of edges. Some used the
explicit edge prediction consisting of the bilateral filter and the
shock filter to generate salient image edges. Although these
methods behave well for small blur degradation, they are not
suitable for large-scale kernels.
The L0 gradient minimization can localize the sharp edges in
the image and segment the image into different parts bounded
by the salient edges. However, it generates grid/spiky artifacts
in image illustrated ,the limitation of only using the L0 norm of
the gradient.

This method uses an extra surface prior, and the intermediate


latent image has much clearer sharp edges and fewer artifacts.
Hence, it is natural that our method can estimate an accurate
kernel which will help find a clean final deblurred image
[13]
In this paper it is stated that Blind image deblurring is a long
standing challenging problem in image processing and low-
level vision. Recently, sophisticated priors such as dark
channel prior, extreme channel prior, and local maximum
gradient prior, have shown promising effectiveness. However,
these methods are computationally expensive.

Now in this paper a theory is proposed and an approximated


method is proposed and now to address these problems,
author firstly proposes a simplified sparsity prior of local
minimal pixels, namely patch-wise minimal pixels (PMP). The
PMP of clear images is much sparser than that of blurred ones,
and hence is very effective in discriminating between clear and
blurred images. Then an algorithm is proposed to efficiently
exploit the sparsity of PMP in deblurring. This new algorithm
flexibly imposes sparsity inducing on the PMP under the
maximum a posterior (MAP) framework rather than directly
uses the half quadratic splitting algorithm. By this, it avoids non-
rigorous approximation solution in existing algorithms, while
being much more computationally efficient. This algorithm is
simple yet effective in discriminating between clear and blurred
images. Rather than directly using the half quadratic splitting
algorithm, the new algorithm flexibly imposes sparsity inducing
on the PMP in the deblurring procedure under the MAP
framework. Particularly, it avoids non-rigorous approximate
solution in existing algorithms in jointly handling multiple non-
explicit priors, while being much more efficient. Extensive
experiments on both natural and specific images demonstrated
that it not only can achieve state-of-the-art deblurring quality,
but also can improve the practical stability and computational
efficiency substantially.
[14]
This paper is based on active and dynamic sensors in this
paper it is stated that active sensors can only record scenes at
low frame rates. The dynamic vision sensor (DVS) can capture
high-speed scenes under challenging lighting conditions.
However, DVS discards absolute light intensity information. The
dynamic and active pixel vision image sensor (DAVIS) can
output events and low-frame-rate grayscale images
simultaneously. This paper proposes a high frame rate video
reconstruction and deblurring algorithm based on events and
low-speed blurred image sequences from DAVIS.
we used the extended 3-D partial recursive search (E-3DPRS)
method to perform block matching on adjacent event images.
The obtained MVF represents the movement of objects with
high time solution.
In this way, E-3DPRS improves the accuracy of MVFs and
simplifies the calculation complexity. So, by E-3DPRS algorithm
between adjacent de-noised event images to obtain improved
MVFs with low calculation complexity. In addition, we also
consider the deblurring of low-frame-rate grayscale image
sequence. We initialize the optical flows with MVF, and then
alternately optimize the optical flows and restore the latent
sharp image. Finally, we interpolate frames based on a sharp
image sequence and MVFs to reconstruct a high frame rate
sharp video. Based on the real datasets recorded by DAVIS
this method can reconstruct sharp interpolated frames with
better perception.
But it has limitation as E-3DPRS and generating interpolation
frames can be functionally run in real time. However, the
primal-dual optimization of image deblurring process and the
de-noising steps of event images will take a long running time.
Therefore, the overall high frame rate video generation and
deblurring algorithm cannot run in real time. Also, the
superposition of motion vectors between multiple frames
causes the quality of the interpolated frames to inevitably
decrease. So further a smooth motion trajectory can be
generated based on the obtained optical flow, thereby
optimizing the MVs’ superposition between multiple frames
Reference
[1] Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen
Egiazarian, “Image Denoising by Sparse 3-D Transform-Domain Collaborative
Filtering” ,IEEE Transactions on Image Processing (Volume: 16, Issue: 8, Aug.
2007)

[2] Qingsong Zhu, Jiaming Mai, and Ling Shao, “A Fast Single Image Haze
Removal Algorithm Using Color Attenuation Prior”, IEEE Transactions on
Image Processing (Volume: 24, Issue: 11, Nov. 2015)

[3] Codruta O. Ancuti, Cosmin Ancuti, Christophe De Vleeschouwer, and


Philippe Bekaert, “Color Balance and Fusion for Underwater Image
Enhancement”, 2019 International Conference on Intelligent Computing and
Control Systems (ICCS)

[4] Lei Wang, Du Q. Huynh, and Piotr Koniusz, “A Comparative Review of


Recent Kinect-Based Action Recognition Algorithms”, IEEE Transactions on
Image Processing (Volume: 29, issue: 02 July 2019)

[5] Zhou Wang, Alan Conrad Bovik, Hamid Rahim Sheikh, and Eero P.
Simoncelli, “Image Quality Assessment: From Error Visibility to Structural
Similarity”, IEEE Transactions on Image Processing (Volume: 13, Issue: 4,
April 2004)

[6] Sihang Zhou, Dong Nie, Ehsan Adeli ,Jianping Yin ,Jun Lian , and
Dinggang Shen, “High-Resolution Encoder–Decoder Networks for Low-
Contrast Medical Image Segmentation”, IEEE Transactions on Image
Processing (Volume: 29, Issue: 19, June 2019)

[7] Wenqi Ren, Sifei Liu, Lin Ma, Qianqian Xu, Xiangyu Xu, Xiaochun Cao,
Junping Du, and Ming-Hsuan Yang, “Low-Light Image Enhancement via a
Deep Hybrid Network” , IEEE Transactions on Image Processing (Volume:
28, Issue: 9, Sept. 2019)

[8] Xiaojie Guo, Yu Li, and Haibin Ling, “LIME: Low-Light Image
Enhancement via Illumination Map Estimation”, IEEE Transactions on Image
Processing (Volume: 26, Issue: 2, Feb. 2017)

[9] Yong Li,Jiabei Zeng, Shiguang Shan, and Xilin Chen,


“Occlusion Aware Facial Expression Recognition Using CNN with Attention
Mechanism” IEEE Transactions on Image Processing (Volume: 28, Issue: 5,
May 2019)
[10] Yunfeng Zhang, Qinglan Fan, Fangxun Bao, Yifang Liu, and Caiming
Zhang , “Single-Image Super-Resolution Based on Rational Fractal
Interpolation” IEEE Transactions on Image Processing (Volume: 27, Issue: 8,
Aug. 2018)

[11] Xinxin Zhang, Ronggang Wang, Da Chen, Yang Zhao, Wen GAO,
“Handling Outliers by Robust M-Estimation in Blind Image Deblurring”
IEEE Transactions on Multimedia (Early Access), 2020

[12] Jun Liu, Ming Yan, and Tieyong Zeng, “Surface-aware Blind Image
Deblurring”, IEEE Transactions on Pattern Analysis and Machine
Intelligence (Early Access), 2019

[13] Fei Wen, Rendong Ying, Yipeng Liu, Peilin Liu, and Trieu-Kien Truong,
“A Simple Local Minimal Intensity Prior and an Improved Algorithm for Blind
Image Deblurring”, IEEE Transactions on Circuits and Systems for Video
Technology (Early Access), 2020

[14] Kaiming Nie, Xiaopei Shi, Silu Cheng, Zhiyuan Gao, Jiangtao Xu, “High
Frame Rate Video Reconstruction and Deblurring based on Dynamic and
Active Pixel Vision Image Sensor” IEEE Transactions on Circuits and Systems
for Video Technology, 2020

You might also like