Efficient Block Matching For Removing Impulse Noise
Efficient Block Matching For Removing Impulse Noise
Abstract
INTRODUCTION
IMPULSE noise has been successfully removed by the me- dian filter. Most median
filters consider only the neighboring pixels around a pixel to be denoised, and therefore
accurate estimation of the ground true value is limited. This problem can be addressed by
obtaining the estimate of the true value from the similar blocks in the image. Block-
matching-based approach has a reasonable ground that the corrupted pixel values can be
nearly perfectly recovered from the pixel values obtained from the similar blocks to the block
to be denoised.
The idea of block matching is not new but has been studied in video processing and
image registration [1], [2]. Block matching in denoising applications is, however, quite
different in its nature from the block matching in other image-processing applications due to
the existence of noisy pixels that cannot be directly used in matching. Block-matching-based
denoising method generally has to conduct an exhaustive search over the entire image to
collect sufficient number of similar blocks, and consequently computational time is highly
demanding.
Ahn et al. [4] proposed a convolutional neural network (CNN) based block-matching
method for image denoising. This scheme needs first to apply a denoising algorithm on the
noisy image to obtain pilot signal for training a CNN. In addition to this overhead, neural
networks generally require long training time, which prevents it from being a realtime
application. Lu et al. [5] proposed a three-values-weighted method where the number of
pixels in the maximum or minimum group determines the centroid of the middle group.
Roy et al. [6] used support vector machine and fuzzy filter to denoise gray-scale
images, and reported that the method is superior in preserving the image’s local structure.
Roy and Laskar [7] presented a linear-prediction-based adaptive filter to denoise color
images. Noisy pixels are identified by comparing the linear prediction error with a predefined
threshold, and adaptive vector median filtering is applied to the pixels with error greater than
the threshold.
We propose a method called the block matching by a lookup table (BMLUT), which
can avoid the exhaustive search to find all matching blocks for the block being denoised. The
rationale behind using a lookup table is that computation of the distance between pixels
corrupted by impulse noise is meaningless, and hence a simple measure, such as the
Euclidean distance, cannot be directly employed.
Deep learning is a subset of machine learning that focuses on training deep neural
networks with multiple layers to learn and represent complex patterns in data. Deep neural
networks are composed of interconnected layers of artificial neurons that simulate the
structure and functioning of the human brain.
Because it is based on artificial neural networks (ANNs) also known as deep neural
networks (DNNs). These neural networks are inspired by the structure and function of the
human brain’s biological neurons, and they are designed to learn from large amounts of
data.
Neural Networks: Deep learning relies on neural networks with multiple hidden layers,
allowing the network to learn hierarchical representations of the data. Each layer in the
network extracts higher-level features from the representations learned in the previous
layer. Deep neural networks can automatically learn and extract relevant features from
raw data, eliminating the need for manual feature engineering.
Training Process: Deep learning models are trained through a process called back
propagation, where the network adjusts its internal parameters (weights and biases) to
minimize the difference between the predicted output and the target output. This process
involves propagating errors backward through the network and updating the parameters
using gradient descent optimization algorithms.
Large-Scale Data: Deep learning models typically require a large amount of labeled data
for training. The availability of big data and advances in computing power have enabled
the success of deep learning models. The large-scale data allows deep neural networks to
learn complex representations and generalize well to new, unseen data.
Applications: Deep learning has shown remarkable performance in various fields,
including computer vision, natural language processing, speech recognition, and
recommendation systems. It has achieved state-of-the-art results in tasks such as image
classification, object detection, machine translation, and speech synthesis.
Deep learning excels in handling complex and high-dimensional data, capturing intricate
patterns, and achieving state-of-the-art performance in many AI tasks. However, it
typically requires more computational resources and data compared to traditional machine
learning approaches.
In summary, machine learning focuses on training algorithms to learn patterns and make
predictions or decisions, while deep learning is a specific approach within machine learning
that utilizes deep neural networks to learn complex representations. Deep learning has gained
significant attention and has been particularly successful in solving tasks that involve
complex data such as images, audio, and text.
Lane detection using deep learning is a popular approach that leverages the power of deep
neural networks to detect and track lane markings on the road. Deep learning models excel in
learning complex patterns and can effectively capture the distinctive characteristics of lane
markings, making them well-suited for this task. Here is a high-level overview of the lane
detection process using deep learning:
Dataset Preparation: The first step is to collect or create a dataset of labeled images or
videos, where the lane markings are manually annotated. The annotations typically
involve marking the pixels or regions corresponding to the lane markings in the images or
videos.
Data Pre-processing: The collected dataset is pre-processed to prepare it for training.
This may involve resizing the images, normalizing pixel values, and splitting the dataset
into training and validation sets.
Model Architecture: A deep learning model architecture needs to be selected or designed
for lane detection. Convolutional Neural Networks (CNNs) are commonly used due to
their ability to capture spatial dependencies in images. The model architecture may
consist of multiple convolutional layers followed by pooling, fully connected layers, and
output layers.
Training: The deep learning model is trained using the labeled dataset. The training
process involves feeding the input images into the model, comparing the predicted output
(lane markings) with the ground truth annotations, and updating the model's weights
through back propagation and gradient descent optimization algorithms. The objective is
to minimize the difference between the predicted output and the ground truth annotations.
Post-processing: Once the model is trained, the lane detection results may undergo post-
processing steps to refine the detected lane markings. This may include techniques such
as filtering outliers, curve fitting, and extrapolation to extend the detected lanes.
Evaluation and Testing: The trained model is evaluated on a separate test dataset to
assess its performance. Evaluation metrics such as accuracy, precision, recall, and F1
score can be used to measure the model's lane detection performance.
Deployment: The trained lane detection model can be deployed in real-time applications,
such as autonomous vehicles or advanced driver-assistance systems (ADAS), to detect
and track lane markings in real-world scenarios.
It's worth noting that there are different variations and approaches for lane detection using
deep learning, including single-image-based methods and video-based methods. Additionally,
techniques like semantic segmentation and instance segmentation can also be employed to
precisely detect and differentiate lane markings from other objects on the road.
Deep learning-based lane detection has shown promising results and has been
successfully applied in various real-world applications. However, it's important to fine-tune
and validate the model on diverse datasets and consider factors such as different weather
conditions, road types, and lighting variations to ensure robust and reliable lane detection
performance.
1.6.3 Application of Deep learning
Deep learning is a subset of machine learning that uses artificial neural networks
(ANNs) to model and solve complex problems. It is based on the idea of building artificial
neural networks with multiple layers, called deep neural networks, that can learn
hierarchical representations of the data.
Deep learning algorithms use a layered architecture, where the input data is passed
through an input layer and then propagated through multiple hidden layers, before reaching
the output layer. Each layer applies a set of mathematical operations, called weights and
biases, to the input data, and the output of one layer serves as the input to the next.
The process of training a deep learning model involves adjusting the weights and
biases of the model to minimize the error between the predicted output and the true output.
This is typically done using a variant of gradient descent, an optimization algorithm that
adjusts the weights and biases in the direction of the steepest decrease in the error.
Deep learning has a wide range of applications, including image and speech
recognition, natural language processing, and computer vision. One of the main advantages
of deep learning is that it can automatically learn features from the data, which means that
it doesn’t require the features to be hand-engineered. This is particularly useful for tasks
where the features are difficult to define, such as image recognition.
1. Automatic feature learning: Deep learning algorithms can automatically learn features
from the data, which means that they don’t require the features to be hand-engineered.
This is particularly useful for tasks where the features are difficult to define, such as
image recognition.
2. Handling large and complex data: Deep learning algorithms can handle large and
complex datasets that would be difficult for traditional machine learning algorithms to
process. This makes it a useful tool for extracting insights from big data.
3. Improved performance: Deep learning algorithms have been shown to achieve state-
of-the-art performance on a wide range of problems, including image and speech
recognition, natural language processing, and computer vision.
4. Handling non-linear relationships: Deep learning can uncover non-linear
relationships in data that would be difficult to detect through traditional methods.
5. Handling structured and unstructured data: Deep learning algorithms can handle
both structured and unstructured data such as images, text, and audio.
6. Predictive modeling: Deep learning can be used to make predictions about future
events or trends, which can help organizations plan for the future and make strategic
decisions.
7. Handling missing data: Deep learning algorithms can handle missing data and still
make predictions, which is useful in real-world applications where data is often
incomplete.
8. Handling sequential data: Deep learning algorithms such as Recurrent Neural
Networks (RNNs) and Long Short-term Memory (LSTM) networks are particularly
suited to handle sequential data such as time series, speech, and text. These algorithms
have the ability to maintain context and memory over time, which allows them to make
predictions or decisions based on past inputs.
9. Scalability: Deep learning models can be easily scaled to handle an increasing amount
of data and can be deployed on cloud platforms and edge devices.
10. Generalization: Deep learning models can generalize well to new situations or
contexts, as they are able to learn abstract and hierarchical representations of the data.
CHAPTER 2
LITERATURE SURVEY
[1] N. Yao and K. K. Ma, “Adaptive rood pattern search for fast block-matching motion
estimation”. IEEE Trans. Image Process., vol. 11 no. 12, pp. 1442– 1448, Dec. 2002.
We propose a novel and simple fast block-matching algorithm (BMA), called adaptive
rood pattern search (ARPS), which consists of two sequential search stages: (1) initial search
and (2) refined local search. For each macroblock (MB), the initial search is performed only
once at the beginning in order to find a good starting point for the follow-up refined local
search. By doing so, unnecessary intermediate search and the risk of being trapped into local
minimum matching error points could be greatly reduced in long search case. For the initial
search stage, an adaptive rood pattern (ARP) is proposed, and the ARP's size is dynamically
determined for each MB, based on the available motion vectors (MVs) of the neighboring
MBs.
In the refined local search stage, a unit-size rood pattern (URP) is exploited
repeatedly, and unrestrictedly, until the final MV is found. To further speed up the search,
zero-motion prejudgment (ZMP) is incorporated in our method, which is particularly
beneficial to those video sequences containing small motion contents. Extensive experiments
conducted based on the MPEG-4 Verification Model (VM) encoding platform show that the
search speed of our proposed ARPS-ZMP is about two to three times faster than that of the
diamond search (DS), and our method even achieves higher peak signal-to-noise ratio
(PSNR) particularly for those video sequences containing large and/or complex motion
contents.
BLOCK-MATCHING algorithm (BMA) for motion estimation (ME) has been widely
adopted by current video coding standards such as H.261, H.263, MPEG-1, MPEG-2,
MPEG-4, and H.264 due to its effectiveness and simplicity for implementation. The most
straightforward BMA is the full search (FS), which exhaustively searches for the best
matching block within the search window. However, FS yields very high computational
complexity and makes ME the main bottleneck in real-time video coding applications. Thus,
using a fast BMA is indispensable to reduce the computational cost. In our views, existing
fast BMAs can be classified into four categories as follows.
In this paper, we have proposed a novel and simple fast block-matching algorithm,
called adaptive rood pattern search (ARPS). By exploiting higher distribution of MVs in the
horizontal and vertical directions and the spatial inter-block correlation, ARP adaptively
exploits adjustable rood-shaped search pattern (which is powerful in tracking motion trend),
together with the search point indicated by the predicted MV, to match different motion
contents of video sequence for each macroblock.
[2] C. Je and H.-M. Park, “Optimized hierarchical block matching for fast and accurate
image registration,” Signal Process.: Image Commun., vol. 28, no. 7, pp. 779–791, 2013.
Recently the camera resolution has been highly increased, and the registration
between high-resolution images is computationally expensive even by using hierarchical
block matching. This paper presents a novel optimized hierarchical block matching algorithm
in which the computational cost is minimized for the scale factor and the number of levels in
the hierarchy. The algorithm is based on a generalized version of the Gaussian pyramid and
its inter-layer transformation of coordinates. The search window size is properly determined
to resolve possible error propagation in hierarchical block matching.
In addition, we also propose a simple but effective method for aligning colors
between two images based on color distribution adjustment as a preprocessing. Simplifying a
general color imaging model, we show much of the color inconsistency can be compensated
by our color alignment method. The experimental results show that the optimized hierarchical
block matching and color alignment methods increase the block matching speed and
accuracy, and thus improve image registration. Using our algorithm, it takes about 1.28 s for
overall registration process with a pair of images in 5 mega-pixel resolution.
Image registration is the process of geometrically transforming an image to make it
match with another image. It is one of the fundamental and important operations in image
processing and its applications, and has been investigated by many researchers [1], [2], [3],
[4]. Image registration has vast applications such as panoramic image mosaics [5], 3D
modeling [6], remote sensing [7], image deblurring [8], image resolution enhancement [9],
image compression [10], and medical applications [3], [4].
[3] K. Dabov, A. Foi, and K. Egiazarian, “Image denoising by sparse 3-D transform-
domain collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 8, pp. 2080–
2095, Aug. 2007.
The filtered blocks are then returned to their original positions. Because these blocks
are overlapping, for each pixel, we obtain many different estimates which need to be
combined. Aggregation is a particular averaging procedure which is exploited to take
advantage of this redundancy. A significant improvement is obtained by a specially developed
collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its
efficient implementation are presented in full detail; an extension to color-image denoising is
also developed. The experimental results demonstrate that this computationally scalable
algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-
noise ratio and subjective visual quality.
Because such details are typically abundant in natural images and convey a significant
portion of the information embedded therein, these transforms have found a significant
application for image denoising. Recently, a number of advanced denoising methods based on
multiresolution transforms have been developed, relying on elaborate statistical dependencies
between coefficients of typically overcomplete (e.g., translation-invariant and multiply-
oriented) transforms. Examples of such image denoising methods can be seen in [1]–[4].
[4] B. Ahn and N. I. Cho, “Block-matching convolutional neural network for image
denoising,” CoRR abs/1704.00524, 2017.
There are two main streams in up-to-date image denoising algorithms: non-local self
similarity (NSS) prior based methods and convolutional neural network (CNN) based
methods. The NSS based methods are favorable on images with regular and repetitive
patterns while the CNN based methods perform better on irregular structures. In this paper,
we propose a block matching convolutional neural network (BMCNN) method that combines
NSS prior and CNN. Initially, similar local patches in the input image are integrated into a 3D
block. In order to prevent the noise from messing up the block matching, we first apply an
existing denoising algorithm on the noisy image.
The denoised image is employed as a pilot signal for the block matching, and then
denoising function for the block is learned by a CNN structure. Experimental results show
that the proposed BMCNN algorithm achieves state-of-the-art performance. In detail,
BMCNN can restore both repetitive and irregular structures. With current image capturing
technologies, noise is inevitable especially in low light conditions. Moreover, captured
images can be affected by additional noises through the compression and transmission
procedures. Since the noise degrades visual quality, compression performance, and also the
performance of computer vision algorithms, the image denoising has been extensively studied
over several decades [1]–[13].
In order to estimate a clean image from its noisy observation, a number of methods
that take account of certain image priors have been developed. Among various image priors,
the NSS is considered a remarkable one such that it is employed in most of current state-of-
the-art methods. The NSS implies that some patterns occur repeatedly in an image and the
image patches that have similar patterns can be located far from each other. The nonlocal
means filter [1] is a seminal work that exploits this NSS prior.
The employment of NSS prior has boosted the performance of image denoising
significantly, and many up-to-date denoising algorithms [3], [7], [14]–[16] can be categorized
as NSS based methods. Most NSS based methods consist of following steps. First, they find
similar patches and integrate them into a 3D block. Then the block is denoised using some
other priors such as low-band prior [3], sparsity prior [15], [16] and low-rank prior [7]. Since
the patch similarity can be easily disrupted by noise, the NSS based methods are usually
implemented as two-step or iterative procedures.
[5] C. T. Lu, Y. Y. Chen, L. L. Wang, and C. F. Chang, “Removal of salt-and pepper
noise in corrupted image using three-values-weighted approach with variable-size
window,” Pattern Recognit. Lett., vol. 80, pp. 188–199, 2016.
The quality of a digital image deteriorates by the corruption of impulse noise in the
record or transmission. The process of efficiently removing this impulse noise from a
corrupted image is an important research task. This investigation presents a novel three-
values-weighted method for the removal of salt-and-pepper noise. Initially, a variable-size
local window is employed to analyze each extreme pixel (0 or 255 for an 8-bit gray-level
image). Each non-extreme pixel is classified and placed into the maximum, middle, or
minimum groups in the local window.
The numbers of non-extreme pixels belonging to the maximum or the minimum group
determines the centroid of the middle group. The distribution ratios of these three groups are
employed to weight the non-extreme pixels with the maximum, middle, and minimum pixel
values. The center pixel with an extreme value is replaced by the weighted value, thus
enabling the noisy pixels to be restored. Experimental results show that the proposed method
can efficiently remove salt-and-pepper noise (only for known extreme values of 0 and 255)
from a corrupted image for different noise corruption densities (from 10% to 90%);
meanwhile, the denoised image is freed from the blurred effect.
Many techniques have been proposed for the restoration of the images contaminated
by salt-and-pepper noise. The median and the mean filters were popular for the removal of
salt-and-pepper noise due to their good denoising power and computational simplicity.
However, when the noise density is over 60%, some details and edges of the original image
cannot be restored well by the algorithms. This is because, the number of noise-free pixels are
not sufficient to reconstruct the pixels (noisy pixels) corrupted by noise in a local window.
An adaptive median filter [10] selects the median value in an adaptive window for
each pixel, thus enabling the noisy pixels to be removed. This method performed well at low
noise densities. However, for high noise densities the window size has to be increased which
may lead to a blur in the denoised image. In the modified switching median filters [3], [20],
the decision of an impulse noise pixel is based on a pre-defined threshold value. The major
drawback of these methods is that defining a robust decision threshold becomes difficult. In
addition, these filters did not take into account the local features because the details and the
edges could not be adequately restored, in particular when noise density was high.
[6] A. Roy, J. Singha, S. S. Devi, and R. H. Laskar, “Impulse noise removal using SVM
classification based fuzzy filter from gray scale images,” Signal Process., vol. 28, pp.
262–273, 2016.
In this paper, support vector machine (SVM) classification based Fuzzy filter (FF) is
proposed for removal of impulse noise from gray scale images. When an image is affected by
impulse noise, the quality of the image is distorted since the homogeneity among the pixels is
broken. SVM is incorporated for detection of impulse noise from images. Here, a system is
trained with an optimal feature set. When an image under test is processed through the trained
system, all the pixels under test image will be classified into two classes: noisy and non-
noisy.
Fuzzy filtering will be performed according to the decision achieved during the
testing phase. It provides about 98.5% true-recognition at the time of classification of noisy
and non-noisy pixels when image is corrupted by 90% of impulse noise. It leads to
improvement of Peak-signal to noise ratio to 22.2437 dB for the proposed system when an
image is corrupted by 90% of impulse noise. The simulation results also suggest that how this
system outperforms some of the state of art methods while preserving structural similarity to
a large extent.
Impulse noise is an “on–off” noise that affects an image drastically. The intensity of
impulse noise has tendency of being either large or small. When an image is affected by
impulse noise, the homogeneity among pixels is distorted. As a result, the quality of image is
deteriorated largely. Impulse noise is generated at the time of capturing images through noisy
sensors, during the transmission of images through corrupted channel or due to the
atmospheric variations such as lightening etc. So removal of impulse noise from that
degraded image is required so that the quality of an image is improved with less blurring
effect preserving the edge like fine details.
The median filter works on every pixel of noisy image irrespectively whether the
pixel is corrupted or not. As a result, it leads blurring effect to the de-noised image. The
difficulties in median filter are improved by weighted median filter (WMF) [5] and center
weight median filter (CWMF) [6]. In these filters, more weight is given to the particular pixel
within the kernel under operation. But uncorrupted pixels are naturally distorted while
restoring the corrupted pixels. So, the performance of these filters is not satisfactory while
removing high density impulse noise.
[7] A. Roy and R. H. Laskar, “Non-casual linear prediction based adaptive filter for
removal of high density impulse noise from color images,” AEU Int. J. Electron.
Commun. vol. 72, pp. 114–124, 2017.
In this paper, a non-causal linear prediction based adaptive vector median filter is
proposed for removal of high density impulse noise from color images. Generally, when an
image is affected by high density of impulse noise, homogeneity amongst the pixels is
distorted. In the proposed method, if the pixel under operation is found to be corrupted, the
filtering operation will be carried out. The decision about a particular pixel of being corrupted
or not depends on the linear prediction error calculated from the non-causal region around the
pixel under operation. If the error of the central pixel of the kernel exceeds some predefined
threshold value, adaptive window based vector median filtering operation will be performed.
The size of adaptive window will depend on the level of error according to the
predefined threshold. The proposed filter improves the peak signal to noise ratio (PSNR) than
that of modified histogram based fuzzy filter (MHFC) by approximately 4.5 dB. The results
of structural similarity index measure (SSIM) suggest that the image details are maintained
significantly better in the proposed method as compared to earlier approaches. It may be
observed from subjective evaluation that the proposed method outperforms some of the
existing filters.
Impulse noise is an “on-off” noise that affects the image abruptly. Impulse noise is
generated at the time of capturing images through sensors, during the transmission of images
or due to the atmospheric variations such as lightning, etc. When an image is corrupted by the
impulse noise, the quality of the image is degraded to a large extent. So, the removal of
impulse noise from the captured image is necessary to enhance the quality of the image. The
intensity of impulse noise has the tendency of being either relatively too low or too high
Therefore, the image quality is severely affected due to high density impulse noise.
Preserving the image details and attenuation of noise are the two important aspects of image
restoration. In general, the linear filters are effective for additive noises.
[8] G. Pok, J. C. Liu, and A. S. Nair, “Selective removal of impulse noise based on
homogeneity level information,” IEEE Trans. Image Process., vol. 12, no. 1, pp. 85–92,
Jan. 2003.
The noise detection scheme does not use a quantitative decision measure, but uses
qualitative structural information, and it is not subject to burdensome computations for
optimization of the threshold values. Empirical results indicate that our scheme performs
significantly better than other median filters, in terms of noise suppression and detail
preservation. MEDIAN filter is a nonlinear filtering technique widely used for removal of
impulse noise [2], [6], [10]. Despite its effectiveness in smoothing noise, the median filter
tends to remove fine details when it is applied to an image uniformly.
To address this drawback, a number of modified median filters have been proposed,
e.g., minimum–maximum exclusive mean (MMEM) filter [7], prescanned minmax center-
weighted (PMCW) filter [14], and decision-based median filters [4], [5], [8], [13]. In these
methods, the filtering operation adapts to the local properties and structures in the image. In
the decision-based filtering, for example, image pixels are first classified as corrupted and
uncorrupted, and then passed through the median and identity filters, respectively.
The main issue of the decision-based filter lies in building a decision rule, or a noise
measure, that can discriminate the uncorrupted pixels from the corrupted ones as precisely as
possible. In the method proposed by Han et al. [7], pixels that have values close to the
maximum and minimum in a filter window are discarded, and the average of remaining
pixels in the window are computed.
[9] Y. Dong and S. Xu, “A New Directional weighted median filter for removal of
random-valued impulse noise,” IEEE Signal Process. Lett.,vol. 14, no. 3, pp. 193–196,
Mar. 2007.
The known median-based denoising methods tend to work well for restoring the
images corrupted by random-valued impulse noise with low noise level but poorly for highly
corrupted images. This letter proposes a new impulse detector, which is based on the
differences between the current pixel and its neighbors aligned with four main directions.
Then, we combine it with the weighted median filter to get a new directional weighted
median (DWM) filter. Extensive simulations show that the proposed filter not only can
provide better performance of suppressing impulse with high noise level but can preserve
more detail features, even thin lines.
As extended to restoring corrupted color images, this filter also performs very well I
MPULSE noise is often introduced into images during acquisition and transmission. Based
on the noise values, it can be classified as the easier-to-restore salt-and-pepper noise and the
more difficult random-valued impulse noise [1]. There have been much more methods for
removing the former, and some of them have performed very well [2]–[5]. So here, we only
focus on removing the latter. Among all kinds of methods for impulse noise, the median filter
[6] is used widely because of its effective noise suppression capability and high
computational efficiency. However, it uniformly replaces the gray-level value of every pixel
by the median of its neighbors.
Consequently, some desirable details are also removed, especially when the window
size is large. In order to improve the median filter, many filters with an impulse detector are
proposed, such as signal-dependent rank order mean (SD-ROM) filter [7], multistate median
(MSM) filter [1], adaptive center weighted median (ACWM) filter [8], the pixelwise MAD
(PWMAD) filter [9], and iterative median filter [10]. These filters usually perform well, but
as the noise level is higher than 30%, they tend to remove many features from the images or
retain too much impulse noise.
CHAPTER 3
PROPOSED METHOD
nonmatching blocks. Among matching blocks, a block to be denoised at hand is called target
block. Noisy pixels in target blocks are denoised by using the clean pixels in matching
blocks.
For each pixel value i and a given percentage δ, we define the range of δ-homogeneity
by two values: 1) lowδ(i); and 2) upδ(i), so that the sum of H(i, j) with j running from lowδ(i)
to upδ(i) equals to δ percent of the sum of all the bins on the cross section at i
After all of lowδ(i) and upδ(i) for all i from 0 to 255 have been computed, noisy
pixels are determined as follows. For each pixel q adjacent to p, if p is in the range of q’s δ-
homogeneity, then the counter is increased by 1. After all the eight neighboring pixels are
considered, if the counter is less than a threshold, then p is classified as noise.
Once all noisy pixels have been determined, a sliding window W of size 3 × 3 moves
over the input image I pixel by pixel, and noisy pixels in W are counted. If the number of
noisy pixels in W is greater than a predefined threshold, it is classified as a noisy block.
Otherwise, the block is initially classified as a matching block.
Among the matching blocks, the blocks including at least one noisy pixel are target
blocks to be denoised using other similar matching blocks. In order to reduce the computing
time in searching similar blocks, we represent the pixel locations in a 3 × 3 window as
numbers from 1 through 9, and maintain a set of 256-D arrays of pointers {Ai [ 0 . . . 255]: i
= 1, 2, . . . , 9} where each i corresponds to a pixel location as shown in
Fig. 2. Arrays {Ai} play a role of a lookup-table in which the information of “a pixel
of value p is occurring at position i in a 3 × 3 block” is recorded in the list pointed by Ai[p].
Thus, because Ai[p] points to the list containing the coordinates of all the matching blocks
whose ith pixel has value of p, search of similar blocks can be done just by looking up the
lists. In practice, the conceptual structure in Fig. 2 can be built as a 3-D array, A[0... 8][0...
255][0... MAX_PIX] where MAX_PIX is the number of most frequent pixel values.
Coordinate (r, c) of a block is converted to a scalar value by expression rw + c where w is the
width of the image.
C. Noise Removal
Denoising of Target Blocks. Let Inoise denote the set of noisy pixels in I. Suppose the
arrays Ai, i = 1, 2, . . . , 9, have been constructed as described earlier. Next step is to collect
similar matching blocks for each target block. Computation of the distance between target
and matching blocks involves noise, and hence the Euclidian distance cannot be directly
employed because comparison of noise is meaningless. We propose a concept of tolerance
level d and define a tolerance function
Tolerance function determines whether two pixels are similar or not. Now, suppose
we are comparing a target block T = [t1, t2, . . . , t9] where ti is a pixel value at position i and
a matching block M = [p1, p2, . . . , p9] which is stored in the lists pointed by Ai’s. We define
the similarity of T and M as
Where
In (3), noise cannot be involved in computing the similarity of two blocks. Our goal is
to build a set of matching blocks Σ(T) for T, which satisfy the following condition for a
threshold τ:
The search process continues until sufficient number #S, say at least five, of matching
blocks are collected in Σ(T). If the number of matching blocks in Σ(T) is not sufficient,
tolerance level d is increased by Δd and the search process starts again until sufficient
numbers of matching blocks are collected. Denoising of T is done by replacing the noisy
pixels in T with the average of all the clean pixels at the corresponding location of all the
matching blocks in Σ(T).
Fig. 3 shows an example of searching matching blocks when the tolerance level d is 1
and τ in (5) is 3. On the left-hand side, target block T = [10, 15, 9, •, •, •, 8, •, 5] is given,
where “•” denotes noise. As x1 of T is 10 and d is 1, all the blocks pointed by A1[ 10 − d..10
+ d] = A1[ 9..11] are candidates for matching to T.
Among them, Fig. 3 shows that only M1 and M2 are matching to T, because simil(T,
M1) = 4, and simil(T, M2) = 4, and hence M1 and M2 satisfy the condition of (5). Denoising
of Noisy Blocks.We divide a noisy block B into four surrounding blocks Ti, i = 1, 2, . . . , 4,
as shown in Fig. 4(a). As denoising of all the target blocks is complete before denoising of
noisy blocks starts, it is very probable that surrounding blocks of a noisy block be target
blocks.
When two or more noisy blocks happen to be adjacent like B1 and B2 in Fig. 4(b),
due to the cascading effect, denoising of T1 makes T2 to be a target block. After converting
noisy blocks into target blocks, denoising starts for the target blocks, as described above. The
overall denoising algorithm is summarized in Fig. 5. In the algorithm, step 10 takes care of
the situation when some target blocks cannot be matched to any matching blocks within the
given tolerance level. If this happens, then instead of increasing the tolerance level as in
step 4, the target and noise blocks are divided into 2 × 2 blocks that contain only one noisy
pixel, and then block matching is performed for these 2 × 2 blocks repeatedly.
CHAPTER 4
• Ram : 4 GB
• Tool : MATLAB
CHAPTER 5
RESULT
CHAPTER 6
CONCLUSION
We proposed an efficient block-matching method, which is superior for denoising the input
image corrupted by random valued impulse noise. In the preprocessing stage, the noisy pixels
are identified using the local homogeneity property. With this information of locations of
noisy pixels, the proposed method divides blocks into matching and noisy blocks, based on
the number of noisy pixels in the block. Among matching blocks, nonoverlapping blocks
containing at least one noisy pixel are called target blocks and are subject to denoising. The
denoising operation is carried out using the array of pointers, which allows to determine what
pixel values occur at which locations in the image, and hence reduces the search time to find
the matching blocks. Our method has shown superior performance gain in terms of
computing time and quality of the restored images. One of the future research directions to
improve the proposed method is to quantize the second dimension of the A arrays in Fig. 2 by
Q, so that the index range is set to [0 . . . 255/Q] instead of [0 . . . 255]. This modification will
expectedly improve the search time. Other research direction is to investigate the extent of
the effects that high noise rate has on identifying noisy pixels and subsequently on the
denoising performance.
REFERENCES
[1] N. Yao and K. K. Ma, “Adaptive rood pattern search for fast block-matching motion
estimation”. IEEE Trans. Image Process., vol. 11 no. 12, pp. 1442– 1448, Dec. 2002.
[2] C. Je and H.-M. Park, “Optimized hierarchical block matching for fast and accurate image
registration,” Signal Process.: Image Commun., vol. 28, no. 7, pp. 779–791, 2013.
[3] K. Dabov, A. Foi, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain
collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 8, pp. 2080–2095, Aug.
2007.
[4] B. Ahn and N. I. Cho, “Block-matching convolutional neural network for image
denoising,” CoRR abs/1704.00524, 2017.
[5] C. T. Lu, Y. Y. Chen, L. L. Wang, and C. F. Chang, “Removal of salt-and pepper noise in
corrupted image using three-values-weighted approach with variable-size window,” Pattern
Recognit. Lett., vol. 80, pp. 188–199, 2016.
[6] A. Roy, J. Singha, S. S. Devi, and R. H. Laskar, “Impulse noise removal using SVM
classification based fuzzy filter from gray scale images,” Signal Process., vol. 28, pp. 262–
273, 2016.
[7] A. Roy and R. H. Laskar, “Non-casual linear prediction based adaptive filter for removal
of high density impulse noise from color images,” AEU Int. J. Electron. Commun. vol. 72,
pp. 114–124, 2017.
[8] G. Pok, J. C. Liu, and A. S. Nair, “Selective removal of impulse noise based on
homogeneity level information,” IEEE Trans. Image Process., vol. 12, no. 1, pp. 85–92, Jan.
2003.
[9] Y. Dong and S. Xu, “A New Directional weighted median filter for removal of random-
valued impulse noise,” IEEE Signal Process. Lett.,vol. 14, no. 3, pp. 193–196, Mar. 2007.
Appendix A
MATLAB
A.1 Introduction
The basic building block of MATLAB is MATRIX. The fundamental data type is the
array. Vectors, scalars, real matrices and complex matrix are handled as specific class of this
basic data type. The built in functions are optimized for vector operations. No dimension
statements are required for vectors or arrays.
A.2.1 MATLAB Window
The command window is where the user types MATLAB commands and expressions
at the prompt (>>) and where the output of those commands is displayed. It is opened when
the application program is launched. All commands including user-written programs are
typed in this window at MATLAB prompt for execution.
MATLAB defines the workspace as the set of variables that the user creates in a work
session. The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information.
The current Directory tab shows the contents of the current directory, whose path is
shown in the current directory window. For example, in the windows operating system the
path might be as follows: C:\MATLAB\Work, indicating that directory “work” is a
subdirectory of the main directory “MATLAB”; which is installed in drive C. Clicking on the
arrow in the current directory window shows a list of recently used paths. MATLAB uses a
search path to find M-files and other MATLAB related files. Any file run in MATLAB must
reside in the current directory or in a directory that is on search path.
The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This is useful to select
various options in addition to executing the commands and is useful feature when
experimenting with various commands in a work session.
Editor Window
The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can be a sub
window in the desktop. In this window one can write, edit, create and save programs in files
called M-files.
MATLAB editor window has numerous pull-down menus for tasks such as saving,
viewing, and debugging files. Because it performs some simple checks and also uses color to
differentiate between various elements of code, this text editor is recommended as the tool of
choice for writing and editing M-functions.
The output of all graphic commands typed in the command window is seen in this
window.
MATLAB provides online help for all it’s built in functions and programming
language constructs. The principal way to get help online is to use the MATLAB help
browser, opened as a separate window either by clicking on the question mark symbol (?) on
the desktop toolbar, or by typing help browser at the prompt in the command window. The
help Browser is a web browser integrated into the MATLAB desktop that displays a
Hypertext Markup Language (HTML) documents. The Help Browser consists of two panes,
the help navigator pane, used to find information, and the display pane, used to view the
information. Self-explanatory tabs other than navigator pane are used to perform a search.
MATLAB Files
MATLAB has three types of files for storing information. They are: M-files and MAT-
files.
M-Files
These are standard ASCII text file with ‘m’ extension to the file name and creating
own matrices using M-files, which are text files containing MATLAB code. MATLAB editor
or another text editor is used to create a file containing the same statements which are typed
at the MATLAB command line and save the file under a name that ends in .m. There are two
types of M-files.
1. Script Files
It is an M-file with a set of MATLAB commands in it and is executed by typing name of file
on the command line. These files work on global variables currently present in that
environment.
2. Function Files
A function file is also an M-file except that the variables in a function file are all
local. This type of files begins with a function definition line.
MAT-Files
These are binary data files with .mat extension to the file that are created by
MATLAB when the data is saved. The data written in a special format that only MATLAB
can read. These are located into MATLAB with ‘load’ command.
SOME BASIC COMMANDS:
Scalar Calculations:
+ Addition
- Subtraction
* Multiplication
/ Right division (a/b means a ÷ b)
^ Exponentiation
Array or element-by-element operations are executed when the operator is preceded by a '.'
(Period):
Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic.
The command window is where the user types MATLAB commands and expressions at
the prompt (>>) and where the output of those commands is displayed. MATLAB defines the
workspace as the set of variables that the user creates in a work session.
The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information and income instances edit certain properties of the variable.
The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.
MATLAB uses a search path to find M-files and other MATLAB related files, which
are organize in directories in the computer file system. Any file run in MATLAB must reside
in the current directory or in a directory that is on search path. By default, the files supplied
with MATLAB and math works toolboxes are included in the search path. The easiest way to
see which directories are soon the search path, or to add or modify a search path, is to select
set path from the File menu the desktop, and then use the set path dialog box. It is good
practice to add any commonly used directories to the search path to avoid repeatedly having
the change the current directory.
The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands.
This action launches a menu from which to select various options in addition to
executing the commands. This is useful to select various options in addition to executing the
commands. This is a useful feature when experimenting with various commands in a work
session.
Getting Help:
The principal way to get help online is to use the MATLAB help browser, opened as a
separate window either by clicking on the question mark symbol (?) on the desktop toolbar,
or by typing help browser at the prompt in the command window. The help Browser is a web
browser integrated into the MATLAB desktop that displays a Hypertext Markup
Language(HTML) documents. The Help Browser consists of two panes, the help navigator
pane, used to find
APPENDIX B
Digital image processing is the use of algorithms to perform image processing of digital
images. As a subcategory or field of digital signal processing, digital image processing has
many advantages over analog image processing. It allows a much wider range of algorithms to
be applied to the input data and can avoid problems such as the build-up of noise and signal
distortion during processing. Since images are defined over two dimensions (perhaps more)
digital image processing may be modelled in the form of multidimensional systems.
Image processing in its broadest sense is an umbrella term for representing and
analyzing of data in visual form. More narrowly, image processing is the manipulation of
numeric data contained in a digital image for the purpose of enhancing its visual appearance.
Through image processing, faded pictures can be enhanced, medical images clarified, and
satellite photographs calibrated. Image processing software can also translate numeric
information into visual images that can be edited, enhanced, filtered, or animated in order to
reveal relationships previously not apparent. Image analysis, in contrast, involves collecting
data from digital images in the form of measurements that can then be analyzed and
transformed.
Originally developed for space exploration and biomedicine, digital image processing
and analysis are now used in a wide range of industrial, artistic, and educational applications.
Software for image processing and analysis is widely available on all major computer
platforms. This software supports the modern adage that "a picture is worth a thousand words,
but an image is worth a thousand pictures."
1.1 History
Many of the techniques of digital image processing, or digital picture processing as it often
was called, were developed in the 1960s at the Jet Propulsion Laboratory, Massachusetts
Institute of Technology, Bell Laboratories, University of Maryland, and a few other research
facilities, with application to satellite imagery, wire-photo standards conversion, medical
imaging, videophone, character recognition, and photograph enhancement. The cost of
processing was fairly high, however, with the computing equipment of that era. That changed
in the 1970s, when digital image processing proliferated as cheaper computers and dedicated
hardware became available. Images then could be processed in real time, for some dedicated
problems such as television standards conversion. As general-purpose computers became
faster, they started to take over the role of dedicated hardware for all but the most specialized
and computer-intensive operations.
To create a digital image, we need to convert the continuous sensed data into digital form. This
involves two processes:
1) Sampling
2) Quantization.
The basic idea behind sampling and quantization is illustrated in Fig. 1.4 and Fig. 1.5. An
image may be continuous with respect to the x- and y-coordinates, and also in amplitude. To
convert it to digital form, we have to sample the function in both coordinates and in amplitude.
Digitizing the coordinate values is called sampling. Digitizing the amplitude values is called
quantization.
Fig 1.2 Continuous Image.
Fig 1.3 a plot of amplitude values along the line of continuous image.
Fig 1.6 a) Continuous image projected onto a sensor array. b) Result of image sampling and
quantization.
The result of sampling and quantization is a matrix of real numbers. We will use two
principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the
resulting digital image has M rows and N columns. The values of the coordinates (x, y) now
become discrete quantities. For notational clarity and convenience, we shall use integer values
for these discrete coordinates. Thus, the values of the coordinates at the origin are (x, y) = (0,
0). The next coordinate values along the first row of the image are represented as (x, y) = (0,
1). It is important to keep in mind that the notation (0, 1) is used to signify the second sample
along the first row. It does not mean that these are the actual values of physical coordinates
when the image was sampled.
Fig 1.7 Coordinate convention to represent digital images
The notation introduced in the preceding paragraph allows us to write the complete
M*N digital image in the following compact matrix form:
The right side of this equation is by definition a digital image. Each element of this
matrix array is called an image element, picture element, pixel, or pel. The terms image and
pixel will be used throughout the rest of our discussions to denote a digital image and its
elements. In some discussions, it is advantageous to use a more traditional matrix notation to
denote a digital image and its elements:
Image editing encompasses the process of altering images, whether they are digital
photographs, traditional analog photographs, or illustrations. Traditional analog image editing
is known as photo retouching, using tools such as an airbrush to modify photographs, or
editing illustrations with any traditional art medium. Graphic software programs, which can be
broadly grouped into vector graphics editors, raster graphics editors, and 3d modellers, are the
primary tools with which a user may manipulate, enhance, and transform images. Many image
editing programs are also used to render or create computer art from scratch.
To apply a contrast filter, you determine if a pixel is lighter or darker than a threshold
amount. If it's lighter, you scale the pixel's intensity up otherwise you scale it down. In code
this is done by subtracting the threshold from a pixel, multiplying by the contrast factor and
adding the threshold value back again. As with the brightness filter the resulting value needs to
be clamped to ensure it remains in the range 0 - 255. To apply a brightness filter you simply
add a fixed amount to every pixel in the image and then clamp the result to ensure it remains
in the range 0 - 255.
In photography and computing, a gray scale or grey scale digital image is an image in
which the value of each pixel is a single sample, that is, it carries only intensity information.
Images of this sort, also known as black-and-white, are composed exclusively of shades of
gray, varying from black at the weakest intensity to white at the strongest. Gray scale images
are distinct from one-bit bi-tonal black-and-white images, which in the context of computer
imaging are images with only the two colors, black, and white (also called bi-level or binary
images). Gray scale images have many shades of gray in between. Gray scale images are also
called monochromatic, denoting the absence of any chromatic variation (i.e., one color).
Gray scale images are often the result of measuring the intensity of light at each pixel in
a single band of the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.), and
in such cases they are monochromatic proper when only a given frequency is captured. But
also they can be synthesized from a full color image.
1.4.4 Inverted Image
An inverted image could be interpreted as a digital version of image negatives. After
inversion, every color takes the exact opposite one (I know this terminology is not that
scientific, but it’s useful as a conceptual information). Let’s put this in more scientific terms.
A positive image should be defined as a normal, original RGB or gray image. A negative
image denotes a tonal inversion of a positive image, in which light areas appear dark and dark
areas appear light. In negative images, a color reversing is also achieved, such that the red
areas appear cyan, greens appear magenta, and blues appear yellow. In simpler sense, for the
gray scale case, a black and white image, using 0 for black and 255 for white, a near-black
pixel value of 5 will be converted to 250, or near-white.
Image inversion is one of the easiest techniques in image processing. Therefore, it’s
very applicable to demonstrations of performance, acceleration, and optimization. Many of the
state of the art image processing libraries such as Open CV, Gandalf, VXL etc., perform this
operation as fast as possible, even though some more accelerations using parallel hardware are
possible.
1.4.5 Blurring
Blurring an image usually makes the image unfocused. In signal processing, blurring is
generally obtained by convolving the image with a low pass filter. In this Demonstration, the
amount of blurring is increased by increasing the pixel radius.
Histogram:
Histograms are the basis for numerous spatial domain processing techniques. Histogram
manipulation can be used effectively for image enhancement. The histogram of a digital image
with gray levels in the range [0, L-1] is a discrete function h(rk) = nk, where r is the kth gray
level and nk is the number of pixels in the image having gray level rk.
Equalization:
Consider for a moment continuous function, and let the variable r represent the gray
levels of the image to be enhanced. In the initial part of our discussion. we assume that r has
been normalized to the interval [0, 1], with r=0 representing black and r=1 representing white.
Later, we consider a discrete formulation and allow pixel values to be in the interval [0, L-1].
That produces a level s for every pixel value r in the original image. For reasons that will
become obvious shortly, we assume that the transformation function
0 _ r _ 1; and
The requirement in (a) that T(r) be single valued is needed to guarantee that the inverse
transformation will exist, and the monotonicity condition preserves the increasing order from
black to white in the output image. A transformation function that is not monotonically
increasing could result in at least a section of the intensity range being inverted, thus
producing some inverted gray levels in the output image. While this may be a desirable effect
in some cases, that is not what we are after in the present discussion. Finally, condition (b)
guarantees that the output gray levels will be in the same range as the input levels.
Image Formats:
Raster formats:
JPEG stands for "Joint Photographic Expert Group" and, as its name suggests, was specifically
developed for storing photographic images. It has also become a standard format for storing
images in digital cameras and displaying photographic images on internet web pages. JPEG
files are significantly smaller than those saved as TIFF, however this comes at a cost since
JPEG employs lossy compression. A great thing about JPEG files is their flexibility. The JPEG
file format is really a toolkit of options whose settings can be altered to fit the needs of each
image.