0% found this document useful (0 votes)
21 views47 pages

Efficient Block Matching For Removing Impulse Noise

Uploaded by

lakshmiv91163
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views47 pages

Efficient Block Matching For Removing Impulse Noise

Uploaded by

lakshmiv91163
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 47

Efficient Block Matching for Removing Impulse Noise

Abstract

A number of block-based image-denoising methods have been presented in the literature.


Those methods, however, are generally adapted to denoising the Gaussian noise, and
subsequently do not show good performance for denoising randomvalued impulse, and salt-
and-pepper noise. We propose an efficient block-based image-denoising method, which is
devised specially for fast denoising of impulse noise. The method first constructs a set of
array pointers to image blocks containing a specific pixel value at a specific location. With
this scheme, finding of blocks similar to a given block can be done by considering only the
blocks pointed by the pointers corresponding to the pixel values of the block without
comparing all the blocks in the input image. The experimental results show that the proposed
method can achieve superior denoising performance in terms of computational time and
signal-to-noise ratio measure.

Index Terms—Array of pointers, block matching, denoising, homogeneity level, impulse


noise.
CHAPTER 1

INTRODUCTION

IMPULSE noise has been successfully removed by the me- dian filter. Most median
filters consider only the neighboring pixels around a pixel to be denoised, and therefore
accurate estimation of the ground true value is limited. This problem can be addressed by
obtaining the estimate of the true value from the similar blocks in the image. Block-
matching-based approach has a reasonable ground that the corrupted pixel values can be
nearly perfectly recovered from the pixel values obtained from the similar blocks to the block
to be denoised.

The idea of block matching is not new but has been studied in video processing and
image registration [1], [2]. Block matching in denoising applications is, however, quite
different in its nature from the block matching in other image-processing applications due to
the existence of noisy pixels that cannot be directly used in matching. Block-matching-based
denoising method generally has to conduct an exhaustive search over the entire image to
collect sufficient number of similar blocks, and consequently computational time is highly
demanding.

BM3D [3] has addressed this problem by searching candidate-matching blocks in a


local neighborhood of the currently processed location. The authors did not explain about
how to set the range of search, what is the measure to compare two blocks corrupted by the
impulse noise, and how to cope with the situation when the collected matching blocks
provide mostly noisy pixels for certain locations.

Ahn et al. [4] proposed a convolutional neural network (CNN) based block-matching
method for image denoising. This scheme needs first to apply a denoising algorithm on the
noisy image to obtain pilot signal for training a CNN. In addition to this overhead, neural
networks generally require long training time, which prevents it from being a realtime
application. Lu et al. [5] proposed a three-values-weighted method where the number of
pixels in the maximum or minimum group determines the centroid of the middle group.

Roy et al. [6] used support vector machine and fuzzy filter to denoise gray-scale
images, and reported that the method is superior in preserving the image’s local structure.
Roy and Laskar [7] presented a linear-prediction-based adaptive filter to denoise color
images. Noisy pixels are identified by comparing the linear prediction error with a predefined
threshold, and adaptive vector median filtering is applied to the pixels with error greater than
the threshold.

We propose a method called the block matching by a lookup table (BMLUT), which
can avoid the exhaustive search to find all matching blocks for the block being denoised. The
rationale behind using a lookup table is that computation of the distance between pixels
corrupted by impulse noise is meaningless, and hence a simple measure, such as the
Euclidean distance, cannot be directly employed.

In addition, because the number of clean pixels participating in distance computation


is varying for noisy blocks, the number of similar pixel values is a more useful measure than
the conventional distance measures. In order to address these issues, we have devised an
efficient and fast lookup-table-based scheme to find similar blocks for a given block.

1.6 Deep Learning

1.6.1 Deep Learning:

Deep learning is a subset of machine learning that focuses on training deep neural
networks with multiple layers to learn and represent complex patterns in data. Deep neural
networks are composed of interconnected layers of artificial neurons that simulate the
structure and functioning of the human brain.

Deep learning is a branch of machine learning which is based on artificial neural


networks. It is capable of learning complex patterns and relationships within data. In deep
learning, we don’t need to explicitly program everything. It has become increasingly
popular in recent years due to the advances in processing power and the availability of large
datasets.

Because it is based on artificial neural networks (ANNs) also known as deep neural
networks (DNNs). These neural networks are inspired by the structure and function of the
human brain’s biological neurons, and they are designed to learn from large amounts of
data.

1.6.2 Key aspects of deep learning include:

 Neural Networks: Deep learning relies on neural networks with multiple hidden layers,
allowing the network to learn hierarchical representations of the data. Each layer in the
network extracts higher-level features from the representations learned in the previous
layer. Deep neural networks can automatically learn and extract relevant features from
raw data, eliminating the need for manual feature engineering.
 Training Process: Deep learning models are trained through a process called back
propagation, where the network adjusts its internal parameters (weights and biases) to
minimize the difference between the predicted output and the target output. This process
involves propagating errors backward through the network and updating the parameters
using gradient descent optimization algorithms.
 Large-Scale Data: Deep learning models typically require a large amount of labeled data
for training. The availability of big data and advances in computing power have enabled
the success of deep learning models. The large-scale data allows deep neural networks to
learn complex representations and generalize well to new, unseen data.
 Applications: Deep learning has shown remarkable performance in various fields,
including computer vision, natural language processing, speech recognition, and
recommendation systems. It has achieved state-of-the-art results in tasks such as image
classification, object detection, machine translation, and speech synthesis.
 Deep learning excels in handling complex and high-dimensional data, capturing intricate
patterns, and achieving state-of-the-art performance in many AI tasks. However, it
typically requires more computational resources and data compared to traditional machine
learning approaches.

In summary, machine learning focuses on training algorithms to learn patterns and make
predictions or decisions, while deep learning is a specific approach within machine learning
that utilizes deep neural networks to learn complex representations. Deep learning has gained
significant attention and has been particularly successful in solving tasks that involve
complex data such as images, audio, and text.

Lane detection using deep learning is a popular approach that leverages the power of deep
neural networks to detect and track lane markings on the road. Deep learning models excel in
learning complex patterns and can effectively capture the distinctive characteristics of lane
markings, making them well-suited for this task. Here is a high-level overview of the lane
detection process using deep learning:

 Dataset Preparation: The first step is to collect or create a dataset of labeled images or
videos, where the lane markings are manually annotated. The annotations typically
involve marking the pixels or regions corresponding to the lane markings in the images or
videos.
 Data Pre-processing: The collected dataset is pre-processed to prepare it for training.
This may involve resizing the images, normalizing pixel values, and splitting the dataset
into training and validation sets.
 Model Architecture: A deep learning model architecture needs to be selected or designed
for lane detection. Convolutional Neural Networks (CNNs) are commonly used due to
their ability to capture spatial dependencies in images. The model architecture may
consist of multiple convolutional layers followed by pooling, fully connected layers, and
output layers.
 Training: The deep learning model is trained using the labeled dataset. The training
process involves feeding the input images into the model, comparing the predicted output
(lane markings) with the ground truth annotations, and updating the model's weights
through back propagation and gradient descent optimization algorithms. The objective is
to minimize the difference between the predicted output and the ground truth annotations.
 Post-processing: Once the model is trained, the lane detection results may undergo post-
processing steps to refine the detected lane markings. This may include techniques such
as filtering outliers, curve fitting, and extrapolation to extend the detected lanes.
 Evaluation and Testing: The trained model is evaluated on a separate test dataset to
assess its performance. Evaluation metrics such as accuracy, precision, recall, and F1
score can be used to measure the model's lane detection performance.
 Deployment: The trained lane detection model can be deployed in real-time applications,
such as autonomous vehicles or advanced driver-assistance systems (ADAS), to detect
and track lane markings in real-world scenarios.

It's worth noting that there are different variations and approaches for lane detection using
deep learning, including single-image-based methods and video-based methods. Additionally,
techniques like semantic segmentation and instance segmentation can also be employed to
precisely detect and differentiate lane markings from other objects on the road.

Deep learning-based lane detection has shown promising results and has been
successfully applied in various real-world applications. However, it's important to fine-tune
and validate the model on diverse datasets and consider factors such as different weather
conditions, road types, and lighting variations to ensure robust and reliable lane detection
performance.
1.6.3 Application of Deep learning

Deep learning is a subset of machine learning that uses artificial neural networks
(ANNs) to model and solve complex problems. It is based on the idea of building artificial
neural networks with multiple layers, called deep neural networks, that can learn
hierarchical representations of the data.

Deep learning algorithms use a layered architecture, where the input data is passed
through an input layer and then propagated through multiple hidden layers, before reaching
the output layer. Each layer applies a set of mathematical operations, called weights and
biases, to the input data, and the output of one layer serves as the input to the next.

The process of training a deep learning model involves adjusting the weights and
biases of the model to minimize the error between the predicted output and the true output.
This is typically done using a variant of gradient descent, an optimization algorithm that
adjusts the weights and biases in the direction of the steepest decrease in the error.

Deep learning has a wide range of applications, including image and speech
recognition, natural language processing, and computer vision. One of the main advantages
of deep learning is that it can automatically learn features from the data, which means that
it doesn’t require the features to be hand-engineered. This is particularly useful for tasks
where the features are difficult to define, such as image recognition.

1.6.4 Advantages of Deep Learning:


Deep learning has several advantages over traditional machine learning methods, some of
the main ones include:

1. Automatic feature learning: Deep learning algorithms can automatically learn features
from the data, which means that they don’t require the features to be hand-engineered.
This is particularly useful for tasks where the features are difficult to define, such as
image recognition.
2. Handling large and complex data: Deep learning algorithms can handle large and
complex datasets that would be difficult for traditional machine learning algorithms to
process. This makes it a useful tool for extracting insights from big data.
3. Improved performance: Deep learning algorithms have been shown to achieve state-
of-the-art performance on a wide range of problems, including image and speech
recognition, natural language processing, and computer vision.
4. Handling non-linear relationships: Deep learning can uncover non-linear
relationships in data that would be difficult to detect through traditional methods.
5. Handling structured and unstructured data: Deep learning algorithms can handle
both structured and unstructured data such as images, text, and audio.
6. Predictive modeling: Deep learning can be used to make predictions about future
events or trends, which can help organizations plan for the future and make strategic
decisions.
7. Handling missing data: Deep learning algorithms can handle missing data and still
make predictions, which is useful in real-world applications where data is often
incomplete.
8. Handling sequential data: Deep learning algorithms such as Recurrent Neural
Networks (RNNs) and Long Short-term Memory (LSTM) networks are particularly
suited to handle sequential data such as time series, speech, and text. These algorithms
have the ability to maintain context and memory over time, which allows them to make
predictions or decisions based on past inputs.
9. Scalability: Deep learning models can be easily scaled to handle an increasing amount
of data and can be deployed on cloud platforms and edge devices.
10. Generalization: Deep learning models can generalize well to new situations or
contexts, as they are able to learn abstract and hierarchical representations of the data.
CHAPTER 2

LITERATURE SURVEY

[1] N. Yao and K. K. Ma, “Adaptive rood pattern search for fast block-matching motion
estimation”. IEEE Trans. Image Process., vol. 11 no. 12, pp. 1442– 1448, Dec. 2002.

We propose a novel and simple fast block-matching algorithm (BMA), called adaptive
rood pattern search (ARPS), which consists of two sequential search stages: (1) initial search
and (2) refined local search. For each macroblock (MB), the initial search is performed only
once at the beginning in order to find a good starting point for the follow-up refined local
search. By doing so, unnecessary intermediate search and the risk of being trapped into local
minimum matching error points could be greatly reduced in long search case. For the initial
search stage, an adaptive rood pattern (ARP) is proposed, and the ARP's size is dynamically
determined for each MB, based on the available motion vectors (MVs) of the neighboring
MBs.

In the refined local search stage, a unit-size rood pattern (URP) is exploited
repeatedly, and unrestrictedly, until the final MV is found. To further speed up the search,
zero-motion prejudgment (ZMP) is incorporated in our method, which is particularly
beneficial to those video sequences containing small motion contents. Extensive experiments
conducted based on the MPEG-4 Verification Model (VM) encoding platform show that the
search speed of our proposed ARPS-ZMP is about two to three times faster than that of the
diamond search (DS), and our method even achieves higher peak signal-to-noise ratio
(PSNR) particularly for those video sequences containing large and/or complex motion
contents.

BLOCK-MATCHING algorithm (BMA) for motion estimation (ME) has been widely
adopted by current video coding standards such as H.261, H.263, MPEG-1, MPEG-2,
MPEG-4, and H.264 due to its effectiveness and simplicity for implementation. The most
straightforward BMA is the full search (FS), which exhaustively searches for the best
matching block within the search window. However, FS yields very high computational
complexity and makes ME the main bottleneck in real-time video coding applications. Thus,
using a fast BMA is indispensable to reduce the computational cost. In our views, existing
fast BMAs can be classified into four categories as follows.

In this paper, we have proposed a novel and simple fast block-matching algorithm,
called adaptive rood pattern search (ARPS). By exploiting higher distribution of MVs in the
horizontal and vertical directions and the spatial inter-block correlation, ARP adaptively
exploits adjustable rood-shaped search pattern (which is powerful in tracking motion trend),
together with the search point indicated by the predicted MV, to match different motion
contents of video sequence for each macroblock.

In addition, an optional zero-motion prejudgment (ZMP) is incorporated into ARPS to


further benefit small motion video sequence. When compared with DS, ARPS–ZMP
significantly increases the computational gain by the factors ranging from 1.9 to 3.4 with
little reduction in average PSNR. In addition, ARPS–ZMP improves average PSNR
performance in large motion video sequences (e.g., 0.24 dB higher in Foreman and 0.39 dB
higher in Coastguard). Meanwhile, ARPS’s simplicity and regularity are very desirable and
attractive for hardware implementation.

[2] C. Je and H.-M. Park, “Optimized hierarchical block matching for fast and accurate
image registration,” Signal Process.: Image Commun., vol. 28, no. 7, pp. 779–791, 2013.

Recently the camera resolution has been highly increased, and the registration
between high-resolution images is computationally expensive even by using hierarchical
block matching. This paper presents a novel optimized hierarchical block matching algorithm
in which the computational cost is minimized for the scale factor and the number of levels in
the hierarchy. The algorithm is based on a generalized version of the Gaussian pyramid and
its inter-layer transformation of coordinates. The search window size is properly determined
to resolve possible error propagation in hierarchical block matching.

In addition, we also propose a simple but effective method for aligning colors
between two images based on color distribution adjustment as a preprocessing. Simplifying a
general color imaging model, we show much of the color inconsistency can be compensated
by our color alignment method. The experimental results show that the optimized hierarchical
block matching and color alignment methods increase the block matching speed and
accuracy, and thus improve image registration. Using our algorithm, it takes about 1.28 s for
overall registration process with a pair of images in 5 mega-pixel resolution.
Image registration is the process of geometrically transforming an image to make it
match with another image. It is one of the fundamental and important operations in image
processing and its applications, and has been investigated by many researchers [1], [2], [3],
[4]. Image registration has vast applications such as panoramic image mosaics [5], 3D
modeling [6], remote sensing [7], image deblurring [8], image resolution enhancement [9],
image compression [10], and medical applications [3], [4].

Hierarchical (multiresolution) matching methods, mostly based on image pyramids,


are frequently employed to speed up the matching process and to reduce the matching error in
image registration. Recently the camera resolution has been highly increased, and the
resolutions of images captured by many commercial cameras are higher than 5 mega pixels.
For such high-resolution images it takes long time to run image registration even by using
hierarchical block matching [23], [22], [17] which is usually preferred to non-hierarchical
block matching [26] because of the speed and accuracy.

We present a novel optimized hierarchical block matching algorithm in which the


computational cost is minimized by optimizing the scale factor and the number of levels in
the hierarchy. The algorithm is based on a general Gaussian pyramid (a generalized version of
the Gaussian pyramid which is described in Section 4) and its inter-layer transformation of
coordinates. The search window size is properly determined to resolve possible error
propagation in hierarchical block matching. To the best of our knowledge, this is the first
work on optimally selecting the scale factor and the number of levels in image pyramids to
minimize the computational cost.

[3] K. Dabov, A. Foi, and K. Egiazarian, “Image denoising by sparse 3-D transform-
domain collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 8, pp. 2080–
2095, Aug. 2007.

We propose a novel image denoising strategy based on an enhanced sparse


representation in transform domain. The enhancement of the sparsity is achieved by grouping
similar 2D image fragments (e.g., blocks) into 3D data arrays which we call "groups."
Collaborative Altering is a special procedure developed to deal with these 3D groups. We
realize it using the three successive steps: 3D transformation of a group, shrinkage of the
transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists
of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative
filtering reveals even the finest details shared by grouped blocks and, at the same time, it
preserves the essential unique features of each individual block.

The filtered blocks are then returned to their original positions. Because these blocks
are overlapping, for each pixel, we obtain many different estimates which need to be
combined. Aggregation is a particular averaging procedure which is exploited to take
advantage of this redundancy. A significant improvement is obtained by a specially developed
collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its
efficient implementation are presented in full detail; an extension to color-image denoising is
also developed. The experimental results demonstrate that this computationally scalable
algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-
noise ratio and subjective visual quality.

PLENTY of denoising methods exist, originating from various disciplines such as


probability theory, statistics, partial differential equations, linear and nonlinear filtering, and
spectral and multiresolution analysis. All these methods rely on some explicit or implicit
assumptions about the true (noise-free) signal in order to separate it properly from the random
noise. In particular, the transform-domain denoising methods typically assume that the true
signal can be well approximated by a linear combination of few basis elements. That is, the
signal is sparsely represented in the transform domain.

Hence, by preserving the few high-magnitude transform coefficients that convey


mostly the true-signal energy and discarding the rest which are mainly due to noise, the true
signal can be effectively estimated. The sparsity of the representation depends on both the
transform and the true-signal’s properties. The multiresolution transforms can achieve good
sparsity for spatially localized details, such as edges and singularities.

Because such details are typically abundant in natural images and convey a significant
portion of the information embedded therein, these transforms have found a significant
application for image denoising. Recently, a number of advanced denoising methods based on
multiresolution transforms have been developed, relying on elaborate statistical dependencies
between coefficients of typically overcomplete (e.g., translation-invariant and multiply-
oriented) transforms. Examples of such image denoising methods can be seen in [1]–[4].

[4] B. Ahn and N. I. Cho, “Block-matching convolutional neural network for image
denoising,” CoRR abs/1704.00524, 2017.
There are two main streams in up-to-date image denoising algorithms: non-local self
similarity (NSS) prior based methods and convolutional neural network (CNN) based
methods. The NSS based methods are favorable on images with regular and repetitive
patterns while the CNN based methods perform better on irregular structures. In this paper,
we propose a block matching convolutional neural network (BMCNN) method that combines
NSS prior and CNN. Initially, similar local patches in the input image are integrated into a 3D
block. In order to prevent the noise from messing up the block matching, we first apply an
existing denoising algorithm on the noisy image.

The denoised image is employed as a pilot signal for the block matching, and then
denoising function for the block is learned by a CNN structure. Experimental results show
that the proposed BMCNN algorithm achieves state-of-the-art performance. In detail,
BMCNN can restore both repetitive and irregular structures. With current image capturing
technologies, noise is inevitable especially in low light conditions. Moreover, captured
images can be affected by additional noises through the compression and transmission
procedures. Since the noise degrades visual quality, compression performance, and also the
performance of computer vision algorithms, the image denoising has been extensively studied
over several decades [1]–[13].

In order to estimate a clean image from its noisy observation, a number of methods
that take account of certain image priors have been developed. Among various image priors,
the NSS is considered a remarkable one such that it is employed in most of current state-of-
the-art methods. The NSS implies that some patterns occur repeatedly in an image and the
image patches that have similar patterns can be located far from each other. The nonlocal
means filter [1] is a seminal work that exploits this NSS prior.

The employment of NSS prior has boosted the performance of image denoising
significantly, and many up-to-date denoising algorithms [3], [7], [14]–[16] can be categorized
as NSS based methods. Most NSS based methods consist of following steps. First, they find
similar patches and integrate them into a 3D block. Then the block is denoised using some
other priors such as low-band prior [3], sparsity prior [15], [16] and low-rank prior [7]. Since
the patch similarity can be easily disrupted by noise, the NSS based methods are usually
implemented as two-step or iterative procedures.
[5] C. T. Lu, Y. Y. Chen, L. L. Wang, and C. F. Chang, “Removal of salt-and pepper
noise in corrupted image using three-values-weighted approach with variable-size
window,” Pattern Recognit. Lett., vol. 80, pp. 188–199, 2016.

The quality of a digital image deteriorates by the corruption of impulse noise in the
record or transmission. The process of efficiently removing this impulse noise from a
corrupted image is an important research task. This investigation presents a novel three-
values-weighted method for the removal of salt-and-pepper noise. Initially, a variable-size
local window is employed to analyze each extreme pixel (0 or 255 for an 8-bit gray-level
image). Each non-extreme pixel is classified and placed into the maximum, middle, or
minimum groups in the local window.

The numbers of non-extreme pixels belonging to the maximum or the minimum group
determines the centroid of the middle group. The distribution ratios of these three groups are
employed to weight the non-extreme pixels with the maximum, middle, and minimum pixel
values. The center pixel with an extreme value is replaced by the weighted value, thus
enabling the noisy pixels to be restored. Experimental results show that the proposed method
can efficiently remove salt-and-pepper noise (only for known extreme values of 0 and 255)
from a corrupted image for different noise corruption densities (from 10% to 90%);
meanwhile, the denoised image is freed from the blurred effect.

Images are inevitably corrupted by impulse noise, caused by malfunctioning pixels in


camera sensors, fault memory locations in hardware, transmission in a noisy channel, and bit
errors in transmission. There are two types of impulse noise; salt-and-pepper noise and
random valued noise. Salt-and-pepper noise can seriously corrupt the images where the
corrupted pixel takes either the maximum or the minimum gray level. This noise can
significantly deteriorate the quality of an image. The process to remove this kind of impulse
noise efficiently is an important research task.

Many techniques have been proposed for the restoration of the images contaminated
by salt-and-pepper noise. The median and the mean filters were popular for the removal of
salt-and-pepper noise due to their good denoising power and computational simplicity.
However, when the noise density is over 60%, some details and edges of the original image
cannot be restored well by the algorithms. This is because, the number of noise-free pixels are
not sufficient to reconstruct the pixels (noisy pixels) corrupted by noise in a local window.
An adaptive median filter [10] selects the median value in an adaptive window for
each pixel, thus enabling the noisy pixels to be removed. This method performed well at low
noise densities. However, for high noise densities the window size has to be increased which
may lead to a blur in the denoised image. In the modified switching median filters [3], [20],
the decision of an impulse noise pixel is based on a pre-defined threshold value. The major
drawback of these methods is that defining a robust decision threshold becomes difficult. In
addition, these filters did not take into account the local features because the details and the
edges could not be adequately restored, in particular when noise density was high.

[6] A. Roy, J. Singha, S. S. Devi, and R. H. Laskar, “Impulse noise removal using SVM
classification based fuzzy filter from gray scale images,” Signal Process., vol. 28, pp.
262–273, 2016.

In this paper, support vector machine (SVM) classification based Fuzzy filter (FF) is
proposed for removal of impulse noise from gray scale images. When an image is affected by
impulse noise, the quality of the image is distorted since the homogeneity among the pixels is
broken. SVM is incorporated for detection of impulse noise from images. Here, a system is
trained with an optimal feature set. When an image under test is processed through the trained
system, all the pixels under test image will be classified into two classes: noisy and non-
noisy.

Fuzzy filtering will be performed according to the decision achieved during the
testing phase. It provides about 98.5% true-recognition at the time of classification of noisy
and non-noisy pixels when image is corrupted by 90% of impulse noise. It leads to
improvement of Peak-signal to noise ratio to 22.2437 dB for the proposed system when an
image is corrupted by 90% of impulse noise. The simulation results also suggest that how this
system outperforms some of the state of art methods while preserving structural similarity to
a large extent.

Impulse noise is an “on–off” noise that affects an image drastically. The intensity of
impulse noise has tendency of being either large or small. When an image is affected by
impulse noise, the homogeneity among pixels is distorted. As a result, the quality of image is
deteriorated largely. Impulse noise is generated at the time of capturing images through noisy
sensors, during the transmission of images through corrupted channel or due to the
atmospheric variations such as lightening etc. So removal of impulse noise from that
degraded image is required so that the quality of an image is improved with less blurring
effect preserving the edge like fine details.

In many image processing applications, it is necessary to remove the impulse noise


for further processing. Using of linear filter for removing of impulse noise is not satisfactory.
Therefore, non-linear filters [1], [2] are preferred over linear filters in this context. The
median filter [3] is the commonly used order statistics filter. A lot of methods have been
proposed for removal of impulse noise. In the standard median filter (SMF) [4], impulse
noise is removed from the images by exploiting the fact that impulse noise has gray level
value when it is compared to the uncorrupted neighborhood pixels within the processing
kernel.

The median filter works on every pixel of noisy image irrespectively whether the
pixel is corrupted or not. As a result, it leads blurring effect to the de-noised image. The
difficulties in median filter are improved by weighted median filter (WMF) [5] and center
weight median filter (CWMF) [6]. In these filters, more weight is given to the particular pixel
within the kernel under operation. But uncorrupted pixels are naturally distorted while
restoring the corrupted pixels. So, the performance of these filters is not satisfactory while
removing high density impulse noise.

[7] A. Roy and R. H. Laskar, “Non-casual linear prediction based adaptive filter for
removal of high density impulse noise from color images,” AEU Int. J. Electron.
Commun. vol. 72, pp. 114–124, 2017.

In this paper, a non-causal linear prediction based adaptive vector median filter is
proposed for removal of high density impulse noise from color images. Generally, when an
image is affected by high density of impulse noise, homogeneity amongst the pixels is
distorted. In the proposed method, if the pixel under operation is found to be corrupted, the
filtering operation will be carried out. The decision about a particular pixel of being corrupted
or not depends on the linear prediction error calculated from the non-causal region around the
pixel under operation. If the error of the central pixel of the kernel exceeds some predefined
threshold value, adaptive window based vector median filtering operation will be performed.

The size of adaptive window will depend on the level of error according to the
predefined threshold. The proposed filter improves the peak signal to noise ratio (PSNR) than
that of modified histogram based fuzzy filter (MHFC) by approximately 4.5 dB. The results
of structural similarity index measure (SSIM) suggest that the image details are maintained
significantly better in the proposed method as compared to earlier approaches. It may be
observed from subjective evaluation that the proposed method outperforms some of the
existing filters.

Impulse noise is an “on-off” noise that affects the image abruptly. Impulse noise is
generated at the time of capturing images through sensors, during the transmission of images
or due to the atmospheric variations such as lightning, etc. When an image is corrupted by the
impulse noise, the quality of the image is degraded to a large extent. So, the removal of
impulse noise from the captured image is necessary to enhance the quality of the image. The
intensity of impulse noise has the tendency of being either relatively too low or too high
Therefore, the image quality is severely affected due to high density impulse noise.
Preserving the image details and attenuation of noise are the two important aspects of image
restoration. In general, the linear filters are effective for additive noises.

However, their performance for removal of impulse noise is not satisfactory.


Therefore, non-linear filters are preferred for removal of impulse noise. The median filter is a
commonly used non-linear filter for removal of impulse noise [1], [2], [3]. Median filter is
actually an order statics filter that exhibits superb noise reduction capabilities with minimum
blurring effects. The concept of median based operation was previously used by Tukey in
time-series analysis [2]. The median filter is generally effective for gray scale images.
According to RGB model [4], one pixel in a color image consists of three channels (R = Red,
G = Green, B = Blue). Processing the three channels individually may produce color
distortion in the de-noised image. Thus, the vectored version of the median filter known as
vector median filter is used for removal of impulse noise from color images.

[8] G. Pok, J. C. Liu, and A. S. Nair, “Selective removal of impulse noise based on
homogeneity level information,” IEEE Trans. Image Process., vol. 12, no. 1, pp. 85–92,
Jan. 2003.

In this paper, we propose a decision-based, signal-adaptive median filtering algorithm


for removal of impulse noise. Our algorithm achieves accurate noise detection and high SNR
measures without smearing the fine details and edges in the image. The notion of
homogeneity level is defined for pixel values based on their global and local statistical
properties. The cooccurrence matrix technique is used to represent the correlations between a
pixel and its neighbors, and to derive the upper and lower bound of the homogeneity level.
Noise detection is performed at two stages: noise candidates are first selected using the
homogeneity level, and then a refining process follows to eliminate false detections.

The noise detection scheme does not use a quantitative decision measure, but uses
qualitative structural information, and it is not subject to burdensome computations for
optimization of the threshold values. Empirical results indicate that our scheme performs
significantly better than other median filters, in terms of noise suppression and detail
preservation. MEDIAN filter is a nonlinear filtering technique widely used for removal of
impulse noise [2], [6], [10]. Despite its effectiveness in smoothing noise, the median filter
tends to remove fine details when it is applied to an image uniformly.

To address this drawback, a number of modified median filters have been proposed,
e.g., minimum–maximum exclusive mean (MMEM) filter [7], prescanned minmax center-
weighted (PMCW) filter [14], and decision-based median filters [4], [5], [8], [13]. In these
methods, the filtering operation adapts to the local properties and structures in the image. In
the decision-based filtering, for example, image pixels are first classified as corrupted and
uncorrupted, and then passed through the median and identity filters, respectively.

The main issue of the decision-based filter lies in building a decision rule, or a noise
measure, that can discriminate the uncorrupted pixels from the corrupted ones as precisely as
possible. In the method proposed by Han et al. [7], pixels that have values close to the
maximum and minimum in a filter window are discarded, and the average of remaining
pixels in the window are computed.

[9] Y. Dong and S. Xu, “A New Directional weighted median filter for removal of
random-valued impulse noise,” IEEE Signal Process. Lett.,vol. 14, no. 3, pp. 193–196,
Mar. 2007.

The known median-based denoising methods tend to work well for restoring the
images corrupted by random-valued impulse noise with low noise level but poorly for highly
corrupted images. This letter proposes a new impulse detector, which is based on the
differences between the current pixel and its neighbors aligned with four main directions.
Then, we combine it with the weighted median filter to get a new directional weighted
median (DWM) filter. Extensive simulations show that the proposed filter not only can
provide better performance of suppressing impulse with high noise level but can preserve
more detail features, even thin lines.
As extended to restoring corrupted color images, this filter also performs very well I
MPULSE noise is often introduced into images during acquisition and transmission. Based
on the noise values, it can be classified as the easier-to-restore salt-and-pepper noise and the
more difficult random-valued impulse noise [1]. There have been much more methods for
removing the former, and some of them have performed very well [2]–[5]. So here, we only
focus on removing the latter. Among all kinds of methods for impulse noise, the median filter
[6] is used widely because of its effective noise suppression capability and high
computational efficiency. However, it uniformly replaces the gray-level value of every pixel
by the median of its neighbors.

Consequently, some desirable details are also removed, especially when the window
size is large. In order to improve the median filter, many filters with an impulse detector are
proposed, such as signal-dependent rank order mean (SD-ROM) filter [7], multistate median
(MSM) filter [1], adaptive center weighted median (ACWM) filter [8], the pixelwise MAD
(PWMAD) filter [9], and iterative median filter [10]. These filters usually perform well, but
as the noise level is higher than 30%, they tend to remove many features from the images or
retain too much impulse noise.

CHAPTER 3

PROPOSED METHOD

We assume a random-valued impulse image model. For an input image I, a block in I


is defined as an n × n patch where n is an odd integer. In this letter, we set n fixed to 3. We
define matching block as a block that contains noisy pixels less than a predefined threshold,
and noisy block as all other
Fig. 1. (a) Color map of histogram H for the Lena image with 30% impulse noise. The white
line shows the vertical cross section of H at 150, which is plotted as (b) a blue curve.

nonmatching blocks. Among matching blocks, a block to be denoised at hand is called target
block. Noisy pixels in target blocks are denoised by using the clean pixels in matching
blocks.

A. Identification of Noisy Pixels

We determine if a pixel is corrupted by the local homogeneity assumption around a


pixel [8]. First, we build a two-dimensional (2-D) histogram H of size 256 × 256, where (i,
j)th element H(i, j), is the total number of neighboring pixels of value j around pixels of value
i. Fig. 1 shows the shape of H where the vertical cross section of H at some point i on the x-
axis denotes the distributions of pixel values adjacent to all the pixels of value i.

For each pixel value i and a given percentage δ, we define the range of δ-homogeneity
by two values: 1) lowδ(i); and 2) upδ(i), so that the sum of H(i, j) with j running from lowδ(i)
to upδ(i) equals to δ percent of the sum of all the bins on the cross section at i

After all of lowδ(i) and upδ(i) for all i from 0 to 255 have been computed, noisy
pixels are determined as follows. For each pixel q adjacent to p, if p is in the range of q’s δ-
homogeneity, then the counter is increased by 1. After all the eight neighboring pixels are
considered, if the counter is less than a threshold, then p is classified as noise.

B. Construction of Pointer Arrays for Matching Blocks

Once all noisy pixels have been determined, a sliding window W of size 3 × 3 moves
over the input image I pixel by pixel, and noisy pixels in W are counted. If the number of
noisy pixels in W is greater than a predefined threshold, it is classified as a noisy block.
Otherwise, the block is initially classified as a matching block.

Among the matching blocks, the blocks including at least one noisy pixel are target
blocks to be denoised using other similar matching blocks. In order to reduce the computing
time in searching similar blocks, we represent the pixel locations in a 3 × 3 window as
numbers from 1 through 9, and maintain a set of 256-D arrays of pointers {Ai [ 0 . . . 255]: i
= 1, 2, . . . , 9} where each i corresponds to a pixel location as shown in

Fig. 2. Pixel locations Xi corresponding to array elements Ai.

Fig. 2. Arrays {Ai} play a role of a lookup-table in which the information of “a pixel
of value p is occurring at position i in a 3 × 3 block” is recorded in the list pointed by Ai[p].
Thus, because Ai[p] points to the list containing the coordinates of all the matching blocks
whose ith pixel has value of p, search of similar blocks can be done just by looking up the
lists. In practice, the conceptual structure in Fig. 2 can be built as a 3-D array, A[0... 8][0...
255][0... MAX_PIX] where MAX_PIX is the number of most frequent pixel values.
Coordinate (r, c) of a block is converted to a scalar value by expression rw + c where w is the
width of the image.

C. Noise Removal

Denoising of Target Blocks. Let Inoise denote the set of noisy pixels in I. Suppose the
arrays Ai, i = 1, 2, . . . , 9, have been constructed as described earlier. Next step is to collect
similar matching blocks for each target block. Computation of the distance between target
and matching blocks involves noise, and hence the Euclidian distance cannot be directly
employed because comparison of noise is meaningless. We propose a concept of tolerance
level d and define a tolerance function
Tolerance function determines whether two pixels are similar or not. Now, suppose
we are comparing a target block T = [t1, t2, . . . , t9] where ti is a pixel value at position i and
a matching block M = [p1, p2, . . . , p9] which is stored in the lists pointed by Ai’s. We define
the similarity of T and M as

Where

In (3), noise cannot be involved in computing the similarity of two blocks. Our goal is
to build a set of matching blocks Σ(T) for T, which satisfy the following condition for a
threshold τ:

Fig. 3. Visual illustration of finding matching blocks.


Fig. 4. (a) A noisy block is divided into four surrounding blocks. (b) After denoising, adjacent
noisy blocks may become target blocks.

The search process continues until sufficient number #S, say at least five, of matching
blocks are collected in Σ(T). If the number of matching blocks in Σ(T) is not sufficient,
tolerance level d is increased by Δd and the search process starts again until sufficient
numbers of matching blocks are collected. Denoising of T is done by replacing the noisy
pixels in T with the average of all the clean pixels at the corresponding location of all the
matching blocks in Σ(T).

Fig. 3 shows an example of searching matching blocks when the tolerance level d is 1
and τ in (5) is 3. On the left-hand side, target block T = [10, 15, 9, •, •, •, 8, •, 5] is given,
where “•” denotes noise. As x1 of T is 10 and d is 1, all the blocks pointed by A1[ 10 − d..10
+ d] = A1[ 9..11] are candidates for matching to T.

Among them, Fig. 3 shows that only M1 and M2 are matching to T, because simil(T,
M1) = 4, and simil(T, M2) = 4, and hence M1 and M2 satisfy the condition of (5). Denoising
of Noisy Blocks.We divide a noisy block B into four surrounding blocks Ti, i = 1, 2, . . . , 4,
as shown in Fig. 4(a). As denoising of all the target blocks is complete before denoising of
noisy blocks starts, it is very probable that surrounding blocks of a noisy block be target
blocks.

When two or more noisy blocks happen to be adjacent like B1 and B2 in Fig. 4(b),
due to the cascading effect, denoising of T1 makes T2 to be a target block. After converting
noisy blocks into target blocks, denoising starts for the target blocks, as described above. The
overall denoising algorithm is summarized in Fig. 5. In the algorithm, step 10 takes care of
the situation when some target blocks cannot be matched to any matching blocks within the
given tolerance level. If this happens, then instead of increasing the tolerance level as in

Fig. 5. Algorithm of the proposed BMLUT denoising method


Fig. 6. Test images (Boat, Lena, Barbara, and Bridge from the left) and corresponding noisy
images corrupted by 30% impulse noise.

step 4, the target and noise blocks are divided into 2 × 2 blocks that contain only one noisy
pixel, and then block matching is performed for these 2 × 2 blocks repeatedly.
CHAPTER 4

SOFTWARE AND HARDWARE REQUIRMENT

4.1 HARDWARE REQUIREMENTS:

• System : Pentium Dual Core.

• Hard Disk : 120 GB.

• Monitor : 15’’ LED

• Input Devices : Keyboard, Mouse

• Ram : 4 GB

4.2 SOFTWARE REQUIREMENTS:

• Operating system : Windows 10

• Coding Language : MATLAB

• Tool : MATLAB
CHAPTER 5

RESULT
CHAPTER 6

CONCLUSION

We proposed an efficient block-matching method, which is superior for denoising the input
image corrupted by random valued impulse noise. In the preprocessing stage, the noisy pixels
are identified using the local homogeneity property. With this information of locations of
noisy pixels, the proposed method divides blocks into matching and noisy blocks, based on
the number of noisy pixels in the block. Among matching blocks, nonoverlapping blocks
containing at least one noisy pixel are called target blocks and are subject to denoising. The
denoising operation is carried out using the array of pointers, which allows to determine what
pixel values occur at which locations in the image, and hence reduces the search time to find
the matching blocks. Our method has shown superior performance gain in terms of
computing time and quality of the restored images. One of the future research directions to
improve the proposed method is to quantize the second dimension of the A arrays in Fig. 2 by
Q, so that the index range is set to [0 . . . 255/Q] instead of [0 . . . 255]. This modification will
expectedly improve the search time. Other research direction is to investigate the extent of
the effects that high noise rate has on identifying noisy pixels and subsequently on the
denoising performance.
REFERENCES

[1] N. Yao and K. K. Ma, “Adaptive rood pattern search for fast block-matching motion
estimation”. IEEE Trans. Image Process., vol. 11 no. 12, pp. 1442– 1448, Dec. 2002.

[2] C. Je and H.-M. Park, “Optimized hierarchical block matching for fast and accurate image
registration,” Signal Process.: Image Commun., vol. 28, no. 7, pp. 779–791, 2013.

[3] K. Dabov, A. Foi, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain
collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 8, pp. 2080–2095, Aug.
2007.

[4] B. Ahn and N. I. Cho, “Block-matching convolutional neural network for image
denoising,” CoRR abs/1704.00524, 2017.

[5] C. T. Lu, Y. Y. Chen, L. L. Wang, and C. F. Chang, “Removal of salt-and pepper noise in
corrupted image using three-values-weighted approach with variable-size window,” Pattern
Recognit. Lett., vol. 80, pp. 188–199, 2016.

[6] A. Roy, J. Singha, S. S. Devi, and R. H. Laskar, “Impulse noise removal using SVM
classification based fuzzy filter from gray scale images,” Signal Process., vol. 28, pp. 262–
273, 2016.

[7] A. Roy and R. H. Laskar, “Non-casual linear prediction based adaptive filter for removal
of high density impulse noise from color images,” AEU Int. J. Electron. Commun. vol. 72,
pp. 114–124, 2017.

[8] G. Pok, J. C. Liu, and A. S. Nair, “Selective removal of impulse noise based on
homogeneity level information,” IEEE Trans. Image Process., vol. 12, no. 1, pp. 85–92, Jan.
2003.

[9] Y. Dong and S. Xu, “A New Directional weighted median filter for removal of random-
valued impulse noise,” IEEE Signal Process. Lett.,vol. 14, no. 3, pp. 193–196, Mar. 2007.
Appendix A

MATLAB

A.1 Introduction

MATLAB is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where problems
and solutions are expressed in familiar mathematical notation. MATLAB stands for matrix
laboratory, and was written originally to provide easy access to matrix software developed by
LINPACK (linear system package) and EISPACK (Eigen system package) projects.
MATLAB is therefore built on a foundation of sophisticated matrix software in which the
basic element is array that does not require pre dimensioning which to solve many technical
computing problems, especially those with matrix and vector formulations, in a fraction of
time.
MATLAB features a family of applications specific solutions called toolboxes. Very
important to most users of MATLAB, toolboxes allow learning and applying specialized
technology. These are comprehensive collections of MATLAB functions (M-files) that extend
the MATLAB environment to solve particular classes of problems. Areas in which toolboxes
are available include signal processing, control system, neural networks, fuzzy logic,
wavelets, simulation and many others.
Typical uses of MATLAB include: Math and computation, Algorithm development,
Data acquisition, Modeling, simulation, prototyping, Data analysis, exploration, visualization,
Scientific and engineering graphics, Application development, including graphical user
interface building.

A.2 Basic Building Blocks of MATLAB

The basic building block of MATLAB is MATRIX. The fundamental data type is the
array. Vectors, scalars, real matrices and complex matrix are handled as specific class of this
basic data type. The built in functions are optimized for vector operations. No dimension
statements are required for vectors or arrays.
A.2.1 MATLAB Window

The MATLAB works based on five windows: Command window, Workspace


window, Current directory window, Command history window, Editor Window, Graphics
window and Online-help window.

A.2.1.1 Command Window

The command window is where the user types MATLAB commands and expressions
at the prompt (>>) and where the output of those commands is displayed. It is opened when
the application program is launched. All commands including user-written programs are
typed in this window at MATLAB prompt for execution.

Work Space Window

MATLAB defines the workspace as the set of variables that the user creates in a work
session. The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information.

Current Directory Window

The current Directory tab shows the contents of the current directory, whose path is
shown in the current directory window. For example, in the windows operating system the
path might be as follows: C:\MATLAB\Work, indicating that directory “work” is a
subdirectory of the main directory “MATLAB”; which is installed in drive C. Clicking on the
arrow in the current directory window shows a list of recently used paths. MATLAB uses a
search path to find M-files and other MATLAB related files. Any file run in MATLAB must
reside in the current directory or in a directory that is on search path.

Command History Window

The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This is useful to select
various options in addition to executing the commands and is useful feature when
experimenting with various commands in a work session.

Editor Window

The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can be a sub
window in the desktop. In this window one can write, edit, create and save programs in files
called M-files.
MATLAB editor window has numerous pull-down menus for tasks such as saving,
viewing, and debugging files. Because it performs some simple checks and also uses color to
differentiate between various elements of code, this text editor is recommended as the tool of
choice for writing and editing M-functions.

Graphics or Figure Window

The output of all graphic commands typed in the command window is seen in this
window.

Online Help Window

MATLAB provides online help for all it’s built in functions and programming
language constructs. The principal way to get help online is to use the MATLAB help
browser, opened as a separate window either by clicking on the question mark symbol (?) on
the desktop toolbar, or by typing help browser at the prompt in the command window. The
help Browser is a web browser integrated into the MATLAB desktop that displays a
Hypertext Markup Language (HTML) documents. The Help Browser consists of two panes,
the help navigator pane, used to find information, and the display pane, used to view the
information. Self-explanatory tabs other than navigator pane are used to perform a search.

MATLAB Files

MATLAB has three types of files for storing information. They are: M-files and MAT-
files.

M-Files

These are standard ASCII text file with ‘m’ extension to the file name and creating
own matrices using M-files, which are text files containing MATLAB code. MATLAB editor
or another text editor is used to create a file containing the same statements which are typed
at the MATLAB command line and save the file under a name that ends in .m. There are two
types of M-files.

1. Script Files

It is an M-file with a set of MATLAB commands in it and is executed by typing name of file
on the command line. These files work on global variables currently present in that
environment.

2. Function Files

A function file is also an M-file except that the variables in a function file are all
local. This type of files begins with a function definition line.

MAT-Files

These are binary data files with .mat extension to the file that are created by
MATLAB when the data is saved. The data written in a special format that only MATLAB
can read. These are located into MATLAB with ‘load’ command.
SOME BASIC COMMANDS:

pwd prints working directory

Demo demonstrates what is possible in Mat lab

Who lists all of the variables in your Mat lab workspace?

Whose list the variables and describes their matrix size

clear erases variables and functions from memory

clear x erases the matrix 'x' from your workspace

close by itself, closes the current figure window

figure creates an empty figure window

ALGEBRIC OPERATIONS IN MATLAB:

Scalar Calculations:

+ Addition

- Subtraction

* Multiplication
/ Right division (a/b means a ÷ b)

\ left division (a\b means b ÷ a)

^ Exponentiation

For example 3*4 executed in 'matlab' gives ans=12

4/5 gives ans=0.8

Array products: Recall that addition and subtraction of matrices involved


addition or subtraction of the individual elements of the matrices. Sometimes it is desired to
simply multiply or divide each element of an matrix by the corresponding element of another
matrix 'array operations”.

Array or element-by-element operations are executed when the operator is preceded by a '.'
(Period):

a .* b multiplies each element of a by the respective element of b

a ./ b divides each element of a by the respective element of b

a .\ b divides each element of b by the respective element of a

a .^ b raise each element of a by the respective b element

MATLAB WORKING ENVIRONMENT:


MATLAB DESKTOP

Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic.

The command window is where the user types MATLAB commands and expressions at
the prompt (>>) and where the output of those commands is displayed. MATLAB defines the
workspace as the set of variables that the user creates in a work session.

The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information and income instances edit certain properties of the variable.
The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which
are organize in directories in the computer file system. Any file run in MATLAB must reside
in the current directory or in a directory that is on search path. By default, the files supplied
with MATLAB and math works toolboxes are included in the search path. The easiest way to
see which directories are soon the search path, or to add or modify a search path, is to select
set path from the File menu the desktop, and then use the set path dialog box. It is good
practice to add any commonly used directories to the search path to avoid repeatedly having
the change the current directory.

The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands.

This action launches a menu from which to select various options in addition to
executing the commands. This is useful to select various options in addition to executing the
commands. This is a useful feature when experimenting with various commands in a work
session.

Getting Help:

The principal way to get help online is to use the MATLAB help browser, opened as a
separate window either by clicking on the question mark symbol (?) on the desktop toolbar,
or by typing help browser at the prompt in the command window. The help Browser is a web
browser integrated into the MATLAB desktop that displays a Hypertext Markup
Language(HTML) documents. The Help Browser consists of two panes, the help navigator
pane, used to find
APPENDIX B

DIGITAL IMAGE PROCESSING

Digital image processing is the use of algorithms to perform image processing of digital
images. As a subcategory or field of digital signal processing, digital image processing has
many advantages over analog image processing. It allows a much wider range of algorithms to
be applied to the input data and can avoid problems such as the build-up of noise and signal
distortion during processing. Since images are defined over two dimensions (perhaps more)
digital image processing may be modelled in the form of multidimensional systems.

Image processing in its broadest sense is an umbrella term for representing and
analyzing of data in visual form. More narrowly, image processing is the manipulation of
numeric data contained in a digital image for the purpose of enhancing its visual appearance.
Through image processing, faded pictures can be enhanced, medical images clarified, and
satellite photographs calibrated. Image processing software can also translate numeric
information into visual images that can be edited, enhanced, filtered, or animated in order to
reveal relationships previously not apparent. Image analysis, in contrast, involves collecting
data from digital images in the form of measurements that can then be analyzed and
transformed.

Originally developed for space exploration and biomedicine, digital image processing
and analysis are now used in a wide range of industrial, artistic, and educational applications.
Software for image processing and analysis is widely available on all major computer
platforms. This software supports the modern adage that "a picture is worth a thousand words,
but an image is worth a thousand pictures."

Fig 1.1: Standard processing of space borne images


Each of the pixels that represent an image stored inside a computer has a pixel value which
describes how bright that pixel is, and/or what color it should be. In the simplest case of
binary images, the pixel value is a 1-bit number indicating either foreground or background.
For a gray scale images, the pixel value is a single number that represents the brightness of
the pixel. The most common pixel format is the byte image, where this number is stored as
an 8-bit integer giving a range of possible values from 0 to 255. Typically zero is taken to be
black, and 255 are taken to be white. Values in between this make up the different shades of
gray.

1.1 History

Many of the techniques of digital image processing, or digital picture processing as it often
was called, were developed in the 1960s at the Jet Propulsion Laboratory, Massachusetts
Institute of Technology, Bell Laboratories, University of Maryland, and a few other research
facilities, with application to satellite imagery, wire-photo standards conversion, medical
imaging, videophone, character recognition, and photograph enhancement. The cost of
processing was fairly high, however, with the computing equipment of that era. That changed
in the 1970s, when digital image processing proliferated as cheaper computers and dedicated
hardware became available. Images then could be processed in real time, for some dedicated
problems such as television standards conversion. As general-purpose computers became
faster, they started to take over the role of dedicated hardware for all but the most specialized
and computer-intensive operations.

1.2 Image Sampling and Quantization

To create a digital image, we need to convert the continuous sensed data into digital form. This
involves two processes:

1) Sampling

2) Quantization.

The basic idea behind sampling and quantization is illustrated in Fig. 1.4 and Fig. 1.5. An
image may be continuous with respect to the x- and y-coordinates, and also in amplitude. To
convert it to digital form, we have to sample the function in both coordinates and in amplitude.
Digitizing the coordinate values is called sampling. Digitizing the amplitude values is called
quantization.
Fig 1.2 Continuous Image.

Fig 1.3 a plot of amplitude values along the line of continuous image.

Fig 1.4 Sampling of the above continuous image.

Fig 1.5 Quantization of the sampled image.


The one-dimensional function shown in Fig. 1.3 is a plot of amplitude (gray level)
values of the continuous image along the line segment AB in Fig. 1.2. The random variations
are due to image noise. To sample this function, we take equally spaced samples along line
AB, as shown in Fig. 1.4. The location of each sample is given by a vertical tick mark in the
bottom part of the figure. The samples are shown as small white squares superimposed on the
function.

Fig 1.6 a) Continuous image projected onto a sensor array. b) Result of image sampling and
quantization.

1.3 Image Representation

The result of sampling and quantization is a matrix of real numbers. We will use two
principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the
resulting digital image has M rows and N columns. The values of the coordinates (x, y) now
become discrete quantities. For notational clarity and convenience, we shall use integer values
for these discrete coordinates. Thus, the values of the coordinates at the origin are (x, y) = (0,
0). The next coordinate values along the first row of the image are represented as (x, y) = (0,
1). It is important to keep in mind that the notation (0, 1) is used to signify the second sample
along the first row. It does not mean that these are the actual values of physical coordinates
when the image was sampled.
Fig 1.7 Coordinate convention to represent digital images

The notation introduced in the preceding paragraph allows us to write the complete
M*N digital image in the following compact matrix form:

Fig. 1.8 Image in matrix form

The right side of this equation is by definition a digital image. Each element of this
matrix array is called an image element, picture element, pixel, or pel. The terms image and
pixel will be used throughout the rest of our discussions to denote a digital image and its
elements. In some discussions, it is advantageous to use a more traditional matrix notation to
denote a digital image and its elements:

Fig 1.9 Arrays

1.4 Image Operations

1.4.1 Color depth

Image editing encompasses the process of altering images, whether they are digital
photographs, traditional analog photographs, or illustrations. Traditional analog image editing
is known as photo retouching, using tools such as an airbrush to modify photographs, or
editing illustrations with any traditional art medium. Graphic software programs, which can be
broadly grouped into vector graphics editors, raster graphics editors, and 3d modellers, are the
primary tools with which a user may manipulate, enhance, and transform images. Many image
editing programs are also used to render or create computer art from scratch.

1.4.2 Contrast of Image

To apply a contrast filter, you determine if a pixel is lighter or darker than a threshold
amount. If it's lighter, you scale the pixel's intensity up otherwise you scale it down. In code
this is done by subtracting the threshold from a pixel, multiplying by the contrast factor and
adding the threshold value back again. As with the brightness filter the resulting value needs to
be clamped to ensure it remains in the range 0 - 255. To apply a brightness filter you simply
add a fixed amount to every pixel in the image and then clamp the result to ensure it remains
in the range 0 - 255.

Fig 1.10 low contrast to high contrast

1.4.3 RGB to Gray scale

In photography and computing, a gray scale or grey scale digital image is an image in
which the value of each pixel is a single sample, that is, it carries only intensity information.
Images of this sort, also known as black-and-white, are composed exclusively of shades of
gray, varying from black at the weakest intensity to white at the strongest. Gray scale images
are distinct from one-bit bi-tonal black-and-white images, which in the context of computer
imaging are images with only the two colors, black, and white (also called bi-level or binary
images). Gray scale images have many shades of gray in between. Gray scale images are also
called monochromatic, denoting the absence of any chromatic variation (i.e., one color).
Gray scale images are often the result of measuring the intensity of light at each pixel in
a single band of the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.), and
in such cases they are monochromatic proper when only a given frequency is captured. But
also they can be synthesized from a full color image.
1.4.4 Inverted Image
An inverted image could be interpreted as a digital version of image negatives. After
inversion, every color takes the exact opposite one (I know this terminology is not that
scientific, but it’s useful as a conceptual information). Let’s put this in more scientific terms.
A positive image should be defined as a normal, original RGB or gray image. A negative
image denotes a tonal inversion of a positive image, in which light areas appear dark and dark
areas appear light. In negative images, a color reversing is also achieved, such that the red
areas appear cyan, greens appear magenta, and blues appear yellow. In simpler sense, for the
gray scale case, a black and white image, using 0 for black and 255 for white, a near-black
pixel value of 5 will be converted to 250, or near-white.
Image inversion is one of the easiest techniques in image processing. Therefore, it’s
very applicable to demonstrations of performance, acceleration, and optimization. Many of the
state of the art image processing libraries such as Open CV, Gandalf, VXL etc., perform this
operation as fast as possible, even though some more accelerations using parallel hardware are
possible.
1.4.5 Blurring
Blurring an image usually makes the image unfocused. In signal processing, blurring is
generally obtained by convolving the image with a low pass filter. In this Demonstration, the
amount of blurring is increased by increasing the pixel radius.

Fig 1.11 Blurring of image


1.4.6 Sharpening an Image
Sharpening is one of the most impressive transformations you can apply to an image
since it seems to bring out image detail that was not there before. What it actually does,
however, is to emphasize edges in the image and make them easier for the eye to pick out --
while the visual effect is to make the image seem sharper, no new details are actually created.

Fig 2.12 sharpened images

Histogram:

Histograms are the basis for numerous spatial domain processing techniques. Histogram
manipulation can be used effectively for image enhancement. The histogram of a digital image
with gray levels in the range [0, L-1] is a discrete function h(rk) = nk, where r is the kth gray
level and nk is the number of pixels in the image having gray level rk.

It is common practice to normalize a histogram by dividing each of its values by the


total number of pixels in the image, denoted by n. Thus, a normalized histogram is given by
p(rk) = nk/n, for k=0, 1,p ,L-1. Loosely speaking, p(rk) gives an estimate of the probability of
occurrence of gray level rk. Note that the sum of all components of a normalized histogram is
equal to 1.

Fig 2.13 Image and its histogram

Equalization:

Consider for a moment continuous function, and let the variable r represent the gray
levels of the image to be enhanced. In the initial part of our discussion. we assume that r has
been normalized to the interval [0, 1], with r=0 representing black and r=1 representing white.
Later, we consider a discrete formulation and allow pixel values to be in the interval [0, L-1].

For any r satisfying the aforementioned conditions, we focus attention on transformations of


the form s=T(r) 0 _ r _ 1

That produces a level s for every pixel value r in the original image. For reasons that will
become obvious shortly, we assume that the transformation function

T(r) satisfies the following conditions:

(a) T(r) is single-valued and monotonically increasing in the interval

0 _ r _ 1; and

(b) 0 _ T(r) _ 1 for 0 _ r _ 1.

The requirement in (a) that T(r) be single valued is needed to guarantee that the inverse
transformation will exist, and the monotonicity condition preserves the increasing order from
black to white in the output image. A transformation function that is not monotonically
increasing could result in at least a section of the intensity range being inverted, thus
producing some inverted gray levels in the output image. While this may be a desirable effect
in some cases, that is not what we are after in the present discussion. Finally, condition (b)
guarantees that the output gray levels will be in the same range as the input levels.

Image Formats:

Raster formats:

a) Joint Photographic Expert Group (JPEG):

JPEG stands for "Joint Photographic Expert Group" and, as its name suggests, was specifically
developed for storing photographic images. It has also become a standard format for storing
images in digital cameras and displaying photographic images on internet web pages. JPEG
files are significantly smaller than those saved as TIFF, however this comes at a cost since
JPEG employs lossy compression. A great thing about JPEG files is their flexibility. The JPEG
file format is really a toolkit of options whose settings can be altered to fit the needs of each
image.

b) Tagged Image File Format (TIFF):


TIFF stands for "Tagged Image File Format" and is a standard in the printing and publishing
industry. TIFF files are significantly larger than their JPEG counterparts, and can be either
uncompressed or compressed using lossless compression. Unlike JPEG, TIFF files can have a
bit depth of either 16-bits per channel or 8-bits per channel, and multiple layered images can
be stored in a single TIFF file.
TIFF files are an excellent option for archiving intermediate files which you may edit later,
since it introduces no compression artefacts. Many cameras have an option to create images as
TIFF files, but these can consume excessive space compared to the same JPEG file. If your
camera supports the RAW file format this is a superior alternative, since these are significantly
smaller and can retain even more information about your image.
c) Portable Network Graphics (PNG):
PNG uses ZIP compression which is lossless, and slightly more effective than LZW (slightly
smaller files). PNG is a newer format, designed to be both versatile and royalty free, back
when the LZW patent was disputed. The PNG (Portable Network Graphics) file format was
created as the free, open-source successor to the GIF. The PNG file format supports true color
(16 million colors) while the GIF supports only 256 colors. The PNG file excels when the
image has large, uniformly colored areas. The lossless PNG format is best suited for editing
pictures, and the lossy formats, like JPG, are best for the final distribution of photographic
images.
d) Graphics Interchange Format (GIF):
GIF is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing
graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style
images. The GIF format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more effective when large areas
have a single color, and ineffective for detailed images or dithered images.
e) Bitmap Format (BMP):
The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows
OS. Typically, BMP files are uncompressed, hence they are large; the advantage is their
simplicity and wide acceptance in Windows programs.
f) RAW:
RAW refers to a family of raw image formats that are options available on some digital
cameras. These formats usually use a lossless or nearly-lossless compression, and produce file
sizes much smaller than the TIFF formats of full-size processed images from the same
cameras. Although there is a standard raw image format, (ISO 12234-2, TIFF/EP), the raw
formats used by most cameras are not standardized or documented, and differ among camera
manufacturers. Many graphic programs and image editors may not accept some or all of them,
and some older ones have been effectively orphaned already. Adobe's Digital Negative (DNG)
specification is an attempt at standardizing a raw image format to be used by cameras, or for
archival storage of image data converted from undocumented raw image formats, and is used
by several niche and minority camera manufacturers including Pentax, Leica, and Samsung.
Vector Formats:
a) Computer Graphics Metafile (CGM):
CGM is a file format for 2D vector graphics, raster graphics, and text, and is defined by
ISO/IEC 8632. All graphical elements can be specified in a textual source file that can be
compiled into a binary file or one of two text representations. CGM provides a means of
graphics data interchange for computer representation of 2D graphical information
independent from any particular application, system, platform, or device. It has been adopted
to some extent in the areas of technical illustration and professional design, but has largely
been superseded by formats like SVG.
b) Scalable Vector Graphics (SVG):
SVG is an open standard created and developed by the World Wide Web Consortium to
address the need (and attempts of several corporations) for a versatile, scriptable and all-
purpose vector format for the web and otherwise. The SVG format does not have a
compression scheme of its own, but due to the textual nature of XML, an SVG graphic can be
compressed using a program such as gzip. Because of its scripting potential, SVG is a key
component in web applications: interactive web pages that look and act like applications.

You might also like