0% found this document useful (0 votes)
23 views17 pages

Electronics-13-04119 Utg

Uploaded by

jamel-shams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views17 pages

Electronics-13-04119 Utg

Uploaded by

jamel-shams
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Article

Method for Noise Reduction by Averaging the Filtering Results


on Circular Displacements Using Wavelet Transform and Local
Binary Pattern
Petrica Ciotirnae 1, Catalin Dumitrescu 2,*, Ionut Cosmin Chiva 2, Augustin Semenescu 3, Eduard Cristian Popovici 4
and Diana Dranga 4

1 Communications Department of Military Technical Academy “Ferdinand I”, 39-49 George Coșbuc Avenue,
050141 Bucharest, Romania; [email protected]
2 Department Telematics and Electronics for Transports, University “Politehnica” of Bucharest,

060042 Bucharest, Romania; [email protected]


3 Department Engineering and Management for Transports, University “Politehnica” of Bucharest,

060042 Bucharest, Romania; [email protected]


4 Department Electronics and Telecommunication, Faculty of Electronics, Telecommunications and

Information Technology, National University of Science and Technology POLITEHNICA from Bucharest,
060042 Bucharest, Romania; [email protected] (E.P.); [email protected] (D.D.)
* Correspondence: [email protected]

Abstract: Algorithms for noise reduction that use the translation invariant wavelet transform indi-
rectly are spatially selective filtering algorithms in the wavelet domain. These al-gorithms use the
undecimated wavelet transform to accurately determine the coeffi-cients corresponding to the con-
tours in the images, these being processed differently from the other wavelet coefficients. The use of
the undecimated wavelet transform in image noise reduction applications leads not only to an im-
provement in terms of Mean Square Error (MSE), but also in terms of the content quality of the
processed images. In the case of noise reduction procedures by truncation of wavelet coefficients,
artifacts appear, especially in the approximation of singularities, due to some pseudo-Gibbs phe-
Citation: Ciotirnae, P.; Dumitrescu,
C.; Chiva, I.C.; Semenescu, A.;
nomena. These artifacts, which appear locally, are troublesome in the case of object recognition ap-
Popovici, E.C.; Dranga, D. Method plications from images acquired in conditions of nonuniform illumina-tion and low contrast. In this
for Noise Reduction by Averaging work we propose a method of feature extractor based on undecimated wavelet transform (UWT)
the Filtering Results on Circular and local binary pattern (LBP). The results ob-tained on images acquired from drones in adverse
Displacements Using Wavelet conditions show promising results in terms of accuracy. The authors show that the displacement-
Transform and Local Binary Pattern. invariant wavelet transform is an very good method of compression and noise reduction in signals.
Electronics 2024, 13, 4119. https://
doi.org/10.3390/electronics13204119 Keywords: UWT; LPB; CNN; object recognition
Academic Editor: Krzysztof Okarma

Received: 10 September 2024


Revised: 15 October 2024
1. Introduction
Accepted: 17 October 2024
Published: 18 October 2024 The wavelet transform can be found in redundant or non-redundant form. Continu-
ous wavelet transforms [1] and frame decomposition [2] are part of the class of redundant
wavelet transforms, while the orthogonal [3] or biorthogonal decompositions are part of
Copyright: © 2024 by the authors.
the class of non-redundant wavelet transforms. Wavelet packets, which are a generaliza-
Licensee MDPI, Basel, Switzerland.
tion of wavelet decomposition, can also be found in redundant or non-redundant [4] form.
This article is an open access article Non-redundant wavelet transforms are useful for several reasons. First, the compression
distributed under the terms and ability of wavelet transforms is better appreciated if only those components relevant for
conditions of the Creative Commons signal reconstruction are retained. Furthermore, the efficient implementation of the anal-
Attribution (CC BY) license ysis/synthesis sections by decimated filter banks also makes the use of non-redundant
(https://creativecommons.org/license wavelet transforms attractive. From a statistical point of view, a property that gives the
s/by/4.0/). orthogonal wavelet transform a significant utility in digital processing is the statistical

Electronics 2024, 13, 4119. https://doi.org/10.3390/electronics13204119 www.mdpi.com/journal/electronics


Electronics 2024, 13, 4119 2 of 17

decorrelation of the wavelet coefficients. Unlike the Fourier transform, the wavelet trans-
form gives the possibility to represent functions can have peaks and discontinuities.
Another advantage is the ability to deconstruct (analyze) and reconstruct (synthesize)
non-periodic and dynamic signals. Wavelet algorithms are of two types: continuous wave-
let transforms (CWT) and discrete wavelet transforms (DWT).
While continuous transformations can act on any translation or scaling, discrete
transformations use a specific subset of values for these operations [5–7]. Table 1 summa-
rizes the characterization of wavelet transforms [8].

Table 1. Summarizes the characterization of wavelet transforms.

Wavelet Transform Use Cases Advantages Limitations


Precise time–frequency
Continuous Wavelet Nonstationary signals Computationally intensive
localization
Signal compression, Efficient, hierarchical
Discrete Wavelet Transform Loss of phase information
denoising, imaging representation
More detailed signal Increased computation due
Wavelet Packet Transform Detailed frequency analysis
exploration to higher decomposition
Signal approximation and
Multiwavelet Transform Better symmetry Increased complexity
feature extraction
Medical imaging data Superior edge and curve High computational
Curvelet Transform
analysis representation demands
Efficient representation of Limited application in
Ridgelet Transform Seismic data analysis
linear features nonlinear structure
Image compression, Excellent edge and texture Computationally intensive,
Contourlet Transform
denoising representation increased complexity
Embedded systems, integer Faster computation, reduce Limited to integer based
Integer Wavelet Transform
based data memory data, reduced flexibility

The main disadvantage of the non-redundant wavelet transform is the lack of dis-
placement invariance. Thus, the wavelet coefficients of a signal shifted by several samples
do not represent the version shifted by the same number of samples of coefficients corre-
sponding to the original signal. The displacement invariance property is important in a
few statistical signal processing applications, such as the detection or estimation of some
signal parameters, noise reduction, the detection of singularities in signals or motion esti-
mation [9–11].
The translation-invariant wavelet transform has been discovered independently sev-
eral times, for different purposes, such as by Ref. [12] under this name, or under other
names such as the undecimated wavelet transform, the stationary wavelet transform, or
the redundant wavelet transform. In fact, this transformation is redundant, invariant to
displacement and constitutes a denser approximation of the continuous wavelet trans-
form than the approximation obtained by the orthogonal discrete wavelet transform.
The use of the translation invariant wavelet transform in noise reduction applications
can be undertaken directly or indirectly. There are three types of filtering algorithms that
use the translation invariant wavelet transform directly, namely (1) algorithms that consist
of averaging the results obtained by hard or soft truncation with a threshold (or even other
ways of processing the wavelet coefficients) on the circular displacements of the input
signal [13], (2) algorithms that are based on determining the optimal decomposition into
translation-invariant wavelet packets [14] and (3) algorithms that proceed to signal recon-
struction based on wavelet maxima [15]. The implementation of the noise reduction algo-
rithm using the translation invariant wavelet transform according to the proposal in the
article in the MATLAB programming environment allowed researchers to study the ef-
fects of different parameters, such as the wavelet coefficients used, the number of
Electronics 2024, 13, 4119 3 of 17

displacements considered and how to choose them, or the effect that the threshold values
used have on the quality of the restored images. In assessing the quality of the restored
images, we considered both the amount of noise removed from the image, expressed by
the peak signal-to-noise ratio (PSNR), and the preservation or even improvement of con-
tours, expressed by the C coefficient.
The research was carried out for approximately 2 years as part of an interdisciplinary
research project carried out within the universities.
From the point of view of the filter banks, the undecimated wavelet transform is ob-
tained by keeping both the even and the odd samples resulting from the filtering and
decimation operations, and further decomposing all the samples corresponding to the
low-pass subbands. This can be explained as follows: following the convolution operation
with the analysis filter, a set of coefficients is obtained from which, through the operation
of decimation by two, only those coefficients with even indices will be retained. If a shift
by one position is performed, following the convolution operation, the same set of coeffi-
cients is obtained, but also shifted by one position. Through the decimation operation, the
coefficients with odd indices will be retained this time from the set of coefficients corre-
sponding to the original signal. If a new shift is performed followed by the convolution
and decimation operations, they will obtain again the coefficients with even indices from
the set of coefficients corresponding to the original signal but shifted by one position.
Thus, at a level of wavelet decomposition, N different coefficients are obtained for all pos-
sible circular displacements on the input signal. If we proceed to L levels of wavelet de-
composition, there will be LN different coefficients.
Algorithms for noise reduction that use the translation invariant wavelet transform
indirectly are spatially selective filtering algorithms in the wavelet domain. These algo-
rithms use the undecimated wavelet transform to accurately determine the coefficients
corresponding to the contours in the images, these being processed differently from the
other wavelet coefficients.
The use of the undecimated wavelet transform in image noise reduction applications
leads not only to an improvement in terms of Mean Square Error (MSE), but also in terms
of the visual quality of the processed images.
Hu H. and collective [16] show that even in the case of noise reduction procedures
performed by the truncation of wavelet coefficients, artifacts appear, especially in the ap-
proximation of singularities, due to some pseudo-Gibbs phenomena. These artifacts,
which appear locally, are less annoying than those generated by noise reduction proce-
dures based on the Fourier transform and which have a global character. Thus, the dis-
placement-invariant wavelet transform constitutes an efficient method of compression
and noise reduction in signals.
In this article, we propose an algorithm consisting of a convolutional neural network
(CNN) structure, and to extract the important features of the objects in the image, we pro-
pose a combination of undecimated wavelet transform (UWT) and local binary pattern
(LBP). We proposed this combination of algorithms because noise reduction and contour
detection represent an important element in object recognition systems.
In images, the most important features are often sudden variations in pixel intensities
called contours, the respective pixels being contour points [17]. Thus, noise reduction and
contour detection represent a useful application for image processing.
The proposed method for noise reduction based on contour detection uses the un-
decimated wavelet transform in combination with LBP, which performs the processing on
the wavelet coefficients resulting from the representation.
Singularities and irregular structures often contain the most information in a signal.
In images, intensity discontinuities lead to locating the contours of objects, an important
aspect when the final goal is to recognize the scene contained in the image or some objects
in the scene. In the case of other types of signals, such as those encountered in radar ap-
plications or those corresponding to electro-cardiograms, the information of interest is
given by transient phenomena, such as peak values [18,19].
Electronics 2024, 13, 4119 4 of 17

If a function has singularities with positive regularity, then the maximum values of
the wavelet transform increase or remain constant as the scale increases. On the contrary,
the values of wavelet maxima determined by noise decrease on average with increasing
scale. This different behavior of the signal-induced and noise-induced peak values is used
to select those signal-induced peaks, based on which the signal is then reconstructed.
Thus, all those values of the local maxima that do not propagate along enough scales, or
those whose average amplitude increases with the decrease in the scale, are canceled, leav-
ing only those maxima induced by the signal. In this way, additive noise can be removed
from the signal. The algorithm proposed in the article was tested on real images acquired
from the drone in low-contrast conditions, and the results obtained are promising.

2. Materials and Methods


Architecture of the proposed algorithm is presented in Figure 1 and consists of the
following stages: acquisition of data from drones in uneven lighting and low contrast con-
ditions, pre-processing of images by segmentation and extraction of the region of interest
(ROI) to obtain the part of the image that contains the most information, the proposed
algorithm for noise reduction and feature extraction based on UWT and LBP and the
recognition of object features by using a convolutional neural network (CNN).

Figure 1. The proposed model for object recognition using a hybrid feature extractor based on UWT-
LBP.

2.1. Data Acquisition


The images were acquired with a camera in the visible spectrum installed on the
drone. The images were acquired in unfavorable conditions (low lighting and very low
contrast). The images were acquired with a Seeker Max SXL2 video camera installed on
the drone. The Seeker Max SXL2 is a high-precision 3-axis gimbal that integrates a 40×
optical zoom EO sensor and an IR thermal sensor with a 19 mm 640*512 lens. The Seeker
Max SXL2 is designed for use in the UAV industries for public safety, power, firefighting,
aerial photography and other industrial applications.

2.2. Feature Extraction


The feature extraction process identifies and extracts the unique features of an object.
In the conditions of some images acquired in unfavorable conditions, the extraction of the
characteristics of the objects represents a difficult problem. To solve this problem, in this
article, we propose a method based on UWT and LBP.

2.2.1. Data Preprocessing


The processing steps consist of converting color images into grayscale images, per-
forming segmentation to obtain the region of interest (ROI) depending on the object we
want to recognize. In this work we will focus on the detection of human forms. Then the
images are resized to 256 x 256 pixels.
Electronics 2024, 13, 4119 5 of 17

2.2.2. Undecimated Wavelet Transform


In the process of processing the images acquired from drones in unfavorable condi-
tions, we found that in the case of noise reduction methods using wavelets, artifacts ap-
pear near singularities. In their vicinity, non-linear processing by the hard or soft method
generates a phenomenon called pseudo-Gibbs, consisting of alternating values above and
below a certain level.
Figure 2 shows the implementation diagram of an undecimated wavelet transfor-
mation, using two levels of decomposition.

Figure 2. The undecimated, direct and inverse wavelet transform using two levels of decomposition,
the one-dimensional case.

In the case of the undecimated direct wavelet transform (Figure 2), both the even and
the odd samples are retained for the detail subbands, and in the case of the low-frequency
subbands, both the even and the odd samples are further decomposed. For the inverse
transformation, both the even and the odd parts are inverted, and the result is averaged.
It is observed that in the case of two levels of decomposition, we obtain 4 transformations.
If we generalize such a decomposition for several levels, we will find that the number of
transformations that can be obtained is 2𝐿 , where L is the number of decomposition levels.
In fact, within a translation invariant wavelet transformation, there are only 𝑁 𝑙𝑜𝑔2 𝑁dif-
ferent coefficients among 𝑁 2 the coefficients that are generated when considering all pos-
sible circular displacements, and the computational complexity have of size 𝑁 𝑙𝑜𝑔2 𝑁.
The size of these artifacts depends on their location and the type of coefficients chosen
to implement the wavelet transform. Basic Haar coefficients [20] have been used over time
in the versatile recognition of images and sounds. A Haar parameter is a simple differen-
tial filter of low computational cost, requiring only additions and subtractions to extract
the value of a feature. By applying a single Haar parameter (1-D) to a temporal signal, we
obtain a simple bandpass filter, but by forming a linear combination of several Haar coef-
ficients, a powerful classifier is created, with a high ability to accurately recognize various
shapes. This type of classifier proves versatile by the simple fact that we can change the
linear combination.
Thus, in the case of Haar coefficients, a discontinuity located at N/2 practically does
not generate pseudo-Gibbs phenomena; instead, a discontinuity located at N/3 will gen-
erate significant pseudo-Gibbs oscillations. The mentioned artifacts depend on the relative
position between the characteristics of the signal and the characteristics of the base used:
Electronics 2024, 13, 4119 6 of 17

signals with similar features but slightly misaligned in time or frequency will generate
fewer artifacts. One solution to correcting the misalignments between the signal and base
characteristics, with negative effects on the quality of the restored signal, is to shift the
signal so that the relative positions of the signal and base characteristics reach a more
favorable alignment. It is difficult to determine, however, the optimal displacement to
avoid artifacts.
Therefore, we choose an alternative solution, which consists of applying the noise
removal procedure for all possible circular motions of the given signal or only for a part
of them using the formula below:
T̄(x; (Sh )h∈H ) = Aveh∈H S−h (T(Sh (x))) (1)
where H is the set of considered circular displacements and (𝑆ℎ 𝑥)𝑡 = 𝑥(𝑡+ℎ)mod 𝑛 is a cir-
cular displacement in time with h, and Aveh∈H represents the arithmetic mean for k tak-
ing all the values of H. The sequence of operations is as follows:
1. circular shift and usual wavelet transforms;
2. soft (hard) truncation with coefficient threshold;
3. inverse wavelet transform+ reverse circular shift.
The complete execution of the N possible movements of a signal of length N re-quires
the memorization of N2 coefficients, the calculation complexity being of the or-der of N2.
As we have already mentioned, it is not necessary to perform all possible displace-
ments because we use the averaging of the filtering results, the number of independent
coefficients being a function of the number of wavelet decomposition levels.

2.2.3. Local Binary Pattern


The locally binary pattern (LBP) operator classifies and labels the pixels in the images
in binary values by setting a threshold for each pixel’s neighborhood. Due to its high dis-
criminative power and computational simplicity, LBP has become popular in various
fields of computer vision, such as texture description, face recognition, and object recog-
nition and classification. The most important feature of the LBP operator is its invariance
to changes in illumination and scaling. The original version of the LBP operator used the
neighborhood of each current pixel, usually of size 3 × 3, to then use different types of
neighborhoods or spatial pyramids [16]. The descriptor calculation steps are as follows:
- For each pixel in the image, the values in the vicinity of the point are a thresholder
according to the value of the central pixel (Figure 3).

Figure 3. Calculation mode of the LBP operator.

For each pixel, the following will be calculated:


𝟏 𝐱≥𝟎
𝐋𝐁𝐏𝐏,𝐑 = ∑𝐏−𝟏 𝐩
𝐩=𝟎 𝐬(𝐠 𝐩 − 𝐠 𝐜 ) ∗ 𝟐 , 𝐬(𝐱) = { } (2)
𝟎 𝐱<𝟎
where g 0 , g p (p = 0, … , P − 1) is a gray value that corresponds to the value of the center
pixel, s is the neighboring pixel, and the output of the operator is a P-bit binary pattern
with 2P distinct values.
Electronics 2024, 13, 4119 7 of 17

- a histogram of the values is created LBPP,R ;


- the histograms are concatenated if a binarization process is performed at several im-
age scales.

2.2.4. Hybrid UWT and LBP Feature Extraction


A block diagram for the proposed algorithm used to extract the characteristics of the
objects from the images is presented in Figure 4. The image is processed by the Coifman
1 wavelet function to determine the approximation coefficients for determining the prox-
imal gradients of magnitude and direction. Based on the proximal gradients, the LBP al-
gorithm extracts the features (accounts) of the objects of interest.

Figure 4. Block diagram for feature extraction using UWT and LBP.

2.3. Recognition Using Convolutional Neuronal Network (CNN)


The basic architecture of the CNN (Figure 5) is composed of three convolutional lay-
ers, three Pooling layers and two fully connected layers with ReLU activation functions.
The last fully connected layer uses a softmax function for activation. The first convolu-tion
layer has a kernel with the size of 3x3 pixels and contains 32 filters, and the acti-vation
function is of ReLU type. The role of this layer is to acquire images of 50x50 pix-els. The
2nd layer is of the Pooling type, having a dimension of 2 x 2, and the step size is 2. This
layer has the role of reducing the spatial dimension based on the maximum value obtained
by eliminating the other values, thus controlling the overtraining effect. Also, based on the
maximum value, the hyperparameters are also reduced. The last hidden layer is a convo-
lutional layer with the size of 3×3 pixels, containing 32 filters, and the activation function
is of ReLU type. This layer is followed by a new Max-Pooling layer that has the character-
istics of the previously described layer. The last 2 layers are like those described previ-
ously [21–25].
Electronics 2024, 13, 4119 8 of 17

Figure 5. CNN network architecture.

3. Results
This article proposes the algorithm for extracting features contours of objects via hy-
brid feature extraction using undecimated wavelet transform (UWT) and local binary pat-
tern (LBP) on a Convolutional Neuronal Network (CNN) for the detection of objects in
images acquired from drones in unfavorable visibility conditions (contrast and lighting).
From the point of view of the filter banks, the undecimated wavelet transformation
is obtained by keeping all even and odd samples resulting from the filtering and decima-
tion operations, continuing the decompositions for the corresponding sub-low-pass sam-
ples. In the case of the undecimated direct wavelet transform, all even/odd samples are
retained for detail sub-bands, and in case of low-frequency sub-bands, all samples are
further decomposed. For the inverse transformation, the even/odd parts are reversed, and
the result is averaged.
It is observed that in the case of two levels of decomposition, we obtain 4 transfor-
mations. If we generalize such a decomposition for several levels, we will find that the
number of transformations that can be obtained is 2𝐿 , where L is the number of decom-
position levels.
This can be explained as follows: following the convolution operation with the anal-
ysis filter, a set of coefficients is obtained from which, through the operation of decimation
by two, only those coefficients with even indices will be retained. If a shift by one position
is performed, following the convolution operation, the same set of coefficients is obtained,
but also shifted by one position. Through the decimation operation, the coefficients with
odd indices will be retained this time from the set of coefficients corresponding to the
original signal. If a new shift is performed, followed by the convolution and decimation
operations, they will obtain again the coefficients with even indices from the set of coeffi-
cients corresponding to the original signal but shifted by one position. Thus, at a level of
wavelet decomposition, N different coefficients are obtained for all possible circular dis-
placements on the input signal. If we proceed to L levels of wavelet decomposition, there
will be LN different coefficients.
This transformation generates only the non-redundant coefficients at each scale of
development. It was found that for a one-dimensional wavelet transformation, for the first
level of decomposition, only the displacements k = 0 and k = 1 generate independent (non-
redundant) coefficients. Otherwise, for all other displacements, the obtained coefficients
can be expressed according to those corresponding to the two displacements. In the case
of two levels of decomposition, the displacements k = 0, k = 1, k = 2 and k = 3 are sufficient
to obtain the independent coefficients. Considering M levels of wavelet decomposition,
we demonstrated that non-redundant coefficients 𝑘 = 0, 1, 2, . . . , 2𝑀 − 1 are obtained only
for displacements, the rest being redundant. Generalizing the results, we can say that in
Electronics 2024, 13, 4119 9 of 17

the case of an image, for three levels of decomposition, we will have 64 displacements that
generate independent coefficients, namely {(𝑘𝑥 , 𝑘𝑦 ), 𝑘𝑥 , 𝑘𝑦 = 0, 1, . . . ,7}.
This is also highlighted by the test images in Figure 6 and the graphs in Figure 7. In
both cases represented in this figure, it can be observed that for more than 64 displace-
ments (case that corresponds to two-dimensional wavelet development on three levels),
virtually no improvement in image quality is obtained.
And regarding the quality of the contours, there is no improvement in the case of
performing more than 64 movements (Figures 7 and 8).
Also, when performing simulations, it was found that the choice of the 64 trips is
important for obtaining the best possible results. Such results are presented in Figures 7
and 8. Two methods of choosing displacements were considered: (A) represented by indi-
ces of the form {(𝑖, 𝑗), 𝑖 = 0,1, . . . ,7, 𝑗 = 0,1, . . . ,7}; (B) represented by indices of the form
{(𝑖, 𝑖), 𝑖 = 0,1, . . . ,7}. It is observed that in the case of (A), better results are obtained by at
least 0.1 dB in PSNR terms; however, in case B, the improvement is faster for a reduced
number of trips. This is important in some applications when it is necessary to quickly
obtain results close to the best possible.

(a) (b)
Figure 6. Test images acquired with the drone in unfavorable conditions. Images (a) and (b) were
acquired in unfavorable lighting and contrast conditions so that the person in the images is diffi-
cult to visually identify.

Figure 6a,b show two images acquired with the drone in unfavorable conditions:
strong lighting (left image) and very low contrast (right image). These images will be used
to test the algorithm proposed in the article for pattern recognition.

(a)
Electronics 2024, 13, 4119 10 of 17

(b)
Figure 7. Graphical representation of the PSNR dependence depending on the number of circular
movements performed in the two cases considered: (a) and (b).

(a)

(b)
Figure 8. Graphical representation of the dependence of coefficient C according to the number of
circular movements performed in the two cases considered: (a) and (b).

Figure 7 shows the graphical representation of the PSNR dependence is calculated


depending on the number of circular movements used, in the two cases considered: (A)
Electronics 2024, 13, 4119 11 of 17

movements are represented by indices of the form {(i,j),i=0,1,. ..,7,j=0,1,...,7}, blue color (B)
movements are represented by indices of the form {(i,i),i=0,1,...,7} green color. The simu-
lations were performed on two images of 256 x 256 pixels, with the dispersion σ=0.05, on
the images acquired with the drone in unfavorable conditions image (a) and (b) figure 6.
Figure 8 shows the graphical representation of the dependence of coefficient C ac-
cording to the number of circular movements performed in the two cases considered: (A)
movements are represented by indices of the form {(𝑖, 𝑗), 𝑖 = 0,1, . . . ,7, 𝑗 = 0,1, . . . ,7} in
blue color; (B) movements are represented by indices of the form {(𝑖, 𝑖), 𝑖 = 0,1, . . . ,7} in
green color. The simulations were performed on two images of 256 × 256 pixels, with dis-
persion 𝜎 = 0.05, for Figure 6a,b.
The results of two image databases used to carry out the tests highlight the effi-
ciency of using the UWT and LBP algorithms. The joint use of the two algorithms with
multi-ple resolution showed equal or better results compared to the existing algorithms
in the specialized literature (Figure 9).

Figure 9. Schematic diagram of using the LBP and detector structure method to fuse features.

For the recognition of objects from aerial images, we used a CNN network. For clas-
sification, we used 28 × 28 images at the input of the network, which are resized to 28 × 28
× 1. The CNN network used has three convolutional states that have the fol-lowing char-
acteristics: the first layer has 32-3 × 3 feature maps (filters), the second layer has 64-3 × 3
feature maps, and the 3rd layer has 128-3 × 3 feature maps. We also use 3 maxpooling
states, each with a size of 2 × 2. Next, two dense layers are used, the last layer of which
uses the softmax function with a size of 10 units for multiclass clas-sification, a flattened
layer and a Dropouts start.
To validate the performance of the CNN network used, the confusion matrix will be
calculated. This matrix is formed as follows: for the first dimension, real values are used,
and for the second dimension, predicted values are used. Each class consists of a row and
a column. The values on the diagonal of the matrix represent the results that are classified
correctly. The results obtained by calculating the matrix are Precision, Recall and F1-score.
Precision: represents the number of samples that contain the existence of the ob-ject
in the image.
Recall: represents the ratio between the expected number of samples containing an
object and the number of samples containing the object.
F measure: It is defined as the harmonic mean of precision and recall.
F1-score: determined for each class, Weights are determined for each class from the
corresponding number of samples.
The network will be trained for 200 epochs, with a batch size of 16. The results ob-
tained are classification accuracy of 95.7%, Recall is 0.96, average F1 is 0.96. The confusion
matrix is shown in Figure 10 and the classification ratio is shown in Table 2.
Electronics 2024, 13, 4119 12 of 17

Figure 10. Creating the confusion matrix.

Table 2. The result for precision, recall and F1-score

Classes Precision Recall F1-Score


Backgroud Noise 1 1 1
Single Target 0.96 0.95 0.95
Multiple Targets 0.96 0.97 0.96
Avg/total 0.95 0.96 0.96

Creating the Confusion Matrix


The confusion matrix represents the positive and negative numbers. This is necessary
to calculate the efficiency of the machine learning algorithm used to classify the objects in
the images using the pre-training of the network. The square confusion matrix de-noted
by C, has Cij values that correspond to the true labels of group i and the predic-tion labels
of group j. The confusion matrix C is presented in Figure 10.
To validate the proposed algorithm for detecting objects from aerial images acquired
in unfavorable conditions, we tested the algorithm on a suite of images, and the results of
applying the feature extraction algorithm using the proposed UWT and LBP are presented
in Figure 11.
Electronics 2024, 13, 4119 13 of 17

Figure 11 The images acquired from the drone (left) in poor visibility conditions and the results
obtained for person recognition (red rectangle) with the proposed algorithm (right).
Electronics 2024, 13, 4119 14 of 17

4. Discussion
In the article we propose a method for determining the contours of objects from aerial
images acquired in unfavorable visibility conditions (low lighting and very low contrast)
based on hybrid feature extraction using undecimated wavelet transform (UWT) and local
binary pattern (LBP) on simple convolutional neuronal network (CNN). The focus is on
the realization of the UWT and LBP feature extraction model for contour extraction. We
used a simple CNN structure to validate the feature extraction method and to verify the
proposed model based on the test images.
In assessing the quality of the restored images, we considered both the amount of
noise removed from the image, expressed by PSNR, and the preservation or even im-
provement of contours, expressed by the coefficient C. The simulations were performed
on several images, with different spectral characteristics, under the conditions of degra-
dation with Gaussian additive white noise (laboratory conditions), but also on images ac-
quired from the drone in unfavorable conditions of contrast and lighting or at different
dispersions. We then highlighted the fact that the algorithm based on the translation-in-
variant wavelet transform proposed in this article is particularly effective, considering
both the mentioned metrics (mentioned in the article) and the visual appreciation of the
images. Thus, in terms of PSNR, the improvement occurs in the case of all tested images
and is generally 1 dB, and with regard to the C coefficient, it only increases in the case of
some of the images, when it can even reach 5 percent.
The proposed solution has the following innovative features:
• We use the results from the theory of learning and shape recognition to determine
the necessary conditions for the realization of a new feature detector based on UWT
and LBP;
• We optimize the proposed algorithm through methods inspired by the simplification
of artificial neural networks (for the selection of components that influence the learn-
ing process) and local temporal representations. If most target classification systems
in video images are based on using one representation for all sequences and the tem-
poral notion is lost, the proposed imposed structure detector model creates a fixed-
length representation based on the temporal information structure. This model com-
bines the benefits of generative and discriminative algorithms with a general charac-
ter, depending on the selected problem;
• We propose an algorithm for extracting the features of local binary patterns based on
the spatial plane and temporal band. Then, we merge the two types of features; one
is the average local binary pattern features and the linear region structure imposition
that are extracted from the temporal linear plane, and the other is the local binary
pattern features and the sector region structure imposition that are extracted from
the spatial plane, so that the two features are converted to the frame feature by the
feature correlation function;
• We propose a non-linear approach for image description and classification. The per-
formance of the proposed features is validated both in the context of a classification
system and from the perspective of a system that searchers images by content. The
proposed algorithm represents a good alternative to classical texture descriptors, as
it shows a similar or higher performance compared to the algorithms presented in
the literature.
The main advantages and limitations derived from the simulations performed using
the solution proposed in the article are the following:
(1) The method of averaging the results of filtering on circular displacements leads to
superior results compared to the standard methods of noise reduction in the wavelet
field, ensuring the removal of a greater amount of noise and a better preservation of
contours, the images thus processed presenting a better visual quality, with fewer
artifacts;
Electronics 2024, 13, 4119 15 of 17

(2) Considering more circular displacements than the number of non-redundant dis-
placements (64, considering three levels of wavelet decomposition in the case of im-
ages) does not lead to better results, regardless of the image;
(3) When a limited number of circular movements are used, their choice is important;
although the choice of displacements along the diagonal of the image (case B pre-
sented in Figure 7) leads to a faster improvement, the other considered case (A) en-
sures superior results in the case of performing more than 20 circular displacements;
(4) The use of different wavelet coefficients does not lead to very different results in
terms of PSNR, but considering the coefficient C, the difference is more important.
This observation leads us to the idea of a noise reduction algorithm based on multiple
filtering that averages the circular displacements through the algorithm proposed in
the article using different sets of wavelet coefficients. It is observed that, in the case
of some images, an improvement of the contours is obtained, expressed by higher
values of the C coefficient.
Shu X. and colleagues [26] present the multichannel local binary pattern (MCLBP)
algorithm for many practical applications in pattern recognition, using this algorithm for
the representation and classification of color texture. Syazwani, Asraf, Amin and Dalila
[27] present the automatic identification or recognition of fruits using image processing is
presented. represents an important element in modern agriculture for the detection of ob-
jects in agricultural fields using ANN-GDX classification with 94.4% accuracy. . In [28] Li
K & collective, present the identification accuracy of detecting oil slicks re-sulting from oil
spills from the sea is presented, based on the Gray Level Co-occurrence Matrix (GLCM)
and Support Vector Machine (SVM) method, with an accuracy of 95%. Wang, Zhan and
Yang [29] present LPB-CGAN (conditional generative adversarial network)-based rain re-
moval algorithms, and the results remove rain from images with 38.79 PSNR. Liu and
Zhang [30] present a novel denoising algorithm using a wavelet and obtain a PSNR 23.3
of dB, and Mustaqim, Tsaniya and Suciati [31] propose the CWT-LBP fusion feature for
face recognition, with the experimental results showing that the detection accuracy is
99.7% in ideal conditions. The results obtained by applying hybrid feature extraction us-
ing UWT and LBP led to a classification accuracy of 95.7%, a result comparable or better
than the results obtained by the previously mentioned algorithms for the detection of ob-
jects in images obtained in unfavorable low lighting conditions and very low contrast.
In the framework of future research directions, a multiple filtering procedure will be
developed by averaging the circular movements to improve the proposed algorithm. Also,
a Wiener filter in the wavelet domain will be designed to increase the degree of accuracy
for recognition.

5. Conclusions
This article proposes the algorithm for extracting features contours of objects uses
hybrid feature extraction using undecimated wavelet transform (UWT) and local bi-nary
pattern (LBP) on convolutional neuronal network (CNN) for the detection, recog-nition
and classification of objects from images acquired from drones in unfavorable visibility
conditions (low lighting and very low contrast).
The proposed algorithm combines the efficiency of UWT and LBP for extracting the
features required for a pattern recognition system. Since using the wavelet algorithms to
reduce the noise in the images introduces artifacts and generates the pseudo-Gibbs phe-
nomenon, in this article, we proposed a solution to correcting the misalignments between
the signal characteristics and those of the base, with negative effects on the quality of the
restored signal. We found that the signal could be moved so that the relative positions
between the signal and the base characteristics reach the most favorable alignment. In
practice, however, it is difficult to determine the optimal displacement to avoid artifacts.
That is why, in the proposed solution, we choose an alternative solution that consists of
applying the noise removal procedure to all possible circular movements of the given
Electronics 2024, 13, 4119 16 of 17

signal or only to a part of them, thus extracting the main characteristics of the objects from
the images.
The results presented for the detection and recognition of people from aerial images
taken in very low-contrast conditions showed that the algorithm proposed here, combin-
ing a CNN network with UWT and LBP, has a high detection accuracy. According to the
results, the proposed UWT-LBP-CNN method is a promising method for recognizing ob-
jects in aerial images acquired in unfavorable visibility conditions.

Author Contributions: Conceptualization, P.C. and C.D.; methodology, C.D. software, C.D. and
D.D.; validation, P.C. and E.P.; formal analysis, P.C., I.C.C. and C.D.; investigation, C.D., E.P. and
P.C.; resources, P.C. and D.D.; data curation, C.D., D.D. and E.P.; writing—original draft prepara-
tion, C.D. and P.C.; writing—review and editing, P.C., A.S. and I.C.C.; visualization, P.C., C.D. and
A.S.; supervision, C.D. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: No new data were created or analyzed in this study. Data sharing is
not applicable to this article.
Acknowledgments: The authors thank the Energy & Eco Concept Company for their support, for
the aerial images made and made available, within the project Innovative system for combating
cross-border terrorism, organized crime, illegal trafficking of goods and people—
357/390033/27.09.2021, Code SMIS: 121596.
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. Grossmann, A.; Morlet, J. Decomposition of Hardy Functions into Square Integrable Wavelets of Constant Shape. SIAM J. Math.
Anal. 1984, 15, 723–736. https://doi.org/10.1137/0515056.
2. Daubechies, I. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 1990, 36, 961–1005.
https://doi.org/10.1109/18.57199.
3. Mallat, S.G. Multiresolution Approximations and Wavelet Orthonormal Bases of L2(R). Trans. Am. Math. Soc. 1989, 315, 69–87.
https://doi.org/10.2307/2001373.
4. Tao, K.; Zhu, J.J. A comparative study on validity assessment of wavelet de-noising. J. Geod. Geodyn. 2012, 32, 128–133.
https://doi.org/10.1016/j.ijleo.2021.166652.
5. Vilimek, D.; Kubicek, J.; Golian, M.; Jaros, R.; Kahankova, R.; Hanzlikova, P.; Barvik, D.; Krestanova, A.; Penhaker, M.; Cerny,
M.; et al. Comparative analysis of wavelet transform filtering systems for noise reduction in ultrasound images. PLoS ONE 2021,
17, e0270745. https://doi.org/10.1371/journal.pone.0270745.
6. Vaiyapuri, T.; Alaskar, H.; Sbai, Z.; Devi, S. GA-based multi-objective optimization technique for medical image denoising in
wavelet domain. J. Intell. Fuzzy Syst. 2021, 41, 1575–1588.
7. Chen, H.; Zhou, C.H.; Wang, S.Z. Research based on mathematics morphology image chirp method. J. Eng. Graph. 2003, 2, 116–
119.
8. Ramakrishnan S., Modern Applications of Wavelet Transform, book Interchopen, February 2024, 51-52 DOI:
10.5772/intechopen.1003981, ISBN978-0-85466-235-7.
9. Mao, J.D. Noise reduction for lidar returns using local threshold wavelet analysis. Opt. Quantum Electron. 2012, 43, 59–68.
https://doi.org/10.3788/CJL20113802.0209001.
10. Yin, Q.S.; Dai, S.G. Research on image denoising algorithm based on improved wavelet threshold. Softw. Guide 2018, 17, 89–91.
11. Zhao, G.C.; Zhang, L.; Wu, F.B. Application of improved median filtering algorithm to image de-noising. J. Appl. Opt. 2011, 32,
678–682.
12. Radulescu, V.M.; Maican, C.A. Algorithm for image processing using a frequency separation method. In Proceedings of the
23rd International Carpathian Control Conference (ICCC), Sinaia, Romania, 29 May–1 June 2022; pp. 181–185.
https://doi.org/10.1109/ICCC54292.2022.9805961.
13. Zhou, Z.R.; Hua, D.X.; Yang, R. De-noising method for mie scattering lidar echo signal based on wavelet theory. Acta Photonica
Sin 2016, 45, 144–149. https://doi.org/10.1109/SDPC52933.2021.9563606.
14. Mallat, S.; Hwang, W.L. Singularity detection and processing with wavelets. IEEE Trans. Inf. Theory 1992, 38, 617–643.
https://doi.org/10.1109/18.119727.
15. Zhang, Y.D.; Hou, M.; Liu, Z.L.; Xie, P. Empirical mode decomposition with wavelet de-noising. In Proceedings of the 2010 IEEE
International Conference on Intelligent Computing and Intelligent Systems, Xiamen, China, 29–31 October 2010; pp. 183–186.
16. Hu, H.; Ao, Y.; Yan, H.; Bai, Y.; Shi, N. Signal Denoising Based on Wavelet Threshold Denoising and Optimized Variational
Mode Decomposition. J. Sens. 2021, 2021, 5599096. https://doi.org/10.1155/2021/5599096.
Electronics 2024, 13, 4119 17 of 17

17. Mert, A.; Akan, A. Detrended fluctuation thresholding for empirical mode decomposition based denoising. Digit. Signal Process.
2014, 32, 48–56.
18. Yousefzadeh, R.; Huang, F. Using Wavelets and Spectral Methods to Study Patterns in Image-Classification Datasets. arXiv 2020,
https://doi.org/10.48550/arXiv.2006.09879.
19. Ke, P.; Cai, M.; Wang, H.; Chen, J. A novel face recognition algorithm based on the combination of LBP and CNN. In Proceedings
of the 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 12–16 August 2018; pp. 539–543.
https://doi.org/10.1109/ICSP.2018.8652477.
20. Cao, Y.; Li, X. Image Recognition Based on Denoising and Edge Detection. In Proceedings of the IEEE International Conference
on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), Fuzhou, China, 24–26 Sep-
tember 2021; pp. 550–559. https://doi.org/10.1109/CEI52496.2021.9574534.
21. Chen, T.; Gao, T.; Li, S.; Zhang, X.; Cao, J.; Yao, D.; Li, Y. A novel face recognition method based on fusion of LBP and HOG. IET
Image Process. 2021, 15, 3559–3572. https://doi.org/10.1049/ipr2.12192.
22. Wu, J.; Shen, T.; Wang, Q.; Tao, Z.; Zeng, K.; Song, J. Local Adaptive Illumination-Driven Input-Level Fusion for Infrared and
Visible Object Detection. Remote Sens. 2023, 15, 660. https://doi.org/10.3390/rs15030660.
23. Ding, H.; Xu, L.; Wu, Y.; Shi, W. Classification of hyperspectral images by deep learning of spectral-spatial features. Arab. J.
Geosci. 2020, 13, 464.
24. Jiang, J.; Sun, H.; Liu, X.; Ma, J. Learning spatial-spectral prior for super-resolution of hyperspectral imagery. IEEE Trans. Com-
put. Imaging 2020, 6, 1082–1096. https://doi.org/10.1109/TCI.2020.2996075.
25. Usui, K.; Ogawa, K.; Goto, M.; Sakano, Y.; Kyougoku, S.; Daida, H. Quantitative evaluation of deep convolutional neural net-
work-based image denoising for low-dose computed tomography. Vis. Comput. Ind. Biomed. Art 2021, 4, 21.
26. Xue, P.; He, H. Research of Single Image Rain Removal Algorithm Based on LBP-CGAN Rain Generation Method. Math. Probl.
Eng. 2021, 2021, 886843. https://doi.org/10.1155/2021/8865843.
27. Syazwani, R.W.N.; Asraf, H.M.; Amin, M.M.S.; Dalila, K.N. Automated image identification, detection and fruit counting of top-
view pineapple crown using machine learning, Alex. Eng. J. 2022, 61, 1265–1276. https://doi.org/10.1016/j.aej.2021.06.053.
28. Li, K.; Yu, H.; Xu, Y.; Luo, X. Detection of oil spills based on gray level co-occurrence matrix and support vector machine. Front.
Environ. Sci. 2022, 10, 1049880. https://doi.org/10.3389/fenvs.2022.1049880.
29. Wang, Z.; Zhan, J.; Duan, C.; Guan, X.; Yang, K. Vehicle detection in severe weather based on pseudo-visual search and HOG-
LBP feature fusion, Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 236, 1607–1618.
https://doi.org/10.1177/095440702110036311.
30. Liu, C.; Zhang, L. A Novel Denoising Algorithm Based on Wavelet and Non-Local Moment Mean Filtering. Electronics 2023, 12,
1461. https://doi.org/10.3390/electronics12061461.
31. Mustaqim, T.; Tsaniya, H.; Adhiyaksa, F.A.; Suciati, N. Wavelet Transformation and Local Binary Pattern for Data Augmentation
in Deep Learning-based Face Recognition. In Proceedings of the 10th International Conference on Information and Communi-
cation Technology (ICoICT), Bandung, Indonesia, 2–3 August 2022; Institute of Electrical and Electronics Engineers Inc.: Pisca-
taway, NJ, USA, 2022; pp. 362–367. https://doi.org/10.1109/ICoICT55009.2022.9914875.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual au-
thor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like