Speech Denoising With Maximal Overlap Discrete Wavelet Transform
Speech Denoising With Maximal Overlap Discrete Wavelet Transform
Abstract— In this paper, the effectiveness of the maximum wavelet technique. Since then, other scientists have studied
overlapping discrete wavelet transform (MODWT) method on multi-resolution signal analysis in a wide range of scientific
denoising the speech signal is tested and examined. Ensuring the applications. The "wavelets" theory emerged from the
intelligibility of the speech signal in noisy environments by exploration of wave propagation, time-frequency signal
separating it from the noise is a widely researched topic today. On analysis and sampling theories.
the other hand, being able to recover the original speech from the
noisy signal with minimal distortion is a challenge due to the Unlike the Fourier transform, the wavelet transform can
difficulties in removing the background noise. Numerous factors in focus on local information that includes abrupt changes in the
environmental noise environments can interfere with the signal. In signals, since it can examine the signals in the local region as
this study, the performance of some discrete wavelets transform often as it wants in the frequency-time plane. For this reason,
methods is experimentally analyzed using different wavelet filters. it is widely used today. The wavelet transform is the enhanced
The analysis program was carried out in the MATLAB environment. version of the Fourier transform. To analyze the components
As the input noise speech signal, speech sounds containing different of a signal that is not stationary and contains sudden changes,
environmental background noises (train, car, station, plane, etc.) were instead of the Fourier transform, wavelet transform enables
analyzed. During the tests, these noisy input signals were filtered out these signal components to be focused and analyzed in the
from the speech signal by wavelet analysis. The input noisy speech desired region. Wavelet transform is divided into two groups
signal is decomposed into wavelet coefficients with different as Continuous wavelet transform (CWT) for analog
thresholding methods. The reconstructed speech was compared by (continuous time) signals and discrete wavelet transform
measuring the signal-to-noise ratio (SNR) values between the noisy
(DWT) for digital signals. The particular subset of translation
input signal and the smoothed output signals. The scientific
and scale values or grid representation is used for it.
contributions of the study include a detailed comparative analysis of
the performances of various wavelet methods against different Wavelets are very useful in interpreting data as they are
background environmental noises. characterized by scale and position. The fluctuations in the
size and position of the signals vary with the wavelet size.
Keywords— Speech enhancement Wavelet thresholding, Signal However, a good mother wavelet selection and the resulting
denoising discrete wavelet transform, maximal overlap wavelet coefficients are important to accurately describe the
fluctuations in a signal. Compared to traditional signal
processing transformations, wavelets have numerous benefits
I. INTRODUCTION in terms of timing and frequency analysis. Because wavelets
Severe problems can arise with almost any speech signal can be compressed at small scales, they can capture very fast
processing method because noise in a speech signal degrades or suddenly changing signals. Wavelets can be scaled up for
intelligibility. Therefore, the common problem of these fields slowly changing low-frequency features and scaled down for
of research is noise cleaning of the speech signal. The use of fast-changes. This scaling is called "sister wavelets" of a wave
wavelets is a very good alternative to get better noise that has an oscillating waveform and is called a "mother
reduction performance with good computational complexity. wavelet". At the first stage of the DWT algorithm, the discrete
During wavelet removal, also called blocking, the first step is signal is decomposed into two sets of coefficients as the
to calculate the coefficients of the wavelet transform, then approximation coefficients with the low-pass filter and the
threshold the coefficients, and finally to invert the thresholded detail coefficients with the high-pass filter. After filtering,
coefficients by applying the inverse wavelet transform. decimation is made by down sampling with a factor of 2. [3]
The wavelet transform of signals allows to manage multi- II. WAVELET THRESHOLDING
resolution signal decomposition in different kinds of
A. Discrete Wavelet Transform
coefficients, while preserving the key signal information. As
a result, it has been employed in a variety of scientific and An input signal is separated into various constituents in
engineering applications, such as medical signal processing, discrete wavelet transform (DWT). The DWT captures the
audio processing, image processing, data compression, frequency content of an input at different scales and positions
denoising, or seismic measurements. Perfect reconstruction is [4]. Wavelet analyzes a signal at several resolution with a
also possible by giving better approximations to smooth localization in frequency and time [5]. The DWT is
functions, Orthonormal wavelets make this simple to realize represented mathematically by the equation in (1).
, = ∗
[1]. A very compact construction of orthonormal wavelets was ∞
proposed by Meyer [2] has suggested a widely adopted ∞ , (1)
; | | >
Modwt is an undecimated version of DWT and is mainly
, = used to analyze the signals at different scales. Similar to DWT,
0; | | ≤
(5)
it is a time-shift invariant method and an input signal is
where X represents the noisy wavelet coefficients and λ decomposed into high and low frequency components to find
represents a threshold value [1]. Soft threshold sets all detail and approximation coefficients. The important
wavelets detail coefficients to zero below a threshold and differences of Modwt from DWT are that it is a highly
compresses all detail coefficients above the threshold as can redundant and non-orthogonal transformation. Modwt has
be seen in (6). many advantages over DWT. For example, the redundancy of
Modwt makes it easy to align the decomposed coefficients at
%&' ( ). | | − ; | | >
, =
each scale level with the time series, thus allowing a ready-
$
0; | | ≤
(6)
made comparison between the series and its decomposition.
Also, the redundancy of Modwt wavelet coefficients increases
In this study, sqtwolog, rigrsure, heursure and minimax the effective degrees of freedom at each scale, thus reducing
noise threshold estimation methods are discussed. the variance of wavelet-based statistical estimates. By
In sqtwolog threshold approach, the square root of the aligning the wavelet coefficients at each time point with the
logarithm (Tsq) is used where σ denotes the standard deviation original data index, Modwt can jointly analyze changes in the
and N represents the length of the signal. Tsq is derived by, localized signal to scale and time, and its temporal relationship
to events. Modwt keeps down-sampled values that would
%+ = ,-2 log 1 (7) normally be discarded by DWT at each level of the
decomposition. Another advantage of modwt is that it can be
28
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on June 24,2024 at 09:54:30 UTC from IEEE Xplore. Restrictions apply.
2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)
29
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on June 24,2024 at 09:54:30 UTC from IEEE Xplore. Restrictions apply.
2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)
CONCLUSION
Wavelet transform is a modern technology technique for
improving speech signals. In this study, we evaluated how
well wavelet families perform using input signals with various
noisy audio backgrounds. The aim of this study was to use the
wavelet transform method to remove various background
disturbances from an input speech signal. Next, we tried
several thresholding techniques for input noise signals and
evaluated how well-known wavelet families perform at noise
removal. We evaluated the performance of thresholding
strategies by measuring SNR values. When we choose Modwt
thresholding with sqtwolog threshold. We got better SNR
results for (0SNR, 5SNR and 10SNR) than other methods. On
the other hand, for 10dB SNR, we achieved better results with
the heursure and rigrsure thresholding methods. When we
evaluated the results in general, we observed that the
MODWT method gave very good results.
REFERENCES
[1] Dai-Wei Wang and X. Zhang, "Design of Hilbert transform pairs of
orthonormal wavelet bases using IIR filters," 2010 10th International
Symposium on Communications and Information Technologies, 2010, pp.
554-559, doi: 10.1109/ISCIT.2010.5665051.
[2] Meyer Y, “Wavelets and operators” (Vol. 1). Cambridge university
press., (1992).
[3] Chavan MS, Mastorakis N, “Studies on implementation of Harr and
Daubechies wavelet for denoising of speech signal”, International journal of
circuits, systems and signal processing, 4 (3), 83-96, (2010).
[4] Munegowda BK, “Performance and Comparative Analysis of Wavelet
Transform in Denoising Audio Signal from Various Realistic Noise”.
Doctoral dissertation, Napier University, Edinburgh, Scotland, United
Kingdom, (2016).
[5] Verma N, Verma AK, “Performance analysis of wavelet thresholding
methods in denoising of audio signals of some Indian Musical
Instruments”. International Journal of Engineering Science and Technology,
4 (5), 2040-2045, (2012).
[6] Venkateswarlu SC, Reddy AS, Prasad KS, “Speech Enhancement in
terms of Objective Quality Measures Based on Wavelet Hybrid Thresholding
the Multitaper Spectrum”. International Journal of Advanced Research in
Electrical, Electronics and Instrumentation Engineering, 5 (1), 201-219,
(2016).
[7] Mihov S.G., Ivanov, R.M., & Popov, A. N., “Denoising Speech Signals
by Wavelet Transform”, Annual Journal of Electronics, ISSN 1313-1842,
pp:69-72, (2009).
[8] Du, L., Xu, R., Xu, F., et al, “Research on key parameters of speech
denoising algorithm based on wavelet packet transform”, IEEE 3rd
International Conference on Computer Science and Information Technology,
2010, pp:551-556, https://doi.org/ / 10.1109/ICCSIT.2010.5564729
[9] Ozaydın, S. & Alak, İ. K, “Speech Enhancement using Maximal Overlap
Discrete Wavelet Transform”, Gazi University Journal of Science Part A:
Engineering and Innovation , 5 (4) , pp.159-171. 2018.
[10] D. K. Alves, C. M. S. Neto, F. B. Costa and R. L. A. Ribeiro, "Power
measurement using the maximal overlap discrete wavelet transform," 2014
11th IEEE/IAS International Conference on Industry Applications, 2014, pp.
1-7, doi: 10.1109/INDUSCON.2014.7059455
[11] Loizou P, “NOIZEUS: A noisy speech corpus for evaluation of speech
enhancement algorithms”. Speech Communication, 49, 588-601, (2017).
30
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on June 24,2024 at 09:54:30 UTC from IEEE Xplore. Restrictions apply.