The Comparison of The Effect of Haimming Window An
The Comparison of The Effect of Haimming Window An
The Comparison of The Effect of Haimming Window An
net/publication/272609068
CITATIONS READS
0 160
5 authors, including:
Li da Fan Lin
Shanghai Jiao Tong University Xiamen University
5 PUBLICATIONS 9 CITATIONS 56 PUBLICATIONS 570 CITATIONS
All content following this page was uploaded by Fan Lin on 25 November 2016.
Abstract. The real-time pitch shifting process is widely used in various types of music production.
The pitch shifting technology can be divided into two major types, the time domain type and the
frequency domain type. Compared with the time domain method, the frequency domain method has
the advantage of large shifting scale, low total cost of computing and the more flexibility of the
algorithm. However, the use of Fourier Transform in frequency domain processing leads to the
inevitable inherent frequency leakage effects which decrease the accuracy of the pitch shifting effect.
In order to restrain the side effect of Fourier Transform, window functions are used to fall down the
spectrum-aliasing. In practical processing, Haimming Window and Blackman Window are frequently
used. In this paper, we compare both the effect of the two window functions in the restraint of
frequency leakage and the performance and accuracy in subjective based on the traditional phase
vocoder[1]. Experiment shows that Haimming Window is generally better than Blackman Window in
pitch shifting process.
Introduction
In the point view of frequency, audio can be seen as a discrete signal which composed by a sine wave
that changes time by time. Music signal can be seen as a smooth signal in a short period of time
(usually 10 ~ 30 ms). It is relatively stable and simple during the period of time. And the voice is a
monotonous voice in subjective. Because of this stable feature of music, Short-Time Fourier
transformation (STFT)[2] is widely used. This signal is called a frame of the period of time in usual. It
can intercept all the frames of time by windowing moved method.
Short-time Fourier transform (SFFT) analysis comprehensive method is an effective solution for
solving phase discontinuous. This method makes use of windowing increment, Fourier transform,
frequency/phase adaptation, comprehensive windowing and stacking process[3]. Eliminate echo
effect effectively, known as phase synthesis.
This is an open access article under the CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0/)
222 Emerging Engineering Approaches and Applications
compare different audio frequency samples. Found that effect is better than Blackman window. The
experiment also shows that the Hamming window on the frequency of leakage suppression has a
better effect. The sequence after windowing needs reconstruction to restore its original energy. For
Blackman window and Haiming window, the reconstruction process can use the same windowing
function. It can restore the time-domain signal by using integrated stacking method.
Blackman Window
0.42 − 0.5cos 2Mπ n + 0.08cos 4Mπ n
wh (n) =
0 (1)
Haiming Window
0.54 − 0.46 cos 2Mπ n 0≤ <
wm (n) =
0 (2)
Fig. 1.The spectrum of audio signal with Haimming window function and Fourier transition
Fig. 2.The spectrum of audio signal with Blackman window function and Fourier transition
Through comparison, Haimming window restrain the signal frequency at a short wide area near
100 Hz. Signal that applied with Haimming window has the feature of narrower main bean,
concentration of energy and the more accurate frequency which will improve the performance the
pitch shifting process. Blackman widow obviously has a wider main bean with energy leakage.
Meanwhile, it should be pointed out that signal with Haimming Window has more side lobes in the
experiment which to a certain extent offsets the concentration effect of the narrow main lobe. In
general, Haimming Window is better than Blackman Window in the restraint of frequency leakage.
∞
1
x ( n) =
M
∑x w (n ) f ( sR − n)
s =−∞ (4)
∑w
s =−∞
m ( sR − n) wm [ p ⋅ ( sR − n)]
M
= ∞
R
(0.08 + 0.922 ⋅1 4) p ≠1
=
R
p =1
(0.08 + 0.92 ⋅ 3 / 8)
2
(6)
In actual processing, because the sequence's status may be changed after being processed, and the
process of comprehensive windowing and stacking can not guarantee restore all of the energy. But in
specific circumstances [5], this process can restore most of the energy. The experiments show that
Haiming window is better that Blackman window on restraining frequency leakage. In improved
algorithms, we use Haiming window as the convolution window in analyzing and integrated process.
Fourier transform. The truncated signal spreads its energy by aliasing effect, which is also called
frequency leakage. The use of Haimming Window as the operator of Fourier transform has better
performance than Blackman window to minimize frequency leakage. During this procedure, Digital
audio is the sample result which is a discrete data for analog signals. So we usually said Fourier
transform in audio processing referring to the Discrete Fourier Transform (DFT). And its inverse
formularies as follows: (7) Fourier Transform. (8) Inverse Fourier Transform.
(7)
(8)
2.7 The relationship of subjective characteristics and the length of window M and number R of
skip samples
The larger of M, the larger of scale of the window covers. And the smaller of change error of spectral
analyze [9]. Experiment shows that, human ear is sensitive on frequency error in high pitch of music.
A large window can increase the accuracy of spectral analysis. For 44.1 KHz audio, let M>2048 can
get the pitch changed coefficient between 1 and 2 which bring a good effect. Audio music keep
original pitch unchanged when pitch coefficient is equivalent 1.
Summary
Part of low frequency of 2 times wave is missed in Phase Vocoder [4] pitch changed algorithm.
Improved algorithm manifests the low frequency of original wave and composes a new wave which
doubles the original wave in naturally. Divide music into the background instrument and voice. For
pitch rising, there is an echo effect of instruments reverberation using frequency modulation method
directly to background instruments. Can be heard sound duplication obviously for human voice; for
pitch falling, it produces noise when using frequency modulation to the area which between two
frame's connection. It solves these two issues effectively by phase composed pitch changed
algorithm. The smoothness of voice is low by Phase Vocoder[4]. Improve algorithm keeps the
loudness of music consistent between before and after changed of the music in subjective. The quality
of voice is improved markedly.
Because of the inherent frequency leakage effects in short term Fourier transform, phase synthesis
method makes use of windowing at time domain to fall down the spectrum-aliasing. It makes each of
phases inconsistent of analysis frame by directly frequency modulate. It gets a wonderful smooth
effect in subjective when connects different phases by adjusting the phases' odds and phases' sum in
the processing of phase changed. It changes the original sequence's energy because of window. But it
restores the energy by composite window/stacking process once again. Choose appropriate
parameters for different factors. Choose large length of window for frequency rises and choose small
step rate for frequency falling.
References
[1] J. Laroche, “Time and pitch scale modification of audio signals,” in Applications of Digital Signal
Processing to Audio and Acoustics,M. Kahrs and K. Brandenburg, Eds. Kluwer, Norwell, MA,
1998.
[2] J.L. Flanagan and R.M. Golden, “Phase vocoder,” Bell Syst. Tech. J., vol. 45, pp. 1493–1509,
Nov 1966.
[3] J. B. Allen and L. R. rabiner, “A unified approach to short-time Fourier analysis and synthesis,”
Proc. IEEE, vol. 65, no. 11, pp. 1558–1564, Nov. 1977.
[4] R. Portnoff, “Time-scale modifications of speech based on short-time Fourier analysis,” IEEE
Trans. Acoust., Speech, Signal Processing, vol. 29, no. 3, pp. 374–390, 1981.
[5] M.S. Puckette, “Phase-locked vocoder,” in Proc. IEEE ASSPWorkshop on app. of sig. proc. to
audio and acous., New Paltz, NY, 1995.
[6] J. Laroche and M. Dolson, “Improved phase vocoder time-scale modification of audio,” to appear
in May issue of IEEE trans. speech and audio proc., 1999.
[7] L.B. Almeida and F.M. Silva, “Variable-frequency synthesis: an improved harmonic coding
scheme,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal processing, 1984, pp. 27.5.1–27.5.4.
[8] R. J. McAulay and T. F. Quatieri, “Speech analysis/synthesis based on a sinusoidal
representation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, no. 4,
pp. 744–754, Aug 1986.
[9] X. Serra and J. Smith, “Spectral modeling synthesis: A sound analysis/synthesis system based on
a deterministic plus stochastic decomposition,” Computer Music J., vol. 14, no. 4,
pp. 12–24,Winter 1990.
[10] E. B. George and M. J. T. Smith, “Analysis-bysynthesis/ Overlap-add sinusoidal modeling
applied to the analysis and synthesis of musical tones,” J. Audio Eng. Soc., vol. 40, no. 6,
pp. 497–516, 1992.
[11] S. Tassart and P. Depalle, “Analytical approximations of fractional delays: Lagrange
interpolators and allpass filters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing,
Munich, Germany, 1997.
[12] T.I. Laakso, V. Valimaki, M. Karjalainen, and U. KLaine, “Splitting the unit delay [fir/all pass
filters design],” IEEE Signal Processing mag., vol. 13, no.1, pp. 30–60, Jan 1996.