Speech Processing Using Multirae DSP
Speech Processing Using Multirae DSP
M.SATISH ECE, GVPCOE MADHURAWADA VISAKHAPATNAM-41. [email protected] T,VENKATESWARA SWAIN ECE, GVPCOE MADHURAWADA VISAKHAPATNAM-41.
1. Abstract In conventional speech processing be formed from FIR or IIR filters. The aim of the paper is to design a QMF filter and then pass a speech signal through it. The low pass filtered signal is decimated and encoded with more number of bits and high pass filtered signal is also decimated and encoded with less number of bits. These two bit streams are multiplexed and transmitted. In receiver side the received signal is demultiplexed and decoded. The signal is passed through the interpolators and then through the synthesis filter so as to reconstruct the speech signal .The reconstructed signal is compared with the original speech signal. 2. Introduction In practice we often encounter signals where most of the energy content, which is important to us, is present in a applications, speech signal is encoded using fixed number of bits over the entire speech signal band. During the process, the bandwidth requirement for speech transmission is relatively high which is of concern. The QMF (Quadrature Mirror Filter) banks are the fundamental building blocks for spectral splitting. The MF structure allows spectral decomposition into contiguous overlapping sub bands in such a way that aliasing incurred in the initial analysis stage is eliminated during signal reconstruction by the synthesis stage. The technique is developed to design the so-called perfect reconstruction QMF bank, which allows complete elimination of amplitude and phase distortion of the reconstructed signal. A QMF bank can
particular frequency band. One of the best examples for this type of signals is a speech signal. In speech signals most of the energy is present in the lower frequency bands. Coding the complete signal that is by allocating same number of bits for the entire signal is not an efficient way of coding the signal for either transmission or storage. Signal coding is the act of transforming the signal at hand to a more compact form, which can then be transmitted with considerably smaller memory. The motivation behind this is the fact that access to the unlimited amount of bandwidth, which is not possible. Therefore there is a need to code and compress speech signals. By taking advantage of the fact that most of the energy is present in a particular frequency band we can split the signal into various bands depending on the information content and then code the subband signals separately. One of the most popular techniques using this concept for coding signals is SubBand coding. SubBand Coding is a method used for coding of signals where the signal is initially split into a number of sub bands depending on the information content. After splitting the signal each
sub band is encoded separately that is more number of bits are allocated to sub bands containing more information and less number of bits are allocated to sub bands containing less information. In this paper we are implementing a subband coding system using QMF (Quadrature Mirror Filter) Banks. There are other coding techniques by which we can code the signal and convert it into a compact form, which can be later stored or transmitted. But coding speech signals using QMF Banks is a very popular technique by which we can achieve very efficient results. Practically we can implement a subband coding system effectively and efficiently using the socalled perfect reconstruction QMF Banks. In QMF Banks the filters are designed in such a way that the aliasing that occurs in the analysis section is completely eliminated in the synthesis section. Thus by eliminating aliasing we are able to reconstruct the signal with a high accuracy. 3.Multi rate signal processing system The basic theory of multirate digital signal processing is introduced in this section along with the two Sampling rate alteration devices namely Upsampler and Downsampler . In many practical
applications where the signal of a given sampling rate needs to be converted into an equivalent signal with a different sampling rate. For example, in digital audio, three different sampling rates are presently employed: 32 kHz in broad casting, 44.1 kHz in digital compact disk and 48 kHz in digital audio tape (DAT) and other applications. Thus conversion of sampling rates of audio signals between these three different rates is often necessary in many situations. The Discretetime systems with unequal sampling rates at various parts of the system are called Multirate systems. Unlike in single rate systems the sampling rates at the input and at the output and all the internal nodes are the same. To achieve different sampling rates at different stages, multirate digital signal processing systems employ the downsampler and the up-sampler, the basic sampling rate alteration devices in addition to the conventional elements such as the adder, the multiplier and the delay. Many multirate systems employ a bank of filters with either a common input or a summed output. The two basic components in sampling rate alteration are the up-sampler and the downsampler. For sampling rate alterations,
the basic sampling rate alteration devices are invariably employed together with low pass digital filters. An up-sampler is a device, which increases the sampling rate by an integer factor. The up-sampler is also called as a sampling rate expander or simply expander. The block diagram of up-sampler is x[n] xu [n]
Figure 3.2 Block diagram of downsampler A down-sampler is a device which reduces the sampling rate by an integer factor. The down-sampler is also called as sampling rate compressor. The block diagram of down-sampler is x[n] y[n]
4. Subband coding
Sub Band Coding (SBC) is a frequency domain coding technique in which the input signal is decomposed into a number of sub bands so that each of these frequency bands can be encoded separately. This technique was originally proposed by Crochiere, Webber and Flanagan as a means to reduce the effect of quantizing noise due to coding and therefore to improve the quality of speech coding systems. Encoding in sub bands offers several advantages that can be effectively used to achieve noise reduction. In the sub band-coding system the input signal, after being sampled at its Nyquist rate, is divided into channels by first being passed through a bank of low pass and high pass filters. The output of each filter is decimated to a rate determined by the number of sub bands and then each of these channel outputs are encoded separately. At the receiver the signals, after being to are decoded, the are interpolated filters and back then that in original to
number
of
samples
coded
and
transmitted does not exceed the number of samples in the original signal since this number is necessary and sufficient for the recovery of the original signal. Under this constraint and in the absence of the channel coders, the overall system response indicates the quality of the system. Ideally, the filtering part of the system must be reversible, i.e. the overall system response must be a pure delay so that the input signal can be perfectly reconstructed at the receiver. However, in general reversibility cannot be of achieved distortion, and sub band-coding aliasing, phase systems suffer from three different types interband and amplitude distortion
distortion. Clearly the quality of the reconstructed signal can be no better than the quality of the system response. On top of that, the quality of the reconstructed signal degrades further, if coders are introduced to the channels. Over the past several years, a number of sub band coding systems have been introduced in an attempt to minimize or remove the three types of distortion mentioned above as well as to minimize the overall number of computations needed for the implementation of these
sampling rate by a bank of interpolation summed reconstruct the input signal. It is important subband coding systems the individual channel signals be decimated in such a way that the
systems. The original sub band coding system which was presented by Crochiere, Webber and Flanagan, used finite impulse response (FIR) filters and the overall response of the system suffered from aliasing and amplitude distortion as well as distortion due to coding. In a later work presented by Crochiere, infinite impulse response (IIR) elliptic filters were used. These filters introduce, to some degree, phase distortion as well. Croisier, Esteban and Galand in their work managed to remove the interband aliasing by introducing the concept of quadrature mirror filters (QMF) to realize a two-band splitting analysis/reconstruction system. The input signal could be divided into more sub bands by using this two band splitting system in a tree-structure. It was also shown that if equal length, linear phase, finite impulse response (FIR) quadrature mirror filters (QMF) are used; phase distortion is also eliminated leaving only the amplitude distortion. Analysis Section The amplitude distortion cannot be removed by using linear phase FIRQMF sub band splitting, except for the trivial case in which the resulting filters
have no frequency selectivity. Johnston though, by using an iterative approach, designed a number of linear phase FIR filters, which produce minimum amplitude distortion in the over all system response
Block diagram of 2-channel QMF bank 5. Two channel qmf bank In many applications, a discrete-time signal x[n] is first split into a number of sub band signals by means of an analysis filter bank; the sub band signals are the sub band signals are then processed and finally combined by a synthesis filter bank resulting in an output signal y[n].If the sub band signals are band limited to frequency ranges much smaller than that of the original input signal, they can be downsampled before processing. Because of the lower sampling rate, the processing of the down-sampled signals can be carried out efficiently. After processing, these signals are upsampled before being combined by the synthesis bank into a higher-rate signal. The combined structure employed is called a Quadrature-mirror filter (QMF) bank. If
the down-sampling and up-sampling factors are equal to or greater than the number of bands of the filter bank, then the output y[n] can be made retain some or all of the characteristics of the input
encoded
by
exploiting
the
special
spectral properties of the signal, such as energy levels and perceptual importance. The coded sub-band one signals sequence are by combined into
multiplexing and either stored for later retrieval or transmitted. At the receiving end, the coded sub-band signals are first recovered decoders
Figure 5.2 Frequency Response
by are
demultiplexing used to
and
produce
approximations of the original downsampled signals. The decoded signals are then up-sampled by a factor of 2 and passed through the synthesis filter bank composed of the low pass and high pass filters whose frequency responses are F0(z) and F1(z) whose outputs are then added yielding y[n]. It follows from the figure that the sampling rates of the input signal x[n] and output signal y[n] are the same. The analysis and the synthesis filters in the QMF bank are chosen so as to ensure that the reconstructed output y[n] is a reasonable replica of the input x[n]. Moreover, they are also designed to provide good frequency selectivity in order to ensure that the sum of the power of the subband signals is reasonably close to the input signal power. In practice, various errors are generated in this scheme. In
x[n] by properly choosing the filters in the structure. The two channel Quadrature Mirror Filter (QMF) bank is multirate digital filter structure that employs two down- samplers in the signal analysis section and two upsamplers in the signal synthesis section. The input signal x[n] is first passed through a two-band analysis filter bank containing the low pass and high pass filters with frequency responses H0(z) and H1(z) .Their corresponding impulse responses are h0(n) and h1(n) respectively, with a cutoff frequency at /2, as shown in the fig. The frequency response characteristics of QMF bank The sub-band signals v0 (n) and v1 (n) are then down-sampled by a factor of 2.Each down-sampled subband signal is
addition to the coding error and errors caused by transmission through the channel, the QMF bank itself introduces several errors due to sampling rate alterations and imperfect filters. We ignore the coding and channel errors, and investigate only the errors generated by the down-samplers and up-samplers in the filter bank and their effects on the performance of the system. In this chapter we efficiently designed an FIR filter for implementing the quadrature mirror filter bank with minimum possible error. Results and Conclusion We have successfully implemented the sub-band coding system by designing an optimum four channel QMF bank. The frequency response characteristics of LPF and HPF used in QMF bank are as given in fig 6.1: From the above characteristics it is seen that, the response of the QMF filter is almost approaching the ideal all-pass filter
characteristic, which results in perfect reconstruction of the input speech signal. 6.1 Input speech signal and its specifications: The speech signal on which sub-band coding is to be performed is given as an input to the QMF bank, which was discussed in the previous chapters. For this we recorded a speech signal using the tool sound recorder i.e available in the windows with the following specifications
Format: PCM Attributes: 8 kHz, 8 bit, Mono The recorded speech signal is of two seconds duration with a length of 21600 samples. The speech signal is sampled with a sampling frequency of 8 kHz and coded with 8 bits per sample. The input speech waveforms and output reconstructed speech waveforms are shown below: From these waveforms we can observe that there is a delay of 31 samples between the input and output speech, which is equal to N-1 (where N=32 is the length of the filter). The data rate reduction depends upon the number of bits allocated for low-pass and high-pass sections. 6.2 Conclusion The intention of this work is to design and implement a SUBBAND CODING system. We have successfully designed an optimum low pass filter for four channel QMF Bank to minimize the amplitude distortion. From this Low pass filter we have designed a High pass Filter. Using these filters we have successfully simulated a two channel
QMF bank for sub-band coding of input speech signal. The result shows that the output is a perfect reconstruction of the input speech signal.
References 1. Digital Signal Processing (Principles, Algorithms and Applications) by John G.Proakis and Dimitris G.Manolakis. 2. Digital Signal Processing, A. Oppenheim & R. Schafer, (PrenticeHall, 1975, ISBN 0-13-214635- 5). 3. P.P. Vaidyanathan. Multirate Systems and Signal rocessing. Prentice-Hall, Englewood Cliffs, NJ, 1993. 4. S.K. Mitra. Digital Signal Processing A Computer-Based Approach. Mc Graw-Hill, New York, 2 edition, 2001. 5. [Chi, et al.] Chi, T., Gao, Y., Guyton, M., Ru, P., and Shamma S.A. SpectroTemporal Modulation Transfer Functions and Speech Intelligibility. 6. [Sinha and tewfik, 1993] Sinha, D. and Tewfik, A. (Dec. 1993) Low bit rate transparent audio compression using adapted wavelets, IEEE Transactions on Signal Processing 4(12) pp. 34633479.