Lowpass and Bandpass Filter On Speech Signal Using Matlab Tools-Tutorial
Lowpass and Bandpass Filter On Speech Signal Using Matlab Tools-Tutorial
Question 1:
Capture your voice with the help of a microphone using Matlab -Plot the captured voice in frequency domain -Now apply a filter to block the highest 1khz frequencies -Play this clip and show the new signal in frequency domain -Keep on repeating the above two steps till you have filtered the frequencies above 1khz At the end, block the frequencies other than 300-3400Hz from the original signal and play the resultant signal Note: voice sample should be a short one, like Allah-o-Akbar
I have used the built-in Matlab tools to filter the data and plot it, instead of designing a filter applying on speech signal, which takes more effort and consumes more time to produce results. I used audiorecorder for capturing the voice. The benefit of audiorecorder is that we can record a sound of any length, i.e. for 3 seconds or 5 or 10 or in between and so on then later we can get the length of recorded speech if required, for plotting etc., by using the command y=length(x) Step by step procedure: 1. Recording and playing of voice
obj=audiorecorder
Creates an object of audiorecorder, uses sampling frequency Fs of 8000Hz by default. You can also use higher frequencies by using the syntax;
obj=audiorecorder(Fs,nbits,nchannel)
Valid nbit values for windows are 8bit, 16bit, and 24bit. Channels can be 1(mono) or 2(stereo).
record(obj) stop(obj)
Recording starts. Recording stops. ---You can also define length of time prior to recording, then there will be no need of stop() command. For this, use the command recordblocking():
recordblocking(obj, length)
speech=getaudiodata(obj) play(obj)
Captured voice is stored in a vector speech. plays the recorded voice clip, you can also use this command: sound(speech) or
wavplay(speech,Fs)
wvtool(s)
Right click in the window, go to the analysis parameters, change the frequency units to Hz, and then set sampling frequency as 8000, and then press OK
Sptool
Import the speech signal from work space, set the sampling frequency as 8000, name the signal as speech. And click OK.
Now press the new button under filters on the sptool window, fdatool window will now appear.
In response type select Lowpass, in design method select FIR -equiripple. Set Fs, Fpass, and Fstop as shown in the figure, and then press the button design filter. Now your first lowpass filter is ready. Close the fdatool, a new filter is seen in the filters of sptool now, select that filter and select your speech signal, now press the apply button. Now name the output signal and press OK.
Now, in order to view plots or to play this new voice signal, you will have to import it to the workspace. Go to file, click export
Select the newspeech signal, and export to workspace. The filter has been applied now to block the highest 1kHz frequencies. 4. Play the filtered clip and plot
sound(newspeech1.data) wvtool(newspeech1.data)
plays the filered clip, that is contained in the newspeech data imported from sptool. plots the filtered speech signal
5. Now repeating the last two steps until we have filtered out all the frequencies above 1kHz: again go to sptool, edit the filter you designed previously. In fdatool, set its Fpass and Fstop to 1kHz lesser than before, in each repetition. Then export to workspace and use wvtool to plot. We get these results:
Conclusion Each time the higher frequencies are removed, the sound becomes heavier. In a sense, no information is lost, only the pitch has changed. However, some of the syllables that are spoken narrow, are faded. As well as the background noise is removed.
6. At the end block the frequencies other than 300-3400 Hz: In the sptool via fdatool, design a bandpass filter this time, using the same procedure as above. And then apply this filter on the speech signal. Again export the output signal to
workspace. And then listen to it using sound(xyz.data), and plot it using wvtool(xyz.data). The plot is shown below:
Question 2: Study G.722, and explain in your own words in one page.
G.722 An International Telecommunications Union (ITU-T) standard for audio (speech) compression and decompression that is used in digital transmission systems, and in particular, used for the coding of analog signals into digital signals. G.722 is an ITU-T standard codec that uses sub-band adaptive differential pulse code modulation (SBADPCM) within a bit rate of 64 kbit/s. The system is referred to as 64 Kbps (7 kHz) audio coding. SBADPCM splits the frequency band into two sub-bands (higher and lower) and the signals in each subband are encoded using ADPCM. Extensions
Operates at 64, 56, or 48 Kbps. Converts audio signals to uniform digital signals which are coded using 14 bits with 16 kHz sampling. Utilizes a SB-ADPCM decoder which performs the reverse operation to the encoder. User selectable processing frame size and I/O formats. In-band synchronization capable. Complies with the ITU-T H.320 standard for video conferencing.