100% found this document useful (1 vote)
4K views

Speech/Audio Signal Processing in MATLAB/Simulink

The document discusses speech and audio signal processing techniques in MATLAB and Simulink. It covers topics like reading and writing wave files, time domain processing using delay and filtering, frequency domain processing using spectrograms, and pitch determination. It provides examples of using MATLAB and Simulink tools to implement techniques like reading, displaying and manipulating waveforms, using delays to create echo and reverberation effects, and generating synthetic sounds using sine waves and amplitude modulation. Exercises are included to help readers practice these techniques.

Uploaded by

skhan0098
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
4K views

Speech/Audio Signal Processing in MATLAB/Simulink

The document discusses speech and audio signal processing techniques in MATLAB and Simulink. It covers topics like reading and writing wave files, time domain processing using delay and filtering, frequency domain processing using spectrograms, and pitch determination. It provides examples of using MATLAB and Simulink tools to implement techniques like reading, displaying and manipulating waveforms, using delays to create echo and reverberation effects, and generating synthetic sounds using sine waves and amplitude modulation. Exercises are included to help readers practice these techniques.

Uploaded by

skhan0098
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

2006 Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing


in MATLAB/Simulink
J.-S. Roger Jang ( 張智星 )
CS Dept, Tsing-Hua Univ, Taiwan
( 清華大學 資訊系 )
http://www.cs.nthu.edu.tw/~jang
[email protected]
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Outline
Wave file manipulation
Reading, writing, recording ...
Time-domain processing
Delay, filtering, sptools …
Frequency-domain processing
Spectrogram
Pitch determination
Auto-correlation, SIFT, AMDF, HPS ...
Others
Formant estimation, speech coding

3
110/12/07 3
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Toolbox/Blockset Used
MATLAB
Simulink
Signal Processing Toolbox
DSP Blockset

4
110/12/07 4
2006 Speech/Audio Signal Processing in MATLAB/Simulink

MATLAB Primer
Before you start, you need to get familiar with MATLAB.
Please read “MATLAB Primer” at the following
page:

http://neural.cs.nthu.edu.tw/jang/demo/demoDownload.
asp

Exercise:
1. Please plot two curves y=sin(2*t) and y=cos(3*t) in
the same figure.
2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).

5
110/12/07 5
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wave File


To read a MS .wav file (PCM format only): wavr
ead
y = wavread(file)
[…] = wavread(file, [n1, n2])
[y, fs, nbits, opts] = wavread(file)
[…] = wavread(file, n)
[y, fs, nbits] = wavread(file)
If the wav file is stereo, y will be a two-column
matrix.

6
110/12/07 6
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wav File


Example (wavRead01.m):
[y, fs] = wavread('singapore.wav');
plot((1:length(y))/fs, y);
xlabel('Time in seconds');
ylabel('Amplitude');

Exercise :
1. Plot the waveform of “rrrrr.wav”. Use MATLAB’s “zoom” button to find the c
onsecutive curling “R” occurs.
2. Plot the two-channel waveform in “flanger.wav”.

7
110/12/07 7
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Solution to the Previous Exercise


wavRead02.m:
[y, fs] = wavread(‘flanger.wav’);
subplot(2,1,1), plot((1:length(y))/fs, y(:,1));
subplot(2,1,2), plot((1:length(y))/fs, y(:,2));

8
110/12/07 8
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Play Wav Files


To play sound using Windows audio output device: wa
vplay, sound, soundsc
wavplay(y, fs)
wavplay(y, fs, ‘async’): non-blocking call
wavplay(y, fs, ‘sync’): blocking call
sound(y, fs)
soundsc(…): autoscale the sound
Example (wavPlay01.m) :
[y, fs] = wavread(‘rrrrr.wav’);
wavplay(y, fs);
Exercise :
Follow the example to play “flanger.wav”.

9
110/12/07 9
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Read/Play Using DSP Blocks


To read/play sound using DSP Blockset:
DSP Blockset/DSP Sources/From Wave File
DSP Blockset/DSP Sinks/To Wave Device
Example:

Frame-based operation!
Exercise:
Create a model as shown above.

10
110/12/07 10
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Solution
Solution to the previous exercise:
slWavFilePlay01.mdl

11
110/12/07 11
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Write a Wave File


To write MS wave files: wavwrite
wavwrite(y, fs, nbits, wavefile)
“nbits” must be 8 or 16.
“y” must have two columns for stereo data.
Amplitude values outside [-1,1] are clipped.
Example (wavWrite01.m) :
[y, fs] = wavread(‘rrrrr.wav’);
wavwrite(y, fs*1.2, 8, ‘testout.wav’);
!start testout.wav
Exercise :
Try out the above example.

12
110/12/07 12
2006 Speech/Audio Signal Processing in MATLAB/Simulink

To Record a Wave File


To record wave files:
1. Use the recording utility under WinXP.
2. Use “wavrecord” under MATLAB.
3. Use “From Wave Device” under Simulink, under “DSP Bloc
ksets/Platform Specific IO/Windows (Win32)”
Example :
1. Go ahead and try WinXP recording utility!
2. Try “wavRecord01.m”
3. Try “slWavFileRecord01.mdl”
Exercise:
Try out the above examples.

13
110/12/07 13
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Speech Signals


A typical time-domain plot of speech signals:

Amplitude: volume or intensity


Frequency: pitch

14
110/12/07 14
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Changing Wave Playback Param.


To control the play of a sound:
• Normal: wavplay(y, fs)
• High volume: wavplay(2*y, fs)
• Low volume: wavplay(0.5*y, fs)
• High pitch (and faster): wavplay(y, 1.2*fs)
• Low pitch (and slower): wavplay(y, 0.8*fs)
Exercise:
• Try “wavPlay01.m” and trace the code.
• Create “wavPlay02.m” such that you can record your o
wn voice on the fly.

15
110/12/07 15
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Signal Processing


Take-home exrecise:
How to get a high pitch with the same time span?

16
110/12/07 16
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Synthetic Sounds
Use a sine wave generator (under DSP blocksets) to produce
sounds

Single frequency:

Multiple frequencies:

Amplitude modulation:

Exercise:
Create the above models.

17
110/12/07 17
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Solution
Solution to the previous exercise:
sineSource01
sineSource02
sineSource03

18
110/12/07 18
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Delay in Speech/Audio
What is a delay in a signal?
y(n) --> y(n-k)
What effects can delay generate?
Echo
Reverberation
Chorus
Flanging

19
110/12/07 19
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Single Delay in Audio Signal


Block diagram:
-k
Input z a Output
u(n)
y(n) =
u(n) + a*u(n-k)

Simulink model:

Exercise:
Create the above model.

20
110/12/07 20
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


How to create “karaoke” effects:
a

-k
Input z Output y(n)
u(n)

y(n) = u(n) + a u(n-k) + a 2u(n-2k) + a 3u(n-3k) ...


Simulink model:

21
110/12/07 21
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


Parameter values:
• Feedback gain a < 1
• Actual delay time = k/fs
Exercise:
• Create the above model and change some parameters to see their effects.
• Modify the model to take microphone input (so you can start singing karaoke now!)
• Use a “configurable subsystem” to include all possible input files and the microphone. (See next page.)

22
110/12/07 22
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal


How to use “configurable subsystem” block?
1. Create a library (say, wavinput.mdl)

2. Get a block of “configurable subsystem”


3. Fill the dialog box with the library name

23
110/12/07 23
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Audio Flanging
Flanging sound:
• A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound
• “Pitch modulation” due to a variable delay
Simulink demo:
• dspafxf.mdl (all platforms)
• dspafxf_nt.mdl (for 95/98/NT)

24
110/12/07 24
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Audio Flanging
Simulink model:

Original spectrogram: Modified spectrogram:

25
110/12/07 25
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Signal Processing Using sptool


To invoke sptool, type “sptool”.

26
110/12/07 26
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Speech Production
How is speech produced?
Speech is produced when air is forced from the
lungs through the vocal cords (glottis) and along
the vocal tract.
Analogy to System Theory:
Input: air forced into the vocal cords
Output: media vibration
System (or filter): vocal tract
Pitch frequency: frequency of the input
Formant frequency: resonant frequency

27
110/12/07 27
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Source Filter Model of Speech


The source-filter model of speech production:
Speech is split into a rapidly varying excitation
signal and a slowly varying filter. The envelope of
the power spectra contains the vocal tract
information.

Two important characteristics of the model are


fundamental (pitch) frequency (f0) and formants
28
110/12/07 (F1, F2, F3, …) 28
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Frame Analysis of Speech Signal

Speech wave form :

Zoom in

Overlap

Frame

29
110/12/07 29
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Spectrogram
Spectrogram (specgram.m) displays short-time fre
quency contents:

Wave form :

Spectrogram :

30
110/12/07 30
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Real-time Spectrogram
Try “dspstfft_win32”:

Spectrum: Spectrogram:

31
110/12/07 31
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Pitch and Formants


Pitch and formants can be defined visually:
Pitch period = 1/f0
First formant
Second formant
F1
F2

32
110/12/07 32
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Spectrogram Reading
Spectrogram Reading
• http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe
ading/spectrogram_reading.html

Waveform:

Spectrogram:

110/12/07 “compute” 33
33
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Pitch Determination Algorithms


Time-domain:
• Auto-correlation
• AMDF (Average Magnitude Difference Function)
• Gold-Rabiner algorithm (1969)
Frequency-domain:
• Cepstrum (Noll 1964)
• Harmonic product spectrum (Schroeder 1968)
Others:
• SIFT (Simple inverse filter tracking)
• Maximum likelihood
• Neural network approach

34
110/12/07 34
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Autocorrelation of Each Frame


Let s(k) be a frame of size 128.
 

s(k):

s(k-):

=30 x(30) = dot prod. of overlapped


= sum(s(31:128).*s(1:99)

Autocorrelation
x():

110/12/07 Pitch period 35


35
30
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Autocorrelation via DSP Blockset


Real-time autocorrelation demo:

Exercise:
Construct the above model and try it.

36
110/12/07 36
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Pitch Tracking via Autocorrelation


Real-time pitch tracking via autocorrelation:
pitch2.mdl

37
110/12/07 37
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Formant Analysis
Characteristics of formants:
• Formants are perceptually defined.
• The corresponding physical property is the
frequencies of resonances of the vocal tract.
• Formant analysis is useful as the position of the
first two formants pretty much identifies a vowel.
Computation methods:
• Peak picking on the smoothed spectrum
• Peak picking on the LP spectrum
• Factoring for the LP roots
• Fitting of mixture of Gaussians

38
110/12/07 38
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Formant Analysis
Track Draw:
• A package for formant synthesis with options to sk
etch formant tracks on a spectrogram.
• http://www.utdallas.edu/~assmann/TRACKDRAW/tr
ackdraw.html
Formant Location Algorithm
• MATLAB code by Michelle Jamrozik
• http://ece.clemson.edu/speech/files.htm

39
110/12/07 39
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Speech Waveform Coding


Time domain coding
• PCM: Pulse Code Modulation
• DPCM: Differential PCM
• ADPCM: Adaptive Differential PCM (dspadpcm.mdl)
Frequency domain coding
• Sub-band coding
• Transform coding
Speech Coding in MATLAB
http://www.eas.asu.edu/~speech/education/educ1.html

40
110/12/07 40
2006 Speech/Audio Signal Processing in MATLAB/Simulink

Conclusions
Ideal tools for speech/audio signal processing:
• MATLAB
• Simulink
• Signal Processing Toolbox
• DSP Blockset
Advantages:
• Reliable functions: well-established and tested
• Visible graphical algorithm design tools
• High-level programming language yet C-compatible
• Powerful visualization capabilities
• Easy debugging
• Integrated environment

41
110/12/07 41
2006 Speech/Audio Signal Processing in MATLAB/Simulink

References

[1] “Discrete-Time Processing of Speech Signals”,


by Deller, Proakis and Hansen, Prentice Hall, 1993
[2] “Fundamentals of Speech Recognition”, by Rabin
er and Juang, Prentice Hall, 1993
[3] “Effects Explained”, http://www.harmony-central.
com/Effects/effects-explained.html
[4] “TrackDraw”, http://www.utdallas.edu/~assmann/
TRACKDRAW/trackdraw.html
[5] “Speech Coding in MATLAB”, http://www.eas.asu.
edu/~speech/education/educ1.html

42
110/12/07 42

You might also like