0% found this document useful (0 votes)

260 views

Sinusoidal Synthesis of Speech Using MATLAB

The document is a thesis submitted by Akshay Vijay Jain in partial fulfillment of the requirements for a degree. It discusses sinusoidal synthesis of speech using MATLAB by analyzing the speech signal, extracting frequency components over time from the spectrogram generated in MATLAB, and then resynthesizing the speech from this frequency vs. time matrix using sinusoidal addition in MATLAB code. The thesis will cover recording speech, analyzing it using the short-time Fourier transform and spectrograms, extracting the frequency vs. time information, and using additive synthesis to resynthesize the speech signal from the extracted components.

Uploaded by

Akshay Jain

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

260 views

Sinusoidal Synthesis of Speech Using MATLAB

Uploaded by

Akshay Jain

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

SINUSOIDAL SYNTHESIS OF SPEECH USING MATLAB Thesis Submitted in partial fulfillment of the requirement of BITS C421T Thesis BY AKSHAY

VIJAY JAIN 2009B4A8568P

Under the supervision of Dr. RAHUL SINGHAL Assistant Professor, EEE Dept. BITS-Pilani
AT

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI November, 2013

ACKNOWLEDGEMENT
I would like to thank the Almighty first of all for his blessings. I am obliged to Prof B.N. JAIN, Vice Chancellor, Birla Institute of Technology & Science, Pilani for providing us with a course pattern where a student gets exposure to projects. I wish to express deep sense of gratitude to Dr Rahul Singhal, my supervisor for Thesis named Sinusoidal Synthesis of speech using MATLAB for providing me this wonderful opportunity to learn about various parameters associated with speech and synthesis of speech from spectrogram. I would also like to thanks him for his constant advice, encouragement and support in the study. I wish to express gratitude to all other people as well as all the websites for the content they provided me for performance of research work. Last but not the least; I would like to thank our parents for their constant support and motivation.

CERTIFICATE This is to certify that Thesis entitled ____________________________ ________Sinusoidal Synthesis of Speech using Matlab ______________ is submitted by _Akshay Vijay Jain_ ID NO _ 2009B4A8568P in partial fulfillment of the requirement of the BITS C421T Thesis embodies the work done by him under my supervision

Signature of Supervisor Date: 25 November 2013 Dr Rahul Singhal Assistant Professor, EEE Department, BITS PILANI, PILANI CAMPUS

Thesis Abstract This thesis report discusses speech signal, how it is stored on computer, how it is analyzed and how it is synthesized. One of the way of analyzing speech signal is Short Time Fourier Transform, which is discussed in the Thesis report along with its parameter. Based on this analysis of speech signal, we are extracting the matrix containing frequency present in the signal as function of time. Then after having obtained the matrix from the spectrogram generated from the MATLAB, we try to resynthesize the speech signal back by sinusoidal addition using MATLAB code.

TABLE OF CONTENTS 1) Introduction 2) Recording of speech signal 3) Analysis of speech signal a) Long term frequency analysis b) Window sequence c) Effect of window d) Choice of window e) Parameters of Short Term Frequency Spectrum f) Time-Frequency domain: spectrogram g) Length of window and fundamental frequency 4) Why sinusoids? 5) Additive synthesis 6) Frequency Vs Time matrix from spectrogram in MATLAB 1. GenerateFreqVsTime Matlab Code 2. Croplimit MatlabCroplimit Code 3. Screenshots 7) Speech signal from Frequency Vs Time matrix in MATLAB 1. GenerateSoundData Matlab Code 2. TestAtLevel Matlab Code 8) Conclusion 9) Bibliography/Reference

Introduction

We all know speech is an acoustic signal by that we mean that it is a mechanical wave that is an oscillation of pressure transmitted through solid liquid or gas and it is composed of frequencies within hearing range. Sound is a sequence of waves of pressure that propagates through compressible media such as air or water. (Sound can propagate through solids as well, but there are additional modes of propagation). Sound that is perceptible by humans has frequencies from about 20 Hz to 20,000 Hz. In air at standard temperature and pressure, the corresponding wavelengths of sound waves range from 17 m to 17 mm. During propagation, waves can be reflected, refracted, or attenuated by the medium.

Figure 1. Typical sound signal

Recording of Speech

Sound recording is an electrical or mechanical inscription of sound waves, such as spoken voice, singing, instrumental music, or sound effects. The two main classes of sound recording technology are analog recording and digital recording. Acoustic analog recording is achieved by a small microphone diaphragm that can detect changes in atmospheric pressure (acoustic sound waves) and record them as a graphic representation of the sound waves on a medium such as a phonograph (in which a stylus senses grooves on a record). In magnetic tape recording, the sound waves vibrate the microphone diaphragm and are converted into a varying electric current, which is then converted to a varying magnetic field by an electromagnet, which makes a representation of the sound as magnetized areas on a plastic tape with a magnetic coating on it. Digital recording converts the analog sound signal picked up by the microphone to a digital form by a process of digitization, allowing it to be stored and transmitted by a wider variety of media. Digital recording stores audio as a series of binary numbers representing samples of the amplitude of the audio signal at equal time intervals, at a sample rate high enough to convey all speechs capable of being heard. Digital recordings are considered higher quality than analog recordings not necessarily because they have higher fidelity (wider frequency response or dynamic range), but because the digital format can prevent much loss of quality found in analog recording due to noise and electromagnetic interference in playback, and mechanical deterioration or damage to the storage medium. A digital audio signal must be reconverted to analog form during playback before it is applied to a loudspeaker or earphones.

Analysis of Speech Signal

The long-term frequency analysis of speech signals yields good information about the overall frequency spectrum of the signal, but no information about the temporal location of those frequencies. Since speech is a very dynamic signal with a time-varying spectrum, it is often insightful to look at frequency spectra of short sections of the speech signal. a) Long-term frequency analysis The frequency response of a system is defined as the discrete-time Fourier transform (DTFT) of the system's impulse response h[n]:

Similarly, for a sequence x[n], its long-term frequency spectrum is defined as the DTFT of the Sequence

Theoretically, we must know the sequence x[n] for all values of n (from n=- until n=) in order to compute its frequency spectrum. Fortunately, all terms where x[n] = 0 do not matter in the sum, and therefore an equivalent expression for the sequence's spectrum is

Here we've assumed that the sequence starts at 0 and is N samples long. This tells us that we can apply the DTFT only to all of the non-zero samples of x[n], and still obtain the sequence's true spectrum X (). But what is the correct mathematical expression to compute the spectrum over a short section of the sequence, that is, over only part of the nonzero samples of the sequence?

b) Window sequence It turns out that the mathematically correct way to do that is to multiply the sequence x[n] by a window sequence w[n] that is non-zero only for n=0 L-1, where L, the length of the window, is smaller than the length N of the sequence x[n]: Now Then we compute the spectrum of the windowed sequence xw [n] as usual

The following figure illustrates how a window sequence w[n] is applied to the sequence x[n]:

Figure 2 Result of application of windowed sequence to data sequence

As the figure shows, the windowed sequence is shorter in length than the original sequence. So we can further truncate the DTFT of the windowed sequence:

Using this windowing technique, we can select any section of arbitrary length of the input sequence x[n] by choosing the length and location of the window accordingly. The only question that remains is: how does the window sequence w[n] affect the short-term frequency spectrum? c) Effect of the window To answer that question, we need to introduce an important property of the Fourier transform. The diagram below illustrates the property graphically: I. Implementation of an LTI system in the time domain.

II. Equivalent implementation of an LTI system in the frequency domain.

The two implementations of an LTI system are equivalent: they will give the same output for the same input. Hence, convolution in the time domain = multiplication in the frequency domain:

And since the time domain and the frequency domain are each others dual in the Fourier transform, it is also true that multiplication in the time domain = convolution in the frequency domain:

This shows that multiplying the sequence x[n] with the window sequence w[n] in the time domain is equivalent to convolving the spectrum of the sequence X (), with the spectrum of the window W(). The result of the convolution of the spectra in the frequency domain is that the spectrum of the sequence is smeared by the spectrum of the window. This is best illustrated by the example in the figure below:

Figure 3 Result of application of window sequence in time and frequency domain d) Choice of window Because the window determines the spectrum of the windowed sequence to a great extent, the choice of the window is important. Matlab supports a number of common windows, each with their own strengths and weaknesses. Some common choices of windows are shown below.

Figure 4 Rectangular window sequence

Figure 5 Triangular and Hamming window sequence All windows share the same characteristics. Their spectrum has a peak, called the main lobe, and ripples to the left and right of the main lobe called the side lobes. The width of the main lobe and the relative height of the side lobes are different for each window. The main lobe width determines how accurate a window is able to resolve different frequencies: wider is less accurate. The side lobe height determines how much spectral leakage the window has. An important thing to realize is that we can't have short-term frequency analysis without a window. Even if we don't explicitly use a window, we are implicitly using a rectangular window. e) Parameters of the short-term frequency spectrum Besides the type of window rectangular, hamming, etc. there are two other factors in Matlab that control the short-term frequency spectrum: window length and the number of frequency sample points. The window length controls the fundamental trade-off between time resolution and frequency resolution of the short-term spectrum,
13

irrespective of the window's shape. A long window gives poor time resolution, but good frequency resolution. Conversely, a short window gives good time resolution, but poor frequency resolution. For example, a 250 millisecond long window can, roughly speaking, resolve frequency components when they are 4 Hz or more apart (1/0.250 = 4), but it can't tell where in those 250 millisecond those frequency components occurred. On the other hand, a 10millisecond window can only resolve frequency components when they are 100 Hz or more apart (1/0.010= 100), but the uncertainty in time about the location of those frequencies is only 10 millisecond. The result of short-term spectral analysis using a long window is referred to as a narrowband spectrum (because a long window has a narrow main lobe), and the result of shortterm spectral analysis using a short window is called a wideband spectrum. In short-term spectral analysis of speech, the window length is often chosen with respect to the fundamental period of the speech signal, i.e., the duration of one period of the fundamental frequency. A common choice for the window length is either less than 1 times the fundamental period, or greater than 2-3 times the fundamental period. Examples of narrowband and wideband short-term spectral analysis of speech are given in the figures below:

Figure 6 Wideband and Narrowband analysis of speech The other factor controlling the short-term spectrum in Matlab is the number of points at which the frequency spectrum H () is evaluated. The number of points is usually equal to the length of the window. Sometimes a greater number of points is chosen to obtain a smoother looking spectrum. Evaluating H () at fewer points than the window length is possible, but very rare.
14

f) Time-frequency domain: Spectrogram An important use of short-term spectral analysis is the short-time Fourier transform or spectrogram of a signal. The spectrogram of a sequence is constructed by computing the short term spectrum of a windowed version of the sequence, then shifting the window over to a new location and repeating this process until the entire sequence has been analyzed. The whole process is illustrated in the figure below:

Figure 7 Demonstration of making of spectrogram Together, these short-term spectra (bottom row) make up the spectrogram, and are typically shown in a two-dimensional plot, where the horizontal axis is time, the vertical axis is frequency, and magnitude is the color or intensity of the plot. For example:

Figure 8 A typical spectrogram The appearance of the spectrogram is controlled by a third parameter: window overlap. Window overlap determines how much the window is shifted between repeated computations of the short term spectrum. Common choices for window overlap are 50% or 75% of the window length. For example, if the window length is 200 samples and window overlap is 50%, the window would be shifted over 100 samples between each short-term spectrum. In the case that the overlap was 75%, the window would be shifted over 50 samples. The choice of window overlap depends on the application. When a temporally smooth spectrogram is desirable, window overlap should be 75% or more. When computation should be at a minimum, no overlap or 50% overlap are good choices. If computation is not an issue, you could even compute a new short-term spectrum for every sample of the sequence. In that case, window overlap = window length 1, and the window would only shift 1 sample between the spectra. But doing so is wasteful when analyzing speech signals, because the spectrum of speech does not change at such a high rate. It is more practical to compute a new spectrum every 20-50 millisecond, since that is the rate at which the speech spectrum changes.

g) Length of the window and fundamental frequency

In a wideband spectrogram (i.e., using a window shorter than the fundamental period), the fundamental frequency of the speech signal resolves in time. That means that you can't really tell what the fundamental frequency is by looking at the frequency axis, but you can see energy fluctuations at the rate of the fundamental frequency along the time axis. In a narrowband Spectrogram (i.e., using a window 2-3 times the fundamental period), the fundamental frequency resolves in frequency, i.e., you can see it as an energy peak along the frequency axis. See for example the figures below:

Figure 9. Wideband Speech Spectrogram

Figure 10. Narrowband Speech Spectrogram

Why Sinusoids?

In general the goal of modelling a signal is to reduce redundancy and to get a more compact representation of the data. There are different techniques to model a time series and it depends on the signal which technique to apply. Sinusoids are especially suited for modelling speech with harmonic content. Most natural acoustical sounds exhibit this attribute and the reason for this sinusoidity can be found in the way of the speech production. Human voice production system consists of two fundamental parts working together, namely the voice chords (the excitation source) and the pharynx with mouth and nasal cavities acting as acoustical filter. During voiced parts of speech the vocal chords are opening and closing at a certain frequency (the fundamental frequency, f0) modulating the airstream coming from the lungs. The harmonic overtone structure results from the structure of the pharynx which can be seen as an open tube in a simplified way, letting develop all overtones. f1 fn being integer multiples of the fundamental f0.

Additive Synthesis

Sine waves can be considered the building blocks of speech. In fact, it was shown in the 19th Century by the mathematician Joseph Fourier that any periodic function can be expressed as a series of sinusoids of varying frequencies and amplitudes. This concept of constructing a complex speech out of sinusoidal terms is the basis for additive synthesis, sometimes called Fourier synthesis for the aforementioned reason. In addition to this, the concepts of additive synthesis have also existed since the introduction of the organ, where different pipes of varying pitch are combined to create a sound or timbre. A simple block diagram of the additive form may appear like
18

Figure 11. Block Diagram representation of Sinusoidal Synthesis Its mathematical form based on Fourier series will be

Where is an offset value for the whole function (typically 0), = the amplitude weightings for each sine term, = the frequency multiplier value. With hundreds of terms each with their own individual frequency and amplitude weightings, we can design and specify some incredibly complex sounds, especially if we can modulate the parameters over time.

6) Frequency Vs Time Matrix from Spectrogram in MATLAB

Determination of the frequency content present in speech at a particular instant of time is possible approximately by the Short Term Fourier Transform (STFT), for our thesis work we are using the Narrow Band Spectrogram produced from Matlab. We are choosing narrow band because it gives better frequency resolution and acceptable time resolution. We tried with Wideband Spectrogram, but the speech synthesized using information from Wideband Spectrogram was very noisy. First of all, we take the spectrogram of speech signal with the help of MATLAB command spectrogram . The spectrogram produced by the MATLAB command spectrogram is a RGB image in decibel scale , where in the intensities above 0 dB are expressed in varying shades of Red color, so we separate out the Red component from the RGB image, then in the separated component we can easily identify the frequencies which had higher intensities in the speech, since the pixels corresponding to high intensity frequencies will appear white while others will appear black and the intermediate will be in gray scale. Now the Red component is appropriately cropped and resized with number of rows equal to 400 implying every row for 10 Hz range and into number of columns hundred times the duration of the speech signal implying that each column in the speech signal corresponds to 10 milliseconds of speech. It has been found that when we convert the resized image in to black and white by converting gray pixel nearer to white into white and gray pixel nearer to black into black the quality of speech is very near to the original speech. So we produce the black and white image which corresponds to Frequency Vs Time Graph for the speech signal.
20

a) The MATLAB code for performing above task is as follows

1)% function GenerateFreqVsTime() 2)% Record your voice for 5 seconds. 3)f=input('Enter the time in seconds for which you want to record'); 4)recObj = audiorecorder(8000,8,1); 5)disp('Start speaking.'); 6)recordblocking(recObj,f); 7)disp('End of Recording.'); 8)% Play back the recording. 9) play(recObj); 10) % Store data in double-precision array. 11) myRecording = getaudiodata(recObj); 12) figure(1) 13) plot (myRecording);title('sound '); 14) % Plot the spectrogram 15) figure(2) 16) spectrogram(myRecording, 1000,923, 1024,8E3,'yaxis'); 17) h=gcf; 18) set(gcf, 'Position', get(0,'Screensize')); % Maximize figure. 19) level=input('Please enter level between 0 and 1'); 20) saveas(h,'spectrogram1.jpg'); 21) fig=imread('spectrogram1.jpg'); 22) figG1ray=rgb2gray(fig); 23) figure(9) 24) imshow(figGray); title('FigGray'); 25) figRed=fig(:,:,1); 26) figure(3) 27) imshow(figRed); 28) title('figRed'); 29) [xmin ymin width height]=croplimits(figRed); 30) figure(4) 31) figRedCropped=imcrop(figRed,[xmin ymin width height]);

32) imshow(figRedCropped);title('figRed Cropped'); 33) figure(5) 34) figRedCroppedResized=imresize(figRedCropped,[400 100*f]); 35) imshow(figRedCroppedResized);title('figRedCroppedResized'); 36) figRedCroppedResizedCorrected=flipud(figRedCroppedResized); 37) figure(6) 38) figRedCroppedResizedBW=im2bw(figRedCroppedResized,level); 39) imshow(figRedCroppedResizedBW);title('figRedCroppedResizedBW'); 40) figure(7) 41) figRedCroppedResizedBWCorrected=flipud(figRedCroppedResizedBW); 42) imshow(figRedCroppedResizedBWCorrected);

b) Matlab code for Croplimits function used in above code is as follows

1) function [xmin ymin width height]=croplimits(input) 2) xmin=0;r2=0;ymin=0;c2=0; 3) [row,column]=size(input); 4) for i=30:90 5) 6) 7) 8) if(input(i,column/2)~=255) ymin=i+5; break end

9) end 10) 11) 12) 13) 14) 15) 16) 17) 18) 19) count=0; for ki=row:-1:row-120 if(input(ki,column/2)~=255) for kj=column/2:column/2+50 if(input(ki,kj)~=255) count=count+1; else count=count-1; end if(count>0) r2=ki;

20) 21) 22) 23) 24) 25) 26) 27) 28) 29) 30) 31) 32) 33) 34) 35) 36) 37) 38) 39) 40) 41) 42) 43) 44) 45) 46) 47) 48) 49) 50) 51) end count=0; end end end end count=0; end end

break end // end of if on line 18 //end of for loop from line 13 //end of if on line 12 //end of for loop on line 11

for j=80:180 if(input(row/3,j)~=255) for i=row/2:row/2+40 if(input(i,j)~=255) count=count+1; else count=count-1; end //end of for loop on line 28

if(count>24) xmin=j+8;break;

//end of if on line 27 //end of for loop on line 26

for j=column:-1:column-120 if(input(row/2,j)~=255) for i=row/2:row/2+100 if(input(i,j)~=255) count=count+1; else count=count-1; end end //end of if from line 44 //end of for from line 43

if(count>0) c2=j;break;

52) 53) 54) 55) 56) 57) end end

end

// end of if from line 51 // end of 42 // end of 41

height=r2-ymin+1; width=c2-xmin+1; end // end of function

c) Screenshots i. Speech Waveform

Figure 12 Speech Waveform

ii.

Spectrogram of above speech using Matlab

Figure 13 Spectrogram of above speech using Matlab

iii. Grayscale Spectrogram

Figure 14 Grayscale Spectrogram

iv.

Image of Red component of spectrogram since red component represents positive magnitude

Figure 15 Red component of spectrogram

Same figure after being cropped by the matlab function croplimit

Figure 16. Same figure after being cropped by the matlab function croplimit
26

vi.

Above figure resized by Matlab function to generate a column of pixel corresponding to 10 milliseconds

Figure 17 Resized using Matlab

vii. Above figure inverted so as to make first row correspond to 10Hz frequency and next row correspond to 20Hz while last 400th row correspond to 4KHz

Figure 18 Same figure as previous but inverted

viii.

Same figure as above with pixels having intensity less than .9 reduced to zero while others extended to 1

Figure 19 Same figure as above with pixels having intensity less than .9 reduced to zero while others extended to 1

7) Speech signal from Frequency Vs Time Matrix in MATLAB

Once we have Frequency Vs Time matrix, we can generate the all the frequencies using the sin function of MATLAB and add them all and do these for all the columns which correspond to 10 milliseconds. Now we can concatenate the data generated for each column and result is the speech signal. The MATLAB code for performing above series of task is as follows a) GenerateSoundData Matlab Code:
1) function sounddata=GenerateSoundData(image) 2) [row column]=size(image); 3) image=image/.255; 4) sounddata=zeros(1,80*column); 5) timeResolution=.01;% 10 milliseconds 6) samplingRate=8000;%8000Hz 7) time=1/samplingRate:1/samplingRate:timeResolution; 8) for i=1:column 9) 10) 12) 13) 14) 15)end 16) sounddata=sounddata'; y=sqrt(double(image(10,i)))*sin(2*pi*time*1*10); for j=11:row-100 y=y+sqrt(double(image(j,i)))*sin(2*pi*time*j*10); end sounddata(80*(i-1)+1:80*i)=y;

In this code we are only generating frequencies in the range 100Hz to 3000 Hz, because other frequencies do not affect the hearing ability so much. b) TestAtLevel Matlab Code:
29

1) 2) 3) 4) 5)

function sdata=TestAtLevel(spectrograph,level) bwspectrograph=im2bw(spectrograph,level); sdata=GenerateSoundData(bwspectrograph); soundsc(sdata,8000); end

In the above function namely TestAtLevel, we pass the matrix obtained from the GenerateFreqVsTime function of name figRedCroppedResizedCorrected, along with the level which specifies the threshold for converting lower values to zero while values greater than level to 1.

Results

The speech waveform generated with different values of level for conversion of Red component of spectrogram into Black and White image are demonstrated below along with their spectrogram
30

a) Level = 0.8

b) Level = 0.9

c) Level = 0.95

Conclusion

From the above three speech waveforms, it seems that the level of around 0.9 is the best threshold for Red component of spectrogram generated from the Matlab, so that the speech generated using the above Matlab function namely GenerateSoundData is matching more with the original speech. The sinusoidal model, a framework for modelling speech and music signals, has been presented. Sinusoidal synthesis of speech by extracting frequency and time information form the spectrogram gives acceptable quality of speech. Another strategy would be decomposing the signal into deterministic and stochastic parts and using different models for the different portions of a speech as proposed by [5].

9) Bibliography/References
[1] R. McAulay, Th. Quatieri: Speech Analysis/Synthesis Based on a Sinusoidal Representation, in IEEE Transactions on Acoustics, Speech, and Signal Processing, August 1986 [2] J. Smith III, X. Serra: PARSHL: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation [3] K. Fitz, L. Haken: On the Use of Time-Frequency Reassignment in Additive Sound Modelling [4] M. Lagrange, S. Marchand, M. Raspaud, J.-B. Rault: Enhanced Partial Tracking using Linear Prediction, in Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), September 2003 [5] X. Serra: A System for Sound Analysis/Transformation/Synthesis based on a Deterministic plus Stochastic Decomposition, Thesis, Stanford University, 1989

Grade 2 English Module 1 and 2 Final
86% (22)
Grade 2 English Module 1 and 2 Final
33 pages
How To Access The Heat Flux in General Purpose Fluent Udfs?: Problem/Description
100% (1)
How To Access The Heat Flux in General Purpose Fluent Udfs?: Problem/Description
3 pages
Directory Music Schools
100% (1)
Directory Music Schools
16 pages
Pre-Delay: Decay Time Cheat Sheet
No ratings yet
Pre-Delay: Decay Time Cheat Sheet
1 page
Cambridge Year 8 Science Curriculum Framework
50% (6)
Cambridge Year 8 Science Curriculum Framework
2 pages
Introduction To Koopman Operator Theory of Dynamical Systems
No ratings yet
Introduction To Koopman Operator Theory of Dynamical Systems
31 pages
Shyfem Finite Element Model For Coastal Seas User Manual
No ratings yet
Shyfem Finite Element Model For Coastal Seas User Manual
54 pages
Introduction To Matlab - 1
No ratings yet
Introduction To Matlab - 1
190 pages
Airfoil Investigation Database - Showing Naca-2412
No ratings yet
Airfoil Investigation Database - Showing Naca-2412
1 page
Modeling A Two Wheeled Inverted Pendulum Robot
100% (2)
Modeling A Two Wheeled Inverted Pendulum Robot
59 pages
Ch2 Modeling in Frequency Domain
No ratings yet
Ch2 Modeling in Frequency Domain
66 pages
Engine Modeling A Keromnes
No ratings yet
Engine Modeling A Keromnes
42 pages
DC Servo Paper - Cse007
No ratings yet
DC Servo Paper - Cse007
4 pages
Introduction To Simulink
No ratings yet
Introduction To Simulink
123 pages
Final Cruise Control
No ratings yet
Final Cruise Control
26 pages
3D Model of Fuel Tank For System Simulation (En)
No ratings yet
3D Model of Fuel Tank For System Simulation (En)
78 pages
Diploma Thesis Simulation of Motion of An Underwater Vehicle (2007)
No ratings yet
Diploma Thesis Simulation of Motion of An Underwater Vehicle (2007)
112 pages
341 Matlab Project Report
No ratings yet
341 Matlab Project Report
11 pages
Fortran - Paul Tackley
No ratings yet
Fortran - Paul Tackley
32 pages
Matlab Group Project
No ratings yet
Matlab Group Project
11 pages
MATLAB Workshop Lecture 1
No ratings yet
MATLAB Workshop Lecture 1
46 pages
Computational Sci. & Engg PDF
No ratings yet
Computational Sci. & Engg PDF
407 pages
Computational Fluid Dynamics (CFD) Blog - LEAP Australia & New Zealand - Turbulence Part 3 - Selection of Wall Functions and Y - To Best Capture The Turbulent Boundary Layer
No ratings yet
Computational Fluid Dynamics (CFD) Blog - LEAP Australia & New Zealand - Turbulence Part 3 - Selection of Wall Functions and Y - To Best Capture The Turbulent Boundary Layer
7 pages
Signal To Noise Ratio Using MATLAB
No ratings yet
Signal To Noise Ratio Using MATLAB
3 pages
Time Respons: Dasar Sistem Kontrol
No ratings yet
Time Respons: Dasar Sistem Kontrol
15 pages
Matlab in Free Form Extrusion
No ratings yet
Matlab in Free Form Extrusion
22 pages
Notes On Adjoint Methods MIT
No ratings yet
Notes On Adjoint Methods MIT
6 pages
Experimental Results For Eppler 387 Airfoil
No ratings yet
Experimental Results For Eppler 387 Airfoil
238 pages
Matlab Short Tutorial
No ratings yet
Matlab Short Tutorial
45 pages
Connecting Matlab With OpenFOAM JP
No ratings yet
Connecting Matlab With OpenFOAM JP
17 pages
Swan Use
No ratings yet
Swan Use
129 pages
Simulation of Non-Darcy Flow in Porous Media Including Viscous, Inertial and
No ratings yet
Simulation of Non-Darcy Flow in Porous Media Including Viscous, Inertial and
7 pages
ICEF 2020 Keynote Prith Banerjee
No ratings yet
ICEF 2020 Keynote Prith Banerjee
23 pages
Create and Customize Your Physical Models
No ratings yet
Create and Customize Your Physical Models
30 pages
Micro and Nano Scale Flow - Direct Simulation Monte Carlo (DSMC)
100% (2)
Micro and Nano Scale Flow - Direct Simulation Monte Carlo (DSMC)
16 pages
TURKEYTRIB'15-http://www Turkeytrib Yildiz Edu TR
No ratings yet
TURKEYTRIB'15-http://www Turkeytrib Yildiz Edu TR
2 pages
EES Integration Example 3.2-1a
No ratings yet
EES Integration Example 3.2-1a
10 pages
Fluent Introduction - Some Best Practice...
No ratings yet
Fluent Introduction - Some Best Practice...
29 pages
Extended Essay Theoretical Version
No ratings yet
Extended Essay Theoretical Version
17 pages
DAHDI Overview For Asterisk Developers: Presented by Matthew Fredrickson and Shaun Ruffell
No ratings yet
DAHDI Overview For Asterisk Developers: Presented by Matthew Fredrickson and Shaun Ruffell
24 pages
767 Electric Power System Modeling in Sysml: Authors
No ratings yet
767 Electric Power System Modeling in Sysml: Authors
23 pages
Aerodynamic Influence Coefficients Methods
100% (1)
Aerodynamic Influence Coefficients Methods
106 pages
Modeling and Parameter Estimation of Synchronous Machine - GR Lekhema
No ratings yet
Modeling and Parameter Estimation of Synchronous Machine - GR Lekhema
33 pages
Tecplot: User's Manual
No ratings yet
Tecplot: User's Manual
666 pages
Tecplot 10 User Manual
No ratings yet
Tecplot 10 User Manual
320 pages
Implement Multiphase Chalmers Student Project
No ratings yet
Implement Multiphase Chalmers Student Project
23 pages
(Chen, Changseng Et Al, 2006) An Unstructured Grid, Finite Volume Coastal Ocean Model (FVCOM) PDF
No ratings yet
(Chen, Changseng Et Al, 2006) An Unstructured Grid, Finite Volume Coastal Ocean Model (FVCOM) PDF
324 pages
Matlab Projects
0% (1)
Matlab Projects
5 pages
MSC Thesis - Final Copy
No ratings yet
MSC Thesis - Final Copy
188 pages
Adjoint Tutorial PDF
No ratings yet
Adjoint Tutorial PDF
6 pages
Aircraft Vehicle Systems Modelling and Simulation Under Uncertainty
No ratings yet
Aircraft Vehicle Systems Modelling and Simulation Under Uncertainty
64 pages
MSc. Thesis G Mansur
No ratings yet
MSc. Thesis G Mansur
64 pages
Airfoil Geometry Parameterization
No ratings yet
Airfoil Geometry Parameterization
25 pages
SWEE
No ratings yet
SWEE
45 pages
Numerical Solution of Ordinary Differential Equations Part 1 - Intro & Approximation
100% (1)
Numerical Solution of Ordinary Differential Equations Part 1 - Intro & Approximation
15 pages
040 - Chapter 8 - L28 PDF
No ratings yet
040 - Chapter 8 - L28 PDF
10 pages
Simulation of 3ph Induction Motor in Matlab With Direct and Soft Starting Methods
No ratings yet
Simulation of 3ph Induction Motor in Matlab With Direct and Soft Starting Methods
43 pages
Control Systems Lab Manual - With Challenging Experiments
No ratings yet
Control Systems Lab Manual - With Challenging Experiments
132 pages
Computational Fluid Dynamics Vol - III - Hoffmann
100% (2)
Computational Fluid Dynamics Vol - III - Hoffmann
188 pages
Infinite Crossed Products
From Everand
Infinite Crossed Products
Donald S. Passman
No ratings yet
Analysis and Synthesis of Speech Using Matlab
No ratings yet
Analysis and Synthesis of Speech Using Matlab
10 pages
Igital Ignal Rocessing: Balochistan University of Information Technology, Engineering & Management Sciences-Quetta
No ratings yet
Igital Ignal Rocessing: Balochistan University of Information Technology, Engineering & Management Sciences-Quetta
10 pages
The Use and Effective Analysis of Vocal Spectrum A
No ratings yet
The Use and Effective Analysis of Vocal Spectrum A
14 pages
Short Term Spectral Analysis, Modification Discrete Fourier Transform Synthesis, and
No ratings yet
Short Term Spectral Analysis, Modification Discrete Fourier Transform Synthesis, and
4 pages
The Abacus School Final Term Exams March 2011
No ratings yet
The Abacus School Final Term Exams March 2011
19 pages
Music
No ratings yet
Music
7 pages
A Light in The Hallway Concert Program
No ratings yet
A Light in The Hallway Concert Program
8 pages
Asuhan Keperawatan Pada Delerium
No ratings yet
Asuhan Keperawatan Pada Delerium
6 pages
Rescue Me
No ratings yet
Rescue Me
3 pages
Datasheet
No ratings yet
Datasheet
1 page
Sony Headset Catalogue
No ratings yet
Sony Headset Catalogue
17 pages
British Journal of Music Education
No ratings yet
British Journal of Music Education
12 pages
MSP43
No ratings yet
MSP43
556 pages
Sound Lesson 3
No ratings yet
Sound Lesson 3
27 pages
Mockpractice
No ratings yet
Mockpractice
8 pages
Chapter II Theoretical Framework
33% (3)
Chapter II Theoretical Framework
25 pages
PRX315D: 15 Inch 2-Way Speaker System Key Features
No ratings yet
PRX315D: 15 Inch 2-Way Speaker System Key Features
2 pages
SE3A 2nd Assingment
No ratings yet
SE3A 2nd Assingment
20 pages
Absolute DB
No ratings yet
Absolute DB
2 pages
Biberian, G Sonata No. 3 Solo Guitar
100% (3)
Biberian, G Sonata No. 3 Solo Guitar
6 pages
Replacing & Reinforcing Recorded Drums PDF
No ratings yet
Replacing & Reinforcing Recorded Drums PDF
12 pages
Timbre Duress
100% (1)
Timbre Duress
14 pages
Civil Noise Control of Buildings
50% (2)
Civil Noise Control of Buildings
10 pages
E-Book SSC GS (Eng)
No ratings yet
E-Book SSC GS (Eng)
291 pages
Yamaha Live Sound Guide
100% (3)
Yamaha Live Sound Guide
19 pages
Sa Ax540
No ratings yet
Sa Ax540
4 pages
Building Construction
No ratings yet
Building Construction
31 pages
Qiaozhan Gao Report ReportFinal
No ratings yet
Qiaozhan Gao Report ReportFinal
6 pages
Famous TB 303 Patterns Source
No ratings yet
Famous TB 303 Patterns Source
1 page
Pocket Values Acoustics
No ratings yet
Pocket Values Acoustics
2 pages

Sinusoidal Synthesis of Speech Using MATLAB

Uploaded by

Sinusoidal Synthesis of Speech Using MATLAB

Uploaded by

SINUSOIDAL SYNTHESIS OF SPEECH USING MATLAB Thesis Submitted in partial fulfillment of the requirement of BITS C421T Thesis BY AKSHAY

VIJAY JAIN 2009B4A8568P

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI November, 2013

Figure 1. Typical sound signal

Analysis of Speech Signal

Figure 2 Result of application of windowed sequence to data sequence

II. Equivalent implementation of an LTI system in the frequency domain.

Figure 4 Rectangular window sequence

g) Length of the window and fundamental frequency

Figure 9. Wideband Speech Spectrogram

Figure 10. Narrowband Speech Spectrogram

6) Frequency Vs Time Matrix from Spectrogram in MATLAB

a) The MATLAB code for performing above task is as follows

b) Matlab code for Croplimits function used in above code is as follows

//end of if on line 27 //end of for loop on line 26

52) 53) 54) 55) 56) 57) end end

// end of if from line 51 // end of 42 // end of 41

height=r2-ymin+1; width=c2-xmin+1; end // end of function

c) Screenshots i. Speech Waveform

Figure 12 Speech Waveform

Spectrogram of above speech using Matlab

Figure 13 Spectrogram of above speech using Matlab

Figure 14 Grayscale Spectrogram

Figure 15 Red component of spectrogram

Same figure after being cropped by the matlab function croplimit

Figure 17 Resized using Matlab

Figure 18 Same figure as previous but inverted

7) Speech signal from Frequency Vs Time Matrix in MATLAB

function sdata=TestAtLevel(spectrograph,level) bwspectrograph=im2bw(spectrograph,level); sdata=GenerateSoundData(bwspectrograph); soundsc(sdata,8000); end

You might also like