0% found this document useful (0 votes)

70 views8 pages

MathLab Based Speech Processing

This document discusses speech recognition and related works. It provides an overview of speech recognition, explaining how speech signals can be converted to electrical waveforms and manipulated through analog and digital signal processing. It also summarizes several related works from 1995-1997 that explored using neural networks and other statistical techniques for speech recognition. Finally, it proposes a new image processing-based approach to speech recognition and includes a MATLAB program implementing a correlation-based recognition method.

Uploaded by

smoke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views8 pages

MathLab Based Speech Processing

Uploaded by

smoke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

INTRODUCTION

Speech is the fundamental analog form of message. It is designed to carry sound

to aid in hearing. Speech signals, can be converted to an electrical waveform by a
microphone. Analog and digital signal processing methods can be used to
manipulate these signals. Speech signals can be again converted back to acoustic
form by a loud speaker or headphone. Speech recognition is a technology that can
translate spoken words into text. A speech recognition system can analyze the
persons specific voice for recognition the persons speech. They can be of two
types. Speaker dependent systems and speaker independent systems. Speaker
dependent systems uses training whereas speaker independent systems does not
use training.

Related works

In the year 1995 speech recognition using Neutral Networks was proposed by joe
Tebelskis where he had examined how artificial neutral networks can benefit a large
vocabulary, speaker independent, continuous speech recognition system. He explored
two different ways to use neural networks for acoustic modeling prediction and
classification. He found that predictive networks yield poor results because of
discrimination, but classification networks gave excellent results. He also verified
that, in accordance with theory, the output activation of a classification network form
highly accurate estimates of the posterior probabilities P , and he showed how these
can easily be converted to likelihood P for standard HMM recognition algorithms

In the year 2003 chulhee Lee, Donghion Hyun, Euisun Chol, jinwook Go, and
Chungyong Lee in their paper optimizing feature extraction for speech recognition
had proposed a method to minimize the lose of information during the feature

1
extraction stage in speech recognition by optimizing the parameters of the
melcepstrum transformation , a transform which is widely used in speech recognition.
The melcepstrum was obtained by critical band filters whose characteristics play an
important role in converting a speech signal into a sequence of vectors. First , they
analyze the performance of the melcepstrum by changing the parameters of the filters
such as shape , center frequency, and band width.Then they proposed an algorithm to
optimize the parameters of the filters using the simplex method.

In the year 1997 K.Ohtsuki , S.Matsunaga,T.Matsuoka, anind Sfurui, In their paper

topic extraction based on continuous speech recognition in broadcast news speech had
studied, the extraction of several topics words from broadcast news using continuous
and found that a combination of multiple topic words represents the content of the
news . They had trained the topic extraction model wit 5 years of newspapers, using
the frequency of topic words taken from headlines and words in articles. The degree
of relevance between topic words in articles is calculated on the basis of statistical
measures, that is mutual information or the X2 value. In topic extraction experiments
for recognized broadcast news speech, they had extracted 5 topic eords using X2
based model and found that 75% of them agreed with topic words chosen by subject.

3. An overview of speech recognition

Speech recognition is a technique that converts pulse code modulation digital audio
from a sound card into recognized speech.It is wavy line which just looks like the
output of an oscilloscope. Wile transforming the PCM digital audio into frequency
domain , it mainly identifies the frequency component of a sound. The main objective
of the speech recognition system is to recognize the speech what user have told .
Therefore it must understand the phoneme of the spoken word. But unfortunately it
becomes diuretic for the following reasons. Every time the word spoken by the user
sounds different. Users may not generate exactly the same sound for the same

2
phoneme. Also the background nose from the microphone and users room sometime
cause the recognizer to hear the different sound then it would have if the user was in a
quite room with the high quality microphone. Various methods used for speech
recognition are fast Fourier transform, training using neural network, various
statistical techniques etc. But hear we have suggested a different approach for speech
recognition which is based on image processing .

4.Basic image processing concept

A digital image is composed of a grid of pixels and stored as an array. A single pixel
represents a value of either light intensity or color . Images are processed to obtain
information what is visible beyond the given the image initial pixel values

4.1 Binary Image

A binary image basically consists of two values that is either 0 or 1. This type of
image is commonly used as a multiplayer to mask regions within another image.

4.2 Grey scale Image

A gray scale digital image is an Image in which the value of each pixel is having a
single component that is only intensity information.This type of image are also known
as black and white, are composed exclusively of shades of grey, varying from black
from the least intensity to white at the most intensity. Grey scale images are distinct
from one- bit bi- tonal black and white images, which are having two colors, only that
is black and white. Grey scale images have many shades of grey in between. Grey
scale images are also called monochromatic ,denoting the presence of only one color

3
4.3 RGB Image

An RGB image is having 3d out of which 2 of the dimensions specify the location of
a pixel within an image. The other dimension specify the color of each pixel. The
color dimension consist of 3 components which is composed of the red, green and
blue color bands. In the RGB color model, a color image can be represented by the
intensity function.1RGB=(FR,FG,FB) , where FR(x,y) is the intensity of pixel (x,y) i
the green channel , and FB (x,y) is the intensity of the pixel (x,y) in the blue channel.
The luminance of grey scale image is matched with luminance color image during
RCB to grey scale conversion. One method is to obtain the values of red, green, and
blue primaries in linear intensity encoding by using gamma expansion. Then 30% of
the red value, 59% of green value, and11% of the blue value are added together

4.4 Histogram

A histogram is a graphical display of data using bars of different heights.Histogram

groups numbers into ranges decides by the user.It shows the visual impression of
distribution of data through graphical representation.The distribution is shown by
adjacent rectangles over discrete intervals with an area equal to the frequency density
and the total number of data represents the area of the Histogram.An image Histogram
represents the lightness property or brightness perception of a color of digital image in
graphical form. The vertical axis represents the number of pixels in the image and the
horizontal axis represents the brightness value.

4.5 Correlation coefficients

The Correlation coefficient computed from the sample data measures the strength and
direction of a relationship between two variables. The Correlation coefficient is a
number between 0 and 1. If there is no relationship between the predicted values and
4
the actual values the Correlation coefficient is 0 or very low.As the strength of the
relationship between the predicted values and actual values increases so does
Correlation coefficient. A perfect fit gives a coefficient of 1.0.Thus the higher the
Correlation coefficient the better[ 9,11]. Corr 2 computes the Correlation coefficient
using

MATLAB PROGRAM

%speech recognition using correlation method

%write folowing command on command window
%speech recoginition('test.wav)
clc;
clear all;
close all;
%key word
voice=audioread('ok_gogle.wav');
x=voice;
x=x';
x=x(1,:);
x=x';
%input-1
y1=audioread('ok_google.wav');
y1=y1';
y1=y1(1,:);
y1=y1';
z1=xcorr(x,y1);
m1=max(z1);
l1=length(z1);
t1=-((l1-1)/2):1:((l1-1)/2);
t1=t1';
%input-2
y2=audioread('whatsup.wav');
y2=y2';
y2=y2(1,:);
y2=y2';
z2=xcorr(x,y2);
m2=max(z2);
l2=length(z2);
t2=-((l2-1)/2):1:((l2-1)/2);
t2=t2';

5
%input-3
y3=audioread('hey_there.wav');
y3=y3';
y3=y3(1,:);
y3=y3';
z3=xcorr(x,y3);
m3=max(z3);
l3=length(z3);
t3=-((l3-1)/2):1:((l3-1)/2);
t3=t3';
%input-4
y4=audioread('hello.wav');
y4=y4';
y4=y4(1,:);
y4=y4';
z4=xcorr(x,y4);
m4=max(z4);
l4=length(z4);
t4=-((l4-1)/2):1:((l4-1)/2);
t4=t4';
zmax=max([max(z1),max(z2),max(z3),max(z4)]);
zmin=min([min(z1),min(z2),min(z3),min(z4)]);
%test 1
subplot(2,2,1);plot(t1,z1);grid;
title('OK GOOGLE');
axis([min(t1) max(t1) zmin zmax]);
%test 2
subplot(2,2,2);plot(t2,z2);grid;
title('WHATs UP');
axis([min(t2) max(t2) zmin zmax]);
%test 3
subplot(2,2,3);plot(t3,z3);grid;
title('HEY THERE');
axis([min(t3) max(t3) zmin zmax]);
%test 4
subplot(2,2,4);plot(t4,z4);grid;
title('HELLO');
axis([min(t4) max(t4) zmin zmax]);

6
RESULT

7
References

https://www.google.co.in/

https://en.wikipedia.org/

https://www.ieee.org/

https://in.mathworks.com/

Cep Signal and System PDF
No ratings yet
Cep Signal and System PDF
5 pages
Mathematics Formula Sheet Class 12
67% (3)
Mathematics Formula Sheet Class 12
28 pages
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
Speech Recognition System Using Matlab
No ratings yet
Speech Recognition System Using Matlab
13 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Lecture 7 - Automatic Speech Recognition
No ratings yet
Lecture 7 - Automatic Speech Recognition
58 pages
Reconocimiento de Voz - MATLAB
No ratings yet
Reconocimiento de Voz - MATLAB
5 pages
Automatic Speech Recognition 2
No ratings yet
Automatic Speech Recognition 2
22 pages
Speaker Recognition File
No ratings yet
Speaker Recognition File
16 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Speaker Recognition System
No ratings yet
Speaker Recognition System
7 pages
PMS KPK
No ratings yet
PMS KPK
2 pages
Speech Analysis
No ratings yet
Speech Analysis
6 pages
Sita#1part2 Merged
No ratings yet
Sita#1part2 Merged
61 pages
Speech Recognition System
No ratings yet
Speech Recognition System
12 pages
Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
No ratings yet
Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
521 pages
Robot Arm Controller Using Fuzzy Speech Recognition
No ratings yet
Robot Arm Controller Using Fuzzy Speech Recognition
7 pages
CCS369 - TSS-Unit 5
No ratings yet
CCS369 - TSS-Unit 5
23 pages
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
No ratings yet
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
546 pages
Speaker Recognition Using Matlab
No ratings yet
Speaker Recognition Using Matlab
14 pages
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
No ratings yet
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
19 pages
Intechopen 80419
No ratings yet
Intechopen 80419
18 pages
Speaker Recognition
No ratings yet
Speaker Recognition
11 pages
Ijves Y14 05338
No ratings yet
Ijves Y14 05338
5 pages
Towards Neurocomputational Speech and So
No ratings yet
Towards Neurocomputational Speech and So
279 pages
Objective B Indices Formative
No ratings yet
Objective B Indices Formative
4 pages
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
No ratings yet
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
6 pages
Gandhinagar Institute of Technology: Question Bank
No ratings yet
Gandhinagar Institute of Technology: Question Bank
5 pages
Mohini Dey - Capstone
No ratings yet
Mohini Dey - Capstone
52 pages
M FCC Review
No ratings yet
M FCC Review
10 pages
Speech Recognition Using Discrete Hidden Markov Model: Department of ECE, Saveetha Engineering College, Chennai, India
No ratings yet
Speech Recognition Using Discrete Hidden Markov Model: Department of ECE, Saveetha Engineering College, Chennai, India
6 pages
$Xwrpdwlf6Shhfk5Hfrjqlwlrqxvlqj&Ruuhodwlrq $Qdo/Vlv: $evwudfw - 7Kh Jurzwk LQ Zluhohvv FRPPXQLFDWLRQ
No ratings yet
$Xwrpdwlf6Shhfk5Hfrjqlwlrqxvlqj&Ruuhodwlrq $Qdo/Vlv: $evwudfw - 7Kh Jurzwk LQ Zluhohvv FRPPXQLFDWLRQ
5 pages
A Review On Feature Extraction and Noise Reduction Technique
No ratings yet
A Review On Feature Extraction and Noise Reduction Technique
5 pages
Silence Removal
No ratings yet
Silence Removal
3 pages
Speech Recognition Using Correlation Tec
No ratings yet
Speech Recognition Using Correlation Tec
8 pages
Methodology For Speaker Identification and Recognition System
100% (1)
Methodology For Speaker Identification and Recognition System
13 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Ma Kale
No ratings yet
Ma Kale
3 pages
Algorithm For The Identification and Verification Phase
No ratings yet
Algorithm For The Identification and Verification Phase
9 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
Implementation of Speech Recognition Using Artificial Neural Networks
No ratings yet
Implementation of Speech Recognition Using Artificial Neural Networks
12 pages
(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF
No ratings yet
(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF
5 pages
Implementing Speaker Recognition: Chase Zhou Physics 406 - 11 May 2015
No ratings yet
Implementing Speaker Recognition: Chase Zhou Physics 406 - 11 May 2015
10 pages
Speech Recognition1
No ratings yet
Speech Recognition1
24 pages
Materi Trigon English
No ratings yet
Materi Trigon English
5 pages
Type 1: Real (B) Ixl Lyl #0 C) X+y 0, X #0
No ratings yet
Type 1: Real (B) Ixl Lyl #0 C) X+y 0, X #0
3 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
ART 002 Lesson 5 Visual Elements of Arts and Designs
No ratings yet
ART 002 Lesson 5 Visual Elements of Arts and Designs
17 pages
Struct2 Lecture Notes #2 (Truss Analysis & Deflection) PDF
No ratings yet
Struct2 Lecture Notes #2 (Truss Analysis & Deflection) PDF
12 pages
Vip No.5 - Mesl
No ratings yet
Vip No.5 - Mesl
4 pages
Set Theory: Well-Defined Collections and Sets
No ratings yet
Set Theory: Well-Defined Collections and Sets
32 pages
Differential Equations For Engineers and Scientists
No ratings yet
Differential Equations For Engineers and Scientists
79 pages
Optimization of The SWAT Model To Adequately Predict Different Segments of A Managed Streamflow Hydrograph
No ratings yet
Optimization of The SWAT Model To Adequately Predict Different Segments of A Managed Streamflow Hydrograph
21 pages
Models - Opt.mbb Beam Optimization
No ratings yet
Models - Opt.mbb Beam Optimization
14 pages
G.T.N. Arts College (Autonomous)
No ratings yet
G.T.N. Arts College (Autonomous)
20 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
Recognizing Voice For Numerics Using MFCC and DTW
No ratings yet
Recognizing Voice For Numerics Using MFCC and DTW
4 pages
Chapter 5: Stationary Perturbation Theory: (From Cohen-Tannoudji, Chapter XI)
No ratings yet
Chapter 5: Stationary Perturbation Theory: (From Cohen-Tannoudji, Chapter XI)
46 pages
Holiday Assignment
No ratings yet
Holiday Assignment
2 pages
COSC 3101A - Design and Analysis of Algorithms 7
No ratings yet
COSC 3101A - Design and Analysis of Algorithms 7
50 pages
Comparative - Superlatives
No ratings yet
Comparative - Superlatives
3 pages
EEL6586 Final Project:: A Speaker Identification and Verification System
No ratings yet
EEL6586 Final Project:: A Speaker Identification and Verification System
16 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
Algebra Notes From The Underground 1st Edition Paolo Aluffi Instant Download
No ratings yet
Algebra Notes From The Underground 1st Edition Paolo Aluffi Instant Download
82 pages
Automation Chapter 4
No ratings yet
Automation Chapter 4
44 pages
Rabiner & Juang - Fundamentals of Speech Recognition
100% (2)
Rabiner & Juang - Fundamentals of Speech Recognition
277 pages
Abstract:: Text-Independent and Dependent Methods. in A Text
No ratings yet
Abstract:: Text-Independent and Dependent Methods. in A Text
11 pages
cs5300 Day06 Adversarial Search
No ratings yet
cs5300 Day06 Adversarial Search
5 pages
Derangement Formula Proof
No ratings yet
Derangement Formula Proof
5 pages
Post Lab #2
No ratings yet
Post Lab #2
7 pages
Learning SciPy For Numerical and Scientific Computing Second Edition Sergio J. Rojas G. Instant Download
No ratings yet
Learning SciPy For Numerical and Scientific Computing Second Edition Sergio J. Rojas G. Instant Download
42 pages
Speech Recognition Final
100% (1)
Speech Recognition Final
107 pages
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
No ratings yet
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
5 pages
Assignment Maths
No ratings yet
Assignment Maths
2 pages
Algorithms For Data Compression in Wireless Computing Systems
No ratings yet
Algorithms For Data Compression in Wireless Computing Systems
7 pages
What Number Is Five More Than Forty? - What Number Is Five More Than Seventy-Five?
No ratings yet
What Number Is Five More Than Forty? - What Number Is Five More Than Seventy-Five?
6 pages
Quantum Mechanics Course Zeemansplitting
No ratings yet
Quantum Mechanics Course Zeemansplitting
29 pages

MathLab Based Speech Processing

Uploaded by

MathLab Based Speech Processing

Uploaded by

INTRODUCTION

Speech is the fundamental analog form of message. It is designed to carry sound

In the year 1997 K.Ohtsuki , S.Matsunaga,T.Matsuoka, anind Sfurui, In their paper

3. An overview of speech recognition

4.Basic image processing concept

4.1 Binary Image

4.2 Grey scale Image

A histogram is a graphical display of data using bars of different heights.Histogram

4.5 Correlation coefficients

%speech recognition using correlation method

You might also like