DCT Application in Speech Recognition: A Survey

This document surveys the application of Discrete Cosine Transform (DCT) in speech recognition, highlighting its effectiveness in feature extraction and noise reduction. It discusses various methods, including the use of DCT in conjunction with Mel Frequency Cepstral Coefficients (MFCC) and hybrid systems like Genetic-Fuzzy Inference for improved recognition accuracy. The findings suggest that DCT enhances speech recognition performance by providing better energy compaction and reducing noise in voice signals.

Uploaded by

safi edin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

DCT Application in Speech Recognition: A Survey

Uploaded by

safi edin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

DCT APPLICATION IN SPEECH

RECOGNITION: A SURVEY
Atul Narkhede1, Dr. Naveen Sen2, Dr. Milind Nemade3
1(Research Scholar, Faculty of Engineering, Pacific Academy of Higher Education and Research University, Udaipur
[email protected])
2(Associate Prof., Pacific Academy of Higher Education and Research University, Udaipur
[email protected])
3(Professor, Department of Electronics Engineering, K. J. Somaiya Institute of Engineering & Information Technology,
Mumbai, India. [email protected])

Abstract: - Speech recognition with the help of the machine is automatically an important research area
for over forty years. Since the voice is an unlimited information signal, the speech signal processing
through digital conversion is a very efficient tool for high and accurate automatic signal or voice
recognition technology. Speech recognition has found its application in different areas of our daily life as a
telephone answering machine for transmitting text and sending voice signals to machines. Function
extraction and classification is a major part of the ASR system process. The main part of the voice
processing system to improve capacity is the selection of the function extraction method that plays an
important role in the accuracy of the system. This document provides a brief overview of the detection of
various methods in speech processing where DCT uses to efficiently extract features in different ways.

Keywords: DCT, MMSE, MFCC.

I. Introduction matching techniques play an important role in the

Automatic speech recognition by machine voice recognition system to maximize the speech
has been a research goal for over four decades. In recognition rate of different people. Following are
the world of science, the computer has always some methods that explain the advantages and
understood human mimics. The idea that was disadvantages of DCT.
generated to make the speech recognition system is II. DCT for noise Reduction:
because it is convenient for humans to interact with
This article illustrates the advantages of
a computer, a robot or any machine by voice or
using the discrete cosine transform (DCT) over the
vocalization instead of difficult instructions.
discrete standard Fourier transform (DFT) in order
Humans have long been inspired to create a
to eliminate the noise embedded in a voice signal.
computer that can understand and speak like
The derivation of the minimum mean square error
humans. Speech recognition is the process by which
filter (MMSE) based on the statistical modeling of
the computer assigns an acoustic voice signal to the DCT coefficients is shown. The derivation of an
some form of abstract vocal meaning. This process
excessive attenuation factor is also demonstrated by
is very difficult, since the sound must correspond to
the fact that the speech energy is not always present
the fragments of sound stored in which a
in the noisy signal at all times or in all coefficients.
subsequent analysis must be performed because the This excessive attenuation factor is useful for
fragments of sound do not correspond to the pre-
suppressing any residual musical noise that may be
existing sound pieces. Various methods of feature
present. It is often necessary to improve speech by
extraction and model matching techniques are used
eliminating noise in voice processing systems
to create better quality speech recognition systems.
operating in noisy environments. The energy of
The feature extraction technique and model
International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

white noise is uniformly distributed throughout the This method illustrates the properties of the
spectrum, but the energy of speech, particularly of discrete cosine transformation (DCT) with respect
sound, is concentrated in certain frequencies. to the discrete standard Fourier transformation
Therefore, the advantage of using a real (DFT) in the case of elimination of speech noise.
transformation, like the DCT considered in this The results show that DCT has better energy
document, is that the problem of not correcting the compaction and fewer calculations than DFT. The
phase will have less serious consequences. DCT is proposed algorithm is implemented for the
widely used in image compression due to its reduction of residual noise using the probability of
excellent energy compaction property. This is also a the absence of speech technique. The proposed
useful function to eliminate noise. DCT provides techniques use adaptive schemes that will monitor
significantly higher energy compaction than DFT the probability of the absence of speech in a noisy
[1]. speech. Estimate the spectral width received from a
III. Hybrid Method Genetic-Fuzzy Inference binary classification, that is, speech is present or
System: absent.
In this system, a voice signal is coded and The presence of different noises such as:
parameterized in a two-dimensional time matrix, • Background noise
with four parameters of the voice signal. After • Channel noise
encoding, the mean and variance of each model is • Quantization noise
used to generate the rule base of the fuzzy inference It significantly degrades system performance, such
system Mamdani. The mean and variance are as voice encoders and speech recognition systems,
optimized using a genetic algorithm to obtain the so we have to do a preprocessing step in these
best performance of the recognition system. systems that incorporate speech enhancement to
Consider the Brazilian expressions (digits) as eliminate noise. The filtering process must be
schemes: 0,1, 2,3,4,5, 6,7,8,9. The discrete cosine performed to filter a signal and eliminate noise. So
transformation (DCT) is used to encode vocal we can define the processing of the filter as follows:
patterns. The use of DCT in data compression and The information extraction process that carries the
model classification has increased in recent years, X (n) signal from the observed signal Y (n), where
mainly because its performance is much closer to Y (n) = X (n) + N (n) and N (n) is a noise process, it
the results obtained from the Karhunen-Lo` is called a filter. Different algorithms are used both
transformation, which is considered optimal for a in the time and frequency domain to eliminate the
variety of criteria such as mean square error of noise embedded in the noisy voice signal [3].
truncation and entropy. This article demonstrates
the potential of DCT and the fuzzy inference system V. DCT and MFCC:
in speech recognition. These two tools have shown This paper examines and presents an
good results in the temporal modeling of the vocal approach for speech signal recognition using
signal [2]. frequency spectral information with Mel frequency.
It is a dominant feature for speech recognition. The
mel coefficients of Cepstral (MFCC) are the
coefficients that collectively represent the short-
term power spectrum of a sound, based on a linear
transformation of the cosine of a logarithmic power
spectrum on a non-linear mel frequency scale. The
performance of the MFCC is influenced by the
Fig. 1. Block diagram of the proposed recongnition system HMFE number of filters, the shape of the filters, the filter
spacing mode and the deformation of the power
IV. MMSE filter using DCT: spectrum. In this document, the optimal values of
International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

the above parameters are chosen to obtain an band filter and the voice compression based on the
efficiency of 99.5% in a very small audio file discrete transformation of the cosine with inverse
length. wave transformation. The main objective is to
integrate the filter with the voice recognition
algorithm to improve the results when there is noise
in the signal. In this work, the correspondence is
made using inverse wave transformations that
reduce the speech recognition time. The proposed
algorithm is designed and implemented in
MATLAB. The proposed algorithm was tested on
the samples provided and evaluated using different
recognizable and unrecognizable samples, obtaining
a recognition ratio of approximately 98%. It has
been shown that the proposed algorithm provides
better results than existing techniques. The
proposed algorithm increases the accuracy of the
voice recognition system. In the proposed method,
the goal is to detect the speaker from previously
Figure 2. Process model for extracting MFCCs from an audio speech recorded wave samples. The main concentration is
in precision and speed. The proposed method is
a) Pre-emphasis: normally, a FIR filter of a implemented using MatLab.
coefficient is known as a pre-emphasis filter.
b) Framing: frames generally have 20-30 ms with
an overlap of 10-15 ms.
c) Windows: you can use the functions of the
Hamming or Hanning window.
d) DFT: to convert each frame of N time domain
samples into the frequency domain.
e) Mel filtering: the magnitude frequency response
of each filter has a triangular shape and is equal to
the unit at the central frequency and decreases
linearly to zero at the central frequency of two
adjacent filters.
f) DCT: this is the process to convert the spectrum
of Mel records into the time domain using DCT.
The result of the conversion is called the Mel
coefficient of Cepstrum. The set of coefficients is
called acoustic vectors. Therefore, each incoming
emission is transformed into an acoustic vector
sequence [4].
In this survey, they focus on providing
better performance in the speech recognition
algorithm by integrating digital signal transposition
with voice recognition techniques. This is an
approach to improve the performance of the speech Fig.3. Flow-Chart For Speech Recognition Algorithm
recognition algorithm by using the Butterworth stop
International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

Voice compression based on discrete cosine

transformations (DCT) is used to reduce the size of
vocal information. It is used to speed up the system
by eliminating the redundancy of audio
information. Compression is the process of
eliminating redundancy and duplicity. DCT is very
common when encoding video and voice tracks on
computers [5]. Fig. 4. Feature extraction methods

VI. 2D DCT: Another approach is a 2D DCT-based

This proposed method used the coefficients approach to compress the acoustic characteristics
extracted from the discrete cosine transform 2D for remote speech recognition applications. The
(DCT) of the energies of the Log Mel filter bank to coding scheme involves the calculation of a 2D
improve the recognition of the diffusers on the DCT in blocks of feature vectors followed by
traditional Mel cepstral frequency coefficients uniform scalar quantification, stroke length and
(MFCC) with delta and double deltas (MFCC / Huffman coding. Digit recognition experiments
delta ). The selection of the relevant coefficients were conducted in which the training was
proved to be crucial, which led to the proposal of a conducted with un-quantified cephalic features of
zigzag analysis strategy. Although the 2D-DCT clean voice and the tests used the same
coefficients have provided significant gains on characteristics after encoding and decoding with 2D
MFCC / delta, the analysis strategy remains DCT and entropy coding and at various noise
sensitive to the number of outputs of the filter bank levels. Acoustic the coding scheme translates into
and to the size of the analysis window. In this work, recognition performances comparable to those
we analyze this sensitivity and propose two new obtained with characteristics that are not quantified
data-based methods to use the DCT coefficients for at low bit rates. MFCC's 2D DCT coding together
the recognition of the speakers: rankDCT and with a method for analyzing variable frame rates
pcaDCT. The first, rankDCT, is an automatic [Zhu and Alwan, 2000] and peak isolation [Strope
coefficient selection strategy based on the highest and Alwan, 1997] maintains the noise robustness of
average intra-frame energy range. The alternative these low SNR algorithms even at 624 bps
method, pcaDCT, avoids the need for selection and
instead projects the DCT coefficients on the desired
dimensionality through the analysis of the main
components (PCA). All functions, including MFCC
/ delta, are set in a subset of the PRISM database to
subsequently highlight the sensitivity of the
parameters of each function. Evaluated in the recent
NIST SRE’12 corpus, pcaDCT constantly exceeds
the characteristics of the DCT and zzDCT range Figure 5. Block diagram of the DCT and entropy encoder.
and offers an average relative improvement of 20%
on MFCC / delta in all conditions [6]. In the client, the entry is first segmented into
frames, the characteristics are calculated for each
frame and then feature blocks are generated. A 2D
DCT is then performed on each block and the
components with the lowest energy are set to zero.
This is followed by scalar quantification, execution
length and Huffman coding. A block diagram of the
encoder is shown in Figure 5. In the receiver
International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

decoding and IDCT are performed and in the ASR ELSEVIER, Speech Communication, pp 249 –
system the characteristic vectors corresponding to 257,1998
each frame are inserted. Only function vectors are [2] Washington Silva and Ginalber Serra, “ An Intelligent
encoded and sent to the recognition server; the first System Based on Discrete Cosine Transform for Speech
and second derivatives are calculated on the server Recognition,” ResearchGate, IBERAMIA, LNAI 7637,
pp. 320–329, November 2012.
based on the features retrieved [7].
[3] Muhammad Safder Shafi, Mansoor Khan, “Transform
Based Speech Enhancement Using DCT Based MMSE
VII. BDCT Method: Filter,& Its Comparison With DFT Filter,” Journal of
Robust speech recognition has become an Space Technology, Vol 1, No. 1, pp 47 – 52, July 2012.
important area of research in recent years. Multi-
[4] Garima Vyas, Barkha Kumari, “Speaker Recognition
band functions can be combined in different ways System Based On MFCC and DCT,” International
to perform the speech recognition task. The Journal of Engineering and Advanced Technology
extraction of multiband characteristics will propose (IJEAT) ISSN: 2249 – 8958, Volume-2, Issue-5, pp 167
a transformation of the cosine to discrete blocks – 169, June 2013.
(BDCT) with its transformation matrix of the [5] Sukhdeep Kaur, Er. Gurwinder Kaur, “Enhancement of
nucleus derived from the decomposition of the Speech Recognition Algorithm Using DCT and Inverse
discrete cosine transformation nucleus (DCT). We Wave Transformation,” International Journal of
Engineering Research and Applications, ISSN: 2248-
show that the BDCT approaches the DCT to
9622, Vol. 3, Issue 6, pp.749-754, Nov-Dec 2013.
maintain information in the correlation of a
sequence. When the BDCT is applied to the [6] Mitchell McLaren, Yun Lei, “Improved Speaker
Recognition Using DCT Coefficients as Features,”IEEE
energies of the mel filter bank frequency (FBE) to International Conference on Acoustics, Speech and
replace the DCT to convert them into cephalic Signal Processing, 978-1-4673-6997-8, pp 4430 – 4434,
coefficients, a new type of MFCC is produced [8]. April 2015.
[7] Qifeng Zhu and Abeer Alwan, “An Efficient And
VIII. Conclusion Scalable 2d Dct-Based Feature Coding Scheme For
This Paper briefly explain different methods Remote Speech Recognition,” IEEE International
used for speech recognition using DCT, which Conference on Acoustics, Speech, and Signal
shows that DCT can be used for noise reduction Processing, ISSN 1520 – 6149, May 2011.
very well, also it has property of energy compaction [8] Suman K. Saksamudre, R. R. Deshmukh, “Comparative
which can improve speed as well as recognition Study of Isolated Word Recognition System for Hindi
Language,” International Journal of Engineering
rate. Research & Technology (IJERT), ISSN: 2278-0181,
Vol. 4 Issue 07, July-2015.
References:

[1] Ing Yann Soon*, Soo Ngee Koh, Chai Kiat Yeo, “Noisy
speech enhancement using discrete cosine transform ,”

Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Speaker Recognition Using MATLAB
95% (64)
Speaker Recognition Using MATLAB
75 pages
2006 - 08 Test of Analog Communiaction Made Easy
0% (2)
2006 - 08 Test of Analog Communiaction Made Easy
12 pages
Speech Recognition and PCA
No ratings yet
Speech Recognition and PCA
14 pages
A Review On Speech Recognition Methods: Ram Paul Rajender Kr. Beniwal Rinku Kumar Rohit Saini
No ratings yet
A Review On Speech Recognition Methods: Ram Paul Rajender Kr. Beniwal Rinku Kumar Rohit Saini
7 pages
Speech To Text Conversion STT System Using Hidden Markov Model HMM
No ratings yet
Speech To Text Conversion STT System Using Hidden Markov Model HMM
4 pages
Mu1202x Ummu12021202xen1 3674 PDF
No ratings yet
Mu1202x Ummu12021202xen1 3674 PDF
20 pages
Enhance Your DSP Course With These Interesting Projects
No ratings yet
Enhance Your DSP Course With These Interesting Projects
15 pages
Voice Recognition
No ratings yet
Voice Recognition
6 pages
Feature Extraction Methods LPC, PLP and MFCC
100% (1)
Feature Extraction Methods LPC, PLP and MFCC
5 pages
Speaker Recognition System Using MFCC and Vector Quantization
No ratings yet
Speaker Recognition System Using MFCC and Vector Quantization
7 pages
Recognizing Voice For Numerics Using MFCC and DTW
No ratings yet
Recognizing Voice For Numerics Using MFCC and DTW
4 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Speaker Recognition Using Vocal Tract Features
No ratings yet
Speaker Recognition Using Vocal Tract Features
5 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Photoshop Blend Modes Explained
No ratings yet
Photoshop Blend Modes Explained
41 pages
pxc3872774 PDF
No ratings yet
pxc3872774 PDF
7 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
Arabic Speech Transformation Using MFCC in GMM2012
No ratings yet
Arabic Speech Transformation Using MFCC in GMM2012
4 pages
Applsci 09 02166
No ratings yet
Applsci 09 02166
12 pages
Speech Recognition Using MFCC and DTW: January 2014
No ratings yet
Speech Recognition Using MFCC and DTW: January 2014
5 pages
Feature Extraction Techniques For Speech Processing A Review
No ratings yet
Feature Extraction Techniques For Speech Processing A Review
8 pages
Speaker Recognition Using Mel Frequency Cepstral Coefficients (MFCC) and Vector
No ratings yet
Speaker Recognition Using Mel Frequency Cepstral Coefficients (MFCC) and Vector
4 pages
Adama Science and Technology University: Advanced Digital Signal Processing Project Presentation
No ratings yet
Adama Science and Technology University: Advanced Digital Signal Processing Project Presentation
21 pages
SN Ka Thesis
No ratings yet
SN Ka Thesis
78 pages
Denoising Speech For MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System
No ratings yet
Denoising Speech For MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System
5 pages
Intechopen 80419
No ratings yet
Intechopen 80419
18 pages
SSP Obe Eteeap Syllabus 2016
No ratings yet
SSP Obe Eteeap Syllabus 2016
9 pages
M FCC Review
No ratings yet
M FCC Review
10 pages
Thesis Mns25
No ratings yet
Thesis Mns25
163 pages
A Novel Approach For MFCC Feature Extraction
No ratings yet
A Novel Approach For MFCC Feature Extraction
5 pages
Speech Recognition Using Discrete Hidden Markov Model: Department of ECE, Saveetha Engineering College, Chennai, India
No ratings yet
Speech Recognition Using Discrete Hidden Markov Model: Department of ECE, Saveetha Engineering College, Chennai, India
6 pages
Audio Signal Processing Audio Signal Processing
No ratings yet
Audio Signal Processing Audio Signal Processing
31 pages
IJARCSSE
No ratings yet
IJARCSSE
6 pages
Algorithm For The Identification and Verification Phase
No ratings yet
Algorithm For The Identification and Verification Phase
9 pages
2015 Elsevier Speaker Identification Using Vowels Features Through A Combined Method of Formants Wavelets and Neural Network Classifiers
No ratings yet
2015 Elsevier Speaker Identification Using Vowels Features Through A Combined Method of Formants Wavelets and Neural Network Classifiers
9 pages
Thesis mns25 PDF
No ratings yet
Thesis mns25 PDF
163 pages
Extra Paper
No ratings yet
Extra Paper
11 pages
Digital Signal Processing: The Final
No ratings yet
Digital Signal Processing: The Final
13 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Voice Activation Using Speaker Recognition For Controlling Humanoid Robot
No ratings yet
Voice Activation Using Speaker Recognition For Controlling Humanoid Robot
6 pages
199568.speaker Recognition Method Combining FFT Wavelet Functions and Neural Networks
No ratings yet
199568.speaker Recognition Method Combining FFT Wavelet Functions and Neural Networks
4 pages
DWT and Mfccs Based Feature Extraction Methods For Isolated Word Recognition
No ratings yet
DWT and Mfccs Based Feature Extraction Methods For Isolated Word Recognition
6 pages
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
No ratings yet
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
6 pages
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
No ratings yet
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
3 pages
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
No ratings yet
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
6 pages
Reconocimiento de Voz - MATLAB
No ratings yet
Reconocimiento de Voz - MATLAB
5 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
45 pages
Maretext Independent Speaker Identification Based On K-Mean Algorithm
No ratings yet
Maretext Independent Speaker Identification Based On K-Mean Algorithm
9 pages
A Review On Feature Extraction and Noise Reduction Technique
No ratings yet
A Review On Feature Extraction and Noise Reduction Technique
5 pages
Voice Command Recognition System Based On MFCC and DTW: Anjali Bala
No ratings yet
Voice Command Recognition System Based On MFCC and DTW: Anjali Bala
8 pages
Vector Quantization Approach For Speaker Recognition Using MFCC and Inverted MFCC
No ratings yet
Vector Quantization Approach For Speaker Recognition Using MFCC and Inverted MFCC
7 pages
Ma Kale
No ratings yet
Ma Kale
3 pages
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
No ratings yet
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
5 pages
Franalyzer Anp 10019 Enu
100% (2)
Franalyzer Anp 10019 Enu
23 pages
Speaker Recognition Publish
No ratings yet
Speaker Recognition Publish
6 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
Digital DDS Generator With AD9850 Chip DDS From SQ5RWQ
No ratings yet
Digital DDS Generator With AD9850 Chip DDS From SQ5RWQ
4 pages
Preprocessing by Contrast Enhancement Techniques For Medical Images
No ratings yet
Preprocessing by Contrast Enhancement Techniques For Medical Images
8 pages
Páginas DesdeRadio Electronics April 1985-2-2
No ratings yet
Páginas DesdeRadio Electronics April 1985-2-2
1 page
Albersheims Equation
No ratings yet
Albersheims Equation
6 pages
Audio VU Meter
No ratings yet
Audio VU Meter
3 pages
MCQ UNIT1andUNIT2 DIP
No ratings yet
MCQ UNIT1andUNIT2 DIP
20 pages
Instruction Division First Semester, 2016-2017 Course Handout (Part II)
No ratings yet
Instruction Division First Semester, 2016-2017 Course Handout (Part II)
3 pages
Image Analytics, Unit-3
No ratings yet
Image Analytics, Unit-3
12 pages
Format Question Bank Revised
No ratings yet
Format Question Bank Revised
21 pages
Analogue Integrated Circuit and System: Common Gate Amplifier
No ratings yet
Analogue Integrated Circuit and System: Common Gate Amplifier
4 pages
Roland (Boss) ME-70 Video Contents
No ratings yet
Roland (Boss) ME-70 Video Contents
1 page
LAVOCE WAF124.02 12in WOOFER E.A
No ratings yet
LAVOCE WAF124.02 12in WOOFER E.A
1 page
Novella U851 Upconverter Specs DBS Band
No ratings yet
Novella U851 Upconverter Specs DBS Band
1 page
Bit-Plane Slicing - IMAGE PROCESSING
No ratings yet
Bit-Plane Slicing - IMAGE PROCESSING
6 pages
Signals and Systems
No ratings yet
Signals and Systems
20 pages
Mpeg-2 Encoding and Most Common PROFILES: 4:2:0 (MP@ML) AND 4:2:2
No ratings yet
Mpeg-2 Encoding and Most Common PROFILES: 4:2:0 (MP@ML) AND 4:2:2
4 pages
JPEG Image Compression and Decompression PDF
No ratings yet
JPEG Image Compression and Decompression PDF
7 pages
Control Systems Ee303: Overnight Test 2
No ratings yet
Control Systems Ee303: Overnight Test 2
3 pages
Chapter6 Modellingandcontrolofconverters 221030084001 176974a7
No ratings yet
Chapter6 Modellingandcontrolofconverters 221030084001 176974a7
30 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
29 pages
A Systematic Review of Compressive Sensing: Concepts, Implementations and Applications
No ratings yet
A Systematic Review of Compressive Sensing: Concepts, Implementations and Applications
20 pages
Comparison of Objective Image Quality Metrics To Expert Radiologists Scoring of Diagnostic Quality of MR Images
No ratings yet
Comparison of Objective Image Quality Metrics To Expert Radiologists Scoring of Diagnostic Quality of MR Images
9 pages
Low Speed Condition Monitorng
No ratings yet
Low Speed Condition Monitorng
3 pages
Experiment - 4: Aim: To Compute DFT of The Signal Given Below Using Matrix Method in MATLAB: X (N) (1 2 3 4)
No ratings yet
Experiment - 4: Aim: To Compute DFT of The Signal Given Below Using Matrix Method in MATLAB: X (N) (1 2 3 4)
4 pages
Carleton University: Sysc 4600 Digital Communications Fall 2016
No ratings yet
Carleton University: Sysc 4600 Digital Communications Fall 2016
2 pages
Simulation of Digital Communication Systems Using Matlab
From Everand
Simulation of Digital Communication Systems Using Matlab
Mathuranathan Viswanathan
3.5/5 (22)
COMMUNICATION SYSTEMS
From Everand
COMMUNICATION SYSTEMS
B.P. Lathi
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
Some Case Studies on Signal, Audio and Image Processing Using Matlab
From Everand
Some Case Studies on Signal, Audio and Image Processing Using Matlab
Dr. Hedaya Mahmood Alasooly
No ratings yet
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
Human Visual System Model: Understanding Perception and Processing
From Everand
Human Visual System Model: Understanding Perception and Processing
Fouad Sabry
No ratings yet
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
From Everand
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
Fouad Sabry
No ratings yet

DCT Application in Speech Recognition: A Survey

Uploaded by

DCT Application in Speech Recognition: A Survey

Uploaded by

International Journal of Engineering and Techniques - Volume 5 Issue4 , August 2019

DCT APPLICATION IN SPEECH

Keywords: DCT, MMSE, MFCC.

I. Introduction matching techniques play an important role in the

Voice compression based on discrete cosine

VI. 2D DCT: Another approach is a 2D DCT-based

You might also like