tr05 01 PDF

This paper investigates using spectral entropy features for speech recognition. Spectral entropy measures the disorganization of a signal's frequency spectrum and can help identify voiced speech regions versus noise/unvoiced regions. The paper computes entropy from sub-bands of the short-time Fourier transform spectrum normalized as a probability mass function. Experiments evaluate entropy features on connected digit recognition in clean and noisy conditions, both alone and combined with mel-frequency cepstral coefficients. Results show entropy features improve baseline performance and robustness in additive noise.

Uploaded by

Gerardo Meza Gómez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views4 pages

tr05 01 PDF

Uploaded by

Gerardo Meza Gómez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

SPECTRAL ENTROPY AS SPEECH FEATURES

FOR SPEECH RECOGNITION

Aik Ming Toh Roberto Togneri Sven Nordholm
School of Electrical, Electronic, School of Electrical, Electronic, Western Australian
and Computer Engineering and Computer Engineering Telecommunications
The University of Western Australia The University of Western Australia Research Institute

Abstract— This paper presents an investigation of spectral would have a flatter distribution and thus higher entropy. This
entropy features, used for voice activity detection, in the context concept has enabled the entropy to be considered in voice
of speech recognition. The entropy is a measure of disorganization activity detection [4] and speech recognition [2].
and it can be used to measure the peakiness of a distribution.
We compute the entropy features from the short-time Fourier In this paper we investigate the entropy feature for its
transform spectrum, normalized as a PMF. The concept of performance in speech recognition. Misra had evaluated the
entropy shows that the voiced regions of speech have lower entropy features for phoneme recognition. We extend the en-
entropy since there are clear formants. The flat distribution tropy features for connected digit recognition on the TI-DIGIT
of silence or noise would induce high entropy values. In this database and study their robustness in noisy environments. In
paper, we investigate the use of the entropy as speech features
for speech recognition purpose. We evaluate different sub-band addition, we append the entropy features to MFCC 0 features
spectral entropy features on the TI-DIGIT database. We have rather than the PLP features used in [5]. Our experiments in [6]
also explored the use of multi-band entropy features to create showed that MFCC 0 outperformed the PLP features both in
higher dimensional entropy features. Furthermore, we append recognition performance and robustness. We want to determine
the entropy features to baseline MFCC 0 and evaluate them the contribution of entropy features on the state-of-art feature.
in clean, additive babble noise and reverberant environments.
The results show that entropy features improve the baseline We have also generated multiple multi-band entropy features
performance and robustness in additive noise. from smaller sub-bands entropy features. Furthermore, we
evalute the entropy features with MFCC 0 in additive babble
noise and reverberant noise for robustness.
I. INTRODUCTION The paper is organized as follows: Section 2 presents an
Speech recognition systems typically use speech features overview of the spectral entropy features and its derivation.
based on the short-term spectrum of speech signal. The state The third section explains the experimental setup and is fol-
of the art feature used in most speech recognizer is the Mel- lowed by the results in section 4. The final section comprises
frequency cepstral coefficients (MFCC) with enhancements the conclusion of the work.
such as regression features and normalization strategy. Other
speech features such as perceptual linear prediction (PLP) and II. S PECTRAL E NTROPY
its variant RASTA [1] are also popular speech features. An
additional type of feature based on the entropy has recently The entropy has been used to detect silence and voiced
emerged in the context of speech recognition. It is also know region of speech in voice activity detection. The discriminatory
as the Wiener entropy since it measures the power spectral property of this feature gives rise to its use in speech recogni-
flatness of the spectrum. Misra proposed the use of entropy tion. The entropy can be used to capture the formants or the
features as speech features for use in speech recognition [2]. peakiness of a distribution. Formants and their locations have
Entropy is usually used in the context of pattern classifica- been considered to be important for speech tracking. Thus,
tion and information technology. Originally the entropy was the peak capturing ability of entropy was employed for speech
defined for information sources by Shannon [3]. It is a measure recognition.
of disorganization or uncertainty in a random variable. The We converted the spectrum into a probability mass function
information can be interpreted as essentially the negative of (PMF) by normalizing the spectrum in each sub-band. Misra
the entropy, and the negative logarithm of its probability. also suggested the use of entropy computation from the full-
The application of the entropy concept for speech recog- band normalized spectrum [5]. Equation (1) is used for sub-
nition is based on the assumption that the speech spectrum band normalization.
is more organized during speech segments than during noise Xi
segments. In addition, the spectral peaks of the spectrum are xi = PN for i = 1 to N (1)
i=1 Xi
supposed to be more robust to noise. Thus a voiced region
of speech would induce low entropy since there are clear where Xi represents the energy of the it h frequency compo-
formants in the region. The spectra of noise or unvoiced region nent of the spectrum and xi is the PMF of the spectrum. The
8 8

7
7

6
6

5
5

4
3

3
2

2
1

1 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180

Fig. 1. The entropy contour of connected digit utterance ”1-9-8-6” in clean Fig. 2. The entropy contour of digit utterance ”1-9-8-6” in clean (dashed
(dashed line) and corrupted with babble noise at SNR10dB (line) line) and corrupted speech with RT0.2s (line)

area under the normalized spectrum in each sub-band should entropy. Due to the number of points in STFT spectrum, we
sum to 1. The normalized spectra were considered as a PMF divided the distribution into sub-bands of equal size with the
and used for entropy computation.The entropy was computed remainder allocated into the last sub-band. The K sub-bands
with equation (2). of interest were the full-band, 2, 3, 4, 5, 6, 8, 12 and 13 sub-
X bands. The performance of the 12 and 13 sub-bands entropy
H(x) = − xi · log2 xi (2) features will be compared against the baseline MFCC and
x∈X MFCC 0 features. Initially we only proposed to compute the
Figure 1 shows the full-band entropy contour of the speech entropy features up to 13 subbands. Misra [5] reports that the
utterance ”1-9-8-6”. The dashed line illustrates the reference sub-bands 16, 24 and 32 yield surprisingly good performance
or the entropy contour of the clean utterance. The line contour and we decided to investigate those sub-bands.
represents the entropy contour of the corrupted utterance under We also appended the sub-band entropy features to create
influence of babble noise at SNR10dB. The figure shows that larger dimensional entropy features. Misra [2] have only eval-
the entropy feature managed to track most of the formants uated the performance of 15 dimensional entropy computed
which is represented by the contour trough even in low SNR from the full-band, 2, 3, 4 and 5 sub-bands entropy features.
of 10dB. The beginning and end of the entropy contour depicts We computed multiple dimensional entropy features to show
that the entropy for the distribution increases as the level of the contribution of smaller sub-bands in comparison with
noise increased. We have also compared the full-band contours conventional entropy features of the same size.
of the utterance under the babble noise influence of 20dB and
10dB. The plots revealed that the location of the peak remained
pretty much in place and the noise region contour were almost IV. EXPERIMENTAL SETUP
similar. The entropy features were able to discriminate the The TI-digit corpus was used in the speech recognition
speech region and the noise region. experiments. The database comprised both isolated and con-
We have also carried out the same analysis for reverberant nected digit utterances. The training data contained utterances
noise and initial results on slight reverberation indicates that of 24 male and 24 female speakers. There were 8 male and
spectral entropy may not be suitable for speech recognition 8 female speakers for the testing data. Each subset composed
in reverberant condition. Figure 2 illustrates the contour of of 77 digit utterances.
the clean (dashed line) and the utterance under reverbernt
The babble noise from NOISEX 92 database was used to
influence of RT 0.2s. The contour of the reverberant utterance
corrupt the testing data for evaluation in noisy environments.
has been shifted. Reverberations have been shown to introduce
The babble noise represented a real-world non-stationary envi-
a temporal smearing effect in [6]. The formants have been
ronment noise. The test data was corrupted with babble noise
displaced due to temporal smearing and this was not ideal for
at five different signal-to-noise ratios (SNRs) ranging from
speech recognition.
0dB to 40dB at the intervals of 10dB.
Reverberant effects were captured by estimating the impulse
III. S UB - BANDS E NTROPY response of the room environments from long segments of
The full-band entropy captures the gross peakiness of the speech. The experiment used the room impulse response
spectrum. We partitioned the STFT spectrum into sub-bands designed to match the characteristic of a 2.2m high, 3.1m
for improved resolution. The distribution is separated into wide and 3.5m long room. The microphone and the speakers
K non-overlapping sub-bands. We refered to this as K sub- were localized 0.5m from the wall at opposite end. The speech
bands entropy features whereas Misra called them multi-band was convolved with the RT60 room impulse response. The
TABLE I TABLE II
W ORD ERROR RATES FOR SUB - BAND ENTROPY FEATURES W ORD ERROR RATES FOR DIMENSIONAL ENTROPY FEATURES

Entropy Word Error Rates (%) DIM Entropy Word Error Rates (%)
Full-band 46.96 12 sub-bands (3,4,5) 21.31
2 sub-bands 39.52 12 sub-bands (2,4,6) 18.98
3 sub-bands 34.98 13 sub-bands (1,3,4,5) 18.96
4 sub-bands 31.87 13 sub-bands (1,4,8) 15.79
5 sub-bands 32.96 13 sub-bands (1,2,4,6) 17.87
6 sub-bands 25.67 14 sub-bands (2,4,8) 16.49
8 sub-bands 24.31 15 sub-bands (1,2,3,4,5) 20.54
12 sub-bands 23.50
13 sub-bands 24.78

B. Dimensional Entropy
Table II displays the WERs for dimensional entropy com-
number of filter coefficients were adjusted according to the puted from smaller sub-bands. The dimensional entropy fea-
reverberation time. tures performed better than the conventional sub-bands entropy
We chose MFCC 0 features as the baseline because of their features. We were able to yield an improvement of about 5%
broad use in speech research and adoption as the state-of- to 9% for 13 dimensional entropy compared to 13 sub-bands
art features in speech recognition systems. In our previous entropy features. The 14 dimensional and 15 dimensional
study [6], we have shown that the PLP features fell short both entropy features have also outperformed the conventional sub-
in performance and recognition, thus we decided to adopt band features by 10.5% and 9.7% respectively. The WERs
MFCC 0 as the baseline feature. All the speech files were for conventional 14 and 15 sub-bands entropy features were
pre-emphasized and windowed with a Hamming window. The 27.03% and 30.30% respectively.
speech signal was analyzed every 10ms with a frame width Our primary aim in investigating the 12 and 13 dimensional
of 25ms. The number of points of the STFT spectrum is 257. sub-bands was to evaluate their performances in comparison
A Mel-scale triangular filterbank with 26 filterbank channels to the baseline MFCC and MFCC 0 features. The results
was used to generate the Mel-frequency cepstral coefficients showed that dimensional entropy features were still unable to
(MFCC) features. The MFCC 0 coefficients constitute the 12 match the performance of the baseline feature. This showed
static MFCC coefficients and the zeroth cepstral coefficients. that spectral entropy features were not suitable to be used as
The HMM model used 15 states and 5 mixtures for the baseline speech features. Thus, we decided to utilize them
connected digit recognition. We do not use any penalty factor as additional features and appended them to our baseline
to optimize the recognition accuracy for these experiments. MFCC 0 feature for speech recognition.

C. MFCC and Subband Entropy

V. EXPERIMENT RESULTS The entropy features were not competitive when compared
to the baseline MFCC 0 features, therefore we appended the
A. Subband Entropy entropy features to assess the performance of entropy features
as additional features. The entropy features did improve the
Table I shows the word error rates (WERs) for spectral baseline recognition accuracy slightly. Most of the entropy
entropy features up to 13 sub-bands. We have conjectured features reduced the WERs to less than 2.00% in the clean
that entropy features with sub-bands greater than 5 would environment.
contribute more to the recognition performance which was not We have also appended the dimensional entropy features to
shown in [2]. the baseline MFCC 0. The results did not show much contribu-
We could observe the contribution of entropy features with tion from the dimensional entropy features and caused slight
better resolution. The WERs decreased as the dimension of degradation in some cases. The use of dimensional entropy
the entropy increased as in Table I. However, the 12 and features should enhance the performance of the baseline but
13 sub-band entropy features did not outperform the baseline experimental results showed otherwise.
MFCC and MFCC 0 with WERs of just 2.97% and 2.05%
respectively. D. Spectral Entropy in Noisy Environments
Misra extended their entropy computations to 32 sub-bands We then performed speech recognition with MFCC 0 and
in [5]. They have reported that 16-bands, 24-bands and 32- entropy features in additive babble noise. The NOISEX 92
bands entropy features gave WERs of 15% to 18%. Our results noise characterized the background noise of real-world envi-
gave different perspectives as the WERs for subbands greater ronment. The baseline MFCC 0 results from [6] were used as
than 16 were typically more than 25%. This raised the issue benchmark. Table III shows the WERs for MFCC 0 appended
of redundancy in excessive sub-bands entropy computation. with entropy features in additive noise environment. All the
TABLE III
W ORD ERROR RATES % FOR MFCC 0 WITH ENTROPY FEATURES IN AADDITIVE BABBLE NOISE

Clean SNR 40 SNR 30 SNR 20 SNR 10 SNR 0

MFCC 0 2.05 2.33 3.09 17.97 59.93 90.20
MFCC0 +Full 2.03 2.15 4.98 29.73 64.43 91.83
MFCC0 +2 1.76 1.98 4.41 22.85 55.00 86.83
MFCC0 +3 1.76 1.81 4.53 24.21 61.34 93.14
MFCC0 +4 1.93 2.08 3.34 17.48 51.68 88.42
MFCC0 +5 1.88 2.00 3.54 19.83 59.38 94.41
MFCC0 +6 1.76 1.81 3.42 17.95 52.52 89.48
MFCC0 +8 2.08 2.10 2.97 19.18 55.17 94.90
MFCC0 +12 1.91 2.00 2.45 14.73 51.31 88.25
MFCC0 +13 2.13 2.13 2.72 13.42 50.22 88.04

TABLE IV
smearing effect induced by reverberations have shifted the
W ORD ERROR RATES % FOR MFCC 0 AND ENTROPY FEATURES IN
formants and the entropy distribution. Thus, spectral entropy
REVERBERANT ENVIRONMENTS
failed to perform for reverberant speech recognition.
Entropy RT 0.1s RT 0.2s
Full-band 3.56 8.89 VI. CONCLUSION
2 sub-bands 3.96 11.01
The utilization of the spectral entropy features have been
3 sub-bands 3.42 9.95
adopted in speech activity detection and speech recognition.
4 sub-bands 4.43 10.82
We have investigated the use of spectral entropy as speech
5 sub-bands 4.23 9.13
features and evaluated them on the TI-DIGIT connected digit
6 sub-bands 4.41 9.18
database and in noisy environments. The spectral entropy
8 sub-bands 5.07 9.53
features with better resolution such as sub-bands 12 and
12 sub-bands 5.15 8.89
13 performed better than other sub-bands. The spectral en-
13 sub-bands 5.22 8.96
tropy features alone however were not able to surpass the
performance of the cepstral features such as the baseline
MFCC 0. The use of spectral entropy features as additional
spectral entropy features contributed to robustness as shown by features showed improvements in the recognition accuracy
the performance in SNR 40dB. However, the entropy features and robustness against additive noise when compared with the
with more sub-bands performed better than those with less baseline MFCC 0 features.
sub-bands as the SNR decreased. Both the analysis and the results showed that entropy
The contribution of sub-band entropy features were evident features were less affected by additive noise such as babble
for 12 and 13 sub-bands entropy. These entropy features have noise. The entropy contours also demonstrated that formants
shown robustness in additive noise throughout different level were less affected by noise. However, spectral entropy fea-
of SNRs. Other sub-bands such as sub-band 2,4 and 6 were tures were not suitable for speech recognition in reverberant
also robust to additive babble noise. The results in Table III environment. Both temporal smearing and shifted formants
demonstrated that MFCC 0 and entropy features with more contributed to the poor performance of spectral entropy in
sub-bands performed better than the baseline MFCC 0 features reverberant environment.
across different levels of additive noise.
We have also evaluated speech recognition with MFCC 0 R EFERENCES
and spectral entropy features in reverberant environments. Ta- [1] H. Hermansky and N. Morgan, “Rasta processing of speech,” IEEE
Trans. SAP, vol. 2, no.4, July, Oct. 1994.
ble IV records the WERs for speech recognition with MFCC 0 [2] H. Misra, S. Ikbal, H. Bourlard, and H. Hermansky, “Spectral entropy
and entropy features in reverberant environments. Preliminary based feature for robust asr,” in Proc. ICASSP, May 2004, pp. 193–196.
results in light reverberation of RT 0.1s and RT 0.2s did [3] C.E. Shannon, “A mathematical theory of communication,” Bell System
Technical Journal, vol. 27, pp. 379–423, 623–656, July, Oct. 1948.
not show any significant contribution or robustness from [4] P. Renevey and A. Drygajlo, “Entropy based voice activity detection
spectral entropy features. The WERs for baseline MFCC 0 in in very noisy conditions,” in Proc.Eurospeech, USA, Sept. 2001, pp.
reverberant condition were 4.48% and 7.40% for RT60 of 0.1s 1887–1890.
[5] H. Misra, S. Ikbal, S. Sivadas, and H. Bourlard, “Multi-resolution spectral
and 0.2s respectively. The performance of the entropy features entropy feature for robust asr,” in Proc. ICASSP, March 2005, pp. 253–
greatly deteriorated as the reverberant level increased. 256.
One reason would be the weakness of spectral entropy in [6] A.M. Toh, R. Togneri, and S. Nordholm, “Investigation of robust features
for speech recognition in hostile environments,” in Proc. APCC, 2005.
capturing shifted spectral peaks. Figure 2 illustrates the effects
of reverberation on spectral entropy contour. The temporal

Ab Initio FAQ's Part1
83% (6)
Ab Initio FAQ's Part1
61 pages
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
No ratings yet
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
546 pages
Compositional Strategies in Music For Solo Instruments and Electroacoustic Sounds - Vol1
No ratings yet
Compositional Strategies in Music For Solo Instruments and Electroacoustic Sounds - Vol1
165 pages
Customer Relationship Management
No ratings yet
Customer Relationship Management
14 pages
Rajesh Thesis
No ratings yet
Rajesh Thesis
86 pages
Implementation of Speech Recognition Using Artificial Neural Networks
No ratings yet
Implementation of Speech Recognition Using Artificial Neural Networks
12 pages
(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF
No ratings yet
(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF
5 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
No ratings yet
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
5 pages
Analysis and Classification of Speech Signals by Generalized Fractal Dimension Features
No ratings yet
Analysis and Classification of Speech Signals by Generalized Fractal Dimension Features
18 pages
Recall What Are Sound Features? Feature Detection and Extraction Features in Sphinx III
No ratings yet
Recall What Are Sound Features? Feature Detection and Extraction Features in Sphinx III
11 pages
Dynamic Spectrum Derived MFCC and HFCC Parameters and Human Robot Speech Interaction
No ratings yet
Dynamic Spectrum Derived MFCC and HFCC Parameters and Human Robot Speech Interaction
5 pages
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
No ratings yet
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
6 pages
Jahangir DSP Feb2014
No ratings yet
Jahangir DSP Feb2014
32 pages
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
No ratings yet
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
6 pages
Speech Analysis
No ratings yet
Speech Analysis
6 pages
Thesis mns25 PDF
No ratings yet
Thesis mns25 PDF
163 pages
Thesis Mns25
No ratings yet
Thesis Mns25
163 pages
Pages From Ali-SpringerPlus
No ratings yet
Pages From Ali-SpringerPlus
1 page
Spoken Language Identification Using Hybrid Feature Extraction Methods
No ratings yet
Spoken Language Identification Using Hybrid Feature Extraction Methods
5 pages
EC39201 - Expt4 - Lab Report - Grp-24
No ratings yet
EC39201 - Expt4 - Lab Report - Grp-24
5 pages
HMM Ser
No ratings yet
HMM Ser
4 pages
Asr2000 Final Footer
No ratings yet
Asr2000 Final Footer
8 pages
SN Ka Thesis
No ratings yet
SN Ka Thesis
78 pages
MathLab Based Speech Processing
No ratings yet
MathLab Based Speech Processing
8 pages
Speaker Recognition Using Vocal Tract Features
No ratings yet
Speaker Recognition Using Vocal Tract Features
5 pages
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
No ratings yet
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
3 pages
Speech Recognition Using FIR Wiener Filter
No ratings yet
Speech Recognition Using FIR Wiener Filter
5 pages
Bispectrum Estimators For Voice Activity Detection and Speech Recognition
No ratings yet
Bispectrum Estimators For Voice Activity Detection and Speech Recognition
12 pages
Backstrom 2015
No ratings yet
Backstrom 2015
5 pages
Deep Learning in Paralinguistic Recognition Tasks: Are Hand-Crafted Features Still Relevant?
No ratings yet
Deep Learning in Paralinguistic Recognition Tasks: Are Hand-Crafted Features Still Relevant?
5 pages
MSC Behra123van Hamid
No ratings yet
MSC Behra123van Hamid
75 pages
Arabic English Speech Emotion Recognition System
No ratings yet
Arabic English Speech Emotion Recognition System
5 pages
Non-Linear Feature Extraction For Robust Speech Recognition in Stationary and Non-Stationary Noise
No ratings yet
Non-Linear Feature Extraction For Robust Speech Recognition in Stationary and Non-Stationary Noise
22 pages
Applsci 09 02166
No ratings yet
Applsci 09 02166
12 pages
Paper 3
No ratings yet
Paper 3
8 pages
Article - Audio Intent Detection Classification Problem
No ratings yet
Article - Audio Intent Detection Classification Problem
4 pages
1 s2.0 S0885230812000563 Main
No ratings yet
1 s2.0 S0885230812000563 Main
17 pages
A Review On Feature Extraction and Noise Reduction Technique
No ratings yet
A Review On Feature Extraction and Noise Reduction Technique
5 pages
Hidden Markov Model-Based Speech Emotion Recognition: Bjorn Schuller, Gerhard and Manfred Lang
No ratings yet
Hidden Markov Model-Based Speech Emotion Recognition: Bjorn Schuller, Gerhard and Manfred Lang
4 pages
Applied Acoustics: Mei Li, Xueyong Liu, Xu Liu
No ratings yet
Applied Acoustics: Mei Li, Xueyong Liu, Xu Liu
5 pages
$Xwrpdwlf6Shhfk5Hfrjqlwlrqxvlqj&Ruuhodwlrq $Qdo/Vlv: $evwudfw - 7Kh Jurzwk LQ Zluhohvv FRPPXQLFDWLRQ
No ratings yet
$Xwrpdwlf6Shhfk5Hfrjqlwlrqxvlqj&Ruuhodwlrq $Qdo/Vlv: $evwudfw - 7Kh Jurzwk LQ Zluhohvv FRPPXQLFDWLRQ
5 pages
Introduction
No ratings yet
Introduction
9 pages
2015 Elsevier Speaker Identification Using Vowels Features Through A Combined Method of Formants Wavelets and Neural Network Classifiers
No ratings yet
2015 Elsevier Speaker Identification Using Vowels Features Through A Combined Method of Formants Wavelets and Neural Network Classifiers
9 pages
Speech Recognition Algo
No ratings yet
Speech Recognition Algo
17 pages
Lindgren Thesis
No ratings yet
Lindgren Thesis
82 pages
Robot Arm Controller Using Fuzzy Speech Recognition
No ratings yet
Robot Arm Controller Using Fuzzy Speech Recognition
7 pages
2019 Speech Enhancement For Secure Communication
No ratings yet
2019 Speech Enhancement For Secure Communication
19 pages
IJARCSSE
No ratings yet
IJARCSSE
6 pages
A Computationally Efficient Speech/music Discriminator For Radio Recordings
No ratings yet
A Computationally Efficient Speech/music Discriminator For Radio Recordings
4 pages
Feature Extraction Methods LPC, PLP and MFCC
100% (1)
Feature Extraction Methods LPC, PLP and MFCC
5 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
An Analysis of Phase-Based Speech Features For Tonal Speech Recognition
No ratings yet
An Analysis of Phase-Based Speech Features For Tonal Speech Recognition
7 pages
1-S2.0-S0885230824000962-Main Significance of Chirp MFCC As A Feature in Speech and Audio
No ratings yet
1-S2.0-S0885230824000962-Main Significance of Chirp MFCC As A Feature in Speech and Audio
11 pages
Jsip 2014021010293134
No ratings yet
Jsip 2014021010293134
7 pages
Lecours 1968
No ratings yet
Lecours 1968
3 pages
1 s2.0 S037843711100906X Main PDF
100% (1)
1 s2.0 S037843711100906X Main PDF
8 pages
A First Course in Wavelets with Fourier Analysis
From Everand
A First Course in Wavelets with Fourier Analysis
Albert Boggess
3.5/5 (2)
COMMUNICATION SYSTEMS
From Everand
COMMUNICATION SYSTEMS
B.P. Lathi
No ratings yet
Multifrequency Electron Paramagnetic Resonance: Theory and Applications
From Everand
Multifrequency Electron Paramagnetic Resonance: Theory and Applications
Sushil K. Misra
No ratings yet
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
The Alien Voice PDF
No ratings yet
The Alien Voice PDF
10 pages
Noise Source Models For Fricative Consonants: Shrikanth Narayanan, Member, IEEE, and Abeer Alwan, Member, IEEE
No ratings yet
Noise Source Models For Fricative Consonants: Shrikanth Narayanan, Member, IEEE, and Abeer Alwan, Member, IEEE
17 pages
Live Coding and Machine Listening: Nick Collins Durham University, Department of Music
No ratings yet
Live Coding and Machine Listening: Nick Collins Durham University, Department of Music
8 pages
Noise Source Models For Fricative Consonants: Shrikanth Narayanan, Member, IEEE, and Abeer Alwan, Member, IEEE
No ratings yet
Noise Source Models For Fricative Consonants: Shrikanth Narayanan, Member, IEEE, and Abeer Alwan, Member, IEEE
17 pages
Modernism Unbound
No ratings yet
Modernism Unbound
18 pages
S E A T - S: Ample Fficient Daptive EXT TO Peech
No ratings yet
S E A T - S: Ample Fficient Daptive EXT TO Peech
15 pages
Swim PDF
No ratings yet
Swim PDF
52 pages
Marshall CM MR 05
No ratings yet
Marshall CM MR 05
10 pages
Divisions of The Tetrachord by John H. Chalmers
No ratings yet
Divisions of The Tetrachord by John H. Chalmers
373 pages
Bruckner's Symphonies and Sonata Deformation Theory J H: Ulian Orton
100% (1)
Bruckner's Symphonies and Sonata Deformation Theory J H: Ulian Orton
13 pages
Project Manager in Reno NV Resume Linda Kennedy
No ratings yet
Project Manager in Reno NV Resume Linda Kennedy
2 pages
ACR3901U-S1: ACS Secure Bluetooth Contact Card Reader
No ratings yet
ACR3901U-S1: ACS Secure Bluetooth Contact Card Reader
48 pages
Lab 6
No ratings yet
Lab 6
5 pages
SP 2
No ratings yet
SP 2
33 pages
CMSC 313 Final Practice Questions Fall 2009
No ratings yet
CMSC 313 Final Practice Questions Fall 2009
6 pages
CS-871-Lecture 1
No ratings yet
CS-871-Lecture 1
41 pages
Software Requirements Specification: Online Railway Reservation System
No ratings yet
Software Requirements Specification: Online Railway Reservation System
4 pages
Matlab Tutorial
No ratings yet
Matlab Tutorial
31 pages
Graph Theory-Basic
No ratings yet
Graph Theory-Basic
188 pages
MB Memory Z790 8L 2G4 D5
No ratings yet
MB Memory Z790 8L 2G4 D5
11 pages
QUICK START GUIDE For Eagle Point Softwa
No ratings yet
QUICK START GUIDE For Eagle Point Softwa
48 pages
Project Report On "Online Tour and Travel Agency"
No ratings yet
Project Report On "Online Tour and Travel Agency"
8 pages
06292020163536west Bengal
No ratings yet
06292020163536west Bengal
56 pages
The DLX Instruction Set
No ratings yet
The DLX Instruction Set
13 pages
PCM Distillation Tutorial
No ratings yet
PCM Distillation Tutorial
6 pages
Chinmoy Mukherjee-Cracking The Coding Interview - 60 Java Programming Questions and Answers (Volume 1) - CreateSpace Independent Publishing Platform (2015)
No ratings yet
Chinmoy Mukherjee-Cracking The Coding Interview - 60 Java Programming Questions and Answers (Volume 1) - CreateSpace Independent Publishing Platform (2015)
27 pages
IT Disaster Recovery
No ratings yet
IT Disaster Recovery
7 pages
Introduction To Image Processing Toolbox
No ratings yet
Introduction To Image Processing Toolbox
7 pages
Lexmark ms310 User Guide
No ratings yet
Lexmark ms310 User Guide
110 pages
Brand Study of Lenovo Electronics
No ratings yet
Brand Study of Lenovo Electronics
41 pages
Analyze Nursing Data by Using Pivottable Reports
No ratings yet
Analyze Nursing Data by Using Pivottable Reports
15 pages
HW3 Soln PDF
No ratings yet
HW3 Soln PDF
17 pages
Asynchronous Sequential Circuits
No ratings yet
Asynchronous Sequential Circuits
27 pages
Cs2351 Ai Notes
100% (1)
Cs2351 Ai Notes
91 pages
Lecture 10
No ratings yet
Lecture 10
39 pages
Archmodels v003 PDF
No ratings yet
Archmodels v003 PDF
5 pages
Slide 01 Winforms 2005
No ratings yet
Slide 01 Winforms 2005
26 pages

tr05 01 PDF

Uploaded by

tr05 01 PDF

Uploaded by

SPECTRAL ENTROPY AS SPEECH FEATURES

FOR SPEECH RECOGNITION

C. MFCC and Subband Entropy

Clean SNR 40 SNR 30 SNR 20 SNR 10 SNR 0

You might also like