0% found this document useful (0 votes)

50 views4 pages

Classification. of Reverberant Situations

This document describes research into classifying acoustic environments based on their reverberation time (T60). It examines using three methods - cepstral mean, autocorrelation, and speech to reverberation modulation energy ratio (SRMR) - to blindly estimate T60 from reverberated speech signals without knowing the original signal or room impulse response. Artificial and real room impulse responses were used to generate test stimuli with T60 ranging from 0.05 to 4 seconds. The three estimation methods were then applied and their results were compared to the true T60 values. The cepstral mean and autocorrelation methods produced estimates that varied with analysis window length, while SRMR provided a single estimate. Overall, the cepstral mean

Uploaded by

PanyBbly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views4 pages

Classification. of Reverberant Situations

Uploaded by

PanyBbly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Classification of Reverberant Acoustic Situations

Jens Schroder1 , Thomas Rohdenburg2 , Volker Hohmann1 , Stephan D. Ewert1

University of Oldenburg, Institute of Physics, Medical Physics, Germany,

Email: [email protected]; [email protected]
Fraunhofer Institute for Digital Media Technology, 26129 Oldenburg, Germany

Introduction
In daily communication, speech intelligibility depends on
the acoustic surrounding or acoustic situation. Particularly for hearing impaired persons, speech understanding
is often problematic if speech is distorted by (room)
reverb, noise or competing talkers. Acoustic situations
are characterized by different dominating types of distortion. Hearing aids might provide appropriate algorithms
to enhance speech intelligibility in the different acoustic
situations. A robust and fast automatic classification
of the acoustic situation should therefore select the
appropriate hearing aid algorithm without requiring an
action of the hearing aid wearer. This study is concerned
with the automatic estimation of the reverberation time
(T 60) in natural situations and with unknown excitation
signal.
Acoustic test situations were generated by
convolving speech signals with artificial and real room
impulse responses with T 60 times ranging from 0.05
to 4 s. Features derived from the cepstral mean, the
autocorrelation function and from the distribution of
modulation energy were used to blindly estimate different
reverb times.

Impulse Response Model

In natural environments sound is often received as a
superposition of direct and reflected sound from walls or
objects in a room. Direct, early reflexions arrive first at
the ear and after multiple reflexions and superpositions of
reflexions from many objects they are diffuse and called
reverberation. The level of reflexions in a room impulse
response decreases due to attenuation effects and scattering. This decay is often assumed to be nearly exponential
([1],[6]). Thus, a reverberant impulse response can be
approximated by an exponentially decaying part and if
measurement noise or background noise occured by a
constant part [1]:
h(t) = Aexp et/ n1 (t) + Anoise n2 (t),

(1)

where Aexp und Anoise are scalar, is the decay parameter in seconds, t is the time in seconds and n1 (t) and
n2 (t) present two independent noise processes.
A common measure for reverberation is the time until
the impulse response has decreased by 60 dB. In (1), the
reverberation time, T 60, can be calculated directly from
the decay parameter :
T 60 = ln(103 ) 6.908 .

(2)

To calculate the T 60 time from a measured impulse

response different solutions exist. A procedure suggested

in [1] is to fit the power by a least square fit. From

equation (1) the instantaneous power can be derived as
q
(3)
a(t) = A2exp e2t/ + A2noise .
The parameters Aexp , and Anoise are evaluated by a
least square fit of the form
Z
2
min [as (t) y s (t)] dt
(4)
where s = 0.5 is a scaling factor to improve the results
[1]. The T 60 time was then calculated as stated in (2).

Blind Estimation Procedures

The goal of this study is to estimate the T 60 time from
a reverberated speech sample without having explicit
information about neither the impulse response nor the
underlying clean speech.
Three different methods are used in the following and
their results are compared.

Cepstral Mean
To estimate the impulse response from an unknown
reverberated signal there exists the theory of blind
homomorphic deconvolution [3], [4], [5]. Here a reverberated speech signal is assumed as:
sir(t) = s(t) h(t),

(5)

where denotes the convolution product, s(t) clean

speech and h(t) the impulse response. A Fourier transformation turns the convolution into a multiplication.
The logarithm transforms this product into a sum. A
further inverse Fourier transformation converts the sum
into the cepstral domain where the additivity is preserved
(see figure 1). If it is assumed that the cepstrum
F

log

F 1

b ) + H(f
b ) sb(q) + b
s(t) h(t) S(f ) H(f ) S(f
h(q)
Figure 1: Calculation of the cepstrum from a convoluted
input signal

of the clean speech is nearly uncorrelated in adjacent

windows, averaging over a few cepstra estimates the
mean cepstrum of the impulse response. By inversing
h(t) can be derived.
To keep the inverse cepstrum stable and causal, only the
minimum phase part of the deconvolved impulse response
is taken [3]. The whole deconvolution scheme by cepstral
mean is shown in figure 2.
The T 60 time can then
be estimated from the resulting impulse response by the
above mentioned least square fit.

cepstrogram

|F( )|

filtering
(causal)

F (log( ))
F -1(log( ))

1 S( )
T
T

F -1(exp(F( )))

t=1

time

|F( )|

inverse
cepstrum

Stimuli and Methods

level

|F( )|

mean

-1

F -1(log( ))

time

Figure 2: Blind deconvolution of reverberated speech by

cepstral mean estimation of the minimum phase part of the
impulse response

Autocorrelation
The autocorrelation function Rsir,sir (t) of a reverberant
signal sir(t) is the convolution product of the autocorrelation functions of the underlying clean speech s(t) and
the room impulse response h(t).
Rsir,sir (t)

= s(t) h(t) s(t) h(t)

= Rs,s (t) Rh,h (t).

(6)

If the clean speech is considered to have a peaky autocorrelation function then the following approximation is
possible:
Rsir,sir (t) Rh,h (t).
(7)
For an exponential function, the autocorrelation function
for positive times has the same exponential decay parameter . Thus we assume the autocorrelation function
of the reverberated signal to decay like the underlying
impulse response. To reduce estimation errors averaging
over overlapping windows was performed. From the
averaged autocorrelation function the T 60 time was
estimated due to equation (4).

Speech to Reverberation modulation energy ratio (SRMR)

Typically, clean speech shows the strongest modulation energy at a modulation frequency of about 4 Hz.
The distribution of modulation energy shifts to higher
modulation frequencies for reverberated speech as a
consequence of the whitening effect by the impulse
response which is considered to be a damped gaussian
white noise. The higher the reverberation time, the
more modulation energy occurs at high modulation
frequencies. Thus [7] suggested a comparison of energies
at high and low modulation frequencies. Here, the
modulation spectrogram was seperated into low and
high modulation frequency regions and the ratio of the
respective energies was calculated. This ratio is called
SRMR (Speech to Reverberation Modulation energy
Ratio).
In comparision to [7], the weighting of the high and
modulation
spectrogram

weighting

SRMR
G,K

high

low
time

low

SE
SE

g,k=1
G,K
g,k=1

To analyse the three estimation methods, clean speech

material of four speaker sets sampled at 16 kHz was
used: Two male, German speakers, one female, German
speaker and a set of female, English speakers.
The clean speech material was reverberated by convolution with room impulse responses.
To characterize the impulse responses, the T 60 times
were estimated from the impulse responses using the
method described in (4) and are referred to as real T 60
times in the following.
For the first test setup, referred to as artifical impulse
response setup (artificial IR setup), impulse responses
were generated by source imaging [2] to achieve a wide
range of T 60 times with controlable spectral (white)
properties. For these impulse responses a rectangular
room was simulated and its size and reflexion coefficients
were adjusted to produce T 60 times ranging from 50 ms
to 4 s, increasing at a factor of two. The sound source
and the receiver were distributed randomly in the room
and the positions differed between the impulse responses.
Independent of the T60 time the length of each impulse
response was 2 s with 16 kHz sampling frequency.
All reverberant speech samples were derived from the
same underlying clean speech. In this paper, the results
for one of the male German speakers are shown.
The second test setup, called real impulse response
setup (real IR setup), consisted of real impulse responses
selected from a commercial impulse response library.
Three groups of T 60 times were defined: a dry one
(0.16 - 0.36 s) with a mean T 60 = 0.3 s, a medium
reverberated one (0.72 - 1.02 s) with a mean T 60 =
0.9 s and a reverberant one (1.71 - 1.98 s) with a mean
T 60 = 1.9 s. Each group consisted of four impulse
responses. Each of the four impulse responses of a group
are convolved with different speech material from one of
the four speaker sets.
To test all three features, an estimation of the T 60 time
respectively the SRMR was done every 0.5 s for a total
time of 100 seconds for the artificial IR setup and 40
seconds for each speaker/impulse response of a T 60 group
for the real IR setup.
For the cepstral mean and the autocorrelation feature
the window lengths were 0.05 s, 0.2 s, 0.5 s, 1 s, 2 s,
3 s, and 4 s. The overlap between two windows was
7/8 and the averaging time 5 times the window length
(=
b 33 windows). The first 6.3 ms (=
b 100 time samples at
16 kHz sampling frequency) of the deconvolved impulse
responses were skipped for the fitting.
The window length for the SRMR feature was 1 s.

Results

g,k

high
g,k

Figure 3: Algorithm for the estimation of SRMR

low modulation frequency regions was changed. Low

modulation frequencies were defined to be smaller than
28.9 Hz or smaller than 10% of the corresponding
bandwidth of the auditory filter.

Cepstral Mean
The means of 200 T 60-time estimates for the artificial
IR setup are plotted in Fig. 4 (solid lines) as a function
of the analysis window duration for three different real
T 60 times indicated by the dotted lines.
The T 60time estimates depend on the analysis window duration,

blind estimated T60s with different window lengths

means of valid T60s (double logarithmic plot)

means of valid T60s

4.5

3.5

3.2
1.6
estimated T60 /s

3
estimated T60 /s

real T60: 0.05 s

real T60: 0.1 s
real T60: 0.2 s
real T60: 0.4 s
real T60: 0.8 s
real T60: 1.6 s
real T60: 3.2 s

2.5
2
1.5
1

0.8

0.2

53.8

1.9

96.5
100

100
97

94.1
0.9

0.1

100

0.5

0.3

0.05
0

0.5

1.5

2
2.5
3
window length /s

3.5

4.5

Figure 4: Cepstral mean: Means and standard deviations

of 200 estimated T 60 times per window length and impulse
response of the artificial IR set up (solid lines). The dotted
lines of same color represent the real T 60 time.

starting at small values for short window durations and

asymptoting against the real T 60 times with increasing
window duration. The T 60-time curves flatten off at a
window duration corresponding to about two times real
T 60 time. Since the estimates depend on the window
duration, the proper window for a good T 60-time estimation has to be chosen blindly. To do so, the algorithm
suggested here successively calculates T 60-time estimates
for increasing analysis window durations. Following the
slope analysis given above, the validity of the estimates
is judged blindly by monitoring the differences of the
T 60-time estimates derived for two succesive analysis
window durations. If the slope calculated from the two
last estimates is smaller than an empirically adjusted
criterion of 0.1, the last estimate is considered to be valid.
A second criterion to judge the validity of the estimate is
to monitor the ratio between the energy of the fitted noise
and the energy in the fittet exponential decay calculated
over the duration of the current T 60-time estimate:
PT 60 2
N
Anoise
=
.
(8)
PT 60
2
S
(Aexp exp( t))

Again, an empirically adjusted criterion of N

S < 0.25 has
to be met in order to judge the estimate as valid. If
both criterions are met, a valid T 60-time estimate was
calculated by the algorithm.
The means of the valid T 60-time estimates are plotted
in Fig. 5 as a function of the real T 60 times. The left
panel shows the results for the artificial IR setup and
the right panel for the real IR setup.
The numbers
indicate the percentage of T 60-time estimates which
satisfied both validity criterions. Apparently, a lower
limit for the estimated T 60 times exists at about 200 ms
for small real T 60 times. For longer real T 60 times,
the estimates match very well. The number of valid
T 60 times decreases since longer real T 60 times need
longer window duration for a proper estimation. The
calculation time for a valid estimate is about ten times
the real T 60 time: the best window duration is about
two times the real T 60 time and the averaging time is
five times the window length. Although the computation
time for the very accurate T 60-time estimate appears
quite long, the successive prolonging of the window
duration in the algorithm allows an early estimation of

0.05

0.1

0.2

0.4
0.8
real T60 /s

1.6

3.2

0.3

0.9
real T60 /s

1.9

Figure 5: Cepstral mean: Means and standard deviations

of valid T 60-time estimates. The numbers indicate the
percentage of the valid estimates in relation to all estimates.
Left panel: artificial IR set up, 200 estimations per impulse
response. Right panel: real IR set up, 320 estimations per
impulse response group

the lowest possible value for the currently observed T 60

time: The longer the actual window duration is, the
longer the T 60 time will be.

Autocorrelation
For the autocorrelation feature, the same sound material
and paramters as for the cepstral mean feature were used
(see above).
The means of 200 T 60-time estimates per window and
impulse response of the articicial IR setup are plotted
in Figure 6, comparable to Figure 4. Comparable to
blind estimated T60s with different window lengths
4.5
real T60: 0.1 s
real T60: 1.6 s
real T60: 3.2 s

4
3.5
3
estimated T60 /s

0.4

0.5

real T60: 0.3 s

real T60: 0.9 s
real T60: 1.9 s

10.5
estimated T60 /s

real T60: 0.1 s

real T60: 1.6 s
real T60: 3.2 s

2.5
2
1.5
1
0.5
0
0.5
1

0.5

1.5

2
2.5
3
window length /s

3.5

4.5

Figure 6: Autocorrelation: The estimated T 60 times

as means with standard deviations of 200 estimations per
window and impulse response of the articicial IR setup (solid
lines). The real T 60 times are indicated as dotted lines of the
same color.

the cepstral mean feature, the estimated T 60 times of

the autocorrelation feature asymptote against the real
T 60, though the deviations are much higher here than
in case of the cepstral mean feature. The prolonging
of the analysis window duration was again used and
valid estimates were derived when the same slope and
N
S criteria as in case of the cepstral mean feature were
met. The results for the valid T 60-time estimates are
shown in Figure 7.
The results are similar to those
from the cepstral mean. Again, there is a lower limit at
about 200 ms for the T 60-time estimates. Above 200 ms
the means of estimated T 60 times are matching the real
ones well, though the deviations are larger than those of
cepstral mean. The number of valid T 60 times decreases
heavily for all impulse responses.

means of valid T60s (double logarithmic plot)

3.2

estimated T60 /s

1.6
0.8

means of valid T60s

0.5
15

20.5 48.5

20.6

1.9

19.5
33

0.4
0.2

real T60: 0.3 s

real T60: 0.9 s
real T60: 1.9 s

estimated T60 /s

real T60: 0.05 s

real T60: 0.1 s
real T60: 0.2 s
real T60: 0.4 s
real T60: 0.8 s
real T60: 1.6 s
real T60: 3.2 s

70.5

35.9

0.9

0.1

91.6
0.3

0.05
0.05

0.1

0.2

0.4
0.8
real T60 /s

1.6

3.2

0.3

0.9
real T60 /s

1.9

Figure 7: Autocorrelation: Means and standard deviations

of valid T 60-time estimates for the different impulse
responses. The numbers show the percentage of valid T 60time estimates in relation to all estimates. Left panel:
artificial IR setup, 200 estimations per impulse response; right
panel: real IR setup, 320 estimations per impulse response
group.

Speech to Reverberation modulation energy ratio (SRMR)

In Figure 8 the means and standard deviations of SRMRs
are plotted.
The left panel shows the results for the artificial IR setup.
means (semi logarithmic plot)

0.8

0.6

1.6

0.4

real T60: 0.3 s

real T60: 0.9 s
real T60: 1.9 s

1.4
1.2
SRMR

SRMR

means

real T60: 0.05 s

real T60: 0.1 s
real T60: 0.2 s
real T60: 0.4 s
real T60: 0.8 s
real T60: 1.6 s
real T60: 3.2 s

0.6
0.4
0.2

0.05

0.1

0.2

0.4
0.8
real T60 /s

1.6

3.2

Acknowledgements
This work was supported by the Bundesministerium f
ur
Bildung und Forschung (BMBF) project Modellbasierte
Horsysteme.

References
[1] M. Karjalainen, P. Antsalo, A. Makivirta, T.
Peltonen, V. Valimaki, Estimation of Modal Decay
Parameters from Noisy Response Measurments, J.
Audio. Eng. Soc., Vol. 50, No. 11, pp. 867-878, Nov.
2002

0.8

0.2

shorter correlation duration like white Gaussian noise.

Another observation is that the number of valid T 60-time
estimates drops towards longer real T 60 times with the
current choice of the longest analysis window duration
of 4 s, particularly for the autocorrelation feature. The
use of even longer analysis window duration seems not
feasible with the practical application in mind. The
calculation times that where achieved are about 10 times
the T 60 time (window length = 2 T 60, averaging time
= 5 window length). Nevertheless an early classification
into short and long T60 times is possible by monitoring
the active window duration in the algortihm: The larger
the active window (even if the T 60 time is not valid) the
longer the T 60 time.
For the SRMR feature, the reliable results from [7] could
not be reproduced. By averaging over some SRMRs or
modulation spectrograms this feature might, however,
be usefull in combination with the cepstral mean or
autocorrelation feature.
In the next step all three features are combined in a
classifier with a gaussian mixture model (GMM) for
robustness.

0.3

0.9
1.9
real T60 /s

Figure 8: Means and standard deviations of calculated

SRMRs per impulse response over the real T 60 times. Left
panel: artificial IR setup, 200 estimations per impulse
response; right panel: real IR setup, 320 estimations per
impulse response group.

It is obvious that the SRMRs between 50 and 400 ms

are nearly equal and that SRMRs decrease beyond about
400 ms real T 60 time. The large standard deviations
indicate that even a meaningful classification in rough
T 60-time categories would fail without averaging over a
large number of SRMRs.

Conclusions
Three different methods for the estimation of the reverberation time T 60 were presented. It was shown that
for the cepstral mean and the autocorrelation feature
an estimation of the T 60 time via the N
S and slope
criteria is possible with very good accuracy above about
200 ms. The lower limit of estimated T 60 times at
about 200 ms is most likely related to the statistical
features of speech. Both methods assume that the speech
signal is statistically independent in successive time
windows which is not the case. Shorter T 60 times could
be only estimated with a input signal of significantly

[2] J. B. Allen, D. A. Berkley , Image method

for efficiently simulating small-room acoustics, J.
Acoust. Soc. Am., Vol. 65, No. 4, pp. 943-950, Apr.
1979
[3] A. Baskind, O. Warusfel, Monaural and binaural
processing for automatic estimation of room acoustics
perceptual attributes, IRCAM meeting, Sep. 2001
[4] A. V. Oppenheim, R. W. Schafer, Zeitdiskrete
Signalverarbeitung, Oldenbourg Verlag, 3. Auflage,
1999
[5] T. G. Stockham, Jr., T. M. Cannon, R. B.
Ingebretsen, Blind Deconvolution Through Digital
Signal Processing, Proceedings of the IEEE, Vol. 63,
No. 4, April 1975
[6] M. Hansen, A Method for Calculating Reverberation
Time from Musical Signals, Technical Report 60,
The Acoustic Laboratory, Technical University of
Denmark
[7] T. H. Falk, W.-Y. Chan, A Non-Intrusive Quality
Measure of Dereverberated Speech, Intl. Workshop
for Acoustic Echo and Noise Control, 2008

Reverberation of A Timepiece PDF Download
No ratings yet
Reverberation of A Timepiece PDF Download
3 pages
056-080 Modbus
67% (3)
056-080 Modbus
3 pages
Transmagnetic Resonance Field Theory
From Everand
Transmagnetic Resonance Field Theory
Timothy E. Douglas
No ratings yet
Three Phase Inverter Circuit Diagram
100% (1)
Three Phase Inverter Circuit Diagram
6 pages
Earthing in Industrial and Pharmaceutical Plants
100% (1)
Earthing in Industrial and Pharmaceutical Plants
28 pages
Reverberation Time
No ratings yet
Reverberation Time
5 pages
NotePerformer - Users Guide
100% (1)
NotePerformer - Users Guide
42 pages
Nanometer Technology Designs High Quality Delay Tests
100% (1)
Nanometer Technology Designs High Quality Delay Tests
287 pages
Service Manual Electrical Diagrams 2003A: Maskin: U AC Manual NR: 005973
No ratings yet
Service Manual Electrical Diagrams 2003A: Maskin: U AC Manual NR: 005973
213 pages
Cepstrum Pitch Determination: OICED-speech Sounds Result From The Resonant
100% (1)
Cepstrum Pitch Determination: OICED-speech Sounds Result From The Resonant
17 pages
017 - The Design of Portable Logic Controller (PLC) Training System For Use Outside of Automation Laboratory
No ratings yet
017 - The Design of Portable Logic Controller (PLC) Training System For Use Outside of Automation Laboratory
5 pages
SAMSUNG Flat-Panel X-Ray Detector User Manual
No ratings yet
SAMSUNG Flat-Panel X-Ray Detector User Manual
52 pages
Content PDF
No ratings yet
Content PDF
13 pages
Tropicalia PDF
No ratings yet
Tropicalia PDF
23 pages
Mitsubishi v500 VFD IB NA 0600065-F FR-V500-NA Instruction Manual-Detailed
No ratings yet
Mitsubishi v500 VFD IB NA 0600065-F FR-V500-NA Instruction Manual-Detailed
221 pages
Chapter 2: Literature Review of Earthing Measurement Techniques
No ratings yet
Chapter 2: Literature Review of Earthing Measurement Techniques
25 pages
Visual Metaphor
No ratings yet
Visual Metaphor
33 pages
Character PDF
No ratings yet
Character PDF
19 pages
Character PDF
No ratings yet
Character PDF
19 pages
3-Solar PV Sizing & Installation
No ratings yet
3-Solar PV Sizing & Installation
50 pages
Reverberation: Heat, Light, or Sound Waves)
No ratings yet
Reverberation: Heat, Light, or Sound Waves)
15 pages
Training UOCIII Philips
No ratings yet
Training UOCIII Philips
318 pages
Air To Water Heat Pump: Engineering Data Book
No ratings yet
Air To Water Heat Pump: Engineering Data Book
113 pages
STPM Physics Chapter 13 Capacitors PDF
No ratings yet
STPM Physics Chapter 13 Capacitors PDF
1 page
Applsci 11 08054 v2
No ratings yet
Applsci 11 08054 v2
11 pages
Eaton 2013
No ratings yet
Eaton 2013
5 pages
Service Manual: Front Panel Kit
No ratings yet
Service Manual: Front Panel Kit
26 pages
Reverberation
No ratings yet
Reverberation
5 pages
Blind Reverberation Time Estimation Gamper IWAENC 2018
No ratings yet
Blind Reverberation Time Estimation Gamper IWAENC 2018
5 pages
Algorithms and Evaluation On Blind Estimation of Reverberation Time
No ratings yet
Algorithms and Evaluation On Blind Estimation of Reverberation Time
5 pages
Hicotronics Group Profile
No ratings yet
Hicotronics Group Profile
18 pages
Blind Reverberation Time Estimation From Ambisonic
No ratings yet
Blind Reverberation Time Estimation From Ambisonic
6 pages
Doire Icassp 2015
No ratings yet
Doire Icassp 2015
5 pages
Styling and Image Making
No ratings yet
Styling and Image Making
4 pages
Glottal-to-Noise Excitation Ratio - Razão de Excitação Glótica-Ruído - Uma Nova Medida para Descrever
No ratings yet
Glottal-to-Noise Excitation Ratio - Razão de Excitação Glótica-Ruído - Uma Nova Medida para Descrever
7 pages
History of Audio Recording
No ratings yet
History of Audio Recording
8 pages
Play Guitar: Exploration and Analysis of Harmonic Possibilities
From Everand
Play Guitar: Exploration and Analysis of Harmonic Possibilities
Kevin Kriescher
No ratings yet
Hands-On Lab On Speech Processing-Time-domain Processing - 2021
No ratings yet
Hands-On Lab On Speech Processing-Time-domain Processing - 2021
11 pages
Speech Enhancement Using Excitation Source Information
No ratings yet
Speech Enhancement Using Excitation Source Information
4 pages
A Novel Approach For Blind Estimation of Reverberation Time Using Gamma Distribution Model
No ratings yet
A Novel Approach For Blind Estimation of Reverberation Time Using Gamma Distribution Model
8 pages
A Novel Filtering Based Approach For Epoch Extraction - Bachhav2015
No ratings yet
A Novel Filtering Based Approach For Epoch Extraction - Bachhav2015
5 pages
Group 3 Module 1 Impulse Response Measurements
No ratings yet
Group 3 Module 1 Impulse Response Measurements
7 pages
Extracting Room Reverberation Time From Speech Using Artificial Neural Networks
No ratings yet
Extracting Room Reverberation Time From Speech Using Artificial Neural Networks
12 pages
Speech Enhancement Using Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator
No ratings yet
Speech Enhancement Using Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator
13 pages
A Tutorial To Extract The Pitch in Speech Signals Using Autocorrelation
No ratings yet
A Tutorial To Extract The Pitch in Speech Signals Using Autocorrelation
11 pages
Cepstrum vs. LPC: A Comparative Study For Speech Formant Frequencies Estimation
No ratings yet
Cepstrum vs. LPC: A Comparative Study For Speech Formant Frequencies Estimation
16 pages
Speech Reverberation A Review IJERTCONV3IS01022
No ratings yet
Speech Reverberation A Review IJERTCONV3IS01022
4 pages
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
No ratings yet
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
13 pages
CIB16691
No ratings yet
CIB16691
12 pages
Comparative Analysis of Bilateral Permanent Magnet Linear Synchronous Motors With Different Structures 09130120
No ratings yet
Comparative Analysis of Bilateral Permanent Magnet Linear Synchronous Motors With Different Structures 09130120
9 pages
Lecours 1968
No ratings yet
Lecours 1968
3 pages
Prodeus A. - Assessment of Speech Intelligibility by Formant-Modulation Method
No ratings yet
Prodeus A. - Assessment of Speech Intelligibility by Formant-Modulation Method
9 pages
MTPPT2 - Room Acoustics and Transducers
No ratings yet
MTPPT2 - Room Acoustics and Transducers
59 pages
Cordless Steinel BHG 360 Heat Gun
No ratings yet
Cordless Steinel BHG 360 Heat Gun
1 page
Sibelius Layout and Formatting
No ratings yet
Sibelius Layout and Formatting
2 pages
Analysis of Existing Noise Estimation Algorithms
No ratings yet
Analysis of Existing Noise Estimation Algorithms
17 pages
R Assumingp Q Q Ci,: Chapter 6 - Speech Analysis
No ratings yet
R Assumingp Q Q Ci,: Chapter 6 - Speech Analysis
6 pages
Transfer-Function Measurement With Sweeps PDF
No ratings yet
Transfer-Function Measurement With Sweeps PDF
52 pages
Voice Activity Detection Based On Auto-Correlation Function Using Wavelet Transform and Teager Energy Operator
No ratings yet
Voice Activity Detection Based On Auto-Correlation Function Using Wavelet Transform and Teager Energy Operator
14 pages
CST113S Course Outline
No ratings yet
CST113S Course Outline
6 pages
To Agriolouloudo
No ratings yet
To Agriolouloudo
3 pages
Epoch Extraction From Speech Signals: K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE
No ratings yet
Epoch Extraction From Speech Signals: K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE
12 pages
Voice Activity Detection Based On Auto-Correlation Function Using Wavelet Transform and Teager Energy Operator
No ratings yet
Voice Activity Detection Based On Auto-Correlation Function Using Wavelet Transform and Teager Energy Operator
23 pages
Acoustical Parameters
No ratings yet
Acoustical Parameters
26 pages
Automatic Identification of Silence, Unvoiced and Voiced Chunks in Speech
No ratings yet
Automatic Identification of Silence, Unvoiced and Voiced Chunks in Speech
10 pages
Rtmeas
No ratings yet
Rtmeas
9 pages
Math 221
100% (1)
Math 221
2 pages
A Study of Speech Intelligibility Over A PA System
No ratings yet
A Study of Speech Intelligibility Over A PA System
32 pages
Ratnam SPLetters 5372004
No ratings yet
Ratnam SPLetters 5372004
4 pages
SNR For SI Auditorium Acoustics Design Index
No ratings yet
SNR For SI Auditorium Acoustics Design Index
68 pages
Load-Frequency Control in A Two-Area Power System Using The Fuzzy PID Method
No ratings yet
Load-Frequency Control in A Two-Area Power System Using The Fuzzy PID Method
23 pages
Cepstrum Analysis and Gearbox Fault Diagnosis - Bruel and Kaer PDF
No ratings yet
Cepstrum Analysis and Gearbox Fault Diagnosis - Bruel and Kaer PDF
21 pages
Vandervloed2016 Real Conditions
No ratings yet
Vandervloed2016 Real Conditions
4 pages
Gaubitch - Analysis of The Dereverberation Performance of Microphone Arrays
No ratings yet
Gaubitch - Analysis of The Dereverberation Performance of Microphone Arrays
4 pages
NT Acou 059 - Rooms - Reverberation Time - Interrupted Noise Precision Method - Nordtest Method
No ratings yet
NT Acou 059 - Rooms - Reverberation Time - Interrupted Noise Precision Method - Nordtest Method
11 pages
A New Silence Removal and Endpoint Detection Algorithm For Speech and Speaker Recognition Applications
No ratings yet
A New Silence Removal and Endpoint Detection Algorithm For Speech and Speaker Recognition Applications
5 pages
RaoK S Prasannas R M Yegna2007
No ratings yet
RaoK S Prasannas R M Yegna2007
4 pages
Audio Forensics From Acoustic Reverberation: Hafiz Malik Hany Farid
No ratings yet
Audio Forensics From Acoustic Reverberation: Hafiz Malik Hany Farid
4 pages
Blind Separation of Disjoint Orthogonal Signals: Demixing N Sources From Mixtures
No ratings yet
Blind Separation of Disjoint Orthogonal Signals: Demixing N Sources From Mixtures
4 pages
Building and Room Acoustics Measurements With Sine-Sweep Technique
No ratings yet
Building and Room Acoustics Measurements With Sine-Sweep Technique
2 pages
Wiring Diagram
No ratings yet
Wiring Diagram
18 pages
Laboratory Exercise 5
No ratings yet
Laboratory Exercise 5
6 pages
RT60 Manual
No ratings yet
RT60 Manual
13 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
8 pages
What Are The Different Components of A Research Project?
No ratings yet
What Are The Different Components of A Research Project?
30 pages
Homework 1
No ratings yet
Homework 1
3 pages
Portrait Lighting Styles
No ratings yet
Portrait Lighting Styles
1 page
NT Acou 053 - Rooms - Reverberation Time - Nordtest Method
No ratings yet
NT Acou 053 - Rooms - Reverberation Time - Nordtest Method
7 pages
Robot Source Ion
No ratings yet
Robot Source Ion
6 pages
Roland Gaia Sh-01 Brochure
No ratings yet
Roland Gaia Sh-01 Brochure
2 pages
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
No ratings yet
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
3 pages
Signal Flow Chart
No ratings yet
Signal Flow Chart
5 pages
Robust Vowel Detection
No ratings yet
Robust Vowel Detection
4 pages
Single Channel Speech Dereverberation Using The LP Residual Cepstrum
No ratings yet
Single Channel Speech Dereverberation Using The LP Residual Cepstrum
5 pages
Friedman BE100 Manual
No ratings yet
Friedman BE100 Manual
20 pages
Οι βασικές συντομεύσεις του Sibelius
No ratings yet
Οι βασικές συντομεύσεις του Sibelius
4 pages
Datasheet PDF
No ratings yet
Datasheet PDF
1 page
Measurement of Reverberation Time 4th Sem
No ratings yet
Measurement of Reverberation Time 4th Sem
3 pages
Lyrics
No ratings yet
Lyrics
2 pages
CDHD2 User Manual Fw2.15.x Rev 2.1-I
No ratings yet
CDHD2 User Manual Fw2.15.x Rev 2.1-I
380 pages
SLD 80R 850 SJ
No ratings yet
SLD 80R 850 SJ
7 pages
Joomia PDF
No ratings yet
Joomia PDF
2 pages
Joomia PDF
No ratings yet
Joomia PDF
2 pages
Theoretical Vs Experimental Density
No ratings yet
Theoretical Vs Experimental Density
9 pages
SNT 3141sata Document
No ratings yet
SNT 3141sata Document
2 pages
2N5294 2N5296 2N5298 NPN Silicon Transistor Description
No ratings yet
2N5294 2N5296 2N5298 NPN Silicon Transistor Description
4 pages
CA Lesson 4 Electric and Magnetic Fields in Space
No ratings yet
CA Lesson 4 Electric and Magnetic Fields in Space
33 pages
Lin 2015
No ratings yet
Lin 2015
3 pages
5152 PDF
No ratings yet
5152 PDF
2 pages

Classification. of Reverberant Situations

Uploaded by

Classification. of Reverberant Situations

Uploaded by

Classification of Reverberant Acoustic Situations

Jens Schroder1 , Thomas Rohdenburg2 , Volker Hohmann1 , Stephan D. Ewert1

University of Oldenburg, Institute of Physics, Medical Physics, Germany,

Impulse Response Model

To calculate the T 60 time from a measured impulse

in [1] is to fit the power by a least square fit. From

Blind Estimation Procedures

where denotes the convolution product, s(t) clean

of the clean speech is nearly uncorrelated in adjacent

Stimuli and Methods

Figure 2: Blind deconvolution of reverberated speech by

= s(t) h(t) s(t) h(t)

Speech to Reverberation modulation energy ratio (SRMR)

To analyse the three estimation methods, clean speech

Figure 3: Algorithm for the estimation of SRMR

low modulation frequency regions was changed. Low

blind estimated T60s with different window lengths

means of valid T60s (double logarithmic plot)

means of valid T60s

real T60: 0.05 s

Figure 4: Cepstral mean: Means and standard deviations

starting at small values for short window durations and

Again, an empirically adjusted criterion of N

Figure 5: Cepstral mean: Means and standard deviations

the lowest possible value for the currently observed T 60

real T60: 0.3 s

real T60: 0.1 s

Figure 6: Autocorrelation: The estimated T 60 times

the cepstral mean feature, the estimated T 60 times of

means of valid T60s (double logarithmic plot)

means of valid T60s

real T60: 0.3 s

real T60: 0.05 s

Figure 7: Autocorrelation: Means and standard deviations

Speech to Reverberation modulation energy ratio (SRMR)

real T60: 0.3 s

real T60: 0.05 s

shorter correlation duration like white Gaussian noise.

Figure 8: Means and standard deviations of calculated

It is obvious that the SRMRs between 50 and 400 ms

[2] J. B. Allen, D. A. Berkley , Image method

You might also like