Acoustic Noise and Echo Canceling With Microphone Array: Mattias Dahl,, and Ingvar Claesson
Acoustic Noise and Echo Canceling With Microphone Array: Mattias Dahl,, and Ingvar Claesson
Acoustic Noise and Echo Canceling With Microphone Array: Mattias Dahl,, and Ingvar Claesson
Abstract A novel method of performing acoustic echo cancelling using microphone arrays is presented. The method employs a digital self-calibrating microphone system. The calibration process is a simple indirect on-site calibration that adapts
to the particulars of the acoustic environment and the electronic
equipment in use. Primarily intended for handsfree telephones in
automobiles, the method simultaneously suppresses the handsfree
loudspeaker and car noise. The system also continuously takes
into account disturbances such as fan noise. Examples from an
extensive evaluation in a car are also included. Typical performance results demonstrate 20-dB echo cancellation and 10-dB
noise reduction simultaneously.
Index Terms Adaptive array, array signal processing, echo
suppression, microphone array, speech enhancement.
I. INTRODUCTION
DAHL AND CLAESSON: ACOUSTIC NOISE AND ECHO CANCELLING WITH MICROPHONE ARRAY
1519
Fig. 1. Two-way handsfree communication conversation between the far- and near-end speaker.
Fig. 2. On-site calibration of the array system by using the existing handsfree loudspeaker (the jammer). The figure shows the procedure for the jammer.
The corresponding operation is performed for the target.
1) Calibration Signals: The calibration signals gathered arrive from the desired and unwanted talker and handsfree loudspeaker positions, respectively, and should have approximately
the same spectral content as the true signals. There are different
methods to facilitate this. A very simple approach is to collect
and superpose human utterances from the target position and
1520
TABLE I
OPERATION MODES
, and
is a
where
most recent samples of
in
vector containing the
reverse order.
The task for the adaptive lower beamformer filters during
the I and Rx modes is to make the lower beamformer output
resemble a linear combination
of the memory target
signals. This is achieved by minimizing the composite error
between the desired signal
and the output
The lower beamformer
from the lower beamformer
input
(2)
DAHL AND CLAESSON: ACOUSTIC NOISE AND ECHO CANCELLING WITH MICROPHONE ARRAY
1521
where
(9)
The performance of the adaptive algorithm can be controlled
and where
by
controls the near-end memorized speech
1) factor
amplification/attenuation;
signal
1522
Fig. 4. Linear microphone geometry [one-dimensional (1-D)] with six microphones. The distance between elements is 50 mm.
2) factor
AND
EVALUATION
MSIR
(10)
and
denote the stored calibration signals for
where
target and jammer, respectively. In the handsfree mode, this
1 http://www.its.hk-r.se.
DAHL AND CLAESSON: ACOUSTIC NOISE AND ECHO CANCELLING WITH MICROPHONE ARRAY
1523
Fig. 7. Output power versus number of filter taps with two microphones, flat noise training signals. Number of taps
the bottom of the figure.
Fig. 8. Output power versus number of filter taps with six microphones, flat noise training signals. Number of taps
at the bottom of the figure.
MSNR
(11)
and
denote the stored calibration signals, i.e.,
where
the target and the actual car noise. In the car environment,
corresponds to environmental car noise, which, in this
evaluation, includes radio music, fan noise, or noise from a
side window wound down. Common information for all plots
is as follows.
Near-end speech, coming from the target position is
denoted Speaker.
Far-end speech signal, i.e., handsfree loudspeaker, is
denoted Echo.
The recorded calibration signals are flat noise. Speechcolored noise or human speech overlayed give similar
results.
All sequences are of 7 s, and are subsequently merged
together, i.e., a new sequence starts at 0, 7, 14
s.
1) The near-end talker is active from 1.5 to 3.5 s.
2) The far-end talker is active from 4.5 to 6.5 s.
3) In the remaining time, only background noise is
present, unless otherwise declared.
All figures begin with a 7-s sequence with an unadapted
single microphone signal, i.e., a plain unfiltered singlechannel microphone signal.
The results are presented as short-time (20 ms) power
estimates in decibels.
All signals are limited to telephone bandwidth
(3003400 Hz).
The number of filter taps needed is a crucial parameter;
we found that 128256 filter taps are sufficient (see Figs. 7
and 8). Since the evaluation was performed at the sample
rate 12 000 Hz, the number of taps could be reduced by
1524
Fig. 9. Output power versus MSIR for two- and six-microphone array, left- and right-hand plots, respectively. MSIR in decibels (+15; +5; 0;
is noted at the bottom of the figure.
Fig. 10.
05 015)
;
Different microphone placement results, 1-D array. Used microphones in the linear geometry, see Fig. 4, are noted at the bottom of the figure.
DAHL AND CLAESSON: ACOUSTIC NOISE AND ECHO CANCELLING WITH MICROPHONE ARRAY
Fig. 11.
1525
Different microphone placement results, 2-D array. Microphones used in the nonlinear geometry, see Fig. 5, are noted at the bottom of the figure.
+fan noise results using two and six microphones. Microphones used in the linear geometry, see Fig. 4,
VI. SUMMARY
AND
side window noise using two and six microphones. Microphones used in the linear
CONCLUSIONS
1526
REFERENCES
Mattias Dahl (S94A95) was born in Uddevalla, Sweden. He received the B.S. degree
in electrical engineering from the Chalmers
Institute of Technology, Gothenburg, Sweden,
the M.S. degree in telecommunication and signal
processing from Lulea University of Technology,
Lulea, Sweden, and the Licentiate degree in
signal processing from Lund University of
Technology, Lund, Sweden, in 1988, 1993, and
1997, respectively.
Since 1993, he has been with the Department
of Signal Processing, University of Karlskrona/Ronneby, Ronneby, Sweden,
where he is involved in adaptive beamforming, speech enhancement, and
active noise control research projects.
Ingvar Claesson (M91) was born in Broby, Sweden, in 1957. He received the Dipl.Eng. and Ph.D.
degrees from Lund University, Lund, Sweden, in
1980 and 1986, respectively.
He was appointed Senior Lecturer in Telecommunication Theory at Lund University in 1986 and
became an Associate Professor in 1992. Since May
1998, he has held the Chair of Signal Processing
at the University of Karlskrona/Ronneby, Ronneby,
Sweden. He is also currently the Head of Research
and Principal Supervisor in Signal Processing. In
1990, he was one of the founders of the Department of Signal Processing. His
current research interests are in adaptive signal processing, blind equalization,
adaptive beamforming, speech enhancement, active noise control, filter design,
and antenna arrays.