2003 Internoise CrossSpecBF
2003 Internoise CrossSpecBF
The individual time delays are chosen with the aim of achieving selective directional
sensitivity in a specific direction, characterized here by a unit vector
m
= =
= =
k r
m
j
e
Here, is the temporal angular frequency, k k is the wave number vector of a plane
wave incident from the direction in which the array is focused [Figure 1] and k c = is
Figure 1. Illustration of a phased microphone
array, a directional sensitivity represented by a
mainlobe and sidelobes, and a plane wave
incident from the direction of the mainlobe.
Figure 2. In near field focusing, spherical waves
emitted by a monopole source at the focus point r is
assumed. Signal delays are computed according to
equation (14).
the wave number. In equation (3) an implicit time factor equal to
j t
e
is assumed.
For a given array geometry {r
m
} the structure of the directional sensitivity is contained in the
Array Pattern function [1] defined in wave number space as
(4)
1
( ) .
m
M
j
m
W e
=
K r
K
It has the form of a generalized spatial DFT of a weighting function, which equals one over
the array area and zero outside. Because the microphone positions have z-coordinate
equal to zero, the Array Pattern is independent of . We shall therefore consider the Array
Pattern W only in the (K
m
r
z
K
x
,K
y
) plane. There, W has an area with high values around the origin
with a peak value equal to M at (K
x
,K
y
) = (0,0). This peak represents according to the
following section the high sensitivity to plane waves coming from the direction , in which
the array is focused. Figure 1 contains an illustration of that peak, which is called the
mainlobe. Other directional peaks are called sidelobes and a good phased array design is
characterized by having low Maximum Sidelobe Level (MSL), measured relative to the main
lobe level [2]. The highest sidelobe is identified as the highest secondary peak in the Power
Array Pattern
(5)
( ) 2
, 1
( ) | ( ) | ,
m n
M
j
m n
U W e
=
=
K r r
K K
for |K| < 2
max
/c,
max
being the upper frequency of the arrays intended use.
CROSS SPECTRAL BEAMFORMING WITH AUTO-SPECTRA EXCLUSION
For a stationary sound field it is natural to consider the time average power output,
( ) ( ) 2 *
, 1 , 1
( , ) | ( , ) | ( ) ( ) ( ) ,
m n m n
M M
j
m n nm
m n m n
V B P P e C e
= =
= =
k r r k r r
j
(6)
of the beamformer, where we have introduced the cross-spectrum matrix
*
( ) ( ) ( )
nm m n
C P P . We may split (6) in an auto-spectrum part and a cross-spectrum part,
(7)
( )
1
( , ) .
m n
M M
j
mm nm
m m n
V C C e
=
= +
k r r
Here, the auto-spectra C will contain self-noise from the individual channels such as
wind-noise and electronic noise from the data acquisition hardware. For that reason it would
be desirable to omit the first sum in equation (7). Ideally, the cross-spectra , are
not affected by the self-noise, because the self-noise in one channel is generally incoherent
with the self-noise in any other channel. Under that condition, averaging will suppress
contributions from self-noise in the cross-spectra. We can assess the effect of excluding the
mm
,
nm
C m n
auto-spectra by relating the plane wave response of the cross-spectral beamformer to the
power array pattern. For a unit amplitude plane wave with wave number vector k
0
the
spectrum recorded by the mth microphone is
0
exp( )
m
P j
m
= k r
0
( ) ( )
, 1
m
M M
j
m n
e e
= =
= =
k k r r
'( , )
M M
j
m n
V
0
'( ). V U
. Insertion of that in equation
(6) leads to the following expression for the beamformer power output:
k k
( ) . M K K
2 /10
M
0
10
10 log
M M | |
=
|
\ .
2 /10
10
MSL
M
(8)
0
( ) ( )
0
, 1
( , ) ( ).
m n m n n
j j
m n
V e U
=
k r r k r r
k k
) j
where we have used formula (5) for the power array pattern U. In a similar way we see that
the self-term free versions of the power array pattern (5) and the cross-spectral beamformer
response (7),
(9)
( ) (
'( ) and
m n m n
nm
m n
U e C e
K r r k r r
K
are for plane waves related by
'( , ) = (10)
Thus, removal of the auto-spectral terms from the cross-spectral beamformer (7) corresponds
to omitting the self-terms from the definition of the power array pattern (5). Provided the
reduced array pattern has lower sidelobe level than U, we can therefore reduce the level
of ghost images in cross-spectral beamformer output by omitting the auto-spectra.
' U
Comparing the definitions of the array pattern U and the reduced version U we find that
'( ) U U = (11)
The mainlobe is therefore reduced from M
2
(for U) to M
2
M (for U), and the highest sidelobe
is reduced from to
2
10
MSL
M
/10
10
MSL
M . Assuming first that U does not
become negative, this leads to the following Maximum Sidelobe Level for U:
2 /1 /10
10 2
10 10 1
' 10 log ,
1
MSL MSL
M
MSL
M M M
|
=
\
|
|
.
which is easily shown to be always smaller (better) than MSL. As an example, assume a 90-
channel array with an MSL equal to 15 dB. In that case MSL equals -16.83 dB, meaning that
the highest sidelobe has been reduced by 1.83 dB. If the power array pattern U contains
values less than M then the reduced array pattern U will have areas with negative values.
Worst case is when U has a null. In that case the minimum value of U equals M, which
will have the same effect as a sidelobe with amplitude equal to M. Such a sidelobe will not
affect MSL as long as M is smaller than M . This condition has been fulfilled
for all the arrays that we have been designing [2]. And additionally, this worst-case condition
will not occur, when only array geometries without redundant spacing vectors are used.
Near Field Beamforming
Up to now we have considered only the case of sources in the far field. In that case each
source will create a plane wave in the region occupied by the array, meaning that different
sources can be located by identifying associated plane waves. For sources in the near field this
will not be the case, and we assume instead a distribution of monopole point sources on the
focus plane. In this case the pressure measured by the microphones will be
[ ( ) ]
( ) / ( ),
m i i
j kr
m i m
i
P Pe r
+
=
i
r
r
(12)
where r
i
is the source positions, P
i
and
i
are the individual source strengths and phases and
r
m
(r
i
) = |r
m
r
i
| is the microphone to source distance. The expression for delay-and-sum
beamforming (3) must be restated for point focusing:
(13)
( )
1
( , ) ( ) ,
m
M
j
m
m
B P e
=
=
r
r
where we have replaced the delays (2) with the form
( ) (| | ( )) / ,
m m
r c = r r r (14)
appropriate for a spherical wave [Figure 2]. The near field version of equation (7) for the
beamformer power appears as
(15)
[ ( ) ( )]
1
( , ) .
m n
M M
j
mm nm
m m n
V C C e
=
= +
r r
r
CROSS SPECTRAL BEAMFORMING WITH AMPLITUDE CORRECTION
Equation (15) for finite distance beamforming contains no compensation for the fact that
different positions on the assumed source plane have different distances to the array
transducers and therefore are attenuated by different amounts. For a single source at r
i
, a
possible correction could be to replace the cross-spectrum matrix by the scaled version
. The introduction of a scaled cross-matrix is, however, an ad-hoc correction
with uncontrolled effects. A sound approach can be achieved by assuming a model where the
recorded sound field is generated by a monopole distribution. Based on this assumption we
determine the distribution of source positions and amplitudes which minimizes an error
function between the measured cross-spectra and the model cross-spectra. The approach is
inspired by reference [3].
( ) ( )
nm m i n i
C r r r r
Let be the transducer coordinates and let r be the position of a monopole.
The field, p
, 1, ,
m
m = r M
m
, recorded by the mth transducer is then
0 0
( ) ( ) ( )
m m m
p p v p v = r r r r . where p
0
is the source strength and is the steering vector given by ( ) v r
(16)
| |
( ) / | | .
jk
v e
=
r
r r
According to our model the cross-spectrum, , between channel m and n is
mod
nm
C
(17)
mod * *
( ) ( ),
nm n m n m
C p p a v v = r r
where a is a reel amplitude coefficient. Then we define an error function, , between
the model cross-spectra and the measured cross-spectra, C
( , ) E a r
nm
,
2
mod *
, 1 , 1
( , ) ( ) ( ) .
M M
nm nm nm n m
m n m n
E a C C C av v
= =
= =
r
2
r r
. v v
(18)
We can simplify this expression by introducing the column matrices
(19)
*
[ ] and [ ]
nm n m
C = = g h
Then (18) appears as
2
2
( , ) ( ) , E a a a a = = + + r
g h g g h g g h h h (20)
where we have used that a is real. Minimizing first with respect to a, we find a g h which
upon multiplication from left with leads to
h
/ a = h . g h h (21)
We have to make sure that the right-hand side is real. Appealing to the fact that the cross-
spectral matrix is Hermitian and to the definition (19) of h and g we see that
* * * *
, 1 , 1 , 1
,
M M M
nm n m mn m n nm n m
m n m n m n
C v v C v v C v v
= = =
= = =
h
= g g h (22)
implying that
h g is real. With this observation and by insertion of (21) into (20) the error
function (20) can be rewritten as
( )
2
2
( , ) 1 / / . E a a ( = =
r
g g h h g g g g h g h h (23)
Minimizing the error function over all r thus corresponds to maximizing the Imaging
Function, ( , ) I r ,
( )
2
2 2
4 * *
, 1 , 1
( , ) / ( ) ( ) ( ) / ( ) ( ) ,
M M
nm m n n m
m n m n
I C v v
= =
=
r h v v g h h r r r r (24)
over all r (We choose the definition I
4
since (24) has unit of power squared). In practice
( , ) I r is computed over a discrete mesh covering the focus area. In the resulting map, peaks
are interpreted as areas with a high probability of finding a source. This interpretation can be
justified if we compare the imaging function in the far field with the corresponding expression
(3) for the Delay-And-Sum beamformer. For large | | R r the approximation
| |
m m
R R r r is valid. In the far field limit (24) can therefore be approximated by
4
(r
2 2
2 *
2
, 1 , 1
( )
2
2 2
2 * , 1
, 1 , 1
1
.
m n
n m
n m
M M
jkR jkR
nm m n nm
M
m n m n
jk R R
nm M M
jkR jkR m n
n m
m n m n
R C v v C e e
I C
M
R v v e e
= =
=
= =
= =
(25) e
m
Now, using the fact that the difference in travel paths
n
R R equals the projection
difference [Figure 3] we find that the beamformer power (7) is )
m n
r
( ) ( )
, 1 , 1
.
m n n m
M M
jk jk R R
nm nm
m n m n
V C e C e
= =
= =
r r
Obviously we have
2
MI V = showing us that apart from a constant factor the imaging
function in the far field equals the output of the Delay-And-Sum beamformer, which justifies
the chosen interpretation. Due to this connection with the plane wave case we can expect
improved side lobe levels from the self-term free version of the imaging function (24):
Figure 3. For a source in the extreme far field the
difference, R
n
-R
m
, in the propagation path length to the
transducers at r
n
and r
m
can be calculated from the
vector diagram.
Figure 4. An example of a planar 66-channel
beamformer. The microphone positions () are
randomly distributed inside the disc.
Figure 5. Comparison of the output of three different beamforming algorithms for a configuration
with two incoherent 3 kHz monopole sources of equal strength. The data were generated using the
array shown in Figure 4. In the legend I refers to the full cross-spectral imaging function (24), J is
the cross-spectral imaging function (26) which excludes the auto-spectra, and V is the near-field
delay-and-sum algorithm (15). All curves are normalized to 0 dB maximum level.
2
2
4 * *
( , ) ( ) ( ) ( ) / ( ) ( ) .
M M
nm m n n m
m n m n
J C v v v v
r r r r r
(26)
A comparison of the self-term free algorithm (26) and the full cross-spectrum methods (15)
and (24) confirms that auto-spectra exclusion provides lower sidelobe levels [Figure 5].
SUMMARY
In this paper we have discussed the possible benefits of excluding the auto spectra in cross-
spectral beamforming algorithms for stationary sound fields. Furthermore we have presented
a self-contained derivation of a near field cross-spectral beamforming algorithm, which
includes amplitude corrections.
REFERENCES
1. D. H. Johnson and D. E. Dudgeon, Array Signal Processing (Prentice Hall, New Jersey, 1993).
2. J. Hald and J.J. Christensen, A class of optimal broadband phased array geometries designed for easy
construction, Proceedings of Internoise 2002.
3. G. Elias, Proceedings of Internoise 1995, p.1175-1178.