Zhang 2023 Distributed
Zhang 2023 Distributed
ABSTRACT tions and distributed system deployments due to their high computa-
tional complexity. To address this, network-wide ACC-based meth-
Personal sound zone (PSZ) systems, which aim to create listening
ods have been proposed, which leverage a wireless acoustic sensor
(bright) and silent (dark) zones in neighboring regions of space, are
and actuator network (WASAN) to distribute the computational bur-
often based on time-varying acoustics. Conventional adaptive-based
den across nodes [16,17]. In the method described in [16], each zone
methods for handling PSZ tasks suffer from the collection and pro-
is treated as being covered by a single acoustic node, composed of L
cessing of acoustic transfer functions (ATFs) between all the match-
loudspeakers and one microphone. However, this approach requires
ing microphones and all the loudspeakers in a centralized manner,
significant processing capacity for each node. Alternatively, in [17],
resulting in high calculation complexity and costly accuracy require-
a distributed adaptive ACC algorithm using a gradient-based gen-
ments. This paper presents a distributed pressure-matching (PM)
eralized eigenvalue decomposition (GEVD) approach is devised to
method relying on diffusion adaptation (DPM-D) to spread the com-
solve the centralized problem, achieving comparable performance
putational load amongst nodes in order to overcome these issues.
to its centralized counterpart. Nevertheless, this method relies on a
The global PM problem is defined as a sum of local costs, and the
root node to compute the global gradient vector and disseminate it to
diffusion adaption approach is then used to create a distributed so-
all other nodes through a communication tree. If a communication
lution that just needs local information exchanges. Simulations over
or computation failure occurs on the root node, it can result in the
multi-frequency bins and a computational complexity analysis are
failure of the entire network. To overcome these limitations, the
conducted to evaluate the properties of the algorithm and to com-
diffusion adaptation strategy [18–20] presents an attractive solution
pare it with centralized counterparts.
with enhanced adaptation performance and wider stability ranges.
Index Terms— Personal sound zone, pressure matching, dis- It operates without relying on a root node and allows each node to
tributed networks, diffusion adaptation. share its local estimates with immediate neighbors, optimizing a
global cost collaboratively. Moreover, the optimization objective of
1. INTRODUCTION the PM method, which is a quadratic form, can be decomposed into
a sum of local cost functions, making it more suitable for exploiting
Personal sound zone (PSZ) aims to generate individual zones for the benefits of diffusion strategies. These factors motivated us to
users within a spatial control region by employing a loudspeaker develop a distributed PM algorithm using diffusion adaptation.
array [1]. This can be used in a wide range of commercial au- In this paper, we examine a real-world PSZ application scenario
dio applications, including neckband headset [2], car cabin au- where the ATFs exhibit time-varying characteristics and are subject
dio [3], mobile devices [4], television sound systems [5], to cite to regular perturbations caused by physical and environmental fac-
a few. Among several methods adopted for sound zone genera- tors [12, 21]. Within this context, we propose a distributed PM algo-
tion, there are two typical approaches, namely the acoustic contrast rithm designed for networks. Our approach considers a system archi-
control (ACC) [6–8] methods and pressure matching (PM) [9–12] tecture where each node consists of multiple microphones, multiple
methods. The ACC algorithms maximize the energy ratio between loudspeakers, and a local processor with communication and com-
the bright and dark zones. The PM algorithms, on the other hand, puting capabilities. Moreover, we impose the constraint that each
try to minimize the mean-square error (MSE) of the reproduced node k can only access its local data, specifically the measured ATFs
sound field in the zones compared to the target sound field. Note between its microphones and all the loudspeakers. Subsequently, we
that these methods should be implemented with online estimates of decompose the global cost function of the PM algorithm across the
room impulse responses (RIR), which are often time-variant [13,14]. nodes to address the optimization problem in a distributed manner.
Existing adaptive PSZ algorithms employ centralized approaches We conduct simulations to validate the proposed distributed pres-
to gather and process the acoustic transfer functions (ATFs) between sure matching via diffusion adaptation (DPM-D) algorithm across
all loudspeakers and all matching points [15]. However, the practi- multiple frequency bins. This approach stands out significantly from
cality of these centralized solutions is limited in large-scale applica- the existing literature, which predominantly emphasizes the use of
acoustic contrast control (ACC)-based distributed methods. Our pro-
The work of M. Zhang was supported partly by the China Scholar- posed method surpasses the limitations of the distributed ACC ap-
ship Council and partly by the Innovation Foundation for Doctor Disserta- proach, which necessitates a root node for communication and com-
tion of Northwestern Polytechnical University. The work of J. Chen is par- putation.
tially supported by NSFC grant 61671382, Shaanxi Key Industrial Innovation
Chain Project 2022ZDLGY01-02, and Xi’an Technology Industrialization
Notation. Normal font letters x, boldface small letters x and capital
Plan XA2020-RGZNTJ-0076. The work of C. Richard was funded in part by letters X denote scalars, column vectors and matrices, respectively.
the PIA program under its IDEX UCAJEDI project (ANR-15-IDEX-0001) C denotes the complex field. The transpose and Hermitian transpose
and by ANR under grant ANR-19-CE48-0002. of a vector or matrix are denoted by (·)⊤ and (·)H , respectively. (·)∗
denotes the complex conjugation operator. 1N denotes the all-one We consider the mean-square error criterion J(n, f ) ≜ ∥e(n, f )∥2
vector of size N . Nk denotes the set of one-hop neighboring nodes to formulate the problem of frequency-domain adaptive PM [15]:
of node k, including k, and |Nk | denotes the cardinality of Nk . g o (n, f ) = arg min J(n, f ). (9)
g
Taking the derivative of J(n, f ) w.r.t. g and considering the com-
2. PROBLEM FORMULATION AND CENTRALIZED plex least mean squares (LMS) algorithm [22], the adaptive update
ADAPTATION STRATEGY equation can be written as:
g(n + 1, f ) =g(n, f ) − µH H (n, f )e(n, f ), (10)
2.1. System model where µ > 0 is the step size. Inspecting (10), we observe that the
error signals as well as the ATFs between each control point and
As shown in Fig. 1(a), we assume an array of L loudspeakers and all the loudspeakers are necessary for computing the control filter at
two spatial regions with M control points in total, delimited by Mb each iteration. This means that (10) must be processed in a central-
(bright zone) and Md (dark zone) control points, respectively. The ized manner [23].
objective of this system is to render a target sound field in the bright
zone with minimal interference in the dark zone. The subscripts b
and d denote the bright and dark zones, respectively. 3. DISTRIBUTED PRESSURE MATCHING VIA
With the frequency domain approach, the sound pressure pm at DIFFUSION ADAPTATION
the m-th control point is given by:
L
3.1. Distributed diffusion adaptation strategy
X
pm (f ) = hm,l (f )gl (f ) = hm (f )g(f ), (1) Consider solving (9) in a collaborative and distributed manner.
l=1 Before devising the proposed algorithm, we briefly introduce the dif-
where f is the frequency, hm,l (f ) ∈ C denotes the ATF between fusion adaptation strategy. Consider a connected network composed
the l-th loudspeaker to the m-th control point, gl (f ) ∈ C denotes of N nodes. The problem is to estimate an unknown complex vector
the loudspeaker control filter, and: g o ∈ CL×1 such that the following global cost is minimized:
hm (f ) = [hm,1 (f ) · · · hm,L (f )], (2) N
X
J glob (g) = Jk (g), (11)
g(f ) = [g1 (f ) · · · gL (f )]⊤ . (3)
k=1
We combine the ATF matrices of the bright and dark zones into a where Jk (g) denotes a real-valued function accessible to node k that
matrix H(f ) as follows: is considered to be convex.
h1 (f )
The typical adapt-then-combine (ATC) diffusion LMS strategy
.. is written in the following form [19]:
.
ˆ k (g )]∗ ,
ψ k,n+1 = g k,n − µ[∇J (12)
Hb hM (f )
b
k,n
H(f ) = = hMb +1 (f ) ,
(4)
Hd
X
g k,n = aℓk ψ ℓ,n+1 , (13)
..
ℓ∈Nk
.
hM (f ) where ∇Jˆ k (g ) denotes a stochastic approximation for the true
k,n
where the first Mb row vectors of H denotes the ATFs for the bright local gradient ∇Jk (g k,n ), the nonnegative coefficients aℓk denote
zone. H d , consisting of the other Md = M − Mb row vectors of the (ℓ, k)-th entries of a left-stochastic matrix A, satisfying:
H, denotes the ATFs for the dark zone. Likewise, the vector p(f ) A⊤ 1N = 1N , aℓk = 0, if ℓ ∈/ Nk (14)
containing the sound pressure at the M control points is defined as: o
to ensure convergence in the mean sense towards g , and µk is a
p(f ) = [p1 (f ) · · · pMb (f ), pMb +1 (f ) · · · pM (f )]⊤ positive step size at node k.
= H(f )g(f ), (5)
and the desired signal d(f ) at all the M control points as: 3.2. Distributed PM via diffusion LMS
⊤
d(f ) = [d1 (f ) · · · dMb (f ), dMb +1 (f ) · · · dM (f )] . (6) To facilitate the presentation of the proposed strategy, we define
From (5) and (6), the estimated error at control points is given by: the network model as follows. We focus on a N -node distributed
e(f ) = p(f ) − d(f ). (7) personal sound zone network. Node k is a module consisting of one
or more microphones in the control zone, one or more loudspeakers,
and a processor with communication and computation capabilities.
2.2. Centralized adaptive PM A network of multi-channel nodes capable of processing a variety
of microphones’ and loudspeakers’ data is depicted in Fig.1(b)-1(c).
ATFs are usually measured beforehand, that is, during a pre- Note that, in practice, the system architecture matches the character-
calibration stage. Then they are used for the control filter calculation. istics of the application at hand (number of processors, number of
Nonetheless, perturbations are unavoidable during the actual mea- microphones and loudspeakers, room size, etc.).
surement of ATF because of, e.g., a position mismatch of sensors, For simplicity, in this paper, we consider the case where each
RIR variation caused by changes in room temperature and humid-
ity, changes in the electroacoustic response, and background noise PNwith Mk microphones
node is equipped PN and Lk loudspeakers. N
We
write M = k=1 M k and L = k=1 Lk . Let {C k } k=1 be a
in the ATF measurement procedure, etc. It is reasonable to continu- partition of the set of indices C = {1, · · · , M }, specifically,
ously estimate the control filter g with the ATFs being updated over N
time. We rewrite the estimated error of (7) at the n-th time block as:
[
Ck = C, Ck ∩ Cℓ = ∅, if k ̸= ℓ, (15)
e(n, f ) = p(n, f ) − d(n, f ). (8) k=1
( !
&
! ! "
*#
) '
+,
!"#$%"&$'( )'$(&
-.
*#
%&'
1m
+,
7.346m
()
-.
%&'
()
0.225m
0.225m
! & !"#$% !"#$
& '
# %" &
0.225m 0.225m
Dark zone
0"+1 .'(/ 1.5m Bright.'(/
*+$,-& zone
!
" %
"#
!
$%
"#
&'
$%
()
&'
()
0.075m
# $ (
) ( ' & % $ # " !
8.088m
14
12
10
(a) (b)
Fig. 2: Comparison of the NMSE and AC learning curves. Shaded regions in (a) represent the three standard deviations of the estimates.
-8 18
-9
16
-10
-11
14
-12
12
-13
-14
10
-15
-16 8
-17
6
-18
500 1000 1500 2000 2500 3000 3500 4000 500 1000 1500 2000 2500 3000 3500 4000
(a) (b)
Fig. 3: Multi-frequency control performances of the proposed DPM-D algorithm for Systems 1 and 2 compared with the CPM algorithm on
control points. (a) NMSE on frequency bins after 5000 iterations; (b) AC on frequency bins after 5000 iterations.
Table 1: Comparison of computational complexity
Algorithms Additions Multiplications
CPM (M + L) × F log2 F + M × L (M + L) × F2 log2 F + (M + 1) × L
Proposed DPM-D (Mk + Lk ) × F log2 F + (|Ck | + |Nk | − 1) × L (Mk + Lk ) × F2 log2 F + (|Ck | + |Nk | + 1) × L
NMSE convergence behavior at the control and validation points for ity of the frequency domain-based PM methods, as shown in Ta-
the bright zone, and Fig. 2(b) reports the AC at the control and vali- ble 1. The parameter F in this table represents the number of FFT
dation points (but includes information from the bright and the dark operations. We evaluated the computational load at each iteration
zones. At the control and validation points, the DPM-D algorithm on the centralized processor for centralized algorithms and on the
reaches a steady-state comparable to that of the CPM algorithm. At processor of each node k for distributed algorithms. The computa-
steady-state on control points, the NMSE and the AC are approxi- tional complexity was divided into two components: the FFT oper-
mately equal to −16 dB and 16 dB, respectively. At steady-state on ation and the frequency domain processing. The results presented
validation points, the NMSE and the AC are approximately equal to in Table 1 demonstrate that the proposed distributed method effec-
−14 dB and 14 dB, respectively. Regarding the DPM-D method, the tively distribute the computational load among to the processors of
distributed PM with System 2 achieves superior convergence perfor- each node, thereby reducing the communication load compared to
mance. The reason is that nodes in System 2 use more microphones the centralized approach. This improvement in load distribution en-
measurements during the update step (18). However, the resulting hances the scalability of the system.
computational complexity is larger because it grows with the num-
ber of microphones maintained by each node. 5. CONCLUSION
In order to examine the performance of the proposed algorithm
over multi-frequency bins, we then tested the desired signal for the The novelty of the work presented in this paper resides in the
bright zone with frequency bins ranging from 100 to 4000 Hz with a utilization of a distributed PM approach based on diffusion LMS
step of 100 Hz. Other experimental settings were identical to those for handling PSZ tasks. This approach stands out significantly from
considered before with the single-frequency signal. Fig. 3 shows the the existing literature, which predominantly emphasizes the use of
NMSE and AC behaviors at various frequencies. It can be observed acoustic contrast control (ACC)-based distributed methods. Our pro-
that the DPM-D with both systems and the CPM perform almost the posed method surpasses the limitations of the distributed ACC ap-
same over most of frequency bins after 5000 iterations. Fig. 3(a) and proach, which necessitates a root node for communication and com-
Fig. 3(b) indicate that adding additional microphones to a node in a putation, by enabling each node to independently estimate and share
distributed system does not significantly improve its NMSE steady- information with its neighboring nodes.
state performance and its AC steady-state performance, respectively.
We further conducted an analysis of the computational complex-
6. REFERENCES [13] S. Spors, H. Buchner, R. Rabenstein, and W. Herbordt, “Active
listening room compensation for massive multichannel sound
[1] T. Betlehem, W. Zhang, M. A. Poletti, and T. D. Abhayapala, reproduction systems using wave-domain adaptive filtering,” J.
“Personal sound zones: Delivering interface-free audio to mul- Acoust. Soc. Am., vol. 122, no. 1, pp. 354–369, 2007.
tiple listeners,” IEEE Signal Process. Mag., vol. 32, no. 2, pp.
[14] L. J. Brännmark, A. Bahne, and A. Ahlén, “Compensation of
81–91, 2015.
loudspeaker–room responses in a robust mimo control frame-
[2] S. W. Jeon and J. W. Choi, “Personal audio system for neck- work,” IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no.
band headset with low computational complexity,” J. Acoust. 6, pp. 1201–1216, 2013.
Soc. Am., vol. 148, no. 6, pp. 3913–3927, 2020. [15] L. Vindrola, M. Melon, J. C. Chamard, and B. Gazengel, “Use
[3] H. So and J. W. Choi, “Subband optimization and filtering of the filtered-x least-mean-squares algorithm to adapt personal
technique for practical personal audio systems,” in Proc. IEEE sound zones in a car cabin,” J. Acoust. Soc. Am., vol. 150, no.
Int. Conf. Acoust., Speech, Signal Process. IEEE, 2019, pp. 3, pp. 1779–1793, 2021.
8494–8498.
[16] G. Piñero, C. Botella, M. d. Diego, M. Ferrer, and A. González,
[4] Jordan Cheer, Stephen J Elliott, Youngtae Kim, and Jung-Woo “On the feasibility of personal audio systems over a network of
Choi, “Practical implementation of personal audio in a mobile distributed loudspeakers,” in Proc. IEEE Eur. Signal Process.
device,” J. Audio Eng. Soc., vol. 61, no. 5, pp. 290–300, 2013. Conf., 2017, pp. 2729–2733.
[5] Marcos F Simón Gálvez, Stephen J Elliott, and Jordan Cheer, [17] R. V. Rompaey and M. Moonen, “Distributed adaptive acous-
“Personal audio loudspeaker array as a complementary TV tic contrast control for node-specific sound zoning in a wireless
sound system for the hard of hearing,” IEICE Trans. Funda- acoustic sensor and actuator network,” in Proc. IEEE Eur. Sig-
mentals, vol. 97, no. 9, pp. 1824–1831, 2014. nal Process. Conf., 2021, pp. 481–485.
[6] J-W. Choi and Y-H. Kim, “Generation of an acoustically bright [18] C. G. Lopes and A. H. Sayed, “Diffusion least-mean squares
zone with an illuminated region using multiple sources,” J. over adaptive networks: Formulation and performance analy-
Acoust. Soc. Am., vol. 111, no. 4, pp. 1695–1700, 2002. sis,” IEEE Trans. Signal Process., vol. 56, no. 7, pp. 3122–
[7] M. Shin, S. Q. Lee, F. M. Fazi, P. A. Nelson, D. Kim, S. Wang, 3136, July 2008.
K. H. Park, and J. Seo, “Maximization of acoustic energy dif- [19] A. H. Sayed, “Diffusion adaptation over networks,” in Aca-
ference between two spaces,” J. Acoust. Soc. Am., vol. 128, pp. demic Press Library in Signal Processing, M. Z. Abdelhak,
121–131, 2010. V. Mats, C. Rama, and T. Sergios, Eds., vol. 3, pp. 323–454.
[8] P. Coleman, P. J. B. Jackson, M. Olik, M. B. Møller, M. Olsen, Elsevier, 2014.
and J. A. Pederson, “Acoustic contrast, planarity and robust-
[20] J. Chen, C. Richard, and A. H. Sayed, “Diffusion LMS over
ness of sound zone methods using a circular loudspeaker ar-
multitask networks,” IEEE Trans. Signal Process., vol. 63, no.
ray,” J. Acoust. Soc. Am., vol. 135, no. 4, pp. 1929–1940, 2014.
11, pp. 2733–2748, Jun. 2015.
[9] F. Olivieri, F. M. Fazi, S. Fontana, D. Menzies, and P. A.
[21] Q. Zhu, P. Coleman, M. Wu, and J. Yang, “Robust personal
Nelson, “Generation of private sound with a circular loud-
audio reproduction based on acoustic transfer function mod-
speaker array and the weighted pressure matching method,”
elling,” in Proc. AES Int. Conf. Sound Field Control, 2016.
IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 25, no.
8, pp. 1579–1591, 2017. [22] B. Widrow, J. McCool, and M. Ball, “The complex lms algo-
[10] V. Molés-Cases, G. Piũero, A. Gonzalez, and M. d. Diego, rithm,” Proc. of the IEEE, vol. 63, no. 4, pp. 719–720, 1975.
“Providing spatial control in personal sound zones using graph [23] A. H. Sayed et al., “Adaptation, learning, and optimization over
signal processing,” in Proc. IEEE Eur. Signal Process. Conf., networks,” Foundations and Trends® in Machine Learning,
2019, pp. 1–5. vol. 7, no. 4-5, pp. 311–801, 2014.
[11] V. Molés-Cases, S. J. Elliott, J. Cheer, G. Piñero, and A. Gon- [24] E. A.P. Habets, “Room impulse response generator,” Technis-
zalez, “Weighted pressure matching with windowed targets for che Universiteit Eindhoven, Tech. Rep, vol. 2, no. 2.4, pp. 1,
personal sound zones,” J. Acoust. Soc. Am., vol. 151, no. 1, pp. 2006.
334–345, 2022.
[25] V. Molés-Cases, G. Piñero, M. d. Diego, and A. Gonzalez,
[12] J. Zhang, L. Shi, M. G. Christensen, W. Zhang, L. Zhang, and “Personal sound zones by subband filtering and time domain
J. Chen, “Robust pressure matching with ATF perturbation optimization,” IEEE Trans. Audio, Speech, Lang. Process., vol.
constraints for sound field control,” in Proc. IEEE Int. Conf. 28, pp. 2684–2696, 2020.
Acoust., Speech, Signal Process., Singapore, May 2022, pp.
8712–8716.