Pagan: A phase-adapted generative adversarial networks for speech enhancement

P Li, Z Jiang, S Yin, D Song, P Ouyang… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
P Li, Z Jiang, S Yin, D Song, P Ouyang, L Liu, S Wei
ICASSP 2020-2020 IEEE International Conference on Acoustics …, 2020ieeexplore.ieee.org
Deep neural networks (DNNs) are becoming more and more popular in speech
enhancement. Most of DNN-based speech enhancement approaches currently operate on
magnitude spectra and ignore the phase mismatch between noisy and clean speech which
greatly limits the speech enhancement performance. This paper presents a new approach to
solve the phase mismatch problem by training traditional DNN adversarially with a time-
domain discriminator. Instead of estimating a more accurate phase, the DNN is trained to be …
Deep neural networks (DNNs) are becoming more and more popular in speech enhancement. Most of DNN-based speech enhancement approaches currently operate on magnitude spectra and ignore the phase mismatch between noisy and clean speech which greatly limits the speech enhancement performance. This paper presents a new approach to solve the phase mismatch problem by training traditional DNN adversarially with a time-domain discriminator. Instead of estimating a more accurate phase, the DNN is trained to be more adapted to noisy phase and able to minimize the influence brought by the phase mismatch. We also propose a new evaluation metric to judge the degree of adaptation to noisy phase. Experimental results show that adding of time-domain discriminator yields a more phase-adapted generator and significantly improves the speech enhancement performance.
ieeexplore.ieee.org
Showing the best result for this search. See all results