Morlet wavelet-based voice liveness detection using convolutional neural network

P Gupta, PK Chodingala… - 2022 30th European Signal …, 2022 - ieeexplore.ieee.org
2022 30th European Signal Processing Conference (EUSIPCO), 2022ieeexplore.ieee.org
Given the attacker's freedom of using any spoofing attack, there is a need to explore liveness
detection approaches that can classify a live speech from all the various spoofed speeches.
To that effect, we propose Morlet wavelet-based approach for Voice Liveness Detection
(VLD). We use acoustic cues of pop noise to discriminate a live speech signal from a spoof
speech. Pop noise is present in live speech signals at low frequencies, caused by human
breath reaching at the closely-placed microphone. As compared to the STFT-based baseline …
Given the attacker's freedom of using any spoofing attack, there is a need to explore liveness detection approaches that can classify a live speech from all the various spoofed speeches. To that effect, we propose Morlet wavelet-based approach for Voice Liveness Detection (VLD). We use acoustic cues of pop noise to discriminate a live speech signal from a spoof speech. Pop noise is present in live speech signals at low frequencies, caused by human breath reaching at the closely-placed microphone. As compared to the STFT-based baseline with 62.08% as overall accuracy, we obtain significantly improved performance. We achieve an overall accuracy of 80.00% on the evaluation set with 45-D handcrafted Morlet wavelet-based features, and an accuracy of 86.23% with Morlet scalogram is obtained on the evaluation set. Better results signify that for VLD, wavelet transform-based time-frequency (scalogram) representation is more efficient as compared to the conventional STFT-based spectrogram. Furthermore, we have analyzed the effect of various phoneme types on VLD performance for the proposed approach.
ieeexplore.ieee.org
Showing the best result for this search. See all results