Generalized Spoofing Detection Inspired from Audio Generation Artifacts

Gao, Yang; Vuong, Tyler; Elyasi, Mahsa; Bharaj, Gaurav; Singh, Rita

Computer Science > Sound

arXiv:2104.04111 (cs)

[Submitted on 8 Apr 2021 (v1), last revised 26 Jun 2021 (this version, v2)]

Title:Generalized Spoofing Detection Inspired from Audio Generation Artifacts

Authors:Yang Gao, Tyler Vuong, Mahsa Elyasi, Gaurav Bharaj, Rita Singh

View PDF

Abstract:State-of-the-art methods for audio generation suffer from fingerprint artifacts and repeated inconsistencies across temporal and spectral domains. Such artifacts could be well captured by the frequency domain analysis over the spectrogram. Thus, we propose a novel use of long-range spectro-temporal modulation feature -- 2D DCT over log-Mel spectrogram for the audio deepfake detection. We show that this feature works better than log-Mel spectrogram, CQCC, MFCC, as a suitable candidate to capture such artifacts. We employ spectrum augmentation and feature normalization to decrease overfitting and bridge the gap between training and test dataset along with this novel feature introduction. We developed a CNN-based baseline that achieved a 0.0849 t-DCF and outperformed the previously top single systems reported in the ASVspoof 2019 challenge. Finally, by combining our baseline with our proposed 2D DCT spectro-temporal feature, we decrease the t-DCF score down by 14% to 0.0737, making it a state-of-the-art system for spoofing detection. Furthermore, we evaluate our model using two external datasets, showing the proposed feature's generalization ability. We also provide analysis and ablation studies for our proposed feature and results.

Comments:	Camera ready version. Accepted by INTERSPEECH 2021
Subjects:	Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2104.04111 [cs.SD]
	(or arXiv:2104.04111v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2104.04111

Submission history

From: Yang Gao [view email]
[v1] Thu, 8 Apr 2021 23:02:56 UTC (13,212 KB)
[v2] Sat, 26 Jun 2021 00:14:30 UTC (13,046 KB)

Computer Science > Sound

Title:Generalized Spoofing Detection Inspired from Audio Generation Artifacts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Generalized Spoofing Detection Inspired from Audio Generation Artifacts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators