Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

Zhu, Yi; Goel, Chirag; Koppisetti, Surya; Tran, Trang; Kumar, Ankur; Bharaj, Gaurav

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2410.07379 (eess)

[Submitted on 9 Oct 2024]

Title:Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

Authors:Yi Zhu, Chirag Goel, Surya Koppisetti, Trang Tran, Ankur Kumar, Gaurav Bharaj

View PDF HTML (experimental)

Abstract:Audio deepfake detection is crucial to combat the malicious use of AI-synthesized speech. Among many efforts undertaken by the community, the ASVspoof challenge has become one of the benchmarks to evaluate the generalizability and robustness of detection models. In this paper, we present Reality Defender's submission to the ASVspoof5 challenge, highlighting a novel pretraining strategy which significantly improves generalizability while maintaining low computational cost during training. Our system SLIM learns the style-linguistics dependency embeddings from various types of bonafide speech using self-supervised contrastive learning. The learned embeddings help to discriminate spoof from bonafide speech by focusing on the relationship between the style and linguistics aspects. We evaluated our system on ASVspoof5, ASV2019, and In-the-wild. Our submission achieved minDCF of 0.1499 and EER of 5.5% on ASVspoof5 Track 1, and EER of 7.4% and 10.8% on ASV2019 and In-the-wild respectively.

Comments:	Accepted into ASVspoof5 workshop
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2410.07379 [eess.AS]
	(or arXiv:2410.07379v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2410.07379

Submission history

From: Yi Zhu [view email]
[v1] Wed, 9 Oct 2024 18:55:28 UTC (4,564 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators