DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection

Shanto, MD Sadik Hossain; Dihan, Mahir Labib; Ghosh, Souvik; Anonto, Riad Ahmed; Chowdhury, Hafijul Hoque; Muhtasim, Abir; Ahsan, Rakib; Hassan, MD Tanvir; Sojib, MD Roqunuzzaman; Hakim, Sheikh Azizul; Rahman, M. Saifur

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.16704 (cs)

[Submitted on 28 Jan 2025]

Title:DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection

Authors:MD Sadik Hossain Shanto, Mahir Labib Dihan, Souvik Ghosh, Riad Ahmed Anonto, Hafijul Hoque Chowdhury, Abir Muhtasim, Rakib Ahsan, MD Tanvir Hassan, MD Roqunuzzaman Sojib, Sheikh Azizul Hakim, M. Saifur Rahman

View PDF HTML (experimental)

Abstract:This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset.

Comments:	Technical report for IEEE Signal Processing Cup 2025, 7 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
Cite as:	arXiv:2501.16704 [cs.CV]
	(or arXiv:2501.16704v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.16704

Submission history

From: Mahir Labib Dihan [view email]
[v1] Tue, 28 Jan 2025 04:46:50 UTC (1,231 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators