Deepfake Detection Report
Deepfake Detection Report
BELAGAVI-590018
Submitted by
Ruchitha MA 1RN21CS126
2024-2025
RNS INSTITUTE OF TECHNOLOGY
Affiliated to VTU, Recognized by GOK, Approved by AICTE, New Delhi
NACC ’A + Grade’ Accredited, NBA Accredited (UG-CSE,ECE,ISE,EIE and EEE)
Channasandra, Dr.Vishnuvardhan Road, Bengaluru-560098
Ph:(080)28611880,28611881 URL:www.rnsit.ac.in
Department of Computer Science and Engineering
CERTIFICATE
Certified that the Technical Seminar entitled “Deepfake Detection Using ResNeXt and LSTM: A
Hybrid Deep Learning Approach” has been successfully carried out at RNSIT by Ruchitha MA
bearing USN 1RN21CS126 bonafide students of RNS Institute of Technology in partial fulfillment
of the requirements of final year degree in Bachelor of Engineering in Computer Science and
The Seminar report has been approved as it satisfies the academic requirements in respect of Seminar
————————— ————————-
Ms. Soumya N G Dr. Vidya Y
Assistant professor Technical Seminar Coordinator
Dept. of CSE Associate Professor, Dept. of
CSE
————————— ————————-
Dr. Kavitha C Dr. Ramesh Babu H S
Dean and Head Principal
Dept. of CSE RNSIT
Acknowledgement
At the very onset, I would like to place on record my gratitude to all those people who have helped
me in making this seminar a reality.Our Institution has played a paramount role in guiding in the
rightdirection.I would like to profoundly thank Management of RNS Institute of Technology for
providing such a healthy environment for the successful completion of this Seminar.
I would like to thank our beloved Director,Dr.MK Venkatesha,for providing the necessary facilities
I would like to thank our beloved Principal,Dr.Ramesh Babu H S , for providing the necessary
Engineering, for having agreed to patronize me in the right direction with all her wisdom.
I would like to express my sincere thanks to our Coordinator Dr.Vidya Y, Associate Professor and
guide, Ms. Soumya N G , Assistant professor, for her constant encouragement that motivated me
for the successful completion of this work.Last but not the least, I am thankful to all the teaching
and non-teaching staff members of the Computer Science and Engineering Department for their
Ruchitha MA (1RN21CS126)
i
Abstract
The growing computation power has made the deep learning algorithms so pow erful that creating a
indistinguishable human synthesized video popularly called as deep fakes have became very simple.
Scenarios where these realistic face swapped deep fakes are used to create political distress, fake
terrorism events, revenge porn, blackmail peoples are easily envisioned. In this work, we describe
a new deep learning-based method that can effectively distinguish AI-generated fake videos from
real videos.Our method is capable of automatically detecting the replacement and reenactment deep
fakes. We are trying to use Artificial Intelligence(AI) to fight Artificial Intelligence(AI). Our system
uses a Res-Next Convolution neural network to extract the frame-level features and these features and
further used to train the Long Short Term Memory(LSTM) based Recurrent Neural Network(RNN)
to clas sify whether the video is subject to any kind of manipulation or not, i.e whether the video is
deep fake or real video. To emulate the real time scenarios and make the model perform better on
real time data, we evaluate our method on large amount of balanced and mixed data-set prepared by
mixing the various available data-set like Face-Forensic++[1], Deepfake detection challenge[2], and
Celeb-DF[3]. We also show how our system can achieve competitive result using very simple and
robust approach.
ii
Contents
Acknowledgement i
Abstract ii
List of Figures v
List of Tables vi
1 INTRODUCTION 1
2 LITERATURE SURVEY 5
2.2.5 Deepfake Face Mask Dataset for Detection in the Infectious Disease Era . . . 7
iii
3 PROBLEM STATEMENT 10
3.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 METHODOLOGY 12
4.7 Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6 CONCLUSION 20
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
References 23
List of Figures
v
List of Tables
vi
Chapter 1
INTRODUCTION
In the world of ever growing Social media platforms, Deepfakes are con sidered as the major threat
of the AI. There are many Scenarios where these realistic face swapped deepfakes are used to create
political distress, fake ter rorism events, revenge porn, blackmail peoples are easily envisioned.Some
of the examples are Brad Pitt, Angelina Jolie nude videos. It becomes very important to spot the
difference between the deepfake and pris tine video. We are using AI to fight AI.Deepfakes are
created using tools like FaceApp[11] and Face Swap[12], which using pre-trained neural networks
like GANorAutoencoders for these deepfakes creation. Our method uses a LSTM based artificial
neural network to process the sequential temporal analysis of the video frames and pre-trained Res-
Next CNN to extract the frame level fea tures. ResNext Convolution neural network extracts the
frame-level features and these features are further used to train the Long Short Term Memory based
artificial Recurrent Neural Network to classify the video as Deepfake or real. To emulate the real
time scenarios and make the model perform better on real time data, we trained our method with large
amount of balanced and combi nation of various available dataset like FaceForensic++[1], Deepfake
detection challenge[2], and Celeb-DF[3]. Further to make the ready to use for the customers, we have
developed a front end application where the user the user will upload the video. The video will be
processed by the model and the output will be rendered back to the user with the classification of the
1
1.1 INTRODUCTION ABOUT THE SEMINAR TOPIC
In today’s digital age, the rise of artificial intelligence (AI) has brought about significant advancements
in media generation and manipulation. One of the most concerning applications of AI is the creation
of deepfakes—highly realistic fake videos or images that can alter a person’s face, voice, or actions in
ways that are difficult to distinguish from real content. Deepfakes are generated using sophisticated AI
models like Generative Adversarial Networks (GANs) and Autoencoders, making them increasingly
The growing misuse of deepfake technology poses serious threats, including political misinfor-
mation, identity fraud, financial scams, blackmail, and fake news propagation. As deepfakes become
more prevalent, there is an urgent need for robust deepfake detection techniques to counteract their
harmful effects.
This seminar will explore the underlying technology behind deepfake creation, its implications
on society, and various AI-driven detection approaches. We will discuss our proposed deepfake
detection model, which leverages ResNeXt Convolutional Neural Networks (CNNs) for frame-level
feature extraction and a Long Short-Term Memory (LSTM) network for sequential video analysis.
Additionally, we will highlight the datasets used for training, real-world applications, and the
By the end of this seminar, participants will gain insights into how AI is being used both to
create and combat deepfakes, the challenges in detecting manipulated content, and the future scope
The current deepfake detection systems employ a variety of methods to identify manipulated
media. These approaches primarily focus on pixel-level analysis, handcrafted feature extraction,
and inconsistencies in facial expressions, eye movements, or lighting conditions. Some traditional
techniques also rely on motion analysis, frequency domain analysis, and physiological signal
However, despite these efforts, the existing systems face several limitations:
datasets and fail to generalize well when tested on deepfakes generated using different
methods struggle to keep up. Many of these systems rely on static feature-based detection,
3. High False-Negative and False-Positive Rates – Some models incorrectly classify real videos
as deepfakes (false positives) or fail to detect actual deepfakes (false negatives), reducing their
effectiveness in practical applications. This leads to misclassification risks and makes the
resources, making real-time detection challenging. This limits their usability in large-
scale applications like social media monitoring, video authentication, and law enforcement
investigations.
ignoring the temporal consistency of videos. This makes them less effective in detecting subtle
6. Inability to Detect Emerging Deepfake Techniques – New deepfake creation tools continu-
ously improve their ability to mimic real videos with minimal artifacts, making it harder for
older detection systems to adapt without constant retraining and dataset updates.
Due to these limitations, there is a growing need for more advanced, AI-driven deepfake detection
methodologies that can adapt to evolving threats, improve accuracy, and operate efficiently in real-
time environments.
The rise of deepfake technology presents a major challenge in today’s digital landscape. While AI-
driven synthetic media can be used for entertainment and creativity, the misuse of deepfakes poses
significant threats to privacy, security, and trust in digital content. The ability to create highly realistic
fake videos and images has led to political misinformation, financial fraud, identity theft, cybercrimes,
and reputational damage. As a result, deepfake detection has become an essential area of research and
technological development.
1. Preventing Misinformation and Fake News – Deepfakes can be used to spread false
information about public figures, governments, and global events, leading to social and political
instability. Reliable detection systems help combat fake news and media manipulation.
2. Enhancing Cybersecurity and Fraud Prevention – Attackers can use deepfakes to imperson-
ate individuals, conduct financial scams, or create phishing attacks. Detecting such fraudulent
3. Protecting Privacy and Preventing Harassment – Deepfake technology has been misused
violations. Effective detection helps mitigate such cybercrimes and supports legal enforcement.
4. Ensuring Trust in Media and Journalism – In an era where digital media plays a vital role
in communication, deepfake detection ensures that news organizations, social media platforms,
5. Forensic and Law Enforcement Applications – Law enforcement agencies require deepfake
models promotes ethical AI usage and ensures that AI technologies are used responsibly,
LITERATURE SURVEY
Deepfake technology has gained prominence in recent years due to advancements in artificial
intelligence and deep learning. While it has potential applications in entertainment and media,
deepfakes pose significant threats in terms of misinformation, privacy violations, and fraud. Various
methods have been proposed to detect and mitigate deepfake content. This survey explores existing
• Feature Extraction: Analyzing pixel-level changes, facial landmarks, and temporal motion
dynamics.
• Machine Learning Models: Using classifiers like Support Vector Machines (SVM) and
Decision Trees.
• Limited Scope: Some systems are specific to face swaps and do not generalize well to other
manipulations.
detection.
5
• Dataset Dependency: Detection accuracy heavily depends on the quality and diversity of
training datasets.
• User Interface: Enabling users to upload and analyze videos for deepfake detection.
• Integration with Other Technologies: Some systems integrate with social media and
• Key Contribution: Proposes Dynamic Fine-Grained Difference Capture (DFDC) and Multi-
frame differences.
temporal inconsistencies.
• Key Contribution: Introduces Bipartite Group Sampling (BGS) and Multi-Rate Branches for
• Method: Uses an improved deep Convolutional Neural Network (D-CNN) for deepfake image
detection.
• Advantages: Achieves high accuracy across multiple datasets (AttGAN, GDWCT, StyleGAN,
etc.).
2.2.5 Deepfake Face Mask Dataset for Detection in the Infectious Disease Era
• Method: Introduces a Deepfake Face Mask Dataset (DFFMD) to enhance deepfake detection
in masked videos.
Recent advancements in deepfake detection have introduced novel techniques leveraging deep
learning, spatio-temporal analysis, and phase-based motion representation. Prashnani et al. (2024)
identify subtle manipulations in inter-frame differences. Similarly, Pang et al. (2023) developed
the Multi-Rate Excitation Network (MRE-Net), incorporating Bipartite Group Sampling (BGS) for
detecting inconsistencies in high-resolution deepfake videos. Patel et al. (2023) enhanced deep
Convolutional Neural Networks (D-CNN) for image-based deepfake detection, demonstrating high
accuracy across multiple datasets such as AttGAN and StyleGAN. Additionally, Alnaim et al. (2023)
addressed the emerging challenge of deepfake videos with face masks by introducing the Deepfake
Face Mask Dataset (DFFMD), enabling better detection under occluded conditions. These studies
highlight the ongoing improvements in generalization, dataset diversity, and real-time efficiency,
feature extraction. These innovations have improved detection accuracy and resilience against adver-
sarial manipulations. However, several critical challenges persist, including the high computational
cost of real-time processing, the heavy reliance on dataset quality and diversity, and the limited
generalization of models across different types of deepfake manipulations. Addressing these issues
requires a multi-faceted approach, integrating lightweight yet robust architectures, enhancing dataset
diversity to encompass a broader range of deepfake techniques, and developing adaptive algorithms
that can efficiently detect emerging threats in real-world scenarios. Future research should also focus
on improving interpretability, reducing false positives, and fostering collaboration between academia,
industry, and regulatory bodies to establish standardized benchmarks and countermeasures, ensuring
PROBLEM STATEMENT
Convincing manipulations of digital images and videos have been demonstrated for several decades
through the use of visual effects, recent advances in deep learn ing have led to a dramatic increase
in the realism of fake content and the accessibility in which it can be created. These so-called AI-
synthesized media (popularly referred to as deep fakes).Creating the Deep Fakes using the Artificially
intelligent tools are simple task. But, when it comes to detection of these Deep Fakes, it is major chal
lenge. Already in the history there are many examples where the deepfakes are used as powerful way
to create political tension, fake terrorism events, revenge porn, blackmail peoples etc.So it becomes
very important to detect these deepfake and avoid the percolation of deepfake through social media
platforms. We have taken a step forward in detecting the deep fakes using LSTM based artificial
Neural network.
3.1 Objectives
• Detect and expose deepfake content to mitigate its negative impact on digital security and public
trust.
• Develop a classification system that differentiates between authentic (pristine) and manipulated
(deepfake) videos.
10
• Design a user-friendly interface that allows users to upload videos and receive real-time
authenticity assessments.
• Enhance computational efficiency to enable real-time deepfake detection with minimal process-
ing delay.
• Improve detection accuracy by leveraging diverse datasets and state-of-the-art algorithms for
METHODOLOGY
The development of deepfake detection systems requires a systematic approach that involves problem
analysis, dataset processing, model training, and evaluation. This chapter details the methodologies
Solution Requirement: We began by analyzing the problem statement and assessing the feasibility
of developing an effective deepfake detection system. This involved an extensive literature review of
various academic papers (as discussed in Section 3.3) to understand existing approaches.
During dataset analysis, multiple training strategies were tested, including training models
exclusively on either fake or real videos. However, this introduced significant bias, leading to
inaccurate predictions. Extensive research and experimentation indicated that a balanced training
approach, incorporating both real and deepfake videos, reduced bias and variance, thereby improving
model accuracy.
• Cost of implementation
• Processing speed
12
• Level of expertise required
The key parameters for detecting deepfake videos were identified based on prior research and
empirical analysis:
• Inconsistent mustaches
• Lighting inconsistencies
• Pose misalignment
• Hairstyle irregularities
These parameters were leveraged to enhance model accuracy and improve deepfake detection
performance.
Based on our research and findings, we designed a deep learning-based system architecture optimized
for deepfake detection. The model consists of multiple layers, each fine-tuned to identify facial
The deepfake detection model consists of multiple layers optimized for video-based forgery detection.
ResNeXt CNN
We utilize the pre-trained Residual Convolutional Neural Network (ResNeXt) model, specifically
resnext50 32x4d [?]. This model consists of 50 layers and follows a 32x4d dimensional
configuration. ResNeXt is chosen due to its superior feature extraction capabilities, leveraging
Sequential Layer
A Sequential Layer is used to structure the feature vectors returned by ResNeXt in an ordered manner.
This ensures that the extracted feature maps are passed to the subsequent LSTM layer sequentially.
LSTM Layer
Long Short-Term Memory (LSTM) networks are employed to capture temporal dependencies
between frames. The extracted 2048-dimensional feature vectors serve as input to the LSTM layer.
• LSTM Layer: A single LSTM layer with 2048 latent dimensions and 2048 hidden units.
The LSTM layer processes video frames sequentially, analyzing temporal inconsistencies by
comparing frame differences over time. The model evaluates the frame at time t with previous frames
This hybrid CNN-LSTM architecture effectively detects inconsistencies within video sequences,
• Programming Language: Python 3, due to its extensive support for AI and deep learning
libraries.
• Framework: PyTorch, chosen for its ease of use, dynamic computation graph, and compatibil-
• Cloud Platform: Google Cloud Platform (GCP), utilized to train the model efficiently on a
large dataset.
The dataset was preprocessed by extracting frames from videos and resizing them to a uniform
resolution. Augmentation techniques such as flipping, rotation, and contrast adjustments were applied
To evaluate the performance of our deepfake detection model, we used a diverse dataset comprising
real and deepfake videos, including samples sourced from YouTube. We employed multiple
• Precision: Evaluates the proportion of true positive detections among all positive predictions.
• F1-Score: Provides a harmonic mean of precision and recall for a balanced evaluation.
• Confusion Matrix: Used to analyze false positives and false negatives, ensuring reliable
performance.
The model was tested on an independent validation set to determine its generalizability.
The final outcome of our project is a trained deepfake detection model capable of analyzing videos
and determining their authenticity. Our solution provides an efficient and accurate mechanism to
Future improvements include real-time detection capabilities and integration into digital content
The proposed deepfake detection model leverages the ResNeXt Convolutional Neural Network
(CNN) for spatial feature extraction and the Long Short-Term Memory (LSTM) network for temporal
The model was trained and tested on a diverse dataset comprising real and deepfake videos. The
• The model achieved high accuracy on FaceForensics++ dataset, with a peak accuracy of
• Performance on the Celeb-DF dataset was slightly lower (93.97%), due to the dataset’s high-
quality deepfakes.
• When tested on the final custom dataset (6000 videos), the model maintained a balanced
accuracy of 89.34%.
18
5.1.3 Evaluation Metrics
To assess the model’s performance, the following evaluation metrics were employed:
The trained model was evaluated on multiple datasets, and the results are summarized in Table 5.1.
CONCLUSION
6.1 Conclusion
Deepfake detection has become a crucial area of research due to the increasing threats posed by
using a hybrid architecture comprising ResNeXt CNN for spatial feature extraction and LSTM
for temporal analysis. The model was trained and evaluated on various datasets, including
FaceForensics++, Deepfake Detection Challenge (DFDC), and Celeb-DF, achieving high accuracy
Through extensive experimentation, we found that balanced dataset training significantly im-
proves the model’s ability to generalize across different types of deepfake manipulations. The
evaluation metrics, including accuracy, precision, recall, and F1-score, confirmed the effectiveness
While our system demonstrates strong performance, challenges remain in improving real-time
processing and handling highly sophisticated deepfakes with minimal detectable artifacts. This study
contributes to the ongoing development of robust deepfake detection systems and highlights the need
20
6.2 Future Enhancements
Despite the success of the proposed deepfake detection model, there is still room for improvement.
• Real-time Detection: Optimizing the model for real-time video analysis to detect deepfakes as
model that can run on edge devices such as smartphones and IoT systems.
• Generalization Across New Deepfake Techniques: Enhancing the model to detect emerging
• Multi-modal Analysis: Integrating audio and speech analysis alongside video detection to
• Integration with Social Media Platforms: Developing APIs and plugins that can be deployed
• Robust Adversarial Defense: Enhancing the system’s resistance to adversarial attacks that
• Dataset Expansion: Incorporating more diverse datasets, including high-resolution and low-
By addressing these areas, future iterations of the system can significantly enhance deepfake
6.3 Summary
This project presented a deep learning-based approach to deepfake detection, leveraging ResNeXt for
spatial feature extraction and LSTM for temporal sequence processing. The methodology involved
model demonstrated high accuracy in distinguishing between real and fake videos across multiple
datasets.
• A balanced dataset approach significantly improves generalization and reduces model bias.
detection performance.
• While the model is effective, real-time processing remains a challenge that requires further
optimization.
In conclusion, the study highlights the importance of AI-driven solutions in countering the rising
threat of deepfake technology. Future work will focus on enhancing real-time detection, improving
generalization across new deepfake methods, and integrating multi-modal verification techniques.
By continuing to refine deepfake detection methodologies, we can contribute to a more secure and
[1] Visual Deepfake Detection: Review of Techniques, Tools, Limitations, and Future Prospects
[2] A Comprehensive Review of DeepFake Detection Using Advanced Machine Learning and
Fusion Methods Gourav Gupta 1,† , Kiran Raja 1 and MukeshPrasad 3 1 , Manish Gupta 2,*,† ,
[4] Deepfake Video Detection using Neural Networks Abhijit Jadhav1 Abhishek Patange2 Jay
[5] G. Pang, B. Zhang, Z. Teng, Z. Qi and J. Fan, ”MRE-Net: Multi-Rate Excitation Network
for Deepfake Video Detection,” in IEEE Transactions on Circuits and Systems for Video
[6] Y. Patel et al., ”An Improved Dense CNN Architecture for Deepfake Image Detection,” in IEEE
”DFFMD: A Deepfake Face Mask Dataset for Infectious Disease Era With Deepfake Detection
[8] Q. Yin, W. Lu, B. Li and J. Huang, ”Dynamic Difference Learning With Spatio–Temporal
Correlation for Deepfake Video Detection,” in IEEE Transactions on Information Forensics and
23