Project Report Model
Project Report Model
DEEP LEARNING
Submitted by
DINESHWAR C (810422243026)
ABINESH M (810422243002)
JEYARAJ K (810422243038)
AMRITH ALOISHIOUS A (810422243004)
Of
DHANALAKSHMI SRINIVASAN ENGINEERING COLLEGE, (AUTONOMOUS)
PERAMBALUR – 621 212
Submitted to the
Certified that this mini project report titled “PERSONAL VIRTUAL AI ASSISTANT”
the research under my supervision. certified further, that to the best of my knowledge the work reported
here in does not form part of any other project report or dissertation on the basis of which a degree or
SIGNATURE SIGNATURE
ii
ACKNOWLEDGEMENT
It is with immense pleasure that I present my first venture in the field of real application
of computing in the form of project work. First, I am indebted to the Almighty for his choicest
College (Autonomous), Perambalur for their moral support and encouragement they have
rendered throughout the course. I express my sincere thanks to the Head of the Department
Dr.
K.V.M SHREE, M.E., Ph.D., for having provided us with all the necessary specifications.
We render our thanks to all the staff members and programmers of department of
iii
ABSTRACT
With the rise of manipulated media, deepfake content has become a serious concern in
digital security, social media, and public trust. These synthetic images and videos, generated
using AI, can be indistinguishable from real ones to the human eye, posing significant ethical
and security threats. Manual identification of deepfake images is not only challenging but also
counter the rapid spread of fake content online. This calls for an automated, reliable solution to
detect deepfakes accurately and efficiently. This project proposes the use of Convolutional
Neural Networks (CNNs), a powerful class of deep learning models, to automate the detection of
deepfake images. The system is trained on a labeled dataset of real and fake images, using a
custom-built CNN architecture consisting of convolutional, max-pooling, and dense layers. The
model takes preprocessed 128×128 RGB images as input and is trained using binary cross-
entropy loss with the Adam optimizer. The system achieves high classification accuracy and
effectively distinguishes between real and synthetic images. The proposed CNN-based deepfake
detection system provides a fast and scalable solution for identifying manipulated images. It can
serve as a valuable tool in digital forensics, content moderation, and media authentication,
iv
TABLE OF CONTENTS
ABSTRACT iv
1 INTRODUCTION 1
1.1 INTRODUCTION 1
1.2 PURPOSE 1
1.4 MOTIVATION 2
1.5 OBJECTIVES 3
2 LITERATURE SURVEY 4
3 SYSTEM ANALYSIS 6
4 SYSTEM SPECIFICATIONS 8
5 SYSTEM IMPLEMENTATION 9
v
5.2.1 DATASET COLLECTION 9
6 SYSTEM DESIGN 12
7 SOFTWARE DESCRIPTION 17
7.1 OVERVIEW 17
18
7.2.5 INFERENCE MODULE
18
7.2.6 EXPLAINABILITY MODULE
18
7.2.7 STREAMLIT INTERFACE MODULE
18
7.2.8 UTILITY MODULE
18
7.3 FRAMEWORK OVERVIEW
19
7.4 FEATURES
19
vi
8 SOFTWARE TESTING 20
27
vii
9 CONCLUSION AND FUTURE ENHANCEMENT 28
9.1 CONCLUSION 28
APPENDICES 29
APPENDIX 2 - SCREENSHOTS 35
REFERENCES 42
viii
LIST OF FIGURES
ix
LIST OF ABBREVIATIONS
AI - ARTIFICIAL INTELLIGENCE
OS - OPERATING SYSTEM
UI - USER INTERFACE
x
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
This project aims to build an effective deepfake detection system using deep learning
techniques. The proposed system analyzes facial and visual features in video frames and uses a
convolutional neural network (CNN) model to classify content as real or fake. This technology
has applications in digital forensics, social media moderation, and media verification.
1.2 PURPOSE
The purpose of this project is to design and implement a deep learning-based system
capable of accurately detecting deepfake videos and images. With the increasing availability of
tools that allow even non-experts to create highly realistic fake media, the integrity of digital
content has become a growing concern. This project aims to combat the misuse of such synthetic
content by developing a reliable and automated solution that can identify tampered visuals
through analysis of facial and visual inconsistencies. By leveraging convolutional neural
networks (CNNs) and other deep learning architectures, the system is expected to detect subtle
1
artifacts introduced
2
during the generation of deepfakes. Beyond technical implementation, the broader purpose of
this work is to support digital forensics, social media moderation, and public awareness efforts
by offering a scalable method for verifying the authenticity of digital media. This contributes to
safeguarding individuals, organizations, and societies from the potentially harmful consequences
of misinformation, impersonation, and fraud caused by deepfakes.
Deepfakes present a serious threat to digital content authenticity, with potentially severe
implications for individuals, corporations, and governments. These AI-generated videos and
images can be manipulated to falsely represent people saying or doing things they never did,
leading to misinformation, defamation, identity theft, and political manipulation. The quality of
deepfakes has advanced to the point where they are nearly indistinguishable from genuine media,
making manual detection by human observers unreliable. Existing detection methods are often
limited in scope, lack real-time performance, and struggle to keep pace with rapidly evolving
deepfake generation techniques. Moreover, traditional forensic analysis techniques are time-
consuming and require expert intervention, making them unsuitable for large-scale content
verification. This project addresses these challenges by developing an intelligent, automated
system that uses deep learning to detect deepfakes with high accuracy. By training models on a
combination of real and fake media datasets, the system aims to identify subtle features that
distinguish authentic content from manipulated media, thus helping to restore trust in digital
communications.
1.4 MOTIVATION
The motivation behind this project stems from the growing misuse of AI-generated
content in malicious contexts such as political misinformation, fake news, financial scams, and
personal reputation damage. By developing a deepfake detection model, we can help combat
these threats and promote trust in digital content, while contributing to AI accountability and
responsible media practices.
3
1.5 OBJECTIVES
4
CHAPTER 2
LITERATURE SURVEY
With the rapid development of generative adversarial networks (GANs) and related
technologies, deepfakes have become one of the most challenging threats to digital media
authenticity. Consequently, researchers have actively explored various techniques for detecting
such manipulations using machine learning and deep learning approaches. This chapter reviews
significant existing work in the field of deepfake detection.
Nguyen et al. (2019) explored the use of capsule networks for deepfake detection,
highlighting their ability to preserve spatial hierarchies in facial structures. Their work showed
promise in scenarios where traditional CNNs struggled due to geometric transformations.
Li et al. (2020) proposed Face X-ray, a technique that identifies blending artifacts in
deepfakes by treating the problem as an image segmentation task. This method detects whether a
given image contains a combination of two facial regions—a common trait in face-swapping
deepfakes.
5
More recent approaches have leveraged transformer-based models and attention
mechanisms to improve detection accuracy. These models focus on capturing long-range
dependencies and facial expressions more effectively, which are often difficult to fake
consistently across frames.
6
CHAPTER 3
SYSTEM ANALYSIS
In the current landscape, deepfake detection systems face significant challenges due to
the rapid evolution of deepfake generation techniques. Many existing systems rely on manual
observation or traditional video forensics, which are time-consuming, inconsistent, and often
ineffective against high-quality deepfakes. While several machine learning approaches have been
introduced, many of them lack generalization capability and fail to maintain high accuracy across
various deepfake datasets.
The proposed system leverages deep learning, specifically convolutional neural networks
(CNNs), to detect deepfake content based on learned features rather than manually crafted ones.
This system is designed to automatically analyze image or video frames and identify subtle facial
distortions, pixel-level inconsistencies, or blending artifacts commonly found in manipulated
7
media. By training the model on a diverse dataset of real and fake videos/images, the system
aims to achieve high accuracy and robustness.
The architecture of the system may include a pretrained model (e.g., XceptionNet,
EfficientNet, or ResNet) fine-tuned on deepfake datasets such as FaceForensics++, DFDC, or
Celeb-DF. The model extracts deep visual features from each frame and classifies them as "real"
or "fake." The system can be extended to include temporal models (e.g., LSTM or 3D CNNs) to
analyze video frame sequences and improve detection based on motion inconsistencies.
This solution is scalable, fast, and capable of detecting both known and emerging
deepfake types. It can be deployed in content moderation systems, mobile apps, or browser
extensions to verify media authenticity in real-time.
8
CHAPTER 4
SYSTEM SPECIFICATIONS
RAM : 8-16 GB
Storage : 256 GB
learn,Tensorflow
9
CHAPTER 5
SYSTEM
IMPLEMENTATION
1. Dataset Collection
2. Data Preprocessing
3. Model Design and Training
4. Deepfake Detection
5. Result Evaluation and Visualization
The Dataset Collection module involves gathering a diverse and comprehensive dataset
that includes both real and deepfake images or video frames. Public datasets such as
FaceForensics++, DFDC, or Celeb-DF are often used for this purpose. These datasets provide a
wide range of manipulated content, ensuring variety in terms of facial expressions, lighting
conditions, backgrounds, and manipulation techniques. The goal is to collect enough data to train
a robust model capable of generalizing to different types of deepfake content.
This module is responsible for preparing the collected data for training. It includes
extracting frames from video files, detecting and aligning faces using tools like MTCNN or
OpenCV, resizing images to the desired input size for the CNN (typically 224x224 pixels), and
normalizing pixel values. To enhance model generalization and reduce overfitting, data
augmentation techniques such as rotation, flipping, brightness adjustment, and noise addition are
also applied. Proper preprocessing ensures consistency and improves the efficiency of the model
training process.
10
5.2.3. Model Design and Training
In this module, a Convolutional Neural Network (CNN) is designed and trained for
binary classification—determining whether an input is real or fake. This can involve building a
custom CNN architecture or fine-tuning a pre-trained model such as VGG16 or ResNet. The
model is trained using a binary crossentropy loss function and an optimizer like Adam. The
dataset is split into training, validation, and test sets to monitor the model's performance and
prevent overfitting. Training is carried out over multiple epochs, and key metrics such as training
accuracy and loss are recorded.
Once the model is trained, it is used in this module to classify new or unseen media. The
input image or video frame undergoes the same preprocessing steps and is then passed through
the CNN model to predict the likelihood of it being real or fake. Based on the output probability,
the system labels the input accordingly. This module represents the core functionality of the
system— real-time or batch detection of deepfakes using the trained model.
The final module of the system is designed to thoroughly evaluate model performance and
present the results through intuitive and insightful visualizations. It begins with the confusion
matrix, which clearly outlines the distribution of true positives, true negatives, false positives,
and false negatives, helping identify the model’s strengths and the nature of its
misclassifications. To further assess the quality of the classification, the module includes a
Receiver Operating Characteristic (ROC) curve, which demonstrates the trade-off between
sensitivity and specificity across different threshold values, providing a visual guide for selecting
an optimal decision boundary.
In addition, accuracy and loss graphs are plotted to monitor the model’s training process
over time, comparing training and validation metrics to detect issues like overfitting or
underfitting.
11
As an added feature, an optional interactive interface is available, allowing users to
upload video files and observe real-time detection results. This interface displays visual outputs
such as bounding boxes and labels on each frame, offering a hands-on, user-friendly way to test
and explore the system’s functionality. Collectively, these tools not only deliver a complete
evaluation of the model’s performance but also enhance its interpretability and accessibility for
both technical and non-technical users.
12
CHAPTER 6
SYSTEM DESIGN
13
6.2 USE CASE DIAGRAM
14
6.3 CLASS DIAGRAM
15
6.4 SEQUENCE DIAGRAM
16
6.5 ACTIVITY DIAGRAM
17
CHAPTER 7
SOFTWARE DESCRIPTION
7.1 OVERVIEW
The project leverages Python due to its robust ecosystem of AI and image processing
libraries. Using datasets containing both genuine and deepfake videos (e.g., FaceForensics++,
Celeb-DF), the system is trained to distinguish real from fake content. It provides a seamless
pipeline—from media upload to prediction output—through modular components.
The Data Ingestion Module handles the loading of datasets and the extraction of frames
from video files. It organizes data into training, validation, and test sets while managing
associated labels, enabling the model to learn from real-world examples of deepfakes.
The Preprocessing Module is responsible for preparing the input data. It detects and crops
faces from images or frames using face detection libraries like MTCNN or dlib, then resizes and
normalizes them. This module also performs data augmentation to enhance model generalization.
The CNN Model Module defines the structure of the neural network used for
classification. It allows for the use of custom or pre-trained models, such as Xception or
EfficientNet, and
18
includes functionality for compiling, training, saving, and loading the model architecture and
weights.
The Training & Validation Module manages the model training loop, monitors
performance metrics like accuracy and F1 score, and applies callbacks such as early stopping.
This module ensures the model learns effectively while avoiding overfitting.
The Inference Module is used to predict whether new input media is real or fake. It
processes the input image or video, extracts faces, and applies the trained CNN model. It returns
a classification label with a confidence score.
19
7.3 FRAMEWORK OVERVIEW
TensorFlow / Keras
OpenCV
7.4 FEATURES
20
CHAPTER 8
SOFTWARE TESTING
Software testing in the context of deepfake detection is critical to ensure that the system
accurately and reliably differentiates between real and manipulated media. The aim is to identify
defects in the model logic, data preprocessing, and the user interface, and to validate that the
deep learning model generalizes well across unseen data. This chapter details the testing
approaches applied across all levels of the system to validate functionality, performance, and
usability.
This test case involves providing a high-quality image with a clearly visible human face. The input
image should be in a supported format like .jpg or .png. The purpose is to verify that the model
accurately identifies real faces. The expected result is a "Real" prediction with high confidence,
typically above 90%. This confirms the model performs well with ideal input conditions.
In this case, the model is tested using a confirmed deepfake image. The goal is to ensure the CNN
correctly classifies fake content. The model should return a prediction of "Fake" with high
confidence, validating that it has learned to distinguish synthetic facial features and manipulation
artifacts effectively.
This test evaluates the system’s response when an image without a human face is submitted. For
example, images of landscapes, objects, or animals can be used. The model should return an
error message like "No face detected," demonstrating that the face detection preprocessing step is
functioning correctly and that unnecessary processing is avoided.
21
8.2.4. Low Confidence Output:
To test how the model handles uncertainty, a low-quality, blurry, or partially obscured face
image is input. The model should still attempt a prediction but may return a output with "Low
confidence prediction." This helps the user understand that the model is uncertain and provides
guidance for corrective action.
A very high-resolution image or a 4K video is used as input to assess how the model handles
large data. The goal is to ensure the system does not crash due to memory overload or processing
timeouts. The output should still be correct, and the performance should remain stable,
confirming system scalability.
This advanced test uses adversarial examples—images that are subtly modified to confuse the
model, often with added noise or slight distortions. The goal is to check if the model is robust
against minor perturbations. Ideally, the system should still classify the input as "Fake" if it's
indeed a deepfake, showing resilience against manipulation.
22
classification system, unit tests are written for core functions such as frame extraction,
face detection, image preprocessing, and model prediction. Each of these components is
tested using Python’s unittest framework and assertions, which check whether the actual
output matches the expected result for various test cases. For example, the face detection
function may be tested by passing in an image with a known face and verifying that it
returns the correct bounding box. Unit testing enables early detection of bugs, simplifies
debugging, and helps maintain code quality during ongoing development.
Once individual components have been tested, integration testing is conducted to ensure
that these modules interact correctly when combined. This type of testing examines the
flow of data across modules—specifically from the frame extractor to the preprocessing
unit, then to the deep learning model, and finally to the output generation system. The
main goal is to verify that each component correctly passes formatted and expected data
to the next. For example, integration testing checks whether the preprocessed face images
output by one module are properly formatted and compatible with the input expected by
the classification model. This helps detect interface mismatches, improper data handling,
and communication failures between modules.
Functional testing evaluates whether the overall system behaves as expected from a
user’s perspective. This includes testing the complete pipeline starting from the user
uploading an image or video, followed by system processing, classification (real or fake),
and finally the display of results on the interface. Test scenarios include valid uploads,
invalid file types, corrupted media, and edge cases like extremely small or blurry faces.
The goal is to ensure the system meets functional requirements, such as successful file
uploads, accurate deepfake detection, and timely feedback. This testing is vital in
validating the end-to-end functionality of the system in realistic usage scenarios.
23
8.3.4. Regression Testing
As the system evolves, with new features added or existing algorithms improved,
regression testing ensures that these changes do not unintentionally disrupt previously
working functionality. For example, if the face detection algorithm is enhanced or the
model is retrained for better accuracy, regression tests are used to retest all critical
features—such as correct image classification and proper result rendering—that were
already working in earlier versions. Automated test scripts are often used for this purpose
to quickly verify that nothing has been broken in the process of updates.
Tools and scripts are used to simulate concurrent uploads and track the system’s ability to
maintain consistent response times, handle memory efficiently, and recover from
overload situations. A well-performing system ensures users experience minimal delays
even during peak usage.
Usability testing focuses on the design and user interface of the application, ensuring that
it is intuitive and accessible for users with varying levels of technical expertise. Test
participants are asked to perform common tasks such as uploading files, interpreting
results, and troubleshooting errors. During testing, evaluators look for signs of confusion,
difficulty, or hesitation. Elements such as clear instructions, helpful tooltips, informative
error messages, and easy navigation are essential. For instance, if a user uploads an
24
unsupported file format, the system should provide a clear message indicating the
accepted formats. Based on usability feedback, the interface is adjusted to ensure a
smooth and user- friendly experience.
Black box testing treats the system as a "black box" where the internal code and
architecture are not considered. Instead, testing focuses purely on inputs and outputs.
Testers provide a variety of input media files and observe the output (real or fake
classification, error messages, etc.) to ensure correctness. They also evaluate how the
system responds to unexpected or incorrect input, such as uploading text files or
extremely large videos. The goal is to ensure the application behaves correctly and
predictably from a user's perspective, regardless of the underlying implementation.
In contrast to black box testing, white box testing involves a detailed examination of the
internal workings of the system. This includes checking the structure of the code, data
transformations, model layer outputs, and normalization processes. For example, testers
may verify that pixel values are normalized to the correct range before being fed into the
model, or that the intermediate outputs of convolutional layers fall within expected
distributions. This kind of testing is particularly useful for debugging and optimizing
model performance and verifying that the architecture and data handling conform to
design specifications.
Output testing validates the accuracy and clarity of the final system outputs. The
classification results (i.e., "Real" or "Fake") are compared against a labeled test dataset to
assess prediction accuracy. Additionally, the visual presentation of results is examined—
for instance, checking whether the predicted label is displayed near the detected face
along
25
with a confidence score overlay. The correctness of the overlay, font clarity, color-coding
(e.g., red for fake, green for real), and alignment with detected features are tested to
ensure users can easily understand the results.
User Acceptance Testing is the final phase where the system is tested by actual end users
— typically a representative group of the intended audience. These users interact with the
system by uploading various media files and interpreting the detection results. Their
feedback is collected on several parameters, including the clarity of classification results,
usefulness of confidence levels, and ease of navigation and interaction. Based on this
feedback, minor enhancements are often implemented, such as more descriptive file
format alerts, better result styling, and improved layout responsiveness. UAT ensures that
the system is ready for deployment and meets real-world user expectations.
In any machine learning or AI-based system, testing is a critical phase that ensures
reliability, correctness, and performance under various conditions. In this project, several tools
have been employed to support both unit-level and system-level testing. The Python modules
unittest and pytest serve as automated unit testing frameworks. These tools allow the developer
to create test cases for individual components such as data loading, preprocessing, face detection,
and model prediction. They help maintain the integrity of the codebase by ensuring that newly
added functions do not break existing features. pytest in particular provides a more scalable and
user-friendly syntax and supports advanced features like fixtures and parameterized testing,
making it ideal for complex deep learning projects.
26
TensorBoard’s interactive graphs and histograms help in understanding how weights and biases
evolve during the training process.
Evaluating the performance of a deep learning model goes beyond simply reporting
accuracy. For a binary classification task such as deepfake detection, it is vital to use a set of
robust evaluation metrics that account for different types of prediction errors. Accuracy, while
commonly used, only indicates the overall correctness of predictions. It can be misleading in
imbalanced datasets where one class may dominate. For instance, if most videos are real, a
model predicting everything as real might still appear accurate.
To address this limitation, Precision is used to measure the number of correctly identified
fake instances divided by the total instances the model predicted as fake. High precision
indicates that when the model claims something is fake, it is likely correct—important in
minimizing false accusations of authenticity. Conversely, Recall focuses on the model’s ability
to detect actual fake content. It is calculated as the number of correctly predicted fake instances
divided by the total number of actual fake samples. High recall ensures the model doesn't miss
potential threats in the form of deepfakes.
The F1-Score serves as a balanced metric that considers both precision and recall. It is
especially useful when dealing with uneven class distributions or when both false positives and
false negatives carry significant consequences. An ideal deepfake detection model should aim for
a high F1-score to maintain balance between caution and coverage. Finally, the ROC-AUC
27
(Receiver Operating Characteristic - Area Under the Curve) metric is employed to evaluate the
trade-off between sensitivity (true positive rate) and specificity (false positive rate) across
various threshold settings.
Building a user-facing AI system demands that it not only performs accurately but also handles
unexpected situations gracefully. The error handling module in the deepfake detection system is
designed to provide informative and user-friendly responses to a variety of potential issues,
ensuring robustness and enhancing user experience.
One common scenario is when a user uploads an image or video where no recognizable
human face is present. In such cases, the system returns the message: “No human face detected.”
This prevents the model from processing irrelevant or non-human content, which could lead to
misleading outputs. This check is implemented early in the pipeline using face detection
algorithms like MTCNN or Haar cascades.
Another error addressed is invalid file formats. The system is designed to work with
specific media formats (e.g., .jpg, .png, .mp4), and when an unsupported file is uploaded, it
prompts the message: “Unsupported file type.” This safeguards the application from crashing due
to unrecognized data structures and guides the user toward acceptable input types.
If the system encounters a failure in loading the trained model—either due to file
corruption, incorrect path, or missing files—it raises an alert with the message: “Model loading
error.” This is a critical failure point, and the error message informs the user or developer to
recheck the deployment files.
Lastly, the system incorporates a confidence threshold mechanism. If the model makes a
prediction but with a confidence level below 60%, it triggers a warning: “Low confidence. Re-
upload suggested.” This acts as a safeguard against unreliable outputs and encourages users to
submit better-quality inputs, such as clearer images or videos with good lighting and frontal
faces. Collectively, these error-handling features make the system more reliable, user-oriented,
and capable of functioning well in real-world scenarios.
28
CHAPTER 9
CONCLUSION AND FUTURE ENHANCEMENT
9.1 CONCLUSION
The increasing prevalence of deepfake media poses a significant threat to digital content
authenticity, personal identity, and information security. This project presents an effective
solution to detect deepfake videos and images using deep learning models. By leveraging
convolutional neural networks (CNNs), the system can learn and extract complex visual features
from input media to distinguish between real and fake content with high accuracy.
Throughout the project, various aspects of deepfake generation and detection were
explored. The proposed system was trained and tested on benchmark datasets and demonstrated
promising results in identifying synthetic facial manipulations. Unlike traditional manual or rule-
based methods, this system relies on learned features, making it more scalable, adaptive, and
suitable for real-world applications.
This work contributes to the broader field of digital forensics and can assist platforms,
law enforcement, and the general public in countering misinformation, fraud, and media
tampering. The proposed model successfully meets the core objectives of detecting manipulated
media and improving awareness regarding the risks of deepfake content.
While the proposed deepfake detection system demonstrates effective performance, there
are opportunities for further development and improvement in future work. Some key areas of
enhancement include:
Incorporating Temporal Features: Current models often analyze frames individually. Adding
temporal models like 3D CNNs or LSTMs will enable better video-level analysis by capturing
motion-based inconsistencies.
29
Multi-modal Detection: Integrating both audio and video features will provide more robust
detection, particularly in detecting deepfakes that also manipulate voice and speech patterns.
Real-time Detection Capabilities: Optimization of the system for real-time processing can
allow for implementation in web applications, browser extensions, or mobile platforms for on-
the-fly deepfake analysis.
User Interface Development: Building a simple and interactive front-end interface would allow
non-technical users to upload and check media content for authenticity.
This project lays a solid foundation for future advancements in automated deepfake detection
and has the potential to evolve into a full-fledged system that plays a key role in combating the
spread of synthetic misinformation.
30
APPENDICES
APPENDIX 1
SOURCE CODE
Training.py
import tensorflow as tf
import numpy as np
import cv2
import os
def build_model():
model = keras.Sequential([
layers.MaxPooling2D((2, 2)),
layers.MaxPooling2D((2, 2)),
layers.MaxPooling2D((2, 2)),
layers.Flatten(),
layers.Dense(128, activation='relu'),
31
layers.Dense(1, activation='sigmoid')
])
return model
def load_data(data_dir):
def process_folder(folder):
img = cv2.imread(img_path)
images.append(img)
labels.append(label)
32
# Train Model
model = build_model()
model.save("deepfake_model.h5")
33
APP.PY
import streamlit as st
import tensorflow as tf
import numpy as np
import cv2
def load_model():
return tf.keras.models.load_model("D:\jp\deepfake_model.h5")
# Preprocess image
def preprocess_image(image):
return image
# Prediction function
processed_image = preprocess_image(image)
prediction = model.predict(processed_image)[0][0]
# Streamlit UI
34
st.title("Deepfake Detection System")
model = load_model()
# Image Upload
image = Image.open(uploaded_file)
import numpy as np
face = extract_face(image_path)
prediction = model.predict(face)[0][0]
else:
35
APPENDIX 2
SCREENSHOTS
INITIAL WEBPAGE
36
DEEPFAKE DETECTION
1. REAL IMAGES
37
Figure 10.3 DeepFake Detection prediction 2: Real
38
Figure 10.4 DeepFake Detection prediction 3: Real
39
2. FAKE IMAGES
40
Figure 10.6 DeepFake Detection prediction 5 :Deepfake
41
REFERENCES
42
REFERENCES
1. Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2019).
FaceForensics++: Learning to detect manipulated facial images.
2. Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). MesoNet: a Compact Facial
Video Forgery Detection Network. In Proceedings of the IEEE International Workshop
on Information Forensics and Security (WIFS), 1–7.
3. Chollet, F. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 1251–1258.
4. Nguyen, H. H., Yamagishi, J., & Echizen, I. (2019). Capsule-forensics: Using capsule
networks to detect forged images and videos. In ICASSP 2019 - IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2307–2311.
5. Li, Y., Chang, M. C., & Lyu, S. (2020). Face X-ray for more general face forgery
detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), 5001–5010.
6. Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C.
(2020). The Deepfake Detection Challenge (DFDC) Dataset. arXiv preprint
arXiv:2006.07397.
https://www.kaggle.com/c/deepfake-detection-challenge
7. Li, Y., & Lyu, S. (2019). Exposing DeepFake Videos By Detecting Face
Warping Artifacts. In Proceedings of the IEEE Conference (CVPRW).
8. Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-
Scale Image Recognition. arXiv preprint arXiv:1409.1556.
9. Abavisani, M., & Patel, V. M. (2020). Exploring the Space of Deepfake Detection. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition Workshops (CVPRW), 1–8.
10. Kingma, D. P., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. In
Proceedings of the International Conference on Learning Representations (ICLR).
https://arxiv.org/abs/1412.6980
43