Detection of Digital Media Manipulation Using Hybrid Ensembling Model Report
Detection of Digital Media Manipulation Using Hybrid Ensembling Model Report
This sheet must be filled in (each box ticked to show that the condition has been met). It must be
signed and dated along with your student registration number and included with all assignments
you submit – work will not be marked unless this is done.
To be completed by the student for all assessments
I / We hereby certify that this assessment compiles with the University’s Rules and Regulations
relating to Academic misconduct and plagiarism**, as listed in the University Website,
Regulations, and the Education Committee guidelines.
I / We confirm that all the work contained in this assessment is my / our own except where
indicated, and that I / We have met the following conditions:
I understand that any false claim for this work will be penalized in accordance with the
University policies and regulations.
DECLARATION:
I am aware of and understand the University’s policy on Academic misconduct and plagiarism and I certify
that this assessment is my / our own work, except where indicated by referring, and that I have followed
the good academic practices noted above.
SIGNATURE SIGNATURE
EXAMINER 1 EXAMINER 2
ACKNOWLEDGEMENTS
We express our humble gratitude to Dr. C. Muthamizhchelvan, Vice-Chancellor,
SRM Institute of Scienceand Technology, for the facilities extended for the project work
and his continued support. We extend our sincere thanks to Dr. T. V. Gopal , Dean-CET,
SRM Institute of Science and Technology, forhis invaluable support. We wish to thank
Dr. Revathi Venkataraman, Professor and Chairperson, School of Computing, SRM
Institute of Science and Technology, for her support throughout the project work.
We encompass our sincere thanks to, Dr. M. Pushpalatha, Professor and Associate
Chairperson, School of Computing and Dr. C. Lakshmi, Professor and Associate
Chairperson, School of Computing, SRM Institute of Science and Technology, for their
invaluable support. We are incredibly grateful to our Head of the Department, Dr. R.
Annie Uthra, Professor, Department of Computational Intelligence, SRM Institute of
Science and Technology, for her suggestions and encouragement atall the stages of the
project work.
We want to convey our thanks to our Project Coordinators, Panel Head, and Panel
Members, Department of Computational Intelligence, SRM Institute of Science and
Technology, for their inputs during the project reviews and support. We register our
immeasurable thanks to our Faculty Advisor Dr. S. Amudha, Department of
Computational Intelligence, SRM Institute of Science and Technology, for leading and
helping us to complete our course.
Our inexpressible respect and thanks to our guide, Dr. M. Meenakshi, Department of
Computational Intelligence, SRM Institute of Science and Technology, for providing us
with an opportunity to pursue our project under his / her mentorship. He / She provided
us with the freedom and support to explore the research topics of our interest. His / Her
passion for solving problems and making a difference in the world has always been
inspiring.
We sincerely thank all the staff and students of Computational Intelligence, School of
Computing, S.R.M Institute of Science and Technology, for their help during our project.
Finally, we would like to thank our parents, family members, and friends for their
unconditional love, constant support and encouragement
ABSTRACT
This project aims to address the increasing problems with the manipulative potential
of digital media, and particularly targets the detection of manipulated media, or
synthetically modified videos and photographs, which threaten nowadays to shoulder
the load of truth in a digital world. Such a thing can be used for the distribution of
fake news, presenting a false identity of a celebrity, or even committing fraud online,
thus making a robust requirement to have a detection mechanism.
v
TABLE OF CONTENTS
Chapter No. Title Page No.
ABSTRACT v
TABLE OF CONTENTS vi
LIST OF FIGURES viii
LIST OF TABLES ix
ABBREVIATIONS x
1 INTRODUCTION 1
1.1 Introduction to Manipulated Media Detection 1
1.2 Motivation 2
1.3 Sustainable Development Goal of the Project 2
2 LITERATURE SURVEY 4
2.1 Overview of the research area 4
2.2 Existing Models and Frameworks 4
2.3 Limitations of Existing Systems 5
2.4 Research Objectives 6
2.5 Product Backlog 6
2.6 Plan of Action 7
3 SYSTEM ARCHITECTURE AND DESIGN 9
4 SPRINT PLANNING AND METHODOLOGY 12
4.1 SPRINT I 12
4.2 SPRINT II 14
4.3 SPRINT III 16
4.4 Methodology 17
5 RESULTS AND DISCUSSION 19
5.1 Distribution of Data 19
5.2 Evaluation Metrics of Each Model 20
5.2.1 Evaluation using Xception for manipulated media 20
detection
5.2.2 Evaluation using InceptionResnetV2 for manipulated 22
media detection
5.2.3 Evaluation using VGG19 for manipulated media 24
detection
5.2.4 Evaluation using NasNetMobile for manipulated 26
media detection
5.2.5 Evaluation using Proposed Hybrid Ensembling Model 28
for manipulated media detection
vi
(Xception+NasNetMobile)
5.3 Comparative Analysis of Each Model 30
6 CONCLUSION AND FUTURE 31
ENHANCEMENTS
REFERENCES 32
APPENDIX
A CODING AND IMPLEMENTATION 33
B RESEARCH PAPER 53
C PAPER SUBMISSION FORM 54
D PLAGARISM REPORT 55
vii
LIST OF FIGURES
Figure No. Title Page No.
3.1 Proposed System Architecture 9
3.2 Dataflow Diagram 10
5.1 Data Distribution 19
5.2 Evaluation Graphs for Xception Model 20
5.3 Confusion Matrix for Xception Model 21
5.4 Evaluation Graphs forInceptionResnetV2 22
Model
viii
LIST OF TABLES
ix
ABBREVIATIONS
Abbreviation Full Form
DL Deep Learning
CE Categorical Cross-Entropy
TL Transfer Learning
TF TensorFlow
UI User Interface
x
CHAPTER 1
INTRODUCTION
Manipulated Digital Image Video Detection Using Hybrid Ensembling Approach, this project
focuses on the detection of manipulated videos and images. The objective is to conceive an
intelligent system that will be able to discern photographs as genuine or manipulated. Using
deep learning techniques, we train models to identify and pick out the occurrences of minute
changes or concealed patterns which remain beyond the reach of the human eye.
The project implemented and tested several deep learning architectures: InceptionResNetV2,
VGG19, CNN, Xception, and NASNet Mobile. But single models often struggled to handle
the complex variations inherent in the manipulated visual content. Hence the hybrid ensemble
of Xception and NASNet Mobile was proposed. The ensemble performed extraordinarily well,
achieving a phenomenal accuracy of 97.02%.
With these editing tools and AI manipulation techniques having become wide-awake in the
market today, it has become imperative to ensure the authentication of videos and images. This
project seeks to aid in the safeguarding of digital spaces through a trustworthy and efficient
means of detecting manipulated visual content, thereby engendering trust and safety in various
platforms.
1
1.2 Motivation
The driving motivation of this project springs from the ever-growing danger that manipulated
videos and images inflict upon their subjects, communities, and institutions. A world where
visual content helps shape opinions and dictate decisions gives grand influence over what
people see and thus construct their realities. The effects could be far-reaching: whether through
fake political speeches, a non-existent news event, or a social media rumor cast by faked
imagery, manipulated visualizations can mislead millions and cause havoc.
Even though this ugly scenario opened the door for me to start the work of developing a system
that could detect manipulated videos and pictures, which would eventually work its way on to
the internet, I also realized this was an urgent task that required immediate action on my part.
Detection techniques were up to par with fairly simple manipulation techniques, but as these
techniques moved into the upper echelon of complexity, typical detection approaches were
becoming obsolete. My aim was to develop and design a solution that was deep-learning-based
with an ensemble hybrid combining various detection techniques in order to catch any tiny
trace of manipulation. I truly believe that technology must save the day for truth, rebuilds trust
in digital media, and protect the innocent from the perils of misinformation and visual
deception.
With this project, we let our minds develop based on the vision that technology was meant to
solve problems and not to create more. Manipulated media is originally technologic
advancements, but they can also be complemented to respond. By accomplishing a better
detection system, we can really work with restoring the trust online and making sure that media
remain a force for good instead of using it for deception.
2
detrimental consequences of false narratives. Trustworthy digital media is indispensable for
social peace, public safety, and informed choice, which are all key pillars of strong institutions.
This project also supports the Goal 9 "Industry, Innovation and Infrastructure" by encouraging
the application of advanced technology for the betterment of society. Through the hybrid
ensembling approach using deep learning models, this project shows how innovation solves
the modern problems of cybersecurity and digital ethics. By adding strong protection for digital
infrastructures against abusive manipulation, this project also encourages responsible
innovation while protecting industries most dependent on authentic media journalism,
legitimate legal systems, education, and entertainment. With digital communication being so
much at the center of almost every sector today, having these mechanisms in place for the
integrity of media is not merely a technological achievement but a step towards sustainable
resilient digital infrastructure.
3
CHAPTER 2
LITERATURE SURVEY
4
simple manipulation traces. Internal coherence refined the detection scheme to be robust to
realistic manipulations that would foil traditional artifact-based detectors. The work thus
opened the way for designing intelligent systems that probe deeper than surface heuristics, thus
lending inspiration to this project to use ensemble approaches to capture inconsistency in
manipulated media.
Another major contribution to the area is S. Fernandes et al. (2020) introduced an attribution-
based confidence metric for the detection of manipulated videos. Their approach is not merely
limited to binary classification—real or fake— but includes a measure of how confident the
model is about a decision. By assessing the attribution maps-the zones within the image that
had the maximum influence on the output of the model-they can give indication of cases where
the model is not confident in its decision, thereby minimizing the possibilities of false positives
or negatives. This last approach is very practical in settings in which manipulated media
detection systems are required to perform with high reliability as well as explainability. This
concurs with the motivation behind adopting in this project a hybrid ensembling technique,
seeking in the end towards not only high accuracy but also more stability and trustworthiness
of the model.
2. Although the simple deep learning algorithms such as CNN, RNN, and LSTM yet has
its incapacity in being flexible and robust over complex deep fake alterations; thus,
this might lead to the limitation in detection.
3. Existing system does not use ensemble models that would combine several algorithms
to improve accuracy and robustness in deep fake detection, thereby resulting in overall
underperformance in detection compared to advanced techniques.
4. Classic detection methods would make existing system relatively immovable with the
emerging system trends, thus needing revision and rework timely in order to keep pace
with techniques.
5
5. It builds on standard heuristic rules, which do not capture the most complex patterns and
could have limited proficiency when it comes to detecting advanced deep fakes created using
new deep learning methods.
Based on the review of existing literature and the gaps identified, the following research
objectives have been formulated:
2. To ensure higher detection accuracy with sensitivity towards the subtle manipulation
artifacts and inconsistencies.
3. To make the proposed model resilient across different real-world conditions such as
low resolution, compression artifacts, and environmental noise.
6
US4: As a user, I want to create signup A trusted access control mechanism
and login pages so that users can securely allows users to register accounts and log
access the application. in to the application safely.
US5: As a user, I want to build a database A secure and encrypted back end
for user authentication so that users' databases to store users' credentials,
credentials are securely stored. thereby withholding the privacy of the
data.
US6: As a user, I want to create an upload This intuitive upload interface allows
page for users to upload videos or images users to upload their material for analyses
so that the application can analyze their and will receive results in short formats
authenticity. concerning manipulation detection.
US7: As a user, I want to test the The result is a client-empirical, tested
application so that it is free of bugs and through-the-roof dependable application
meets performance standards. devoid of bugs and operating in a well-
mannered way for use.
This first phase lays the groundwork for defining the objectives and scope of the project.
The project starts by collecting a relevant Kaggle dataset which contains original videos
and manipulated ones. Each video is converted to frames and only facial regions will
be extracted for model training. The data will be carefully annotated and preprocessed
for getting quality input to the model.
7
In this phase model architecture is chosen and training is commenced. Convolutional
neural networks, and even an ensemble of upstream models distinguishing manipulated
from original media, are also deployed. Hyperparameters like the learning rate, batch
size, as well as the number of epochs, are optimized to get the best detection accuracy.
Now, after developing the model, the system is being built to incorporate the model as
part of a working pipeline. It includes front-end for uploading videos or images and
backend services to perform analysis. This design emphasizes modularity, scalability,
and real-time performance.
Finally, the models are tested on unseen data to validate their progress. These
evaluations include accuracy, precision, recall, and F1 score. Test conditions also
ensure that models are robust across varied types of videos and images.
8
CHAPTER 3
The architecture governing the Manipulated Media Detection Platform is a modular and orderly
flow that starts from data preprocessing. Initially, the input includes a real and fake video
dataset. These videos will go through preprocessing; each video will be split into frames, faces
detected and cropped, then only facial regions' saved to further process. This will result in a
Processed Dataset with strictly face images without all other background noise and irrelevant
frames. Once everything is ready for the dataset, the dataset will then be split into training and
testing subsets using a control data-splitting procedure for a clean evaluation setup. The
training subset is loaded into a hybrid ensembling model which channels the performance of
Xception architecture and NASNet Mobile architectures for high-performance manipulation
detection.
9
Evaluation of the model's performance retrieving from confusion matrix gives a view of
accuracy, precision, recall, and so on. The hybrid model, once trained, is saved for predictions
in the future. For practical purposes, on the upload of any new video by a user, the
preprocessing steps of the same are executed within the system, namely, splitting into frames,
face detection, and cropping, giving the extracted face images as input to the pre-trained
ensemble model. This path, indicated in the architecture diagram, would classify the uploaded
media into 'Real' or 'Fake' accordingly. The modular setup also allows easy retraining and
scaling, with fast prediction speeds that make it very suitable for manipulating media detection
in real time.
The data flow diagram depicts the entire pipeline of the Manipulated Media Detection project-
10
from basic setup to final prediction. It starts with importing different required libraries. After
successfully setting up the environment, the next step is to verify if all libraries and
prerequisites are properly loaded, without which the process will abruptly terminate with a
message "NO PROCESS". After the validation, the system continues to explore the ingestion
of datasets followed by processing of images, including frame extraction, cropping of images
of faces and so on, and normalization of these images. Data visualization comes into play after
preprocessing to map the distribution of the dataset plus its quality and assure that the data is
in a state appropriate for building models.
This phase includes model construction, which involves implementing multiple architectures:
InceptionResNetV2, VGG19, CNN, Xception, and NASNetMobile, as well as Ensemble
(Xception + NASNetMobile) models. Once constructs are ready, training of these models on
preprocessed dataset must take place. In later part of this system, a web-based interface will be
introduced whereby users would be able to register and login to access the service. Users would
upload a video or picture that has to be processed by the model, which would return a final
prediction on whether the medium is fake or real. Finally, the output is presented to the user—
completing the entire cycle of media manipulation detection.
11
CHAPTER 4
4.1 SPRINT I
The Sprint I objective was to set up a robust and clean dataset pipeline for training models
used in manipulated media detection. This sprint particularly targeted the collection of
deepfake videos, frame extraction, dataset cleaning, and preprocessing methods such as
resizing, face cropping, and normalization. The intent was to provide high-quality, properly
labeled datasets for training the models to establish deep features that can discriminate
between real and fake media.
12
2. /data/test/real/, /data/test/fake/
• Splitting: 80% training, 20% testing split
13
4.2 SPRINT II
14
head.
Ensemble Model:
1. Averaging the outputs of the Xception and NASNetMobile models.
2. Final decisions are made by majority voting or by averaging prediction
probabilities.
Training pipeline:
1. Data loaded from different directories
2. Rescaled
3. Passed through the models in sequential order
Evaluation Metrics:
Precision, recall, F1 score, sensitivity, specificity etc. were evaluated through the detailed
formulae.
Block diagrams illustrated the flow from data input → model training → evaluation.
15
4.3 SPRINT III
16
5. Pass to trained model → Collect predictions
6. If video: Majority voting for final decision
7. Result returned on the frontend with visualization (confidence bars)
4.4 METHODOLOGY
4.4.1 InceptionResnetV2
Inception-resnet-v2 combines the depth of the Inception modules with the efficiency of the
residual connections, thereby maintaining the delicate trade-off between speed and
accuracy. Deeper feature extraction was made possible, with the Inception-resnet-V2
architecture capturing fine textures differentiating between real and fake faces. Multiple
convolutional paths in parallel efficiently analysed different scales of manipulation.
Transfer learning allowed the fast convergence of the manipulated media classification by
utilizing pre-trained weights.
17
4.4.2 Xception
Xception architecture claims to utilize depth wise separable convolutions making it very
efficient parameter-wise without loss of accuracy. It was quite good at revealing subtle
artifacts from the detected deepfake by disentangling the spatial and cross-channel
correlations independently.
The lightweight nature of Xception facilitated faster training and high accuracy, which was
a necessary requirement for ensemble learning.
4.4.3 NasNetMobile
Neural Architecture Search (NAS) was used to evolve NASNetMobile, optimizing its
architecture for accuracy and speed. Its modular building blocks allowed for an enhanced
extraction of features that became of utmost importance for detecting imperfections in the
fake images. On the other hand, NASNetMobile provided a lightweight yet powerful
solution featuring a trade-off between mobile deployment capability and competitive
performance in deepfake detection.
4.4.4 VGG19
A great degree of propagation and the simplicity of VGG19 set it up to become a reliable
standard for manipulating media detection. Its hierarchical feature learning, although
computationally heavy, was beneficial for noticing slight differences in facial textures and
backgrounds associated with deepfakes. Fine-tuning VGG19 on pre-processed datasets set
strong baseline benchmarks with respect to which newer architectures have been compared.
18
CHAPTER 5
After preprocessing and extracting 30 random frames from each video, the dataset was split
into two distinct classes, representing real and fake media. Of the total count of images, the
training set processed 16,415 images while the validation set processed a total of 4,109
images. Such distribution shows the datasets are robustly sized so that models can learn
complex patterns to differentiate between real and manipulated media. A large training
sample size is good for the generalization of the model; a validation set acts as a useful
metric for performance monitoring to mitigate against overfitting. Such careful frame
extraction and dataset structuring create a balanced environment for training deep learning
models like InceptionResNetV2, VGG19, CNN, Xception, NASNetMobile, and ensemble
methods, bringing accuracy and robustness for fake media detection.
19
5.2 Evaluation Metrics of Each Model
20
Confusion Matrix for Xception
21
5.2.2 Evaluation using InceptionResnetV2 for manipulated
media detection
22
Confusion Matrix for InceptionResnetV2
23
5.2.3 Evaluation using VGG19 for manipulated media
detection
24
Confusion Matrix for VGG19
VGG19 is a deep CNN-architecture which is well known for its simplicity and
effectiveness for the image classification tasks. It basically consists of 19 layers,
including convolutional layers, pooling layers, and FC(fully connected) layers, making
it capable of extracting hierarchical features from input images. Its usage focuses on
high-level image classification tasks, where the network is trained to recognize various
objects and patterns. The purpose of VGG19 is to offer a highly interpretable and
scalable model for feature extraction in visual recognition tasks.
25
5.2.4 Evaluation using NasNetMobile for manipulated media
detection
26
Confusion Matrix for NasNetMobile
27
5.2.5 Evaluation using Proposed Hybrid Ensembling Model
(Xception+NasNetMobile)
28
Confusion Matrix for the Proposed Hybrid Model
The Ensemble model combining Xception and NasNetMobile leverages the strength of
both architectures to boost prediction accuracy along with robustness of the model. By
combining the output of these models, the ensemble approach reduces the chances of
overfitting and enhances generalization across different datasets. This method is used
for tasks requiring high accuracy, such as deep fake detection, by exploiting the
complementary features learned by both models. The purpose of the ensemble is to
capitalize on the diversity of Xception’s feature extraction and NasNetMobile’s mobile
efficiency, achieving optimal performance.
29
5.3 Comparative Analysis of Each Model
The highest accuracy obtained was by the ensemble model, which achieved 97.02%, as
well as an F1-score and the least error made (MSE and MAE).Both InceptionResNetV2
(95.35%) and NASNetMobile (95.32%) performed better than Xception (93.55%) and
VGG19 (91.75%), but none came close to the performance of the ensembling model.
VGG19 had the least well performance, with an accuracy of 91.75% and the highest error
rates (MSE = 0.0604 and MAE = 0.1206), which suggests that such older architectures
may not be sufficiently reliable for media manipulation detection. The hybrid ensemble
approach really super-connected multiple architectures for achieving better classification
with combined power
30
CHAPTER 6
In conclusion, the deep fake detection system developed in this study demonstrates
significant promise in identifying manipulated media using advanced deep learning algorithms.
Among the various models tested, the ensemble approach combining Xception and
NasNetMobile emerged as the highest-performing algorithm, achieving a remarkable accuracy
of 97.0120 with corresponding precision, recall, and F1-score values that further highlight its
efficacy. The ensemble method's superior performance can be attributed to its ability to harness
the strengths of both models, resulting in enhanced generalization and robust detection
capabilities. This high accuracy suggests that the proposed system is well-suited to address the
growing concerns surrounding deep fake content and can be deployed in real-world scenarios
to safeguard against the risks posed by digital manipulations. By focusing on optimizing
detection methods, the research offers a practical solution to mitigate the harmful impact of
deep fakes in various sectors, including media, politics, and cybersecurity
Future modifications on the work will focus on enhancing the system's ability to
identify different forms of digital media manipulation on different types of datasets, e.g.,
images and videos of varying resolutions, lighting, and intricate backgrounds. This will involve
fine-tuning the hybrid ensembling approach by model selection optimization, weight
adjustment, and feature fusion techniques to attain optimal detection accuracy. Additionally,
exploration of more advanced data augmentation processes and adversarial training can
enhance the model's robustness against new emerging manipulation processes. Future work
can also involve the use of multi-modal analysis (audio, textual, and visual input fusion) to
further improve detection performance on different types of media. Periodic benchmarking
against newly emerging manipulation processes will make the system effective in real-world
application
31
REFERENCES
[1]. B. Zi, M. Chang, J. Chen, X. Ma, and Y.-G. Jiang, “Wilddeepfake: A challenging real-
world dataset for deepfake detection,” in Proceedings of the 28th ACM international
conference on multimedia, 2020, pp. 2382– 2390
[2]. T. Zhao, X. Xu, M. Xu, H. Ding, Y. Xiong, and W. Xia, “Learning self-consistency for
deepfake detection,” in Proceedings of the IEEE/CVF international conference on
computer vision, 2021, pp. 15 023–15 033.
[3]. S. Fernandes, S. Raj, R. Ewetz, J. S. Pannu, S. K. Jha, E. Ortiz, I. Vintila, and M. Salter,
“Detecting deepfake videos using attributionbased confidence metric,” in Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,
2020, pp. 308– 309.
[4]. P. Yang, R. Ni, and Y. Zhao, “Recapture image forensics based on laplacian
convolutional neural networks,” in International Workshop on Digital Watermarking.
Springer, 2016, pp. 119–128.
[5]. B. Bayar and M. C. Stamm, “A deep learning approach to universal image manipulation
detection using a new convolutional layer,” in Proceedings of the 4th ACM workshop
on information hiding and multimedia security, 2016, pp. 5–10.
[6]. J. Luttrell, Z. Zhou, Y. Zhang, C. Zhang, P. Gong, B. Yang, and R. Li, “A deep transfer
learning approach to fine-tuning facial recognition models,” in 2018 13th IEEE
Conference on Industrial Electronics and Applications (ICIEA). IEEE, 2018, pp. 2671–
2676.
[7]. S. Tariq, S. Lee, H. Kim, Y. Shin, and S. S. Woo, “Detecting both machine and human
created fake face images in the wild,” in Proceedings of the 2nd international workshop
on multimedia privacy and security, 2018, pp. 81–87.
[8]. D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, “Mesonet: a compact facial video
forgery detection network,” in 2018 IEEE international workshop on information
forensics and security (WIFS). IEEE, 2018, pp. 1–7.
[9]. Y. Li, M.-C. Chang, and S. Lyu, “In ictu oculi: Exposing ai created fake videos by
detecting eye blinking,” in 2018 IEEE International workshop on information forensics
and security (WIFS). IEEE, 2018, pp. 1–7.
[10]. Y. Li and S. Lyu, “Exposing deepfake videos by detecting face warping
artifacts,” arXiv preprint arXiv:1811.00656, 2018
32
APPENDIX A
33
Function for converting videos to frames
34
Creating Directories required
35
Checking Data Distribution of each Class
36
Visualizing Train Data
37
Writing Functions for Evaluation Metrics
38
Saving the trained Model
39
Confusion Matrix
40
Training InceptionResnetV2
41
Training VGG19 Model
42
Training NasNetMobile Model
43
Training the Hybrid Ensemble Model
Model Description
44
Evaluation Metric History
45
Function to detect if it is a video
46
Sample Prediction I (video as input)
47
Creating a webpage for uploading the media
1. Home Page
48
2. Result Page
49
Result Page After Predicting the Uploaded Media
app = Flask(__name__)
model = load_model("D:\Major Code\kaggle_output\EnsembleModel.h5",
compile=False)
class_labels = ['Fake', 'Real']
UPLOAD_FOLDER = 'uploads'
os.makedirs(UPLOAD_FOLDER, exist_ok=True)
50
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
def predict_image(image_path):
image = cv2.imread(image_path)
processed_image = preprocess_frame(image)
prediction = model.predict(processed_image)
return class_labels[np.argmax(prediction)]
def predict_video(video_path):
cap = cv2.VideoCapture(video_path)
results = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
processed = preprocess_frame(frame)
pred = model.predict(processed)
results.append(class_labels[np.argmax(pred)])
cap.release()
return max(set(results), key=results.count)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/upload', methods=['POST'])
def upload():
if 'media' not in request.files:
return "No file part"
file = request.files['media']
if file.filename == '':
return "No selected file"
filename = secure_filename(file.filename)
filepath = os.path.join(app.config['UPLOAD_FOLDER'], filename)
file.save(filepath)
51
result = predict_video(filepath)
else:
return "Unsupported file type"
if __name__ == '__main__':
app.run(debug=True)
52
APPENDIX B
RESARCH PAPER
55
APPENDIX C
54
APPENDIX D
PLAGARISM REPORT
55