0% found this document useful (0 votes)

2 views6 pages

Report

This report details the development of a machine learning model to classify real and fake images, aimed at combating misinformation in digital content. It covers preprocessing steps, exploratory data analysis, model selection including simple CNNs and transfer learning, and evaluates the model's performance using various metrics. The findings highlight the effectiveness of transfer learning over traditional models, while suggesting future work to enhance model performance through additional data and advanced techniques.

Uploaded by

OussamaHajSalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views6 pages

Report

Uploaded by

OussamaHajSalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Real vs Fake Images Classification Report

Arvind Raghavendran
21f1005301
April 17, 2024

1
Contents
1 Introduction 3

2 Preprocessing 3
2.1 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Exploratory Data Analysis (EDA) 3

3.1 Class Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Visual Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

4 Model Exploration 4
4.1 Simple CNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5 Model Training and Evaluation 5

5.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.2 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

6 Observations 5

7 Conclusions 5

8 Future Work 6

9 Acknowledgement 6

Abstract
This report documents the process and findings of developing a ma-
chine learning model to classify real and fake images. The objective is
to combat misinformation and maintain the integrity of visual content in
the digital age. The report details the preprocessing steps, exploratory
data analysis (EDA), model selection, training, and evaluation, along with
conclusions and potential areas for further improvement.

2
1 Introduction
The proliferation of manipulated images in various media platforms has raised
concerns about the authenticity of visual content. To address this issue, this
project aims to develop a machine learning model capable of distinguishing
between real and fake images. The model’s performance is evaluated using
metrics such as accuracy, precision, and recall. This report provides a detailed
overview of the methodology, experimentation, and results obtained during the
project.

2 Preprocessing
The preprocessing stage involves preparing the image data for training and test-
ing the model. It includes steps such as data augmentation and normalization.

2.1 Data Augmentation

Data augmentation is essential to increase the diversity of the training data
and improve the model’s generalization ability. Techniques such as rotation,
rescaling, and flipping are applied to augment the dataset.

2.2 Normalization
Normalization is performed to scale the pixel values of the images to a standard
range, typically [0, 1]. This ensures that the model learns effectively without
being biased by the input data’s varying scales.

3 Exploratory Data Analysis (EDA)

EDA is conducted to gain insights into the distribution and characteristics of
the dataset. It helps identify potential challenges and biases that may affect
model performance.

3.1 Class Distribution

The class distribution of the dataset is analyzed to understand the imbalance be-
tween real and fake images. Strategies such as class weighting may be employed
to address this imbalance during model training.

3.2 Visual Inspection

Sample images from the dataset are visually inspected to understand the diver-
sity and quality of the data. This helps identify any anomalies or artifacts that
may need to be addressed before model training.

3
4 Model Exploration
Various machine learning models were explored to identify the most suitable
architecture for the real vs fake image classification task. The models consid-
ered included simple convolutional neural networks (CNNs) as well as transfer
learning-based approaches.

4.1 Simple CNNs

Initially, several simple CNN architectures were experimented with, consisting
of multiple convolutional layers followed by batch normalization and dropout
layers. The configurations tested included:

• Model 1: 3 convolutional layers with 64, 128, and 256 filters respectively,
each followed by batch normalization, max-pooling and dropout (dropout
rate of 0.5).

• Model 2: Similar to Model 1, but with an additional convolutional layer

with 512 filters.

However, these models struggled to effectively distinguish between real and

fake images, achieving poor performance. Despite efforts to optimize hyperpa-
rameters such as learning rate and dropout rates, the models failed to generalize
well, likely due to the limited diversity of the dataset with only 500 samples pro-
vided for training and testing.

4.2 Transfer Learning

Given the limitations of the dataset, transfer learning emerged as a promising
approach. Pre-trained models such as VGG16 were fine-tuned for the classi-
fication task by leveraging their learned features. Various configurations were
explored to optimize model performance, including:

• Model 3: VGG16 with the final dense layers replaced by two fully con-
nected layers with 2048 neurons each, followed by ReLU activation and a
sigmoid output layer.
• Model 4: Similar to Model 3, but with different numbers of neurons in the
fully connected layers (e.g., 4096 neurons each).

• Model 5: VGG16 with different activation functions in the fully connected

layers, such as Leaky ReLU.

These transfer learning-based models showed significant improvement com-

pared to the simple CNNs. Model 3, with two fully connected layers with 2048
neurons each, achieved the best performance, with an F1 score of approximately
56%. However, further experimentation and optimization could potentially im-
prove the model’s performance given more time and resources.

4
5 Model Training and Evaluation
The selected model architecture is trained using the preprocessed data, and its
performance is evaluated using standard evaluation metrics.

5.1 Metrics
The model’s performance is assessed using metrics such as accuracy, precision,
recall, and F1 score. These metrics provide insights into the model’s ability to
correctly classify real and fake images and its balance between precision and
recall.

5.2 Confusion Matrix

A confusion matrix is generated to visualize the model’s predictions and identify
any patterns or areas of improvement. It helps understand the distribution of
true positives, true negatives, false positives, and false negatives.

6 Observations
1. Transfer learning is definitely a much-needed boost over models constructed
from scratch, due to their already extensive training on huge, diverse
datasets.

2. The metric graphs observed while training look promising, yet when we
observe the confusion matrix, we realize that the model doesn’t perform
very well for the minor class, despite using class weights.
3. This discrepancy is mainly due to how misleading these metrics could be,
especially when the classes are imbalanced and, for example, N ≪ P . In
such cases, we may still have a high recall just because we do not have
enough false negatives proportional to true positives. The same goes for
precision and accuracy as well.
4. The graphs also show erratic trends in learning, likely due to the lesser
number of samples and hence truly random inputs of data. Fortunately,
we make sure to save the best model using ModelCheckpoint.

I would like to assume that model performance would be better if we had more
data, since even data augmentation is not helping us here.

7 Conclusions
In conclusion, the project demonstrates the effectiveness of transfer learning in
addressing the challenges of real vs fake image classification. The chosen model

5
architecture, fine-tuned VGG16, achieves significantly better performance com-
pared to simple CNNs. However, further improvements are possible with addi-
tional experimentation and optimization. The project highlights the importance
of robust preprocessing, thorough EDA, and systematic model exploration in
developing effective machine learning solutions for image classification tasks.

8 Future Work
Potential areas for future work include:

• Experimenting with different pre-trained models and architectures.

• Incorporating more advanced data augmentation techniques.
• Investigating ensemble methods to combine multiple models for improved
performance.
• Collecting and annotating larger datasets to enhance model generalization.

9 Acknowledgement
I’d like to thank IIT Madras and the CV team for this opportunity. This project
solidified my understanding of computer vision and how important transfer
learning is. It has also improved my confidence in working with Deep Neu-
ral Networks. This project is a fantastic addition to my resume and I hope to
work on more projects.

w202 - Wiring Diagram - ME-SFI Fuel Injection and Ignition System
60% (5)
w202 - Wiring Diagram - ME-SFI Fuel Injection and Ignition System
5 pages
Thesis On Property Valuation
100% (3)
Thesis On Property Valuation
7 pages
Human Resources Business Partner 2.0
No ratings yet
Human Resources Business Partner 2.0
5 pages
LOGIQ E10 Abdominal SWE Guide PDF
100% (1)
LOGIQ E10 Abdominal SWE Guide PDF
24 pages
Mnist Handwritten Digit Classification
No ratings yet
Mnist Handwritten Digit Classification
26 pages
How To Create A Lead Company in T24 Within 10 Minutes
100% (1)
How To Create A Lead Company in T24 Within 10 Minutes
5 pages
Deepfake Detector
No ratings yet
Deepfake Detector
6 pages
Image Recognition Using Machine Learning Research Paper
No ratings yet
Image Recognition Using Machine Learning Research Paper
5 pages
Image Recognition Using Neural Network & Deep Learning
No ratings yet
Image Recognition Using Neural Network & Deep Learning
60 pages
Phase 2 Review 1
No ratings yet
Phase 2 Review 1
32 pages
Jasper Busschers - Thesis Final
No ratings yet
Jasper Busschers - Thesis Final
39 pages
Report
No ratings yet
Report
14 pages
Kirkvik Acit2022
No ratings yet
Kirkvik Acit2022
155 pages
Image Recognition Using Machine Learning Research Paper
No ratings yet
Image Recognition Using Machine Learning Research Paper
5 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DeepFake Detection (DR - Kalam)
No ratings yet
DeepFake Detection (DR - Kalam)
21 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Vit 3 55 - Merged
No ratings yet
Vit 3 55 - Merged
55 pages
Image Classification Using MNIST Dataset
No ratings yet
Image Classification Using MNIST Dataset
28 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
Project Report Model
No ratings yet
Project Report Model
52 pages
Final Year Project Synopsis - Guidelines
No ratings yet
Final Year Project Synopsis - Guidelines
9 pages
Image Forgery Detection - Report
No ratings yet
Image Forgery Detection - Report
52 pages
FULLTEXT02
No ratings yet
FULLTEXT02
87 pages
Image Classification Using CNN
No ratings yet
Image Classification Using CNN
15 pages
Fake Image CNN Presentation
No ratings yet
Fake Image CNN Presentation
7 pages
UrbutisAndBara MasterThesis PDF
No ratings yet
UrbutisAndBara MasterThesis PDF
94 pages
Internship
No ratings yet
Internship
18 pages
Speeding Up Document Image Classi Cation
No ratings yet
Speeding Up Document Image Classi Cation
59 pages
Autonomous Car
100% (1)
Autonomous Car
12 pages
FULLTEXT01
No ratings yet
FULLTEXT01
85 pages
Phase 1 Review
No ratings yet
Phase 1 Review
38 pages
KECReport
No ratings yet
KECReport
23 pages
MVS - Expt8 Object Detection and Reconstruction Using CNN
No ratings yet
MVS - Expt8 Object Detection and Reconstruction Using CNN
5 pages
Data Augmentation For Supervised Learning With Generative Adversa
No ratings yet
Data Augmentation For Supervised Learning With Generative Adversa
60 pages
AI-Powered Image Analysis Using Python
No ratings yet
AI-Powered Image Analysis Using Python
5 pages
Fake Image Identification Using CNN
No ratings yet
Fake Image Identification Using CNN
3 pages
Dissertation
No ratings yet
Dissertation
86 pages
IEEE Detection of Authenticity of Images
No ratings yet
IEEE Detection of Authenticity of Images
9 pages
Deep Learning Nanodegree Syllabus
No ratings yet
Deep Learning Nanodegree Syllabus
15 pages
Predicting User Interaction On Social Media Using Machine Learnin
No ratings yet
Predicting User Interaction On Social Media Using Machine Learnin
76 pages
Lab 8
No ratings yet
Lab 8
5 pages
Book
No ratings yet
Book
96 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
RapportPFE MohamedNourElhak Jouini
No ratings yet
RapportPFE MohamedNourElhak Jouini
73 pages
Vineela Ann1
No ratings yet
Vineela Ann1
9 pages
Project Manual - Team 591965
No ratings yet
Project Manual - Team 591965
27 pages
Classificationusingcnn 170430184308
No ratings yet
Classificationusingcnn 170430184308
24 pages
Fulltext01 P
No ratings yet
Fulltext01 P
78 pages
Image Classification Using CNN Pallavi
No ratings yet
Image Classification Using CNN Pallavi
26 pages
Report23 24
No ratings yet
Report23 24
55 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Localization Using Convolutional Neural Networks
No ratings yet
Localization Using Convolutional Neural Networks
29 pages
Object Detection and Recognition: Final Project Title
No ratings yet
Object Detection and Recognition: Final Project Title
6 pages
Irjet V7i61094
No ratings yet
Irjet V7i61094
3 pages
Bachelor of Technology
No ratings yet
Bachelor of Technology
39 pages
Deep Learning
No ratings yet
Deep Learning
32 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
594 Assignment 3
No ratings yet
594 Assignment 3
4 pages
Adnan Internship
No ratings yet
Adnan Internship
15 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Project Report
No ratings yet
Project Report
9 pages
Deep Learning Lecture 22 April
No ratings yet
Deep Learning Lecture 22 April
4 pages
C2 W2
No ratings yet
C2 W2
23 pages
C2 W1
No ratings yet
C2 W1
20 pages
1 - Intro
No ratings yet
1 - Intro
33 pages
Reactive Protocol AODV
No ratings yet
Reactive Protocol AODV
17 pages
Valse Op.70 No.1 Chopin
No ratings yet
Valse Op.70 No.1 Chopin
3 pages
Water-Coold Ex
No ratings yet
Water-Coold Ex
6 pages
Condor Scissors Lift t62 92367 Parts Book
100% (71)
Condor Scissors Lift t62 92367 Parts Book
20 pages
Horn Antennas
100% (2)
Horn Antennas
24 pages
AWGN
100% (1)
AWGN
2 pages
CRI215 1st Exam Coverage - Part 2
No ratings yet
CRI215 1st Exam Coverage - Part 2
16 pages
Class 2 Word Processing (Ms Word)
No ratings yet
Class 2 Word Processing (Ms Word)
8 pages
Module 1-Discrete Structure
No ratings yet
Module 1-Discrete Structure
7 pages
How To Use SignalWire As An SMS Provider For GoHighLevel - Simple Steps
No ratings yet
How To Use SignalWire As An SMS Provider For GoHighLevel - Simple Steps
3 pages
Research IIQ2 Mod 4 WK 4
No ratings yet
Research IIQ2 Mod 4 WK 4
16 pages
Finish - Checkout - T-Mobile
No ratings yet
Finish - Checkout - T-Mobile
4 pages
Project Management
100% (1)
Project Management
30 pages
Unboxyourphone Ekoparty
No ratings yet
Unboxyourphone Ekoparty
54 pages
Job Application Letter in Arabic
100% (1)
Job Application Letter in Arabic
4 pages
Delegated Content Erasure in IPFS: Future Generation Computer Systems June 2020
No ratings yet
Delegated Content Erasure in IPFS: Future Generation Computer Systems June 2020
10 pages
SOP-Hitachi Drives
No ratings yet
SOP-Hitachi Drives
2 pages
Lecture Note 1
No ratings yet
Lecture Note 1
42 pages
Business Analytics in Healthcare Past, Present
No ratings yet
Business Analytics in Healthcare Past, Present
1 page
Adhoc Network
No ratings yet
Adhoc Network
23 pages
Last Resume
No ratings yet
Last Resume
1 page
FS For Medium Voltage Motor PDF
No ratings yet
FS For Medium Voltage Motor PDF
8 pages
Sharpmx 4111n Guide
No ratings yet
Sharpmx 4111n Guide
68 pages
Ram Sharma
No ratings yet
Ram Sharma
2 pages
Relational Database 2
No ratings yet
Relational Database 2
5 pages
Data Sheet 3D 40-200 9.26
No ratings yet
Data Sheet 3D 40-200 9.26
6 pages