0% found this document useful (0 votes)
6 views15 pages

DL Mid

This document discusses experiments conducted to develop a YOLO model for face detection. It is divided into two parts where a basic YOLO implementation is first trained for face detection and then modifications are made to create a personalized YOLO model optimized specifically for face detection. The personalized model demonstrates improved accuracy and efficiency compared to the basic implementation according to the results.

Uploaded by

sana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views15 pages

DL Mid

This document discusses experiments conducted to develop a YOLO model for face detection. It is divided into two parts where a basic YOLO implementation is first trained for face detection and then modifications are made to create a personalized YOLO model optimized specifically for face detection. The personalized model demonstrates improved accuracy and efficiency compared to the basic implementation according to the results.

Uploaded by

sana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

1

Deep Learning

Mid Term

Instructor:

Dr. Mirza Mubasher Baig

Submitted by:

Sana Farooq 23L-8000

Amna Akbar 23L-7802


2

Contents

Introduction .......................................................................................................................... 3

Methodology .......................................................................................................................... 6

Results and Discussion ........................................................................................................... 8

Part A ................................................................................................................................ 8

Part B ................................................................................................................................ 8

Conclusion........................................................................................................................... 14

References ........................................................................................................................... 15
3

Introduction

Object detection is a critical task in computer vision which enables machines to identify and

locate objects within images or video frames. Among the various object detection methods, the

You Only Look Once (YOLO) model stands out for its real-time performance and high

accuracy. YOLO is a convolutional neural network-based approach that processes images in a

single pass, making it exceptionally fast compared to traditional detection systems that rely on

sliding window approaches.

The YOLO model operates by dividing the input image into a grid and predicting bounding

boxes and class probabilities for each grid cell. This approach allows YOLO to detect multiple

objects of different classes in a single inference step, making it well-suited for real-time

applications such as autonomous driving, surveillance, and robotics [1].

In this report, we delve into the application of YOLO for face detection, a task crucial for

various domains, including security systems, human-computer interaction, and biometric

authentication. We undertake two main tasks:

1. Part A: Basic YOLO Implementation for Face Detection:

In this task, we begin by obtaining a face detection dataset from Kaggle, which contains

annotated images with bounding boxes around faces. We then implement a basic

YOLOv3 model using OpenCV and fine-tune it to detect faces specifically. The

objective is to update the fully connected component of the YOLO model to specialize

in detecting human faces. We train the model using the provided dataset and evaluate

its performance on a separate validation set.

2. Part B: Development of Personalized YOLO for Face Detection:


4

Building upon the basic YOLO implementation, we explore innovative modifications

to the YOLO architecture to develop a personalized version optimized for face

detection. This task involves experimenting with various modifications, such as

removing certain pre-trained layers, adjusting network parameters, or introducing new

components. The goal is to create a streamlined version of YOLO tailored specifically

for detecting human faces with improved efficiency and accuracy.

Throughout both tasks, we aim to not only achieve accurate face detection but also optimize

the models for deployment on resource-constrained devices, such as mobile phones or edge

computing platforms. By leveraging the capabilities of YOLO and customizing it for face

detection, we seek to address the unique challenges posed by this task and pave the way for

practical applications in diverse real-world scenarios.


5

Experiment Description

In this experiment, we aim to utilize the YOLO model for face detection. The basic YOLOv3

model is trained to detect a wide range of objects across 80 different classes. However, for our

specific task of face detection, we need to adapt the model to detect faces exclusively.

For Part A, we start by downloading a face detection dataset from Kaggle. We then implement

a basic YOLOv3 model and fine-tune it to detect faces using the provided dataset. The model

is modified to update only the fully connected component to specialize in face detection.

For Part B, we propose an innovative modification to the YOLO architecture to develop a

personalized version. We explore the impact of removing certain pre-trained layers from the

original YOLO network and contrast it with a single multi-layer perceptron. Additionally, we

aim to reduce the number of trainable parameters while maintaining or improving performance.
6

Methodology

In Part A of our experiment, we commenced by preparing the face detection dataset obtained

from Kaggle. After organizing and preprocessing the dataset, including resizing images to a

consistent resolution, we proceeded to implement the basic YOLOv3 model using the OpenCV

library. We initialized the model with pre-trained weights and modified the fully connected

component to focus exclusively on detecting human faces. The model was trained using the

annotated dataset, where we optimized the model parameters iteratively to minimize the

detection loss.

For Part B, which involved the development of a personalized YOLO model for face detection,

we adopted a similar approach but with additional considerations for model modification and

optimization. Given the substantial size of the face detection dataset, consisting of over 60,000

images, we opted to train the model on batches of 3000 images to manage computational

resources efficiently. This batch-wise training strategy allowed us to iteratively update the

model parameters while monitoring performance metrics to ensure convergence. During model

training, we explored innovative modifications to the YOLO architecture, experimenting with

various network architectures, layer configurations, and optimization techniques. We aimed to

streamline the model for face detection by removing unnecessary layers, adjusting network

parameters, and introducing novel components tailored specifically to the task at hand.

Throughout the experimentation process, we meticulously documented the changes made to

the model architecture and tracked the impact on performance metrics to inform our decision-

making process.

Upon completing model training and optimization, we conducted a comprehensive evaluation

of the personalized YOLO model on a separate validation set. We compared the performance

of the personalized model with the basic YOLO implementation, assessing key metrics such as

detection accuracy, inference speed, and model size. Additionally, we analyzed the trade-offs
7

between model complexity and performance to identify the optimal configuration for face

detection tasks. Through rigorous experimentation and evaluation, we aimed to demonstrate

the effectiveness and efficiency of the personalized YOLO model for real-world applications,

laying the groundwork for future research and development in the field of computer vision and

object detection.
8

Results and Discussion

Part A

The basic YOLO implementation successfully detects faces in images from the provided

dataset. However, some images may exhibit occlusions or complex backgrounds, leading to

false positives or missed detections. Overall, the model demonstrates promising results in

detecting faces across different poses and lighting conditions.

Part B

In Part B, our personalized YOLO model for face detection demonstrated strong performance,

accurately localizing human faces with precise bounding boxes. Despite training on batches of

3000 images from the large dataset, totalling over 60,000 images, the model showcased

efficiency and real-time capabilities, outperforming the basic YOLO implementation.


9

An essential aspect of face detection is the confidence scores associated with each detection.

These scores reflect the model's certainty in its predictions, ranging from 0 to 1. We observed

that clear and well-defined faces yielded higher confidence scores, while occluded or

ambiguous faces resulted in lower scores, highlighting the model's uncertainty in challenging

scenarios. The confusion matrix for validation shows that the model performed well on the

validation set. All the non-zero values are on the diagonal, indicating that the model correctly

classified all the objects in the validation set.


10

The following figures show the original image, the image with faces bounded in rectangles,

and their respective confusion matrices.


11
12
13
14

Conclusion

In conclusion, we have demonstrated the effectiveness of using YOLO for face detection tasks.

By fine-tuning the basic YOLOv3 model and developing a personalized version, we achieve

accurate and efficient face detection capabilities. These models have various applications in

security, surveillance, and human-computer interaction. Future work may involve further

optimizing the personalized YOLO architecture and exploring additional enhancements for

robust face detection in challenging environments.


15

References

[1] F. Gurkan, B. Sagman and B. Gunsel, "YOLOv3 as a Deep Face Detector," 2019 11th
International Conference on Electrical and Electronics Engineering (ELECO), Bursa,
Turkey, 2019, pp. 605-609, doi: 10.23919/ELECO47770.2019.8990641

You might also like