0% found this document useful (0 votes)
23 views

Age and Gender Detection Using Deep Learning

This document discusses using convolutional neural networks for age and gender detection from facial images. It describes how CNNs can be trained on large datasets and used to extract features from images to accurately predict age and gender through techniques like transfer learning.

Uploaded by

Abhinav Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Age and Gender Detection Using Deep Learning

This document discusses using convolutional neural networks for age and gender detection from facial images. It describes how CNNs can be trained on large datasets and used to extract features from images to accurately predict age and gender through techniques like transfer learning.

Uploaded by

Abhinav Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Age and Gender Detection Using Deep

Learning *

Abhinav Singh* Avanish Kumar**


Student of Information Technology Student of Information Technology
KIET Group of Institutions KIET Group of Institutions
Delhi-NCR Delhi-NCR
[email protected] [email protected]

Anshika Gupta*** Abhishek Tiwari****


Student of Information Technology Student of Information Technology
KIET Group of Institutions KIET Group of Institutions
Delhi-NCR Delhi-NCR
anshigupta178.com [email protected]

Deepak Vishwakarma*****
Faculty of Information Technology
KIET Group of Institutions
Delhi-NCR
[email protected]

Abstract— Facial attributes are crucial in breaking success in face recognition and im-
numerous applications such as access control age classification tasks. Leveraging pre-trained
and video surveillance, where demographic deep CNNs, this research aims to estimate
data like age and gender can be inferred age and gender accurately from facial images.
from facial images. Automatic estimation of The methodology involves utilizing convolution
age and gender enables tailored content de- layers to produce a robust and compact output,
livery and personalized services. However, ex- enhancing the efficiency of age and gender
tracting effective features from facial images detection systems.
poses a significant challenge. This paper pro-
poses employing Convolutional Neural Net- I. INTRODUCTION
works (CNNs) for automatic age and gender
Age and gender prediction have be-
prediction. CNNs have demonstrated ground-
come one of the more recognized fields
in deep learning because of the rise in Interaction), has led to an ever-growing
picture uploads on the internet in to- interest in this area. Law enforcement,
day’s data-driven environment. Although security management, and forensics are
humans are naturally skilled at identi- a few possible uses. Using these models
fying one another, figuring out gender, with IoT is another useful use. A restau-
and assessing ethnicity, age assessment rant may decide to alter its theme, for
is nevertheless a challenging task. To instance, by calculating the average age
underscore the complexity of the issue, or gender of patrons who have come in
consider this: The most used statistic for thus far.
assessing an individual’s age prediction is
mean absolute error (MAE). According II. PROCEDURE
to research, depending on the database A. Deep Learning
settings, people can estimate the age of
An artificial intelligence (AI) method
an individual over 15 with an MAE of
called deep learning aims to mimic the
7.2–7.4. This indicates that human fore-
human brain by learning from experience.
casts are often wrong by 7.2–7.4 years.
These representations are learned through
The question is, can we do better? Can
a technique called training. We must first
we automate this problem in a bid to
train the program with a huge number of
reduce human dependency and simultane-
object photos that we classify into vari-
ously obtain better results? For these rea-
ous groups to teach it how to recognize
sons, persons of comparable ages might
objects. Deep learning-based algorithms
appear substantially different from one
take longer to train than conventional
another. Because of this, estimating age
machine learning techniques and need a
is fundamentally a difficult undertaking.
lot more training data. It takes a lot of ef-
This issue is further exacerbated by the
fort and complexity to identify distinctive
non-linear relationship between age and
features while attempting to identify any
gender and face appearance, as well as
item or letter on a picture. Deep learning
the extreme lack of big, balanced datasets
techniques, which automatically extract
with accurate labelling. There are very
significant characteristics from data, can
few such datasets available; the majority
be used to solve issues, in contrast to clas-
are severely skewed, with a large pro-
sical machine learning, where features are
portion of participants in the 20–75 age
collected manually.
range, or they are gender-biased. It is
Deep learning is the use of numerous
not advisable to use such biased datasets
hidden layers in a neural network. Once
since testing on real-time pictures will
a picture has been taught throughout the
result in a distribution mismatch and sub-
network, they can proceed to construct
par performance. There is an enormous
more complex ideas from simpler ones.
amount of untapped potential in this field
An image may be taught in the network to
of research. The enormous potential that
understand objects like characters, faces,
autonomous age and gender prediction
and so forth by incorporating basic fea-
offers in a variety of computer science do-
tures like form, edges, and corners. Each
mains, including HCI (Human Computer
layer receives a basic attribute as the
picture moves across them, progressing to with the input data to pinpoint specific
the next one. The network may potentially features. The stride dictates how far the
learn more complicated characteristics as filter will move in each direction. The
the layers get bigger and combine them stride value is crucial in determining the
to identify the image. Deep learning has size of the output image matrix. A larger
found many applications in the field of stride, such as 4 or 5, will result in a
computer vision. The most significant smaller output matrix and potential loss
computer vision applications were found of information and vice versa.
in the fields dealing with face data. After passing the input image through
the convolutional layer, ReLU activation
B. Convolutional Neural Networks function is applied elementwise to the
Convolutional neural networks (CNNs) feature maps produced by the convolu-
are a popular type of machine learning tional layer in CNN. Applying ReLU
algorithm used for image processing and introduces non-linearity to the network
recognition. They excel at categorizing and allows it to learn complex patterns
images by taking them as input and pro- and representations in the data. It helps
cessing them with a given dataset. Com- in capturing and amplifying important
prised of fully connected layers responsi- features while suppressing irrelevant or
ble for classifying images after extract- negative values. Then it is passed through
ing features. CNN utilizes a blend of the pooling layer where down sampling
both supervised and unsupervised learn- of the feature maps happens for faster
ing methods, specifically through a mul- computation.
tilayer feed-forward architecture. These The concept of Transfer Learning is
distinctive stages consist of numerous application of the knowledge learned by
layers, each with their own designated one model to be applied to another. It is
functions and objectives. The Convolu- used when there is a dearth of suitable
tion layer is a crucial element of the CNN learning data. Deep neural network can
algorithm, responsible for most of its be trained with previously saved model
computations. It takes in key components weights using transfer learning on large
such as the input image, filter, and feature datasets. Therefore, a pre-trained model
map. For example, when given a human can be improved using large-scale deep
face image as input, the Convolution layer transfer learning and limited data.
processes its 3D matrix of RGB pixels, The UTK Face dataset is a very small
defining the image’s length, width, and dataset to capture the complexity involved
height. Within each layer, we establish in age and gender estimation, so we fo-
filter and stride matrices to aid in the cused our attention further on leveraging
crucial process of feature extraction. transfer learning. Therefore, we are us-
The filter serves as a feature detector, ing convolutional blocks of VGG16 pre-
taking on a 2-dimensional form and ca- trained on VGG Face and ResNet50 pre-
pable of a variety of sizes, such as a 3 trained on VGG Face2, as feature ex-
by 3 matrix. This filter acts as a compact tractors. These models are originally pro-
representation of numbers, which we use posed for facial recognition, thus can be
TABLE I
used for higher level of feature extraction.
N ETWORK A RCHITECTURE FOR AGE E STIMATION
VGG Face is composed of two blocks,
each containing layers for batch normal- Layer Filters Output Size Kernel Size Activation
ization, spatial dropout with a probability Image - 180 x 180 x 3 - -
Separable Conv1 64 180 x 180 x 64 3x3 ReLU
of 0.5, separable convolutions layers with Max Pooling - 90 x 90 x 64 2x2 -
312 filters of size 3x3, maintaining the Separable Conv2 128 90 x 90 x 128 3x3 ReLU
same padding, and max pooling with ker- Max Pooling - 45 x 45 x 128 2x2 -
Separable Conv3 128 45 x 45 x 128 3x3 ReLU
nel size 2x2. The ResNet50 gender con- Max Pooling - 22 x 22 x 128 2x2 -
sists of only the fully connected system Separable Conv4 256 22 x 22 x 128 3x3 ReLU
with batch norm, dropout with probability Max Pooling - 11 x 11 x 256 2x2 -
Separable Conv5 256 11 x 11 x 256 3x3 ReLU
of 0.5, and 128 units with exponential Max Pooling - 5 x 5 x 256 2x2 -
linear unit (ELU) activation. The fully FC1 - 128 - ReLU
connected system was composed of batch FC2 - 64 - ReLU
FC3 - 32 - ReLU
normalization layers, alpha dropout, and Output - 1 - ReLU
128 neurons with ReLU activation.
The two convolution blocks that make
up the VGG face for age estimation
diverse, encompassing a wide range
are separated by a separable convolution
of traffic sign images. It includes
layer that has 312 filters of size 3x3,
multiple types of signs, varying
padding the same so that the dimension
light and weather conditions, and
remains constant with the ReLU activa-
a range of perspectives. To prepare
tion function, a batch norm layer, and
and assess our CNN model, the
spatial dropout with keep probabilities of
dataset was split into separate train-
0.8 and 0.6, respectively. After every con-
ing and testing sets.
volution block, a 2x2 kernel max pooling
2) Model Architecture: Our team de-
was performed. The three layers of the
veloped a cutting-edge CNN archi-
fully linked system included 648, 312,
tecture specifically tailored for the
and 128 neurons each, and their corre-
purpose of recognizing traffic signs.
sponding dropout keep probabilities were
This intricate design comprises sev-
0.2, 0.2, and 1.
eral essential components, includ-
III. R ESULT ing convolutional layers, pooling
layers, and fully connected layers.
We conducted experiments on Age and To perfect our model, we meticu-
gender Recognition, utilizing a Convo- lously fine-tuned important hyper-
lutional Neural Network (CNN) as our parameters such as the number of
primary algorithm. The CNN underwent layers, filter sizes, and the size of
training and testing using a dataset com- the fully connected layers. This en-
posed of Human faces images with age sured optimal performance and ac-
and gender mentioned. The following are curacy for our revolutionary Age
the pivotal elements of our experiments and Gender detection system.
and the resulting outcomes: 3) Training: The training process was
1) Database: Our dataset is rich and a vital component, as it entailed
inputting the training dataset into Actual Values
the CNN model. Through this, the Positive (1) Negative (0)
model was able to acquire the abil-
ity to accurately distinguish and

Predicted Values
categorize traffic signs based on the Positive (1) TP FP
visual data provided. Our approach
involved carefully selecting suitable
loss functions and optimization al-
gorithms, fine-tuning the model’s Negative (0) FN TN
inner workings (weight and biases)
to effectively reduce any classifica-
tion errors. Fig. 1. Confusion Matrix
4) Evaluation: Once the training was
complete, we proceeded to assess
the model’s performance by testing
it with a dataset that it had not pre-
viously encountered. This step was tp + tn
Accuracy =
essential in determining the model’s tp + f p + f n + tn
ability to handle unfamiliar data. tp
Sensitivity =
The evaluation process involved us- tp + f n
ing various metrics such as accu- tn
Specificity =
racy, precision, recall, and F1-score, f p + tn
providing a comprehensive measure tp
of the model’s performance. Precision =
tp + f p
2.tp
F1 - score =
A confusion matrix provides insight 2.tp + f p + f n
into the model’s performance, errors, and The symbols fp, fn, tp, and tn refer
weaknesses. It breaks down the number of to abbreviations of false positive, false
correct and incorrect predictions by each negative, true positive and true negative
class, and can be used to calculate metrics respectively.
such as:
IV. CONCLUSIONS
• Accuracy: The proportion of predic- • Training Accuracy: The CNN
tions that the model classified cor- demonstrated exceptional learning
rectly ability by achieving a training
• Precision: The proportion of rele- accuracy of 95.75%, showcasing its
vant instances among the retrieved prowess on the provided dataset.
instances • Test Accuracy: The test accuracy,
• Recall: The proportion of the to- a crucial measure of how well the
tal amount of relevant instances that model can perform on unfamiliar
were retrieved data, boasted an impressive score of
90.40%. Such a high level of accu- VATIONS IN TECHNOLOGY, (Volume 7, Issue
racy showcases the model’s excep- 3 - V7I3-2163).
tional capability in identifying traffic
signs in real-life situations. Moving
forward, our goal is to enhance the
system’s capabilities by expanding
the range of classes for traffic signs
and improving the quality of the im-
ages. As is typical in machine learn-
ing studies, improving model quality
is a critical and time-consuming pro-
cess which is achieved by training
the deep learning model with huge
number of human face images.

R EFERENCES
[1] Amit Dohme, Ranjit Kumar, and Vijay Bhan,
“Gender Recognition Through Face Using Deep
Learning”, International Conference on Compu-
tational Intelligence and Data Science (ICCIDS
2018).
[2] Akash. B. N, Akshay. K Kulkarni, Deek-
shith.A and Gowtham Gowda4, “Age and Gender
Recognition using Convolution Neural Network”,
IJESC, ISSN 2321 3361 Volume 10 Issue No.6.
[3] Anto A Micheal and R Shankar, “Automatic
Age and Gender Estimation using Deep Learning
and Extreme Learning Machine”, Turkish Journal
of Computer and Mathematics Education Vol.12
No.14 (2021), 63- 73.
[4] Shubham Patil, Bhagyashree Patil and Ganesh
Tartare, “Gender Recognition and Age Approx-
imation using Deep Learning Techniques”, In-
ternational Journal of Engineering Research &
Technology (IJERT), Vol. 9 Issue 04, April-2020.
[5] SHUBHAM KUMAR TIWARI (1613112045),
“AGE AND GENDER DETECTION”, GALGO-
TIAS UNIVERSITY, Project Report of Capstone
Project- 2.
[6] Sasikumar Gurumurthy, C. Ammu and B.
Sreedevi, “Age Estimation and Gender Classifi-
cation Based on Face Detection and Feature Ex-
traction”, International Journal of Management &
Information Technology, ISSN 2278-5612 Vol.4,
No.1.
[7] Mahija Kante, Dr. Esther Sunandha Bandaru,
Gadilid Manasa, Meghana Emandi and Varanasi
Leela Lavanya, “Age and Gender Detection using
OpenCV”, INTERNATIONAL JOURNAL OF
ADVANCE RESEARCH, IDEAS, AND INNO-

You might also like