Facial Age and Gender Prediction Using Deep Learning
Facial Age and Gender Prediction Using Deep Learning
P Himabindhu, S Kethavi,
Department of Computer Science & Engineering, Department of Computer Science & Engineering,
Madanapalle Institute of Technology & Science, Madanapalle Institute of Technology & Science,
Madanapalle, Andhra Pradesh. Madanapalle, Andhra Pradesh.
[email protected] [email protected]
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-5306-8/24/$31.00 ©2024 IEEE 1571
II. LITERATURE SURVEY T. Hassner & G. Levi, gender and age Classification
Using CNN discusses the use for CNNs to
concurrently classify gender, age from face
[1] For face recognition, deep convolutional neural pictures. They create a CNN architecture that works
networks, PCA, and SVC are used for feature well for the purpose of classifying the people’s
extraction and classification. A Bounoua and M.K. gender, age from facial photos. To extract
Benkaddour provide a thorough method for face hierarchical information from the input images,
recognition that combines deep learning with more model probably consists of several layers of
conventional machine learning approaches. Their convolution, which are pooling layers. Using the
primary the development of a facial recognition proper evaluation measures, they assess the
system that performs well utilizing the powers of efficiency of their approach for classifying people
deep convolutional neural networks (CNNs), by gender and age.
Support Vector Machines (SVC) and Principal
Component Analysis (PCA). The CNN model gains
the ability to extract discriminative features—which III. METHODOLOGY
are crucial for precise face recognition—by
receiving facial picture input.
Dataset Preparation
[2] Train Convolutional Neural Networks
Efficiently for Face-Based Age and Gender
Prediction, J. Dugelay, G. Antipov and M.
In this study, we utilized the UTKFace dataset,
Baccouche discuss how to train CNNs for reliably
comprising more than 23,000 trimmed, centered
estimate the age and gender from face photos. These
face pictures including gender, ethnicity and age.
trained model's performance is assessed by the
With a total of 23,708 images, we encountered only
authors using suitable assessment metrics. Metrics
six missing age labels. Notably, this dataset offers
like precision, recall, F1-score and accuracy can be
extensive variability in facial attributes, including
applied to gender prediction. Root mean square
expression, illumination, pose, resolution, and
error (RMSE) or mean absolute error (MAE) could
occlusion, making it ideal for our research
be extensively employed metrics for age prediction.
objectives. We selected this dataset due to its
[3] The 2014 "Age and Gender Estimation of relatively uniform distributions across age, gender,
Unfiltered Faces" paper published in IEEE and ethnicity, as well as its representation of diverse
Transactions on Information Forensics and characteristics such as brightness, occlusion, and
Security, by Hassner, Enbar and Eidinger positioning, reflecting the general people [3]. This
challenges the difficult issue with estimation of diversity was deemed essential for accurately
gender, age for unfiltered facial photos. Their developing and assessing our models. Each image
primary goal is to create a precise method that can in the dataset is associated with a three-part tuple
be used to figure out the age and gender of an comprising the individual's age where female is
individual from unprocessed face photos. To figure represented as ‘1’ and male is represented as ‘0’. To
out how to translate the collected characteristics to ensure standardized results, we adopted a consistent
the target variables, the authors use support vector approach for both custom convolutional neural
machines (SVMs), random forests, and deep neural network employing the dataset partitioning strategy
networks are examples of machine learning supporting verification, training purposes and
techniques. evaluation [8]. It provides insights into the
[4] Age-group classification from facial images is composition of the data splits concerning gender
the focus of a paper by K. Ueki, T. Hayashida, and and age, accordingly.
T. Kobayashi titled Age-group classification based
on subspace and using facial photos in different
lighting conditions which was shown to the
Proceedings of Computer Identification of Faces
and Gestures. The paper specifically addresses the
difficulties presented by different lighting
conditions. Principal component analysis (PCA)
and eigenfaces are two examples of techniques used
to extract facial traits that are invariant to Fig. 1. A few pictures from the UTKFace
fluctuations in lighting. The authors use the features collection.
they have collected to create a classification model
that divides people into several age groups. This
model uses methods like linear discriminant Data Preprocessing
analysis and support vector machines to extract
discriminative age-related features from the face
photos. In preparation for age and gender estimation tasks,
[5] The 2015 IEEE Workshops on Pattern a systematic preprocessing pipeline was
Recognition and Computer Vision presentation by implemented for image data. Firstly, images
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
1572
underwent resizing to standardized dimensions: 224
x 224 pixels for age estimation and 64 x 64 pixels Start
for gender estimation, ensuring uniformity across
the dataset [11]. By making sure that every image
has the same size, resizing helps to avoid
differences in image dimensions that might lead to
differences in feature extraction and model Input Conv2D Layer
performance. Color conversion steps varied
according to task requirements: grayscale
conversion for age estimation simplified
processing, while RGB format was retained for
gender estimation to preserve color information. AveragePooling2D
Additionally, normalizing procedures were applied Layer
to bring the pixel values of the pictures
into uniformity [13]. Typically, this involved
scaling pixel intensities to a predetermined range
(e.g., [0, 1] or [-1, 1]). Thus, pixel values were Conv2D and Average
Polling 2D Layer
adjusted to a normalized range of 0 to 1, ensuring
consistency in processing methods. By
guaranteeing that input characteristics have
comparable magnitudes and avoiding the GlobalAveragePooli
dominance of certain features during model ng 2D Layer
training, normalization aids in training process
stabilization. Lastly, subsets of the dataset for
validation as well as training were carefully formed.
post-preprocessing, facilitating effective model
Dense Layer
training and evaluation. This meticulous
preprocessing strategy laid a solid foundation for
robust age and gender estimation model
development.
Output Dense Layer
Model Architecture
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
1573
• The Adam optimizer, renowned for its • Input Layer: Conv2D with 32 filters, applies
effectiveness in optimizing deep neural convolution operation to input images.
networks, was used to train the model with a
fixed learning rate of 0.001. • After the input layer, the AveragePooling2D
Layer applies average pooling to feature maps
• Training data was divided into batches of size in order to minimize their spatial dimensions.
32, enhancing computational efficiency and
facilitating parallel processing during model • Three Pairs of AveragePooling2D and Conv2D
training. Layers: The filters in each pair have been
adjusted gradually to 256, 128 and 64,
• The model underwent training for a total of 30 correspondingly, to gather additional intricate
epochs, representing how many times the details.
neural network processed the whole training
dataset both forward and backward. • It calculates every attribute map's average
across all spatial areas, producing one value.
• Categorical loss function used for this was
cross-entropy for age estimation, measuring the • Complete dense layered network of 132 units,
dissimilarity between predicted and actual age introduces non-linearity and learns complex
class distributions [2]. patterns.
• As the loss function, binary cross-entropy was • The last dense stage uses the softmax activation
utilized for gender estimation, quantifying the method to produce a stochastic result.
disparity between predicted gender
probabilities and ground truth labels.
Gender Prediction Model
Design Assessment
• For preserving most noticeable features, the
model employs MaxPooling2D layers rather
• The newly developed model was tested for both than AveragePooling2D layers after each
gender and age estimation tasks on the Conv2D layer. It does this by choosing the
verification group., ensuring unbiased maximum value from each pooling window.
assessment of its performance.
• The final dense layer uses the function of soft
• Accuracy and Mean Absolute Error (MAE) max activation to produce probabilistic output
were utilized as evaluation metrics for age and comprises three nodes, each of which
estimation, quantifying both classification represents one of the three gender labels (male,
accuracy additionally the mean absolute female, or other) in the collection.
difference between expected and measured
ages.
• Accuracy, F1-score were employed as IV. TRAINING AND EVALUATION
evaluation metrics for gender estimation,
providing insights into how well the model can
Implementing the architecture for a CNN, gender,
categorize gender labels and balance between
age detection models were prepared 75% of the
precision and recall [6].
information and examined on 25% of data [5].
• The evaluation results were juxtaposed with Three pairs of Conv2D layers (one with 64 filters,
previous works in the field, enabling a thorough the other with 128 filters, and so on) together with
evaluation of the suggested model's capabilities AveragePooling2D layers, a 132.0 units in the
against existing methodologies. Dense layer, seven units in the final Dense layer,
and the GlobalAveragePooling2D layer were used
• The model attained accuracy rates around 78% to develop the age detection model. A Dense level
for estimating age and 89% for estimating with 128.0 nodes, an final Dense level with 3 units,
gender on the validation set, showcasing its 3 sets of Conv2D layers, each containing 256, 128,
efficacy and robustness in successfully and 64 filters, respectively, and the gender
guessing age and gender based on face. recognition architecture consisted of a
• Total obtained results underscored efficacy to MaxPooling2D structure and a source Conv2D
CNN-based method for estimating gender and structure with 32 effects. While the Grayscale
age from facial expressions, highlighting its pictures in the Dataset were used for training the
potential for real-world applications in various gender detection algorithm, and 234,000 enhanced.
domains. Based on validation accuracy, a Model Checkpoint
callback object was built to preserve the best model.
With a batch size of 512 and 60 epochs of training,
Age Detection Model the model's shuffle parameter was set to False
inorder to improve repeatability. During
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
1574
this procedure, its Tensor Board push out had been
utilized for illustrating the learning progress. A line
chart was used to plot the accuracy and decline
values To be able to assess the efficiency of the
model [12]. The chart revealed an enhancement
regarding accuracy values, reducing loss values
during course of epochs. The final structure is
utilized for estimating the gender and age of faces
in fresh photos, with 78.0% and 89.0% accuracy
rates for gender and age identification, respectively.
VI. CONCLUSION
V. RESULT
Our CNN model for age prediction was built using
Two measures are available to assess the an initial Conv2D layer with 32 filters, and it was
effectiveness of the suggested methodology: trained on a dataset of 234,000 enhanced grayscale
classification accuracy and loss rate. By splitting the face photos. The GlobalAveragePooling2D layer
entire amount of photographs in a subset by the was the result of three sets for Conv2D extends
amount of correctly identified photos in that subset,
AveragePooling2D structure with 256, 128 and 64
the rate of identification is computed. For every
subset of data, including age and gender effects, respectively, following this. First Dense
classification, this metric can be computed. The level with 132.0 units and a final Dense level with
categorization score for age and gender are added seven units, representing various age spans , were
together yield the overall accuracy rate [14]. On the combined [10]. The model performed well,
other hand, the reduction rates for gender and age achieving an accuracy on the test dataset that was
categories add up to total decline rate. Difference higher than 83.5%. However, it's critical to identify
between expected and actual values is measured by any potential differences in performance between
the loss rate; a smaller number denotes greater
performance. All things considered, these metrics circumstances and populations. Despite this, the age
allow us to assess how well the suggested prediction CNN model holds promising prospects
methodology performs in correctly identifying across various domains including law enforcement,
photographs based on gender and age. healthcare, and marketing. Continuous refinement
and research endeavors stand to enhance the
model's accuracy and applicability across diverse
settings.
VII. REFERENCES
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
1575
convolutional neural networks, PCA and SVC for networks ” IEEE Conference on Computer Vision
face recognition " International Information and and Pattern Recognition Workshops,2015.
Engineering Technology Association IIETA ,2017.
[15] K .Zhang ,Ce.Gao , L.Guo, M.Sun2 , X.Yuan2
DOI:10.3166/TS.34.77-91,2017.
,Tony X. Han2, Zhenbing Zhao1 , Baogang Li1,
[2] Jing Wu, W. A. P. Smith , E.Hancock, “ Gender “Age Group and Gender Estimation in the Wild
classification using shape from shading”. ICIAR With Deep RoR Architecture”. Published in: IEEE
2008: Image Analysis and Recognition p925- Access Vol: 5 ,2017.
934,2008.
[16] Z. Qawaqneh, A. A. Mallouh, and B. D.
[3] A.Golomb, T.Lawrence ,T.Sejnowskia,“Neural Barkana, “Deep convolutional neural network for
network identifies sex from human faces”, age estimation based on VGG-face model,,” 2017.
Conference: Advances in Neural Information
[17] G. Antipov , M.Baccouche, J. Dugelay ,
Processing Systems 3,1990.
Effective Training of Convolutional Neural
[4] A.Khan, A.Majid ,A.Mirza, “Combination and Networks for Face-Based Gender and Age
optimization of classifiers in gender classification Prediction , Pattern Recognition Volume 72 , 2017.
using genetic programming”, International Journal
[18] A.Olatunbosun and S. Viriri ,’’Deeply Learned
of Knowledge-based and Intelligent Engineering
Classifiers for Age and Gender Predictions of
Systems, volume. 9, no. 1, pp. 1-11, 2005.
Unfiltered Faces’’ , the scientific world journal,
[5] M.K.Yamaguchi, T.Kato, T. Akamatsu, “ 2020.
Relationship between physical traits and subjective
[19] S.M.Osman ,N. Noor ,and S.Viriri ,
impressions of the face - Age and sex information”.
Component-Based Gender Identification using
IEICE Trans. J79-A(2), 279–287 (1996)Chen, W.K.
local binary pattern, Computational Collective
(1993). Linear Networks and Systems. Wadsworth,
Intelligence: 11th International Conference,
Belmont, 123-135,1999.
ICCCI,2019.
[6] D. M. Burt, D. I. Perrett,“ Preception of age in
adult caucasian male faces”: computer graphic
manipulation of shape and colour information in
Perception ,DOI: 10.1098/rspb.1995.0021,1995.
[7] K. Ueki, T. Hayashida and T. Kobayashi.
“Subspace-based age-group classification using
facial images under various lighting conditions”. In:
Proceedings of Automatic Face and Gesture
Recognition, 2006.
[8] H.Zhou, P. Miller ,J. Zhang, “Age classification
using Radon transform and entropy based scaling
SVM”. Proceedings of the British Machine Vision
Conference, pa 28.1-28.12, 2011.
[9] Y. H. Kwon and N. Lobo, “Age classification
from facial images”. Computer Vision and Image
Understanding,1999.
[10] J.Schmidhuber, Deep learning in neural
networks: An overview. Neural network ,vol
61,2015.
[11] Yu Zhu, Yan Li, Guowang Mu, Guodong Guo.
“A Study on Apparent Age Estimation”. IEEE
International Conference on Computer Vision
Workshop,2015.
[12] IMDB-WIKI datasets:
https://data.vision.ee.ethz.ch/cvl/rrothe/imdbwiki/.
[ 13] E.
Eidinger, R. Enbar ,T. Hassner “Age and gender
estimation of unfiltered faces”,IEEE,2014.
[14] G. Levi, T. Hassner “Age and Gender
Classification Using Convolutional Neural
Authorized licensed use limited to: Raja Rajeswari College of Engineering. Downloaded on October 28,2024 at 07:03:14 UTC from IEEE Xplore. Restrictions apply.
1576