0% found this document useful (0 votes)
19 views6 pages

Real-Time Age and Gender Prediction

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

A Lightweight Deep Convolutional Neural Network

Model for Real-Time Age and Gender Prediction


2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC) | 978-1-7281-9183-6/20/$31.00 ©2020 IEEE | DOI: 10.1109/ICAECC50550.2020.9339503

Md. Nahidul Islam Opu∗ , Tanha Kabir Koly† , Annesha Das‡ and Ashim Dey§
Department of Computer Science & Engineering
Chittagong University of Engineering and Technology
Chittagong-4349, Bangladesh
Email: ∗ [email protected], † [email protected], ‡ [email protected], § [email protected]

Abstract—Recognition of age and gender has become a signif- recognition task. While research on age estimation extends
icant part of the biometric system, protection, and treatment. It over decades, the research of apparent age evaluation or
is widely used for people to access age-related content. It is used age as interpreted from a face picture by other humans is
by social media in the distribution of layered advertising and
promotions to expand its scope. Application of face detection has a recent effort. It should be noted that age detection from
grown to a great extent that we should upgrade it using various a single picture is not an easy task to accomplish because
methods to achieve more accurate results. In this paper, we have the perceived age depends on several factors and same-aged
developed a lightweight deep Convolution neural network model people look very different in different parts of the world.
for real-time age and gender prediction. For making the training Successful age and gender prediction from facial picture
dataset more diverse, Wiki, UTKFace, and Adience datasets
have been merged into one containing 18728 images. Using this captured under real-world conditions will lead to improving
vast mixed dataset, we have achieved accuracy of 48.59% and the results of identification. The applications of age and gender
80.76% for age and gender respectively. Further, the model is classification systems have been growing fast in recent years
tested in real-time. Different experimental investigations on the due to its improved technology such as haar cascade classifiers,
prepared dataset show that with most recent approaches, our deep multi-task learning and OpenCV etc. [2]–[4]. Recently
model provides competitive prediction accuracy.
deep neural networks have become popular for numerous
Keywords—Convolutional Neural Network, Deep Learning, applications to improve accuracy. A deep learning age and
Facial Images, Age and Gender Prediction, Real-time recognition gender classification approach is proposed in this paper, taking
system. into account significant constraints of the mobile application.
In this work, we implemented a deep learning Convolutional
I. I NTRODUCTION Neural Network (CNN) solution to age and gender prediction
Human age and gender are considered as important bio- from a single face image combining three datasets with age
metric trait for human identification. Age and gender predic- and gender labels. This paper shows that with the exercise
tion refers to the process of recognizing a person’s face in of profound CNN as presented here, it is possible to achieve
the picture and identifying if a person is male or female and remarkable performance. The main objectives of our work are:
predicting age. These two attributes play a vital role in our • Build a lightweight CNN model.
social life. Recognition of face attributes in real-time is a very • Train the CNN model using a large combined dataset.
promising research topic. Recent research suggests that the ag- • Estimate age and predict gender from facial image in real
ing characteristics deeply learned from huge data contribute to time.
a substantial improvement in facial image-based age evaluation
Our multi-task learning system allows optimal features to be
efficiency. For a growing number of applications, automatic
shared and learned in order to enhance recognition efficiency
age and gender detection have become important, especially
for both tasks. The CNN architecture that our model employs
after the rise of social networks and social media. In the real
is designed specifically for age and gender estimation to
world, there are many technologies available that are related to
increase the time efficiency and reduce the model size while
age estimation and gender prediction, including product sales,
retaining recognition output consistency.
biometrics, cosmetology, forensics, entertainment, etc. [1].
Age prediction plays an significant role in crime investigation The rest of the paper is summarized as follows. Related
also as it helps to find the actual criminal based on the person’s literature and studies are presented in Section II. The proposed
age. methodology is described in Section III. In Section IV, results
However, the performance of achievable processes on real and performance analysis is presented. Finally, the paper is
world raw images is still not up to the standard when it comes concluded in Section V.
to the realistic method of face recognition. Nevertheless,
II. L ITERATURE R EVIEW
there is still a significant lack of performance of established
techniques on real world images, particularly when compared There is several notable research done in the area of age and
to the enormous performance leaps for the associated face gender prediction using facial images. The early methods for

978-1-7281-9183-6/20/$31.00 © 2020 IEEE

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.
this sector were focused mainly on extraction and calculation and Haar Cascade Classifier. Recently in [17], a research was
of the facial features. done using CNN with ‘Google’ and ‘IMFDB’ datasets and
In [5], a deep neural network was used that is compu- gained approximately 80% accuracy. Another research was
tationally inexpensive and provides good accuracy on many done with the IMDB-WIKI dataset and achieved accuracy for
competitive datasets. A deep CNN network was proposed in gender prediction is approximately 96.50% and age prediction
[6] that was composed of locally connected hidden layers. The is 85% for their own dataset [1].
dataset CAS-PEAL and FEI have been used for training. In recent years, other researchers developed a Deep CNN
Zukang Liao et al. [7] proposed 9 overlapping patches architecture for gender detection with standard accuracy and
per photo instead of a hundred patches to cover the whole low computational cost. They compared their architecture with
region. Accuracy was achieved by the nine-patched method other popular CNN with common datasets namely IMDB-
was 95.072% on the Labeled Face in the Wild (LFW) dataset WIKI, LFW, and Adience dataset [18]. In [19] they im-
and 78.63% on the Adience dataset for gender prediction. plemented a lightweight model for age and gender estima-
For age classification, accuracy was 40.25%. Rothe et al. [8] tion. They have used multi-task learning for an embedded
approached a deep learning method for age prediction from system using multiple datasets such as MORPH-II, FGNET,
a single face picture without using facial landmarks. Their and MegaAge-Asian datasets. In [20], they have used the
main contributions are the IMDB-WIKI dataset, regression same dataset MORPH-II and FG-NET and developed an age
formula by deep classification, and achieving accuracy 64.0%. estimation process based on a lightweight CNN and Data
In [9], a hybrid architecture was introduced that consists of a Augmentation. They proposed a mixed attention method by
CNN and an extreme learning machine (ELM). Using datasets combining regression and classification formulas. In [21], a
MORPH-ll and Adience benchmark, achieved accuracy for age system was developed to detect gender, age and emotion
and gender prediction respectively is around 52% and 88% on by using CNN. They have used a haar cascade classifier
average. and 10000 images to train their model. A deep learning
For age and gender prediction, computationally efficient classifier was developed to predict age and gender in [22] from
CNN model was designed for mobile platform in [10]. In unfiltered images. Their CNN architecture consists of two parts
[4], authors developed an android app for age and gender that extract features and classify. They trained their network
using LBP (Local Binary Patterns) classifier and LBPH (Local on IMDB-WIKI, MORPH-II and OIC- Adience datasets and
Binary Pattern Histogram) model. A lightweight CNN was improved accuracy for age and gender prediction.
implemented for mobile application in [11] using the Adience Studying and summarizing the above researches, there are
dataset and gained accuracy using LMTCNN-2-1 for age top-1 some drawbacks of their implementations such as CNN archi-
is 44.26% and gender top-1 is 85.16%. Also a smartphone- tecture, big sized CNN model, high computing cost, dataset
based implementation was done using ocular images for processing, and so on.
gender prediction in [12]. VISOB dataset was used and gained Analyzing the previous works, we can state two prime
accuracy was 78% and 86.2% from iPhone and oppo images. factors contributing to the success of age and gender prediction
Gabor filter as input in CNN and Adience dataset were used using CNN which are:
in [13] and achieved accuracy for age and gender is respec- • Changes in architecture.
tively 61.3% and 88.9%. A video-based implementation was • Dataset preparation.
done by using Dempster-Shafer theory to generate classifiers
using different datasets such as IMFDB, Kinect, EmotiW 2018 III. M ETHODOLOGY
and IJB-A in [14]. They increased the accuracy of the age The main focus of this work is to develop a CNN model and
and gender detection by 2-5% and 5-10% correspondingly. A training it using a large combined dataset. Total methodology
research was carried out using feed-forward propagation neural can be divided into two main parts:
networks at a finer level with 3-sigma control limits in [15]. • Training Phase
By using the JAFFE dataset, they gained accuracy of 95% for • Real-time Testing Phase
age and gender detection.
Using the (LFW) dataset and Adience dataset, another local A. Training Phase
deep neural network was introduced in [16]. Accuracy of The steps of training are shown in Fig. 1. The first two
96.02% and 80.64% for gender prediction was obtained from steps are, building a CNN and processing the dataset. After
these two datasets and accuracy of 44.36% for age prediction that, using the processed dataset, the CNN model is trained.
was achieved from the proposed model. Another research was And the last step is to save the model for future use.
done by two learning methods -single task learning (STL) This is because training a deep learning model takes a lot
and deep multi-task learning (DMTL) in [3]. Gained accuracy of time and resources. So it is not feasible to train each time
for CNN+STL and CNN+DMTL is 80.11% and 91.34% for before predicting. After successful training, the model with
gender prediction. For age prediction, gained mean absolute the weights can be saved in memory. Whenever a prediction
error (MAE) is 4.01. and 4.00 respectively. Insha Rafique et al. is required, the model is loaded from memory and used for
[2] proposed a deep CNN for eight age groups and two gender prediction. This method also enables low-end devices, such
groups. They gained 79% accuracy using deep CNN model as IoT devices, smartphones to perform complex tasks as

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.
has an equal number of images which results in a total of
18728 images. Fig. 2 shows some sample images from the
dataset. The dataset distribution based on age and gender is
shown in Fig. 3. Finally, the dataset is divided into three parts,

Fig. 1. Training Phase

prediction or classification, as training is not possible in these


low-end devices. Fig. 3. Dataset Distribution
We have learnt two main ideas given below from the deep
learning research which are applied in our work: with an 8:1:1 ratio of training, validation and testing.
2) Creating and Training CNN Model: The CNN is a deep
• The more balanced and diverse the datasets are, the
neural network architecture used in computer vision, such as
more the network understands to generalize and the more
image recognition. Computer vision and image recognition are
resilient it becomes to overfitting.
not new concepts rather old ones. The architecture of CNN as
• The deeper the neural networks are, the greater the ability
shown in Fig. 4, has two parts which are feature extraction
to model extremely non-linear shifts.
and then classification. The feature extraction part contains
Dataset preparation and model creation and train are described several convolution layers and pooling layers. This convolution
here: process extracts features from the input images. The output of
1) Preparing Dataset: For this study, we have selected this process is called a feature map.
three publicly available facial datasets which are, Wiki [8],
UTKFace [23] and Adience Dataset [24]. Preprocessing steps
are different for each of them. For the wiki dataset, the
photos which have only one face are selected. Then the age
is calculated as mentioned in (1).
age = date of photo taken − date of birth (1)
Input Conv Pool Conv Pool FC Output

For the UTKFace and Adience dataset, much processing is not


needed as these datasets are well organized. We have merged Fig. 4. A Simple CNN Architecture
these three datasets into one.
Pooling layer is also called subsampling or down sampling.
This layer reduces the feature map by retaining only the most
important information such as taking the average or maximum
value. The classification unit generates output according to
input data. This unit has fully connected layers. It indicates
that all the neurons of the previous layer are connected with
all the neurons in the next layer. Typically it uses “Softmax”
operation.
It is stated earlier that a deeper model gives a better result.
But also the deeper model takes a lot of time to train and
consumes large memory. Moreover, a deeper model is not good
to integrate into mobile devices. So, we have tried to make
trade-off between these.
Fig. 2. Dataset Example The model that is used consists of a basic building block
as shown in Fig. 5. 32 filters are used in each layer. Different
The total age range is divided into 8 age groups. These sizes of kernels: 3X3, 5X5, 3X1, 1X3, 1X5, 5X1, 1X1 are
groups are: 0-2, 4-6, 8-13, 15-20, 25-32, 38-43, 48-53, 60- applied in different layers. Max pooling is used for down-
100 as it has been done in Adience dataset. The total dataset sampling without reducing dimension. By connecting three of
is then down-sampled according to age so that each age group these blocks, the CNN model is built. The last layers are fully

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.
Input: InputLayer First, images are read from the camera. The face is detected
from the images using OpenCV and haar cascade classifiers.
conv2d_52: Conv2D conv2d_49: Conv2D conv2d_56: Conv2D
OpenCV is a library of programming functions specially
targeted at computer vision in real-time. And haar-cascade
classifiers can be explained as a trained machine learning
conv2d_50: Conv2D conv2d_57: Conv2D conv2d_54: Conv2D
model that can detect a face. The face is cropped from the main
image and sent to the model for prediction. Then, predicted
conv2d_53: Conv2D conv2d_51: Conv2D conv2d_58: Conv2D conv2d_55: Conv2D
age and gender are shown as output.

concatenate_10: Concatenate concatenate_11: Concatenate


IV. E XPERIMENTAL R ESULTS & A NALYSIS
Experiments were conducted on grayscale images for 100
conv2d_59: Conv2D conv2d_62: Conv2D epoch with a batch size of 32. Images have been resized to
64x64 to reduce training time.
conv2d_60: Conv2D conv2d_63: Conv2D
A. Performance Analysis
One of the primary goals of this paper is to create a
conv2d_61: Conv2D conv2d_64: Conv2D
lightweight model. By the word lightweight model we mean
a model that has a small number of parameters, and a small
concatenate_12: Concatenate model size that is suitable for mobile integration. The total
number of parameters of the proposed model is 210,050. And
Fig. 5. Basic Building Block of Proposed CNN Architecture the final model size is 2.60MB which is very lightweight
to execute in any mobile platform. Where some well-known
models like VGG, Reset has size more than 200MB [25].
connected layers. The output layer for gender prediction has After 100 epochs, 81.35% and 51.59% training accuracy have
two output units or classes, 1 for male and 0 for female. The been achieved for gender and age respectively. Test accuracy
output layer for age prediction has 8 output units or classes. for gender is 80.76% and for age is 48.59% on our combined
B. Real-time Testing Phase dataset. Fig. 7 shows the model accuracy and loss for age and
gender separately.
In the real-time testing phase, after the completion of
training, the model is used for predicting captured images via
a camera and displaying the result continuously on the screen.
This involves several steps as shown in Fig. 6.

Fig. 7. Accuracy vs Epoch and Loss vs Epoch

In Table I, results are compared with several research works.


The model is tested on the complete Adience dataset to
compare with other research work. Accuracy of 46.71% for
age and 81.38% for gender are obtained. The model is then
trained only on the Adience dataset for the same purpose.
This is done after splitting the dataset into train, validation,
and test (8:1:1). The training accuracy of 86.68% for gender
and 64.22% for age are achieved and testing accuracy is
Fig. 6. Real-time Testing Phase 60.03% for age and 85.77% for gender. Our model is also

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.
TABLE I R EFERENCES
C OMPARISON AMONG SEVERAL STATE OF THE ART
[1] B. Agrawal and M. Dixit, “Age estimation and gender prediction using
Test Accuracy convolutional neural network,” in International Conference on Sustain-
Research Work Dataset Used
Age Gender able and Innovative Solutions for Current Challenges in Engineering &
[5] Adience 61.3% 91% Technology. Springer, 2019, pp. 163–175.
[7] Adience + LFW 40.25% 78.63% [2] I. Rafique, A. Hamid, S. Naseer, M. Asad, M. Awais, and T. Yasir, “Age
[9] Adience + MORPH-II 52.3% 88.2% and gender prediction using deep convolutional neural networks,” in
[13] Adience 61.3% 88.9% 2019 International Conference on Innovative Computing (ICIC). IEEE,
[11] Adience 44.26% 85.16% 2019, pp. 1–6.
[16] Adience + LFW 44.36% 80.64% [3] D. S. Al-Azzawi, “Human age and gender prediction using deep multi-
Proposed Model Adience 60.03% 85.77% task convolutional neural network,” Journal of Southwest Jiaotong
Proposed Model Adience + Wiki + UTKFace 48.59% 80.76% University, vol. 54, no. 4, 2019.
[4] A. Salihbašić and T. Orehovački, “Development of android application
for gender, age and face recognition using opencv,” in 2019 42nd In-
ternational Convention on Information and Communication Technology,
tested, which has been trained using the Adience dataset, on Electronics and Microelectronics (MIPRO). IEEE, 2019, pp. 1635–
both Wiki and UTKFace dataset. It shows moderate accuracy 1640.
[5] A. Dehghan, E. G. Ortiz, G. Shu, and S. Z. Masood, “Dager: Deep age,
of 79% and 72% on gender respectively. But it performs very gender and emotion recognition using convolutional neural network,”
poorly for age, showing less than 32% accuracy. arXiv preprint arXiv:1702.04280, 2017.
[6] K. Zeeshan, H. Kaleem, R. Malik, and S. Khalid, “Deepgender : real-
B. Real-time Testing time gender classification using deep learning for smartphones,” Journal
of Real-Time Image Processing, 2017.
The model that trained on a mixed dataset was tested
[7] Z. Liao, S. Petridis, and M. Pantic, “Local deep neural networks for age
in real-time data. The image was processed as described in and gender classification,” arXiv preprint arXiv:1703.08497, 2017.
Section III-B. We got promising output in the real time test. [8] R. Rothe, R. Timofte, and L. V. Gool, “Deep expectation of real and ap-
In real-time prediction run-time of the model was average parent age from a single image without facial landmarks,” International
Journal of Computer Vision, vol. 126, no. 2-4, pp. 144–157, 2018.
0.0654 ± 0.0056 seconds per prediction for 100 predictions [9] M. Duan, K. Li, C. Yang, and K. Li, “A hybrid deep learning cnn–
on the Windows platform. Fig. 8 shows the real time test. elm for age and gender classification,” Neurocomputing, vol. 275, pp.
448–461, 2018.
[10] X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely effi-
cient convolutional neural network for mobile devices,” in Proceedings
of the IEEE conference on computer vision and pattern recognition,
2018, pp. 6848–6856.
[11] J.-H. Lee, Y.-M. Chan, T.-Y. Chen, and C.-S. Chen, “Joint estimation
of age and gender from unconstrained face images using lightweight
multi-task cnn for mobile applications,” in 2018 IEEE Conference on
Multimedia Information Processing and Retrieval (MIPR). IEEE, 2018,
pp. 162–165.
[12] A. Rattani, N. Reddy, and R. Derakhshani, “Convolutional neural
networks for gender prediction from smartphone-based ocular images,”
IET Biometrics, vol. 7, no. 5, pp. 423–430, 2018.
[13] S. Hosseini, S. H. Lee, H. J. Kwon, H. I. Koo, and N. I. Cho, “Age and
gender classification using wide convolutional neural network and gabor
filter,” in 2018 International Workshop on Advanced Image Technology
(IWAIT). IEEE, 2018, pp. 1–3.
[14] A. Kharchevnikova and A. V. Savchenko, “Neural networks in video-
based age and gender recognition on mobile platforms,” Optical Memory
and Neural Networks, vol. 27, no. 4, pp. 246–259, 2018.
[15] M. Dileep and A. Danti, “Human age and gender prediction based
on neural networks and three sigma control limits,” Applied Artificial
Intelligence, vol. 32, no. 3, pp. 281–292, 2018.
Fig. 8. Real-time Prediction [16] Y. Zhang and T. Xu, “Landmark-guided local deep neural networks for
age and gender classification,” Journal of Sensors, vol. 2018, 2018.
[17] K. Jain, M. Chawla, A. Gadhwal, R. Jain, and P. Nagrath, “Age and
V. C ONCLUSION & F UTURE W ORKS gender prediction using convolutional neural network,” in Proceedings
In this work, we developed a lightweight CNN model which of First International Conference on Computing, Communications, and
Cyber-Security (IC4S 2019). Springer, 2020, pp. 247–259.
is ideal to integrate in mobile devices. And we have achieved [18] A. Greco, A. Saggese, M. Vento, and V. Vigilante, “A convolutional
this without compromising too much accuracy. The model neural network for gender recognition optimizing the accuracy/speed
achieved accuracy of 48.59% for age and 80.76% for gender tradeoff,” IEEE Access, vol. 8, pp. 130 771–130 781, 2020.
[19] H.-T. Q. Bao and S.-T. Chung, “A light-weight gender/age estimation
using a large combined dataset. Comparing with other state model based on multi-taking deep learning for an embedded system,” in
of the art works, it is clear that the model built on the mixed Proceedings of the Korea Information Processing Society Conference.
dataset performs well on unknown data and shows good results Korea Information Processing Society, 2020, pp. 483–486.
[20] X. Liu, Y. Zou, H. Kuang, and X. Ma, “Face image age estimation based
on the real-time test. We plan add more datasets from different on data augmentation and lightweight convolutional neural network,”
sources and increase accuracy for age. We also want to develop Symmetry, vol. 12, no. 1, p. 146, 2020.
a smartphone application that can predict gender and age in [21] S. Manasa, J. S. Abraham, A. Sharma, and K. Himapoornashree, “Age,
gender and emotion detection using cnn,” International Journal of
real-time using the proposed model. And our other idea is to Advanced Research in Computer Science, vol. 11, no. Special Issue 1,
upgrade the model for special cases such as faces with a mask. p. 68, 2020.

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.
[22] O. Agbo-Ajala and S. Viriri, “Deeply learned classifiers for age and
gender predictions of unfiltered faces,” The Scientific World Journal,
vol. 2020, 2020.
[23] Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by condi-
tional adversarial autoencoder,” in IEEE Conference on Computer Vision
and Pattern Recognition (CVPR). IEEE, 2017.
[24] E. Eidinger, R. Enbar, and T. Hassner, “Age and gender estimation
of unfiltered faces,” IEEE Transactions on Information Forensics and
Security, vol. 9, no. 12, pp. 2170–2179, 2014.
[25] J. Fu and Y. Rui, “Advances in deep learning approaches for image
tagging,” APSIPA Transactions on Signal and Information Processing,
vol. 6, 2017.

Authorized licensed use limited to: IEEE Xplore. Downloaded on July 09,2021 at 06:47:20 UTC from IEEE Xplore. Restrictions apply.

You might also like