Gender - and - Age - Detection - Using - Deep - Lear 2021
Gender - and - Age - Detection - Using - Deep - Lear 2021
ABSTRACT
Article Info For the past few years, gender and age detection has been an active area of study
Volume 7, Issue 3 and researchers have been putting a lot of effort to contribute quality research in
Page Number: 604-610 this area. Starting from preprocessing of data to building a model which gives
high precision results is tedious task for researchers. There is a immense
Publication Issue : dormant field of study as it can be used in monitoring, surveillance, human-
May-June-2021 computer interaction and security. However, there is still a lack of the
performance of existing methods on real live images. Many difficult tasks such as
Article History computer vision, speech recognition, and natural language processing are easily
Accepted : 10 June 2021 solved with deep learning. Therefore, the approach of deep learning remarkably
Published : 15 June 2021 growing and this also takes place in image classification. Therefore, to analyses
and focuses on comparative study of different algorithms for gender and age
recognition system to give elevated degree of precision is required.
Keywords : CNN, Adience dataset, Feature extraction, neural network, deep
learning
Copyright: © the author(s), publisher and licensee Technoscience Academy. This is an open-access article distributed under the 604
terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use,
distribution, and reproduction in any medium, provided the original work is properly cited
Utkarsha Kumbhar et al Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol, May-June - 2021, 7 (3) : 604-610
Deep learning is an artificial intelligence subset of The deep learning technology has substantially
machine learning that uses neural networks to learn increased the quality of picture categorization and
unsupervised from unstructured or unlabelled data. object recognition over the last seven years. The
Deep neural learning or deep neural network are essential components of a gender recognition system
other terms for the same thing. Deep learning gives before applying deep learning methods are feature
numerous ways and algorithms to solve the age and descriptor and classification approach. Many
gender detection dilemma. Choosing which techniques have been used to extract discriminative
algorithms to use is a complex job. Data scarcity is features from facial images, which can be broadly
also a major issue in determining gender and age. classified into geometric and appearance-based
methods. Geometric features such as the distance
Deep learning and machine learning can help solve a between eyes, the length of the eyes and ears, the
variety of problems. In this paper, a convolution length and width of the face, and so on have been
neural network is utilised to extract features from the used in the former category. On the other hand, in
Audience dataset, and a few classifiers are utilised to appearance-based approaches, the entire image is
improve accuracy. taken into account rather than just a portion of it.
II. RELATED WORK For the encoding of facial images, Dyadic Wavelet
Transform (DyWT) and Local Binary Pattern (LBP)
There are numerous systems on the market that can were previously utilised. LBP is a state-of-the-art
determine gender and age using various technologies, texture descriptor, and DyWT depicts a facial picture
approaches, and techniques. Deep Learning-based at multiple scales. DyWT decomposes an image into
techniques are an increasingly prominent set of distinct sub-bands at different scales, making analysis
approaches used by software engineering academics simple. The DWT transform has been utilised for face
to automate development chores (DL). The description, however because it is not translation
popularity of these techniques is due to their invariant, it does not have the best potential for
automated feature engineering capabilities, which feature extraction. DyWT is a superior choice for face
helps in the modelling of software artefacts. However, description since it is translation invariant. LBP, on
because deep learning techniques are being adopted the other hand, is superior at capturing local detail.
at such a quick rate, it is difficult to distil the current In a new method, DyWT and LBP are combined. For
research landscape's successes, failures, and prospects. the challenge of gender recognition, we propose a
Deep learning (DL) is a machine learning field with a description of facial images. Deep learning provides a
lot of potential. technique to dissecting an image into subcategories
rather than dissecting it into subcategories.
The state-of-the-art studies in several scientific areas, Convolutional neural networks (CNNs)
including as computer vision, object recognition, revolutionised the field of computer vision. It not
speech recognition, and natural language processing, only improves image classification accuracy with
have led to the artificial intelligence waking up from time, but it also helps with generic feature extraction
its deep slumber in the recent decade. Many such as scene categorization, object detection, image
researchers are currently using DL approaches to try retrieval, image caption and semantic segmentation.
to solve a variety of challenges in various domains. In image processing tasks, convolutional neural
networks (CNNs) are one of the most powerful kinds
of deep neural networks. It's a powerful tool that's recognition, and language translation are some of the
widely utilised in computer vision applications. tasks performed through deep learning.
Convolution layers, subsampling layers, and full
connection layers are the three types of layers in a Deep learning is a subset of machine learning that
convolution neural network. analyses data using hierarchical neural networks.
Within these hierarchical neural networks,
III. PRAPOSED METHOD AND MATERIALS comparable to the human brain, neuron codes are
linked together. The hierarchical nature of deep
In proposed system, Adience dataset [15] is used for learning, unlike other standard linear programmes in
solving the problem. Adience benchmark, which is machines, allows it to take a nonlinear approach,
used to categorise people based on their age and processing data over a number of layers, each of
gender. Images from smart-phone devices are which will integrate subsequent tiers of additional
automatically uploaded to Flickr in the Adience set. information.
Because these photographs were posted without
deliberate filtering, as is common on media webpages There are three different types of neural networks:
(e.g., photographs from the LFW collection [25]) or ANN, RNN, and CNN. Each has its own importance
social websites (the Group Photos set [14]), viewing in a number of situations. Feature engineering is a
circumstances in these photographs are a little critical aspect in building a network-based model.
different. Feature engineering consists of two steps: feature
As a result, Adience photos record extreme extraction and feature selection. A convolutional
differences in head attitude, lighting quality, and neural network (CNN) is employed in the proposed
more. There are around 26K photos in the Adience system to extract a feature from the dataset. CNN
collection, with 2,284 subjects. The collection is models are employed in a wide range of applications
divided by age groups. A typical five-fold, subject- and domains, but they're especially common in image
exclusive cross-validation methodology [15] is used to and video processing.
test for both age and gender classification. These
images are used more than recent alignment
approaches to emphasise the performance
improvement ascribed to network architecture rather
than better pre-processing. Deep Learning is used to
correctly determine a person's gender and age from a
single photograph of their face. The anticipated
gender and age ranges are (0-2), (4-6), (8-12) (15-20)
(25-32) (38-43) (48-53) (60-100).
Deep learning, also known as deep neural networks
or neural learning, is a type of artificial intelligence
(AI) that aims to mimic the brain's functions. It's a
type of machine learning that uses functions to make
decisions in a nonlinear way. Deep learning occurs
when decisions are made without supervision on
unstructured data. Recognition of objects, speech Figure 1. Proposed system neural architecture.
For both age and gender classification, we used our a variety of instances to evaluate. If the new dataset is
proposed network design throughout our studies. greater than the old one, the weight should be
Only three convolutional layers and two fully- adjusted; this is the fine-tuning strategy.
connected layers with a modest number of neurons
make up the network. This is in compared to the
much bigger structures used in [16] and [17]. Our
decision to use a smaller network was influenced by
both our goal to avoid overfitting and the nature of
the problem. Figure 1 depicts our network
architecture in its entirety. More information is
available in the text. one of the issues we're trying to
resolve: On the Adience set, age is classified into
eight categories, but gender is only divided into two.
In contrast to, for example, this is in contrast to the
ten thousand identity classes used to train the facial
recognition network in [48]. The network processes
all three colour channels directly. Images are initially
rescaled to 256 256 pixels and then sent to the Figure 2. small data with exactly similar to original
network with a crop of 227 227 pixels. Following that, dataset.
the three convolutional layers are defined as follows.
In the figure.2, the transfer learning function is used
on small data that are identical to the original dataset.
1. In the first convolutional layer, 96 filters with a
When the targeted data is small, new fully connected
size of 3707 pixels are applied to the input,
layer is generated by keeping previous weights
followed by a rectified linear operator (ReLU), a
constants.
max pooling layer with a maximum value of 3 3
regions with two-pixel strides, and a local
IV. RESULTS AND DISCUSSION
response normalisation layer [28].
2. The second convolutional layer, which contains
In the proposed system, CNN is employed to extract
256 filters of size 96 5 5 pixels, processes the
features, and various classifiers are employed to
preceding layer's 96 28 28 output. The same
achieve high accuracy.
hyper parameters as before are used for ReLU, a
max pooling layer, and a local response
The proposed system is organised into three primary
normalisation layer.
parts. Images are captured utilising a live webcam as
3. Finally, the third and final convolutional layer
well as images from a saved location in the first two
applies a set of 384 filters of size 256 3 3 pixels to
modules. With the help of the OpenCV module, it is
the 256 14 14 blobs, followed by ReLU and a
determined whether the image is male or female and
max pooling layer.
what age range it belongs to after receiving it as an
input.
In the CNN feature extraction strategy, just the top
layer of the network is trained, and the rest of the
network remains constant. In this method, there are