0% found this document useful (0 votes)
29 views57 pages

Report Writing

Uploaded by

gravity circle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views57 pages

Report Writing

Uploaded by

gravity circle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

FACE AGE AND GENDER DETECTION

A REPORT OF SIX MONTHS TRAINING

at

NET XPRT INSTITUTE, ALWAR (RAJASTHAN)

SUBMITTED IN PARTIAL FULFILMENT OF THE


REQUIREMENTS FOR THE AWARD OF THE DEGREE OF
BACHELOR OF TECHNOLOGY
(Information Technology)

AUG-DEC,2023

SUBMITTED BY:
Gedela RamaKrishna Vara Prasad
(2004910)

Department of Information Technology


Guru Nanak Dev Engineering College, Ludhiana-141006
(An autonomous college under UGC Act)
Certificate by Institute

i
Student’s Declaration

I hereby certify that the work which is being presented in this training report with the project

entitled “FACE AGE AND GENDER DETECTION” by GEDELA RAMAKRISHNA VARA

PRASAD University Roll No. 2004910 in partial fulfillment of requirements for the award of

degree of B.Tech. (Information Technology) submitted in the Department of Information

Technology at GURU NANAK DEV ENGINEERING COLLEGE, LUDHIANA under I.K.

GUJRAL PUNJAB TECHNICAL UNIVERSITY is an authentic record of my own work carried

out under the supervision of (MANISH MAMODIS), (DIRECTOR) of (NET XPRT

INSTITUTE). The matter presented has not been submitted by me in any other University /

Institute for the award of B.Tech. Degree.

Student Name: GEDELA RAMAKRISHNA VARA PRASAD

Univ. Roll No. : 2004910

(Signature of Student)

This is to certify that the above statement made by the candidate is correct to the best of my

knowledge.

Signature of Internal Examiner

The External Viva-Voce Examination of the student has been held on .

Signature of External Examiner

Signature of HOD

ii
ABSTRACT

The field of machine learning is introduced at a conceptual level. Ideas such as supervised and

unsupervised as well as regression and classification are explained. The trade-off between bias,

variance, and model complexity is discussed as a central guiding idea of learning. Various types

of model that machine learning can produce are introduced such as the neural network (feed-

forward and recurrent), support vector machine, random forest, self-organizing map, and

Bayesian network. Training a model is discussed next with its main ideas of splitting a dataset

into training, testing, and validation sets as well as performing cross-validation. Assessing the

goodness of the model is treated next alongside the essential role of the domain expert in keeping

the project real. The chapter concludes with some practical advice on how to perform a machine

learning project. Since the advent of social media, there has been an increased interest in

automatic age and gender classification through facial images. So, the process of age and gender

classification is a crucial stage for many applications such as face verification, aging analysis, ad

targeting and targeting of interest groups. Yet most age and gender classification systems still

have some problems in real-world applications. This work involves an approach to age and

gender classification using multiple convolutional neural networks (CNN). The proposed method

has 5 phases as follows: face detection, remove background, face alignment, multiple CNN and

voting systems. The multiple CNN model consists of three different CNN in structure and depth;

the goal of this difference It is to extract various features for each network. Each network is

trained separately on the AGFW dataset, and then we use the Voting system to combine

predictions to get the result.

iii
ACKNOWLEDGEMENT

I am highly grateful to the Dr. Sehijpal Singh , Principal, Guru Nanak Dev Engineering

College (GNDEC), Ludhiana, for providing this opportunity to carry out Six Months industrial

training at NET XPRT INSTITUTE.

The constant guidance and encouragement received from Dr. K. S. Mann, Head of the

Department, GNDEC Ludhiana has been of great help during the training and project work and

is acknowledged with reverential thanks.

I would like to express a deep sense of gratitude and thanks profusely to MANISH Director of

Company. Without the wise counsel and able guidance, it would have been impossible to

complete the report in this manner.

The help rendered by Mr MANISH MAMODIS, Designation(DIRECTOR) for experimentation

is greatly acknowledged.

I express gratitude to other faculty members of Information Technology department of GNDEC

for their intellectual support throughout the course of this work.

Finally, I am indebted to all whosoever have contributed in this report work and friendly stay at

NET XPRT INSTITUTE.

Gedela Ramakrishna Vara Prasad (2004910)

iv
LIST OF FIGURES

Figure No. Figure Description Page No.


1.1 Movement of the Kernel 1

1.2 TensorFlow Working 5


1.3 Age Prediction 7
2.1 CNN 12
2.2 RNN 12
2.3 DBNs 13

2.4 Pycharm 14
2.5 ArgParse Dataset 16
2.6 NumPy Library 18

2.7 Matplot Library 20


2.8 Pandas Library 21

4.1 Importing Library 23


4.2 Loading Dataset 24
4.3 Dataset used in Project 24

5.1 Output 1 38
5.2 Output 2 39

5.3 Output 3 40
5.4 Output 4 41
5.5 Output 5 41
5.6 Camera-Output 1 42
5.7 Terminal-Output for Camera-Output 1 42
5.8 Camera-Output 2 43
5.9 Terminal-Output for Camera-Output 2 43

v
CONTENTS

Topics Page No.


Certificate by Institute i
Candidate’s Declaration ii
Abstract iii
Acknowledgement iv
List of v
Figures
CHAPTER 1 INTRODUCTION 1-9
1.1. Introduction to the project 1
1
1.2 . Objectives

1.3. Description 2
1.4. Layout of the basic idea 5
CHAPTER 2 TECHNOLOGY AND DATASET USED 10-21
2.1. Deep learning 10
2.2. The dataset 13
2.3. Essential libraries and tools used 14
CHAPTER 3 HARDWARE AND SOFTWARE REQ. 22
CHAPTER 4 TECHNOLOGY AND IMPLEMENTATION 23-37
4.1. Import the libraries and load the dataset 23
4.2. Pre-process the data 25
4.3. Create the model 28
4.4. Train the model 31
4.5. Evaluate the model 33
4.6. Running real-time embedded system 36
4.7. Summary 37
CHAPTER 5 RESULTS 38-43
5.1. Screenshots 38
5.2. Compiler output 42
CHAPTER 6 CONCLUSION AND FUTURE SCOPE 44-45
6.1. Conclusion 44
6.2. Future scope 45
REFERENCES 46
APPENDIX 47-50
CHAPTER 1 INTRODUCTION

1.1 Introduction to the Project

To build a gender and age detector that can approximately guess the gender and age of

the person (face) in a picture using Deep Learning on the Adience dataset.

In this Python Project, we Generally use Deep Learning to accurately identify the

gender and age of a person from a single image of a face. I used the models trained

by Tal Hassner and Gil Levi. The predicted gender may be one of ‘Male’ and

‘Female’, and the predicted age may be one of the following ranges- (0 – 2), (4 – 6),

(8 – 12), (15 – 20), (25 –32), (38 – 43), (48 – 53), (60 – 100) (8 nodes in the final

softmax layer). It is very difficult to accurately guess an exact age from a single

image because of factors like makeup, lighting, obstructions, and facial expressions.

And so, we make this a classification problem instead of making it one of regression.

1.2 Objectives
• Ensure the system works well in different conditions like varied lighting and facial

expressions.

• Develop accurate algorithms to predict age and gender from facial images.

• Optimize the system for quick processing of facial images in real-time.

The CNN Architecture :


The convolutional neural network for this python project has 3 convolutional layers:
1
i. Convolutional layer; 96 nodes, kernel size 7

ii. Convolutional layer; 256 nodes, kernel size 5

iii. Convolutional layer; 384 nodes, kernel size 3

It has 2 fully connected layers, each with 512 nodes, and a final output layer of softmax

type.

To go about the python project, I’ll:

-Detect faces

-Classify into Male/Female

-Classify into one of the 8 age ranges

-Put the results on the image and display it The Dataset :

For this python project, we’ll use the Adience dataset; the dataset is available in the

public domain. This dataset serves as a benchmark for face photos and is inclusive of

various real-world imaging conditions like noise, lighting, pose, and appearance. The

images have been collected from Flickr albums and distributed under the Creative

Commons (CC) license. It has a total of 26,580 photos of 2,284 subjects in eight age

ranges (as mentioned above) and is about 1GB in size. The models we will use have

been trained on this dataset.

1.3 Description

To make machines more intelligent, the developers are diving into machine learning

and deep learning techniques. A human learns to perform a task by practicing and

repeating it again and again so that it memorizes how to perform the tasks. Then the
2
neurons in his brain automatically trigger and they can quickly perform the task they

have learned. Deep learning is also very similar to this. It uses different types of neural

network architectures for different types of problems. For example – object

recognition, image and sound classification, object detection, image segmentation, etc.

In this python project, we implemented a CNN to detect gender and age from a single

picture of a face.

Prerequisites:

-Knowledge of Python programming language

-You’ll need to install OpenCV (cv2) to be able to run this project. You can do this

with pip- pip install opencv-python

-Other packages you’ll be needing are math and argparse, but those come as part of

the standard Python library.

What is OpenCV?

OpenCV is short for Open Source Computer Vision. Intuitively by the name, it is an

open-source Computer Vision and Machine Learning library. This library is capable

of processing real-time image and video while also boasting analytical capabilities. It

supports the Deep Learning frameworks TensorFlow, Caffe, and PyTorch. It plays a

major role in real-time operation which is very important in today’s systems. By using

it, one can process images and videos to identify objects, faces, or even handwriting

of human. When it integrated with various libraries, Such as Numpy, python is capable

of Processing the OpenCV array structure for analysis. To Identify image pattern and

3
its various features we use vector space and perform mathematical operations on these

features.

The first OpenCV version was 1.0. OpenCV is released under a BSD License and

hence it’s free both academic and commercial use. It has C++, C, Python and Java

interfaces and supports Windows, Linux, Mac OS, IOS and Android. When openCV

was designed the main focus was real-time applications for computational efficiency.

All things are written in optimized C/C++ to take advantage of multi-core processing.

Convolution Neural Network:

A Convolutional Neural Network or CNN is a Deep Learning Algorithm which is very

effective in handling image classification tasks. It can capture the Temporal and Spatial

dependencies in an image with the help of filters or kernels.

A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm

which can take in an input image, assign importance (learnable weights and biases) to

various aspects/objects in the image and be able to differentiate one from the other.

The pre-processing required in a ConvNet is much lower as compared to other

classification algorithms. While in primitive methods filters are hand-engineered, with

enough training, ConvNets have the ability to learn these filters/characteristics. The

kernel is just like a small window sliding over the large window in order to extract the

spatial features and in the end, we get feature maps. A Convolutional Neural Network

(CNN) is a type of neural network architecture designed for tasks involving images

and spatial data. CNNs are particularly effective in computer vision tasks, such as

image classification, object detection, and facial recognition. The architecture of

CNNs is inspired by the visual processing that occurs in the human brain.

4
Fig 1.1 Movement of the Kernel

The architecture of a ConvNet is analogous to that of the connectivity pattern of

Neurons in the Human Brain and was inspired by the organization of the Visual Cortex.

Individual neurons respond to stimuli only in a restricted region of the visual field

known as the Receptive Field. A collection of such fields overlaps to cover the entire

visual area.

1.4 Layout of the basic idea :

Files Used and their Working:-

-opencv_face_detector.pbtxt

-opencv_face_detector_uint8.pb

-age_deploy.prototxt

-age_net.caffemodel

-gender_deploy.prototxt

-gender_net.caffemodel
5
-a few pictures to try the project on

1. For face detection, we have a .pb file- this is a protobuf file (protocol buffer); it holds

the graph definition and the trained weights of the model. We can use this to run the

trained model.

2. For face, age, and gender, ini.pbtxt extension holds it in text format.These are

TensorFlow files. For age and gender, the .prototxt files describe the network

configuration and the .caffemodel file defines the internal states of the parameters of

the layers. TensorFlow uses Protocol Buffers (protobuf) to serialize structured data, and

it commonly uses files with the extension .pb or .pbtxt. However, the specific contents

and purpose of these files depend on their context. They could be saved models,

configuration files, or other types of data. If you are referring to "servables" or model

configuration files for TensorFlow Serving, you might have encountered .pbtxt files

used to define configurations. These files contain information about the model, such as

the input and output nodes, data types, and other settings.

A .prototxt file is a text file used in the context of deep learning, particularly with the

Caffe deep learning framework. It is a configuration file that defines the architecture of

a neural network model in a human-readable format. A .caffemodel file is a binary file

format used in the Caffe deep learning framework. It contains the learned parameters

(weights and biases) of a trained neural network model. When you train a neural

network using Caffe, the model's architecture and its learned parameters are saved

separately. The architecture is stored in a .prototxt file, while the learned parameters

are saved in a .caffemodel file.

6
Fig 1.2 TensorFlow Working

3. We use the argparse library to create an argument parser so we can get the image

argument from the command prompt. We make it parse the argument holding the path

to the image to classify gender and age for.

7
Fig 1.3 Age Prediction

4. Initialize the mean values for the model and the lists of age ranges and genders to

classify from.

5. Now, use the readNet() method to load the networks. The first parameter holds trained

weights and the second carries network configuration.

6. Let’s capture video stream in case you’d like to classify on a webcam’s stream. Set

padding to 20.

7. Now until any key is pressed, we read the stream and store the content into the names

hasFrame and frame. If it isn’t a video, it must wait, and so we call up waitKey() from

cv2, then break.

8
8. Let’s make a call to the highlightFace() function with the faceNet and frame

parameters, and what this returns, we will store in the names resultImg and faceBoxes.

And if we got 0 faceBoxes, it means there was no face to detect. Here, net is faceNet-

this model is the DNN Face Detector and holds only about 2.7MB on disk.

• Create a shallow copy of frame and get its height and width.
• Create a blob from the shallow copy.

• Set the input and make a forward pass to the network.

• FaceBoxes is an empty list now. for each value in 0 to 127, define the confidence

(between 0 and 1). Wherever we find the confidence greater than the confidence

threshold, which is 0.7, we get the x1, y1, x2, and y2 coordinates and append a list of

those to faceBoxes.

• Then, we put up rectangles on the image for each such list of coordinates and return
two things: the shallow copy and the list of faceBoxes.

9. But if there are indeed faceBoxes, for each of those, we define the face, create a

4-dimensional blob from the image. In doing this, we scale it, resize it, and pass in the

mean values.

10. We feed the input and give the network a forward pass to get the confidence of the two

class. Whichever is higher, that is the gender of the person in the picture.

11. Then, we do the same thing for age.

12. We’ll add the gender and age texts to the resulting image and display it with imshow().

9
CHAPTER 2 TECHNOLOGY AND DATASET USED

2.1 Deep Learning:

Deep learning is an artificial intelligence (AI) function that imitates the workings of

the human brain in processing data and creating patterns for use in decision making.

Deep learning is a subset of machine learning in artificial intelligence that has

networks capable of learning unsupervised from data that is unstructured or

unlabelled. Also known as deep neural learning or deep neural network.

Deep learning has evolved hand-in-hand with the digital era, which has brought about

an explosion of data in all forms and from every region of the world. This data, known

simply as big data, is drawn from sources like social media, internet search engines,

e-commerce platforms, and online cinemas, among others.

1. Deep Learning is a subfield of Machine Learning that involves the use of neural

networks to model and solve complex problems. Neural networks are modeled

after the structure and function of the human brain and consist of layers of

interconnected nodes that process and transform data.The key characterstic of

Deep Learning is the use of deep neural networks, Which have multiple layers

of interconnected nodes. These networks can learn complex representations of

data by discovering hierachical patterns and features in the data. Deep Learning

algorithms can automatically learn and improve from data without the need for

manual feature engineering.

2. Deep Learning has achieved significant success in various fields, including

image recognition, natural language processing, speech recognition, and

10
recommendation systems. Some of the popular Deep Learning architectures

include Convolutional Neural Networks(CNNs), Recurrent Neural

Networks(RNNs), and Deep Belief Networks(DBNs). How does deep learning

attain such impressive results? In a word, accuracy. Deep learning achieves

recognition accuracy at higher levels than ever before. This helps consumer

electronics meet user expectations, and it is crucial for safety-critical

applications like driverless cars. Recent advances in deep learning have

improved to the point where deep learning outperforms humans in some tasks

like classifying objects in images. While deep learning was first theorized in the

1980s, there are two main reasons it has only recently become useful: 1. Deep

learning requires large amounts of labeled data. For example, driverless car

development requires millions of images and thousands of hours of video. 2.

Deep learning requires substantial computing power. High-performance GPUs

have a parallel architecture that is efficient for deep learning. When combined

with clusters or cloud computing, this enables development teams to reduce

training time for a deep learning network from weeks to hours or less. A

Convolutional Neural Network (CNN) is a type of neural network architecture

designed for tasks involving images and spatial data. CNNs are particularly

effective in computer vision tasks, such as image classification, object

detection, and facial recognition. The architecture of CNNs is inspired by the

visual processing that occurs in the human brain. Recurrent Neural Networks

(RNNs) are a type of neural network architecture designed for processing

sequential data. Unlike traditional feedforward neural networks, RNNs have

connections that form directed cycles, allowing them to maintain a memory of

previous inputs. This memory makes them well-suited for tasks involving

11
sequences, such as time series prediction, natural language processing, and

speech recognition.

Fig 2.1 CNN

Fig 2.2 RNN

12
Fig 2.3 DBNs

2.2 The Dataset:

For this python project, we’ll use the Adience dataset; the dataset is available in the

public domain. This dataset serves as a benchmark for face photos and is inclusive of

various real-world imaging conditions like noise, lighting, pose, and appearance. The

images have been collected from Flickr albums and distributed under the Creative

Commons (CC) license. It has a total of 26,580 photos of 2,284 subjects in eight age

ranges (as mentioned above) and is about 1GB in size. The models we will use have

been trained on this dataset.

13
2.3 Essential Libraries and tools used:

1. Pycharm

PyCharm is an integrated development environment (IDE) used in computer

programming, specifically for the Python language. It is developed by the Czech

company JetBrains. It provides code analysis, a graphical debugger, an integrated unit

tester, integration with version control systems (VCSes), and supports web

development with Django as well as data science with Anaconda.

Fig 2.4 Pycharm

PyCharm is cross-platform, with Windows, macOS and Linux versions. The

Community Edition is released under the Apache License, and there is also

Professional Edition with extra features – released under a proprietary license.

PyCharm is widely used in the Python community and is known for its rich feature

set, ease of use, and continuous improvement through regular updates. It is available

for Windows, macOS, and Linux platforms.

14
2. Math Library: Introduction

The Python Math Library provides us access to some common math functions and

constants in Python, which we can use throughout our code for more complex

mathematical computations. The library is a built-in Python module, therefore you

don't have to do any installation to use it. In this article, we will be showing example

usage of the Python Math Library's most used functions and constants. This module

provides access to the mathematical functions defined by the C standard. These

functions cannot be used with complex numbers; use the functions of the same name

from the cmath module if you require support for complex numbers. The distinction

between functions which support complex numbers and those which don’t is made

since most users do not want to learn quite as much mathematics as required to

understand complex numbers. Receiving an exception instead of a complex result

allows earlier detection of the unexpected complex number used as a parameter, so

that the programmer can determine how and why it was generated in the first place.

3. Argparse:

The argparse module makes it easy to write user-friendly command-line interfaces.

The program defines what arguments it requires, and argparse will figure out

how to parse those out of sys.argv. The argparse module also automatically generates

help and usage messages and issues errors when users give the program invalid

arguments.

15
Fig 2.5 ArgParse Dataset

ArgumentParser(prog=None, usage=None, description=None, epilo g=None,

parents=[], formatter_class=argparse.HelpFormatter, prefix_chars='-',

fromfile_prefix_chars=None, argument_default=None, conflict_hand ler='error',

add_help=True, allow_abbrev=True, exit_on_error=True) Create a new

ArgumentParser object. All parameters should be passed as keyword arguments. Each

parameter has its own more detailed description below, but in short they are:

prog - The name of the program (default: sys.argv[0])

usage - The string describing the program usage (default: generated from arguments

added to parser)

description - Text to display before the argument help (default: none)

epilog - Text to display after the argument help (default: none)

parents - A list of ArgumentParser objects whose arguments should also be included

16
formatter_class - A class for customizing the help output

prefix_chars - The set of characters that prefix optional arguments (default: ‘-‘)

fromfile_prefix_chars - The set of characters that prefix files from which

additional arguments should be read (default: None)

argument_default - The global default value for arguments (default: None)

conflict_handler - The strategy for resolving conflicting optionals (usually

unnecessary)

add_help - Add a -h/--help option to the parser (default: True) allow_abbrev - Allows

long options to be abbreviated if the abbreviation is unambiguous. (default: True)

exit_on_error - Determines whether or not ArgumentParser exits with error info when

an error occurs. (default: True)

Changed in version 3.5: allow_abbrev parameter was added.

Changed in version 3.8: In previous versions, allow_abbrev also

disabled grouping of short flags such as -vv to mean -v -v.

What about gender prediction -

While using computer vision and deep learning to identify the gender of a person may

seem like an interesting classification problem, it’s actually one wrought with moral

implications.

Just because someone visually looks, dresses, or appears a certain way does not imply

they identify with that (or any) gender.

17
Software that attempts to distill gender into binary classification only further chains

us to antiquated notions of what gender is. Therefore, I would encourage you to not

utilize gender recognition in your own applications if at all possible.

If you must perform gender recognition, make sure you are holding yourself

accountable, and ensure you are not building applications that attempt to conform

others to gender stereotypes (e.g., customizing user experiences based on perceived

gender).

There is little value in gender recognition, and it truly just causes more problems than

it solves. Try to avoid it if possible.

4. NumPy:

NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting

of multidimensional array objects and a collection of routines for processing of array.

Fig 2.6 NumPy Library

Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package

Numarray was also developed, having some additional functionalities. In 2005, Travis

18
Oliphant created NumPy package by incorporating the features of Numarray into

Numeric package. There are many contributors to this open source project.

Operations using NumPy:

Using NumPy, a developer can perform the following operations −

• Mathematical and logical operations on arrays.

• Fourier transforms and routines for shape manipulation.

• Operations related to linear algebra.

• NumPy has in-built functions for linear algebra and random number

generation.

NumPy – A Replacement for MatLab :

NumPy is often used along with packages like SciPy (Scientific Python) and

Mat−plotlib (plotting library). This combination is widely used as a replacement for

MatLab, a popular platform for technical computing. However, Python alternative to

MatLab is now seen as a more modern and complete programming language.

5. Matplot:

Matplot library is a python library used to create 2D graphs and plots by using python

scripts. It has a module named pyplot which makes things easy for plotting by

providing feature to control line styles, font properties, formatting axes etc. It supports

a very wide variety of graphs and plots namely - histogram, bar charts, power spectra,

error charts etc. It is used along with NumPy to provide an environment that is an

19
effective open source alternative for MatLab. It can also be used with graphics toolkits

like PyQt and wxPython.

Fig 2.7 Matplot Library

➢ pyplot: The module used so far in this article

➢ pylab: A module to merge Matplotlib and NumPy together in an environment

closer to MATLAB

➢ Object-oriented way: The Pythonic way to interface with Matplotlib

matplotlib.pyplot is a collection of command style functions that make matplotlib

work like MATLAB. Each pyplot function makes in a figure, plots some lines in a

plotting area, decorates the plot with labels, etc.

20
6. Pandas:

Pandas is an opensource Python package that is most widely used for data science/data

analysis and machine learning tasks. It is built on top of another package named

Numpy, which provides support for multi-dimensional arrays. As one of the most

popular data wrangling packages, Pandas works well with many other data science

modules inside the Python ecosystem, and is typically included in every Python

distribution, from those that come with your operating system to commercial vendor

distributions like Active State’s ActivePython.

Pandas deal with the following three data structures: − Series Data Frame Panel

Fig 2.8 Pandas Library

21
CHAPTER 3 HARDWARE AND SOFTWARE REQ.

1. Hardware Requirements

• Processer i3 or above with a Supported GPU

• RAM 8 GB RAM

• Hard Disk space 100 GB disk spaces Free

2. Software Requirements

• Operating System-- Windows 10/ Windows server 2012

• Prerequisite-- Python (3+), Keras, Annaconda and supporting

Libraries

• Other-- Administrator & internet access is required in the windows

machine, it should be open environment.

• Application access-- VPN access (If required), Portal access, Application

access, shared point access, SMTP port & credentials.

• Browser-- Google chrome, for JupyterNotebook

22
CHAPTER 4 TECHNOLOGY AND IMPLEMENTATION

4.1 Import the libraries and load the dataset

First, we are going to import all the modules that we are going to need for training our

model. The Keras library already contains some datasets and MNIST is one of them.

So we can easily import the dataset and start working with it. The mnist.load_data()

method returns us the training data, its labels and also the testing data and its labels.

For a face and age detection project, you'll commonly use libraries such as OpenCV

for image processing, and a deep learning library like TensorFlow or PyTorch for

model development. Below is a basic example using Python to import these libraries

and load a dataset. In this example, I'll use TensorFlow and OpenCV.

Fig 4.1 Importing Library

23
Fig 4.2 Loading Dataset

Fig 4.3 Dataset used in Project

24
4.2 Pre-process the data

The image data cannot be fed directly into the model so we need to perform some

operations and process the data to make it ready for our neural network. The dimension

of the training data is (60000,28,28). The CNN model will require one more dimension

so we reshape the matrix to shape (60000,28,28,1).

Data preprocessing is a crucial step in preparing your dataset for training a face and

age detection model. Below are common steps involved in preprocessing data for such

a project:

1. Data Collection:

• Collect a diverse dataset of facial images with associated age and gender

labels. Ensure the dataset is representative of the target population to avoid

bias.

2. Image Loading:

• Load the facial images into the system. Use a suitable image processing

library like OpenCV or PIL to read and manipulate the images.

3. Data Cleaning:

• Check for and handle any missing or corrupted images in the dataset.

Remove duplicates to ensure a clean dataset.

4. Face Detection:

• Use a face detection algorithm or pre-trained face detector (e.g.,

Haarcascades, MTCNN, or deep learning-based detectors) to locate faces in

each image.

25
5. Face Alignment (Optional):

• Align faces to a standard orientation if needed. This helps in reducing

variations caused by differences in head poses.

6. Image Resizing:

• Resize the images to a consistent resolution. This is important for feeding

images into a neural network, as they typically require fixed-size inputs.

7. Normalization:

• Normalize pixel values to a standard scale (e.g., [0, 1] or [-1, 1]).

Normalization helps in speeding up training and improving convergence.

8. Data Augmentation:

• Apply data augmentation techniques to artificially increase the size of your

dataset. Common augmentations include random rotations, flips, and changes

in lighting conditions.

9. Age Labeling:

• Convert age labels into a suitable format for your model. This might involve

categorizing ages into bins or representing them as continuous values.

10. One-Hot Encoding (for Gender):

• If gender is a categorical variable, perform one-hot encoding to convert it into

a binary representation (e.g., male: [1, 0], female: [0, 1]).

26
11. Data Splitting:

• Split the dataset into training, validation, and test sets. This allows you to train

the model on one subset, tune hyperparameters on another, and evaluate the

final performance on a third.

12. Balancing Classes (Optional):

• If your dataset has imbalanced age or gender classes, consider techniques such

as oversampling, undersampling, or using class weights to balance the

distribution.

13. Data Serialization:

• Serialize the preprocessed data into a format suitable for training (e.g., HDF5,

TFRecord). This step is especially important when dealing with large datasets.

14. Metadata Recording:

• Record any metadata, such as file paths, age labels, and gender labels,

associated with each image. This information is useful for troubleshooting and

analysis.

15. Documentation:

• Document the preprocessing steps and parameters used. This documentation

is crucial for reproducibility and sharing insights with other team members.

27
4.3 Create the model

In the part we’ll learn about age detection, including the steps required to automatically

predict the age of a person from an image or a video stream (and why age detection is

best treated as a

classification problem rather than a regression problem).

From there, we’ll discuss our deep learning-based age detection model and then learn

how to use the model for both:

• Age detection in static images

• Age detection in real-time video streams

Controlling a model in a face and age detection project involves managing its behavior,

making adjustments, and ensuring it meets the desired criteria. Here are several aspects

to consider when controlling the model:

1. Thresholds and Confidence Levels:

• Set thresholds or confidence levels for age and gender predictions.

Depending on the application, you may want to accept predictions only if the

model is highly confident.

2. Dynamic Thresholding:

• Implement dynamic thresholding based on real-time performance. Adjust

thresholds if the model's accuracy varies under different conditions or

demographics.

28
3. Feedback Loop:

• Establish a feedback loop for continuous improvement. Collect feedback on

misclassifications or areas of low confidence and use it to retrain or fine-tune

the model.

4. Monitoring and Logging:

• Implement a robust monitoring system to track the model's performance over

time. Log predictions, errors, and model metrics to identify potential issues

early.

5. Regular Model Updates:

• Schedule regular updates to the model using new data. This helps the model

adapt to changing demographics and improves its accuracy.

6. Dynamic Model Selection:

• Consider using ensemble methods or multiple models that specialize in

different age or gender groups. Dynamically select the appropriate model

based on input characteristics.

7. Explainability and Interpretability:

• Incorporate explainability techniques to understand how the model arrives at

its predictions. This is crucial for transparency and building trust.

8. Bias Mitigation:

• Implement strategies to mitigate bias in predictions, especially concerning

gender and age. Regularly assess and address biases to ensure fair and ethical

outcomes.

29
9. User Interface (UI) Controls:

• If the model is part of a user-facing application, provide user controls to

adjust sensitivity levels or preferences, allowing users to customize the

model's behavior within certain limits.

10. Privacy Controls:

• Ensure that the model complies with privacy regulations. Implement controls

to anonymize or limit the storage and use of sensitive information.

11. Security Measures:

• Implement security measures to protect the model from adversarial attacks.

This is especially important in applications where the model is exposed to

external inputs.

12. Versioning:

• Maintain version control for the model and its associated components. This

allows for easy rollback in case of issues and facilitates collaboration among

team members.

13. Documentation:

• Document the control mechanisms, their configurations, and their impact on

the model's behavior. This documentation aids in troubleshooting and

knowledge transfer.

30
14. Adherence to Regulations:

• Ensure that the model and its controls adhere to relevant regulations and

ethical guidelines. Stay informed about changes in laws or standards that

may impact the model's deployment.

15. Stakeholder Communication:

• Communicate changes and updates to stakeholders, especially if the model

is part of a larger system or service. Keep stakeholders informed about

improvements and any potential impact on user experience.

4.4 Train the model

Once your face detector has produced the bounding box coordinates of the face in the

image/video stream, you can move on to Stage #2— identifying the age of the person.

Training a model for face and age detection involves several steps. Here's a high-level

overview of the process:

1. Data Collection:

• Gather a diverse dataset of facial images labeled with both age and gender

information. Ensure the dataset represents different ethnicities, ages, and

genders to improve the model's generalization.

2. Data Preprocessing:

• Clean and preprocess the dataset. This may involve resizing images,

normalizing pixel values, and augmenting data (applying random

transformations to increase variability).


31
3. Data Splitting:

• Split the dataset into training, validation, and test sets. The training set is used

to train the model, the validation set helps tune hyperparameters, and the test

set evaluates the final model's performance.

4. Model Architecture:

• Choose a suitable deep learning architecture for face and age detection.

Popular choices include convolutional neural networks (CNNs) for image

processing tasks. Ensure the architecture can handle both face and age

detection tasks simultaneously.

5. Loss Function:

• Define a loss function that measures the difference between the predicted and

actual age and gender values. Common loss functions for regression tasks

include mean squared error, and for classification tasks, categorical cross-

entropy.

6. Optimizer:

• Select an optimizer (e.g., Adam, SGD) to minimize the loss during training.

7. Model Training:

• Train the model using the training set. Adjust the model's weights and biases

iteratively to minimize the defined loss function. Monitor the model's

performance on the validation set to prevent overfitting.

32
8. Hyperparameter Tuning:

• Experiment with different hyperparameter values (learning rate, batch size,

etc.) to optimize the model's performance. Use the validation set to guide this

process.

9. Evaluation:

• Assess the model's performance on the test set. Evaluate metrics such as

accuracy, precision, recall, and F1 score for both age and gender detection.

10. Fine-Tuning (Optional):

• If the model performance is not satisfactory, consider fine-tuning the model

architecture or adjusting hyperparameters based on the evaluation results.

11. Deployment:

• Once satisfied with the model's performance, deploy it in a production

environment. This involves integrating the model into a system or application

where it can make real-time predictions on new data.

4.5 Evaluate the model

Age detection is the process of automatically discerning the age of a person solely

from a photo of their face.

Typically, you’ll see age detection implemented as a two-stage process:

• Stage #1: Detect faces in the input image/video stream

• Stage #2: Extract the face Region of Interest (ROI), and apply the age detector

algorithm to predict the age of the person.

33
Evaluating a face and age detection model involves assessing its performance based

on various metrics. Here are steps and key metrics you can use to evaluate the model:

1. Metrics for Age Detection:

• Mean Absolute Error (MAE): Measures the average absolute difference

between predicted and actual age values. Lower MAE indicates better

performance.

• Root Mean Squared Error (RMSE): Similar to MAE but gives higher

penalties to large errors. It provides a sense of the scale of errors.

• Age Accuracy: The percentage of correctly predicted age groups or exact

ages within a certain tolerance.

2. Metrics for Gender Detection:

• Accuracy: The percentage of correctly predicted gender labels.

• Precision, Recall, and F1 Score: Provide insights into the model's ability to

correctly identify gender, especially in imbalanced datasets.

3. Confusion Matrix:

• For both age and gender detection, analyze the confusion matrix to

understand how many instances were correctly or incorrectly classified. This

helps identify specific areas of improvement.

34
4. Receiver Operating Characteristic (ROC) Curve (Optional):

• If gender detection is treated as a binary classification problem, you can use

ROC curves to analyze the trade-off between true positive rate and false

positive rate.

5. Visual Inspection:

• Manually inspect a sample of predictions, especially those with high errors.

This can provide insights into specific challenges the model faces, such as

misclassification patterns or biases.

6. Bias and Fairness Analysis:

• Assess the model for bias, especially concerning gender and age groups. Use

fairness metrics to identify and mitigate biases in the predictions.

7. Cross-Validation:

• If applicable, perform cross-validation to ensure the model's generalization

across different subsets of the data. This helps assess its robustness.

8. Speed and Efficiency:

• Evaluate the inference speed and efficiency, especially if real-time applications are

a consideration. Consider factors like latency and computational resources.

9. Ethical Considerations:

• Assess the model for ethical implications, especially when dealing with

sensitive information such as age and gender. Ensure that the model's

predictions are fair and unbiased across different demographic groups.

35
10. User Feedback (if applicable):

• If the model is part of a user-facing application, gather feedback from users

to understand their experience and identify any potential issues.

11. Documentation:

• Document the evaluation process, including the metrics used, their values,

and any insights gained. This documentation is crucial for future

improvements and transparency.

4.6 Running real- time embedded system

For Stage #1, any face detector capable of producing bounding boxes for faces in an image

can be used, including but not limited to Haar cascades, HOG + Linear SVM, Single Shot

Detectors (SSDs), etc. Exactly which face detector you use depends on your project:

• Haar cascades will be very fast and capable of running in real-time on embedded

devices — the problem is that they are less accurate and highly prone to false-

positive detections.

• HOG + Linear SVM models are more accurate than Haar cascades but are slower.

They also aren’t as tolerant with occlusion (i.e., not all of the face visible) or

viewpoint changes (i.e., different views of the face).

• Deep learning-based face detectors are the most robust and will give you the best

accuracy, but require even more computational resources than both Haar cascades

and HOG + Linear SVMs When choosing a face detector for your application, take

the time to consider your project requirements — is speed or accuracy more

important for your use case?

36
I also recommend running a few experiments with each of the face detectors so you can

let the empirical results guide your decisions.

4.7 Summary

There are several factors that determine how old a person visually appears, including

their lifestyle, work/job, smoking habits, and most importantly, genetics. Secondly,

keep in mind that people purposely try to hide their age — if a human struggles to

accurately predict someone’s age, then surely a machine learning model will struggle

as well.

Therefore, you must assess all age prediction results in terms of perceived age rather

than actual age. Keep this in mind when implementing age detection into your own

computer vision projects.

37
CHAPTER 5 RESULT

5.1 Screenshots

Fig 5.1 Output1

38
Fig 5.2 Output2

39
Fig 5.3 Output3

40
Fig 5.4 Output4

Fig 5.5 Output5

41
5.2 COMPILER OUTPUT:

Fig 5.6 Camera-Output 1

Fig 5.7 Terminal-output for Camera-Output 1


42
Fig 5.8 Camera-Output 2

Fig 5.9 Terminal-output for Camera-Output 2


43
CHAPTER 6 CONCLUSION AND FUTURE SCOPE

6.1 CONCLUSION:

We have successfully build Face, c with Python, Tensorflow, and Machine Learning libraries.

Face, Age and Gender Detection have been recognized with more than 97% test accuracy.

This can be also further extended to identifying the Face, Age and Gender correctly . Though

many previous methods have addressed the problems of age and gender classifification, until

recently, much of this work has focused on constrained images taken in lab settings. Such

settings do not adequately reflflect appearance variations common to the real-world images in

social websites and online repositories. Internet images, however, are not simply more

challenging: they are also abundant. The easy availability of huge image collections provides

modern machine learning based systems with effectively endless training data, though this

data is not always suitably labeled for supervised learning. Taking example from the related

problem of face recognition we explore how well deep CNN perform on these tasks using

Internet data. We provide results with a lean deep-learning architecture designed to avoid

overfifitting due to the limitation of limited labeled data. Our network is “shallow” compared

to some of the recent network architectures, thereby reducing the number of its parameters

and the chance for overfifitting. We further inflflate the size of the training data by artifificially

adding cropped versions of the images in our training set. The resulting system was tested on

the Adience benchmark of unfifiltered images and shown to signifificantly outperform recent

state of the art. Two important conclusions can be made from our results. First, CNN can be

used to provide improved age and gender classifification results, even considering the much

smaller size of contemporary unconstrained image sets labeled for age and gender.

44
6.2 FUTURE SCOPE:

The Face Recognition (FR) is growing as a major research area because of the broad choice

of applications in the fields of commercial and law enforcement. Traditional FR methods

based on Visible Spectrum (VS) are facing challenges like object illumination, pose variation,

expression changes, and facial disguises. Unfortunately these limitations decrease the

performance in object identification and verification. To overcome all these limitations, the

Infrared Spectrum (IRS) may be used in human FR. So it leads and encourages the researchers

for continuous research in this area of FR. Simultaneously, the present study emphasizes the

use of three dimensional cubic dataset i.e. Multi/ Hyperspectral Imagery Data in FR. The IR

based Multi/ Hyperspectral Imaging System can minimize the several limitations arise in the

existing and classical FR system because the skin spectra derived with cubic dataset depicts

the unique features for an individual. Multi/ Hyperspectral Imaging System provides valuable

discriminants for individual appearance that cannot be obtained by additional imaging system

that's why this may be the future of human FR. This paper also presents a detailed and time to

time review of the literature on FR in IRS.

45
Refrences

• Aurélien Géron (2019). Hands-on Machine Learning with Scikit- Learn, Keras, and

TensorFlow: Second Edition.

• Hisham, A., Harin, S. (2017). Deep Learning – the new kid in Artificial Intelligence

• Robin Nixon (2014). Learning PHP, MySQL, JavaScript, CSS & HTML5: A Step-

by-Step Guide to Creating Dynamic

• Choi, S.E.; Lee, Y.J.; Lee, S.J.; Park, K.R.; Kim, J. Age Estimation Using a

Hierarchical Classifier Based on Global and Local Facial Features. Pattern

Recognition

• Ricanek, K.; Tesafaye, T. Morph: A Longitudinal Image Database of Normal Adult

Age-Progression. In Proceedings of the SeventInternationa

• https://www.geeksforgeeks.org/introduction-deep-learning/

• https://www.geeksforgeeks.org/courses/data-science-live

46
APENDIX

import cv2 #convolutional neural network

import math

import argparse

def highlightFace(net, frame, conf_threshold=0.7):

frameOpencvDnn=frame.copy()

frameHeight=frameOpencvDnn.shape[0]

frameWidth=frameOpencvDnn.shape[1]

blob=cv2.dnn.blobFromImage(frameOpencvDnn, 1.0, (300, 300), [104, 117, 123], True,

False)

net.setInput(blob)

detections=net.forward()

faceBoxes=[]

for i in range(detections.shape[2]):

confidence=detections[0,0,i,2]

if confidence>conf_threshold:

x1=int(detections[0,0,i,3]*frameWidth)

y1=int(detections[0,0,i,4]*frameHeight)

x2=int(detections[0,0,i,5]*frameWidth)

y2=int(detections[0,0,i,6]*frameHeight)

faceBoxes.append([x1,y1,x2,y2])

47
cv2.rectangle(frameOpencvDnn, (x1,y1), (x2,y2), (0,255,0),

int(round(frameHeight/150)), 8)

return frameOpencvDnn,faceBoxes

parser=argparse.ArgumentParser()

parser.add_argument('--image')

args=parser.parse_args()

faceProto="opencv_face_detector.pbtxt"

faceModel="opencv_face_detector_uint8.pb"

ageProto="age_deploy.prototxt"

ageModel="age_net.caffemodel"

genderProto="gender_deploy.prototxt"

genderModel="gender_net.caffemodel"

MODEL_MEAN_VALUES=(78.4263377603, 87.7689143744, 114.895847746)

ageList=['(0-2)', '(4-6)', '(8-12)', '(15-20)', '(25-32)', '(38-43)', '(48-53)', '(60-100)']

genderList=['Male','Female']

faceNet=cv2.dnn.readNet(faceModel,faceProto)

ageNet=cv2.dnn.readNet(ageModel,ageProto)

48
genderNet=cv2.dnn.readNet(genderModel,genderProto)

video=cv2.VideoCapture(args.image if args.image else 0)

padding=20

while cv2.waitKey(1)<0:

hasFrame,frame=video.read()

if not hasFrame:

cv2.waitKey()

break

resultImg,faceBoxes=highlightFace(faceNet,frame)

if not faceBoxes:

print("No face detected")

for faceBox in faceBoxes:

face=frame[max(0,faceBox[1]-padding):

min(faceBox[3]+padding,frame.shape[0]-1),max(0,faceBox[0]-padding)

:min(faceBox[2]+padding, frame.shape[1]-1)]

blob=cv2.dnn.blobFromImage(face, 1.0, (227,227), MODEL_MEAN_VALUES,

swapRB=False)

genderNet.setInput(blob)

genderPreds=genderNet.forward()

gender=genderList[genderPreds[0].argmax()]

print(f'Gender: {gender}')

49
ageNet.setInput(blob)

agePreds=ageNet.forward()

age=ageList[agePreds[0].argmax()]

print(f'Age: {age[1:-1]} years')

cv2.putText(resultImg, f'{gender}, {age}', (faceBox[0], faceBox[1]-10),

cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0,255,255), 2, cv2.LINE_AA)

cv2.imshow("Detecting age and gender", resultImg)

50

You might also like