FYP Hashim
FYP Hashim
KULLIYYAH OF ENGINEERING
MALAYSIA
MAY 2019
1
ABSTRACT
Features of the human face holds a lot of data that can be processed into useful
information such as age, state of emotion, race and possibly what types of illness
different a human specimen suffers from. To extract all these information, proper
analysis of facial data is required through image processing of facial images. Due to
lack of availability of complete data on human face with their respective ages, it has
proved challenging to accurately determine the age of individuals based on just data
obtained from their faces.
In this paper, features of human face will be extracted using OpenCV library of
Python programming language, different functions and classes used to detect edges,
OTSU Thresholding and many other OpenCV functions. Specimen will consist of
both voluntary and online data obtained from data science website (kaggle.com),
facial images used in this paper are of people aged 18 years to 70 years. The obtained
data were used to train the deep convolution neural network algorithm to estimate the
age of facial images in a 4 years’ interval. Part of this research will be conducted
using a JavaScript framework called VueJs for the estimate age of each facial images
and compare them with result obtained using OpenCV and deep learning, JavaScript
will also be used for the design of the user interface of the resulting application. It
should however be noted that using JavaScript doesn’t require a lot of testing data.
ii
TABLE OF CONTENTS
Abstract.........................................................................................................................ii
Table of Contents........................................................................................................iii
List of Figures..............................................................................................................iv
Approval Page...............................................................................................................v
Declaration...................................................................................................................vi
Acknowledgements.....................................................................................................vii
CHAPTER 1: INTRODUCTION...............................................................................1
1.1 Background......................................................................................................1
1.2 Statement of the Problem.................................................................................2
1.3 Objective of the Study.....................................................................................2
1.4 Scope of the Project.........................................................................................3
1.5 Significance of the Study.................................................................................4
1.6 Organization of the Report...............................................................................4
CHAPTER 3 : METHODOLOGY.............................................................................8
3.1 Introduction......................................................................................................8
3.2 Face Detection.................................................................................................8
3.3 Face Alignment................................................................................................9
3.4 Face Age Feature Extraction..........................................................................11
CHAPTER 5 : CONCLUSION.................................................................................20
REFERENCES...........................................................................................................21
iii
LIST OF FIGURES
iv
APPROVAL PAGE
I certify that I Have supervised and read this study and that in my opinion, it conforms
quality, as a Final Year Project report as partial fulfilment for a degree of Bachelor of
……………………………………………………….
SEDIONO]
supervisor
v
DECLARATION
except where otherwise stated. I also declare that it has not been previously or
concurrently submitted as a whole for any other degrees at IIUM or other institutions.
vi
ACKNOWLEDGEMENTS
All praises are due to Allah for providing me with the strength and means to
carry out my first final year project. My great thanks and feelings of appreciation goes
to Dr. Wahju Sediono who gave me all the guidelines required to see me through this
research. Owning to his guidance and perseverance through the semester, I manage to
complete my research. I would also like to thank the coordinator of this program, Dr.
Noor Hazrin Hany Bt Mohamad Hanif and Dr. Hazlina Bt Md. Yusof for giving us the
vii
CHAPTER 1
INTRODUCTION
1.1 BACKGROUND
industrial and technical uses such as human computer interface to video surveillance,
for academic researches, the need for facial age estimation cannot be overstated.
Facial age estimation is the process of assigning an exact age or a range of ages to the
facial image of a human being based on cues and features that can be extracted on the
Many different approaches have been used to estimate age, unfortunately they
are either inaccurate or largely resource intensive. In the medical field, laboratory
depend on the use of body fluid analysis such as blood test, cholesterol level, albumin
and other indicators to determine a person’s “physiological age” and determine the
level of aging in human being (Liao et al., 2018). However, medical method of
estimating human age is largely inefficient and very expensive equipment and
features include the nose, eyes and the lip. Equally important are shapes texture based
cues that gives information on the exact age or range of age of a facial image. The
1
detection, face alignment, feature detection and then trained based on facial images
apparent in our society. There are very limited available technologies that correctly
estimate the range of an individual’s age much less the exact age of the individual.
Inaccurate estimation of age are wrong data that would make decision made based on
them wildly inaccurate, these decisions include but not exclusively commercial
purpose (e.g. targeted advertisement), social media (e.g. tagging system on social
1. To develop a system that extract all relevant data from image of a human face
2. To develop a modular system for the face change recognition caused by aging
2
4. To evaluate the performance of the developed system
This project will mainly cover exploring methods of estimating ages using a texture
and shape based cues that are obtained from facial images. Texture based cues include
the smoothness and roughness of the skin, the elasticity of the skin while shape based
cues are based on the symmetry of the face, the shape of the face, jaw area and
position of different organ (eyes, nose and mouth) of the face. In addition,
anthropometric measurement of the face (size of the mouth, length between the eyes,
the structure of the nose) will be extracted to give addition accuracy to the texture and
shape based cues. Furthermore, all the data extracted from the face using OpenCV
library will be used to train the deep learning convolution neural network. The trained
algorithm will be used to test fresh of set facial images to properly evaluate the
Figure 1. 2 : Different face shapes used in shape based cues for age estimation
3
1.5 SIGNIFICANCE OF THE STUDY
The importance of this project is to study the human face and all possible data that are
hidden in it. Study the setback of existing system and work on designing new system
with deep convolution neural network (DCNN) algorithms that corrects the deficiency
This report will include five chapters which include, The Introduction, Literature
Chapter 1 – Introduction: This chapter focuses on basic ideas of the research, the
problem statement, its objective and scope as well as the background of the project
the research, giving detail information about the research and information obtained
Chapter 3 – Research Methodology: The chapter will elaborate on the methods that
will be used in this research to achieve results and reach conclusions. It will include
measurement, calculations.
Chapter 4 – Result and Discussion: The chapter examines the result of experiences
Chapter 5 – Conclusion: The chapter draws important conclusions from the research
4
CHAPTER 2
LITERATURE REVIEW
2.1 INTRODUCTION
Automatic age estimation, which involves evaluating a person’s exact age or age
group is a very important topic in human face image comprehension. The role of
estimating exact human age adopts a close representation of the age labels, and the
task of age group estimation divides the label into a group. The research focuses on
regression problem. Multi-class approaches simply treat the age values as independent
to-rank approaches to solve the age estimation problem recently. Unlike the
classification approaches that treat the labels naively as independent tags or the
regression approaches that treat the labels as simply proportional quantities, ranking
approaches can be trained by adopting the ordering property of the labels, thereby
In an effort to close the gap between the capabilities of age and gender
estimation systems and face recognition systems (let alone the gap between computer
and human capabilities) we take the following steps. First, we offer a public data-set
5
challenges of real-world age and gender estimation tasks. Besides being massive in
the number of images and subjects it includes, it is unique in the nature of its images.
These were obtained from online image albums captured by smart-phones and
therefore contains many images which would typically be removed by their owners as
seeks to model the high dimensional region in some feature space, which characterizes
how facial appearances change with age. To this end, previous methods have used
models, however, they assume accurate alignment of faces, and the underlying
assumption of the manifold structure of facial appearances across ages, which may not
strictly be true.
representing faces by obtaining facial measurements that change with a person’s age.
manually defined statistical models, to determine age. More recently, [18] presented a
similar approach for modeling the age of young people. These methods, based on their
need for accurate localization of facial feature detections, may not be suitable for the
Convolutional layers with nonlinear activation functions compose CNN. The output
convolutions are computed from a previous input layer. Convolutional layers apply
6
distinct filters and associate their results with the output. More specifically, the layer’s
parameters consist of a set of learnable filters, operating on small receptive fields that
extend through the full size of the input. A sequence of these layers is responsible for
extracting features, and turning abstract low level features into high-level ones. The
network is responsible for learning the filter weights, allowing problems with different
image databases to use the same network. Nevertheless, the use of multiple
convolution layers will decrease the network processing speed, due to the number of
7
CHAPTER 3
METHODOLOGY
3.1 INTRODUCTION
In this chapter, the methods and procedures for testing and experimentation applied
throughout the project would be explained in great detail. The procedures are
1. Face Detection
2. Face Alignment
There are different types of Image dataset that can be obtained, a labelled dataset that
comes with the year each facial images are taken or an unlabeled dataset with no
information about when the facial image was taken, the former can be obtained from
online resources like flickr.com while the later can be obtained from data science
website (like kaggle.com). It is however important to note that regardless of the type
of image dataset used (labelled and unlabeled), the process for face detection is similar
(Hu et al., 2016). For age estimation, the first level of computation is to identify the
faces in every input image using OpenCV algorithm for face detection. Two main
modules are used for face detection, first an input image is passed into a pyramid of 7
levels, the resulting image pyramid is used by the network to generate a new pyramid
of 256 feature maps at the fifth convolution layer (conv5). A 3 X 3 max filter is
8
applied to the feature pyramid at a stride of one to obtain the max 5 layer which
integrates the fifth convolution layer, each face contains a bias which needs to be
pyramid, the resulting feature pyramid is called “norm5” (Ranjan et al., 2015). The
second module is trained on norm5 using a linear SVM. The root filter of a certain
width and height is convolved with the feature pyramid at each resolution level.
Scores are assigned to different region of images, scores above a certain threshold are
intersection overlap union (IOU) above 0.3 with a bounding rectangle to denote the
Images contained in a facial image dataset often contain images of different sizes and
resolution, the above method of face detection is very advantageous because it can
detect images no matter how big or small the size and resolutions are.
9
3.3 FACE ALIGNMENT
Facial landmarks are different parts of the face that are identifiably unique to a face
that makes its detection possible, facial landmarks include but are not limited to right
eyes center, left eye center, nose base, the lip etc. Considering the detected face
bounded by the rectangle in the previous section. Face alignment algorithm first trims
the facial image and assigns an initial mean shape S0. The shape estimate S(t+1) at
(t+1)th can be written in terms of the shape estimate at the previous stage S t and a
shape regressor rt which determines the increment of the shape estimate using the
current shape estimate and the difference in pixel intensity values as the features
Once the facial landmarks have been detected as shown in figure 4 below, the facial
image is aligned into the canonical coordinate system with the similarity transform
using facial landmarks, the image is then converted into grayscale with a resolution of
256X256.
Deep Convolution Neural Network has proven very efficient in extraction of facial
features from facial images and detected landmarks. Hence features extraction can be
computed using the CNN learnt from the face detected which give a vector
representation. The representation is tightly packed and discriminative for face. The
50,000 subjects as the classes. The aligned grayscale images obtained from the face
alignment procedure with resolution (shape) of 256X256 are used as input. The
DCNN used in this research consist of 10 convolution layers, 5 pooling layers and 1
fully connected layer. To estimate the age, 3 layer of neural network is used to learn
11
Each layer is followed by a parametric linear rectified linear unit (PReLU) activation
function except the last layer that evaluates the age of the input facial image. The first
layer is the input layer which takes the 320 dimensional feature vector obtained from
the face-identification task. The output of this layer, after the dropout and PReLU
operation, is fed to the fist hidden layer containing 160 number of hidden units.
Subsequently, the output propagates to the second hidden layer containing 32 hidden
units. The output from this layer is used to generate a scalar value that would describe
12
13
CHAPTER 4
4.1 RESULTS
The initial result obtained after testing the DCNN algorithm with different images are
explained in a modular fashion. First the result and possible discussion on the face
detection are enumerated, secondly the face alignment algorithm is discussed at length
and all the precaution that need to be taken for future studies of this nature. Lastly in
the section how the DCNN algorithm works to extract and estimate the age of each
The face recognition aspect is largely done using OpenCV Python library. The
the face image which is in grayscale mode, the scale factor is used to adjust for some
images that may be too close to the camera and as a result will suffer from poor
exposure which may lead to images not be been detected or non-facial images
detected as a facial image. The number of faces detected by the algorithm is determine
14
Figure 4. 1: Face detection of a group of dancers
In this instant of testing, the algorithm detected a non-facial part of the body as a
facial image, this could be due to multiple issues, first is the quality of the camera, it
produces a high resolution image. The picture was taken far away and in area of high
light exposure that interferes with lens of the camera. All these can be corrected by
that were carried out on the algorithm were accurate to a significant degree.
15
Figure 4. 3: Face Detection Programming Written in Python (OpenCV)
The importance of alignment has been properly stated in the methodology section.
Given a set of image data, the matrix shape of each individual image is different, the
size and the color map (e.g. BGR, RGB, GRAY) for each individual image is
different, Alignment is the process of making all images in a dataset into one unique
form from which proper age estimation computation can be carried out. Without face
alignment, one would have to write different algorithm for different faces based on
their unique individual properties (size, shape, orientation and color map).
16
To successfully align every images in the dataset, the image must be centered, rotated
such that the landmarks such as the left eye center and the right eye center are
horizontally parallel, the nose base and the lip are also vertically parallel to each other.
Below are results of some alignment testing that was carried out on random images
It is worth noting that some images in the dataset may be properly the aligned and
there will be no need to re align them. In cases like this, it is important to save
computing resources by not computing them in the face alignment algorithm. Hence a
17
program is written to check if each images in the dataset are aligned and if they are,
18
4.4 RESULT FOR FACE AGE FEATURE EXTRACTION AND AGE
ESTIMATION
Using CNN, different factors that can affect the appearance of a human being are
taken into consideration for example, people in different part of the world age
differently based on the weather condition of their origin, skin color and so on.
Moreover, emotion plays a huge role in the way people appear, as a result extracting
features requires proper detection of landmarks on the face as shown in the figure 11
below.
The estimated ages were in range of 4-5 years because the difference in face attributes
over a period of one year is very negligible, hence, the study will be more productive
19
with longer age ranges. Other important factors noted in age extraction are monitoring
the activity of the face and change in emotions of the face and the engagement of the
Finally, the UI of the system was built with a JavaScript frameworks called Vuejs. It
makes it possible to host the whole system on a web browser since JavaScript is a web
20
which allows more image data to be tested on the system thereby increasing its
efficiency.
Chapter 5
CONCLUSION
This is the first part of the studies, the objective highlighted in the introduction are all
achieved except monitoring aging process. The scope covered in this studies is a
process requires a lot of time and resources particularly computing power and
resources because it requires a lot of steps which including classifying all images
based on ethnicity, skin color, texture and shape of the face. Since all this classes
for them.
The result obtained in this experiment are based on voluntary testing of about 100
IIUM students and 10,000 facial images. Due to the low level of data sample space the
accuracy of the initial system is still at a minimum. For future studies, higher level of
data will be used to conduct the research. DCNN algorithm only get better when more
data is analyzed.
deeply researched and tested for further possible improve and possible implementation
21
REFERENCES
Chang, K. and Chen, C. (2015) ‘A Learning Framework for Age Rank Estimation
based on Face Images with Scattering Transform’, IEEE TRANSACTIONS ON
IMAGE PROCESSING, 7149(c), pp. 1–14. doi: 10.1109/TIP.2014.2387379.
Eidinger, E., Enbar, R. and Hassner, T. (2014) ‘Age and Gender Estimation of
Unfiltered Faces’, IEEE TRANSACTIONS ON INFORMATION FORENSICS
AND SECURITY, 9(12), pp. 2170–2179.
Hu, Z. et al. (2016) ‘Facial Age Estimation with Age Difference’, IEEE
TRANSACTIONS ON IMAGE PROCESSING, 7149(c), pp. 1–11. doi:
10.1109/TIP.2016.2633868.
Liao, H. et al. (2018) ‘Age Estimation of Face Images Based on CNN and Divide-
and-Rule Strategy’, Mathematical Problems in Engineering, 2018, pp. 1–8.
Ranjan, R. et al. (2015) ‘Unconstrained Age Estimation with Deep Convolutional
Neural Networks’, IEEE International Conference on Computer Vision
Workshops, pp. 109–117. doi: 10.1109/ICCVW.2015.54.
22